The Data Set

Updated on 6/17/2007 3:48 PM

Raw image sequences


The data was collected over four days, May 20-21, 2001 and Nov 15-16, 2001 at University of South Florida (USF), Tampa. There are about 33 subjects common between the May and Nov collections. USF IRB approved informed consent was obtained from all the subjects in this dataset. The data set consists of persons walking in elliptical paths in front of the camera(s). Each person walked multiple (>= 5) circuits around an ellipse, out of which the last circuit forms the data set. The protocol for the data collection, including specifications of the used imaging equipment, is described in detail HERE. An example AVI file is available HERE. For each person, we have up to 5 covariates


*      2 different shoe types (A, and B),

*      2 different carrying conditions (with or without a briefcase),

*      on 2 different surface types (grass and concrete),

*      from 2 different viewpoints (Left or Right) and

*      some at 2 different time instants


Thus, there are 32 possible conditions under which persons gait could have been imaged. However, not all subjects were imaged in all conditions. The full data set can be partitioned as depicted in the following grid.

*      We refer to the cells with blue background as the May-2001-No-Briefcase data (used in our ICPR-02 and FGR-02 papers).

*      We distribute only the full version of the dataset which consists of 1870 sequences from 122 subjects. The total size of the data is around 1.2 Tera bytes, uncompressed, and is distributed in compressed form using: one external 750GB (or more) drive. Please make sure your computer system can handle this large amount of data.

*      You will need to have linux to uncompress the data.



This is a large (1.2 Tera Bytes) dataset of video of gait from 122 subjects in up to 32 possible combinations of variations in factors. Signature of release form is required.


Expected response time: 3 months.


Foreign distribution may be restricted.


  • Download the license agreement from HERE


  • The license agreement MUST be reviewed and signed by the individual or entity authorized to make legal commitments on behalf of the institution or corporation. We cannot accept licenses signed by students or faculty members any more. Your institution's legal office must review and execute the license.


  • Return the ORIGINAL properly signed license agreement by postal mail AND email, as instructed in the license agreement.


  • Our university has a new export control process in place. Before you send us the drive, send me the signed license agreement. If approved only then send us the drives.


  • Send to the address below one EXTERNAL 1.5 Tera Byte (1.5 TB) USB drive ALONG with a prepaid return label with addresses to which drives will be sent and a telephone number, for the carrier. Also send an email, with tracking numbers, to (sarkar AT cseeDOTusfDOTedu) to look out for the drive.


Sudeep Sarkar

Computer Science and Engineering

University of South Florida

4202 E Fowler Ave., ENB 118

Tampa, FL 33620


Phone: 813 974 4100



Note 1: Please make sure your computer setup can handle terabytes of data and you have access to the linux machine to uncompress the data.

Note 2: Do not expect immediate service. It will take us couple of months to respond to each request.

Silhouettes Computed by the Baseline Algorithm


We make available the silhouettes that were computed by the baseline algorithm on the complete dataset, so that one can experiment with similarity computation methods. The complete set of silhouettes is available as a tar file from the Source Code page. The silhouettes, which as you will notice, are noisy, but do support the recognition rates reported by the simple correlation based similarity computing baseline strategy.