Clone wiki

Dataset: OhKBIC / Home

OhKBIC Dataset

See the OhKBIC for more details.

Access to this repository is provided under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. In short, the dataset is intended for research and cannot be used for commercial applications.

Publications making use of this dataset should make the following citations:

  1. Monaco, John V. and Perez, Gonzalo and Tappert, Charles C. and Bours, Patrick and Mondal, Soumik and Rajkumar, Sudalai and Morales, Aythami and Fierrez, Julian and Ortega-Garcia, Javier, "One-handed Keystroke Biometric Identification Competition," International Conference on Biometrics (ICB) 2015.

Description

The dataset consists of keystroke samples from 64 students answering questions on 3 online exams over a semester. The first exam required student to type normally with both hands. For the second and third exam, students were required to type with their left hand and right hand only. This was done to simulate a serious handicap in which a user is only able to type with one hand. Each student provided at least 500 keystrokes for each sample.

The dataset is split into labeled and unlabeled samples. The goal is to identify the user of the unlabeled samples.

Important: not all of the users in the training dataset appear in the testing dataset.

The training dataset contains 500-keystroke samples from 64 users under normal typing conditions. Columns in the training dataset are:

Training dataset columns

Column Description
user Unique label for each user
condition Typing condition (both hands, left hand, right hand)
handedness User handedness (left, right, ambidextrous)
typingstyle User typing style (touch typist, hunt-and-peck, or hybrid)
timepress Press timestamp in milliseconds
timerelease Release timestamp in milliseconds
keyname Name of the key

The testing dataset contains 471 500-keystroke samples from the same population under three different typing conditions: normal typing with both hands, typing with just the left hand, and typing with just the right hand. All samples from within the same user are at least 50 keystrokes apart to avoid classification by grammatical structures in the student's response. Timestamps are also normalized by subtracting the first keypress timestamp, to remove any correlation between the time of the attempt in the training and testing datasets. The columns in the testing dataset are:

Testing dataset columns

Column Description
sample Globally unique label for each sample
condition Typing condition (both hands, left hand, right hand)
timepress Press timestamp in milliseconds
timerelease Release timestamp in milliseconds
keyname Name of the key

Updated