This IPython Notebook describes the two datasets presented in:
L. Giancardo, A. Sánchez-Ferro, T. Arroyo-Gallego, I. Butterworth, C. S. Mendoza, P. Montero, M. Matarazzo, J. A. Obeso, M. L. Gray & R. San José Estépar. Computer keyboard interaction as an indicator of early Parkinson's disease. Sci. Rep. 6, 34468; doi: 10.1038/srep34468 (2016)
The Early Parkinson's Disease dataset is referred as MIT-CS1PD; the De-Novo Parkinson's Disease dataset is referred as MIT-CS2PD. It also shows how to load timepress data from the datasets.
These datasets have been collected thanks to the financial support by the Comunidad de Madrid, Fundacion Ramon Areces and The Michael J Fox Foundation for Parkinson's research (grant number 10860). We thank the M + Vision faculty for their guidance in developing this project. We also thank our many clinical collaborators at MGH in Boston, at “12 de Octubre”, Hospital Clinico and Centro Integral en Neurociencias HM CINAC in Madrid for their insightful contributions.
This notebook requires:
The MIT_CS1PD dataset includes the typing information collected on a population sample of 31 subjects, 13 healthy controls and 18 Parkinson’s disease (PD) sufferers. The subjects were recruited from two movement disorder units in Madrid (Spain) following the institutional protocols approved by the Massachusetts Institute of Technology, USA (Committee on the Use of Humans as Experimental Subjects approval no. 1402006203), Hospital 12 de Octubre, Spain (no. CEIC:14/090) and Hospital Clinico San Carlos, Spain (no. 14/136-E).
Each data file includes the timing information collected during two sessions of typing activity using a standard word processor on a Lenovo G50-70 i3-4005U with 4MB of memory and a 15 inches screen running Manjaro Linux. Subjects were instructed to type as they normally would do at home and they were left free to correct typing mistakes only if they wanted to. The key acquisition software presented a temporal resolution of 3/0.28 (mean/std) milliseconds.
Subjects were asked to visit a movement disorder unit twice to complete the study. Clinical evaluation includes UPDRS and finger tapping tests.
This dataset is free to use for research purposes only. We encourage researchers to try new computational algorithms on these data. Any time MIT_CS1PD is used for publishing or presenting the following publication needs to be referenced
L. Giancardo, A. Sánchez-Ferro, T. Arroyo-Gallego, I. Butterworth, C. S. Mendoza, P. Montero, M. Matarazzo, J. A. Obeso, M. L. Gray & R. San José Estépar. Computer keyboard interaction as an indicator of early Parkinson's disease. Sci. Rep. 6, 34468; doi: 10.1038/srep34468 (2016)
The ground truth used to describe motor impairment on each participant. The GT value discriminate the two classes in the sample population (CNT-False and PD-True). The results of the clinical evaluation, UPDRS- III, alternating finger tapping, single key tapping, neuroQWERTY score and typing speed are included in updrs108, afTap, sTap, nqScore, typingSpeed, respectively.
Columns file_1 and file_2 correspond to the first and second repetition
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import nqDataLoader as nq #data loading library
# load ground Ground Truth
cs1PdFr = pd.read_csv( 'MIT-CS1PD/GT_DataPD_MIT-CS1PD.csv' )
# set Patient ID as index
cs1PdFr = cs1PdFr.set_index('pID')
# show part of Data Frame
cs1PdFr.head()
In here it is shown how to load the typing data
# load typing data for patient 60 (first repetition)
# this method automatically filters out meta keys, mouse presses and backspace
keyPressed, htArr, pressArr, releaseArr = \
nq.getDataFiltHelper( 'MIT-CS1PD/data_MIT-CS1PD/' + cs1PdFr.loc[60]['file_1'])
here we diplay the raw hold time
def plotRawHT(pressArrIn, htArrIn, maxWinID=3):
winSize = 90
lblfSize = 24
plt.figure()
plt.ion()
plt.plot( pressArrIn, htArrIn, '.' )
plt.ylim(0,0.49)
plt.xticks( range( 0, np.int(np.max(pressArrIn)), winSize ) )
# rmBorder( plt.gca() )
plt.xlabel( '$t$', fontsize=lblfSize )
plt.ylabel( 'Hold time, $a[t]$', fontsize=lblfSize )
plt.show()
plotRawHT( pressArr, htArr )
The MIT_CS2PD dataset includes the typing information collected on a population sample of 54 subjects, 30 healthy controls and 24 Parkinson’s disease (PD) sufferers. The subjects were recruited from two movement disorder units in Madrid (Spain) following the institutional protocols approved by the Massachusetts Institute of Technology, USA (Committee on the Use of Humans as Experimental Subjects approval no. 1402006203), Hospital 12 de Octubre, Spain (no. CEIC:14/090) and Hospital Clinico San Carlos, Spain (no. 14/136-E).
Each data file includes the timing information collected during a session of typing activity using a standard word processor on a Lenovo G50-70 i3-4005U with 4MB of memory and a 15 inches screen running Manjaro Linux. Subjects were instructed to type as they normally would do at home and they were left free to correct typing mistakes only if they wanted to. The key acquisition software presented a temporal resolution of 3/0.28 (mean/std) milliseconds. Clinical evaluation includes UPDRS and finger tap
Subjects were asked to visit a movement disorder unit once to complete the study. Clinical evaluation includes UPDRS and finger tapping tests.
This dataset is free to use for research purposes only. We encourage researchers to try new computational algorithms on these data. Any time MIT_CS1PD is used for publishing or presenting the following publication needs to be referenced
L. Giancardo, A. Sánchez-Ferro, T. Arroyo-Gallego, I. Butterworth, C. S. Mendoza, P. Montero, M. Matarazzo, J. A. Obeso, M. L. Gray & R. San José Estépar. Computer keyboard interaction as an indicator of early Parkinson's disease. Sci. Rep. 6, 34468; doi: 10.1038/srep34468 (2016)
The ground truth used to describe motor impairment on each participant. The GT value discriminate the two classes in the sample population (CNT-False and PD-True). The results of the clinical evaluation, UPDRS- III, alternating finger tapping, single key tapping, neuroQWERTY score and typing speed are included in updrs108, afTap, sTap, nqScore, typingSpeed, respectively.
Columns file_1 and file_2 correspond to the first and second repetition
# load ground Ground Truth
cs2PdFr = pd.read_csv( 'MIT-CS2PD/GT_DataPD_MIT-CS2PD.csv' )
# set Patient ID as index
cs2PdFr = cs2PdFr.set_index('pID')
# show part of Data Frame
cs2PdFr.head()
In here it is shown how to load the typing data
# load typing data for patient 1000
# this method automatically filters out meta keys, mouse presses and backspace
keyPressed, htArr, pressArr, releaseArr = \
nq.getDataFiltHelper( 'MIT-CS2PD/data_MIT-CS2PD/' + cs2PdFr.loc[1000]['file_1'])
here we diplay the raw hold time
plotRawHT( pressArr, htArr )