Movement Analysis of Rehabilitation Exercises: Distance ... · movement performance are extracted...

Preview:

Citation preview

1014 IEEE SYSTEMS JOURNAL, VOL. 10, NO. 3, SEPTEMBER 2016

Movement Analysis of Rehabilitation Exercises:Distance Metrics for Measuring Patient Progress

Roshanak Houmanfar, Michelle Karg, and Dana Kulic

Abstract—Mobility improvement for patients is one of theprimary concerns of physiotherapy rehabilitation. Providing thephysiotherapist and the patient with a quantified and objectivemeasure of progress can be beneficial for monitoring the patient’sperformance. In this paper, two approaches are introduced forquantifying patient performance. Both approaches formulate adistance between patient data and the healthy population as themeasure of performance. Distance measures are defined to cap-ture the performance of one repetition of an exercise or multi-ple repetitions of the same exercise. To capture patient progressacross multiple exercises, a quality measure and overall score aredefined based on the distance measures and are used to quan-tify the overall performance for each session. The effectivenessof these measures in detecting patient progress is evaluated onrehabilitation data recorded from patients recovering from kneeor hip replacement surgery. The results show that the proposedmeasures are able to capture the trend of patient improvementover the course of rehabilitation. The trend of improvement is notmonotonic and differs between patients.

Index Terms—Biomedical monitoring, biomedical signal pro-cessing, computer aided diagnosis, human motion analysis, motionmeasurement, motion quality assessment, rehabilitation robotics.

I. INTRODUCTION

THE application of machine learning techniques to humanmotion analysis has grown rapidly over the past few

years. Measurement and analysis of physiotherapy data havethe potential to provide an objective and quantitative measureof patient progress over the course of physiotherapy treatment.

During a typical physiotherapy session, the physiotherapistinstructs the patient to perform a number of exercises, each withseveral repetitions. The set of exercises chosen and the numberof repetitions may be customized for each patient. In currentclinical practice, the patient’s performance is typically assessedusing visual observation of the patient’s motions and ques-tionnaires, e.g., the Community Balance and Mobility Scale[1], the Falls Efficiency Scale [2]. Goniometry, a technique ofmeasuring joint angles which isolates a single body joint inorder to evaluate range of motion [3], can also be used, but isnot accurate when the subject is moving e.g., during exercisesand functional rehabilitation.

The current measurement and assessment techniques requireadditional physiotherapist effort and monitoring, and are notcapable of measuring during movement. An automated system

Manuscript received October 31, 2013; revised February 28, 2014 andMay 20, 2014; accepted May 27, 2014. Date of publication July 1, 2014; dateof current version August 23, 2016.

The authors are with the University of Waterloo, Department of Electricaland Computer Engineering, Waterloo, ON N2L 3G1, Canada.

Digital Object Identifier 10.1109/JSYST.2014.2327792

could provide the therapist with numerical metrics to assess thepatient’s recovery process and potentially allow physiothera-pists to assess the effectiveness of various treatment protocolsover a population of patients.

Patient data analysis for progress monitoring is a challengingtask because of the complexity of human motion. Human move-ment consists of synchronous recruitment of multiple degreesof freedom (DoF), making single DoF comparisons incompleteand possibly unreliable. Human motion exhibits significanttemporal and spatial variability for different repetitions of thesame exercise. Since humans differ in characteristics such asage, gender, height, and weight, variability between differentsubjects is also observed. When recovering from an illness orsurgery there are variabilities caused by progress and improve-ment through rehabilitation, differing levels of pain during thecourse of treatment, as well as differing levels of fatigue overthe course of a session. During the course of rehabilitation,patients frequently are observed to exhibit compensation. Com-pensation refers to the recruitment of additional or differentDoFs [4] while performing a certain exercise. The correct formof the exercise and the DoFs recruited are prescribed by thephysiotherapist. Other sources of variability are due to the mea-surement system and the algorithms used for deriving the jointangles.

The goal of progress monitoring is to identify the variabilitycaused by recovery and improvement. The presence of mul-tiple other sources of variability makes this task challenging.Furthermore, the exercises are performed based on a specificregimen instructed by the physiotherapist for each patient.Therefore, the proposed approach should be flexible to detectpatient improvement for any set of exercises.

In previous work, we have developed a body-worn sensorsystem and associated algorithms for measuring human move-ment during rehabilitation. The overall system is illustrated inFig. 1. The data is collected from body worn inertial measure-ment unit (IMU) sensors attached to the patient and the jointangle positions, velocities, and accelerations are derived [5].The data are then segmented such that each segment beginswith the start of an exercise repetition and ends when theexercise repetition is finished [6]. In this paper, we proposean approach for progress estimation based on the segmentedmotion data. Descriptive features are either extracted fromjoint angle positions, velocities, and accelerations or from astatistical model of these data. The former is a common ap-proach in the biomechanics literature [7]–[9] whereas the latterprovides a model of the timeseries based on multiple repetitionsand is more common in the machine learning literature [10],[11]. In addition to collecting patient data, we also collectdata from healthy participants performing the same exercises,and use the healthy population data as a reference. Distance

1937-9234 © 2014 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

HOUMANFAR et al.: MOVEMENT ANALYSIS OF REHABILITATION EXERCISES: METRICS FOR PATIENT PROGRESS 1015

Fig. 1. Overall system is depicted. The IMU sensors are mounted on thepatient’s knee and ankle. Angular velocity and linear acceleration are collectedfrom the sensors and the five joint angles are estimated using the extendedKalman filter [12] (q1: extension/flexion of the hip, q2: internal/externalrotation of the hip, q3: abduction/adduction of the hip, q4: extension/flexionof the knee, q5: internal/external rotation of the lower limb). The data is thensegmented [6] so that each segment starts with the beginning of one repetitiontimeseries and ends when the repetition is completed. Features are extractedfrom the joint angles’ segmented timeseries and are used to obtain the measuresof progress.

measures are proposed to quantify the performance quality ofa single repetition of an exercise, a set of repetitions of thesame exercise, and a set of different exercises. The distances arecalculated based on kinematic data and compare movements ofone subject to the average healthy exercise performance.

This paper is organized as follows: Section II overviewsthe related work and motivates the application of continuousmeasures in physiotherapy rehabilitation. In Section III, twoapproaches are proposed to formulate distance measures foreach specific exercise and the overall score for a session ofmultiple exercises. The proposed approaches are evaluated ona synthetic data set in Section IV. Clinical data set collectionand the experimental evaluation of the proposed approaches aredetailed in Section V. A discussion of the results is presented inSection VI. Section VII outlines conclusions and directions forfuture work.

II. RELATED WORK

Human movement analysis is an active area of researchwith a wide field of applications including action recognition[13], gait identification [14], gesture recognition [15], motionimitation in robotics [16], affective human computer interaction[17], sport science [18], medical diagnosis [19], and rehabilita-tion [20]. The goal of these applications is either to recognizewhat movement is performed, e.g., [21] or how a movementis performed, e.g., [15]. Automatic human movement analysisfor rehabilitation exercises targets the latter to discriminatebetween movements performed by healthy and patient popu-lations [22] and perform illness diagnosis [23].

Typically, for a set of movements, key elements of humanmovement performance are extracted concatenating position,velocity, and acceleration information in a common featurevector [10], [11], [24]. These important features are used toseparate the unhealthy population from the healthy popula-tion. Most studies base their methods on classifiers that candiscriminate between the healthy and unhealthy populationse.g., [19], [25]–[27]. These studies rely on a patient databasefor training such a classifier [26], [27]. There are also studiesthat focus on monitoring features that change when a certain

medication or treatment is applied to a group of patients [28],[29] or focus on detecting features that are specific to thepatient population [30]. Unlike classification methods whichdistinguish only between two classes (healthy versus patient),we focus on patient monitoring and the detection of gradualchanges in patient performance due to rehabilitation. To date,only a few studies focus on assessing the correctness of exer-cises performed [10] and analyzing continuous changes in themovement performed [11], [24].

Upper body functionality post-stroke is considered in [24].The data from 77 healthy control subjects and 46 stroke pa-tients performing a single exercise is collected using a roboticexoskeleton; both data sets are used in feature selection andclassifier training. A multi-layered neural network is used toselect features and distinguish between healthy and patientpopulations. The summation of outputs in the last layer is usedto estimate the continuous measure of progress for patients. Theanalysis is performed on sessions up to 50 days apart.

Taylor et al. [10] consider three typical multiple knee os-teoarthritis rehabilitation exercises and record the movementswith wearable accelerometers. Descriptive features such asmean, minimum, and maximum, are calculated from the sen-sor readings and directly used in a multi-label classifier todistinguish between correct performance and several commoncompensation strategies. The Adaboost algorithm with linearclassifiers for each feature is used for classification. Only datacollected from a healthy population is used in the analysis.The healthy population data is labeled using expert opinionand analysis is performed on motions that have recognizabledifferences.

Zhang et al. [11] focus on post stroke rehabilitation. Motiondata is collected with IMUs and raw sensor output is used forfeature extraction after basic filtering. The timeseries data foreach sensor is partitioned for different exercises. Partitions thatcorrelate least with corresponding partitions of other exercisesare considered as motion templates. The patient data is thencross-correlated with the templates and the peak values ofthe cross-correlation are considered as the features. Data iscollected from rehabilitation professionals and a single patientis used for testing. K- Nearest Neighbours is used to classifypatient’s motions and the distance from the center of cluster isthe estimate of continuous progress.

The current state of the art develops models for healthy andpatient populations [10], [11], [24] and therefore is capable ofassigning the class labels “healthy” or “unhealthy” to capturedmovement sequences. The disadvantage of classification tech-niques is that they cannot explicitly model continuous progressand therefore are not suitable for continuous monitoring pur-poses. Some classifiers, e.g., neural network classifiers [24]often need fine tuning, are hard to replicate or extend becausetheir structure makes clinical interpretation difficult.

Many of the works in the state of the art focus on onespecific exercise [11], [24]. This is a limitation for monitoringpatients over the course of rehabilitation because the exerciseregimen consists of more than one exercise. Furthermore, manyof the current works [10], [24] validate their methods basedon synthetic and simulated data due to lack of patient data.Validation of studies considering continuous labeling (e.g.,[24]) is difficult because an objective quantized ground truth ofcontinuous progress is rarely available. Quantitative assessment

1016 IEEE SYSTEMS JOURNAL, VOL. 10, NO. 3, SEPTEMBER 2016

Fig. 2. Repetition timeseries and the repetition set provide the performancemeasures δ, and Δ. The overall score S assesses an exercise set. During theknee extension exercise, the subject performs a full knee extension/flexionwithout moving any other joints. During the knee hip extension exercise,the subject lies on the ground and performs a knee hip extension/flexionsimultaneously. During the squat, the subject bends his knees and hip whilestanding. (a) Repetition timeseries. (b) Repetition set. (c) Exercise set.

scores are often not collected for each physiotherapy sessionof a patient due to the limited time in each session. Therefore,visual graphs and cross validations (e.g. comparison to otherclassifiers’ performance) are common methods for validation.

An objective and quantified measure of patient improvementcan be beneficial for monitoring patient progress. In this paper,we propose a technique that estimates the continuous measureof patient improvement, capable of handling a variety of exer-cises. We validate our proposed approaches based on both syn-thetic and clinical data. Of the challenges summarized above,we address capturing the variability caused by improvementin human motion, validating the proposed approaches basedon clinical data, and handling different exercise regimens foreach patients and each session. We do not address factors suchas pain and fatigue that affect human motion, and we do notaddress how the causes for improvement or degradation inperformance can be identified.

III. METHODOLOGY

Analysing patient progress during physiotherapy requiresanswering the following questions: 1) How to assess one rep-etition of one exercise?, 2) How to assess multiple repetitionsof one exercise?, and 3) How to combine the evaluations fromdifferent exercises and obtain a score that denotes the overallperformance for a session?

In answering the questions above, we assume that the motiondata is available in the form of joint angle positions, velocities,and accelerations. We also assume that motion data is availablefrom a healthy population performing the same set of exercises.We make the assumption that, at the time of the analysis,we know which exercise is being performed, that the data issegmented such that one single repetition of a certain exerciseis a repetition timeseries ω = [γ(1) γ(2) . . . γ(T )], where Tis the duration of the repetition for that exercise, and γ is avector of joint kinematics γ = [q1 q2 . . . q1 q2 . . . q1 q2 . . .].Multiple repetitions of the same exercise performed in the samesession are the repetition set for that exercise Ω = {ω1, . . . ωn}where n is the number of repetitions. The set of multipleexercises performed in the same session are the exercise setof that session Γ = {Ω1, . . . ,Ωm} where m is the number ofdifferent exercises performed in the session. (See Fig. 2).

We propose two approaches to extract variation due toprogress through rehabilitation. In the feature-based method,descriptive measures are extracted from the joint angle time-series. The HMM-based method relies on features extractedfrom a generative model for the joint angle timeseries. For both

approaches, we use the healthy population data as the referencefor assessing performance. Measures δ and Δ for assessing oneand multiple repetitions of one exercise are introduced based ona comparison between the healthy population and the patient.The overall score S is calculated as a function of these measuresfor multiple exercises in one session.

A. Feature-Based Approach

In the feature-based method, the mean, minimum, maximum,skew and range of motion of the joint angle positions, ve-locities, and accelerations plus the duration of each repetitiontimeseries are considered as the feature vector

v =

[μq1 min

q1maxq1

skewq1 romq1 μq2 . . . duration

]

(1)

skewqi =1T

∑Tj=1

(qij − μqi

)3(√

1T

∑Tj=1

(qij − μqi

)2)3 (2)

romqi = maxqi

−minqi

. (3)

This definition of the feature vector is desirable because itallows modeling the timeseries of the data using statisticalfeatures. This method is fast to compute and can capture the at-tributes of the timeseries from one example. However, since thefeatures are defined directly from the timeseries, the approachis more affected by unwanted variabilities such as noise.

1) Feature Selection: To assess the performance, it is essen-tial to extract the informative features from the feature vectorand exclude those that are uninformative or redundant. Theinformative features extracted from the feature vector are thetop features. Different features may be informative for differentexercises, therefore the top features are selected automaticallyfrom the data by looking for those features which show themost variation over the course of treatment and are mostdifferent from the healthy population. Features that reflect whatchanges most throughout the rehabilitation are chosen usingLeast Absolute Shrinkage and Selection Operator (LASSO)[31]. LASSO is a regression tool which can also be used forselecting features.

For a set of inputs f1, f2, . . . , fk, an output y and the follow-ing linear model:

y = w0 + w1f1 + w2f2 + w3f3 + · · ·+ wkfk. (4)

LASSO adjusts the weights w0, . . . , wk such that∑

(y − y)2 isminimized and

∑ki=0 wi < t where t ≥ 0 is a tuning parameter

[31]. The parameter t is selected so that the weights w arelarger than zero for only five features. Preliminary experimentsshowed that the algorithm is not sensitive to this value. Anyvalue of t resulting in a range of 5–25 features results inthe same performance measures. When wi becomes zero, theinput fi does not contribute to minimizing

∑(y − y)2, i.e., fi

is either uninformative or its information is redundant. Theseremaining 5 features are considered as the top features in thesubsequent analysis.

The inputs f1, f2, . . . , fk are the features of the repetitiontimeseries and the output y is the corresponding session num-ber. The session numbers are normalized between 0 and 1, suchthat 0 corresponds to a patient’s first session, and 1 corresponds

HOUMANFAR et al.: MOVEMENT ANALYSIS OF REHABILITATION EXERCISES: METRICS FOR PATIENT PROGRESS 1017

to a patient’s last session. We select features that change withevery session and among these features the ones that mostlikely correspond to a linear relationship. This selection allowsus to find the features that are changing as patients progressthrough the sessions. For the purposes of feature selection, wealso consider the healthy population data in this regression. Forthe healthy population, the label y is set to be 100 times largerthan the patients’ last session. Introducing this outlier forces theregression to be in the direction of the healthy population dataand helps to detect the features that not only change with theprogress of the patients but also separate the healthy populationfrom patients. The value of y for the healthy population directlyaffects the value of the weights, but the chosen features are notchanged as long as y is sufficiently large. We do not use thevalues of the weights in our analysis and only use the featuresselected by this method.

As there are multiple sources of variation in human motion,we cannot assume that there is a linear relationship betweenthe number of days in treatment and motion features. We usea linear model only for feature selection, i.e., for identifyingwhich features change during the course of treatment anddiscriminate between the healthy population and patient data.

2) Measure of Performance for Repetition Timeseries: Toobtain a measurement for the performance of one exercise, thetop feature vectors are extracted from the patient (VP ) andhealthy population (VH) data as explained in Section III-A1

V ′H =VH(topfeatures) (5)V ′P =VP (topfeatures). (6)

Based on our observations, to a smaller degree, healthy indi-viduals employ the same compensation strategies that patientsuse when performing an exercise, i.e., healthy subjects showthe same compensation strategies due to mental and physicalfatigue, lack of physical readiness, and misunderstanding theexercise instructions. For example, in the knee extension exer-cise, the correct form of the exercise is to perform a full range ofknee extension while keeping the other joints still. Based on ourobservations, the healthy population often compensates withadditional hip extension. We assume that among the featureschosen by LASSO, the ones with higher variance in the healthypopulation are more informative, because the highly variantfeatures are either features of the moving joint or are thefeatures describing the compensation strategy. Therefore, moreweight is given to the more variant features in defining thedistance measure. When comparing these features, standardnormalization (normalizing with a mean zero and standarddeviation of one) is performed on the data to make the dataunitless. The distance δ between the patient repetition and thehealthy population data evaluates each repetition

μH =mean (V ′H) (7)

ΣH =diag (std (V ′H)) (8)

δi =(V ′Pi

− μH

)TΣH

(V ′Pi

− μH

)(9)

where V ′H is the healthy population top feature vector, V ′

Piis

the patient top feature vector for the ith repetition timeseries,μH is the mean of the healthy population top feature vectors,ΣH is the diagonal matrix of standard deviations for the healthypopulation top feature vectors, and δi is the distance betweenone repetition of the exercise performed and the healthy group’s

performance. We assume that as patients improve they getcloser to the healthy data and therefore a decrease in the valueof δ over the course of rehabilitation indicates improvement.

3) Measure of Performance for Multiple Repetitions of theSame Exercise: For each patient, δ represents the measure ofperformance for one repetition of an exercise. The median ofthe distance measures (δ) calculated for one exercise over thesession is considered as the overall distance measure for therepetition set of that exercise

ΔΩ = median(δΩ) (10)

where ΔΩ is the overall performance of one exercise in onesession and δΩ is the vector of distance measures calculatedfor every repetition timeseries data ωi in the repetition set Ω.The median is used to lessen the sensitivity to outliers.

4) Measure of Performance for a Combination of Exercises:The distance ΔΩ describes the patient performance for oneexercise (i.e., Ωj) in each session. There are multiple exercisesperformed in each physiotherapy session (i.e., Γ) that need tobe considered together for overall patient progress assessment.Quality and quantity are the two factors that affect scoringan exercise. Based on our observations, the features computedfrom exercises performed by the healthy population have largervariances when the exercise is more difficult. We thereforeassume that the distance measures of the healthy populationhave larger variance for more difficult exercises.

The distance measures (δ) are calculated for every repetitiontimeseries of the healthy population data according to (9)and are considered as the comparison reference. The healthypopulation distance measure vector δHj

is the vector of thedistance measures calculated for every repetition timeseries ofexercise Ωj in the healthy population data. The patient distancemeasures (ΔPj

) are calculated for the repetition set of everyexercise Ωj in the exercise set Γ. The mean and standarddeviation of δHj

are considered as the measure of exercisedifficulty

μδHj=mean

(δHj

)(11)

σδHj= std

(δHj

)(12)

where μδHjis the mean of the healthy population distance

measure vector δHjand σδHj

is the standard deviation of thehealthy population distance measure vector δHj

.We define the measure of quality for a repetition set of an

exercise j performed by the patient as

Qj =

(ΔPj

− μδHj

)σaδHj

(13)

where ΔPjis the patient’s distance measure for the repetition

set of exercise j, and a is the index that penalizes Q based onthe exercise difficulty, i.e., larger a increases the importance ofexercise difficulty. We observed that the best value for a is 2,which can be interpreted as the inverted dispersion index [32].A perfect performance over any repetition set Ω results in avalue of zero for the overall distance measure (ΔΩ = 0). Theoverall score for the patient in a specific session is calculatedas the difference between the norm of the score resulting froma perfect performance and the norm of the weighted qualitymeasures. The quality measure Qj of an exercise j is weighted

1018 IEEE SYSTEMS JOURNAL, VOL. 10, NO. 3, SEPTEMBER 2016

by its number of repetitions. The score of the patient for a givensession is calculated using the following equation:

S =

√√√√√∑Ω∈Γ

⎛⎝ nΩ∑

Ω∈ΓnΩ

μdHΩ

σadHΩ

⎞⎠

2

√√√√√∑Ω∈Γ

⎛⎝ nΩ∑

Ω∈ΓnΩ

⎞⎠

2

(14)

where Γ is the exercise set, Ω is an exercise in the Γ, nΩ is thenumber of repetitions for exercise Ω, and QΩ is the quality mea-sure calculated for exercise Ω using (13). The score S is formu-lated such that performing a difficult exercise in a session wouldimprove a patient’s score. Furthermore, we assume that exer-cises with more repetitions in one session are the main focus ofthat session and therefore, the quality measures Q are weightedby the number of repetitions for each exercise. The score S isdefined as the difference between a perfect weighted qualitymeasure and the patient’s weighted quality measure henceprogress is assumed to result in smaller values for this measure.

The healthy population’s distance values are often smalland have a small variance compared to patient data. To avoiddividing the quality measure with a value less than 1 when nor-malizing by the healthy population’s distance measure varianceσδHj

, all δ values are scaled uniformly such that all variancevalues of the healthy population’s distance measures becomegreater than 1. The algorithm flexibility in defining any exerciseset allows us to calculate the overall score for any arbitrary setof exercises.

B. HMM-Based Approach

The Hidden Markov Model (HMM) [33] based approachrelies on features extracted from HMMs modeling the jointangle timeseries. HMMs are trained on the repetition set of eachexercise for the healthy and patient populations.

Individual HMMs are learned for each member of the healthypopulation and for each session of each patient. Each repetitionset is modeled using a 3 state, left-to-right model. State 1 corre-sponds to moving to the desired posture, state 2 corresponds toholding the desired posture, and state 3 corresponds to returningto the starting posture. The observations of the HMMs arethe position, velocity, and acceleration of the joint angles. Themean and variance of the observation distributions in each stateare considered as the feature vector

v =[μstate1q1

σstate1q1μstate2q1

. . . μstate3q5σstate3q5

].

(15)

LASSO feature selection is used to choose the ten mostinformative features following Section III-A1. We use the sameprocedure as Section III-A2 in calculating distance measures:the distance measure, Δ, is calculated using (9), the qualitymeasure Q is calculated using (13), and the overall performancescore S is calculated using (14).

The HMM is capable of capturing the statistical essence ofa dynamic timeseries. Such a definition of the feature vectoris beneficial, because it models the most likely timeseries andthe probability of variations. Furthermore, the feature-basedapproach requires expertise in predefining the features whereasthe HMM captures the features that describe the pattern ofthe data automatically. However, the HMM is computationally

Fig. 3. Effect of each source of variability on the correlation index betweenthe average score of each method and the ground truth average score. (a) Noise.(b) Late segmentation. (c) Early segmentation.

Fig. 4. The results of removing each set of exercise on the average overallscore for each of the approaches. (a) Feature-based. (b) HMM-based. (c) SVM.

more expensive than the feature-based approach and requiresmultiple samples of the timeseries data, which may not beavailable. This also means that an individual repetition cannotbe evaluated.

IV. SYNTHETIC DATA EXPERIMENTS

The synthetic data test is designed to evaluate the proposedapproaches with a known ground truth signal. The syntheticdata is generated for a scalar position, velocity, and accelerationtimeseries based on two simulated exercises, an upward bellcurve and a downward bell curve. One of the exercises is con-sidered as a “hard” exercise, while the other an “easy” exercise.We generate a simulated healthy and patient data sets, bothincluding temporal and spatial variability, where the patientrange of motion and execution time improve as they advancethrough sessions. A detailed description of the synthetic datageneration procedure is provided in the supplementary material.

The HMM-based approach, feature-based approach, and theSupport Vector Machine (SVM)1 are applied on the syntheticdata. When temporal and spatial variability is removed fromthe synthetic data and all the data is available to all methods, thecorrelation in average patient scores between the three methodsis over 97%. We consider the variability-free, full data caseas the ground truth and investigate how correlations betweenthis ground truth and the results from the three approaches areaffected under the following conditions: 1) noisy data, 2) poorsegmentation, 3) temporal variability, and 4) incomplete data.

The effect of noise is depicted in Fig. 3(a). The HMM-basedapproach is least affected while the feature-based approachis most affected. This is because the HMM-based approachgenerates a model based on a repetition set. The feature-based

1The SVM is chosen for comparison with the proposed approaches becauseof the results obtained with the clinical data, as described in Section V.

HOUMANFAR et al.: MOVEMENT ANALYSIS OF REHABILITATION EXERCISES: METRICS FOR PATIENT PROGRESS 1019

approach and the SVM have the poorest performance becausethe features are directly affected by the noise.

In the second set of tests, the effect of poor segmentationis investigated. Two types of segmentation error are possible:late segmentation and early segmentation. Late segmentationis modeled by adding points with a constant value to the endof the joint angle position in the patient data set. The resultsare shown in Fig. 3(b). The feature-based approach is theleast affected, the HMM-based approach is affected when thenumber of points are significant enough to alter the states.The SVM is also affected in many cases. Early segmentationresults in an incomplete timeseries. The results of this testare depicted in Fig. 3(c). The feature-based and SVM basedapproach are not much affected by this variability. This is alimitation of the predefined features since they do not considerthe timing and do not include enough information to capture thedifference between a complete timeseries and an incompletetimeseries. However, the HMM constructs a statistical modelof the timeseries and is significantly affected by the incompletedata. In the third set of tests, the effect of scaling the length ofthe timeseries in the patient data set is analyzed. None of theapproaches are significantly affected by this variability.

The fourth test is designed to investigate the effects of setsparsity when data of some exercises are not available in onesession. Fig. 4 shows the values of the average overall scorefor each exercise when the hard or the easy exercises are notavailable. It can be seen from the results that set sparsity resultsin jumps and inconsistencies in the overall score.

V. CLINICAL DATA EXPERIMENTS

The proposed approaches are evaluated on a patient dataset from patients recovering from knee or hip replacementsurgery. In this type of rehabilitation, the following exercisesare commonly performed: knee extension/flexion while seated,knee and hip extension/flexion while supine, and squat.2 Data ofthese exercises was also recorded for a healthy population. Thefeature-based and the HMM-based approaches are evaluated fortwo cases: 1) healthy population and a subset of patient datais available for training, and 2) only healthy population datais available for training. The healthy population data is onlyused to learn a reference model, and results are presented forthe patient data.

A. Data Collection and Pre-Processing

Motion data is collected using Shimmer sensors [34]mounted at the knee and ankle providing angular velocity andlinear acceleration data (128 Hz). Position q, velocity q, andacceleration q of five joint angles consisting of knee flexion,knee rotation, hip flexion, hip abduction, and hip rotation areestimated from the sensor databased on a kinematic model andan Extended Kalman Filter [5], see Fig. 1.

Patient data was collected from eighteen inpatients duringtheir rehabilitation at the Toronto Rehabilitation Institute. Eachpatient performs one 45–60 minute session per day. The numberof days a patient stays in the hospital varies from 4–12 days

2The patients do not perform the full squat but lower their torso only slightly(i.e., knee bend of 15 degs).

TABLE IPATIENT INFORMATION3

Fig. 5. Patient data and healthy population data are separable for the mostinformative features. The star indicates the first day the patients have performedthe exercises and the triangle indicates the last day the patients have performedthe exercises (only 1 session available for patient 5). q1 is the joint anglecorresponding to hip extension and q4 is the joint angle corresponding toknee extension. (a) Knee extension/flexion. (b) Knee hip extension/flexion. (c)Squats.

and depends on the patient’s needs and health status. The setof exercises specified by the therapist in each session differsbetween patients and sessions. Therefore, the repetition set ofone specific exercise is not available for every session. Thehealthy population data consists of 10 subjects (age: 23 ± 4.5)performing each exercise 10–20 times. The patient populationtends to be elderly. Therefore, the healthy population performedthe exercises slowly to minimize the speed difference betweenthe two populations.

A subset of the patient data is used for feature selection(patient 1, 2, 3, 5, 6, 7), all patient data is used for testing. Weselected patients 2, 8, and 18 to graphically illustrate the resultswithin this paper. Patient 2 is one of the patients included inthe feature selection, and shows gradual improvement duringthe rehabilitation process. Patients 8 and 18 are examples ofpatients whose data is not used for feature selection and plotsof subjects 8 and 18 illustrate the performance of estimatingprogress for subjects whose data is unseen during training.Patient 18 is an example of a patient who shows a rapidprogress in the course of their rehabilitation, while patient 8is an example of a patient with a common duration of recovery.General information for patients 2, 8, and 18 is summarized inTable I including any unique circumstances. Plots and infor-mation for all 18 patients can be found in the supplementarymaterial.

B. Feature-Based Approach

The LASSO technique described in Section III-A1 is used forfeature selection. Fig. 5 shows the distribution of the repetitiontimeseries of the healthy population and the training subsetof the patient data over the two features selected by LASSOthat have the largest variance in the healthy population. The

3Patient information for all 18 patients is provided in the supplementarymaterial.

1020 IEEE SYSTEMS JOURNAL, VOL. 10, NO. 3, SEPTEMBER 2016

Fig. 6. Measure of performance for each repetition timeseries δ is illustratedfor session 2 of patient 2 performing the knee extension/flexion exercise.

Fig. 7. Results for the distance measure δ calculated using the feature-based

approach are shown for three exercises. The red circle illustrates the medianof the distance measures (i.e., Δ) in each session and the blue bar depictsthe variance of the distance measures δ in one session. The size of the circleindicates the number of repetitions available in each repetition set. For kneeextension the top features are minq4 ,meanq1 ,meanq4 , romq4 , time. (a) P2knee extension. (b) P18 knee extension. (c) P8 knee extension.

clusters of the healthy population data and the patient data areseparable. Furthermore, Fig. 5 shows that a patient’s progressis in the direction of the variance of the healthy population dataand moves toward the mean of the healthy population over thecourse of rehabilitation.

Fig. 6 illustrates the values of δ for the second session ofpatient 2. As can be seen from this figure, the approach capturesthe variation in exercise performance over the course of multi-ple repetitions. Fig. 7 shows the calculated distance measure Δand the distribution of δ for the 3 example patients. The exerciseregimen is specific to each patient. The exercises are performedin a subset of the sessions, e.g., patient 2 performs kneeextensions in session 1, 2, 7, 8, and 10. Furthermore, factorssuch as pain, fatigue, psychological status, and environmentalconditions contribute to patients’ performance and it cannotbe expected that the patient progress increases monotonically.For all three patients an overall improvement over the courseof the physiotherapy treatment can be observed. Some patientsshow rapid progress and are discharged early, e.g., patient 18[in Fig. 7(b)]. The distance measure for a repetition set ismore reliable when the number of repetitions available for thatexercise is larger. The feature-based approach generalizes tounseen patient data, e.g., the data of patients 8 and 18 was notused for the feature selection.

The quality measure Q and the overall score S for eachsession are obtained according to (13) and (14) using the overalldistance measure Δ calculated for every repetition set of eachsession. We assume that as patients improve, the overall scoreincreases from negative values toward zero.

Fig. 8 shows the score measures for each patient. It can beseen from the figures that the trend of the score shows progressbut there are some inconsistencies in patient 2 session 8 [in

Fig. 8. An overall score S ( ) is calculated for a exercise set, combiningindividual distance measures Δ of knee extension, knee-hip extension, andsquat. The size of the marker indicates the number of exercises available ineach session. The green line shows the best score of the patients in theirlast physiotherapy session. (a) Patient 2. (b) Patient 18. (c) Patient 8.

Fig. 9. HMM-based distance measure ΔHMM ( ) shows the trend inprogress over the sessions, here illustrated for patients 2, 8, and 18. The markersize indicates the number of repetitions available in the repetition set. (a) P2knee extension; (b) P18 knee extension; (c) P8 knee extension.

Fig. 11(a)]. These inconsistencies are caused by a small numberof performed exercise repetitions.

Due to differences in health status, the exercise regimen ofeach session is different from one patient to the other. Amongthe three exercises chosen for analysis in this study, thereare sessions where only one of these exercises is performedand therefore the score is based solely on the performancequality of that single exercise. This results in inconsistenciesin the improvement trend of the score measure since a poorperformance in one exercise is not an accurate measure ofthe patient’s overall status. The score measure estimates thepatient’s overall status more accurately when more exercisedata from each session is available.

Visual analysis of the distribution of the most relevant fea-tures (see Fig. 5), where it can be seen that patient variation asthey proceed through rehabilitation is in the direction of healthydata variability, motivates to investigate whether only healthypopulation data is sufficient to select the most relevant features.Such an approach is beneficial when a physiotherapist mayinclude a new exercise into the exercise regimen and patientdata is not yet available for this exercise. Healthy populationdata can be easily collected by the physiotherapist him/herselfperforming the new exercise. We investigate this extensionby using only healthy population data for feature selection.Variabilities caused by initial posture and sensor positioningare highly variant in the healthy population. Because we areonly considering the variation in the healthy population, suchfeatures could get selected using our current approach. To avoidthis, features obtained from joint angle position were removed

HOUMANFAR et al.: MOVEMENT ANALYSIS OF REHABILITATION EXERCISES: METRICS FOR PATIENT PROGRESS 1021

Fig. 10. Overall score SHMM ( ) shows the trend of progress duringrehabilitation. The marker size indicates the number of exercises available foreach session. shows the best score of the patients in their last session ofperforming the three exercises. (a) Patient 2; (b) Patient 18; (c) Patient 8.

from the feature vector. The top features are chosen from thefifteen most variant features that correlate less than .5 with eachother. Equation (9) is used to calculate the distance measure foreach repetition timeseries, (10) is used to calculate the distancemeasure for the repetition set, and (13) and (14) are used tocalculate the overall score for each session.

We analyze the correlation between the overall score fromthe feature-based approach when using healthy and patientpopulation data for feature selection and when using onlyhealthy population data. The results correlate highly (over .65)for most patients. Even though feature selection based only onhealthy population data does not take compensation strategiesspecific to the patient population into account, the extensionusing only healthy population data for feature selection candetect patient progress. When the overall performance of apatient is constantly high (patient 7 and 15) or low (patient 5)over the course of the rehabilitation, changes in the scores aresmall. In these cases, the correlation index can be low, becausethe two techniques differ when assessing small changes inperformance.

C. HMM-Based Approach

Fig. 9 shows the overall distance measure Δ (seeSection III-B) calculated for the sessions when patient 2, 8, and18 performed the knee extension exercise. The features chosenby LASSO indicate the progress in unseen data by decreasing δvalues over the course of the rehabilitation sessions, e.g., shownfor patient 8 in Fig. 9(c).

Based on the quality measure Q [see (13)] of each exercisein one session the overall score S for an entire exercise set iscalculated using (14). Fig. 10 shows the scores SHMM for eachpatient. The scores show an overall trend of improvement formost patients. As before, the reliability of the score measuredepends on the number of available exercises, i.e., outliersare usually observed when only one exercise is available tocalculate the score. The features chosen by LASSO generalizewell to the unseen data e.g., patient 8 whose trend of improve-ment is captured by the approach [in Fig. 10(c)]. Furthermore,the method captures the rapid improvement of patient 18 [inFig. 10(b)].

Moreover, we investigate whether healthy population data issufficient for feature selection for the HMM-based approach.Feature extraction for the healthy population is described inSection III-B. Among the first fifteen most variant features

in the healthy population, those that correlate least with eachother (less than .5) are chosen as the top features. The distancemeasure ΔHMM and the overall score SHMM for each patientand each session are calculated following Section III-B. Thecorrelation between the overall score using healthy and patientpopulation data for feature selection and using only healthypopulation data for feature selection is above .65 for mostpatients. Negative correlation indices are observed when thechanges in a patient’s progress are small.

If only healthy population data is available, an intuitivedistance measure δ for the HMM-based approach is the loglike-lihood, a common approach in gesture and motion recognitionliterature [8], [9], [35]. To test this approach, we compute thelikelihood that a repetition timeseries of a patient is generatedby an HMM trained only on healthy population data. Themedian of the log likelihoods is considered as the distancemeasure Δ for a repetition set. As patients improve and theirperformance becomes more similar to the healthy populationthe log likelihood should increase. However, this method doesnot capture the trend of progress for 80% of patients sincedistance measures Δ 2–3 times larger than their average rangeare observed for many repetition sets.

D. Validation

The patient’s physical status is visually assessed by the phys-iotherapist in each rehabilitation session. The physiotherapistuses this assessment to formulate the patient’s regimen and de-cide his or her treatment duration. This evaluation is subjectiveand does not have a quantified interpretation. Quantified mea-surements may also be taken (e.g., range of motion score, Bergscale), but these are typically not recorded for everysession.

While direct quantified expert evaluation is not available forsession-by-session comparison, exercise difficulty and durationcan be used as an indirect measure of PT assessment. In thefirst sessions of rehabilitation, exercises recommended by thephysiotherapist are mostly composed of supine and sittingexercises with very few repetitions. As patients improve, therecommended exercise regimen becomes harder (addition ofstanding exercises) and includes more repetitions. Therefore,the exercise regimen can be utilized to obtain an estimate ofthe clinical assessment of the patient’s overall performance,i.e., overall score. To validate our approaches, we compare thescore measures with an estimate of patient progress calculatedbased on their exercise regimen and we provide a qualitativecomparison of the score measures with the physiotherapistnotes in the patient health charts. Finally, we compare theproposed approaches with classifier-based approaches.

1) Comparison With Estimate of Patient Progress FromExercise Regimen: To estimate a measure of patient progressfrom the exercise regimen, we use the complete informationof all exercise sets available from all patients. We consider theexercises performed in the last session for patients with fewerthan 4 sessions, and the exercises performed in the last twosessions for patients with more than 4 sessions as the hardexercise set.4 The first session exercises are considered as the

4We consider the last two sessions because for most patients the last sessionis a last check up and contains very few exercises.

1022 IEEE SYSTEMS JOURNAL, VOL. 10, NO. 3, SEPTEMBER 2016

Fig. 11. Correlation between the overall score calculated for each method andthe ground truth for each patient. The data of patients 10 and 11 has only 1session available and therefore correlation cannot be calculated. (a) Feature-based approach. (b) HMM-based approach.

easy exercise set. For each exercise, the number of patients whoperformed the exercise on their last two sessions are countedand this number is divided by the total number of patientsto obtain the probability that the exercise belongs to the hardexercise set, pH(Ω). We eliminate the exercises performed byfewer than three patients, i.e., exercises with probability lessthan .15, from the hard exercise set. The same procedure isperformed to determine the probability of belonging to the easyexercise set, pE(Ω). If an exercise is not in the hard exercise setthe probability that this exercise belongs to the hard exerciseset, p(H|Ω), is assigned a value of .01. The same approach isused for generating the probability that an exercise belongs tothe easy exercise set, p(E|Ω).

The overall measure of progress for each session and eachpatient is calculated as

SiGT=

∑Ω∈Γ log (p(H|Ω))∑Ω∈Γ log (p(E|Ω))

SGT = [S1GT, S2GT

, . . .] (16)

where i is the number of sessions, SiGTis the ground truth

overall score for each session, and SGT is the overall score forall the sessions.

Fig. 11 shows the correlation index comparing each method’soverall score, S, for each patient with the overall score obtainedfrom (16). The two approaches correlate moderately in mostcases (over 62% for the feature-based approach and over 55%for the HMM-based approach). Low correlations occur mostlyin cases where for many of the sessions few exercises areavailable for evaluation, e.g., patient 4 and 3. As mentioned inSec. IV the inconsistency in the number of available exercisesbetween different sessions can cause jumps in the overall score,which in turn results in a poor correlation with the ground truth.The cases where the clinical assessment does not correlate wellwith our proposed approaches are either caused by patientswho do not show a visible change in their overall score or arecaused by patients who have gaps in the number of availableexercises in more than half their sessions. When these patientsare excluded (9 patients out of 16 remain) the mean of thecorrelation becomes 67% for the feature-based approach and72% for the HMM-based approach.

2) Qualitative Comparison With Patient Health Charts:Physiotherapists assess and record patient performance andcondition on admission. Even though these assessment forms

often contain unfilled sections and are mostly qualitative, theyinclude information about the initial status of the patients. Fur-thermore, patient performance during rehabilitation is some-times recorded by the physiotherapists in the daily charts. Inthis section, we provide the physiotherapist assessments forthe exemplar patients and compare these evaluations to theproposed overall scores.

Patient 2 was admitted to the hospital after a hip jointreplacement surgery. Based on the first day assessment, shewas forbidden to perform hip abduction due to hip precautions.She was capable of bearing her weight, but needed assistancefor rolling in bed and transferring from bed to wheelchair. Sheused a walker and was capable of walking for 2 meters only.The range of motion score in the recovering leg was 8/18 andthe patient had a high risk of fall according to her stability testresults. On the night before session 5, the patient fell causingpain in her lower extremity joints and therefore affecting herperformance in session 5. This information matches the overallscores calculated for both of the proposed approaches in Figs. 8and 10. The patient was sent back to the ER in session 8due to complications unrelated to her surgery; the effects ofthis incident on her performance are captured by both of theproposed approaches in Figs. 8 and 10. The patient comes backa week later to continue her physical rehabilitation. In session 9,the patient is able to walk 70 meters with a walker withsupervision and her range of motion score for the rehabilitatedleg becomes 18/18. This progress is captured in the 9th and 10thsession by both approaches in Figs. 8 and 10.

Patient 8 was admitted to the hospital after a hip jointreplacement surgery. Based on her first day assessment, she wascapable of bearing her own weight but was feeling severe pain.She had high bed mobility but needed assistance in transitioningfrom bed to wheelchair. She had a high fall risk according to herstability test and was capable of walking for 10 meters only. Herrange of motion score was 12/18 and she could perform the ex-ercises with assistance. In her second session she was capable ofperforming all her transfers independently and was capable ofwalking 50 meters independently using a walker. The proposedapproaches both capture the progress for this patient betweensessions 1 and 3 in Figs. 8 and 10. In her 9th session, sheperformed 20 repetitions of bilateral exercises which indicatesimprovement in her performance. The physiotherapist did notrecord the range of motion score at discharge.

Patient 18 was admitted to the hospital due to knee replace-ment surgery. He had a high risk of falls and had normal bedmobility. He needed supervision for bed to wheelchair transfer.He could walk 30 meters with supervision. In session 3, hehad two physiotherapy sessions where he walked 40 meterssupervised using a walker in the morning and 70 meters in theafternoon. Our scores capture this rapid progress for this patientbetween the first and third sessions in Figs. 8 and 10. For Patient18, the physiotherapist did not record the range of motion score,either on admission or at discharge.

E. Comparison With Classifier Based Approaches

We provide a comparison of the proposed measures toclassifier-based approaches trained using both healthy and pa-tient data. Popular classifiers in the human motion literatureare Naive Bayes (NB) [36], Kullback-Leibler (KL) divergence

HOUMANFAR et al.: MOVEMENT ANALYSIS OF REHABILITATION EXERCISES: METRICS FOR PATIENT PROGRESS 1023

Fig. 12. The correlation index between (a) SVM and feature-based approachand (b) SVM and HMM-based approach for the three exercises: knee exten-sion/flexion (KEF), knee hip extension/flexion (KHEF), and squat is for mostcases above.6. (a) Correlation between SVM and feature-based approach fordifferent exercises. (b) Correlation between SVM and HMM-based approachfor different exercises.

[18], and Support Vector Machines (SVM) [37]. For NBclassification, we use the top features selected by LASSO inSection III-A2. The probability of belonging to the healthypopulation class normalized by the summation of probabilitiesof belonging to healthy or patient class is considered as the dis-tance measure δNB for each repetition timeseries. To computethe KL divergence, an HMM is trained for the entire healthypopulation and on every repetition set of every exercise andeach patient. The symmetric KL divergence [33] for each pa-tient is calculated for each repetition timeseries and the averageof these values is considered as the measure of progress forone repetition timeseries. Both the NB and the KL approacheswere unable to capture any trend of progress for the patients.The SVM provided the best results, therefore we base our com-parison on the results obtained from an SVM.

The SVM formulation used is as given in [38], [39]. Forthe purposes of this study, we use a soft margin SVM witha linear kernel. The SVM is trained using the top featureschosen by LASSO that have the highest variances in the healthypopulation. We consider the distance to the SVM decisionhyperplane (the magnitude of the SVM-margin) as the distancemeasure δSVM for a repetition timeseries of an exercise ineach session. The median of these values is considered asthe overall distance measure ΔSVM for a repetition set. Thecorrelation between the distance measures obtained from theSVM and the feature-based approach are shown in Fig. 12(a),while those between the SVM and the HMM-based approachare depicted in Fig. 12(b). Results are reported for those patientswho performed knee extension, knee hip extension, and squatexercises during rehabilitation. In 94% of the investigated casesfor both approaches, the correlation is high (above .75). Onlyfor a few cases, e.g., for patient 9, small or negative correlationsare observed. For these cases the patient’s status remains eitherconstantly poor or constantly good over all the sessions, and themethods disagree on the small trends of improvement resultingin correlations less than .2. Such small differences in individualexercise trends do not affect the informative value of the overallscore which provides a quantitative assessment of progressstatus, e.g., the score of patient 2 and 8 is above the green linein both Fig. 8 and 10. Figs. 8 and 10 also capture the rapidrecovery of patient 18.

VI. DISCUSSION

Monitoring exercise performance during physiotherapy canprovide an objective measure of patient progress. Movementperformance shows temporal and spatial variability caused bymultiple sources including the stochastic nature of muscle re-cruitment, as well as individual differences in height, age, pain,fatigue, and progress. The objective of the proposed distancemeasures and the overall score are to capture the variabilitycaused by improvement and progress over the course of thephysiotherapy treatment.

In this paper, we estimate a continuous measure of patientperformance to capture their progress through rehabilitation,whereas most existing works [10], [11] can only separatehealthy from patient data using classification. We formulatea measure of performance for an exercise set, whereas mostcurrent works [11], [24] consider only a single exercise. More-over, we evaluate our approach both on synthetic and patientdata, whereas many of the current works focus on syntheticanalysis and simulated data only. Furthermore, our proposedapproach can be used when patient data for a motion is notavailable whereas the classification techniques require bothhealthy and patient data for training. The proposed approachesachieve generalization to new patients by including healthypopulation data as reference. Furthermore, the score measureformulations can be applied to any set of exercises as longas the corresponding healthy population data is available. Thisflexibility enables the physiotherapist to include patient specificor novel exercises requiring only a healthy reference set. Thescore measure is formulated in a way to handle individualexercise regimens and a variable number and type of exercises.

To enable feature selection when little or no patient datais available, we assumed that the healthy population exhibitsthe same compensation strategies as the patients to a smallerdegree. This hypothesis is formulated on the basis that difficultmotions result in compensatory strategies in human motion.This assumption is supported for the three exercises discussedin this paper as shown in Fig. 5. In the absence of patient data,we considered the most variant features in the healthy popula-tion as the top features. However, feature selection using bothhealthy and patient population data is more accurate because itallows the method to detect the compensatory strategies whichare specific to the patient population.

For both the feature-based and the HMM-based approach,the distance measure Δ for a repetition set and the overallscore S for a exercise set assess patient progress. The feature-based approach is faster to compute whereas the HMM-basedapproach provides details about each stage of the motion.

We also compared the proposed approach to estimatingpatient progress based on the magnitude of the SVM-marginbetween the healthy and patient population data. Our proposedapproach has a high degree of correlation with the SVM-basedapproach, while requiring less training data. SVM requires fea-ture selection on top of our LASSO feature selection to identifythe most variant features, and requires training data from boththe healthy and patient population. In its current form, the SVMis not capable of capturing the progress based on different exer-cises. We combine the SVM approach for generating distanceswith our approach to generate the overall performance scorefor multiple exercises using the SVM. The results obtainedwith synthetic data illustrate that the proposed approaches

1024 IEEE SYSTEMS JOURNAL, VOL. 10, NO. 3, SEPTEMBER 2016

are superior to this classification method in the presenceof noise, inaccurate segmentation, and incomplete timeseries.

We also compared the proposed approach to physiotherapistevaluations by computing the correlation between progressestimate and advancement of the exercise regimen, and qual-itatively. The method correlates well with physiotherapist eval-uations, but it is not yet possible to determine from the currentdata set whether the proposed approach enhances the ability ofthe physiotherapist to perform diagnosis and assessment. Thiswill be the focus of our future work. Even if the proposedapproach does not provide additional useful information overwhat a physiotherapist can observe visually, it can be usedwhen a physiotherapist is not available to observe a patientsmotion (e.g., when a physiotherapist is observing multiplepatients in the same session or when the patient is performingrehabilitation at home).

The proposed measures consider the improvement due toexercise performance; other factors such as pain and psycho-logical status are not included in our analysis. Different paintreatments can affect motion performance, e.g., reducing painkiller medication may lead to a decrease in observed exerciseperformance even though the overall health status improves.

The order of exercises performed in obtaining the overallscore is not considered in the proposed formulation and effectsof fatigue on movement performance are not included. Exer-cises vary in their difficulty and the variance of the healthypopulation’s performance is considered as an estimate of ex-ercise difficulty. Considering patients, exercise difficulty mayfurther depend on the type of surgery. Variance in the healthypopulation depends further on fitness level and familiarity withan exercise.

The proposed approaches can be used both to provide infor-mation about how well a patient performs a specific task andrepetitions of that task, and also to identify what is differentbetween the ideal motion and the patient’s motion. However,since the proposed approaches calculate the performance mea-sures based on a set of features, the information about thecontribution of each feature and the reasons for the observeddifference between the patient’s performance and the healthyperformance is not captured. To determine the cause of thedifference in performance between the patient and the healthypopulation, either the features need to be further investigated orthe hypothesized causes of the difference should be explicitlymodeled, e.g. for fatigue [40].

VII. CONCLUSION AND FUTURE WORK

Quantified and continuous measure of performance can bebeneficial for monitoring patient progress during the courseof physiotherapy rehabilitation. This work introduces two ap-proaches, feature-based and HMM-based, for capturing thecontinuous change in patient data. A distance measure is intro-duced as a measure of performance for a repetition timeseriesand repetition set. The overall score is then calculated for theexercise set in each session and captures the overall perfor-mance of the patient. The proposed approaches are evaluatedon data of exercises commonly performed after hip or kneereplacement surgery. The results show that the proposed ap-proach is able to track patient progress over the course oftreatment. Future work will consider evaluating with an age

matched healthy population or physiotherapist demonstrationsand evaluating on a larger set of exercises. The proposedapproach could also be enhanced by considering the order of theexercises in the formulation, subject independent measures ofexercise difficulty, and fatigue and pain. Future directions mayalso investigate the smallest clinical important difference ofthe proposed score and whether physiotherapists using the newdistance metrics gain additional clinically relevant informationnot available through visual observation alone.

ACKNOWLEDGMENT

The authors would like to thank the patients and the physio-therapists of the Toronto Rehabilitation Institute.

REFERENCES

[1] J. A. Howe, E. L. Inness, A. Venturini, J. I. Williams, and M. C. Verrier,“The community balance and mobility scale-a balance measure for in-dividuals with traumatic brain injury,” Clin. Rehabil., vol. 20, no. 10,pp. 885–895, Oct. 2006.

[2] L. Yardley et al., “Development and initial validation of the falls ef-ficacy scale-international,” Age Ageing, vol. 34, no. 6, pp. 614–619,Nov. 2005.

[3] C. C. Norkin and D. J. White, Measurement of Joint Motion: A Guide toGoniometry, 4th ed. Philadelphia, PA, USA: Davis, 2009.

[4] M. F. Levin, J. A. Kleim, and S. L. Wolf, “What do motor ‘recovery’ and‘compensation’ mean in patients following stroke?” Neurorehabil. NeuralRepair, vol. 23, no. 4, pp. 313–319, May 2009.

[5] J. F. Lin and D. Kulic, “Human pose recovery using wireless in-ertial measurement units,” Physiol. Meas., vol. 33, no. 12, p. 2099,Dec. 2012.

[6] J. Lin and D. Kulic, “On-line segmentation of human motion for auto-mated rehabilitation exercise analysis,” IEEE Trans. Neural Syst. Rehabil.Eng., vol. 22, no. 1, pp. 168–180, Jan. 2014.

[7] D. Kulic, D. Lee, C. Ott, and Y. Nakamura, “Incremental learning of fullbody motion primitives for humanoid robots,” in Proc. Humanoids, 2008,pp. 326–332.

[8] H.-K. Lee and J. Kim, “An hmm-based threshold model approach forgesture recognition,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 21,no. 10, pp. 961–973, Oct. 1999.

[9] J. Alon, S. Sclaroff, G. Kollios, and V. Pavlovic, “Discovering clusters inmotion time-series data,” in Proc. CVPR, 2003, pp. I-375–I-381.

[10] P. E. Taylor, G. J. Almeida, T. Kanade, and J. K. Hodgins, “Classifyinghuman motion quality for knee osteoarthritis using accelerometers,” inProc. EMBC, 2010, pp. 339–343.

[11] Z. Zhang, Q. Fang, L. Wang, and P. Barrett, “Template matching basedmotion classification for unsupervised post-stroke rehabilitation,” in Proc.ISBB, 2011, pp. 199–202.

[12] J. F.-S. Lin and D. Kulic, “Human pose recovery for rehabilitation usingambulatory sensors,” in Proc. EMBC, 2013, pp. 4799–4802.

[13] J. Aggarwal and M. S. Ryoo, “Human activity analysis: A review,” ACMComput. Surv., vol. 43, no. 3, p. 16, Apr. 2011.

[14] P. J. Phillips, S. Sarkar, I. Robledo, P. Grother, and K. Bowyer, “The gaitidentification challenge problem: Data sets and baseline algorithm,” inProc. ICPR, 2002, pp. 385–388.

[15] D. Glowinski et al., “Toward a minimal representation of affectivegestures,” IEEE Trans. Affective Comput., vol. 2, no. 2, pp. 106–118,Apr./May 2011.

[16] A. Nakazawa, S. Nakaoka, K. Ikeuchi, and K. Yokoi, “Imitating humandance motions through motion structure analysis,” in Proc. IROS, 2002,pp. 2539–2544.

[17] A. Jaimes and N. Sebe, “Multimodal human-computer interaction: Asurvey,” Comput. Vis. Image Understanding, vol. 108, no. 1/2, pp. 116–134, Nov. 2007.

[18] D. Kulic, G. Venture, and Y. Nakamura, “Detecting changes inmotion characteristics during sports training,” in Proc. EMBC, 2009,pp. 4011–4014.

[19] L. Van Gestel et al., “Probabilistic gait classification in children withcerebral palsy: A bayesian approach,” Res. Develop. Disabil., vol. 32,no. 6, pp. 2542–2552, Nov./Dec. 2011.

HOUMANFAR et al.: MOVEMENT ANALYSIS OF REHABILITATION EXERCISES: METRICS FOR PATIENT PROGRESS 1025

[20] R. Ali, L. Atallah, B. Lo, and G.-Z. Yang, “Detection and analysis of tran-sitional activity in manifold space,” IEEE Trans. Inf. Technol. Biomed.,vol. 16, no. 1, pp. 119–128, Jan. 2012.

[21] K. Jia and D.-Y. Yeung, “Human action recognition using local spatio-temporal discriminant embedding,” in Proc. CVPR, 2008, pp. 1–8.

[22] B. Sun, X. Liu, J. Shen, and Q. Zhang, “Joint angle measurements basedon omni-directional lower limb rehabilitation platform,” in Proc. MHS,2012, pp. 337–341.

[23] B. Toro, C. J. Nester, and P. C. Farren, “Cluster analysis for the extractionof sagittal gait patterns in children with cerebral palsy,” Gait Posture,vol. 25, no. 2, pp. 157–165, Feb. 2007.

[24] J.-Y. Jung, J. I. Glasgow, and S. H. Scott, “Feature selection and classifi-cation for assessment of chronic stroke impairment,” in Proc. BIBE, 2008,pp. 1–5.

[25] B. Zhang, T. Kanno, W. Chen, G. Wu, and D. Wei, “Walking stability byage a feature analysis based on a fourteen-linkage model,” in Proc. CIT ,2007, pp. 145–150.

[26] A. Webber, N. Virji-Babul, R. Edwards, and M. Lesperance, “Stiffnessand postural stability in adults with down syndrome,” Exp. Brain Res.,vol. 155, no. 4, pp. 450–458, Apr. 2004.

[27] T. Dao and M. H. B. Tho, “Knowledge-based system for orthopedicpediatric disorders,” in Proc. IFMBE, pp. 125–128, Springer.

[28] A. Switonnski et al., “The effectiveness of applied treatment in parkinsondisease based on feature selection of motion activities,” Przeglad Elek-trotechniczny, vol. 88, pp. 103–106, 2012.

[29] B. Rohrer et al., “Movement smoothness changes during stroke recovery,”J. Neurosci., vol. 22, no. 18, pp. 8297–8304, Sep. 2002.

[30] H. I. Krebs, M. L. Aisen, B. T. Volpe, and N. Hogan, “Quantization ofcontinuous arm movements in humans with brain injury,” Proc. Nat. Acad.Sci., vol. 96, no. 8, pp. 4645–4649, Apr. 1999.

[31] R. Tibshirani, “Regression shrinkage and selection via the lasso,” J. Roy.Statist. Soc., Ser. B, vol. 58, no. 1, pp. 267–288, 1996.

[32] P. A. W. Cox and D. R. Lewis, The Statistical Analysis of Series of Events.London, U.K.: Methuen, 1966.

[33] L. R. Rabiner, “A tutorial on hidden Markov models and selected appli-cations in speech recognition,” Proc. IEEE, vol. 77, no. 2, pp. 257–286,Feb. 1989.

[34] A. Burns et al., “Shimmer-a wireless sensor platform for noninvasivebiomedical research,” IEEE Sens. J., vol. 10, no. 9, pp. 1527–1534,Sep. 2010.

[35] C. Bregler, “Learning and recognizing human dynamics in video se-quences,” in Proc. CVPR, Jun. 1997, pp. 568–574.

[36] N. Kern, B. Schiele, and A. Schmidt, “Multi-sensor activity context detec-tion for wearable computing,” in Ambient Intelligence, vol. 2875, E. Aarts,R. Collier, E. Loenen, and B. Ruyter, Eds. Berlin, Germany: Springer-Verlag, 2003, ser. Lecture Notes in Computer Science, pp. 220–232.

[37] C. Schuldt, I. Laptev, and B. Caputo, “Recognizing human actions: A localsvm approach,” in Proc. ICPR, Aug. 2004, pp. 32–36.

[38] C. Cortes and V. Vapnik, “Support vector machine,” Mach. Learn., vol. 20,no. 3, pp. 273–297, 1995.

[39] J. A. Suykens and J. Vandewalle, “Least squares support vector machineclassifiers,” Neural Process. Lett., vol. 9, no. 3, pp. 293–300, Jun. 1999.

[40] M. Karg, G. Venture, J. Hoey, and D. Kulic, “Human movement analysisas a measure for fatigue: A hidden Markov-based approach,” IEEE Trans.Neur. Syst. Rehabil. Eng., vol. 22, no. 3, pp. 470–481, May 2014.

Roshanak Houmanfar received the B.A.Sc. degreefrom Amirkabir University of Technology (Poly-technic of Tehran), Tehran, Iran, in 2011. She iscurrently pursuing the M.A.Sc. degree in electri-cal and computer engineering at the University ofWaterloo, Waterloo, ON, Canada, developing au-tomated algorithms to estimate patient progressthrough rehabilitation.

Her research interests include machine learning,and human motion analysis.

Michelle Karg received the B.A.Sc, Dipl.-Ing.(summa cum laude), and Dr.-Ing. (summa cumlaude) degrees in electrical and computer engineer-ing at Technical University of Munich, München,Germany, in 2005, 2006, and 2012, respectively.

Her studies included research stays at LTH Lund,Lund, Sweden, and University of North Texas,Denton, TX, USA. Since 2012, she is a Post-Doctoral Fellow at the Electrical and Computer En-gineering Department at the University of Waterloo,Waterloo, ON, Canada. Her research interests in-

clude machine learning, human movement analysis, and affective computing.

Dana Kulic received the combined B.A.Sc. andM.Eng. degrees in electro-mechanical engineering,and the Ph.D. degree in mechanical engineering fromthe University of British Columbia, Vancouver, BC,Canada, in 1998 and 2005, respectively. From 2002to 2006, she was a Ph.D. degree student and apost-doctoral researcher at the CARIS Lab at theUniversity of British Columbia, developing human-robot interaction strategies to quantify and maximizesafety during interaction.

From 2006 to 2009, she was a JSPS Post-doctoralFellow and a Project Assistant Professor at the Nakamura Laboratory at theUniversity of Tokyo, Tokyo, Japan, working on algorithms for incrementallearning of human motion patterns for humanoid robots. She is currently anAssistant Professor at the Electrical and Computer Engineering Departmentat the University of Waterloo, Waterloo, ON, Canada. Her research interestsinclude human motion analysis, robot learning, humanoid robots, and human-machine interaction.

Recommended