13
Special Issue Article The psychometric properties of self-report outcome measures in temporomandibular dysfunction Alicia J. Emerson Kavchak 1 , John Jake Mischke 1 , Kevin Lulofs-MacPherson 1 , Ann M. Vendrely 2 1 University of Illinois Hospital & Health Sciences System, Department of Physical Therapy, Chicago, USA, 2 Governors State University, Physical Therapy Department, University Park, IL, USA Background: Temporomandibular dysfunction (TMD) demonstrates a variety of clinical manifestations. While there are some well-documented self-report outcome measures for diagnostics and screening of TMD, these scales are often not utilized in physical therapy (PT) when assessing the patient’s self-reported functional limitations and disability. Further, there is a lack of understanding of which self-report outcome measures in TMD have sound psychometric properties. Objective: The purpose of this study is to identify and analyze the psychometric properties of commonly used self-reported outcome measures in adults with TMD undergoing conservative management. Methods: A comprehensive and systematic search of articles published in PubMed, CINAHL, Dentistry and Oral Science, and PsycINFO databases through June 2013 was completed. Inclusion criteria included (1) any article that described the psychometric properties of a self-report outcome measure utilized in TMD, (2) subjects were adults .18 years, and (3) the full text article was available in English. Major Findings: Thirteen articles were discovered with eight reporting psychometric analysis. Ten studies reported on reliability, with good internal consistency noted but test–retest reliability varied greatly. Face and content validity was reported for most measures; only 50% of the studies reported on construct validity. Other psychometric concepts such as responsiveness, duration to administer, feasibility, interpretability, and acceptability were less reported. Conclusion: The results of this systematic review demonstrate that eight non-diagnostic self-report outcome measures in adults diagnosed with TMD have undergone some psychometric development or analysis. Keywords: Outcome measures, Psychometric properties, Self-report outcomes, Temporomandibular dysfunction Introduction Temporomandibular dysfunction (TMD) has a wide range in prevalence reported in the global population ranging from 8 1 to 12%. 2 Patients are typically female, aged 20–40 years, and reporting primarily chronic symptoms. 1–3 The clinical presentation of TMDcan vary and be either the primary reason for referral to physical therapy (PT) or a secondary complaint. For example, in those patients with neck pain seeking PT, 90% of the patients also have complaints of TMD. 4 Temporomandibular dysfunction can significantly affect patients with clinical manifestations including severe daily orofacial, neck, and head pain, 3–6 sleep dysfunction, 7 and depression. 3 In addition, functional activities that require optimal jaw mobility, such as eating, chewing, biting, kissing, and speaking, are impaired. 8 These functional limitations and disabil- ities can have a profound impact on the quality of life in patients who have TMD. Generally, self-report outcome measures are used to quantify the impact of psychological distress or functional limitation in patients with pain. Bom- bardier (2000) identified five key constructs to be included in self-report outcome measures: pain, function, generic health status, work disability, and patient satisfaction. 9 When determined to have sound psychometric properties, self-report outcome mea- sures can help identify baseline functional limitations, monitor for changes in presentation, and may help identify patients who will respond positively to interventions. Currently, there are some identified outcome measures used in this population that were initially designed for the diagnostic and screening process. Correspondence to: Alicia J Emerson Kavchak, University of Illinois Hospital & Health Sciences System, Department of Physical Therapy, Chicago, IL, USA. Email: [email protected] 174 ß W. S. Maney & Son Ltd 2014 DOI 10.1179/1743288X13Y.0000000126 Physical Therapy Reviews 2014 VOL. 19 NO.3

The psychometric properties of self-report outcome measures in temporomandibular dysfunction

Embed Size (px)

Citation preview

Special Issue Article

The psychometric properties of self-reportoutcome measures in temporomandibulardysfunction

Alicia J. Emerson Kavchak1, John Jake Mischke1, Kevin Lulofs-MacPherson1,Ann M. Vendrely2

1University of Illinois Hospital & Health Sciences System, Department of Physical Therapy, Chicago, USA,2Governors State University, Physical Therapy Department, University Park, IL, USA

Background: Temporomandibular dysfunction (TMD) demonstrates a variety of clinical manifestations.While there are some well-documented self-report outcome measures for diagnostics and screening ofTMD, these scales are often not utilized in physical therapy (PT) when assessing the patient’s self-reportedfunctional limitations and disability. Further, there is a lack of understanding of which self-report outcomemeasures in TMD have sound psychometric properties.Objective: The purpose of this study is to identify and analyze the psychometric properties of commonlyused self-reported outcome measures in adults with TMD undergoing conservative management.Methods: A comprehensive and systematic search of articles published in PubMed, CINAHL, Dentistry andOral Science, and PsycINFO databases through June 2013 was completed. Inclusion criteria included (1)any article that described the psychometric properties of a self-report outcome measure utilized in TMD, (2)subjects were adults .18 years, and (3) the full text article was available in English.Major Findings: Thirteen articles were discovered with eight reporting psychometric analysis. Ten studiesreported on reliability, with good internal consistency noted but test–retest reliability varied greatly. Faceand content validity was reported for most measures; only 50% of the studies reported on construct validity.Other psychometric concepts such as responsiveness, duration to administer, feasibility, interpretability,and acceptability were less reported.Conclusion: The results of this systematic review demonstrate that eight non-diagnostic self-report outcomemeasures in adults diagnosed with TMD have undergone some psychometric development or analysis.

Keywords: Outcome measures, Psychometric properties, Self-report outcomes, Temporomandibular dysfunction

IntroductionTemporomandibular dysfunction (TMD) has a wide

range in prevalence reported in the global population

ranging from 81 to 12%.2 Patients are typically

female, aged 20–40 years, and reporting primarily

chronic symptoms.1–3 The clinical presentation of

TMDcan vary and be either the primary reason for

referral to physical therapy (PT) or a secondary

complaint. For example, in those patients with neck

pain seeking PT, 90% of the patients also have

complaints of TMD.4

Temporomandibular dysfunction can significantly

affect patients with clinical manifestations including

severe daily orofacial, neck, and head pain,3–6 sleep

dysfunction,7 and depression.3 In addition, functional

activities that require optimal jaw mobility, such as

eating, chewing, biting, kissing, and speaking, are

impaired.8 These functional limitations and disabil-

ities can have a profound impact on the quality of life

in patients who have TMD.

Generally, self-report outcome measures are used

to quantify the impact of psychological distress or

functional limitation in patients with pain. Bom-

bardier (2000) identified five key constructs to be

included in self-report outcome measures: pain,

function, generic health status, work disability, and

patient satisfaction.9 When determined to have sound

psychometric properties, self-report outcome mea-

sures can help identify baseline functional limitations,

monitor for changes in presentation, and may help

identify patients who will respond positively to

interventions.

Currently, there are some identified outcome

measures used in this population that were initially

designed for the diagnostic and screening process.

Correspondence to: Alicia J Emerson Kavchak, University of IllinoisHospital & Health Sciences System, Department of Physical Therapy,Chicago, IL, USA. Email: [email protected]

174! W. S. Maney & Son Ltd 2014DOI 10.1179/1743288X13Y.0000000126 Physical Therapy Reviews 2014 VOL. 19 NO. 3

These are typically used in dental practice and are

well published. For example, the research diagnostic

criteria for temporomandibular disorder (RDC/

TMD) identifies any pain and/or resulting dysfunc-

tion in or near the temporomandibular joint attrib-

uted to a muscle disorder, disc dysfunction, or

arthralgia.10 Included in the RDC/TMD’s systematic

diagnostic process is the identification of clinical

signs, such as limited range of motion with jaw

movements (Axis I), as well as identification of any

psychological and psychosocial distress (Axis II).10

Another common scale found in the dental literature,

the TMJ scale, has been demonstrated to be reliable

and valid in this target population.11 The 97-item self-

report outcome measure assesses the constructs of

pain, functional limitations, and psychosocial factors

such as stress.12 Perhaps because it is available only

commercially it is not commonly utilized in the PT

clinical setting. Interestingly, only the TMJ scale13

has evolved from being used only as diagnostic scale

to being capable of monitoring change over an

episode of care. Another frequently found tool in

the dental literature is the oral health impact profile

(OHIP), which was developed as a self-report

measure in patients with orofacial pain, has been

validated, demonstrates responsiveness and has been

translated in many languages.14,15

These well-published scales are not commonly used

in the PT clinic. One explanation could be that they are

primarily associated with the diagnostic process. A

second explanation, perhaps, is the lack of awareness

by the PT community as to which self-report outcome

measures in TMD demonstrate sound psychometric

properties. We propose that this has been the concern

when attempting to select a scale for use. As relatively

few free, well-studied measures specific to TMD are

utilized in PT, this creates difficulty when monitoring

for change or establishing prognosis. For example in a

recently published article examining the effectiveness

of varying manual therapy techniques, a non-validated

outcome measure was utilized, thus placing a limit on

the value of the final interpretation of the results.16

This weakness in research design is unfortunately

consistently found. In addition, a recent meta-analysis

reported that two issues: (1) either impairments, such

as jaw opening, jaw clicking, and pain reduction,

similar to those found on the RDC/TMD were

described as the outcome of desire or (2) non-validated

self-report outcome measures were used.17 Further,

the constructs included in those self-report outcome

measures (e.g. depression and perceived change in

status) do not appear to embody the breadth of

functional limitation that should be assessed according

to Bombardier.9

The purpose of this study is to identify commonly

used self-report outcome measures in adults with

TMD undergoing conservative management and to

analyze the psychometric properties of these measures.

MethodsIdentification and selection of the literatureThe research question was generated using the PICO

system. PICO is a mnemonic device in which a

research question can be developed by identifying the

P (patient population), I (intervention or variable of

interest), C (comparison, which can be none or

placebo of groups), and O (outcomes) monitored.

Our question was: In adult patients diagnosed with

TMD (P) undergoing conservative intervention (I),

can we compare the psychometric properties (C) of

self-report outcome measures (O) to facilitate selec-

tion determination? Inclusion and exclusion criteria

were established a priori. Studies describing the

design of or analysis of a self-report outcome measure

utilized in temporomandibular pain and/or dysfunc-

tion were included. A comprehensive and systematic

literature search to ensure optimal identification of

outcome measures for TMD was conducted utilizing

the PubMed, CINAHL, Dentistry and Oral Science,

and PsycINFO databases. Please refer to Figure 1.

Keywords used in the database include temporoman-

dibular joint dysfunction, temporomandibular joint

disorder, temporomandibular disease, temporomandib-

ular pain, incidence, epidemiology, follow-up studies,

disease progression, validation studies, evaluation stu-

dies, psychometrics, reproducibility of results, sensitivity

and specificity, self-report, self-assessment, physical

performance, outcome assessment, questionnaires, qual-

ity of life, patient satisfaction, health surveys, data

collection, and severity of illness index. In addition, we

manually reviewed our personal files and cross-refer-

enced articles that described any randomized and quasi-

randomized controlled trials that utilized self-report

outcome measures as the dependent variable. Any

English language article published through June 30,

2013 that included self-report outcome measures

investigating varying constructs in TMD and/or pain

in adults were included in the study. Any articles that

included subjects who had undergone a surgical

intervention, had a history of cancer, or had current

or previous neurological disease were excluded.

Selection criteriaThree independent reviewers screened the titles and

abstracts for eligibility using the criteria determined a

priori (AEK, KLM, and JJM). Any differences were

resolved by consensus and/or by obtaining the full

article. Studies were included if the following criteria

were met: (1) a self-report or physical outcome

measure utilized in TMD and/or pain, (2) subjects

were adults over the age of 18 years, and (3) the

article was available in full text in English.

Emerson Kavchak et al. Psychometric properties of self-report outcome measures

Physical Therapy Reviews 2014 VOL. 19 NO. 3 175

Psychometric property assessment and dataextractionAs no one set method of quality assessment is

preferred in the systematic review of psychometric

properties, data extraction was based on the pre-

viously established literature.17,18 The psychometric

properties (reliability, validity, and responsiveness)

and study design were assessed independently by one

reviewer (AEK). A second reviewer (JJM) confirmed

the results of the data extracted. The second reviewer

(JJM) was not blinded to the results of the first

reviewer (AEK).

ResultsThe search yielded 1100 articles for review. In total

1010 articles were excluded based on the title. Further

hand searching of references resulted in 13 more full

text articles for review. The three authors indepen-

dently reviewed and selected potential articles to

review. Any disagreement was resolved by consensus,

narrowing the search to 30 articles for full text

review. In total, 13 studies were confirmed that

described some form of psychometric analysis, with

eight different self-report outcome measures identi-

fied. In the articles included in this study, the authors’

scope of professional practice was predominantly the

dental community, with only one from occupational

therapy.19 The subjects in these studies were pre-

dominantly female, with individual study demo-

graphic reporting a high percentage of female

patients (ranging from 87 to 93%), with only one

studying reporting less than this.20 Please refer to

Table 1 for details. Major areas of review were

reliability, validity, and acceptability and/or inter-

pretability of the self-report outcome measure; lesser

reported were responsiveness and feasibility and/or

duration required to administer the tool. The

majority of the studies examined the construct of

pain via the visual analog scales (VAS),20–24 though

Figure 1 Data Extraction.

Emerson Kavchak et al. Psychometric properties of self-report outcome measures

176 Physical Therapy Reviews 2014 VOL. 19 NO. 3

Ta

ble

1S

tud

yd

es

cri

pti

on

Au

tho

r(d

ate

)S

elf-r

ep

ort

ou

tco

me

measu

reS

ub

ject

dem

og

rap

hic

s(x

¡S

D)

years

inag

eS

tud

yd

esi

gn

Nu

mb

er

of

part

icip

an

tsA

ssess

men

tfr

eq

uen

cy

van

Gro

ote

let

al.

(2007)2

1(1

)P

ain

dia

ryV

AS

,(2

)C

PT

and

VA

S,

(3)

VA

Sas

aq

uest

ionnaire

Myo

genous

TM

D;

18–6

5ye

ars

(31.6

¡10.0

);chro

nic

ity:

x51.1

years

;93%

fem

ale

Pro

spectiv

e,

RC

T(1

)n

595–1

06

dia

ryentr

y,(2

)n

5109

cold

CP

T,

(3)

n5

118

(1)

Daily

dia

ryentr

y(46

/day)

for

2w

eeks

befo

retr

eatm

ent,

(2)

CP

T:

befo

retr

eatm

ent,

during

last

visi

t,(3

)as

quest

ionnaire:

2w

eeks

befo

retr

eatm

ent,

during

last

visi

t,or

follo

wup

(2–1

8m

onth

s)va

nG

roote

let

al.

(2009)2

2(1

)V

AS

dia

ry,

(2)

pain

behavi

or

scale

(0–5

),(3

)V

AS

as

quest

ionnaire

Myo

genous

TM

D;

18–6

5ye

ars

31.6

¡10.0

);chro

nic

ity:

x51.1

years

;93%

fem

ale

Pro

spectiv

e,

RC

T107

com

ple

ted

pre

-inte

rventio

nd

iary

(Days

1–7

);96

com

ple

ted

pre

-inte

rventio

nd

iary

(days

for

‘inte

rvalof

13

days

’);

103

com

ple

ted

the

dia

ryat

least

four

times

rep

ort

ed

on

10

of

14

days

Daily

dia

ryentr

y(46

/d

ay)

for

2w

eeks

befo

retr

eatm

ent

after

last

visi

t

Wass

ell

et

al.

(2008)2

0(1

)D

aily

pain

dia

ryV

AS

for

wors

tp

ain

Refe

rralfo

rTM

D;

19–6

5ye

ars

(x5

35

years

);54%

fem

ale

Pro

spectiv

e,

RC

T39

com

ple

ted

suffic

ient

dia

ryentr

yD

aily

Em

shoff

et

al.

(2010)2

3(1

)V

AS

-PI,

(2)

PG

ICC

hro

nic

TM

Dd

iag

nosi

s,18–7

0ye

ars

(39.1

z15.2

);91.3

%fe

male

;chro

nic

ityin

weeks

:97.9

¡124.3

(mean,

SD

)

Pro

spectiv

e,

cohort

588

of

678

recru

ited

Base

line

and

at

3-m

onth

follo

wup

visi

ts

Em

shoff

et

al.

(2011)2

4(1

)V

AS

-PI,

(2)

PG

ICP

atie

nts

with

chro

nic

TM

D;

18–7

0ye

ars

;chro

nic

ity:

.3

and

,6

month

s(n

5229),

.6

month

sand

,2

years

(n5

372),

.2

years

and

,5

years

(n5

183)

Pro

spectiv

e,

cohort

784

of

794

recru

ited

Base

line

and

after

12

weeks

Vis

scher

et

al.

(2010)2

5Tam

pa

scale

of

kinesi

op

hob

iaD

iag

nosi

sof

TM

D,

18–8

7ye

ars

(41.3

¡14.1

),81%

fem

ale

Cro

ss-s

ectio

nal

cohort

,analy

sis

of

quest

ionnaire

transl

atio

nand

desi

gn

301

com

ple

ted

fear

of

move

ment

sub

gro

up

ing

Befo

refir

stvi

sit,

100

com

ple

ted

with

in4

weeks

befo

refir

stvi

sit

Brist

er

et

al.

(2006)2

6A

rthritis

self-

effic

acy

Patie

nts

with

TM

D18–6

8ye

ars

(x5

37

years

);87%

fem

ale

;chro

nic

ity:

2m

onth

s–46

years

(x5

60

month

s)

Analy

sis

of

quest

ionnaire

initi

ally

deve

lop

ed

for

art

hritis

and

mod

ified

for

TM

D

156

Base

line

van

der

Meule

net

al.

(2012)2

9O

HIP

-TM

Dab

bre

viate

dR

efe

rralfr

om

dentis

tto

TM

Dclin

ic;

41.0

¡14.9

years

;77%

fem

ale

Psy

chom

etr

icanaly

sis

of

ash

ort

er

quest

ionnaire

245

Befo

refir

stap

poin

tment

Durh

am

et

al.

(2011)3

0O

HIP

-TM

D110

patie

nts

the

TM

D;

.18

years

old

;110

ag

ed

and

gend

er

matc

hed

contr

ols

Deve

lop

ment

of

cond

ition-s

pecifi

cq

ualit

yof

life

measu

re

110

Once

for

quantit

ativ

e;

29

patie

nts

ina

purp

osi

vem

axi

mum

variatio

nfo

rq

ualit

ativ

eass

ess

ment

with

inte

rvie

wd

iscuss

ion

and

com

ple

tion

of

the

tool

Emerson Kavchak et al. Psychometric properties of self-report outcome measures

Physical Therapy Reviews 2014 VOL. 19 NO. 3 177

fear of movement,25 self-efficacy,26 jaw pain and

functional limitation,27,28 and overall oral health

status29,30 were assessed to a lesser extent. Overall,

all studies reported some component of reliability

and validity. Of the six studies that examined the

responsiveness of the self-report outcome measure,

five examined the VAS.

ReliabilityOne of the psychometric cornerstones of a measure is

the establishment of freedom from random error.33

One method to control for random error is ensuring

the internal consistency of the individual items that

measure the same construct in a self-report measure;

the other method is to report stability of the test

results over time (test–retest reliability).33 Internal

consistency or test–retest reliability were examined in

10 articles. Please see Table 2 for detailed results.

Internal consistency was reported in seven studies. Of

these articles, six demonstrated at least one subscale

deemed to have excellent internal consistency

(alpha>0.9).25–31 The constructs, however, ranged

from questionable to good. Test–retest reliability was

reported in five studies.21,25,27,29,32 One study exam-

ined the temporal stability of the measure.31 Duration

from baseline assessment to re-test, ranged from daily

reassessment to reassessment after 6–8 weeks. The

results varied greatly, with only one study reporting

ICC.0.8 for all analyses.27

ValidityFace/content validityFace and content validity are two important and

related concepts in self-report outcome measures.

Face validity ensures the construct studied is actually

being monitored and content validity assesses the

extent to which the key concepts of a construct have

been included.33 As face validity does not have a

standardized evaluation, the determination was made

as to whether face validity was present or not. Both

face and content are intuitively easy to assess with the

VAS21–24 and had been previously described for the

Tampa scale of kinesiophobia.25 For self-report

outcome measures that are more involved, the

process of assuring face and content validity was

found for the mandibular functional impairment

questionnaire (MFIQ),2 the Jaw Functional Limita-

tion Scales (8 and 20),32 the oral health impact

profile-TMD (OHIP-TMD),29,30 and the Tampa

Scale of Kinesiophobia-TMD (TSK-TMD).25 The

methods described for establishing face and content

validity ranged from qualitative and quantitative

assessment as described on Table 3.

Construct validityConstruct validity ensures that the construct, or

variable to be assessed, is in fact being quantified byAu

tho

r(d

ate

)S

elf-r

ep

ort

ou

tco

me

measu

reS

ub

ject

dem

og

rap

hic

s(x

¡S

D)

years

inag

eS

tud

yd

esi

gn

Nu

mb

er

of

part

icip

an

tsA

ssess

men

tfr

eq

uen

cy

Cam

pos

et

al.

(2012)2

7M

and

ibula

rfu

nctio

nim

pairm

ent

quest

ionnaire

Patie

nts

with

TM

D(R

DC

/TM

DA

xis

1)

36.8

8.9

5ye

ars

old

;53–7

3%

fem

ale

Cro

ss-s

ectio

nalcohort

;analy

sis

of

quest

ionnaire

transl

atio

nand

desi

gn

249

Base

line,

2w

eeks

late

r

Ste

geng

aet

al.

(1993)2

8M

and

ibula

rfu

nctio

nim

pairm

ent

quest

ionnaire

Patie

nts

with

vary

ing

TM

Dd

iag

nose

s;25.2

¡7.3

years

old

;89%

fem

ale

Cro

ss-s

ectio

nalcohort

;analy

sis

of

quest

ionnaire

transl

atio

nand

desi

gn

95

Base

line

Rollm

an

et

al.

(2010)3

2P

atie

nt-

specifi

cap

pro

ach

Patie

nts

with

TM

D;

.18

years

old

(39¡

14);

86.4

%fe

male

Cro

ss-s

ectio

nalcohort

;analy

sis

of

quest

ionnaire

psy

chom

etr

ics

132;

93%

com

ple

ted

second

measu

re;

83%

com

ple

ted

third

measu

re

(1)

Befo

refir

stvi

sit,

(2)

second

visi

tb

efo

retr

eatm

ent,

(3)

6–8

weeks

after

treatm

ent

start

ed

Ohrb

ach

et

al.

(2008)3

1Ja

wFunctio

nalLim

itatio

nS

cale

-8ite

ms

and

20

item

s5

dia

gnost

icg

roup

s:n

531

TM

D,

n5

25

prim

ary

Sjo

gre

n,

n5

20

burn

ing

mouth

synd

rom

e,

n5

28

skele

talm

alo

cclu

sion,

n5

30

health

yre

call

inad

ults

Know

ng

roup

sva

lidity

desi

gn

util

izin

gq

ualit

ativ

eand

Rasc

hite

manaly

sis

Goalw

as

30

patie

nts

per

dia

gnost

icg

roup

;15

patie

nts

with

TM

Dfo

rq

ualit

ativ

est

ud

y

Base

line,

1–2

weeks

late

r

x:m

ean;S

D:s

tand

ard

devi

atio

n;V

AS

:vis

uala

nalo

gsc

ale

;CP

T:c

old

pre

ssure

test

;R

CT:r

and

om

ized

contr

olt

rial;

VA

S-P

I:vi

suala

nalo

gsc

ale

for

pain

inte

nsi

ty;P

GIC

:patie

ntg

lob

ali

mp

ress

ion

ofchang

e;T

MD

:te

mp

oro

mand

ibula

rd

ysfu

nctio

n;

RD

C/T

MD

:re

searc

hd

iag

nost

iccrite

ria

for

tem

poro

mand

ibula

rd

isord

er;

OH

IP-T

MD

:ora

lhealth

imp

act

pro

file-T

MD

.

Ta

ble

1C

on

tin

ue

d

Emerson Kavchak et al. Psychometric properties of self-report outcome measures

178 Physical Therapy Reviews 2014 VOL. 19 NO. 3

the outcome measure; one method to establish

construct validity is via convergent validity or via

discriminatory validity analysis.33 Of the eight self-

report outcome measures identified in this systematic

review, half reported findings for convergent validity

analysis (MFIQ, TSK-TMD, self-efficacy in TMD,

and OHIP-TMD). Please see Table 3 for further

details. As the construct being studied varied, the

correlation analyses varied as well.

ResponsivenessThe ability to detect both significant and meaningful

change to the patient’s status is a key psychometric

component when making decisions of treatment

effectiveness or efficiency. Six articles examined

responsiveness of the instrument; five of these studies

utilized the VAS. Please refer to Table 4 for details.

Wassell et al. defined the determination of mean-

ingful change as dependent on the clinician’s visual

interpretation of the patients’ reported mean, max-

imum, and combined mean/maximum VAS ratings

(pain) over time.20 The clinicians categorized patients

into ‘improved, borderline, or not improved’. The

other four studies utilized the responses of the

patients and implemented various calculation meth-

ods, including distributional analysis based on

statistical findings (smallest detectable difference,21,22

standard error of measure,24 and effect size22) or an

anchoring method (receiver operating curve) of

responders to non-responders.20,23,32 Meaningful

change for the VAS scale varied by calculation

method, even within the same study cohort, as well

as between the populations studied, with ranges

reported from 11.5 to 48.9 mm. The patient-specific

approach determined that 58% of relative change

from baseline on a VAS scale with the minimal

clinical improvement needed to identify a responder

(area under the curve, AUC50.91; 95% CI50.86–

0.97) with a sensitivity of 0.85 and a specificity of

0.84.32

Duration/feasibilityBeyond reliability and validity, a well-designed out-

come measure should address the overall burden to

complete the self-report outcome measure.33 The

amount of burden, both for the person administrat-

ing the measure and for the patient completing, is an

important factor when determining whether the

outcome measure will be completed in the real world

(i.e. clinical) environment. No one study explicitly

discussed the time required to complete the ques-

tionnaire, though the number of items on the

questionnaire was often reported and could give the

clinician a rough idea of burden.Ta

ble

2R

es

ult

so

nre

lia

bil

ity

Au

tho

r(d

ate

)O

utc

om

eto

ol

Relia

bili

tyM

eth

od

of

calc

ula

tio

nR

esu

lts

van

Gro

ote

let

al.

(2007)2

1V

AS

Test

–rete

stIC

C(1

)V

AS

as

dia

ry:

Day

7–8

;4–1

1;

1–1

4ra

ng

ed

from

0.5

2to

0.7

0;

(2)

CP

T5

0.4

8W

ass

ell

et

al.

(2008)2

0V

AS

Inte

r-ra

ter

relia

bili

tyk

Am

ong

stth

eth

ree

exa

min

ers

:k5

0.7

6–0

.79

for

‘imp

rove

rs’,

k50.2

6–0

.44

for

‘bord

erlin

eim

pro

vers

’,k5

0.6

9–0

.79

for

‘nonim

pro

vers

’V

issc

her

et

al.

(2010)2

5TS

K-T

MD

Inte

rnalconsi

stency

alp

ha

alp

ha

50.6

6–0

.83

(n5

120)

(dep

end

ing

on

analy

sis

of

entir

esc

ale

or

sub

scale

s)V

issc

her

et

al.

(2010)2

5TS

K-T

MD

Test

–rete

stIC

CIC

C5

0.6

7–0

.71

(dep

end

ing

on

analy

sis

of

entir

esc

ale

or

sub

scale

s)B

rist

er

et

al.

(2006)2

6S

elf-

effic

acy

inTM

DIn

tern

alconsi

stency

alp

ha

alp

ha

50.9

1va

nd

er

Meule

net

al.

(2012)2

9O

HIP

-TM

Dab

bre

viate

dve

rsio

nIn

tern

alC

onsi

stency

alp

ha

alp

ha

50.6

7w

ithO

HIP

-NL,

OH

IP-N

L14

50.9

0;

OH

IP-N

L5

0.9

6va

nd

er

Meule

net

al.

(2012)2

9O

HIP

-TM

Dab

bre

viate

dve

rsio

nTest

–rete

stIC

CIC

C5

0.6

9w

ithO

HIP

-NL5,

OH

IP-N

14

50.8

0;

OH

IP-N

50.8

2D

urh

am

et

al.

(2011)3

0O

HIP

-TM

DIn

tern

alconsi

stency

alp

ha

alp

ha

50.8

90–0

.951

(dep

end

ing

on

scale

vers

ion)

Cam

pos

et

al.

(2012)2

7M

FIQ

Test

–rete

stIC

Cfo

rre

pro

ducib

ility

;P

ears

on’s

rfo

rte

st-r

ete

stD

imensi

on

I5IC

C5

20.8

95

(95%

CI5

0.8

32–0

.935);

r50.8

96

(95%

CI5

0.8

34–0

.936);

Dim

ensi

on

II5

ICC

50.8

25

(95%

CI5

0.7

26–0

.891);

r50.8

26

(95%

CI5

0.7

26–0

.891)

Cam

pos

et

al.

(2012)2

7M

FIQ

Inte

rnalconsi

stency

alp

ha

Dim

ensi

on

Ialp

ha

50.8

74;

dim

ensi

on

IIalp

ha

50.9

18

Ste

geng

aet

al.

(1993)2

8M

FIQ

Inte

rnalconsi

stency

alp

ha

alp

ha

50.6

3–0

.95

dep

end

ing

on

scale

analy

zed

Rollm

an

et

al.

(2010)3

2P

atie

nt-

specifi

cap

pro

ach

Test

–rete

stIC

CIC

C5

0.7

2(9

5%

CI:

0.5

7–0

.82)

(betw

een

first

and

second

ass

ess

ment

times)

Ohrb

ach

et

al.

(2008)3

1Ja

wFunctio

nalLim

itatio

nS

cale

-8ite

ms

and

20

item

sIn

tern

alconsi

stency

alp

ha

alp

ha

50.8

7fo

rJF

LS

-8;

alp

ha

50.9

5fo

rJF

LS

-20

Ohrb

ach

et

al.,

(2008)3

1Ja

wFunctio

nalLim

itatio

nS

cale

-8ite

ms

and

20

item

sTem

pora

lst

ab

ility

CC

Crh

occc

rho

50.8

1fo

rJF

LS

-8;

ccc

rho

50.8

7fo

rJF

LS

-20

TM

D:te

mp

oro

mand

ibula

rd

ysfu

nctio

n;V

AS

:vi

suala

nalo

gsc

ale

s;M

FIQ

:m

and

ibula

rfu

nctio

nali

mp

airm

entq

uest

ionnaire;O

HIP

-TM

D:ora

lhealth

imp

actp

rofil

e-T

MD

;TS

K-T

MD

:Tam

pa

scale

ofki

nesi

op

hob

ia-

TM

D;

CP

T:

cold

pre

ssure

test

;C

CC

:concord

ance

corr

ela

tion

coeffic

ient;

ICC

:in

ter

cla

sscoeffic

ient.

Emerson Kavchak et al. Psychometric properties of self-report outcome measures

Physical Therapy Reviews 2014 VOL. 19 NO. 3 179

Ta

ble

3P

sy

ch

om

etr

icp

rop

ert

ies

Au

tho

r(D

ate

)F

ace/c

on

ten

tva

lidit

yC

on

stru

cts

ass

ess

ed

Co

nst

ruct

valid

ity:

dis

cri

min

ate

valid

ity

Co

nst

ruct

valid

ity:

co

nve

rgen

tva

lidit

yD

ura

tio

nto

ad

min

iste

rA

ccep

tab

ility

Inte

r-p

reta

bili

tyF

urt

her

lan

gu

ag

eva

lidati

on

van

Gro

ote

let

al.

(2007)2

1Y

es

intu

itive

ly,

but

not

dis

cuss

ed

Pain

No

No

No

95–1

06/1

18

com

ple

ted

dia

ry;

109/1

18

CP

T;

118/1

88

VA

Sq

uest

ionnaire

No

No

van

Gro

ote

let

al.

(2009)2

2Y

es

intu

itive

ly,

but

not

dis

cuss

ed

Pain

,p

ain

behavi

or

No

No

No

96–1

07/1

18

com

ple

ted

the

dia

ry,

dep

end

ing

on

ass

ess

ment

vers

ion

No

No

Wass

ell

et

al.

(2008)2

0Y

es

intu

itive

ly,

but

not

dis

cuss

ed

Pain

No

No

No

54%

com

ple

ted

the

dia

rysu

ffic

ient

by

stand

ard

sest

ab

lished

;th

eclin

icia

ns

com

ple

ted

all

cate

goriza

tions

No

No

Em

shoff

et

al.

(2010)2

3Y

es

intu

itive

ly,

but

not

dis

cuss

ed

Pain

;p

atie

nt

satis

factio

nN

oN

oN

o588/6

78

No

No

Em

shoff

et

al.

(2011)2

4Y

es

intu

itive

ly,

but

not

dis

cuss

ed

Pain

;p

atie

nt

satis

factio

nN

oN

oN

o782/7

94

No

No

Vis

scher

et

al.

(2010)2

5Y

es

Fear

of

move

ment

‘lack

of

corr

ela

tion

betw

een

cata

stro

phiz

ing

and

avo

idance

iseve

nsu

gg

est

ive

of

dis

crim

inate

valid

ity’

Pears

on’s

corr

ela

tion

coeffic

ient

with

cata

stro

phiz

ing

scale

(n5

120)5

0.1

2–0

.33

(varied

wheth

er

entir

eor

sub

scale

)

No

301/3

27

Yes

Yes:

initi

ally

deve

lop

ed

inD

utc

h;

back

transl

ate

din

toE

ng

lish

Brist

er

et

al.

(2006)2

6Y

es

intu

itive

ly,

but

not

dis

cuss

ed

Self-

effic

acy

No

Self-

effic

acy

and

ab

ility

tocontr

ol/d

ecre

ase

pain

:su

rvey

of

pain

attitu

des

(SO

PA

)contr

ol

scale

r50.5

4(P

,0.0

01);

self-

effic

acy

and

contr

olp

ain

:cop

ing

stra

teg

ies

quest

ionnaire

(CS

Q)

contr

olp

ain

r50.5

8(P

,0.0

01);

self-

effic

acy

and

decre

ase

pain

:C

SQ

decre

ase

pain

r50.4

8(P

,0.0

01)

Rep

ort

ed

8ite

ms

on

scale

;d

idnot

mentio

nd

ura

tion

tocom

ple

te

No

No

No

Emerson Kavchak et al. Psychometric properties of self-report outcome measures

180 Physical Therapy Reviews 2014 VOL. 19 NO. 3

Au

tho

r(D

ate

)F

ace/c

on

ten

tva

lidit

yC

on

stru

cts

ass

ess

ed

Co

nst

ruct

valid

ity:

dis

cri

min

ate

valid

ity

Co

nst

ruct

valid

ity:

co

nve

rgen

tva

lidit

yD

ura

tio

nto

ad

min

iste

rA

ccep

tab

ility

Inte

r-p

reta

bili

tyF

urt

her

lan

gu

ag

eva

lidati

on

van

der

Meule

net

al.

(2012)2

9N

ot

dis

cuss

ed

for

this

stud

yFunctio

nal

limita

tion,

phys

icalp

ain

,p

sycholo

gic

al

dis

com

fort

,p

hys

ical

dis

ab

ility

,p

sycholo

gic

al

dis

ab

ility

,so

cia

ld

isab

ility

,and

hand

icap

No

Sp

earm

an’s

rho:

r50.4

6w

ithp

ain

-rela

ted

dis

ab

ility

for

OH

IP-N

Land

OH

IP-N

L14;

r50.3

9fo

rO

HIP

-NL5;

r50.2

8se

lf-re

port

ora

lhealth

statu

sin

the

OH

IP-N

L,

r50.1

9in

OH

IP-N

L14,

r50.2

1in

OH

IP-N

L5

OH

IP-1

4w

as

the

pre

ferr

ed

vers

ion

No

No

Tra

nsl

ate

din

toD

utc

h

Durh

am

et

al.

(2011)3

0Item

analy

sis

(qualit

ativ

eanaly

sis

of

purp

ose

fulsa

mp

leof

29

patie

nts

)

Functio

nal

limita

tion,

phys

icalp

ain

,p

sycholo

gic

al

dis

com

fort

,p

hys

ical

dis

ab

ility

,p

sycholo

gic

al

dis

ab

ility

,so

cia

ld

isab

ility

,and

hand

icap

No

OH

IP-T

MD

Sp

earm

an’s

r50.7

51

with

multi

dim

ensi

onalp

ain

inve

nto

ry(M

PI)

,0.5

76

with

VA

S

Short

er

than

pre

vious

inst

rum

ents

,not

dis

cuss

ed

specifi

cally

No

No

No

Cam

pos

et

al.

(2012)2

7Face

valid

ity:

6d

entis

try

pro

fess

ionals

,3

Eng

lish

exp

ert

s;p

roto

typ

ep

re-t

est

ed

in25

sub

jects

;com

pre

hensi

vein

dex

was

req

uired

tob

e80%

for

each

item

;conte

nt

valid

ityb

y21

dentis

tsconte

nt

valid

ityra

tio(0

.43–1

)5all

item

sre

tain

ed

Dim

ensi

on

I5fu

nctio

nal

cap

acity

,d

imensi

on

II5

feed

ing

Dis

crim

inant

valid

ityw

as

found

tob

elo

w

Ave

rag

eva

riance

ext

racte

d:

dim

ensi

on

I:0.5

07;

dim

ensi

on

II:

0.6

60;

com

posi

tere

liab

ility

:d

imensi

on

I:0.8

72,

dim

ensi

on

II:

0.9

21

No

Yes

(com

ple

ted

by

pro

fess

ionals

and

patie

nts

)

Yes

Tra

nsl

ate

donly

into

Port

ug

uese

and

then

transl

ate

db

ack

into

Eng

lish

Ste

geng

aet

al.

(1993)2

8Y

es

intu

itive

ly,

but

not

dis

cuss

ed

Functio

nal

limita

tion

No

No

No

No

No

No

Rollm

an

et

al.

(2010)3

2Y

es

intu

itive

ly,

but

not

dis

cuss

ed

Pain

,fu

nctio

nN

oN

oN

oN

oN

oN

o

Ohrb

ach

et

al.

(2008)3

1C

onte

nt

valid

ity:

base

don

exp

ert

sin

‘ora

lm

ed

icin

e,

pain

,TM

Dand

pro

sthod

ontic

s’

Mast

icatio

n,

jaw

mob

ility

,ve

rbal/e

motio

nal

exp

ress

ion,

and

mis

cella

neous

No

No

8or

20

item

s;d

idnot

dis

cuss

dura

tion

Base

don

qualit

ativ

eanaly

sis

Base

don

qualit

ativ

eanaly

sis

Tra

nsl

ate

din

toS

wed

ish

and

back

transl

ate

din

toA

merican

Eng

lish

OH

IP:

ora

lhealth

imp

act

pro

file;

TM

D:

tem

poro

mand

ibula

rd

ysfu

nctio

n;

VA

S:

visu

alanalo

gsc

ale

s.

Ta

ble

3C

on

tin

ue

d

Emerson Kavchak et al. Psychometric properties of self-report outcome measures

Physical Therapy Reviews 2014 VOL. 19 NO. 3 181

Ta

ble

4R

es

po

ns

ive

ne

ss

Au

tho

r(D

ate

)C

om

men

to

nre

fere

nce

cri

teri

aO

utc

om

eto

ol

Ass

ess

men

tin

terv

al

Meth

od

of

calc

ula

tio

nR

esu

lts

van

Gro

ote

let

al.

(2007)2

1Q

uest

ionab

leva

lidity

as

refe

rence

crite

rion

did

not

inclu

de

TM

Dp

atie

nts

30%

(Farr

ar,

2001):

base

don

10

stud

ies

with

patie

nt

dia

gnose

sof

dia

betic

neuro

path

y,p

ost

herp

etic

neura

lgia

,O

A,

CLB

P,

and

fibro

mya

lgia

VA

S(1

)V

AS

as

dia

ry:

Day

7–8

;4–1

1;

1–1

4,

(2)

CP

T

SD

D(1

)V

AS

:37.6

–48.8

mm

;(2

)V

AS

with

CP

T:

52.7

mm

van

Gro

ote

let

al.

(2009)2

2(1

)V

AS

refe

renced

van

Gro

ote

let

al.

(2007);

(2)

Ric

hard

son

(1983)

for

pain

behavi

or

scale

was

base

don

child

ren

with

head

aches,

not

ad

ults

with

TM

D

VA

S(1

)V

AS

as

dia

ry:

Day

7–8

;4–1

1;

1–1

4,

(2)

pain

behavi

or

SD

Dfo

r(1

)V

AS

dia

ry,

(2)

pain

behavi

or;

CID

/Cohen’s

effect

size

/SR

Mfo

r(1

)p

ain

behavi

or,

(2)

VA

Sas

aq

uest

ionnaire

SD

D5

(1)

37.6

–48.8

mm

for

VA

Sd

iary

,(2

)2.0

6–2

.73;

pain

behavi

or5

CID

51.1

3/E

S5

1.3

8/S

RM

51.1

2;

VA

S5

CID

524.2

mm

/ES

51.0

9/S

RM

50.9

2

Wass

ell

et

al.

(2008)2

0V

isualass

ess

ment

com

pare

dto

num

eric

defin

ition

(mean,

max,

and

com

bin

ed

mean

and

max)

VA

SIn

itialth

roug

hd

ischarg

ed

(ave

rag

eof

127

days

)

Are

aund

er

the

curv

e(A

UC

)to

est

ab

lish

sensi

tivity

/sp

ecifi

city

Varied

by

perc

enta

ge

of

pain

red

uctio

nand

num

erical

defin

ition;

50%

was

the

clin

icia

nag

reed

up

on

resu

lt

Em

shoff

et

al.

(2010)2

3N

AV

AS

;P

GIC

Base

line

and

final

(at

3m

onth

s)R

OC

for

PG

IC5

much

imp

rove

dV

AS

:2

19.5

5m

m;

specifi

city

50.9

2(9

5%

CI5

0.8

8–0

.94),

sensi

tivity

50.9

3(9

5%

CI5

0.8

9–0

.96)

Em

shoff

et

al.

(2011)2

4N

AV

AS

;P

GIC

Base

line

and

12

weeks

MD

C(S

EM

for

VA

S);

CID

(RO

Cfo

rP

GIC

5m

uch

imp

rove

d)

MD

Cva

ried

base

don

seve

rity

and

chro

nic

ity(7

.4–1

2.7

for

.3

and

,6

month

s;8.6

–13.7

for

.6

month

sand

,2

years

;6.6

–12.1

for

.2

years

and

,5

years

)C

IDfo

rcuto

ffp

oin

tsof

raw

chang

e11.5

–28.5

mm

;C

IDfo

rm

ean

chang

e20.9

–57.5

mm

;C

IDfo

rcuto

ffp

erc

enta

ge

29.9

–47.7

%,

mean

perc

enta

ge

chang

e64.1

–76.3

%R

ollm

an

et

al.

(2010)3

2C

hang

ein

functio

nallim

itatio

n(h

ind

rance)

that

was

poorly

corr

ela

ted

with

base

line

Patie

nt-

specifi

cap

pro

ach

Betw

een

ass

ess

ment

1and

3

RO

Cb

etw

een

40

‘imp

rove

d’and

69

‘nonim

pro

ved

’p

atie

nts

58%

rela

tive

chang

e(A

UC

50.9

1;

95%

CI5

0.8

6–0

.97);

sensi

tivity

50.8

5(9

5%

CI5

0.7

9–0

.91);

specifi

city

50.8

4(9

5%

CI5

0.7

7–0

.91)

SD

D:

smalle

std

ete

cta

ble

diff

ere

nce;

CID

:clin

ically

imp

ort

ant

diff

ere

nce;

SR

M:

stand

ard

ized

resp

onse

mean;

MD

C:

min

imal

dete

cta

ble

chang

e;

TM

D:

tem

poro

mand

ibula

rd

ysfu

nctio

n;

VA

S:

visu

al

analo

gsc

ale

s;P

GIC

:p

atie

nt’s

glo

balim

pre

ssio

nof

chang

e;

CP

T:

cold

pre

ssure

test

.

Emerson Kavchak et al. Psychometric properties of self-report outcome measures

182 Physical Therapy Reviews 2014 VOL. 19 NO. 3

Acceptability/interpretabilityAcceptability and interpretability were discussed

either explicitly by description in three studies 25,27,31

or implicitly by reporting the number of patients

satisfactorily completing the questionnaire.20–25 Please

see Table 3 for details. Consistent with cultural and

linguistic interpretability analyses, experts in the field

of TMD and native linguists were used to ensure

accurate interpretation of the original self-report

outcome measure. To further assess for acceptability,

Ohrbach completed a qualitative analysis in a purpo-

sively sampled population for the Jaw Function Limi-

tation Scale.31

DiscussionThe results of this systematic review demonstrate that

few free self-report outcome measures with complete

psychometric constructs reported for use with adults

who have TMD and have undergone rigorous

analysis. At this point in time, self-report measures

in TMD appear to be in the developmental stage, as

none of the self-report outcome measures have had all

psychometric aspects analyzed. Further, only the

MFIQ and the OHIP-TMD reported psychometric

analyses in two different patient cohorts. This review

found good internal consistency for the self-report

outcome measures studies, though test–retest reliabil-

ity varied. While this provides understanding of the

overall components of the measure, it does not fully

explore which of those items are most important to

patients with TMD. There was at least one known

study that did complete an item analysis to identify the

prevalence of responses when comparing patients with

TMD to healthy controls.34 We believe this is a good

step in the further analysis of self-report outcome

measures and recommend future research endeavors

to investigate this further. Content and face validity

were reported for most measures, but only 50% of the

self-report outcome measures reported an analysis of

construct validity. Responsiveness was reported only

for the VAS, the patient’s global impression of change

(PGIC), and the patient-specific approach. Not one

study found in this systematic review reported on

duration or feasibility of the measure. Acceptability

and interpretability were reported implicitly in five

studies, (two different self-report outcome measures)

and explicitly in three studies. Limited analyses of

feasibility of completing the self-report outcome

measures are of concern because if burden is too high

for the patient or the clinician, the tool will not be

completed consistently. The implication is that there is

an increased potential for bias in cohort analysis when

only including those who did not think the burden was

too high.

There have been at least five constructs identified

that should be assessed by a self-report outcome

measure including: pain, function, generic health

status, work disability, and patient satisfaction.9

Not one of the identified outcome measures for

TMD fully addresses all five constructs. This is not

unexpected as there is not one self-report outcome

measure that can capture all five constructs for any

disease or diagnosis. However, self-reported pain was

the primary construct assessed in the studies identi-

fied by this systematic review. While important, only

assessing self-reported pain delivers a relatively

limited insight into the complex biopsychosocial

aspects that could be affecting the patient. The more

recent inclusion of overall oral health status, patient

identified patient-specific limitations, fear of move-

ment, and self-efficacy allow for a more thorough and

nuanced understanding of these patients. While these

self-report outcome measures are valued for this

understanding, it must be recognized that they may

need to be used in conjunction with another self-

report outcome measure, much like in low back pain

where the fear avoidance questionnaire and the

Oswestry low back pain questionnaires are frequently

combined.

The ability of the self-report outcome measure to

reflect change owing to a specific intervention is

becoming a central theme in the evolving state of

health care. Having the means for a patient with TMD

to report meaningful change over time is important for

understanding the effectiveness of a treatment or the

efficiency of the intervention in TMD.17 In addition,

recent changes to the Medicare system in the United

States have begun to mandate the inclusion of

outcome measures, thus indicating governmental

involvement in ensuring recipients receive efficient

and effective treatment (refer to www.mediserve.com/

resource/analysis/cms-clarfications-on-cbor/ for updat-

ed information). Responsiveness demonstrates a duality

in its nature. On the positive side, responsiveness can

assist with monitoring change or effectiveness of an

intervention. However, given the complexity of calcula-

tion and influence of baseline demographics, the

minimal clinically important difference (MCID) has

been demonstrated to vary widely.35,36 Despite this

duality, responsiveness needs to be studied when estab-

lishing the psychometric properties of a self-report

outcome measure.

Interestingly, there are only three scales with

reported responsiveness in this systematic review:

the VAS for pain, a similar linear scale used with the

patient-specific approach,32 and the PGIC,23,24 which

is also known as the global rating of change.37 The

first two scales utilize the same method for rating pain

or functional limitation of one important activity.

The varying methods of calculation demonstrate a

non-robust MCID for the VAS, which is a well-

known limitation of the MCID.36 The third scale is

Emerson Kavchak et al. Psychometric properties of self-report outcome measures

Physical Therapy Reviews 2014 VOL. 19 NO. 3 183

similar to the commonly used global rating of change.

A weakness of this measure is the influence of recall

bias that may lead to underestimating or over-

estimating change.37

The above limitations of responsiveness determina-

tion do not necessarily mandate that responsiveness

should not be further studied in the TMD self-report

outcome measures. The self-efficacy scale is an

example of where future responsiveness could be

examined given the recent interest regarding patient

expectation in the PT literature. While self-efficacy

has been traditionally examined at baseline, a recent

post-operative cardiology study demonstrated that

change in self-efficacy could influence more positive

outcomes.38 Further, responsiveness is of value to the

clinical researcher. The lack of established respon-

siveness limits the validity of monitoring change

when assessing the effectiveness of an intervention

using these specific self-report outcome measures as

dependent variables.

The decision of which self-report outcome measure

to use should depend on the psychometric properties

established. Based on validity assessment, the TSK-

TMD had the strongest evidence of construct

validity, assessing for both convergent and discrimi-

nant validity. The OHIP-TMD and the self-efficacy

scale for TMD demonstrated good convergent

validity. One can extrapolate that the construct being

assessed is valid and the scale is able to differentiate

one construct from another. The MFIQ demonstrates

poor discriminant validity, indicating difficulty in

distinguishing between the two dimensions of feeding

and functional capacity. This may not be unexpected

given the relationship between ability to feed oneself

and overall function. Based on the reliability assess-

ment, the ability to recommend an internally

consistent and reliable self-report outcome measure

is limited to the OHIP-TMD, the MFIQ, and the Jaw

Function Limitation (20-item scale). At this point in

time, the psychometric property of responsiveness

needs further research before recommendations for a

self-report outcome measure can be made.

LimitationsOne of the limitations of this study is that it focuses

primarily on the adult population. Admittedly, there

are at least seven additional studies that described the

development of other self-report measures not

assessed in this study as the subjects included on

these projects were younger than 18 years of age.39–45

Given the prevalence studies indicating average age

of presentation to be between 20 and 40 years, it was

decided a priori that the focus of this review be on

patients over 18 years of age. A future systematic

review would be helpful to examine the psychometric

properties of self-report outcome measures developed

in studies with adolescent and young adult patients.

A second limitation of this systematic review that has

the potential for bias is that only English language

articles were included. However, we searched several

databases to minimize this bias.

ConclusionThere are few free, non-diagnostic, self-report out-

come measures reportedly used for adults with TMD

that have complete psychometric properties estab-

lished. There were 13 studies that met inclusion

criteria reviewed here, with eight different outcome

measures identified. The psychometric properties of

reliability and validity were examined primarily, and

only three scales analyzing the responsiveness of the

self-report outcome measures were noted. Given the

global presence of TMD, further validation in a

variety of different populations would improve the

psychometric properties specific to acceptability and

interpretability. Improved analysis of patient defined

responsiveness is recommended to allow improved

interpretation of the effectiveness of PT interven-

tions. In addition, future studies identifying prog-

nostic ability of self-report outcome measures would

be of even greater utility.

Disclaimer StatementsFunding None.

Conflict of Interest The authors assert that are no

conflict-of-interest with this study.

Ethics approval None.

Contributors All authors were involved with the

production of this study. AEK was the primary

author, involved with study development and design.

AEK, KLM, JM all were involved with the database

search strategy design and full text identification. AEK

and JM completed the data extraction and data

analysis. KLM and AMV were instrumental in the

paper development, writing, and editing/revisions.

References1 Kohler AA, Hugoson A, Magnusson T. Clinical signs indicative

of temporomandibular disorders in adults: time trends andassociated factors. Swed Dent J. 2013;37(1):1–11.

2 Liu F, Steinkeler A. Epidemiology, diagnosis, and treatment oftemporomandibular disorders. Dent Clin North Am.2013;57:465–79.

3 Manfredini D, Guarda-Nardini L, Winocur E, Piccotti F,Ahlberg J, Lobbezoo F. Research diagnostic criteria fortemporomandibular disorders: a systematic review of axis Iepidemiologic findings. Oral Surg Oral Med Oral Pathol OralRadiol Endod. 2011;112(4):453–62.

4 Ferao M, Traebert J. Prevalence of temporomandibulardysfunction in patients with cervical pain under physiotherapytreatment. Fisioter Mov. 2008;21(4):63–70.

5 van Grootel RJ, van der Glas HW, Buchner R, de Leeuw JR,Passchier J. Patterns of pain variation related to myogenoustemporomandibular disorders. Clin J Pain. 2005;21(2):154–65.

6 Zito G, Jull G, Story I. Clinical tests of musculoskeletaldysfunction in the diagnosis of cervicogenic headache. ManTher. 2006;11(2):118–29.

7 Edwards RR, Grace E, Peterson S, Klick B, HaythornthwaiteJA, Smith MT. Sleep continuity and architecture: associations

Emerson Kavchak et al. Psychometric properties of self-report outcome measures

184 Physical Therapy Reviews 2014 VOL. 19 NO. 3

with pain-inhibitory processes in patients with temporoman-dibular joint disorder. Eur J Pain. 2009;13(10):1043–7.

8 Turner JA, Mancl L, Aaron LA. Short- and long-term efficacyof brief cognitive-behavioral therapy for patients with chronictemporomandibular disorder pain: a randomized, controlledtrial. Pain. 2006;121(3):181–94.

9 Bombardier C. Outcome assessments in the evaluation oftreatment of spinal disorders: summary and general recom-mendations. Spine. 2000;25(24):3100–3.

10 Dworkin SF, Sherman J, Mancl L, Ohrbach R, LeResche L,Truelove E. Reliability, validity, and clinical utility of theresearch diagnostic criteria for Temporomandibular DisordersAxis II Scales: depression, non-specific physical symptoms, andgraded chronic pain. J Orofac Pain. 2002;16(3):207–20.

11 Levitt SR, McKinney MW. Validating the TMJ scale in anational sample of 10,000 patients: demographic and epide-miologic characteristics. J Orofac Pain. 1994;8(1):25–35.

12 Brown DT. Temporomandibular disorder treatment options:second report on a large scale prospective clinical study.Cranio. 2002;4(10):244–53.

13 Pan Resource Center’s TMJ ScaleTM. Information and testingservices for TMJ and chronic pain, (last accessed 2013 October13) http://ww.tmjscale.com/clinician-resources/tmj-scale.

14 Dahlstrom L, Carlsson GE. Temporomandibular disorders andoral health-related quality of life. A systematic review. ActaOdontol Scand. 2010;68(2):80–5.

15 Ozhayat EB, Gotfredsen GT, Elverdam B, Owall B.Comparison of an individual systematic review method andthe oral health impact profile. Responsiveness and ability ofdescribing treatment effect of oral rehabilitation. J OralRehabil. 2010;37(8):604–14.

16 Gonzalez-Iglesias J, Cleland JA, Neto F, Hall T, Fernandez-de-Las-Penas C. Mobilization with movement, thoracic spinemanipulation, and dry needling for the management oftemporomandibular disorder: a prospective case series.Physiother Theory Pract. 2013;29(8):586–95.

17 List T, Axelsson S. Management of TMD: evidence from systematicreviews and meta-analyses. J Oral Rehabil. 2010;37(6):430–51.

18 Leite JC, Jerosch-Herold C, Song F. A systematic review of thepsychometric properties of the Boston Carpal TunnelQuestionnaire. BMC Musculoskelet Disord. 2006;7:78.

19 Fitzpatrick R, Davey C, Buxton MJ, Jones DR. Evaluatingpatient-based outcome measures for use in clinical trials. HealthTechnol Assess. 1998;2(14):1–74.

20 Rochmon DL, Ray SA, Kullch RJ, Mehta NR, Driscoll S.Validity and utility of the Canadian occupational performancemeasure as an outcome measure in a craniofacial pain center.Occup Particip Health. 2008;28(1):4–11.

21 Wassell RW, Moufti MA, Meechan JG, Steen IN, Steele JG. Amethod for clinically defining ‘improvers’ in chronic painstudies. J Orofac Pain. 2008;22(1):30–40.

22 van Grootel RJ, van der Bilt A, van der Glas HW. Long-termreliable change of pain scores in individual myogenous TMDpatients. Eur J Pain. 2007;11(6):635–43.

23 van Grootel RJ, van der Glas HW. Statistically and clinicallyimportant change of pain scores in patients with myogenoustemporomandibular disorders. Eur J Pain. 2009;13(5):506–10.

24 Emshoff R, Emshoff I, Bertram S. Estimation of clinicallyimportant change for visual analog scales measuring chronictemporomandibular disorder pain. J Orofac Pain. 2010;24(3):262–9.

25 Emshoff R, Bertram S, Emshoff I. Clinically importantdifference thresholds of the visual analog scale: a conceptualmodel for identifying meaningful intraindividual changes forpain intensity. Pain. 2011;152(10):2277–82.

26 Visscher CM, Ohrbach R, van Wijk AJ, Wilkosz M, Naeije M.The Tampa Scale for Kinesiophobia for TemporomandibularDisorders (TSK-TMD). Pain. 2010;150(3):492–500.

27 Brister H, Turner JA, Aaron LA, Mancl L. Self-efficacy isassociated with pain, functioning, and coping in patients with

chronic temporomandibular disorder pain. J Orofac Pain.2006;20(2):115–24.

28 Campos JA, Carrascosa AC, Maroco J. Validity and reliabilityof the Portuguese version of Mandibular Function ImpairmentQuestionnaire. J Oral Rehabil. 2012;39(5):377–83.

29 Stegenga B, de Bont LG, de Leeuw R, Boering G. Assessmentof mandibular function impairment associated with tempor-omandibular joint osteoarthrosis and internal derangement. JOrofac Pain. 1993;7(2):183–95.

30 van der Meulen MJ, John MT, Naeije M, Lobbezoo F.Developing abbreviated OHIP versions for use with TMDpatients. J Oral Rehabil. 2012;39(1):18–27.

31 Durham J, Steele JG, Wassell RW, Exley C, Meechan JG, AllenPF, et al. Creating a patient-based condition-specific outcomemeasure for Temporomandibular Disorders (TMDs): OralHealth Impact Profile for TMDs (OHIP-TMDs). J OralRehabil. 2011;38(12):871–83.

32 Ohrbach R, Larsson P, List T. The jaw functional limitationscale: development, reliability, and validity of 8-item and 20-item versions. J Orofac Pain. 2008;22(3):219–30.

33 Rollman A, Naeije M, Visscher CM. The reproducibility andresponsiveness of a patient-specific approach: a new instrumentin evaluation of treatment of temporomandibular disorders. JOrofac Pain. 2010;24(1):101–5.

34 Moufti MA, Wassell RW, Meechan JG, Allen PF, John MT, SteeleJG. The Oral Health Impact Profile: ranking of items fortemporomandibular disorders. Eur J Oral Sci. 2011;119(2):169–74.

35 Wang Y-C, Hart DL, Stratford PW, Mioduski JE. Baselinedependency of minimal clinically important improvement. PhysTher. 2011;91:675–88.

36 Wright AA, Hannon J, Hegedgus EJ, Emerson-Kavchak AJ.Clinimetrics corner: a closer look at the minimal clinicallyimportant difference (MCID). J Man Manip Ther. 2012;20(3):160–6.

37 Kamper SJ, Maher CG, MacKay G. Global rating of changescales: a review of strengths and weaknesses and considerationsfor design. J Man Manip Ther. 2009;17(3):163–70.

38 Laferton JA, Mora MS, Auer CJ, Moosdorf R, Rief W.Enhancing the efficacy of heart surgery by optimizing patientspreoperative expectations: study protocol of a randomizedcontrolled trial. Am Heart J. 2013;165:1–7.

39 Johnson J, Carlsson S, Johansson M, Pauli N, Ryden A,Fagerberg-Mohlin B, et al. Development and validation of theGothenburg Trismus Questionnaire (GTQ). Oral Oncol.2012;48(8):730–6.

40 Ohrbach R, Granger C, List T, Dworkin S. Preliminarydevelopment and validation of the Jaw Functional LimitationScale. Community Dent Oral Epidemiol. 2008;36(3):228–36.

41 Undt G, Murakami K, Clark GT, Ploder O, Dem A, Lang T,et al. Cross-cultural adaptation of the JPF-Questionnaire forGerman-speaking patients with functional temporomandibularjoint disorders. J Craniomaxillofac Surg. 2006;34(4):226–33.

42 Conti PC, de Azevedo LR, de Souza NV, Ferreira FV. Painmeasurement in TMD patients: evaluation of precision andsensitivity of different scales. J Oral Rehabil. 2001;28(6):534–9.

43 Andreu Y, Galdon MJ, Dura E, Ferrando M, Pascual J, TurkDC, et al. An examination of the psychometric structure of theMultidimensional Pain Inventory in temporomandibular dis-order patients: a confirmatory factor analysis. Head Face Med.2006;2:48.

44 Dura E, Andreu Y, Galdon MJ, Ferrando M, Murgui S,Poveda R, et al. Psychological assessment of patients withtemporomandibular disorders: confirmatory analysis of thedimensional structure of the Brief Symptoms Inventory 18. JPsychosom Res. 2006;60(4):365–70.

45 Bush FM, Harkins SW. Pain-related limitation in activities ofdaily living in patients with chronic orofacial pain: psychometricproperties of a disability index. J Orofac Pain. 1995;9(1):57–63.

Emerson Kavchak et al. Psychometric properties of self-report outcome measures

Physical Therapy Reviews 2014 VOL. 19 NO. 3 185

Copyright of Physical Therapy Reviews is the property of Maney Publishing and its contentmay not be copied or emailed to multiple sites or posted to a listserv without the copyrightholder's express written permission. However, users may print, download, or email articles forindividual use.