Cambridge Cognition Examination (CAMCOG)

Evidence Reviewed as of before: 18-03-2009
Editor(s): Lisa Zeltzer, MSc OT; Nicol Korner-Bitensky, PhD OT; Elissa Sitcoff, BA BSc

Purpose

The Cambridge Cognition Examination (CAMCOG) is the cognitive and self-contained part of the Cambridge Examination for Mental Disorders of the Elderly (CAMDEX). The CAMCOG is a standardized instrument used to measure the extent of dementia, and to assess the level of cognitive impairment. The measure assesses orientation, language, memory, praxis, attention, abstract thinking, perception and calculation (Roth, Tym, Mountjoy, Huppert, Hendrie, Verma, et al., 1986).

In-Depth Review

Purpose of the measure

The Cambridge Cognition Examination (CAMCOG) is the cognitive and self-contained part of the Cambridge Examination for Mental Disorders of the Elderly (CAMDEX). The CAMCOG is a standardized instrument used to measure the extent of dementia, and to assess the level of cognitive impairment. The measure assesses orientation, language, memory, praxis, attention, abstract thinking, perception and calculation (Roth, Tym, Mountjoy, Huppert, Hendrie, Verma, et al., 1986).

Available versions

The CAMCOG was developed in 1986 by Roth, Tym, Mountjov, Huppert, Hendrie, Verma and Godddard. In 1999, Roth, Huppert, Mountjoy and Tym reviewed it and then published the CAMCOG-R. In 2000, de Koning, Dippel, van Kooten and Koudstall shortened the 67 items of the CAMCOG to 25 items, known as the Rotterdam CAMCOG (R-CAMCOG).

Features of the measure

Items:

The CAMCOG consists of 67 items, including the 19 items from the Mini Mental State Examination (MMSE) (Folstein, Folstein, & McHugh, 1975). It is divided into 8 subscales: orientation, language (comprehension and expression), memory (remote, recent and learning), attention, praxis, calculation, abstraction and perception (de Koning, van Kooten, Dippel, van Harskramp, Grobbee, Kluft, et al. 1998).

The orientation subscale is comprised of 10 items taken from the MMSE. In the language subscale, comprehension is assessed through nonverbal and verbal responses to spoken and written questions, and expression is assessed through tests of naming, repetition, fluency and definitions. The memory subscale assesses remote memory (famous events and people), recent memory (news items, prime minister, etc.), and learning (the recall and recognition of non-verbal and pictorial information learned incidentally as well as intentionally). Attention is assessed by serial sevens and counting backwards from 20. Praxis is assessed by copying, drawing, and writing as well as carrying out instructions. In the calculation subscale, the client is asked to perform an addition and a subtraction question involving money. For the abstraction subscale, the client is asked about similarities between an apple and a banana, a shirt and a dress, a chair and a table, and a plant and an animal. In the perception subscale, the client is asked to identify photographs of famous people and familiar objects from unusual angles, in addition to the tactile recognition of coins (Huppert, Jorm, Brayne, Girling, Barkley, Bearsdall et al., 1996).

The number of scored items for each subscale is as follows (de Koning et al., 1998; Huppert et al., 1996).

CAMCOG subscales Number of scored items
Orientation 10
Language
Comprehension
Expression
9
8
Memory
Learning
Recent
Remote
3
4
6
Concentration 2
Praxis 8
Calculation 2
Perception 3
Abstraction 4
Number of scored items 59

Items related to aphasia or upper extremity paresis may not be tested in all clients and depend on stroke severity.

Detailed administration guidelines are in the CAMCOG manual that can be obtained from the Cambridge University Department of Psychiatry.

Scoring:

The CAMCOG total score ranges from 0 to 107. Scores lower than 80 are considered indicative of dementia (de Koning et al., 1998; Roth et al., 1986). Among the 67 CAMCOG items, 39 are scored as ‘right’ or ‘wrong’; 11 are scored on a 3-point scale with ‘wrong’, ‘right to a certain degree’ or ‘completely right’ as response options; 9 items encompass questions or commands, and the score for each item is the sum of the correct answers; and finally 8 items are not scored. Five of the non-scored items are from the MMSE and they are not included in the total score because they are assessed in more detail by other CAMCOG items. The remaining 3 items are optional during the examination (de Koning, Dippel, van Kooten, & Koudstall, 2000; Huppert et al.,1996).

The maximum score per subscale is as follows (Huppert et al., 1996):

CAMCOG subscales Number of scored items
Orientation 10
Language
Comprehension
Expression
9
21
Memory
Learning
Recent
Remote
17
4
6
Concentration 4
Praxis 12
Calculation 5
Perception 11
Abstraction 8
Maximum Total Score 107

Time:

The CAMCOG takes 20 to 30 to administer and the R-CAMCOG takes 10 to 15 minutes to administer (de Koning et al, 1998; de Koning et al., 2000; Huppert et al., 1996).

Subscales:

The CAMCOG is comprised of 8 subscales:

  • Orientation
  • Language: subdivided into comprehensive and expressive language
  • Memory: subdivided into remote, recent and learning memory
  • Attention
  • Praxis
  • Calculation
  • Abstraction
  • Perception

Equipment:

The CAMCOG requires no specialized equipment. Only the test and a pencil are needed to complete the assessment.

The CAMCOG requires specialized equipment that are enclosed within its manual. The manual can be purchased from the Cambridge University Department of Psychiatry.

Alternative forms of the CAMCOG

  • Revised CAMCOG (CAMCOG-R): Published in 1999 by Roth, Huppert, Mountjoy and Tym, the CAMCOG-R improved the ability of the measure to detect certain types of dementia and to make clinical diagnoses based on the ICD-10 and DSM-IV. This version includes updated items from the remote memory subscale and the addition of items to assess executive function (Leeds, Meara, Woods & Hobson, 2001; Roth, Huppert, Mountjoy & Tym, 1999).
  • Rotterdam CAMCOG (R-CAMCOG): Published in 2000, the R-CAMCOG is a shortened version of the CAMCOG with 25 items. It takes 10 to 15 minutes to administer and is as accurate as the CAMCOG in screening for post-stroke dementia (de Koning et al., 2000).
  • General Practitioner Assessment of Cognition (GPCOG): Published in 2002 to be used in primary care settings, the GPCOG contains 9 cognitive and 6 informant items that were derived from the Cambridge Cognitive Examination, the Psychogeriatric Assessment Scale (Jorm, Mackinnon, Henderson, Scott, Christensen, Korten et al. 1995) and the instrumental Activities of Daily Living Scale (Lawton & Brody, 1969). The GPCOG takes 4 to 5 minutes to administer and appears to have a diagnostic accuracy similar to the Mini-Mental State Examination (Folstein, Folstein, & McHugh, 1975) in detecting dementia (Brodaty, Pond, Kemp, Luscombe, Harding, Berman et al., 2002).

Client suitability

Can be used with:

  • Clients with stroke
  • Clients with different types of dementia

Should not be used with:

  • The CAMCOG should not be used with clients with severe cognitive impairment.
  • Items related to aphasia and upper extremity paresis might not be tested on all clients and appropriate use depends on stroke severity.

In what languages is the measure available?

English and Dutch (de Koning et al., 2000).

Summary

What does the tool measure? The CAMCOG is a standardized instrument for diagnosis and grading of dementia.
What types of clients can the tool be used for? The CAMCOG can be used with, but is not limited to clients with stroke.
Is this a screening or assessment tool? Assessment
Time to administer The CAMCOG takes 20 to 30 minutes to administer.
Versions Revised CAMCOG (CAMCOG-R); Rotterdam-CAMCOG (R-CAMCOG); General Practitioner Assessment of Cognition (GPCOG)
Other Languages English; Dutch
Measurement Properties
Reliability Internal consistency:
No studies have examined the internal consistency of the CAMCOG in clients with stroke.
Validity Content:
– No studies have examined the content validity of the CAMCOG in clients with stroke.
– One study examined the content validity of the R-CAMCOG by reporting the steps for generating the shortened version of the CAMCOG.
Criterion:
Concurrent:
No studies have examined the concurrent validity of the CAMCOG.
Predictive:
Six studies examined the predictive validity of the CAMCOG and reported that the CAMCOG can be predicted by age, the R-CAMCOG, the Mini-Mental State Examination and cognitive and emotional impairments. Additionally, the CAMCOG was an excellent predictor of dementia 3 to 9 months post-stroke. However, the CAMCOG was not able to predict QOL in clients with stroke and is not predicted by the Functional Independence Measure.
Construct:
Convergent:
– One study examined the convergent validity of the CAMCOG in clients with stroke and reported excellent correlations between the CAMCOG and the R-CAMCOG and the Mini-Mental State Examination shortly after and 1 year post-stroke. Correlations between the CAMCOG and the Functional Independence Measure range from adequate after stroke to poor at 1 year post-stroke.
– One study examined the convergent validity of the CAMCOG-R and reported excellent correlations between the CAMCOG-R and the Raven Test and the Weigl Test and poor correlations between the CAMCOG-R and the Geriatric Depression Scale and the Barthel Index using Pearson correlation.
Known Groups:
Two studies using student t-test examined known groups validity of the CAMCOG and reported that the CAMCOG is able to distinguish between clients with or without dementia as well as aphasia severity in clients with stroke.
Floor and ceiling effect One study examined the floor / ceiling effects of the CAMCOG in clients with stroke and reported that 14 items showed ceiling effects but no floor effects
Does the tool detect change in patients? – No studies have examined the responsiveness of the CAMCOG in clients with stroke.
– One study examined the responsiveness of the CAMCOG-R and reported that at follow-up scores changes were all statistically significant (p<0.01).
Acceptability Items related to aphasia and upper extremity paresis might not be tested on all clients due to stroke severity.
Feasibility The instructions for administration and coding must be followed closely (Ruchinskas and Curyto, 2003).
How to obtain the tool? The CAMCOG can be obtained by purchasing the entire CAMDEX from the Cambridge University Department of Psychiatry

Psychometric Properties

Overview

We conducted a literature search to identify all relevant publications on the psychometric properties of the Cambridge Cognition Examination (CAMCOG) in individuals with stroke. We identified 6 studies on the CAMCOG, 1 on the CAMCOG-R and 1 on the R-CAMCOG.

Floor/Ceiling Effects

de Koning, Dippel, van Kooten and Koudstaal (2000) analyzed the floor and ceiling effects of the CAMCOG in 300 clients with stroke. A ceiling effect was found in 2 out of 10 orientation items, 8 out 17 language items, 2 out of 13 memory items, 1 out of 8 praxis items, and 1 out of 3 perception items, with more than 20% of participants scoring the maximum score. No floor effect was observed in the CAMCOG.

Reliability

No studies have examined the reliability of the CAMCOG in clients with stroke.

Validity

Content:

No studies have examined the content validity of the CAMCOG in clients with stroke.

de Koning et al. (2000) analyzed CAMCOG scores from 300 clients with stroke and reduced the 59 items of the CAMCOG to the 25 items of the R-CAMCOG. Initially, item reduction was performed by removing 14 items with ceiling effects on the CAMCOG. Next, the language, attention, praxis, and calculation subscales were eliminated due to their low diagnostic accuracy. Finally, items with a very low or very high inter-item correlation were removed.

Criterion:

Concurrent:
No studies have examined the concurrent validity of the CAMCOG in clients with stroke.

Predictive:
Kwa, Limburg, Voogel, Teunisse, Derix and Hijdra (1996a) examined whether age, educational level, side and volume of the infarct, aphasia severity, and motor function predicted CAMCOG scores at 3 months after stroke in 129 clients. A cut-off of 80 was used to discriminate between normal and abnormal cognitive function. Based on regression analysis with these above-mentioned variables included, age appeared to be the best predictor of CAMCOG scores 3 months post-stroke.
Note: The timeline for the baseline measurements were not reported in the study.

Kwa, Limburg and de Haan (1996b) verified the ability of the CAMCOG, the Rankin Scale (Rankin, 1957), the Barthel Index (Mahoney & Barthel, 1965), the Motricity Index (Colin & Wade, 1990), aphasia severity, age, educational level, volume and side of the infarct to predict quality of life in 97 clients with stroke. Linear regression analysis indicated that quality of life is best predicted by the Rankin Scale, volume of infarct and aphasia severity.
Note: The timeline for all the measurements were not reported in the study.

de Koning, van kooten, Dippel, van Harskamp, Grobbee, Kluft, et al. (1998) analyzed the ability of the CAMCOG and the Mini-Mental State Examination (MMSE – Folstein, Folstein, & McHugh, 1975) measured shortly after stroke to predict dementia measured 3 to 9 months later in 300 clients with stroke. Predictive validity was calculated by use of c-statistics to calculate the area under the Receiver Operating Characteristic (ROC) curve. The ability of the CAMCOG (AUC = 0.95) and the MMSE (AUC = 0.90) to predict dementia after stroke were both considered excellent. These results suggest that the percentage of patients correctly classified according to their dementia level at 3 to 9 months post-stroke is only slightly higher when using the CAMCOG over the MMSE.

de Koning et al. (2000) examined whether the CAMCOG and the R-CAMCOG, measured at hospital admission predicted dementia at 3 to 9 months post-stroke in 300 clients. Predictive validity, as calculated using c-statistics to estimate the area under the Receiver Operating Characteristic (ROC) curve, were all excellent for the CAMCOG (AUC = 0.95) and the CAMCOG-R (AUC = 0.95). These results suggest that the percentage of patients correctly classified according to their dementia level at 3 to 9 months post-stroke is the same when using the CAMCOG and the R-CAMCOG. Additionally, when using a cut-off of 77 for the CAMCOG and 33 for the R-CAMCOG, both measures showed a sensitivity of 91% and the specificity was 88% and 90%, respectively.

van Heugten, Rasquin, Winkens, Beusmans, and Verhey (2007) estimated the ability of a checklist of cognitive and emotional impairments measured 6 months post-stroke to predict the CAMCOG and the Mini-Mental State Examination (MMSE – Folstein, Folstein, & McHugh, 1975) scores at 12 months in 69 clients. Regression analysis showed that cognitive and emotional impairments explained 31% of the variance on the MMSE and 22% of the variance on the CAMCOG. These results suggest that cognitive and emotional impairments were able to predict the scores of both measures.

Winkel-Witlox, Post, Visser-Meily, and Lindeman (2008) analyzed the ability of the R-CAMCOG, the Mini-Mental State Examination (MMSE – Folstein, Folstein, & McHugh, 1975) and the Functional Independence Measure (FIM – Keith, Granger, Hamilton, & Sherwin, 1987) to predict the CAMCOG in 169 clients. All four outcomes measures were collected shortly after and 1 year post-stroke. Regression analysis showed that after stroke the R-CAMCOG explained 83% of variance on the CAMCOG, the MMSE explained 53% and the FIM 11%. At 1 year post-stroke the R-CAMCOG explained 82% of variance on the CAMCOG, the MMSE explained 57% and the FIM only 04%. These results suggest that the R-CAMCOG is the best predictor of the CAMCOG among these independent variables.

Construct:

Convergent/Discriminant:
Winkel-Witlox et al. (2008) examined the convergent validity of the CAMCOG by comparing it to R-CAMCOG, the Mini-Mental State Examination (MMSE – Folstein, Folstein, & McHugh, 1975) and the Functional Independence Measure (FIM – Keith, Granger, Hamilton, & Sherwin, 1987) in 169 clients with stroke. Shortly after and at 1 year post-stroke correlations between the CAMCOG and the R-CAMCOG and the MMSE were all excellent (rho1 = 0.92; 066, rho2 = 0.92; 069, respectively). Correlations between the CAMCOG and the FIM was adequate shortly after stroke (rho1 = 0.35) and poor after 1 year (rho2 = 0.27).

Leeds, Meara, Woods and Hobson (2001) analyzed the construct validity of the CAMCOG-R by comparing it to the Raven Test (Raven, 1982), the Weigl Test (Grewal, Haward, & Davies, 1986), the Geriatric Depression Scale (Sheikh & Yesavage, 1986) and the Barthel Index (Mahoney & Barthel, 1965) in 83 clients with stroke. Correlations as calculated using Pearson correlations were excellent between the CAMCOG-R and the Raven Test (r = 0.75) and the Weigl Test (r = 0.70). Correlations between the CAMCOG-R and the Geriatric Depression Scale (r = -0.30) and the Barthel Index (r = 0.20) were poor.

Known groups.
de Koning et al. (1998) analyzed whether the CAMCOG is able to distinguish between individuals with dementia from those without dementia in 300 clients with stroke. Known groups validity, as calculated using student t-test, showed that the CAMCOG was able to discriminate clients with dementia from those without dementia. These results demonstrated that clients with dementia have statistically significant lower scores on the CAMCOG.

Kwa et al. (1996a) verified the ability of the CAMCOG to discriminate between clients without aphasia and those with severe aphasia in 129 clients with stroke. Known groups validity, as calculated using the student t-test, showed that the CAMCOG was able to differentiate between aphasia severity.

Responsiveness

No studies have examined the responsiveness of the CAMCOG in clients with stroke.

Leeds et al. (2001) examined the responsiveness of the CAMCOG-R in 83 clients with stroke. Participants were assessed at baseline and 63 days later. At follow-up, changes on the CAMCOG-R scores were all statistically significant (p<0.01). These results suggest that the CAMCOG-R appears sensitive to change in cognitive status of clients with stroke.

References

  • Brodaty, H., Pond, D., Kemp, N.M., Luscombe, G., Harding, L., Berman, K. et al. (2002). The GPCOG: A new screening test for dementia designed for general practice. Journal of the American Geriatrics Society, 50, 530-534.
  • Collin, C. & Wade, D. (1990). Assessing motor impairment after stroke: A pilot reliability study. J Neurology Neurosurg Psychiatry, 53, 576-579.
  • de Koning, I., Dippel, D.W.J., van Kooten, F. & Koudstaal, P.J. (2000). A short screening instrument for poststroke dementia: The R-CAMCOG. Stroke, 31, 1502-1508.
  • de Koning, I., van Kooten, F., Dippel, D.W.J., van Harskamp, F., Grobbee, D.E., Kluft, C. & Koudstaal, P.J. (1998). The CAMCOG: A useful screening instrument for dementia in stroke patients. Stroke, 29, 2080-2086.
  • Folstein, M.F., Folstein, S. E. & McHugh, P. R. (1975). “Mini-mental state”. A practical method for grading the cognitive state of patients for the clinician. J Psychiatr Res, 12(3), 189-198.
  • Grewal, B., Haward, L. & Davies, I. (1986). Color and form stimulus values in a test of dementia. IRCS Med Sci, 14, 693-694.
  • Huppert, F.A., Jorm, A.F., Brayne, C., Girling, D.M., Barkeley, C., Bearsdall, et al. (1996). Psychometric properties of the CAMCOG and its efficacy in the diagnosis of dementia. Aging, Neuropsychology, and Cognition, 3, 201-214.
  • Jorm, A.F., Mackinnon, A.J., Henderson, A.S., Scott, H., Christensen, H., Korten, A.E., et al. (1995). The Psychogeriatric Assessment Scales: A multidimensional alternative to categorical diagnoses of dementia and depression in the elderly. Psychol Med, 25, 447-460.
  • Keith, R.A., Granger, C.V., Hamilton, B.B., & Sherwin, F.S. (1987). The functional independence measure: A new tool for rehabilitation. Adv Clin Rehabil, 1, 6-18.
  • Kwa, V.I.H., Limburg, M. & de Haan, R.J. (1996b). The role of cognitive impairment in the quality of life after ischaemic stroke. J Neurol, 243, 599-604.
  • Kwa, V.I.H., Limburg, M., Voogel, A.J., Teunisse, S., Derix, M.M.A. & Hijdra, A. (1996a). Feasibility of cognitive screening of patients with ischaemic stroke using the CAMCOG: a hospital based study. J Neurol, 243, 405-409.
  • Lawton, M.P. & Brody, E.M. (1969). Assessment of older people: Self-maintaining and instrumental activities of daily living. Gerontologist, 9, 179-186.
  • Leeds, L., Meare, R.J., Woods, R. & Hobson, J.P. (2001). A comparison of the new executive functioning domains of the CAMCOG-R with existing tests of executive function in elderly stroke survivors. Age and Ageing, 30, 251-254.
  • Mahoney, F. & Barthel, D. (1965). Functional evaluation: The Barthel Index. MD State J, 14, 61-65.
  • Rankin, J. (1957). Cerebral vascular accidents in patients over the age of 60. Scott Med J, 2, 200-215.
  • Raven, J.C. (1982). Revised manual for Raven’s Coloured Progressive Matrices. Windsor, UK: NFER-Nelson.
  • Roth, M., Huppert, F., Mountjoy, C., & Tym, E. (1999). The Cambridge Examination for Mental Disorders of the Elderly – Revised. Cambridge: Cambridge University Press.
  • Roth, M., Tym, E., Mountjoy, C., Huppert, F.A., Hendrie, H., Verma, S. et al. (1986). CAMDEX: A standardized instrument for the diagnosis of mental disorder in the elderly with special reference to the early detection of dementia. British Journal of Psychiatry, 149, 698-709.
  • Ruchinskas, R.A. & Curyto, K. (2003). Cognitive screening in geriatric rehabilitation. Rehabilitation Psychology, 48(1), 14-22.
  • Sheikh, J.A. & Yesavage, J.A. (1986). Geriatric depression scale (GDS): Recent findings and development of a shorter version. Clinical Gerontologist, 5, 165-172.
  • Winkel-Witlox, A.C.M.Te, Post, M.W.M., Visser-Meily, J.M.A., & Linderman, E. (2008). Efficient screening of cognitive dysfunction in stroke patients: Comparison between the CAMCOG and the R-CAMCOG, Mini-Mental State Examination and Functional Independence Measure-cognition score. Disability and Rehabilitation, 30(18), 1386-1391.
  • van Heugten, C., Rasquin, S., Winkens, I., Beusmans, G., & Verhey, F. (2007). Checklist for cognitive and emotional consequences following stroke (CLCE-24): Development, usability and quality of the self-report version. Clinical Neurology and Neurosurgery, 109, 257-262.

See the measure

How to obtain the CAMCOG

The CAMCOG can be obtained by purchasing the entire CAMDEX from the Cambridge University Department of Psychiatry.

Table of contents

Clock Drawing Test (CDT)

Evidence Reviewed as of before: 19-08-2008
Author(s)*: Lisa Zeltzer, MSc OT; Anita Menon, MSc
Editor(s): Nicol Korner-Bitensky, PhD OT; Elissa Sitcoff, BA BSc

Purpose


The CDT is used to quickly assess visuospatial and praxis abilities, and may determine the presence of both attention and executive dysfunctions (Adunsky, Fleissig, Levenkrohn, Arad, & Nov, 2002; Suhr, Grace, Allen, Nadler, & McKenna, 1998; McDowell, & Newell, 1996).

The CDT may be used in addition to other quick screening tests such as the Mini-Mental State Examination (MMSE), and the Functional Independence Measure (FIM).

In-Depth Review

Purpose of the measure

The CDT is used to quickly assess visuospatial and praxis abilities, and may determine the presence of both attention and executive dysfunctions (Adunsky, Fleissig, Levenkrohn, Arad, & Nov, 2002; Suhr, Grace, Allen, Nadler, & McKenna, 1998; McDowell & Newell, 1996).

The CDT may be used in addition to other quick screening tests such as the Mini-Mental State Examination (MMSE), and the Functional Independence Measure (FIM).

Available versions

The CDT is a simple task completion test in its most basic form. There are several variations to the CDT:

Verbal command:

  • Free drawn clock:
    The individual is given a blank sheet of paper and asked first to draw the face of a clock, place the numbers on the clock, and then draw the hands to indicate a given time. To successfully complete this task, the patient must first draw the contour of the clock, then place the numbers 1 through 12 inside, and finally indicate the correct time by drawing in the hands of the clock.
  • Pre-drawn clock:
    Alternatively, some clinicians prefer to provide the individual with a pre-drawn circle and the patient is only required to place the numbers and the hands on the face of the clock. They argue that the patient’s ability to fill in the numbers may be adversely affected if the contour is poorly drawn. In this task, if an individual draws a completely normal clock, it is a fast indication that a number of functions are intact. However, a markedly abnormal clock is an important indication that the individual may have a cognitive deficit, warranting further investigation.

Regardless of which type is used (free drawn or pre-drawn), the verbal command CDT can simultaneously assess a patient’s language function (verbal comprehension); memory function (recall of a visual engram, short-term storage, and recall of time setting instructions); and executive function. The verbal command variation of the CDT is highly sensitive for temporal lobe dysfunction (due to its heavy involvement in both memory and language processes) and frontal lobe dysfunction (due to its mediation of executive planning) (Shah, 2002).

Copy command:

The individual is given a fully drawn clock with a certain time pre-marked and is asked to replicate the drawing as closely as possible. The successful completion of the copy command requires less use of language and memory functions but requires greater reliance on visuospatial and perceptual processes.

Copy command clock

Clock reading test:
A modified version of the copy command CDT simply asks the patient to read aloud the indicated time on a clock drawn by the examiner. The copy command clock-drawing and clock reading tests are good for assessing parietal lobe lesions such as those that may result in hemineglect. It is important to do both the verbal command and the copy command tests for every patient as a patient with a temporal lobe lesion may copy a pre-drawn clock adequately, whereas their clock drawn to verbal command may show poor number spacing and incorrect time setting. Conversely, a patient with a parietal lobe lesion may draw an adequate clock to verbal command, while their clock drawing with the copy command may show obvious signs of neglect.

Clock reading clock

Time-Setting Instructions:

The most common setting chosen by clinicians is “3 O’clock” (Freedman, Leach, Kaplan, Winocur, Shulman, & Delis, 1994). Although this setting adequately assesses comprehension and motor execution, it does not indicate the presence of any left neglect the patient may have because it does not require the left half of the clock to be used at all. The time setting “10 after 11” is an ideal setting (Kaplan, 1988). It forces the patient to attend to the whole clock and requires the recoding of the command “10” to the number “2” on the clock. It also has the added advantage of uncovering any stimulus-bound errors that the patient may make. For example, the presence of the number “10” on the clock may trap some patients and prevent the recoding of the command “10” into the number “2.” Instead of drawing the minute hand towards the number “2” on the clock to indicate “10 after,” patients prone to stimulus-bound errors will fixate and draw the minute hand toward the number “10” on the clock.

Features of the measure

Scoring:

There are a number of different ways to score the CDT. In general, the scores are used to evaluate any errors or distortions such as neglecting to include numbers, putting numbers in the wrong place, or having incorrect spacing (McDowell & Newell, 1996). Scoring systems may be simple or complex, quantitative or qualitative in nature. As a quick preliminary screening tool to simply detect the presence or absence of cognitive impairment, you may wish to use a simple quantitative method (Lorentz et al., 2002). However, if a more complex assessment is required, a qualitative scoring system would be more telling.

Different scoring methods have been found to be better suited for different subject groups (Richardson & Glass, 2002; Heinrik, Solomesh, & Berkman, 2004). In patients with stroke, no single standardized method of scoring exists. Suhr, Grace, Allen, Nadler, and McKenna (1998) examined the utility of the CDT in localizing lesions in 76 patients with stroke and 71 controls. Six scoring systems were used to assess clock drawings (Freedman et al., 1994; Ishiai, Sugishita, Ichikawa, Gono, & Watabiki, 1993; Mendez, Ala, & Underwood, 1992; Rouleau, Salmon, Butters, Kennedy, & McGuire, 1992; Sunderland et al., 1989; Tuokko, Hadjistavropoulos, Miller, & Beattie, 1992; Watson, Arfken, & Birge, 1993; Wolf-Klein et al., 1989). Significant differences were found between controls and patients with stroke on all scoring systems for both quantitative and qualitative features of the CDT. However, quantitative indices were not helpful in differentiating between various stroke groups (left versus right versus bilateral stroke; cortical versus subcortical stroke; anterior versus posterior stroke). Qualitative features were helpful in lateralizing lesion site and differentiating subcortical from cortical groups.

A psychometric study in patients with stroke by South, Greve, Bianchini, and Adams (2001) compared three scoring systems: the Rouleau rating scale (1992); the Freedman scoring system (1994), and the Libon revised system (1993). These scoring systems were found to be reliable in patients with stroke (please see for the details of this study).

Subscales:

None typically reported.

Equipment:

Only a paper and pencil is required. Depending on the method chosen, you may need to prepare a circle (about 10 cm in diameter) on the paper for the patient.

Training:

The CDT can be administered by individuals with little or no training in cognitive assessment. Scanlan, Brush, Quijano, & Borson (2002) found that a simple binary rating of clock drawings (normal or abnormal) by untrained raters was surprisingly effective in classifying subjects as having dementia or not. In this study, a common mistake of untrained scorers was failure to recognize incorrect spacing of numbers on the clock face as abnormal. By directing at this type of error, concordance between untrained and expert raters should improve.

Time:

All variations of the CDT should take approximately 1-2 minutes to complete (Ruchinskas & Curyto, 2003).

Alternative forms of the CDT

The Clock Drawing Test-Modified and Integrated Approach (CDT-MIA) is a 4-step, 20-item instrument, with a maximum score of 33. The CDT-MIA emphasizes differential scoring of contour, numbers, hands, and center. It integrates 3 existing CDT’s:

  • Freedman et al’s free-drawn clock (1994) on some item definitions
  • Scoring techniques adapted from Paganini-Hill, Clark, Henderson, & Birge (2001)
  • Some items borrowed from Royall, Cordes, & Polk (1998) executive CLOX

The CDT-MIA was found to be reliable and valid in individuals with dementia, however this measure has not been validated in the stroke population (Heinik et al., 2004).

Client suitability

Can be used as a screening instrument with:

Virtually any patient population (Wagner, Nayak, & Fink, 1995). The test appears to be differentially sensitive to some types of disease processes. Particularly, it has proven to be clinically useful in differentiating among normal elderly, patients with neurodegenerative or vascular diseases, and those with psychiatric disorders, such as depression and schizophrenia (Dastoor, Schwartz, & Kurzman, 1991; Heinik, Vainer-Benaiah, Lahav, & Drummer, 1997; Lee & Lawlor, 1995; Shulman, Gold, & Cohen, 1993; Spreen & Strauss, 1991; Tracy, De Leon, Doonan, Musciente, Ballas, & Josiassen, 1996; Wagner et al., 1995; Wolf-Klein, Silverstone, Levy, & Brod, 1989).

Can be used with:

  • Patients with stroke. Because the CDT requires a nonverbal response, it may be administered to those with speech difficulties but who have sufficient comprehension to understand the requirement of the task.

Should not be used in:

  • Patients who cannot understand spoken or written instructions
  • Patients who cannot write

As with many other neuropsychological screening measures, the CDT is affected by age, education, conditions such as visual neglect and hemiparesis, and other factors such as the presence of depression (Ruchinskas & Curyto, 2003; Lorentz, Scanlan, & Borson, 2002). The degree to which these factors affect ones score depends much on the scoring method applied (McDowell & Newell, 1996). Moreover, the CDT focuses on right hemisphere function, so it is important to use this test in conjunction with other neuropsychological tests (McDowell & Newell, 1996).

In what languages is the measure available?

The CDT can be conducted in any language. Borson et al. (1999) found that language spoken did not have any direct effect on CDT test performance.

Summary

What does the tool measure? Visuospatial and praxis abilities, and may determine the presence of both attention and executive dysfunctions.
What types of clients can the tool be used for? Virtually any patient population. It has proven to be clinically useful in differentiating among normal elderly, patients with neurodegenerative or vascular diseases, and those with psychiatric disorders, such as depression and schizophrenia.
Is this a screening or assessment tool? Screening
Time to administer All variations of the CDT should take approximately 1-2 minutes to complete.
Versions

  • Verbal command: Free drawn clock; Pre-drawn clock;
  • Copy command: Copy command; Clock reading test
  • Time-setting: “10 after 11”
  • The Clock Drawing Test Modified and Integrated Approach (CDT-MIA)
Languages The CDT can be conducted in any language.
Measurement Properties
Reliability Test-retest:
Out of four studies examining test-retest reliability, three reported excellent test-retest and 1 found adequate test-retest.
Inter-rater:
Out of seven studies examining inter-rater reliability, six reported excellent inter-rater and one reported adequate (for examiner clocks) to excellent (for free-drawn and pre-drawn clocks inter-rater.
Validity Criterion:
Predicted lower functional ability and increased need for supervision on hospital discharge; poor physical ability and longer length of stay in geriatric rehabilitation; activities of daily living at maximal recovery.
Construct:
The CDT correlated adequately with the Mini-Mental State Examination and the Functional Independence Measure.
Known groups:
Significant differences between Alzheimer’s patients and controls detected by CDT.
Does the tool detect change in patients? Not applicable
Acceptability The CDT is short and simple. It is a nonverbal task and may be less threatening to patients than responding to a series of questions.
Feasibility The CDT is inexpensive and highly portable. It can be administered in situations in which longer tests would be impossible or inconvenient. Even the most complex administration and scoring system requires approximately 2 minutes. It can be administered by individuals with minimal training in cognitive assessment.
How to obtain the tool? A pre-drawn circle can be downloaded by clicking on this link: pre-drawn circle

Psychometric Properties

Overview

Until recently, data on the psychometric properties of the CDT were limited. While there are many possible ways to administer and score the CDT, the psychometric properties of all the various systems seem consistent and all forms correlate strongly with other cognitive measures (Scanlan et al., 2002; Ruchinskas & Curyto, 2003; McDowell & Newell, 1996). Further, scoring of the CDT has been found to be both accurate and consistent in patients with stroke (South et al., 2001).

For the purposes of this review, we conducted a literature search to identify all relevant publications on the psychometric properties of the more commonly applied scoring methods of the CDT. We then selected to review articles from high impact journals, and from a variety of authors.

Reliability

Test-retest:

Using Spearman rank order correlations of the CDT has been reported by several investigators using a variety of scoring systems:

  • Manos and Wu (1994) reported an “excellent” 2-day test-retest reliability of 0.87 for medical patients and 0.94 for surgical patients.
  • Tuokko et al. (1992) reported an “adequate” test-retest reliability of 0.70 at 4 days.
  • Mendez et al. (1992) reported and “excellent” coefficients of 0.78 and 0.76 at 3 and 6 months, respectively.
  • Freedman et al. (1994) reported test-retest reliability as “very low”. However, when the “10 after 11” time setting was used with the examiner clock, which is known to be a more sensitive setting for detecting cognitive dysfunction, test-retest reliability was found to be “excellent” (0.94).

Inter-rater:

Inter-rater reliability of the CDT, as indicated by Spearman rank order correlations (not the preferred method of analyses for assessing inter-rater reliability but one used in earlier measurement research), has also been reported by several investigators:

  • Sunderland et al. (1989) found “excellent” coefficients ranging from 0.86 to 0.97 and found no difference between clinician and non-clinician raters (0.84 and 0.86, respectively).
  • Rouleau et al. (1992) found “excellent” inter-rater reliability, with coefficients ranging from 0.92 to 0.97.
  • Mendez et al. (1992) reported “excellent” inter-rater reliability of 0.94.
  • Tuokko et al. (1992) reported high coefficients ranging from 0.94 to 0.97 across three annual assessments.
  • The modified Shulman scale (Shulman, Gold, Cohen, & Zucchero, 1993) also has “excellent” inter-rater reliability (0.94 at baseline, 0.97 at 6 months, and 0.97 at 12 months).
  • Manos and Wu (1994) obtained “excellent” inter-rater reliability coefficients ranging from 0.88 to 0.96.
  • Freedman et al. (1994) reported coefficients ranging from 0.79 to 0.99 on the free-drawn clocks, 0.84 to 0.85 using the pre-drawn contours, and 0.63 to 0.74 for the examiner clocks, demonstrating “excellent” inter-rater reliability.

South et al. (2001) compared the psychometrics of 3 different scoring methods of the CDT (Libon revised system; Rouleau rating scale; and Freedman scoring system) in a sample of 20 patients with stroke. Intra-rater reliability were measured using the intraclass correlation coefficient (ICC). Raters used comparable criteria for each score demonstrating “excellent” inter-rater reliability. Raters used similar scoring criteria throughout, demonstrating “excellent” intra-rater reliability. South et al. (2001) concluded that while the Libon scoring system demonstrated a range of reliabilities across different domains, the Rouleau and Freedman systems were in the excellent range.

Validity

In a review, Shulman (2000) reported that most studies achieved sensitivities and specificities of approximately 85% and concluded that the CDT, in conjunction with other widely used tests such as the Mini-Mental State Examination (MMSE), could provide a significant advance in the early detection of dementia. In contrast, Powlishta et al. (2002) concluded from their study that the CDT did not appear to be a useful screening tool for detecting very mild dementia. Other authors have concluded that the CDT should not be used alone as a dementia screening test because of its overall inadequate performance (Borson & Brush, 2002; Storey et al., 2001). However, most of the previous studies were based on relatively small sample sizes or were undertaken in a clinical setting, and their results may not be applicable to a larger community population.

Nishiwaki et al. (2004) studied the validity of the CDT in comparison to the MMSE in a large general elderly population (aged 75 years or older). The specificity of the CDT for detecting moderate-to-severe cognitive impairment (MMSE score = 17) were 77% and 87%, respectively, for nurse administration and 40% and 91%, respectively, for postal administration. The authors conclude that the CDT may have value as a brief face-to-face screening tool for moderate/severe cognitive impairment in an older community population but is relatively poor at detecting milder cognitive impairment.

Few studies have examined the validity of the CDT specifically in patients with stroke. Adunsky et al. (2002) compared the CDT with the MMSE and cognitive Functional Independence Measure (FIM) (cognitive tests used for the evaluation of functional outcomes at discharge in elderly patients with stroke). The tests were administered to 151 patients admitted for inpatient rehabilitation following an acute stroke. Correlation coefficients (Pearson correlation) between the three cognitive tests resulted in r-values ranging from 0.51 to 0.59. Adunsky et al. (2002) concluded that they share a reasonable degree of resemblance to each other, accounting for “adequate” concurrent validity of these tests.

Bailey, Riddoch, and Crome (2000) evaluated a test battery for hemineglect in elderly patients with stroke and determined that the CDT had questionable validity in the assessment of representational neglect. Further, consistent with previous findings (Ishiai et al., 1993; Kaplan et al., 1991), the utility of the CDT as a screening measure for neglect was not supported from these results. Reasons include the subjectivity in scoring, and questionable validity in that the task may also reflect cognitive impairment (Freidman, 1991), constructional apraxia, or impaired planning ability (Kinsella, Packer, Ng, Olver, & Stark, 1995).

Responsiveness

Not applicable.

References

  • Adunsky, A., Fleissig, Y., Levenkrohn, S., Arad, M., Nov, S.(2002). Clock drawing task, mini-mental state examination and cognitive-functional independence measure: relation to functional outcome of stroke patients. Arch Gerontol Geriatr, 35(2), 153-60.
  • Bailey, M. J., Riddoch, J., Crome, P. (2002). Evaluation ofa test battery for hemineglect in elderly stroke patients for use by therapists in clinical practice. Neurorehabilitation, 14(3), 139-150.
  • Borson, S., Brush, M., Gil, E., Scanlan, J., Vitaliano, P.,Chen, J., Cahsman, J., Sta Maria, M. M., Barnhart, R., Roques, J. (1999). The Clock Drawing Test: Utility for dementia detection in multiethnic elders. J Gerontol A Biol Sci Med Sci, 54, M534-40.
  • Dastoor, D. P., Schwartz, G., Kurzman, D. (1991).Clock-drawing: An assessment technique in dementia. Journal of Clinical and Experimental Gerontology, 13, 69-85.
  • Freedman, M., Leach, L., Kaplan, E., Winocur, G., Shulman,K. I., Delis, D. C. (1994). Clock Drawing: A Neuropsychological Analysis (pp. 5). New York: Oxford University Press.
  • Friedman, P. J. (1991). Clock drawing in acute stroke.Age and Ageing, 20(2), 140-145.
  • Heinik, J., Vainer-Benaiah, Z., Lahav, D., Drummer, D.(1997). Clock drawing test in elderly schizophrenia patients. International Journal of Geriatric Psychiatry, 12, 653-655.
  • Heinik, J., Solomesh, I., Berkman, P. (2004). Correlationbetween the CAMCOG, the MMSE and three clock drawing tests in a specialized outpatient psychogeriatric service. Arch Gerontol Geriatr, 38, 77-84.
  • Heinik, J., Solomesh, I., Lin, R., Raikher, B., Goldray, D.,Merdler, C., Kemelman, P. (2004). Clock drawing test-modified and integrated approach (CDT-MIA): Description and preliminary examination of its validity and reliability in dementia patients referred to a specialized psychogeriatric setting. J Geriatr Psychiatry Neurol, 17, 73-80.
  • Ishiai, S., Sugishita, M., Ichikawa, T., Gono, S., Watabiki,S. (1993). Clock drawing test and unilateral spatial neglect. Neurology, 43, 106-110.
  • Kaplan, E. (1988). A process approach to neuropsychologicalassessment. In: T Bull & BK Bryant (Eds.), Clinical neuropsychology and brain function: Research, measurement, and practice (pp. 129-167). Washington DC: American Psychological Association.
  • Kaplan, R.F., Verfaillie, M., Meadows, M., Caplan, L.R.,Pessin, M. S., DeWitt L. (1991). Changing attentional demands in left hemispatial neglect. Archives of Neurology, 48, 1263-1267.
  • Kinsella, G., Packer, S., Ng, K., Olver, J., Stark, R.(1995). Continuing issues in the assessment of neglect. Neuropsychological Rehabilitation, 5, 239-258.
  • Lee, H., Lawlor, B. A. (1995). State-dependent nature of theClock Drawing Task in geriatric depression. Journal of the American Geriatrics Society, 43, 796-798.
  • Lorentz, W. J., Scanlan, J. M., Borson, S. (2002). Briefscreening tests for dementia. Can J Psychiatry, 47, 723-733.
  • Manos, P. J., Wu, R. (1994). The Ten Point Clock Test: Aquick screen and grading system for cognitive impairment in medical and surgical patients. International Journal of Psychiatry in Medicine, 24, 229-244.
  • McDowell, I., Newell, C. (1996). Measuring Health. A Guideto Rating Scales and Questionnaires. 2nd ed. NewYork: Oxford University Press.
  • Mendez, M. F., Ala, T., Underwood, K. L. (1992). Developmentof scoring criteria for the clock drawing task in Alzheimers disease. Journal of the American Geriatrics Society, 40, 1095-1099.
  • Nishiwaki, Y., Breeze, E., Smeeth, L., Bulpitt, C. J.,Peters, R., Fletcher, A. E. (2004). Validity of the Clock-Drawing Test as a Screening Tool for Cognitive Impairment in the Elderly. American Journal of Epidemiology, 160(8), 797-807.
  • Paganini-Hill, A., Clark, L. J., Henderson, V. W., Birge, S.J. (2001). Clock drawing: Analysis in a retirement community. J Am Geriatr Soc, 49, 941-947.
  • Powlishta, K. K., von Dras, D. D., Stanford, A., Carr D. B.,Tsering, C., Miller, J. P., Morris, J. C. (2002). The Clock Drawing Test is a poor screen for very mild dementia. Neurology, 59, 898-903.
  • Richardson, H. E., Glass, J.N. (2002). A comparison ofscoring protocols on the clock drawing test in relation to ease of use, diagnostic group and correlations with mini-mental state examination. Journal of the American Geriatrics Society, 50, 169-173.
  • Rouleau, I., Salmon, D. P., Butters, N., Kennedy, C.,McGuire, K. (1992). Quantitative and qualitative analyses of clock drawings in Alzheimers and Huntington’s. Brain and Cognition, 18, 70-87.
  • Royall, D. R., Cordes, J. A., Polk, M. (1998). CLOX: anexecutive clock drawing task. J Neurol Neurosurg Psychiatry, 64, 588-594.
  • Ruchinskas, R. A., Curyto, K. J. (2003). Cognitive screeningin geriatric rehabilitation. Rehabil Psychol, 48, 14-22.
  • Scanlan, J. M., Brush, M., Quijano, C., Borson, S. (2002).Comparing clock tests for dementia screening: naïve judgments vs formal systems – what is optimal? International Journal of Geriatric Psychiatry, 17(1), 14-21.
  • Shah, J. (2001). Only time will tell: Clock drawing as anearly indicator of neurological dysfunction. P&S Medical Review, 7(2), 30-34.
  • Shulman, K. I., Gold, D. P., Cohen, C. A., Zucchero, C. A.(1993). Clock-drawing and dementia in the community: A longitudinal study. International Journal of Geriatric Psychiatry, 8(6), 487-496.
  • Shulman, K. I. (2000). Clock-drawing: Is it the idealcognitive screening test? International Journal of Geriatric Psychiatry, 15, 548-561.
  • Shulman, K., Shedletsky, R., Silver, I. (1986). Thechallenge of time: Clock-drawing and cognitive function in the elderly. International Journal of Geriatric Psychiatry, 1, 135-140.
  • South, M. B., Greve, K. W., Bianchini, K. J., Adams, D.(2001). Inter-rater reliability of Three Clock Drawing Test scoring systems. Applied Neuropsychology, 8(3), 174-179.
  • Spreen, O., Strauss, E. A. (1991). Compendium ofneuropsychological tests: Administration, norms, and commentary. New York: Oxford University Press.
  • Storey, J. E., Rowland, J. T., Basic, D., Conforti, D. A.(2001). A comparison of five clock scoring methods using ROC (receiver operating characteristic) curve analysis. Int J Geriatr Psychiatr, 16, 394-9.
  • Sunderland, T., Hill, J. L., Mellow, A. M., Lowlor, B. A.,Grundersheimer, J., Newhouse, P. A., Grafman, J. H. (1989). Clock drawing in Alzheimer’s disease: a novel measure of dementia severity. J Am Geriatr Soc, 37(8), 725-729.
  • Suhr, J., Grace, J., Allen, J., Nadler, J., McKenna, M.(1998). Quantitative and Qualitative Performance of Stroke Versus Normal Elderly on Six Clock Drawing Systems. Archives of Clinical Neuropsychology, 13(6), 495-502.
  • Tracy, J. I., De Leon, J., Doonan, R., Musciente, J.,Ballas, T., Josiassen, R. C. (1996). Clock drawing in schizophrenia. Psychological Reports, 79, 923-928.
  • Tuokko, H., Hadjistavropoulos, T., Miller, J. A., Beattie,B. L. (1992). The Clock Test, a sensitive measure to differentiate normal elderly from those with Alzheimer disease. Journal of the American Geriatrics Society, 40, 579-584.
  • Wagner, M. T., Nayak, M., Fink, C. (1995). Bedside screeningof neurocognitive function. In: L. A. Cushman & M. J. Scherer (Eds.), Psychological assessment in medical rehabilitation: Measurement and instrumentation in psychology (pp. 145-198). Washington, DC: American Psychological Association.
  • Watson, Y. I., Arfken, C. L., Birge, S. J. (1993). Clockcompletion : An objective screening test for dimentia. J Am Geriar Soc, 41(11), 1235-40.
  • Wolf-Klein, G. P., Silverstone, F. A., Levy, A. P., Brod, M.S. (1989). Screening for Alzheimer’s disease by clock drawing.Journal of the American Geriatrics Society, 37, 730-734.

See the measure

Click here to find a pre-drawn circle that can be used when administering the CDT.

Table of contents

Color Trails Test (CTT)

Evidence Reviewed as of before: 08-11-2012
Author(s)*: Lisa Zeltzer, MSc OT; Valerie Poulin, OT, PhD candidate
Editor(s): Nicol Korner-Bitensky, PhD OT; Annabel McDermott, BOccThy

Purpose

The Color Trails Test (CTT) is a language-free version of the Trail Making Test (TMT) that was developed to allow for broader cross-cultural assessment of sustained attention and divided attention in adults.

In-Depth Review

Purpose of the measure

The Color Trails Test (CTT) (Maj, D’Elia, Satz, Janssen, Zaudig, Uchiyama et al., 1993; D’Elia, Satz, Uchiyama & White, 1996) is a language-free version of the Trail Making Test (TMT) that was developed to allow for broader cross-cultural application to measure sustained attention and divided attention in adults.

Available versions

There are 4 versions of the CTT (forms A, B, C, and D) but only the first version (form A) has normative data and is the only version that should be used in a clinical setting. Versions B-D are experimental and should be used in research only (Mitrushina, Boone, Razzani, & D’Elia, 2005).

Features of the measure

Items:

The CTT is comprised of two tasks:

  • CTT1: Must be administered first and requires the respondent to connect circles in an ascending numbered sequence (1-25).
  • CTT2: Must follow the CTT1 and requires the respondent to connect numbers in an ascending sequence while alternating between pink and yellow colors. Numbers are presented twice, once in pink and once in yellow, so the client must ignore the distracter item (e.g. start at pink 1, avoid pink 2 to select yellow 2, avoid yellow 3 to select pink 3, etc.).

Untimed practice trials are completed for both the CCT1 and CCT2 to ensure that the client understands the task.

Scoring and score interpretation:

Time taken to complete each part of the CTT is recorded in seconds and is compared to normative data. Qualitative aspects of the performance that may be indicative of brain dysfunction (e.g. near misses, prompts required, sequencing errors for colour and number) are also recorded.

Time:

The CTT manual reports that it takes 3-8 minutes to complete the CTT. A task is discontinued if the client takes longer than ?240 seconds to complete it.

Equipment:

  • Table and chair
  • Test
  • Pencil
  • Stopwatch

Training requirements:

This is a level “C” qualification meaning that it requires an experienced professional to administer the test.

Alternative Forms of the Colour Trails Test

  • Trail Making Test (TMT)
  • Comprehensive Trail Making Test (Reynolds, 2002)
  • Delis-Kaplan Executive Function Scale (D-KEFS): includes subtests modeled after the TMT
  • Oral TMT: an alternative for patients with motor deficits or visual impairments (Ricker & Axelrod, 1994).
  • Repeat testing TMT: alternate forms have been developed for repeat testing purposes (Franzen et al., 1996; Lewis & Rennick, 1979)

Client suitability

Can be used with:

  • Individuals with stroke
  • Clients 18-89 years old
  • Individuals who are colourblind
  • The CTT requires relatively intact motor abilities (i.e. ability to hold and manoeuvre a pen or pencil, ability to move the upper extremity). The Oral TMT may be more appropriate if the examiner considers that the participant’s motor ability may impact his/her performance.
  • Clients must be able to understand Arabic numbers and numerical sequence.

Should not be used with:

  • Clients with motor or coordination impairments (e.g. apraxia). If motor ability may impact performance, consider using the Oral TMT.
  • Should be used with caution in older adults with low education. Age and education have been reported to influence response times in both parts of the CCT, such that older individuals with low education levels have demonstrated significantly slower response times (D’Elia et al, 1996; Messinis, Malegiannaki, Christodoulou, Panagiotopoulos, & Papathanasopoulos, 2011).

In what languages is the measure available?

This is a language-free measure however cultural norms have been published for the following populations:

  • Adult Greek population with stroke (Messinis et al., 2011)
  • Turkish population with schizophrenia (Güleç, Kavakçı, Güleç, & Küçükalioğlu, 2006)
  • Healthy Turkish population (Dugbartey, Townes & Mahurin, 2000)
  • Healthy Spanish population (LaRue, Romero, Oritz, Chi Liang, & Lindeman, 1999)
  • Healthy Brazilian sample (Sant’Ana Rabelo, Pacanaro, Rossetti, Almeida de Sa Leme, de Castro, Guntert, et al., 2010)
  • Healthy sample from China (Hsieh & Riley, 1997)
  • Healthy sample from Hong Kong (Lee & Chan, 2000).

Summary

What does the tool measure? Language-free measure of sustained and divided attention.
What types of clients can the tool be used for? The CTT can be used with, but is not limited to, patients with stroke.
Is this a screening or assessment tool? Assessment tool
Time to administer The TMT takes approximately 3 to 8 minutes to administer.
Versions

  • Trail Making Test (TMT)
  • Comprehensive TMT
  • Oral TMT
  • Repeat testing TMT (developed for repeat testing purposes)
  • Symbol TMT
  • Delis-Kaplan Executive Function Scale (D-KEFS)
Other Languages Language-free measure but norms established for Greek, Turkish, Chinese, Brazilian, and Spanish populations
Measurement Properties
Reliability Internal consistency:
No studies have examined internal consistency of the CTT in patients with stroke.

Test-retest:
No studies have examined test-retest reliability of the CTT in a stroke population but the authors of the measure report excellent test-retest reliability for CTT2 and adequate test-retest reliability for CTT1 in a healthy sample.

Inter-rater:
No studies have examined inter-rater reliability of the CTT in patients with stroke.

Validity Content:
No studies have examined content validity of the CTT in patients with stroke.

Criterion:
Concurrent:
One study reported excellent correlations between the CTT1 and CTT2 and the TMT-A and TMT-B respectively.

Predictive:
Two studies reported that the CTT1 predicted on-road driving test failure in samples of clients that included stroke.

Construct:
Convergent:
One study reported adequate to excellent correlations between the CTT and the Useful Field of View (UFOV) subtests.

Known groups:
One study reported significant differences in time to complete the CCT between the patients with stroke and healthy adults.

Floor/Ceiling Effects No studies have examined floor/ceiling effects of the CTT in patients with stroke.
Does the tool detect change in patients? The responsiveness of the CTT has not formally been studied, however it has been used to detect changes in a clinical trial of 2 participants with stroke.
Acceptability The CTT is simple and easy to administer and is language-free.
Feasibility The CTT is relatively inexpensive and highly portable. The CTT must be purchased and should be administered by an experienced professional.
How to obtain the tool?

The CTT can be purchased from: Psychological Assessment Resources (http://www4.parinc.com/Products/Product.aspx?ProductID=CTT)

* Initially developed for a traumatic-brain injured population, the psychometric properties of the tool with this population are described in the administration guide of the tool.

Psychometric Properties

Overview

We conducted a literature search to identify all relevant publications on the psychometric properties of the CTT in individuals with stroke. We identified 4 studies.

Floor/Ceiling Effects

No studies have reported on floor/ceiling effects of the CTT when used with an adult stroke population.

Reliability

Internal consistency:
No studies have reported on internal consistency of the CTT when used with an adult stroke population.

Test-retest:
D’Elia et al. (1996) examined the test-retest reliability of the CTT in 27 healthy individuals. The CTT was administered twice, two weeks apart. Excellent test-retest reliability was reported for the CTT2 (r=0.79), and adequate test-retest reliability was reported for the CTT1 (r=0.64).

Inter-rater:
No studies have reported on inter-rater reliability of the CTT when used with an adult stroke population.

Validity

Content:

No studies have reported on content validity of the CTT when used with an adult stroke population.

Criterion:

Concurrent:
Elkin-Frankston, Lebowitz, Kapust, Hollis, & O’Connor (2007) examined the concurrent validity of the CTT with the TMT in 29 individuals with various medical conditions including stroke (n=8). Completion times on the CTT and TMT were highly correlated (CTT1 vs. TMT-A: r=0.91; CTT2 vs. TMT-B: r=0.72) suggesting excellent concurrent validity with the original TMT.

Predictive:
Elkin-Frankston et al. (2007) examined the ability of the CTT to predict on-road driving test failure in 29 individuals with various medical conditions including stroke (n=8). Patients who failed an on-road driver evaluation performed the CTT1 significantly slower than those who passed (Cohens d=0.66, p<0.05). This relationship was also found for the CTT2 but it did not reach statistical significance.

Hartman-Maeir et al. (2008) examined predictive validity of the CTT in a sample of 30 individuals with acquired brain injury including stroke (n=17) wishing to obtain a drivers licence. There was a significant difference in time taken to complete CTT1 between those who passed and failed the on-road test (Cohen’s d = 0.67, p=0.02). Performance time <60 seconds on the CTT1 was found to predict passing the on-road evaluation, whereas >60 seconds was predictive of failing.

Construct:

Convergent/Discriminant:
Hartman-Maeir, Erez, Ratzon, Mattatia and Weiss (2008) examined convergent validity of the CTT in a sample of 30 individuals with acquired brain injury (including stroke, n=17) wishing to obtain a drivers licence, using Spearman correlation coefficients. The CTT1 and CTT2 showed adequate to excellent correlations with Useful Field of View (UFOV) subtests of processing speed (CTT1 r=0.407; CTT2 not significant), divided attention (r=0.457, 0.486 respectively) and selective attention (r=0.602, 0.629 respectively). Results support validity of the CTT as a pre-driving assessment tool.

Known groups:
Messinis, Malegiannaki, Christodoulu, Panagiotopoulos, and Papathanasopoulos (2011) examined known groups validity of the CTT with 25 clients who had recently experience a stroke and 26 healthy participants matched for age, educational level and gender (Greek population). Clients in the stroke group required significantly more time to complete the CTT1 and CCT2 than the healthy controls (p < 0.001).

Responsiveness

Liu, Chan, Lee, and Hui-Chan (2004) used the CTT to evaluate the effectiveness of mental imagery in clients with stroke (n=2). In this study, the CTT detected change in both clients with reduced time to complete the CTT1 and CTT2 post-intervention.

Sensitivity/ Specificity

No studies have reported on sensitivity/specificity of the CTT when used with an adult stroke population.

References

  • Barncord, S. W. & Wanlass, R. L. (2001). The Symbol Trail Making Test: test development and utility as a measure of cognitive impairment. Applied Neuropsychology, 8, 99-103
  • D’Elia, L. F., Satz, P., Uchiyama, C.L., & White, T. (1996). Color Trails Test. Odessa, FL: PAR.
  • Dugbartey, A. T., Townes, B. D., & Mahurin, R. K. (2000). Equivalence of the Color Trail Making Test in nonnative English-speakers. Archives of Clinical Neuropsychology, 15, 425-31.
  • Elkin-Frankston, S., Lebowitz, B. K., Kapust, L. R., Hollis, A.M., & O’Connor, M.G. (2007). The use of the Colour Trails Test in the assessment of driver competence: preliminary reports of a culture-fair instrument. Archives of Clinical Neuropsychology, 22(5), 631-5.
  • Franzen, M., Paul, D., & Iverson, G. L. (1996). Reliability of alternate forms of the trail making test. The Clinical Neurologist, 10(2), 125-9.
  • Güleç, H., Kavakçı, O., Güleç, M. Y., & Küçükalioğlu, C. I. (2006). The reliability and validity of the Turkish Color Trails Test in evaluating frontal assessment among Turkish patients with schizophrenia. Düşünen Adam, 19(4), 180-5.
  • Hartman-Maeir, A., Erez, A. B., Ratzon, N., Mattatia, T., & Weiss, P. (2008). The validity of the Color Trails Test in the pre-driver assessment of individuals with acquired brain injury. Brain Injury, 22, 994-1008.
  • Hsieh, S. & Riley, N. (1997, November). Neuropsychological performance in the People’s Republic of China: Age and educational norms for four attentional tasks Presented at the National Academy of Neuropsychology, Las Vegas, Nevada. In Mitrushina, M. Boone, K., & D’Elia L. Handbook of Normative Data for Neuropsychological Assessment. (pp.70-73). New York, NY: Oxford University Press.
  • LaRue, A., Romero, L., Ortiz, I., Liang, H.C., & Lindeman, R. D. (1999). Neuropsychological performance of Hispanic and non-Hospanic older adults: an epidemiologic survey. Clinical Neuropsychologist, 13, 474-86.
  • Lee, T. M. & Chan, C. C. (2000). Are Trail Making and Color Trails Tests of equivalent constructs? Journal of Clinical and Experimental Neuropsychology, 22, 529-34.
  • Lewis, R. F. & Rennick, P. M. (1979). Manual for the repeatable Cognitive-Perceptual-Motor Battery. Grosse Point Park, MI: Axon Publishing Company.
  • Liu, K. P., Chan, C. C., Lee, T. M., & Hui-Chan, C.W. (2004). Mental imagery for relearning of people after brain injury. Brain Injury, 18(11), 1163-72.
  • Maj, M., D’Elia, L. D., Satz, P., Janssen, R., Zaudig, M., Uchiyama, C., Starace, F., Galderisi, S., & Chervinsky, A. (1993). Evaluation of two new neuropsychological tests designed to minimize cultural bias in the assessment of HIV-1 Seropositive persons: a WHO study. Archives of Clinical Neuropsyhology, 8, 123-35.
  • Messinis, L., Malegiannaki, A. C., Christodoulou, T., Panagiotopoulos, V., & Papathanasopoulos, P. (2011). Color Trails Test: normative data and criterion validity for the greek adult population. Archives of Clinical Neuropsychology, 26(4), 322-30.
  • Mitrushina, M., Boone, K. B., Razzani J., & D’Elia, L. F. (2005). Handbook of normative data for neuropsychological assessment. (2nd ed.). New York: Oxford University Press.
  • Reynolds, C. (2002). Comprehensive Trail Making Test. Austin, TX: Pro-Ed.
  • Ricker, J.H. & Axelrod, B. N. (1994). Analysis of an oral paradigm for the Trail Making Test. Assessment, 1, 47-51.
  • Sant’Ana Rabelo, I., Pacanaro, S.V., de Oliveira Rosetti, M., de Sa Leme, I.F., de Castro, N.R., Guntert, C. M., Correa Miotto, E., & Souza de Lucia, M. C. (2010). Color Trails Test: a Brazilian normative sample. Psychology and Neuroscience, 3, 93-9.

See the measure

How to obtain the CTT

The CTT can be purchased from Psychological Assessment Resources (http://www4.parinc.com/Products/Product.aspx?ProductID=CTT)

Table of contents

DOC Screen

Evidence Reviewed as of before: 30-04-2019
Author(s)*: Alexandra Matteau
Editor(s): Annabel McDermott
Content consistency: Gabriel Plumier

Purpose

The DOC screen is a screening tool that can be used to identify individuals at high risk of depression, obstructive sleep apnea and cognitive impairment following a stroke.

In-Depth Review

Purpose of the measure

The DOC screen is a screening tool that identifies individuals at high risk of depression, obstructive sleep apnea and cognitive impairment following a stroke.

Available versions

The DOC screen was developed by Swartz et al. and was first published in 2013. The tool was developed by combining and modifying three existing validated brief screens, the 2-item Patient Health Questionnaire (PHQ-2), the STOP questionnaire and a 10-point version of the Montreal Cognitive Assessment (MoCA).

Features of the measure

Items:

The DOC screen comprises three screening tests:

DOC – Mood (PHQ-2)

This test comprises two items with the purpose of screening for depression. The test evaluates the degree to which an individual has experienced depressed mood and anhedonia over the past two weeks.

DOC – Apnea (STOP Questionnaire)

This test comprises four items with the purpose of screening for obstructive sleep apnea: snoring, tiredness during daytime, breathing interruption during sleep, and hypertension.

DOC – Cog (10-point version of the MoCA)

This test comprises three tasks with the purpose of screening for cognitive impairment: clock drawing, abstraction, and 5-word recall (memory).

Scoring:

Each subscale has different scoring and is interpreted independently.

DOC – Mood (total score 0-6)

The two items are scored from 0-3 whereby the respondent is asked to rate how often each symptom occurred over the last 2 weeks:

  • 0 = not at all
  • 1 = several days
  • 2 = more than half of the days
  • 3 = nearly every day.

DOC – Apnea (total score 0-4)

The four items are scored on a dichotomic scale (0 = no, 1 = yes) according to whether or not the respondent experiences each symptom.

DOC – Cog (total score 0-10)

  • Clock drawing task (0-3 points): 1 point each is given for (i) contour, (ii) numbers and (iii) the hands of the clock.
  • Abstraction task (0-2 points): 1 point is given for each item pair correctly answered.
  • Delayed recall task (0-5 points): 1 point is given for each word recalled without any cues.

The score for each task is summed to calculate the subscale score.

Each subscale is then summed to obtain a total score ranging between 0 and 20.

A raw score interpretation and a regression interpretation can be obtained at http://www.docscreen.ca/.

Time:

The DOC screen takes approximately 5 minutes to complete.

Subscales:

The DOC screen is comprised of three subscales: DOC Mood, DOC Apnea and DOC Cog.

Equipment:

A pencil and the test form are needed to complete the DOC screen.

Training:

No training requirements have been reported. The DOC screen can be administered by any individual who is able to correctly follow the instructions, but must be interpreted by a qualified health professional.

Alternative forms of the DOC Screen:

An alternative version is available and uses different words for the memory and abstraction tasks. This version must be used if the patient has previously been exposed to the MoCA or DOC screen to minimize any learning effects associated with repeated administration.

The E-DOC screen is an electronic version of the tool, which is available through the DOC screen website. The E-DOC screen has not been validated.

Client suitability

Can be used with:

  • Patients with stroke.
  • The DOC screen may also be suitable for use among patients with other neurological and vascular disorders such as multiple sclerosis, Alzheimer’s disease, mild cognitive impairment, Parkinson’s Disease and traumatic brain injury. However, no study has been conducted with this population.

Should not be used with:

While no contraindications have been reported, some considerations must be made when completing the test:

  • A translator, family member or caregiver can provide translation for patients who do not speak English fluently;
  • Provide visual aid (e.g. glasses) for patients with visual loss;
  • Speak loudly and clearly for patients with reduced hearing;
  • Motor tasks such as the clock drawing activity may be difficult for patients with motor impairments – use sound clinical judgement for this task;
  • Use alternative communication strategies for patients with aphasia.

In what languages is the measure available?

English

Summary

What does the tool measure? Depression, obstructive sleep apnea and cognitive impairment following stroke.
What types of clients can the tool be used for? Patients with stroke.
Is this a screening or assessment tool? Screening.
Time to administer Five minutes.
Versions

  • DOC screen
  • E-DOC screen
  • A second version is available to minimize learning effects associated with repeated administration.
Languages The DOC screen is only available in English.
Measurement Properties
Reliability Internal consistency:
No studies have examined internal consistency of the DOC screen.

Test-retest:
No studies have examined test-retest reliability of the DOC screen.

Intra-rater:
No studies have examined intra-rater reliability of the DOC screen.

Inter-rater:
No studies have examined inter-rater reliability of the DOC screen.

Validity Criterion:
Concurrent:
No studies have examined concurrent validity of the DOC screen.

Predictive:
No studies have examined predictive validity of the DOC screen.

Construct:
Convergent/Discriminant:
No studies have examined convergent validity of the DOC screen.

Known groups:
No studies have examined known groups validity. However, one study examined the sensitivity and specificity and reported that the DOC screen is a valid measure that can reliably identify patients at high-risk of depression, obstructive sleep apnea and cognitive impairment.

Floor/Ceiling Effects No studies have examined the floor or ceiling effects of the DOC screen.
Does the tool detect change in patients? Not reported.
Acceptability The DOC screen is a standardized screening tool suitable for use with stroke patients.
Feasibility The measure is brief, easy to score and requires no formal training. A study on 1503 patients showed that 89% of participants completed the screen in 5 minutes or less.
How to obtain the tool?

The DOC screen is free to use for clinical and educational purposes.

The administration manual and forms are available online from the following website: http://www.docscreen.ca/

Psychometric Properties

Overview

We conducted a literature search to identify all relevant publications on the psychometric properties of the DOC screen in individuals with stroke. We identified only one study, which was published in part by the developers of the measure. More studies are required before definitive conclusions can be drawn regarding the reliability and validity of the DOC screen.

Floor/Ceiling Effects

No studies have examined the floor or ceiling effects of the DOC screen.

Reliability

Internal consistency:
No studies have examined the internal consistency of the DOC screen.

Test-retest:
No studies have examined the test-retest reliability of the DOC screen.

Inter-rater:
No studies have examined the inter-rater reliability of the DOC screen.

Intra-rater:
No studies have examined the intra-rater reliability of the DOC screen.

Validity

Criterion:

Concurrent:
No studies have examined the concurrent validity of the DOC screen.

Predictive:
No studies have examined the predictive validity of the DOC screen.

Construct:

Convergent/Discriminant:
No studies have examined the convergent validity of the DOC screen.

Known groups:
No studies have examined the known groups validity of the DOC screen.

Responsiveness

No studies have examined the responsiveness of the DOC screen.

Sensitivity and Specificity:

Swartz et al. (2017) examined the sensitivity and specificity of the DOC screen for detecting depression, obstructive sleep apnea and cognitive impairment using receiver operating characteristic (ROC), area under the curve analyses (AUC) and the two-cut point approach. DOC-Mood was compared with the Structured Clinical Interview for DSM Disorders (SCID-D) and excellent sensitivity (92%) and specificity (99%) was identified for detecting depression (AUC=0.898). DOC-Apnea was compared with results on polysomnography (PSG) and excellent sensitivity (95%) and specificity (96%) for detecting obstructive sleep apnea was identified (AUC=0.660). DOC-Cog was compared to a 30-minute neuropsychological tests protocol proposed by Hachinski et al. (2006) and excellent sensitivity (100%) and specificity (95%) for detecting cognitive impairment was identified (AUC=0.776).

References

  • Hachinski, V., Iadecola, C., Petersen, R. C., Breteler, M. M., Nyenhuis, D. L., Black, S. E., … & Vinters, H. V. (2006). National Institute of Neurological Disorders and Stroke–Canadian stroke network vascular cognitive impairment harmonization standards. Stroke, 37 (9), 2220-2241.
  • Swartz, R. H., Cayley, M. L., Lanctôt, K. L., Murray, B. J., Cohen, A., Thorpe, K. E., … & Herrmann, N. (2017). The “DOC” screen: Feasible and valid screening for depression, Obstructive Sleep Apnea (OSA) and cognitive impairment in stroke prevention clinics. PloS one, 12 (4), e0174451.

See the measure

How to obtain the DOC Screen?

The form and manual of administration are available online from the following website: http://www.docscreen.ca/

The Doc screen is free to use for clinical and educational purposes and therefore no permissions are required.

Table of contents

Executive Function Performance Test (EFPT)

Evidence Reviewed as of before: 25-02-2013
Author(s)*: Valérie Poulin, OT, PhD candidate;; Annabel McDermott, OT
Editor(s): Nicol Korner-Bitensky, PhD OT
Content consistency: Gabriel Plumier

Purpose

The Executive Function Performance Test (EFPT) is a performance-based assessment of executive function through observation of four Instrumental Activities of Daily Living (I-ADLs).

In-Depth Review

Purpose of the measure

The Executive Function Performance Test (EFPT) is a performance-based standardized assessment of cognitive function using Instrumental Activities of Daily Living (I-ADLs). The EFPT adopts a top-down approach and is performed in an environmental (real-world) context. The EFPT is used to identify an individual’s: (a) impaired executive functions; (b) capacity for independent functioning; and (c) required amount of assistance for task completion (Baum, 2011).

Available versions

The EFPT was developed by Baum, Morrison, Hahn & Edwards (2003) at the Program in Occupational Therapy at Washington University Medical School.

Features of the measure

Description of Tasks:

The EFPT assesses performance of four functional tasks, completed in the following order:

  1. Simple cooking (oatmeal preparation)
  2. Telephone use
  3. Medication management
  4. Bill payment

The EFPT assesses the client’s ability to complete three executive function components of the task:

  1. Task initiation
  2. Task execution (comprising organization, sequencing, and judgment and safety)
  3. Task completion

The EFPT uses a standardized cueing system that enables use with individuals of varying ability (Baum, 2011).

Scoring and Score Interpretation:

The examiner observes the client’s executive functioning during task performance and also records level of cueing required to support task performance.

Executive functions

  • Initiation: beginning the task. The individual moves to the materials table to collect items needed for the task
  • Execution: the individual carries out the steps of the task
  • Organization: arrangement of the tools/materials to complete the task. The individual correctly retrieves and uses the items that are necessary for the task
  • Sequencing: execution of steps in an appropriate order. The individual carries out the steps in an appropriate order, attends to each step appropriately, and can switch attention from one step to the next
  • Judgment and safety: avoidance of dangerous situations. The individual exhibits an awareness of safety by actively avoiding or preventing the creation of a situation that would be unsafe.
  • Completion: termination of the task. The individual indicates that he/she is finished or moves away from the area of the last step.

Cueing hierarchy:

Cues required Score
No cues required 0
Indirect verbal guidance 1
Gestural guidance 2
Direct verbal assistance 3
Physical assistance 4
Do for the participant 5

The score is the highest level of cue needed by the client to perform the task.

The EFPT results in three overall scores:

Scores How is it calculated? What is the score range?
1. Executive function component score Sum of the numbers recorded on each of the four tasks for initiation, organization, sequencing, judgment and completion Each EF component can range from 0-5, with a total of all four tasks ranging from 0-20
2. Task score Sum of the five scores for each task Each task can range from 0-25
3. Total score Sum ofa the performance on all four tasks 0-100

A higher score indicates that the client requires more cueing and demonstrates more difficulties with executive functions.

Time:

The EFPT takes approximately 30 – 45 minutes to complete.

Training requirements:

While there are no specific training requirements the examiner should have experience delivering cues (as per cue guidance sheet – please see training manual: Baum 2011).

Equipment:

Leave all of the items necessary for all of the tasks in a clear storage box on a table (the “materials table”). Put the box on a lower table or stool if the person is in a wheel chair.

  • Hand soap in dispenser (as one would find in a home)
  • Paper Towels (if you use cloth they will need to be washed after each use)
  • Pan (with handle that gets hot and requires a pot holder)
  • Pot holder
  • A pad to put beside the burner to set the pan on when finished (have on the table before they start)
  • A spoon rest
  • Measuring cup (glass) – 1 cup
  • Dry measuring cups
  • Spoon for stirring
  • Rubber spatula
  • Old-fashioned Oats
  • Bowl
  • Spoon for eating
  • Salt shaker
  • Timer – a timer that can be used for 2 minutes
  • Pencil/Paper
  • Phone book
  • Magnifying Glass
  • Medicine bottle with instructions with the person’s name on it – filled with sugar-free candy
  • Medicine bottle with instructions with another person’s name on it filled with sugar-free candy
  • Crackers
  • Claritin (or other over-the-counter version) bottle (non prescription) as a distracter – filled with sugar-free candy
  • Drinking cups
  • Two bills: one cable (due in 30 days), one phone (due immediately) with pre addressed envelopes mixed with 5 other pieces of mail (letter from credit card company, postcard, flier, letter in a plain white envelope, mail order catalogue) in a Ziploc bag
  • Chequebook with person’s name on the check
  • Balance sheet (i.e. account book) with a balance $5.00 less than the bills total
  • Pen
  • Calculator
  • Other distracter items
  • Tongs
  • Pepper shaker
  • An enlarged direction sheet for the cooking task as on the oatmeal box (they may not be able to read it in small print). EXCEPTION: Say cook for 2 minutes (so there is time for them to use the timer and be cued if necessary.)
  • A stop watch or timer (it is acceptable to use the timer function on a phone)
  • Prepare a response card for the pre-test questions.
  • Put Bills and distracter mail in a gallon plastic bag
  • Put medications in a quart plastic bag

Additional items:

  • Pre-test questions
  • Script
  • Forms B-E
  • Cueing chart
  • Behaviour assessment chart

What to consider before beginning:

The EFPT is a standardized cognitive assessment; testing procedures should be followed precisely in order to maintain test validity. All items must be administered; if a client refuses to perform a task it can be skipped and performed later.

Conversations and verbal feedback are not permitted.

Multiple administrations may result in a learning effect.

Alternative Forms of the measure

There are no other forms of the assessment.

Client suitability

Can be used with:

  • Adolescents, adults and elderly adults.
  • The EFPT is suitable for use with clients with motor impairment. Clients are scored according to the cue level required but are not penalized if they ask for assistance because the impairment necessitates physical assistance (Baum et al., 2008).
  • The EFPT has been tested on populations with stroke (Baum et al., 2008), multiple sclerosis (Goverover et al., 2005) and schizophrenia (Katz et al., 2007).
  • The EFPT has been used with patients with chronic traumatic brain injury (Toglia et al., 2010).

Should not be used with:

  • The EFPT is not suitable for use with individuals with severe cognitive impairment who are not able to follow directions.

Note: Assessors should carefully consider the effect of apraxia and aphasia on performance.

Languages of the measure

The EFPT training manual is available in English. It has been translated and validated in Swedish and Hebrew.

Summary

What does the tool measure? The EFPT examines executive functions in the context of performing a task.
What types of clients can the tool be used for? The EFPT can be used with, but is not limited to, clients with stroke.
Is this a screening or assessment tool? Assessment
What ICF domain is measured? Activity
Time to administer 30-45 minutes
Versions An updated EFPT training manual was published in 2011.
Other Languages The EFPT has been translated and validated in Swedish and Hebrew.
Measurement Properties
Reliability Internal consistency:
One study reported excellent internal consistency for the EFPT total score and adequate to excellent internal consistency for tasks. Correlations between the EFPT total score and executive function components were excellent.

Test-retest:
No studies have reported on test-retest reliability of the EFPT in a stroke population.

Intra-rater:
No studies have reported on the intra-rater reliability of the EFPT in a stroke population.

Inter-rater:
One study reported excellent inter-rater reliability for the EFPT total score and all tasks.

Validity Content:
The EFPT was developed based on Baum & Edwards’ (1993) Kitchen Task Assessment.

Criterion:
Concurrent:
Three studies have examined concurrent validity of the EFPT in patients with acute or chronic stroke and reported an excellent correlation with the Functional Assessment Measure, an adequate to excellent correlation with the Assessment of Motor and Process Skills (AMPS) and the Short Blessed Test, and an adequate correlation with the Functional Independence Measure, Weschler Memory Scale-Revised Logical Memory Total Recall Test and Digit Span Backward subtests, Animal Naming Test, Delis-Kaplan Executive Function System (DKEFS) Sorting Test, Verbal Fluency Test and Colour Word Interference Test and the Trail Making Test Part B.

Predictive:
No studies have reported on the predictive validity of the EFPT in a stroke population.

Construct:
Convergent/Discriminant:
No studies have reported on discriminant validity of the EFPT in a stroke population.

Known Groups:
One study reported that the EFPT was able to discriminate between clients with mild and moderate stroke, and between clients with mild stroke and healthy controls.

Floor/Ceiling Effects No studies have reported on floor or ceiling effects of the EFPT in a stroke population.
Sensitivity/ Specificity No studies have reported on sensitivity or specificity of the EFPT in a stroke population.
Does the tool detect change in patients? No studies have reported on responsiveness of the EFPT in a stroke population.
Acceptability The EFPT is comprised of real world tasks. The tool can be administered to individuals of varying ability due to the flexibility to provide a hierarchy of cues as required.
Feasibility The EFPT can be administered in a home or rehabilitation setting. The tool is simple to administer and guidelines are clearly stipulated in the test manual. The EFPT assesses what an individual is able to do rather than what he/she cannot do
How to obtain the tool?

The EFPT is free and can be obtained from Carolyn Baum at baumc@wustl.edu, or online through the following websites:

Psychometric Properties

Overview

A literature search was conducted to identify all relevant publications on the psychometric properties of the Executive Function Performance Test (EFPT). While this assessment can be used with various populations, this module addresses the psychometric properties of the measure specifically when used with patients with stroke. Three studies were identified.

Floor/Ceiling Effects

No studies have reported on the floor or ceiling effects of the EFPT in a stroke population.

Reliability

Internal consistency:
Baum et al. (2008) examined internal consistency of the EFPT with a sample of 73 patients with mild to moderate chronic stroke and 22 age- and education-matched healthy controls. Internal consistency, calculated using Cronbach’s alpha, was excellent for the total score (?=0.94) and adequate to excellent for test items (cooking: ?=0.86; paying bills: ?=0.78; managing medication: ?=0.88; telephone use: ?=0.77). Correlations between the EFPT total score and executive function components were excellent (initiation: r=0.91; organization: r=0.93; sequencing: r=0.88; safety and judgment: r=0.78; task completion: r=0.89).

Test-retest:
No studies have reported on test-retest reliability of the EFPT in a stroke population.

Intra-rater:
No studies have reported on the intra-rater reliability of the EFPT in a stroke population.

Inter-rater:
Baum et al. (2008) examined inter-rater reliability of the EFPT with three trained raters and 10 participants (5 clients with stroke and 5 healthy controls). Inter-rater reliability, calculated using intra-class correlation coefficients (ICCs) was excellent for the total score (ICC=0.91) and all test items (cooking: ICC=0.94; paying bills: ICC=0.89; managing medication: ICC=0.87; telephone use: ICC=0.79).

Validity

Content:

The EFPT was developed at the Program in Occupational Therapy at Washington University Medical School.

The EFPT was developed based on Baum & Edwards’ (1993) Kitchen Task Assessment.

Criterion:

Concurrent:
Baum et al. (2008) examined concurrent validity of the EFPT by comparison with functional and neuropsychological tests in a sample of 73 patients with mild to moderate chronic stroke, using Pearson correlation coefficients. Functional tests included the Functional Assessment Measure and the Functional Independence Measure (FIM). Neuropsychological tests included the Weschler Memory Scale-Revised (WMS-R) Logical Memory Total Recall, Digit Span Forward and Digit Span Backward subtests, Animal Naming Test, Short Blessed Test and Trail Making Test. The EFPT showed an excellent correlation with the Functional Assessment Measure (r=-0.68), and adequate correlations with the FIM (r=-0.40), WMS-R Logical Memory Total Recall Test (r=-0.59) and Digit Span Backward (r=-0.49) subtests, Animal Naming Test (r=-0.47), Short Blessed Test (r=0.39) and the Trail Making Test Part B (r=0.39). Correlations with cognitive tests that are not considered to assess executive function were not significant (Trail Making Test Part A, WMS-R Digit Span Forward).

Wolf et al. (2010) examined concurrent validity of the EFPT by comparison with neuropsychological tests in a sample of 20 patients with mild to moderate acute stroke, using Pearson correlation coefficients. The EFPT total score showed adequate correlations with the Short Blessed Test (p=0.548) and the Delis-Kaplan Executive Function System (DKEFS) Sorting Test (p=-0.511), Verbal Fluency Test (p=-0.474) and Colour Word Interference Test (p=-0.566), but not the Trail Making Test. The EFPT Cooking task showed adequate correlations with the DKEFS Sorting (1: p=-0.498; 2: p=-0.587) and Verbal Fluency (p=0.527) Tests, and an excellent correlation with the Short Blessed Test (p=0.710). The EFPT Bill Payment task showed adequate correlations with DKEFS Sorting, Colour Word Interference and Trail Making Tests (p=-0.484 to -0.594). The EFPT Telephone task showed an adequate correlation with the DKEFS Colour Word Interference Test (p=-0.499). There were no significant correlations between the EFPT Medication Management task and other neuropsychological tests.

Cederfeldt et al. (2011) examined concurrent validity of the EFPT by comparison with the Assessment of Motor and Process Skills (AMPS) in a sample of 23 patients with mild acute stroke, using Spearman’s rank correlation test. The correlation between the EFPT total sum of all tasks and AMPS process skills was excellent (rho=0.61). Correlations between the four EFPT tasks and AMPS process skills were adequate to excellent (rho=0.54 – 0.60).

Predictive:
No studies have reported on the predictive validity of the EFPT in a stroke population.

Construct:

Convergent/Discriminant:
No studies have reported on convergent/discriminant validity of the EFPT in a stroke population.

Known Group:
Baum et al. (2008) examined known group validity of the EFPT with a sample of 73 patients with mild (n=59) to moderate (n=14) chronic stroke and 22 age- and education-matched healthy controls. Stroke severity was classified using the National Institutes of Health Stroke Scale (? 5 = mild stroke, 6-15 = moderate stroke). The EFPT was able to discriminate among groups, with healthy controls achieving a lower (better) total score than clients with mild stroke (p<0.05) and moderate stroke (p<0.0001), and clients with mild stroke achieving a lower score than those with moderate stroke (p<0.0001). Significant differences were seen between healthy controls and clients with mild stroke for Cooking (p=0.008) and Paying Bills (p=0.03). Significant differences were seen between clients with mild and moderate stroke for Paying Bills (p=0.01), Managing Medication (p=0.001) and Telephone Use (p=0.0001). Analysis of test EF components showed significant differences between healthy controls and clients with mild stroke for sequencing (p<0.001) and organization (p<0.04). Significant differences between clients with mild and moderate stroke were seen for organization (p<0.0001), sequencing (p<0.001), safety and judgment (p<0.004) and task completion (p<0.01).

Responsiveness

No studies have examined responsiveness of the EFPT in a sample of patients with stroke, although studies have been conducted among patient groups with other upper limb conditions (see: Beaton et al., 2001; Bot et al., 2004; MacDermid & Tottenham, 2004; Schmitt & Di Fabio, 2004).

Sensitivity & Specificity:
No studies have examined responsiveness of the EFPT in a sample of patients with stroke, although studies have been conducted among patient groups with other upper limb conditions (see: Beaton et al., 2001).

References

  • Baum, C.M. (2011). Executive Function Performance Test: training manual. St. Louis, MO: Washington University.
  • Baum, C.M. & Edwards, D. (1993). Cognitive performance in senile dementia of the Alzheimer’s type: the Kitchen Task Assessment. The American Journal of Occupational Therapy, 47, 431-6.
  • Baum, C.M., Morrison, T., Hahn, M., & Edwards, D.F. (2003). Test manual: Executive Function Performance Test. St. Louis, MO: Washington University.
  • Baum, C.M., Tabor Connor, L., Morrison, T., Hahn, M., Dromerick, A.W., & Edwards, D.F. (2008). Reliability, validity, and clinical utility of the Executive Function Performance Test: a measure of executive function in a sample of people with stroke. The American Journal of Occupational Therapy, 62(4), 446-455.
  • Cederfeldt, M., Widell, Y., Elgmark Andersson, E., Dahlin-Ivanoff, S., & Gosman-Hedström, G. (2011). Concurrent validity of the Executive Function Performance Test in people with mild stroke. British Journal of Occupational Therapy, 74(9), 443-9.
  • Goverover, Y., Kalmar, J., Gaudino-Goering, E., Shawaryn, M., Moore, N.B., Halper, J., & DeLuca, J. (2005). The relation between subjective and objective measures of everyday life activities in persons with multiple sclerosis. Archives of Physical Medicine and Rehabilitation, 86, 2303-8.
  • Katz, N., Tadmore, I., Felzen, B., & Hartman-Maeir, A. (2007). Validity of the Executive Function Performance Test in individuals with schizophrenia. Occupational Therapy Journal of Research, 27, 1-8.
  • Toglia, J., Johnston, M.V., Goverover, Y., & Dain, B. (2010). A multicontext approach to promoting transfer of strategy use and self regulation after brain injury: an exploratory study. Brain Injury, 24(4), 664-77.
  • Wolf, T.J., Stift, S., Tabor Connor, L., Baum, C., & The Cognitive Rehabilitation Research Group. (2010). Feasibility of using the EFPT to detect executive function deficits at the acute stage of stroke. Work: Journal of Prevention, Assessment & Rehabilitation, 36(4), 405-12.

See the measure

How to obtain the assessment?

The EFPT can be obtained from Carolyn Baum at baumc@wustl.edu, or online through the following websites:

Table of contents

Kettle Test (KT)

Evidence Reviewed as of before: 22-03-2011
Author(s)*: Katie Marvin, MSc, PT Candidate
Editor(s): Nicol Korner-Bitensky, PhD OT; Annabel McDermott, OT

Purpose

The Kettle Test was developed as a brief performance-based measure designed to assess cognitive skills in a functional context.

In-Depth Review

Purpose of the measure

The Kettle Test was developed as a brief performance-based measure designed to assess cognitive skills in a functional context. The Kettle Test can be used to evaluate the capacity for independent community living in clients with cognitive impairments. Using the functional task of preparing a hot beverage, the cognitive-functional and problem-solving skills of the client are assessed.

Available versions

The Kettle Test was developed by Dr. Adina Hartman-Maeir, Nira Armon and Dr. Noomi Katz in 2005, and later validated (Hartman-Maeir, Harel & Katz, 2009).

Features of the measure

Items:

The task of preparing two hot beverages is broken down into 13 discrete steps that can be evaluated. These items are described below.

Description of task

The client prepares two cups of hot beverages – one for him/herself and another for the examiner. The examiner asks the client to prepare a hot drink that differs in two ingredients from the one the client chose for him/her self.

  1. Opening the water faucet
  2. Filling the kettle with approximately 2 cups of water
  3. Turning off the faucet
  4. Assembling the kettle
  5. Attaching the electric cord to the kettle
  6. Plugging the electric cord in an electric socket
  7. Turning on the kettle
  8. Assembling the ingredients
  9. Putting the ingredients into the cups
  10. Picking up the kettle when water boils.
  11. Pouring the water into the cups.
  12. Adding milk
  13. Indication of task completion (e.g. verbal, gesture, serving)

What to consider before beginning:

The kettle must be dissembled and equipment set up.

Scoring and Score Interpretation:

All 13 discrete steps of the task are to be scored on a 4-point scale. The total score ranges from 0 to 52 with higher scores indicating the need for greater assistance. The administrator should note any cueing provided to the client in the “comments” section.

The following scoring scale should be used:

  • 0 = Performance intact.
  • 1 = Item completed independently but completed slowly, by trial and error and/or performance was questionable.
  • 2 = Received general cues
  • 3 = Received specific cueing; or
    Performance was incomplete (for example, only places part of ingredients in cup, removes the kettle before water boils etc.); or
    Performance is deficient (for example, places lid of kettle upside down, uses wrong ingredients or fails to perform step, for example did not turn on kettle, did not add milk etc.)
  • 4 = Received physical demonstration or assistance.

Following performance, the client and administrator are ask to comment on the following:

  1. Description of the process by the examiner.
  2. Recall of the instructions by the client: “What were the steps you had to do?”
  3. The client’s description of the process: “Describe to me what you did from the beginning to the end of the task.”
  4. Rating of performance by the client: “How do you rate your performance on this task between 0 to 100 percent?” (If the client cannot rate his/her performance then suggest the following options: “very good”, “fair”, “not so good”, “not good at all”).
  5. Rating of difficulty by the client: “How difficult was the task for you? Easy (able to by yourself easily); a little difficult; or very difficult (I needed help)”.
  6. Additional comments

Please note that as with most tests that involve everyday problem solving tasks, immediate learning may occur which may impact performance on retesting.

Time:

The average completion time has not been reported, however, it is estimated that the Kettle Test takes approximately 5-20 minutes to complete.

Training requirements:

There is no formal training required to administer the Kettle Test, however the examiner should have some experience and training in observational evaluation of functional performance. Familiarity with the process and scoring is also recommended.

Subscales:

None typically reported.

Equipment:

  • Electric kettle: it is important to use a kettle that can be dissembled because assembly of the kettle is part of the task.
  • Ingredients for beverages (e.g. instant/decaffeinated coffee, black/herbal tea, sugar/artificial sweeteners, milk, honey)
  • Other ingredients (to be used to distract the client, e.g. salt, pepper, oil)
  • Tray
  • Dishes and utensils for use during the task, plus extra to distract the client (3 cups, milk pitcher, a bowl, 2 plates, 3 tea spoons, a large spoon, 2 forks, a knife, can opener)

Alternative form of the KT

There are no alternative versions of the Kettle Test.

Client suitability

Can be used with:

  • Clients with stroke, who were living independently in the community prior to stroke.
  • Clients with stroke who understand spoken or written language.

Should not be used in:

  • Clients who do not understand spoken or written language.
  • Since the Kettle Test is administered through direction observation of a task a proxy respondent cannot complete it.

In what languages is the measure available?

The manual has only been released in English (Hartman-Maeir, Armon & Katz, 2005), however, only comprehension of spoken language is required of the client during administration.

Summary

What does the tool measure? The Kettle Test measures cognitive skills in a functional context.
What types of clients can the tool be used for? Clients with stroke who were living independently in the community prior to stroke
Is this a screening or assessment tool? Assessment tool
Time to administer Approximately 5 to 20 minutes.
Versions There are no alternative versions.
Other Languages None
Measurement Properties
Reliability Internal consistency:
No studies have examined the internal consistency of the Kettle Test.

Test-retest:
No studies have examined the test-retest reliability of the Kettle Test.

Intra-rater:
No studies have examined the intra-rater reliability of the Kettle Test.

Inter-rater:
One study examined the inter-rater reliability of the Kettle Test and reported excellent inter-rater.

Validity Construct:
Convergent:
One study reported excellent correlation with the Functional Independence Measure (FIM) Cognitive scale and adequate correlation with the Mini-Mental Status Examination (MMSE), Clock Drawing Test and the Behavioural Inattention Test (BIT) Star Cancellation subtest.

Known groups:
The Kettle Test was able to discriminate clients with stroke from healthy controls.

Floor/Ceiling Effects Not yet examined in a stroke population.
Does the tool detect change in patients? Not yet examined in a stroke population.
Acceptability The Kettle Test is accepted by clients with stroke as it involves a real-life functional task.
Feasibility The administration of the Kettle Test is easy and quick to perform.
How to obtain the tool? A preliminary version of the Kettle Test manual can be obtained from: https://www.sralab.org/rehabilitation-measures/kettle-test

Psychometric Properties

Overview

We conducted a literature search to identify all relevant publications on the psychometric properties of the Kettle Test. We identified only one study on the psychometric properties of the Kettle Test, which was published in part by the developers of the measure. More studies are required before definitive conclusions can be drawn regarding the reliability and validity of the Kettle Test.

Floor/Ceiling Effects

Not yet examined in a stroke population.

Reliability

Internal Consistency:
Not yet examined in a stroke population.

Test-retest:
Not yet examined in a stroke population.

Intra-rater:
Not yet examined in a stroke population.

Inter-rater:
Hartman-Maeir, Harel & Katz (2009) examined the inter-rater reliability of the Kettle Test in 21 clients with stroke admitted to one of two rehabilitation hospitals. Clients were within 1-month post stroke and had been living independently prior to stroke. Inter-rater reliability between four Occupational Therapists, as measured using Spearman correlation coefficient was found to be excellent at both sites (r=.851, p=.001; and r=.916, p=.000).

Validity

Content:

Not yet examined in a stroke population.

Criterion:

Concurrent:
Not yet examined in a stroke population.

Predictive:
Not yet examined in a stroke population.

Construct:

Convergent/Discriminant:
Hartman-Maeir, Harel & Katz (2009) examined the convergent validity of the Kettle Test by comparing it to other commonly used measures of cognitive ability in 36 clients with stroke and 36 healthy controls. Correlations were calculated using Pearson Correlation Coefficients. Excellent correlation was found between the Kettle Test and the Cognitive domain of the Functional Independence Measure (FIM) (r=-.659). Adequate correlations were found between the Kettle Test and the Mini-Mental Status Examination (MMSE), Clock Drawing Test and the Behavioural Inattention Test (BIT) Star Cancellation subtest (r=-.478; r=-.566; and r=-.578 respectively).

Known groups:
Hartman-Maeir, Harel & Katz (2009) verified the ability of the Kettle Test to discriminate between healthy controls (n=36) and individuals with stroke (n=36). The healthy controls showed little variability in performance and all scored within a narrow range of 0 to 3 points. The individuals with stroke demonstrated great variability in performance and scored within a large range of 1 to 29 points (with higher scores indicating greater need for assistance). The patients with stroke required significantly more assistance in completing the Kettle Test whereas the healthy controls required very minimal to no assistance.

Ecological:

Hartman-Maeir, Harel & Katz (2009) investigated the ecological validity of the Kettle Test in 36 patients with stroke. Basic activities of daily living (BADL) and safety were measured prior to discharge home, using the Motor domain of the Functional Independence Measure (FIM) and the Safety Rating Scale portion of the Routine Task Inventory (RTI-E) (Allen, 1989; Katz 2006). One month later instrumental activities of daily living (IADL) were assessed using the IADL Scale (Lawton & Brody, 1969). The Kettle Test was found to have excellent correlation with the Motor domain of the FIM (r=-.759) and adequate correlation with the Safety Rating Scale of the RTI-E and the IADL Scale (r=-.571 and r=-.505 respectively), using Pearson correlation coefficients. The results of this study suggest that performance on the Kettle Test is representative of the functional outcome of patients who are discharged to home.

Responsiveness

Not yet examined in a stroke population.

References

  • Hartman-Maeir, A., Armon, N. & Katz, N. (2005). The Kettle Test: A cognitive functional screening test. Unpublished protocol. Helene University, Jerusalem, Israel. Retrieved on February 1, 2010 from: http://www.rehabmeasures.org/Lists/RehabMeasures/DispForm.aspx?ID=939
  • Hartman-Maeir, A., Harel, H. & Katz, N. (2009). Kettle Test – A brief measure of cognitive functional performance: Reliability and validity in a stroke population. American Journal of Occupational Therapy, 64, 592-599.

See the measure

How to obtain the Kettle Test?

https://www.sralab.org/rehabilitation-measures/kettle-test

Table of contents

Mini-Mental State Examination (MMSE)

Evidence Reviewed as of before: 07-11-2010
Author(s)*: Lisa Zeltzer, MSc OT
Editor(s): Nicol Korner-Bitensky, PhD OT; Elissa Sitcoff, BA BSc

Purpose

The Mini-Mental State Examination (MMSE) was originally developed as a brief screening tool to provide a quantitative evaluation of cognitive impairment and to record cognitive changes over time (Folstein, Folstein, & McHugh, 1975). Since that time it has become recognized that repeated use of the MMSE with the same client reduces its validity, so it is advised that this screening tool not be used repeatedly with the same individual if the time interval between testing is short. Rather than provide a diagnosis, the measure should be used to detect the presence of cognitive impairment (Folstein, Robins, & Helzer, 1983). The MMSE briefly measures orientation to time and place, immediate recall, short-term verbal memory, calculation, language, and construct ability. While the measure was originally used to detect dementia within a psychiatric setting, its use has become widespread. Since 1993, the MMSE has been available with an attached table that enables patient-specific norms to be identified on the basis of age and educational level (Crum, Anthony, Bassett, & Folstein, 1993).

In-Depth Review

Purpose of the measure

The Mini-Mental State Examination (MMSE) was originally developed as a brief screening tool to provide a quantitative evaluation of cognitive impairment and to record cognitive changes over time (Folstein, Folstein, & McHugh, 1975). Since that time it has become recognized that repeated use of the MMSE with the same client reduces its validity, so it is advised that this screening tool not be used repeatedly with the same individual if the time interval between testing is short. Rather than provide a diagnosis, the measure should be used to detect the presence of cognitive impairment (Folstein, Robins, & Helzer, 1983). The MMSE briefly measures orientation to time and place, immediate recall, short-term verbal memory, calculation, language, and construct ability. While the measure was originally used to detect dementia within a psychiatric setting, its use has become widespread. Since 1993, the MMSE has been available with an attached table that enables patient-specific norms to be identified on the basis of age and educational level (Crum, Anthony, Bassett, & Folstein, 1993).

Available versions

The MMSE was published by Folstein et al. in 1975.

Features of the measure

Items:

The MMSE consists of 11 simple questions or tasks that look at various functions including: arithmetic, memory and orientation.

Scoring:

The score is the number of correct items. The measure yields a total score of 30. A score of 23 or less is the generally accepted cutoff point indicating the presence of cognitive impairment (Ruchinskas & Curyto, 2003).

Levels of impairment have also been classified as none (24-30); mild (18-23) and severe (0-17) (Tombaugh & McIntyre 1992).

More recently, Folstein, Folstein, McHugh, and Fanjiang. (2001) recommended the following cutoff scores:

Score Level of impairment
≥ ? 27 None
21-26 Mild
11-20 Moderate
≤ 10 Severe

Crum et al. (1993) reported that cognitive performance as measured by the MMSE varies within the population by age and educational level. There is an inverse relationship between MMSE scores and age, ranging from a median of 29 for those aged 18 to 24 years, to 25 for individuals 80 years of age and older. There is also an inverse relationship between MMSE scores and education. The median MMSE score is 29 for individuals with at least 9 years of schooling, 26 for those with 5 to 8 years of schooling, and 22 for those with 0 to 4 years of schooling.

The following table, created by Crum et al. (1993) can be used to compare your patient’s MMSE score with a reference group based on age and education level.

(Source: Crum et al., 1993)

Age
Education 20-24 25-29 30-34 35-39 40-44
4th grade 22 25 25 23 23
8th grade 27 27 26 26 27
High school 29 29 29 28 28
College 29 29 29 29 29
Age
Education 45-49 50-54 55-59 60-64 65-69
4th grade 23 23 22 23 22
8th grade 26 27 26 26 26
High school 28 28 28 28 28
College 29 29 29 29 29
Age
Education 70-74 75-79 80-84 >84
4th grade 22 21 20 19
8th grade 25 25 25 23
High school 27 27 25 26
College 28 28 27 27

Subscales:

Orientation (total points = 10), Registration (total points = 3), Attention and calculation (total points = 5), Recall (total points = 3), and Language (total points = 9).

Equipment:

The MMSE requires no specialized equipment.

Training:

Little information has been reported on training for the MMSE, however a standardized version of the MMSE has been developed (Molloy & Standish, 1997).

Time:

Administration by a trained interviewer takes approximately 10 minutes.

Alternative form of the MMSE

The Modified mini-mental state examination (3MS) (Teng & Chui, 1987).

An expanded version of the MMSE was developed by Teng and Chui (1987) increasing the content, number and difficulty of items included in the assessment. The score of the 3MS ranges from 0 – 100 with a standardized cut-off point of 79/80 for the presence of cognitive impairment. This expanded assessment takes approximately 5 minutes more to administer than the original MMSE, which takes approximately 10 minutes to complete. Grace et al. (1995) compared the MMSE to the 3MS in geriatric patients with stroke. Test-retest reliability of the 3MS was excellent (r = 0.80). The 3MS also correlated with a battery of neuropsychological assessments and with some cognitive domains missed by the MMSE. The 3MS was a significantly better predictor of functional outcome (as measured by the Functional Independence Measure) than the MMSE. The 3MS was found to have higher sensitivity than the MMSE (69% vs. 44%) and similar specificity (80% vs. 79%). The area under the curve (AUC) was 0.798 for the 3MS.

3MS + Clock-drawing (Suhr & Grace, 1999).

The addition of clock drawing, a simple measure of constructional ability, increased the sensitivity in detecting focal brain damage of the 3MS in patients with right hemisphere stroke (87%). The addition of the Clock Drawing Test requires about two extra minutes in administration time.

Standardized MMSE (SMMSE) (Molloy & Standish, 1997).

Molloy and Standish (1997) developed the SMMSE to improve the reliability of the measure. The idea was to develop strict guidelines for administration and scoring. To examine the reliability of the SMMSE in 48 older adults, university students were randomized to administer either the MMSE or the SMMSE, and were trained on that test to give to participants on three different occasions. The SMMSE had significantly better inter-rater and intra-rater reliability compared with the MMSE. The inter-rater variance was reduced by 76% and the intra-rater variance was reduced by 86%. It took less time to administer the SMMSE compared with the MMSE (average 10.5 minutes and 13.4 minutes, respectively. The intraclass correlation (ICC) for the MMSE was adequate (ICC = 0.69), and was excellent for the SMMSE (ICC = 0.90).

Telephone version (ALFI-MMSE) (Roccaforte, Burke, Bayer, & Wengel, 1992).

This version includes 22/30 of the original MMSE items, the majority of which were removed from the last section (language and motor skills). Roccaforte et al. (1992) examined the validity of the ALFI-MMSE in 100 geriatric outpatients. Correlations between phone and face-to-face versions of the MMSE were excellent (Pearson’s r = 0.85). Patients tended to score slightly higher on in-person testing than on the telephone. Sensitivity (using a brief neurological screening test as the criterion) of 67% and specificity of 100% were reported in a population of elderly, community-dwelling individuals. This was similar to the sensitivity (68%) and specificity (100%) reported for screening with the traditional MMSE.

26-item version of the ALFI-MMSE (T-MMSE) (Roccaforte et al. cited in Newkirk, Kim, Thompson, Tinklenberg, Yesavage, & Taylor, 2004).

The T-MMSE was developed from the ALFI-MMSE. It is a 26-point adaptation, containing a 3-step command: “Say hello, tap the mouthpiece of the phone 3 times, then say I’m back”. It also contains a new question that requests that the patient give the interviewer a phone number where they can usually be reached. The T-MMSE had an excellent correlation with the MMSE (r = 0.88). Neither hearing impairment nor years of education were associated with T-MMSE scores. On the 22 points in common between the 2 scales, scores had an excellent correlation (r = 0.88), however, telephone scores tended to be lower than in-face scores (Newkirk et al., 2004). The authors provide tables for the conversion of T-MMSE scores to MMSE scores

Client suitability

Can be used with:

  • Patients with stroke (Agrell & Dehlin, 2000; Ozdemir, Birtane, Tabatabaei, Ekuklu, Kokino, & Siranus, 2001; Grace et al., 1995; Suhr & Grace, 1999).

Should not be used with:

  • The MMSE was ineffective in detecting cognitive impairment in patients with right-sided stroke (Grace et al., 1995).
  • The MMSE is not suitable for use with a proxy respondent as it is administered via direct observation of task completion.
  • Because the MMSE is heavily language dependent, it is likely to misclassify patients with aphasia.
  • The MMSE has a limited ability to diagnose dementia in general practice and should therefore be used as only one aspect of a patient’s overall cognitive profile (Wind, Schellevis, van Staveren, Scholten, Jonker, & van Eijk, 1997).
  • The MMSE has been criticized for attempting to assess too many functions in one brief test. An individual’s performance on individual items or within a single domain may be more useful than interpretation of a single, overall score (Tombaugh & McIntyre 1992). However, when used to screen for visual or verbal memory problems, or for problems in orientation or attention, it is not possible to identify acceptable cut-off scores (Blake, McKinney, Treece, Lee, & Lincoln, 2002).
  • MMSE scores have been shown to be affected by age, level of education, ethnicity, and sociocultural background (Tombaugh & McIntyre, 1992; Bleeker et al., 1988; Lorentz et al., 2002; Shadlen, Larson, Gibbons, McCormick, & Teri, 1999). These variables may introduce bias leading to the misclassification of individuals. For example, highly educated individuals who have mild dementia may well score within normal range on the MMSE because they find the questions easy. Further, poorly educated individuals may have low scores on the MMSE simply because they find the questions difficult. Thus, their scoring on the MMSE may indicate a diagnosis of dementia when none is present. Although these biases are not always present, Agrell and Dehlin (2000) found that age and education did not influence scores in their study, attention to these factors is warranted when interpreting MMSE results.
  • The MMSE has been found to lack sensitivity in patients with stroke (Blake et al., 2002; Suhr & Grace, 1999; Nys et al., 2005). Other studies have reported low levels of sensitivity among individuals with mild cognitive impairment (Tombaugh & McIntyre, 1992; de Koning et al., 1998) and in patients with right-hemisphere lesions (Dick et al., 1984). One potential solution to increase the sensitivity of the MMSE is the addition of a Clock Drawing Test (Suhr & Grace, 1999). Another solution that has been offered is to administer the Neurobehavioral Cognitive Status Examination (NCSE) in lieu of the MMSE. The NCSE is a highly sensitive measure to detect cognitive impairment in patients with brain lesions (Schwamm, Van Dyke, Kiernan, Merrin, & Mueller, 1997).
  • Da Costa et al. (2010) investigated the cognitive evolution and clinical severity of illiterate and schooled patients with stroke during a 6-month follow-up, using the MMSE and National Institutes of Health Stroke Scale (NIHSS) respectively. Significant improvement in clinical severity as measured by NIHSS was observed in both groups (P<0.001); however, only schooled individuals showed a significant improvement in MMSE scores, indicating an improvement in their overall cognitive function (P=0.008). Schooling was found to significantly influence MMSE scores.
  • Folstein, Folstein, and McHugh (1998) reported that the MMSE demonstrates marked ceiling effects in younger intact individuals and marked floor effects in moderately to severely impaired individuals.

In what languages is the measure available?

Afrikaans Dutch Israeli English Romanian
Arabic Estonian Italian Russian
Argentinean Spanish Filipino Japanese Russian for Estonia
Belgian Dutch Finnish Kannada Serbian
Belgian French French Korean Slovakian
Bosnian Austrian German Latvian South African English
Brazilian Portuguese German Lithuanian Spanish
Bulgarian Greek Macedonian Swedish
Chilean Spanish Gujarati Malayalam Telugu
Chinese Hebrew Marathi Turkish
Croatian Hindi Norwegian UK English
Czech Hungarian Polish Ukranian
Danish Indian English Portuguese Urdu

Authorized translations of the MMSE can be obtained by contacting Custsupp@parinc.com or call 1.800.331.8378

Summary

What does the tool measure? Cognitive impairment
What types of clients can the tool be used for? While originally used to detect dementia within a psychiatric setting, its use is now widespread and is available with an attached table that enables patient-specific norms
Is this a screening or assessment tool? Screening
Time to administer Administration by a trained interviewer takes approximately 10 minutes.
Versions The modified mini-mental state examination (3MS); 3MS + Clock-drawing; Standardized MMSE (SMMSE); Telephone version (ALFI-MMSE); 26-item version of the ALFI-MMSE (T-MMSE)
Other Languages Afrikaans; Dutch; Romanian; Arabic; Estonian; Italian; Russian; Argentinean Spanish; Filipino; Japanese; Russian for Estonia; Belgian Dutch; Finnish; Kannada; Serbian; Belgian French; French; Korean; Slovakian; Bosnian; Austrian German; Latvian; Brazilian; Portuguese; German; Lithuanian; Spanish; Bulgarian; Greek; Macedonian; Swedish; Chilean Spanish; Gujarati; Malayalam; Telugu; Chinese; Hebrew; Marathi; Turkish; Croatian; Hindi; Norwegian; Czech; Hungarian; Polish; Ukranian; Danish; Portuguese; Urdu
Floor/Ceiling effects Folstein, Folsten, and McHugh (1998) reported that the MMSE demonstrates marked ceiling effects in younger intact individuals and marked floor effects in individuals with moderate to severe cognitive impairment.
Reliability Internal consistency:
Out of nine studies examining the internal consistency of the MMSE, three reported poor internal consistency, one reported adequate internal consistency, two reported poor to excellent internal consistency, two reported excellent internal consistency, one reported excellent internal consistency in patients with Alzheimer’s Disease and poor internal consistency in patients with cognitive impairment.

Test-rest:
Out of six studies examining the test-rest reliability of the MMSE, two studies reported excellent test-rest, one reported adequate test-retest, one reported adequate to excellent test-retest, one reported poor to adequate test-rest and one reported poor test-retest.

Inter-rater:
Out of three studies examining the inter-rater reliability of the MMSE, one reported excellent inter-rater and two reported adequate inter-rater.

Validity Criterion:
The MMSE can discriminate between patients with Alzheimer’s Disease and frontotemporal dementia; can discriminate between patients with left- and right-hemispheric stroke.

Construct:
Concurrent:
MMSE had a poor correlation with the Mattis Dementia Rating Scale; poor to excellent correlations with the Wechsler Adult Intelligence Test; adequate correlation with the Functional Independence Measure; significant correlations with the Montgomery Asberg Depression Rating Scale and the Zung Depression Scale.

Predictive:
MMSE scores found to be predictive of functional improvement in patients with stroke following rehabilitation; discharge destination; developing functional dependence at a 3-year follow-up interval; ambulatory level; length of hospital stay such that for patients with moderate dementia; death.

Does the tool detect change in patients? Not applicable.
Acceptability The MMSE is a brief measure to administer. Patient variables such as age, level of education and sociocultural backgroup may affect scores on the measure. It is administered by direct observation and is therefore not appropriate for proxy use.
Feasibility No specialized equipment is required, and therefore it is a highly portable and inexpensive measure. However, one study reported that physicians found the MMSE too lengthy and unable to contribute much useful information.
How to obtain the tool? The MMSE can be obtained from the current copyright owner, Psychological Assessment Resources (PAR).

Psychometric Properties

Overview

We conducted a literature search to identify all relevant publications on the psychometric properties of the MMSE.

Floor/Ceiling Effects

Folstein, Folstein, and McHugh (1998) reported that the MMSE demonstrates marked ceiling effects in younger intact individuals and marked floor effects in individuals with moderate to severe impairment.

Reliability

Internal consistency:
Tombaugh and McIntyre (1992) reviewed studies published on the psychometric properties of the MMSE over the last 26 years. The internal consistency of the MMSE was reported to range from poor to excellent (alpha = 0.54 to 0.96).

McDowell, Kristjansson, Hill, and Hebert (1997) examined the internal consistency of the MMSE used as a screening test for cognitive impairment and dementia. The internal consistency was adequate (alpha = 0.78).

Holzer, Tischler, Leaf, and Myers (1984) examined the prevalence of dementia in a community sample (n = 4,917). In this study, the internal consistency of the MMSE was found to be adequate (alpha = 0.77). Reliability of individual items ranged from poor (alpha = 0.43 for Orientation) to excellent (alpha = 0.82 for Registration). Calculation/attention items were omitted from this study.

Kay, Henderson, Scott, Wilson, Rickwood, and Grayson (1985) conducted a community survey in 274 individuals over 70 years of age. Rates of dementia were measured by interviewing participants with the MMSE. In this study, the internal consistency of the MMSE was poor (alpha = 0.68).

Foreman (1987) examined the reliability of the MMSE in 66 hospitalized medical-surgical patients (normal, dementia, or delirium) over 65 years of age. The MMSE was found to have an excellent internal consistency (alpha = 0.96).

Jorm, Scott, Henderson, and Kay (1988) examined whether there was a bias in the MMSE such that individuals with less education (less than or equal to 8th grade) would perform worse on the measure than individuals with more education (more than 8th grade). The MMSE was administered 269 elderly participants. The internal consistency was found to be poor in both the more educated group (alpha = 0.54) and the less educated group (alpha = 0.65).

Albert and Cohen (1992) administered the MMSE to 40 elderly residents with severe cognitive impairment. The internal consistency of the MMSE was poor in patients with an MMSE score ≤ 10 (alpha = 0.56). However, when subjects representing the full range of MMSE scores were included, the internal consistency was excellent (alpha = 0.90).

Tombaugh, McDowell, Kristjansson, and Hubley (1996) compared the psychometric properties of the MMSE to the 3MS in community-dwelling participants between the ages of 65-89. Participants were divided into two groups, one with no cognitive impairment (n = 406) and one with Alzheimer’s disease (n = 119). The internal consistency of the MMSE was poor in the group without cognitive impairment (alpha = 0.62) and was found to be excellent in patients with Alzheimer’s disease (alpha = 0.81).

Hopp, Dixon, Grut, and Backman (1997) administered the MMSE to 44 adults without dementia, who were over the age of 75 years. In this sample, the internal consistency of the MMSE was poor (alpha ranged from 0.31 to 0.52).

Test-retest:
Tombaugh and McIntyre (1992) reviewed studies published on the psychometric properties of the MMSE over the last 26 years. They reported that in studies having a re-test interval of < 2 months, the MMSE has poor to excellent test-retest reliability with correlations ranging from 0.38 to 0.99. Twenty-four out of 30 studies reported excellent test-retest reliability (r > 0.75).

Folstein et al. (1975) administered the MMSE to 206 patients with dementia syndromes, affective disorder, affective disorder with cognitive impairment, mania, schizophrenia, personality disorders, and to 63 healthy controls. The test-retest reliability of the MMSE when administered twice within 24 hours was excellent, with a Pearson correlation coefficient of r = 0.89. When the MMSE was given to patients with depression and dementia twice, 28 days apart, the correlation was excellent, with a Pearson correlation of r = 0.99.
Note: Pearson correlation coefficients are likely to over-estimate reliability and the Pearson is no longer used for test-retest reliability.

Schmand, Lindeboom, Launer, Dinkgreve, Hooijer, and Jonker (1995) examined the test-retest reliability of the MMSE in healthy older subjects who were examined twice with an interval of 1 year between evaluations. Test-retest reliability was adequate (Spearman’s correlation = 0.58). The results of this study are similar to those found in O’Connor et al. (1989). These results suggest that the MMSE is not an appropriate measure for detecting subtle cognitive impairment.

Hopp et al. (1997) administered the MMSE to 44 adults without dementia, who were over the age of 75 years. The test-retest reliability for 6- 12- and 18-month intervals, using Pearson’s correlations, ranged from adequate to excellent (r = 0.56 to r = 0.80).

Olin and Zelinski (1991) examined the 12-month reliability of the MMSE in 57 elderly participants without dementia. Poor 12-month test-retest correlations were found for the total MMSE score (r = 0.34 when administering the alternate Attention item, r =0.23 when administering the same Attention item).

Uhlmann, Larson, and Buchner (1987) also examined the 12-month test-retest reliability of the MMSE in outpatients with dementia. In this study, the test-retest reliability was found to be excellent (r = 0.86).

Mitrushina and Satz (1991) examined the test-retest reliability of the MMSE in 122 healthy community-residing elderly volunteers between the ages of 57-85. The test-retest reliability of the MMSE was adequate (ranging from r = 0.45 to r = 0.50) over a 1-year interval, and poor over a 2-year period (r = 0.38).

Intra-rater/Inter-rater:
Molloy and Standish (1997) examined the intra-rater reliability of the MMSE in comparison to the SMMSE in 48 older adults. University students, who were trained to administer either the MMSE or the SMMSE, tested participants on three different occasions to assess their inter-rater and intra-rater reliability. An adequate ICC of 0.69 was reported for the traditional MMSE.

Inter-rater:
Dick et al. (1984) examined the inter-rater reliability of the MMSE in patients with neurological disorders and reported a kappa of 0.63, demonstrating the adequate inter-rater reliability of the MMSE.

Fabrigoule, Lechevallier, Crasborn, Dartigues, and Orgogozo (2003) examined the reliability of the MMSE in patients who were likely to develop dementia. Fifty trained general practitioners and psychologists examined patients. There was a significant difference in scores between the general practitioners and the psychologists for the MMSE. The concordance correlation coefficient was 0.87 between evaluations performed by general practitioners and those performed by psychologists.

In a study by O’Connor et al. (1989), 5 coders rated taped interviews with 54 general practice patients aged 75 and over. In this study, the inter-rater reliability was excellent, with a mean kappa value of 0.97.

Validity

Criterion:

Although the MMSE is generally considered unidimensional, Jones and Gallo (2000) identified five factors (concentration, language and praxis, orientation, memory, and attention) to support the construct validity of the MMSE as a measure of cognitive mental state among community dwelling older adults.

Concurrent:
Friedl, Schmidt, Stronegger, Fazekas, and Reinhart (1996) examined the concurrent validity of the MMSE and the Mattis Dementia Rating Scale (MDRS) (Mattis, 1976), two measures commonly used to screen for dementia. Concurrent validity between the MMSE and the MDRS was found to be poor (Pearson’s r = 0.29), as were correlations between the MMSE and MDRS subtests (attention r = 0.18; initiation and perseveration r = 0.04; construction r = 0.10; conceptualization r = 0.17; verbal and non-verbal short-term memory r = 0.27).

Folstein et al. (1975) administered the MMSE to 206 patients with dementia syndromes, affective disorder, affective disorder with cognitive impairment, mania, schizophrenia, personality disorders, and to 63 healthy controls. The concurrent validity of the MMSE was examined by correlating the measure with the Wechsler Adult Intelligence Scale (WAIS – Wechsler, 1955). The concurrent validity between the MMSE and the WAIS verbal IQ (r = 0.78) and the WAIS performance IQ (r = 0.66) were both excellent.

Hopp, Dixon, Grut, and Backman (1997) administered the MMSE and the Wechsler Adult Intelligence Scale-Revised (WAIS-R, Wechsler, 1981) to 44 adults without dementia, who were over the age of 75 years. Correlations between the MMSE and the WAIS-R Verbal IQ were adequate, ranging from r = 0.36 to r = 0.52. Correlations between the MMSE and WAIS-R Performance IQ were also adequate, ranging from r = 0.37 to r = 0.57. Correlations between the MMSE and the WAIS-R subtests ranged from poor to excellent (r = 0.20 to r = 0.60). Correlations between the MMSE subscales and the WAIS-R were generally lower than r = 0.41. The Language subscale of the MMSE showed the lowest correlations with both WAIS-R Verbal and WAIS-R Performance. Correlations between MMSE subscales and WAIS-R subtests showed that the MMSE subscale, Orientation, had the lowest correlations with all WAIS-R subtests (r = 0.001 to r = 0.40).

Similar to the results by Hopp et al. (1997), Dick et al. (1984) examined the utility of the MMSE for bedside screening, and serial assessment of cognitive function in 126 neurological patients and found adequate correlations between the MMSE and the Weschler Adult Intelligence Scale (WAIS) (r = 0.55 for WAIS-Verbal; r = 0.56 for WAIS-Performance).

Agrell and Dehlin (2000) reported significant correlations between MMSE scores and the Barthel Index (Mahoney & Barthel, 1965), the Montgomery Asberg Depression Rating Scale (MADRS – Montgomery & Asberg, 1979) and the Zung Depression Scale (Zung, 1965).

Diamond, Felsenthal, Macciocci, Butler, and Lally-Cassady (1996) examined the relationship between cognition and ability to benefit from inpatient rehabilitation in 52 patients admitted to geriatric rehabilitation. Functional gain was assessed using the change in Functional Independence Measure (FIM – Keith, Granger, Hamilton, & Sherwin, 1987) score from admission to discharge. The MMSE was not found to be associated with change in FIM score (r = 0.10). However, the MMSE alone and in combination with age correlated adequately with functional status on admission (r = 0.58) and discharge (r = 0.49).

Predictive:
Ozdemir et al. (2001) examined the predictive validity of the MMSE in 43 patients with stroke. Baseline total MMSE scores were correlated with discharge Motor Functional Independence Measure (Keith et al., 1987) improvement (r = 0.31). The baseline Orientation subscore of the MMSE correlated significantly with functional ambulation score improvement as measured by the Adapted Patient Evaluation and Conference System functional scale (r = 0.31). These results suggest that baseline total MMSE scores are somewhat predictive of functional improvement in patients with stroke after rehabilitation.

Diamond et al. (1996) examined the relationship between cognition and the ability to benefit from inpatient rehabilitation in 52 patients admitted to geriatric rehabilitation. The MMSE was found to be highly predictive of discharge destination such that low MMSE scores were associated with a greater likelihood of nursing home placement (r = 0.68). While only 8% of the uppermost MMSE quartile was discharged to nursing home placement, 62% of the lowest MMSE quartile was discharged to nursing homes.

Aguero-Torres, Fratiglioni, Guo, Viitanen, von Strauss, and Winblad (1998) examined predictors of dependence in activities of daily living (as measured by the Katz index of Activities of Daily Living (Katz, Downs, Cash, Grotz, 1970)) in the elderly. In patients without dementia, the MMSE was found to be one of the strongest predictors for developing functional dependence at a 3-year follow-up interval. Lower MMSE scores were associated with functional dependence in both adults with dementia (OR = 0.8) and in adults without dementia (OR = 0.8). Initial MMSE performance also predicted future functional dependence and decline among adults without dementia (OR = 0.7). Thus, independent of the presence of other chronic conditions, the MMSE may indicate subsequent functional status in a cognitively intact elderly population.

Matsueda and Ishii (2000) retrospectively examined the relationship between MMSE score and ambulatory level (divided into three groups: dependent, partially dependent, and independent) in 162 elderly patients who experienced a hip fracture. A significant relationship was found between initial MMSE score and ambulatory level such that those in the dependent group had the lowest mean MMSE score of only 6.6, those in the partially dependent group had a mean score of 17.9, and those in the independent group had the highest MMSE score of 24.6.

Huusko, Karppi, Avikainen, Kautiainen, and Sulkava (2000) examined the effect of intensive geriatric rehabilitation (intervention group) versus local hospital treatment (control group) on patients with dementia and a hip fracture. MMSE scores were predictive of length of hospital stay such that for patients with moderate dementia (MMSE score of 12-17), the median length of stay was 47 days in the intervention group and 147 days in control group. Patients with mild dementia (MMSE score of 18-23) had a length of stay of 29 days in intervention group and 46.5 days in the control group. No significant differences in mortality or in the length of hospital stay were observed for patients with severe dementia. In the intervention group, 3 months after surgery 91% of the patients with mild dementia and 63% of the patients with moderate dementia were living independently. In the control group, the corresponding figures were 67% and 17%, respectively. The results of this study suggest that the MMSE is associated with the length of hospital and rehabilitation stay, and that length of stay can be impacted on by intervention for those with cognitive impairment.

Pettigrew, Thomas, Howard, Veltkamp, and Toole (2000) examined whether low MMSE scores predict transient ischemic attack, stroke, myocardial infarction, or death. Patients were randomized to receive a carotid endarterectomy or best medical therapy in as a means to preserve cognition. A significant relationship was found between a low post-randomization MMSE score and an increased risk of death. Furthermore, patients who experienced stroke after randomization had a significant and persistent reduction in MMSE score.

Construct:

Convergent:
Snowden at al. (1999) examined 140 patients who were part of the Alzheimer’s Disease Patient Registry to evaluate the psychometric properties of a new measure, the Minimum Data Set (MDS). The cognitive performance scores from the MDS were correlated with the MMSE. The MMSE correlated adequately with the MDS (Spearman’s r = -0.45) (this correlation is negative because a low score on the MMSE indicates cognitive impairment, whereas a high score on the MDS indicates impairment). Consistent with previous studies, the MMSE had excellent correlations with the Weschler Adult Intelligence Scale (WAIS) Verbal and Performance IQ scores (r = 0.78 and r = 0.66, respectively).

Discriminant:
Winograd et al. (1994) developed the Physical Performance and Mobility Examination, a measure used to assess 6 domains of physical functioning and mobility for hospitalized elderly. The construct validity of this measure was examined by comparing it to the MMSE, Activities of Daily Living (ADL), Instrumental Activities of Daily Living (IADL) (Lawton & Brody, 1969), Geriatric Depression Scale (Yesavage et al., 1983), and modified Medical Outcomes Study Measure of Physical Functioning (MOS-PFR). The MMSE correlated poorly with the Physical Performance and Mobility Examination (r = 0.36), suggesting that these two measures assess different constructs.

Macnight and Rockwood (1995) examined discriminant validity of the MMSE by comparing it to a new measure, the Hierarchical Assessment of Balance and Mobility (HABAM) in patients 65 and older. The discriminant validity was demonstrated, as the two measures correlated poorly (r = 0.15).

Known groups:
Wetherell, Darby, Emerson, and Miller (1997) found that the MMSE was able to discriminate between patients with Alzheimer’s Disease and frontotemporal dementia.

Kase, Wolf, Kelly-Hayes, Kannel, Beiser, and D’Agostino (1998) found that baseline pre-stroke MMSE scores were significantly lower for patients with stroke than were the scores for matched controls. This difference became more pronounced when the post-stroke scores were compared. The MMSE could discriminate between patients with left- and right-hemispheric stroke. In patients with right-hemispheric stroke, cognitive impairment was characterized by a significant decline in scores from pre-stroke to post-stroke specifically in the areas of orientation and language. For patients with left hemisphere strokes, a significant decline in scores from pre-stroke to post-stroke were found in all five domains of the MMSE except memory.

Sensitivity and Specificity

Low reported levels of sensitivity, particularly among individuals with mild cognitive impairment, have been reported for the MMSE (Tombaugh & McIntyre, 1992; de Koning et al. 1998) and may be due to the emphasis placed on language items and a lack of items assessing visual-spatial ability (Grace et al. 1995; de Koning et al. 1998; Suhr & Grace, 1999).

Blake et al. (2002) examined the sensitivity and specificity of the MMSE for detecting cognitive impairment after stroke. When the MMSE was compared with cognitive impairment identified an optimum cutoff of <24, with good specificity (88%) and moderate sensitivity (62%). However, it was not possible to identify suitable cutoff scores to use the MMSE to assess for the presence of either visual or verbal memory deficits.

Nys, van Zandvoort, de Kort, Jansen, Kappelle, and de Haan (2005) administered the MMSE to 34 patients with stroke and 34 healthy controls. In this study, no optimum cut-off scores yielding both sensitivity greater than 80%, and specificity greater than 60%, could be identified.

References

  • Agrell, B., Dehlin, O. (2000). Mini mental state examination in geriatric stroke patients. Validity, differences between subgroups of patients, and relationships to somatic and mental variables. Aging (Milano), 12(6), 439-444.
  • Aguero-Torres, H., Fratiglioni, L., Guo, Z., Viitanen, M., von Strauss, E., Winblad, B. (1998). Dementia is the major cause of functional dependence in the elderly: 3-year follow-up data from population-based
  • study. American Journal of Public Health, 88,1452-1456.
  • Albert, M., Cohen, C. (1992). The test for severe impairment: An instrument for the assessment of patients with severe cognitive dysfunction. J Am Geriatr Soc, 40(5), 449-453.
  • Blake, H., McKinney, M., Treece, K., Lee, E., Lincoln, N. B. (2002). An evaluation of screening measures for cognitive impairment after stroke. Age and Ageing, 31, 451-456.
  • Bleecker, M. L., Bolla-Wilson, K., Kawas, C., Agnew, J. (1988). Age-specific norms for the Mini-Mental State Exam. Neurology, 10, 1565-1568.
  • Crum, R. M., Anthony, J. C., Bassett, S. S., Folstein, M. F. (1993). Population-based norms for the mini-mental state examination by age and educational level. JAMA, 18, 2386-2391.
  • Da Costa, F.A., Bezerra, I.F.D., de Araujo Silva, D.L., de Oliveira, R. & da Rocha, V.M. (2010). Cognitive evolution by MMSE in poststroke patients. International Journal of Rehabilitation Research, 33, 248-253.
  • de Koning, I., van Kooten, F., Dippel, D. W. J., van Harskamp, F., Grobbee, D. E., Kluft, C., Koudstaal, P. J. (1998). The CAMCOG: A useful screening instrument for dementia in stroke patients. Stroke, 29, 2080-2086.
  • Diamond, P. T., Felsenthal, G., Macciocci, S. N., Butler, D. H., Lally-Cassady, D. (1996). Effect of cognitive impairment on rehabilitation outcome. American Journal of Physical Medicine & Rehabilitation, 75(1), 40-43.
  • Dick, J. P., Guiloff, R. J., Stewart, A., Blackstock, J., Bielawska, C., Paul, E. A., Marsden, C. D. (1984). Mini-mental state examination in neurological patients. Journal of Neurology, Neurosurgery, and Psychiatry, 47, 496-499.
  • Fabrigoule, C., Lechevallier, N., Crasborn, L., Dartigues, J. F., Orgogozo, J. M. (2003). Inter-rater reliability of scales used to measure mild cognitive impairment by general practitioners and psychologists. Current Medial Research and Opinion, 19(7), 603-608.
  • Folstein, M. F., Folstein, S. E., McHugh, P. R. (1975). “Mini-mental state”. A practical method for grading the cognitive state of patients for the clinician. J Psychiatr Res, 12(3), 189-198.
  • Folstein, M. F., Folstein, S. E., McHugh, P. R. (1998). Key papers in geriatric psychiatry. Mini-Mental State: A practical method for grading the cognitive state of patients for the clinician. Int J Geriat Psychiatry, 13(5), 285-294.
  • Folstein, M. F., Folstein, S. E., McHugh, P. R., Fanjiang, G. (2001). Mini-Mental State Examination User’s Guide. Odessa, FL: Psychological Assessment Resources.
  • Folstein, M. F., Robins, L. N., Helzer, J. E. (1983). The Mini-Mental State Examination. Arch Gen Psychiatry, 40(7), 812.
  • Foreman, M. D. (1987). Reliability and validity of mental status questionnaires in elderly hospitalized patients. Nurs Res, 36(4), 216-220.
  • Friedl, W., Schmidt, R., Stronegger, W. J., Fazekas, F., Reinhart, B. (1996). Sociodemographic predictors and concurrent validity of the Mini Mental State Examination and the Mattis Dementia Rating Scale. European Archives of Psychiatry and Clinical Neuroscience, 246(6), 317-319.
  • Grace, J., Nadler, J. D., White, D. A., Guilmette, T. J., Giuliano, A. J., Monsch, A. U., Snow, M. G. (1995). Folstein vs modified Mini-Mental State Examination in geriatric stroke. Stability, validity, and screening utility. Archives of Neurology, 52(5), 477-484.
  • Holzer, C. E., Tischler, G. L., Leaf, P. J., Myers, J. K. (1984). An epidemiologic assessment of cognitive impairment in a community. Research in Community Mental Health, 4, 3-32.
  • Hopp, G. A., Dixon, R. A., Grut, M., Backman, L. (1997). Longitudinal and psychometric profiles of two cognitive status tests in very old adults. J Clin Psychol, 53(7), 673-686.
  • Huusko, T. M., Karppi, P., Avikainen, V., Kautiainen, H., Sulkava, R. (2000). Randomised, clinically controlled trial of intensive geriatric rehabilitation in patients with hip fracture: Subgroup analysis of patients with dementia. British Medical Journal, 321,1107-1111.
  • Jones, R. N., Gallo, J. J. (2000). Dimensions of the Mini-Mental State Examination among community dwelling older adults. Psychological Medicine, 30, 605-618.
  • Jorm, A. F., Scott, R., Henderson, A. S., Kay, K. W. (1988). Educational level differences on the Mini-Mental State: The role of test bias. Psychol Med, 18(3), 727-731.
  • Kase, C. S., Wolf, P. A., Kelly-Hayes, M., Kannel, W. B., Beiser, A., D’Agostino, R. B. (1998). Intellectual decline after stroke: The Framingham study. Stroke, 29, 805-812.
  • Katz, S., Downs, T. D., Cash, H. R., Grotz, R. C. (1970). Index of Activities of Daily Living. The Gerontologist, 1, 20-30.
  • Kay, K. W., Henderson, A. S., Scott, R., Wilson, J., Rickwood, D., Grayson, D. A. (1985). Dementia and depression among the elderly living in the Hobart community: The effect of the diagnostic criteria on the prevalence rates. Psychol Med, 15(4), 771-788.
  • Keith, R. A., Granger, C. V., Hamilton, B. B., Sherwin, F. S. (1987). The functional independence measure: A new tool for rehabilitation. Adv Clin Rehabil, 1, 6-18.
  • Lawton, M. P., Brody, E. M. (1969). Assessment of older people: Self-maintaining and instrumental activities of daily living. Gerontologist, 9, 179-186.
  • Lorentz, W. J., Scanlan, J. M., Borson, S. (2002). Brief screening test for dementia. Can J Psychiatry, 47, 723-733.
  • Macnight, C., Rockwood, K. (1995). A hierarchical assessment of balance and mobility. Age and Ageing, 24(2), 126-130.
  • Mahoney, F. I., Barthel, D. W. (1965). Functional evaluation: The Barthel Index. Md State Med J, 14, 61-5.
  • Matsueda, M., Ishii, Y. (2000). The relationship between dementia score and ambulatory level after hip fracture in the elderly. American Journal of Orthopedics, 29,691-693.
  • Mattis, S. (1976). Mental status examination for organic mental syndrome in the elderly patient. In: Bellak L, Karasu TB, editors. Geriatric Psychiatry. New York: Grune and Stratton, 77-101.
  • McDowell, I., Kristjansson, B., Hill, G. B., Hebert, R. (1997). Community screening for dementia: The Mini Mental State Exam (MMSE) and modified Mini-Mental State Exam (3MS) compared. Journal of Clinical Epidemiology, 50(4), 377-383.
  • Mitrushina, M., Satz, P. (1991). Reliability and validity of the Mini-Mental State Exam in neurologically intact elderly. J Clin Psychol, 47(4), 537-543.
  • Molloy, D. W., Standish, T. I. M. (1997). A guide to the Standardized Mini-Mental State Examination. International Psychogeriatrics, 9(1), 87-94.
  • Montgomery, S. A., Asberg, M. (1979). A new depression scale designed to be sensitive to change. Brit J Psychiat, 134, 382-389.
  • Newkirk, L. A., Kim, J. M., Thompson, J. M., Tinklenberg, J. R., Yesavage, J. A., Taylor, J. L. (2004). Validation of a 26-point telephone version of the Mini-Mental State Examination. Journal of Geriatric Psychiatry and Neurology, 17(2), 81-87.
  • Nys, G. M., van Zandvoort, M. J., de Kort, P. L., Jansen, B. P., Kappelle, L. J., de Haan, E. H. (2005). Restrictions of the Mini-Mental State Examination in acute stroke. Arch Clin Neuropsychol, 20(5), 623-629.
  • O’Connor, D. W., Pollitt, P. A., Hyde, J. B., Fellows, J. L., Miller, N. D., Brooke, C. P., Reiss, B. B. (1989). The reliability and validity of the Mini-Mental State in a British community survey. J Psychiatr Res, 23(1), 87-96.
  • Olin, J.T., Zelinski, E.M. (1991). The 12-month reliability of the Mini-Mental State Examination. Psychological Assessment, 3, 427-432.
  • Ozdemir, F., Birtane, M., Tabatabaei, R., Ekuklu, G., Kokino, S. (2001). Cognitive evaluation and functional outcome after stroke. American Journal of Physical Medicine & Rehabilitation. 80(6), 410-415.
  • Pettigrew, L. C., Thomas, N., Howard, V. J., Veltkamp, R., Toole, J. F. (2000). Low mini-mental status predicts mortality in asymptomatic carotid arterial stenosis. Neurology, 55,30-34.
  • Roccaforte, W. H., Burke, W. J., Bayer, B. L., Wengel, S. P. (1992). Validation of a telephone version of the mini-mental state examination. J Am Geriatr Soc, 40(7), 697-702.
  • Ruchinskas, R. A., Curyto, K. J. (2003). Cognitive screening in geriatric rehabilitation. Rehab Psychol, 48, 14-22.
  • Schmand, B., Lindeboom, J., Launer, L., Dinkgreve, M., Hooijer, C., Jonker, C. (1995). What is a significant score change on the Mini-Mental State Examination? International Journal of Geriatric Psychiatry, 10, 411-414.
  • Schwamm, L. H., Van Dyke, C., Kiernan, R. J., Merrin, E. L., Mueller, J. (1987). The Neurobehavioral Cognitive Status Examination: Comparison with the Cognitive Capacity Screening Examination and the Mini-Mental State Examination in a neurosurgical population. Ann Intern Med, 107(4), 486-491.
  • Shadlen, M. F., Larson, E. B., Gibbons, L., McCormick, W. C., Teri, L. (1999). Alzheimer’s disease symptom severity in Blacks and Whites. Journal of the American Geriatrics Society, 47,482-486.
  • Snowden, M., McCormick, W., Russo, J., Srebnik, D., Comtois, K., Bowen, J., Teri, L., Larson, E. B. (1999). Validity and responsiveness of the Minimum Data Set. Journal of the American Geriatrics Society, 47(8), 1000-1004.
  • Suhr, J. A., Grace, J. (1999). Brief cognitive screening of right hemisphere stroke: Relation to functional outcome. Arch Phys Med Rehabil, 80(7), 773-776.
  • Teng, E. L., Chui, H. C. (1987). The Modified Mini-Mental State (3MS) examination. J Clin Psychiatry, 48(8), 314-318.
  • Tombaugh, T. N., McIntyre, N. J. (1992). The mini-mental state examination: A comprehensive review. J Am Geriatr Soc, 40(9), 922-935.
  • Tombaugh, T. N., McDowell, I., Kristjansson, B., Hubley, A. M. (1996). Mini-Mental State Examination (MMSE) and the modified MMSE (3MS): A psychometric comparison and normative data. Psychol Assess, 8(1), 48-59.
  • Uhlmann, R. F., Larson, E. B., Buchner, D. M. (1987). Correlations of Mini-Mental State and modified Dementia Rating Scale to measures of transitional health status in dementia. J Gerontol, 42(1), 33-36.
  • Wechsler, D. (1981). Wechsler Adult Intelligence Scale-Revised: Test. New York: Harcourt Brace
  • Wechsler, D. (1955). Manual for the Wechsler Adult Intelligence Scale. New York: The Psychological Corporation.
  • Wetherell, M., Darby, A., Emerson, K., & Miller, B. L. (1997). Mini- Mental State Examination performance in Alzheimer’s disease and frontotemporal dementia. International Journal of Rehabilitation and Health, 3,253-265.
  • Wind, A. W., Schellevis, F. G., van Staveren, G., Scholten, R. J. P. M., Jonker, C., van Eijk, J. M. (1997). Limitations of the mini-mental state examination in diagnosing dementia in general practice. International Journal of Geriatric Psychiatry, 12(1), 101-108.
  • Winograd, C. H., Lemsky, C. M., Nevitt, M. C., Nordstrom, T. M., Stewart, A. L., Miller, C. J., Bloch, D. A. (1994). Development of a physical performance and mobility examination. J Am Geriatr Soc, 42(7), 743-749.
  • Yesavage, J. A., Brink, T. L., Rose, T. L., Lum, O., Huang, V., Adey, M. B., Leirer, V. O. (1983). Development and validation of a geriatric depression screening scale: A preliminary report. Journal of Psychiatric Research, 17, 37-49.
  • Zung, W. W. K. (1965). A self-rating depression scale. Arch Gen Psychiatry, 12, 63-70.

See the measure

How to obtain the MMSE

The MMSE can be obtained from the current copyright owner, Psychological Assessment Resources (PAR).

Table of contents

Montreal Cognitive Assessment (MoCA)

Evidence Reviewed as of before: 20-01-2011
Author(s)*: Lisa Zeltzer, MSc OT; Katie Marvin, MSc PT Candidate
Editor(s): Nicol Korner-Bitensky, PhD OT; Elissa Sitcoff, BA BSc

Purpose

The Montreal Cognitive Assessment (MoCA) was designed as a rapid screening instrument for the detection of mild cognitive impairment. It was developed in response to the poor sensitivity of the Mini-Mental State Examination (MMSE) in distinguishing clients with mild cognitive impairment from normal elderly clients (Nasreddine et al., 2005). Thus, the MoCA is intended for clients with memory complaints who score within the normal range on the MMSE.

The MoCA assesses the following cognitive domains: attention and concentration, executive functions, memory, language, visuoconstructional skills, conceptual thinking, calculations, and orientation. The measure can be used, but is not limited to patients with stroke.

In-Depth Review

Purpose of the measure

The Montreal Cognitive Assessment (MoCA) was designed as a rapid screening instrument for the detection of mild cognitive impairment. It was developed in response to the poor sensitivity of the Mini-Mental State Examination (MMSE) in distinguishing clients with mild cognitive impairment from normal elderly clients (Nasreddine et al., 2005). Thus, the MoCA is intended for clients with memory complaints who score within the normal range on the MMSE.

The MoCA assesses the following cognitive domains: attention and concentration, executive functions, memory, language, visuoconstructional skills, conceptual thinking, calculations, and orientation. The measure can be used, but is not limited to patients with stroke.

Available versions

The Montreal Cognitive Assessment was developed by Dr Nasreddine in 1996, then validated with the help of Chertkow, Phillips, Whitehead, Bergman, Collin, Cummings, and Hébert in 2004-2005.

Features of the measure

Items:

The items of the MoCA examine attention and concentration, executive functions, memory, language, visuoconstructional skills, conceptual thinking, calculations, and orientation. These items are described in detail below.

  1. Alternating Trail Making: The examiner instructs the client to “Please draw a line, going from a number to a letter in ascending order. Begin here” (points to 1) and draw a line from 1 then to A then to 2 and so on. End here (points to E).
  2. Visuoconstructional Skills – Cube: The examiner gives the following instructions, pointing to the cube: “Copy this drawing as accurately as you can, in the space below“.
  3. Visuoconstructional Skills – Clock: Indicate the right third of the test sheet where a space is provided for the clock drawing item, and give the following instructions: “Draw a clock. Put in all the numbers and set the time to 10 after 11
  4. Naming: Beginning on the left, point to each figure and say: “Tell me the name of this animal”
  5. Memory: The examiner reads a list of 5 words at a rate of one per second, giving the following instructions: “This is a memory test. I am going to read a list of words that you will have to remember now and later on. Listen carefully. When I am through, tell me as many words as you can remember. It doesn’t matter in what order you say them“. Checkmark the space allocated for each word the client produces on the first trial on the test sheet. When the client indicates that he/she has finished (has recalled all the words), or can recall no more words, read the list a second time with the following instructions: “I am going to read the same list for a second time. Try to remember and tell me as many words as you can, including words you said the first time“. Put a checkmark in the allocated space for each word on the test sheet the client recalls after the second trial. At the end of the second trial, inform the client that she/he will be asked to recall these words again by saying, “I will ask you to recall those words again at the end of the test
  6. Attention:
    • Forward Digit Span: Give the following instruction: “I am going to say some numbers and when I am through, repeat them to me exactly as I said them“. Read the five number sequences at a rate of one digit per second.
    • Backward Digit Span: Give the following instruction: “Now I am going to say some more numbers, but when I am through you must repeat them to me in the backwards order“. Read the three number sequences at a rate of one digit per second.
    • Vigilance: The examiner reads the list of letters at a rate of one per second, after giving the following instruction: “I am going to read a sequence of letters. Every time I say the letter A, tap you hand once. If I say a different letter, do not tap your hand
    • Serial 7s: The examiner gives the following instruction: “Now I will ask you to count by subtracting seven from 100, and then, keep subtracting seven from your answer until I tell you to stop“. Give this instruction twice if necessary.
  7. Sentence Repetition: The examiner gives the following instructions: “I am going to read you a sentence. Repeat it after me, exactly as I say it [pause]. I only know that John is the one to help today.” Following the response say: “Now I am going to read you another sentence. Repeat it after me, exactly as I say it [pause]. The cat always hid under the couch when dogs were in the room“.
  8. Verbal Fluency: The examiner gives the following instruction: “Tell me as many words as you can think of that begin with a certain letter of the alphabet that I will tell you in a moment. You can say any kind of word you want, except for proper nouns (like Bob or Boston), numbers, or words that begin with the same sound but have a different suffix, for example, love, lover, loving. I will tell you to stop after one minute. Are you ready? [pause]. Now, tell me as many words as you can beginning with the letter F” [time 60 seconds]. “Stop
  9. Abstraction: The examiner asks the client to explain what each pair of words has in common, starting with the example: “Tell me how an orange and a banana are alike“. If the subject answers in a concrete manner, then say only one additional time: “Tell me another way in which those items are alike“. If the client still doesn’t give the appropriate response (fruit), say “Yes, and they are also both fruit“. Do not give any additional instructions or clarification. After the practice trial say: “Now tell me how a train and a bicycle are alike“. Following the response, administer the second trial, saying: “Now, tell me how a ruler and a watch are alike“. Do not give any additional instructions or prompts
  10. Delayed Recall:The examiner gives the following instruction: “I read some words to you earlier, which I asked you to remember. Tell me as many of those words as you can remember.” Make a checkmark on the test sheet for each of the words correctly recalled spontaneously without any cues, in the allocated space.

    Optional: The client can be prompted with semantic category cues for any word that is not recalled. This is to elicit clinical information in order to provide the examiner with additional information regarding the type of memory disorder. For memory deficits due to retrieval failures, performance can be improved with a cue. For memory deficits due to encoding failures, performance does not improve with a cue. No points are awarded for words recalled from a cue.

    Make a checkmark in the allocated space if they remembered the word with the help of a category cue. If not, give them a multiple choice cue.

    Use the following category and/or multiple-choice cues for each word, when appropriate:

    • FACE: category cue: part of the body multiple choice: nose, face, hand
    • VELVET: category cue: type of fabric multiple choice: denim, cotton, velvet
    • CHURCH: category cue: type of building multiple choice: church, school, hospital
    • DAISY: category cue: type of flower multiple choice: rose, daisy, tulip
    • RED: category cue: a color multiple choice: red, blue, green
  11. Orientation: The examiner gives the following instructions: “Tell me the date today“. If the client does not give a complete answer, then prompt accordingly by saying: “Tell me the [year, month, exact date, and day of the week]“. Then say: “Now, tell me the name of this place, and which city it is in.”

Scoring:

Sum all subscores. Add one point for a client who has had 12 years or fewer of formal education, for a possible maximum of 30 points. A final total score of 26 and above is considered normal. A final total score below 26 is indicative of mild cognitive impairment.

Below is a breakdown of how each item of the MoCA is to be scored:

Item How to score
Alternate Trail Making
(1 point)
Give 1 point if the following pattern is drawn without drawing any lines that cross:
1-A-2-B-3-C-4-D-5-E. Any error that is not immediately self-corrected earns a score of 0.
Visuoconstructional skills Cube (1 point) Give 1 point for a correctly executed drawing. Drawing must be 3D; all lines drawn; no lines added; lines are relatively parallel and lengths are similar (rectangular prisms are accepted). A point is not assigned if any of the above-criteria are not met.
Vosuoconstructional skills Clock (3 points) Contour (1 point): The clock face must be a circle with only minor distortion acceptable (e.g. slight imperfection in closing the circle).
Numbers (1 point): All clock numbers must be present with no additional numbers; numbers must be in correct order and placed in approximate quadrants on the clock face; roman numerals are accepted; numbers can be places outside the circle contour.
Hands (1 point): There must be 2 hands jointly indicating the correct time; the hour hand must be clearly shorter than the minute hand; hands must be centered within the clock face with their junction close to the clock centre.
A point is not assigned for a given element if any of the above-criteria are not met.
Naming (3 points) One point each is given for the following responses: (1) camel/dromedary, (2) lion, (3) rhinoceros/rhino.
Memory (0 points) No points are given for Trials 1 and 2.
Attention (6 points) Digit span (2 points): Give 1 point for each sequence correctly repeated (the correct response for the backwards trial is 2-4-7).
Vigilance (1 point): Give 1 point if there are 0-1 errors (an error includes a tap on a wrong letter, or a failure to tap on letter A).
Serial 7s (3 points): This item is scored out of 3 points. Give 0 points for no correct subtractions; 1 point for 1 correct subtraction; 2 points for 2-3 correct subtractions; and 3 points if the client successfully makes 4-5 correct subtractions. Count each correct subtraction of 7 beginning at 100. Each subtraction is evaluated independently; that is, if the client responds with an incorrect number but continues to correctly subtract 7 from it, give a point for each correct subtraction. For example, a client may respond “92-85-78-71-64” where the “92” is incorrect, but all subsequent numbers are subtracted correctly. This is 1 error and the item would be given a score of 3.
Sentence Repetition
(2 points)
Give 1 point for each sentence correctly repeated. Repetition must be exact. Be alert for errors that are omissions (e.g., omitting “only”, “always”) and substitutions/additions.
Verbal fluency (1 point) Give 1 point if the 11 words or more are generated in 60 seconds. Record responses in the margins.
Abstraction (2 points) Only the last 2 item pairs are scored. Give 1 point to each item pair correctly answered.
The following responses are acceptable:
Train-bicycle = means of transportation, means of traveling, you take trips in both
Ruler-watch = measuring instruments, used to measure
The following responses are not acceptable: Train-bicycle = they have wheels; Ruler-watch = they have numbers.
Delayed recall (5 points) Give 1 point for each word recalled freely without any cues.
Orientation (6 points) Give 1 point for each item correctly answered. The client must tell the exact date and place (name of hospital, clinic, office). No points are awarded if client makes an error of 1 day for the day and date.

Time:

The MoCA takes approximately 10-15 minutes to administer for clients with mild cognitive impairment.

Subscales:

Visuospatial/Executive; Naming; Memory; Attention; Language; Abstraction; Delayed recall; Orientation

Equipment:

Only the MoCA test sheet and a pencil are required to complete the measure.

Training:

The MoCA should be administered by a health professional. No formal training is required to administer the measure.

Alternative form of the MoCA

MoCA – version 2 & 3 (English)

Two alternative versions of the MoCA (English) have been validated for use in instances when repeated administration is necessary, to avoid possible learning effects.

MoCA – modified for individuals with visual impairments.

An alternative version of the MoCA has been validated for use with patients with visual impairments.

Please visit http://www.mocatest.org for further information and to download the alternative forms.

Client suitability

Can be used with:

  • Patients with stroke.
  • The MoCA is suitable for any individual who is experiencing memory difficulties but who scores within the normal range on the Mini-Mental State Examination.

Should not be used with:

  • Because the MoCA is heavily language dependent, it is likely to misclassify patients with aphasia.
  • The MoCA is not suitable for use with a proxy respondent as it is administered via direct observation of task completion.

In what languages is the measure available?

The MoCA has been translated into Arabic, Afrikaans, Chinese (Beijing, Cantonese, Changsha, Hong Kong, Taiwan), Czech, Croatian, Danish, Dutch, Estonian, French, Finnish, German, Greek, Hebrew, Italian, Japanese, Korean, Persian, Polish, Portuguese (Brazil), Russian, Serbian, Sinhalese, Spanish, Swedish, Thai, Turkish, Ukrainian and Vietnamese. These translations can be found at the following website: http://www.mocatest.org.

Summary

What does the tool measure? Mild cognitive impairment
What types of clients can the tool be used for? Can be used with but not limited to:
• Patients with stroke
• Any individual who is experiencing memory difficulties but scores within the normal range on the Mini Mental State Examination.
Is this a screening or assessment tool? Screening
Time to administer The MoCA takes approximately 10-15 minutes to administer for clients with mild cognitive impairment.
Versions MoCA (original); MoCA English (version 2); and MoCA English (version 3); MoCA (modified for individuals with visual impairments).
Other Languages The MoCA has been translated into Arabic, Afrikaans, Chinese (Beijing, Cantonese, Changsha, Hong Kong, Taiwan), Czech, Croatian, Danish, Dutch, Estonian, French, Finnish, German, Greek, Hebrew, Italian, Japanese, Korean, Persian, Polish, Portuguese (Brazil), Russian, Serbian, Sinhalese, Spanish, Swedish, Thai, Turkish, Ukrainian and Vietnamese.
Measurement Properties
Reliability Internal consistency:
Only one study has examined the internal consistency of the MoCA and reported excellent levels of internal consistency.

Test-rest:
Only one study has examined the test-rest reliability of the MoCA, and reported excellent test-retest.

Intra-rater:
No studies have examined the intra-rater reliability of the MoCA.

Inter-rater:
No studies have examined the inter-rater reliability of the MoCA.

Validity Criterion:
Concurrent:
Excellent correlations with the Mini Mental State Examination (MMSE) have been reported.

Construct:
Known groups:
One study reported that the MoCA can distinguish between patients with mild cognitive impairment and healthy controls.

Floor/Ceiling Effects No studies have examined the ceiling or floor effects of the MoCA.
Does the tool detect change in patients? Not Applicable.
Acceptability The MoCA is not suitable for individuals with aphasia or for use with a proxy respondent
Feasibility The measure is simple to score and only the MoCA test sheet and a pencil are required to complete the measure.
How to obtain the tool? The MoCA is available at: http://www.mocatest.org.

Psychometric Properties

Overview

We conducted a literature search to identify all relevant publications on the psychometric properties of the MoCA. As the MoCA is a relatively new measure, to our knowledge, the creators have personally gathered the majority of psychometric data that are currently published on the scale.

Reliability

Internal consistency:
Nasreddine et al. (2005) examined the internal consistency of the MoCA and reported an excellent Cronbach’s alpha (alpha = 0.83) on the standardized items.

Test-retest:
Nasreddine et al. (2005) examined the test-retest reliability of the MoCA by administering the measure to a subsample of 26 clients (clients with mild cognitive impairment or Alzheimer’s disease, and healthy elderly controls) twice, on average 35 days apart. The correlation between the two evaluations was excellent (r = 0.92). The mean change in MoCA scores from the first to second evaluation was 0.9 points.

Validity

Criterion:

Concurrent:
Nasreddine et al. (2005) administered the MoCA and the Mini Mental State Examination to 94 patients with mild cognitive impairment, 93 patients with mild Alzheimer’s disease, and 90 healthy elderly controls. The correlation between the MoCA and the MMSE was excellent (r = 0.87).

Sensitivity and Specificity

Four studies examined whether the MoCA could detect patients known to have varying degrees of cognitive impairment and found the MoCA to be more sensitive than the Mini-Mental State Examination (MMSE) in detecting these differences.

Nasreddine et al. (2005) examined whether the MoCA could distinguish between patients with mild cognitive impairment and healthy controls. The DSM-IV and NINCDS-ADRDA criteria were used to establish diagnosis of Alzheimer’s disease and neurological assessments performed by neurologists and geriatricians were used to establish diagnosis of cognitive impairment. At a cutoff score of 26, the MoCA had a sensitivity in identifying clients with mild cognitive impairment and clients with Alzheimer’s disease of 90% and 100%, respectively, and a specificity of 87%. The MoCA’s sensitivity in detecting mild cognitive impairment was considerably more sensitive than was the Mini-Mental State Examination (MMSE) (the sensitivity of the MMSE was poor: 18% for patients with mild cognitive impairment; 78% for patients with Alzheimer’s disease).

Smith, Gildeh and Holmes (2007) evaluated whether the MoCA could detect mild cognitive impairment and dementia in patients attending a memory clinic. Dementia and mild cognitive impairment were diagnosed by neuropsychological assessment involving the ICD-10 criteria and CAMCOG scores. At a cutoff score of 26, the MoCA was found to have excellent sensitivity for detecting mild cognitive impairment (83%) and dementia (94%), but poor specificity (50% for both mild cognitive impairment and dementia). The specificity was lower than that identified in the earlier study by Nasreddine et al. (2005), likely due to the heterogeneous nature of the control group. The MoCA was also found to be more sensitive than the MMSE (the sensitivity of the MMSE was poor: 17% for patients with mild cognitive impairment and 25% for patients with dementia).

Luis, Keegan and Mullan (2009) examined whether the MoCA could distinguish between healthy controls and patients with Alzheimer’s disease or mild cognitive impairment. A diagnosis of Alzheimer’s disease was made by neuropsychological assessment using NINCDS-ADRDA criteria and mild cognitive impairment (MCI) by Petersen’s criteria (Petersen et al., 1999 as cited in Luis, Keegan & Mullan, 2009). At a cutoff score of 26, the MoCA was found to have excellent sensitivity for detecting MCI (100%) and Alzheimer’s disease and MCI combined (97%), with a poor specificity (35% for both groups of MCI and Alzheimer’s disease+MCI). A cutoff score of 23 was found to be optimal for identifying MCI, providing excellent sensitivity and specificity, 96% and 95% respectively. The MoCA was found to be more sensitive than the MMSE (at a cut-off score of ≤ 24, MMSE sensitivity for detecting MCI and Alzheimer’s disease+MCI was 17% and 36% respectively).

Dong et al. (2010) evaluated the sensitivity and specificity of an alternative language version of the MoCA for detecting vascular cognitive impairment and dementia after stroke. Patients underwent neuro-imaging and neuropsychological assessment in order to establish a diagnosis of cognitive impairment or dementia using the DSM-IV criteria. Using an optimum cutoff score of 21, the MoCA correctly identified 90% of patients with cognitive impairment (excellent sensitivity) and 77% of those without cognitive impairment (adequate specificity). The MoCA was also found to be more sensitive than the MMSE (MMSE sensitivity of 86% and specificity of 82% for detecting cognitive impairment).

In a population-based study of 413 patients with stroke or TIA, the MoCA was found to detect more cognitive deficits than the MMSE. For the purposes of the study, a score of ≥ 27 on the MMSE was used to classify patients as having normal cognitive function, and < 26 on the MoCA to classify mild cognitive impairment (no formal neuropsychological testing was performed to confirm diagnosis). 58% of patients with normal MMSE scores (≥ 27) were found to have scores indicative of mild cognitive impairment when the MoCA was used for screening (<26). Several of the deficits detected by the MoCA were in domains either not assessed or detected by the MMSE, including executive function and attention (not assessed) and recall and repetition (not detected) (Pendlebury, Cuthbertson, Welch, Mehta & Rothwell, 2010). Sensitivity and specificity of the MoCA for cognitive impairment could not be established in the study because no formal neuropsychological testing was performed to confirm diagnosis.

Responsiveness

Koski, Xie and Finch (2009) evaluated the MoCA as a quantitative measure of cognitive ability and its responsiveness. By applying Rasch analysis techniques to existing data from a geriatric outpatient clinic, the researchers found that in addition to the usefulness of the MoCA as a screening instrument, scores on the MoCA can be used to quantify the amount of cognitive ability a person has and can be used to track changes in cognitive ability over time. The significance of scores and change in scores can be interpreted based on the respondent’s baseline score, for example, a 5-point decrease from a baseline score of 25 is a more statistically significant and meaningful change than that of a 5-point decrease from a baseline score of 15 (please refer to Table 4 in Koski et al., 2009 for statistical significance of change in MoCA scores). Further research to determine the minimal clinically important difference is required.

References

  • Dong, Y.H., Sharma, V.K., Chan, B.P.L., Venketasubramanian, N., Teoh, H.L., Seet, R.C.S., Tanicala, S., Chan, Y.H. & Chen, C. (2010). The Montreal Cognitive Assessment (MoCA) is superior to the Mini-Mental State Examination (MMSE) for the detection of vascular cognitive impairment after acute stroke. Journal of Neurological Sciences. doi:10.1016/j.jns.2010.08.051
  • Koski, L., Xie, H. & Finch, L. (2009). Measuring cognition in a geriatric outpatient clinic: Rasch analysis of the Montreal Cognitive Assessment. Journal of Geriatric Psychiatry and Neurology, 22, 151-160.
  • Luis, C.A, Keegan, A.P. & Mullan, M. (2009). Cross validation of the Montreal Cognitive Assessment in community dwelling older adults residing in the Southeastern US. International Journal of Geriatric Psychiatry, 24, 197-201.
  • Nasreddine, Z. S., Phillips, N. A., Bediriam, V., Charbonneau, S., Whitehead, V., Collin, I., Cummings, J. L., Chertkow, H. (2005). The Montreal Cognitive Assessment, MoCA: A brief screening tool for mild cognitive impairment. Journal of the American Geriatrics Society, 53, 4, 695-699.
  • Nasreddine, Z. S., Chertkow, H., Phillips, N., Whitehead, V., Collin, I., Cummings, J. L. The Montreal Cognitive Assessment (MoCA): A brief cognitive screening tool for detection of mild cognitive impairment. Neurology, 62(7): S5, A132. Presented at the American Academy of Neurology Meeting, San Francisco, May 2004.
  • Nasreddine, Z. S., Chertkow, H., Phillips, N., Whitehead, V., Bergman, H., Collin, I., Cummings, J. L., Hébert, L. The Montreal Cognitive Assessment (MoCA): a Brief Cognitive Screening Tool for Detection of Mild Cognitive Impairment. Presented at the 8th International Montreal/Springfield Symposium on Advances in Alzheimer Therapy. http://www.siumed.edu/cme/AlzBrochure04.pdf p. 90, April 14-17, 2004.
  • Nasreddine, Z. S., Collin, I., Chertkow, H., Phillips, N., Bergman, H., Whitehead, V. Sensitivity and Specificity of The Montreal Cognitive Assessment (MoCA) for Detection of Mild Cognitive Deficits. Can J Neurol Sci, 30 (2), S2, 30. Presented at Canadian Congress of Neurological Sciences Meeting, Québec City, Québec, June 2003.
  • Pendlebury, S.T., Cuthbertson, F.C., Welch, S.J.V., Mehta, Z. & Rothwell, P.M. (2010). Underestimation of cognitive impairment by Mini-Mental State Examination versus the Montreal Cognitive Assessment in patients with transient ischemic attack and stroke. Stroke, 41, 1290-1293.
  • Smith, T., Gildeh, N. & Holmes, C. (2007). The Montreal Cognitive Assessment: Validity and utility in a memory clinic setting. The Canadian Journal of Psychiatry, 52, 329-332.
  • Wittich, W., Phillips, N., Nasreddine, Z.S. & Chertkow, H. (2010). Sensitivity and specificity of the Montreal Cognitive Assessment modified for individuals who are visually impaired. Journal of Visual Impairment & Blindness, 104(6), 360-368.

See the measure

How to obtain the MoCA?

The MoCA is available at: http://www.mocatest.org.

Table of contents

Multiple Errands Test (MET)

Evidence Reviewed as of before: 08-05-2013
Author(s)*: Valérie Poulin, OT, PhD candidate; Annabel McDermott, OT
Editor(s): Nicol Korner-Bitensky, PhD OT
Expert Reviewer: Deirdre Dawson, PhD OT

Purpose

The Multiple Errands Test (MET) evaluates the effect of executive function deficits on everyday functioning through a number of real-world tasks (e.g. purchasing specific items, collecting and writing down specific information, arriving at a stated location). Tasks are performed in a hospital or community setting within the constraints of specified rules. The participant is observed performing the test and the number and type of errors (e.g. rule breaks, omissions) are recorded.

In-Depth Review

Purpose of the measure

The Multiple Errands Test (MET) evaluates the effect of executive function deficits on everyday functioning through a number of real-world tasks (e.g. purchasing specific items, collecting and writing down specific information, arriving at a stated location). Tasks are performed in a hospital or community setting within the constraints of specified rules. The participant is observed performing the test and the number and type of errors (e.g. rule breaks, omissions) are recorded.

The Multiple Errands Test was developed by Shallice and Burgess in 1991. The measure was intended to evaluate a patient’s ability to organize performance of a number of simple unstructured tasks while following several simple rules.

See Alternative Forms sections below for information regarding other versions.

Features of the measure

Items:

The original Multiple Errands Test (Shallice and Burgess, 1991) was comprised of 8 items: 6 simple tasks (e.g. buy a brown loaf of bread, buy a packet of throat pastilles), 1 task that is time-dependent, and 1 that comprises 4 subtasks (see Description of tasks, below). It should be noted that the MET was originally devised in an experimental context, rather than as a formal assessment.

Description of tasks:

The original Multiple Errands Test (Shallice and Burgess, 1991) was comprised of 8 written tasks to be completed in a pedestrian shopping precinct. Tasks and rules are written on a card provided to the participant before arriving at the shopping precinct. Of the 8 tasks, 6 are simple (e.g. buy a brown loaf of bread, buy a packet of throat pastilles), the 7th requires the participant to be at a particular place 15 minutes after starting the test, and the 8th is more demanding as it comprises 4 sets of information that the participant must obtain and write on a postcard:

  1. the name of the shop most likely to have the most expensive item;
  2. the price of a pound of tomatoes;
  3. the name of the coldest place in Britain yesterday; and
  4. the rate of the exchange of the French franc yesterday.

The card also includes instructions and rules, which are repeated to the participant on arrival at the shopping precinct:

“You are to spend as little money as possible (within reason) and take as little time as possible (without rushing excessively). No shop should be entered other than to buy something. Please tell one or other of us when you leave a shop what you have bought. You are not to use anything not bought on the street (other than a watch) to assist you. You may do the tasks in any order.“

Scoring:

The participant is observed performing the test and errors are recorded according to the following categorizations:

  • Inefficiencies: where a more effective strategy could have been applied
  • Rule breaks: where a specific rule (either social or explicitly mentioned in the task) is broken
  • Interpretation failure: where requirements of a particular task are misunderstood
  • Task failure: where a task is either not carried out or not completed satisfactorily.

Time taken to complete the assessment is recorded and the total number of errors is calculated.

Alternative versions of the Multiple Errands Test

Different versions of the MET were developed for use in specific hospitals (MET – Hospital Version and Baycrest MET), a small shopping plaza (MET – Simplified Version), and a virtual reality environment (Virtual MET). For each of these versions, 12 tasks must be performed (e.g. purchasing specific items and collecting specific information) while following several rules.

MET – Hospital Version (MET-HV – Knight, Alderman & Burgess, 2002)

The MET-HV was developed for use with a wider range of participants than the original version by adopting more concrete rules and simpler tasks. Clients are provided with an instruction sheet that explicitly directs them to record designated information. Clients must achieve four sets of simple tasks, with a total of 12 separate subtasks:

  1. The client must complete six specific errands (purchase 3 items, use the internal phone, collect an envelope from reception, and send a letter to an external address).
  2. The client must obtain and write down four items of designated information (e.g. the opening time of a shop on Saturday).
  3. The client must meet the assessor outside the hospital reception 20 minutes after the test had begun and state the time.
  4. The client must inform the assessor when he/she finishes the test.

The MET-HV uses 9 rules in order to reduce ambiguity and simplify task demands (Knight et al., 2002). Errors are categorized according to the same definitions as the original MET. The test is preceded by (a) an efficiency question rated using an end-point weighted 10-point Likert scale (“How efficient would you say you were with tasks like shopping, finding out information, and meeting people on time?“); and (b) a familiarity question rated using a 4-point scale (“How well would you say you know the hospital grounds?“). On completion the client answers a question rated using a 10-point scale (“How well do you think you did with the task?“).

MET – Simplified Version (MET-SV – Alderman, Burgess, Knight & Henman, 2003)

The MET-SV includes four sets of simple tasks analogous to those in the original MET, however the MET-SV incorporates 3 main modifications to the original version:

  1. More concrete rules to enhance task clarity and reduce likelihood of interpretation failures;
  2. Simplification of task demands; and
  3. Space provided on the instruction sheet for the participant to record the information they were required to collect.

The MET-SV has 9 rules that are more explicit than the original MET and are clearly presented on the instruction sheet.

Baycrest MET (BMET – Dawson, Anderson, Burgess, Cooper, Krpan & Stuss, 2009)

The BMET was developed with an identical structure to the MET-HV, except that some items, information and a meeting place are specific to the testing environment (Baycrest Center, Toronto). The BMET comprises 12 items and 8 rules. The test manual provides explicit instructions including collecting test materials, language to be used in describing the test, and a pretest section to ensure participants understand the tasks. Scoring was standardized to allow for increased usability. The score sheet allows identification of specific task errors or omissions, other inefficiencies, rule breaks and strategy use (please contact the authors for further details regarding the manual: ddawson@research.baycrest.org).

Virtual MET (VMET – Rand, Rukan, Weiss & Katz, 2009)

The VMET was developed within the Virtual Mall, a functional video-capture virtual shopping environment that consists of a large supermarket with 9 aisles. The system includes a single camera that films the user and displays his/her image within the virtual environment. The VMET is a complex shopping task that includes the same number of tasks (items to be bought and information to be obtained) as the MET-HV. However, the client is required to check the contents of the shopping cart at a particular time instead of meeting the tester at a certain time. Virtual reality enables the assessor to objectively measure the client’s behaviour in a safe, controlled and ecologically valid environment. It enables repeated learning trials and adaptability of the environment and task according to the client’s needs.

What to consider before beginning:

The MET is performed in a real-world shopping area that allows for minor unpredicted events to occur.

Time:

The BMET takes approximately 60 minutes to administer (Dawson et al., 2009).

Training requirements:

It is advised that the assessor reads the test manual and becomes familiar with the procedures for test administration and scoring.

Equipment:

  • Access to a shopping precinct or virtual shopping environment
  • Pen and paper
  • Instruction sheet (according to version being used)

Client suitability

Can be used with:

  • The MET has been tested on populations with acquired brain injury including stroke.

Should not be used with:

  • The MET cannot be administered to patients who are confined to bed.
  • Participants require sufficient language skills.
  • Some tasks may need to be adapted depending on the rehabilitation setting.

In what languages is the measure available?

The MET was developed in English.

Summary

What does the tool measure? The effect of executive function deficits on everyday functioning.
What types of clients can the tool be used for? The Multiple Errands Test can be used with, but is not limited to, clients with stroke.
Is this a screening or assessment tool? Assessment
Time to administer Baycrest MET: approximately 60 minutes (Dawson et al., 2009).
Versions

  • Multiple Errands Test (MET) (Shallice and Burgess, 1991)
  • MET – Simplified Version (MET-SV) (Alderman et al., 2003)
  • MET – Hospital Version (MET-HV) (Knight, Alderman & Burgess, 2002)
  • Virtual MET (Rand, Rukan, Weiss & Katz, 2009)
  • Baycrest MET (Dawson et al., 2009)
  • Modified version of the MET-SV and MET-HV (including 3 alternate versions) (Novakovic-Agopian et al., 2011, 2012)
Other Languages N/A
Measurement Properties
Reliability Internal consistency:
One study reported adequate internal consistency of the MET-HV in a sample of patients with chronic acquired brain injury including stroke.

Test-retest:
No studies have reported on the test-retest reliability of the MET with a population of patients with stroke.

Intra-rater:
No studies have reported on the intra-rater reliability of the MET with a population of patients with stroke.

Inter-rater:
– One study reported excellent inter-rater reliability of the MET-HV in a sample of patients with chronic acquired brain injury including stroke.
– One study reported adequate to excellent inter-rater reliability of the BMET in a sample of patients with acquired brain injury including stroke.

Validity Criterion:
Concurrent:
No studies have reported on the concurrent validity of the MET in a stroke population.

Predictive:
One study examined predictive validity of the MET-HV with a sample of patients with acquired brain injury including stroke and reported poor to adequate correlations between discharge MET-HV performance and community participation measured by the Mayo-Portland Adaptability Inventory (MPAI-4).

Construct:
Convergent/Discriminant:
– Three studies* examined convergent validity of the MET-HV and reported excellent correlations with the Modified Wisconsin Card Sorting Test (MWCST), Behavioural Assessment of Dysexecutive Syndrome battery (BADS), Dysexecutive questionnaire (DEX), IADL questionnaire and FIM Cognitive score; and an adequate correlation with the Rivermead Behavioural Memory Test (RBMT).
– One study* examined convergent validity of the MET-SV and reported adequate correlations with the Weschler Adult Intelligence Scale – Revised Full Scale IQ (WAIS-R FSIQ), MWCST, BADS and Cognitive Estimates test; and poor to adequate correlations with the DEX.
– One study* examined convergent validity of the BMET and reported adequate to excellent correlations with the Sickness Impact Profile and Assessment of Motor and Process Skills.
– Three studies* examined convergent validity of the VMET and reported excellent correlations with the MET-HV, BADS, IADL questionnaire, Semantic Fluencies test, Tower of London test, Trail Making Test, Corsi’s supra-span test, Street’s Completion Test and the Test of Attentional Performance.
*Note: Correlations between the MET and other measures of everyday executive functioning and IADLs used in these studies also provide support for the ecological validity of the MET.

Known Groups:
– Two studies reported that the MET-HV is able to differentiate between individuals with acquired brain injury (including stroke) vs. healthy adults, and between healthy older adults vs. healthy younger adults.
– One study reported that the MET-SV is able to differentiate between clients with brain injury including stroke vs. healthy adults.
– One study reported that the BMET is able to differentiate between clients with stroke vs. healthy adults.
– Three studies reported that the VMET is able to differentiate between clients with stroke vs. healthy adults, and between healthy older adults vs. healthy younger adults.

Sensitivity/Specificity:
– One study reported 85% sensitivity and 95% specificity when using a cut-off score ≥ 7 errors on the MET-HV with clients with chronic acquired brain injury including stroke.
– One study reported 82% sensitivity and 95.3% specificity when using a cut-off score ≥ 12 errors on the MET-SV with clients with brain injury including stroke.

Floor/Ceiling Effects No studies have reported on the floor/ceiling effects of the MET.
Does the tool detect change in patients? Responsiveness of the MET has not been formally evaluated, however:
– One study used a modified version of the MET-HV and MET-SV to measure change following intervention;
– One study used the MET-HV and the VMET to detect change in multi-tasking skills of clients with stroke following intervention.
Acceptability The MET provides functional assessment of executive function as it enables clients to participate in real-world activities.
Feasibility Administration of the MET requires access to a shopping area and so is not always feasible in a typical clinical setting. Some tasks may need to be adapted depending on the rehabilitation setting. Administration time can be lengthy. Ecological validity is supported.
How to obtain the tool? The Baycrest MET can be obtained at https://cognitionandeverydaylifelabs.com/multiple-errands-test/

Psychometric Properties

Overview

A literature search was conducted to identify publications on the psychometric properties of the Multiple Errands Test (MET) relevant to a population of patients with stroke. Of the 10 studies reviewed, 8 included a mixed population of patients with acquired brain injury including stroke. Studies have reviewed psychometric properties of the original MET, Hospital Version (MET-HV), Simplified Version (MET-SV), Baycrest MET (BMET) and Virtual MET (VMET), as indicated in the summaries below. While research indicates that the MET demonstrates adequate validity and reliability in populations with acquired brain injury including stroke, further research regarding responsiveness of the measure is warranted.

Floor/Ceiling Effects

No studies have reported on floor/ceiling effects of the MET with a stroke population.

Reliability

Internal consistency:
Knight, Alderman & Burgess (2002) calculated internal consistency of the MET-HV in a sample of 20 patients with chronic acquired brain injury (traumatic brain injury, n=12; stroke, n=5, both TBI and stroke, n=3) and 20 healthy control subjects matched for gender, age and IQ, using Cronbach’s alpha. Internal consistency was adequate (α=0.77).

Test-retest:
No studies have reported on the test-retest reliability of the MET.

Intra-rater:
No studies have reported on the intra-rater reliability of the MET.

Inter-rater:
Knight, Alderman & Burgess (2002) calculated inter-rater reliability of the MET-HV error categories in a sample of 20 patients with chronic acquired brain injury (traumatic brain injury, n=12; stroke, n=5, both TBI and stroke, n=3) and 20 healthy control subjects matched for gender, age and IQ, using intraclass correlation coefficients. Participants were scored by 2 assessors. Inter-rater reliability was excellent (ICC ranging from 0.81-1.00). The ‘rule breaks’ error category demonstrated the strongest inter-rater reliability (ICC=1.00).

Dawson, Anderson, Burgess, Cooper, Krpan and Stuss (2009) examined inter-rater reliability of the BMET with clients with stroke (n=14) or traumatic brain injury (n=13) and healthy matched controls (n=25), using Intraclass Correlation Coefficients and 2-way random effects models. Participants were scored by two assessors. Inter-rater reliability was adequate to excellent for the five summary measures used: mean number of tasks completed accurately (ICC = 0.80), mean number of rules adhered to (ICC = 0.71), mean number of total errors (ICC = 0.82), mean number of total rules broken (ICC = 0.88) and mean number of requests for help (ICC = 0.71).

Validity

Content:

Shallice & Burgess (1991) evaluated the MET in a sample of 3 patients with traumatic brain injury (TBI) who demonstrated above-average performance on measures of general ability and normal or near-normal performance on frontal lobe tests, and 9 age- and IQ-matched controls. Participants were monitored by two observers and were scored according to number of errors (inefficiencies, rule breaks, interpretation failures, task failures and total score) and qualitative observation. The patients demonstrated qualitatively and quantitatively impaired performance, particularly relating to rule breaks and inefficiencies. The most difficult subtest was the least sensitive part of the procedure and presented difficulties for both patients and control subjects.

Criterion:

Concurrent:
No studies have reported on the concurrent validity of the MET in a stroke population.

Predictive:
Maier, Krauss & Katz (2011) examined predictive validity of the MET-HV in relation to community participation with a sample of 30 patients with acquired brain injury including stroke (n=19). Community participation was measured using the Mayo-Portland Adaptability Inventory (MPAI-4) Participation Index (M2PI), completed by the participant and a significant other. The MET-HV was administered 1 week prior to discharge from rehabilitation and the M2PI was administered at 3 months post-discharge. Analyses were performed using Pearson correlation analysis and partial correlation controlling for cognitive status using FIM Cognitive scores. Predictably, higher MET-HV error scores correlated with more restrictions in community participation. There were adequate correlations between participants’ and significant others’ M2PI total score and MET-HV total error score (r = 0.403, 0.510 respectively), inefficiencies (r = 0.353, 0.524 respectively) and rule breaks (r = 0.361, 0.449 respectively). The ability for the MET total error score to predict the M2PI significant other score remained significant but poor following partial correction controlling for cognitive status using FIM Cognitive scores (r = 0.212).

Construct:

Convergent/Discriminant:
Knight, Alderman & Burgess (2002)* examined convergent validity of the MET-HV by comparison with tests of IQ and cognitive functioning, traditional frontal lobe tests and ecologically sensitive executive function tests, in a sample of 20 patients with chronic acquired brain injury (traumatic brain injury, n=12; stroke, n=5, both TBI and stroke, n=3). Tests of IQ and cognitive functioning included the National Adult Reading Test – Revised Full Scale Intelligence Quotient (NART-R FSIQ), Weschler Adult Intelligence Scale – Revised Full Scale Intelligence Quotient (WAIS-R FSIQ), Adult Memory and Information Processing Battery (AMIPB), Rivermead Behavioural Memory Test (RBMT) and Visual Objects and Space Perception battery (VOSP). Frontal lobe tests included verbal fluency, the Cognitive Estimates Test (CET), Modified Card Sorting Test (MCST), Tower of London Test (TOLT) and versions of the hand manipulation and hand alternation tests. Ecologically sensitive executive function tests included the Behavioural Assessment of the Dysexecutive Syndrome battery (BADS) and the Test of Everyday Attention (TEA) Map Search and Visual Elevator tasks. The Dysexecutive (DEX) questionnaire was also used, although proxy reports were used rather than self-reports due to identified lack of insight of individuals with brain injury. There were excellent correlations between the MCST percentage perseverative errors with MET-HV rule breaks (r=0.66) and MET-HV total errors (r=0.67) following Bonferroni adjustment. There were excellent correlations between the BADS Profile score and the MET-HV task failures (r = -0.58), interpretation failures (r = 0.64) and total errors (r = -0.57). There was an excellent correlation between the DEX intentionality factor and MET-HV task failures (r = 0.70). In addition, the relationship between the MET-HV and DEX was re-evaluated to control for possible confounding effects; controlling variables age, familiarity and memory function with respect to MET-HV task failures resulted in excellent correlations with the DEX total score (r = 0.79) and DEX inhibition (r = 0.69), intentionality (r = 0.76) and executive memory (r = 0.67) factors. There was an adequate correlation between the RBMT Profile Score and the MET-HV number of task failures (r=-0.57). There were no significant correlations between the MET and other tests of IQ and cognitive functioning (MET-HV, NART-R FSIQ, WAIS-R FSIQ, AMIPB, VOSP), and other frontal lobe tests (verbal fluency, CET, TOLT, hand manipulation and hand alternation tests), other ecologically sensitive executive function tests (TEA Map Search and Visual Elevator tasks) or other DEX factors (positive affect, negative affect).
Note: Initial correlations were measured using Pearson correlation coefficient and significance levels were subsequently adjusted by Bonferroni adjustment to account for multiple comparisons; results reported indicate significant correlations following Bonferroni adjustment.

Rand, Rukan, Weiss & Katz (2009a)* examined convergent validity of the MET-HV by comparison with measures of executive function and IADLs with a sample of 9 patients with subacute or chronic stroke, using Spearman correlation coefficients. Executive function was measured using the BADS Zoo Map test and IADLs were measured using the IADL questionnaire. There were excellent negative correlations between the BADS Zoo Map and MET-HV outcome measures of total number of mistakes (r = -0.93), partial mistakes in completing tasks (r = -0.80), non-efficiency mistakes (r = -0.86) and time to complete the MET (r = -0.79). There were excellent correlations between the IADL questionnaire and the MET-HV number of mistakes of rule breaks (r = 0.80) and total number of mistakes (r = -0.76).

Maier, Krauss & Katz (2011)* examined convergent validity of the MET-HV by comparison with the FIM Cognitive score with a sample of 30 patients with acquired brain injury including stroke (n=19), using Pearson correlation analysis. There was an excellent negative correlation between MET-HV total errors score and FIM Cognitive score (r = -0.67).

Alderman, Burgess, Knight and Henman (2003)* examined convergent validity of the MET-SV by comparison with tests of IQ, executive function and everyday executive abilities with 50 clients with brain injury including stroke (n=9). Neuropsychological tests included the WAIS-R FSIQ, BADS, Cognitive Estimates Test, FAS verbal fluency test, a modified version of the Wisconsin Card Sorting Test (MWCST) and the DEX. There were adequate correlations between MET-SV task failure errors and WAIS-R FSIQ (r = -0.32), MWCST perseverative errors (r = 0.39), BADS profile score (r = -0.46) and Zoo-Map (r = -0.46) and Six Element Test (r = -0.41) subtests. There were adequate negative correlations between MET-SV social rule breaks and the Cognitive Estimates (r = -0.33), and between MET-SV task rule breaks, social rule breaks and total rule breaks and the BADS Action Program subtest (r = -0.42, -0.40, -0.43 respectively). There were poor to adequate negative correlations between the DEX and MET-SV rule breaks (r = -0.30), task failures (r = -0.25) and total errors (r = -0.37).

In a subgroup analysis of individuals with brain injury who passed traditional executive function tests but failed the MET-SV (n=17), there were adequate to excellent correlations between MET-SV inefficiencies and DEX factors of intentionality and negative affect (r = 0.59, -0.76); MET-SV interpretation failures and DEX inhibition and total (r = -0.67, -0.57); MET-SV total and actual rule breaks and DEX inhibition (r = -0.70, 0.66), intentionality (r = 0.60, 0.64) and total (r = -0.57, 0.59); MET-SV social rule breaks and DEX positive and negative affect (r = 0.79, -0.59); MET-SV task failures and DEX inhibition and positive affect (r = -0.58, -0.52), and MET-SV total errors and DEX intentionality (r = 0.67).

Dawson et al. (2009)* examined convergent validity of the BMET by comparison with other measures of IADL and everyday function with 14 clients with stroke, using Pearson correlation. Other measures included the DEX (significant other report), Stroke Impact Profile (SIP), Assessment of Motor and Process Skills (AMPS) and Mayo Portland Adaptability Inventory (MPAI) (significant other report). There were excellent correlations between the BMET number of rules broken and the SIP – Physical (r = 0.78) and Affective behavior (r = 0.64) scores and the AMPS motor score (r = -0.75). There was an adequate correlation between the BMET time to completion and SIP physical score (r = 0.54).

Rand et al. (2009a)* examined convergent validity of the VMET by comparison with the BADS Zoo Map test and IADL questionnaire with the same sample of 9 patients with subacute or chronic stroke, using Spearman correlation coefficients. There was an excellent negative correlation between the BADS Zoo Map and VMET outcome measure of non-efficiency mistakes (r = -0.87), and between the IADL and VMET total number of mistakes (r = -0.82).

Rand et al. (2009a) also examined the relationships between the scores of the VMET and those of the MET-HV using Spearman and Pearson correlation coefficients. Among patients with stroke, there were excellent correlations between MET-HV and VMET outcomes for the total number of mistakes (r = 0.70), partial mistakes in completing tasks (r = 0.88) and non-efficiency mistakes (r = 0.73). Analysis of the whole population indicated adequate to excellent correlations between MET-HV and VMET outcomes for the total number of mistakes (r = 0.77), complete mistakes of completing a task (r = 0.63), partial mistakes in completing tasks (r = 0.80), non-efficiency mistakes (r = 0.72) and use of strategies (r = 0.44), but not for rule break mistakes.

Raspelli et al. (2010) examined convergent validity of the VMET by comparison with neuropsychological tests, with 6 clients with stroke and 14 healthy subjects. VMET outcome measures included time, searched item in the correct area, sustained attention, maintained sequence and no perseveration. Neuropsychological tests included the Trail Making Test, Corsi spatial memory supra-span test, Street’s Completion Test, Semantic Fluencies and Tower of London test. There were excellent correlations between the VMET variable ‘time’ and the Semantic Fluencies test (r = -0.87) and the Tower of London test (r = -0.82); between the VMET variable ‘searched item in the correct area’ and the Trail Making Test (r = 0.96); and between the VMET variables ‘sustained attention’, ‘maintained sequence’ and ‘no perseveration’ and Corsi’s supra-span test (r = 0.84) and Street’s Completion Test (r = -0.86).

Raspelli et al. (2012) examined convergent validity of the VMET by comparison with the Test of Attentional Performance (TEA) with 9 clients with stroke. VMET outcome measures included time, errors, inefficiencies, rule breaks, strategies, interpretation failures and partial-task failures. Authors reported excellent correlations between the VMET outcomes time, inefficiencies and total errors and TEA tests (range r = -0.67 to 0.81).
Note: Other neuropsychological tests were administered but correlations are not reported (Mini Mental Status Examination (MMSE), Beck Depression Inventory (BDI), State and Trait Anxiety Index (STAI), Behavioural Inattention Test (BIT) – Star Cancellation Test, Brief Neuropsychological Examination (ENB) – Token Test, Street’s Completion Test, Stroop Colour-Word Test, Iowa Gambling Task, DEX and ADL/IADL Tests).
*Note: The correlations between the MET and other measures of everyday executive functioning and IADLs also provide support for the ecological validity of the MET (as reported by the authors of these articles).

Known Group:
Knight, Alderman & Burgess (2002) examined known-group validity of the MET-HV in a sample of 20 patients with chronic acquired brain injury (traumatic brain injury, n=12; stroke, n=5, both TBI and stroke, n=3) and 20 healthy control subjects (hospital staff members) matched for gender, age and IQ*. Clients with brain injury made significantly more rule breaks (p=0.002) and total errors (p<0.001), and achieved significantly fewer tasks (p<0.001) than control subjects. Clients with brain injury used significantly more strategies such as looking at a map (p=0.008), reading signs (p=0.006), although use of strategies had little effect on test performance. The test was able to discriminate between individuals with acquired brain injury and healthy controls.
*Note: IQ was measured using the National Adult Reading Test – Revised Full Scale Intelligence Quotient (NART-R FSIQ).

Rand et al. (2009a) examined known group validity of the MET-HV with 9 patients with subacute or chronic stroke, 20 healthy young adults and 20 healthy older adults, using Kruskal Wallis H. Patients with stroke made more mistakes than older adults on VMET outcomes of total mistakes, mistakes in completing tasks, partial mistakes in completing tasks and non-efficiency mistakes, but not rule break mistakes or use of strategies mistakes. Older adults made more mistakes than younger adults on VMET outcomes of total mistakes, partial mistakes in completing tasks and non-efficiency mistakes, but not mistakes in completing tasks, rule break mistakes or use of strategies mistakes.

Alderman et al. (2003) examined known group validity of the MET-SV with 46 individuals with no history of neurological disease (hospital staff members) and 50 clients with brain injury including stroke (n=9), using a series of t-tests. Clients with brain injury made significantly more rule breaks (t = 4.03), task failures (t = 10.10), total errors (t = 7.18), and social rule breaks (chi square 4.3) than individuals with no history of neurological disease. Results regarding errors were preserved when group comparisons were repeated using age, familiarity and cognitive ability (measured by the NART-R FSIQ) as covariates (F = 11.79, 40.82, 27.92 respectively). There was a significant difference in task failures between groups after covarying for age, IQ (measured by the WAIS-R FSIQ) and familiarity with the shopping centre (F = 11.57). Clients with brain injury made approximately three times more errors as healthy individuals. For both groups, rule breaks and task failures were the most common errors.

Dawson et al. (2009) examined known group validity of the BMET with 14 clients with stroke and 13 healthy matched controls, using a series of t-tests. Clients with stroke performed significantly worse on number of tasks completed accurately (d = 0.84, p<0.05), rule breaks (d = 0.92, p<0.05) and total failures (d = 1.05, r<0.01). The proportion of group members who completed fewer than 40% (< 5) tasks satisfactorily was also significantly different between the two groups (28% of clients with stroke vs. 0% of healthy matched controls, p<0.05).
Note: d is the effect size; effect sizes ≥ 0.7 are considered large.

Rand et al. (2009a) examined known group validity of the VMET with a sample of 9 patients with subacute or chronic stroke, 20 healthy young adults and 20 healthy older adults, using Kruskal Wallis H. Patients with stroke made more mistakes than older adults on all VMET outcomes except for rule break mistakes. Older adults made more mistakes than young adults on all VMET outcomes except for the use of strategies mistakes.

Raspelli et al. (2010) examined known group validity of the VMET with 6 clients with stroke and 14 healthy subjects. There were significant differences between groups in time taken to execute the task (higher for healthy subjects) and in the partial error ‘Maintained task objective to completion’.

Raspelli et al. (2012) examined known group validity of the VMET with 9 clients with stroke, 10 healthy young adults and 10 healthy older adults, using Kruskal-Wallis procedures. Results showed that clients with stroke scored lower in VMET time and errors than older adults, and that older adults scored lower in VMET time and errors than young adults.

Sensitivity/ Specificity:
Knight, Alderman & Burgess (2002) investigated sensitivity and specificity of the MET-HV in a sample of 20 patients with chronic acquired brain injury (traumatic brain injury, n=12; stroke, n=5, both TBI and stroke, n=3) and 20 healthy control subjects matched for gender, age and IQ*. A cut-off score ≥ 7 errors (i.e. 5th percentile of total errors of control subjects) resulted in correct identification of 85% of participants with acquired brain injury (85% sensitivity, 95% specificity).
*Note: IQ was measured using the National Adult Reading Test – Revised Full Scale Intelligence Quotient (NART-R FSIQ).

Alderman et al. (2003) reported on sensitivity and specificity of the MET-SV with 46 individuals with no history of neurological disease and 50 clients with brain injury including stroke (n=9). Using a cutoff score ≥ 12 errors (i.e. 5th percentile of controls) results in 44% sensitivity (i.e. correct classification of clients with brain injury) and 95.3% specificity (i.e. correct classification of healthy individuals). The authors caution that deriving a single measure based only on number of errors fails to consider between-group qualitative differences in performance. Accordingly, error scores were recalculated to reflect “normality” of the error type, with weighting of errors according to prevalence in the healthy control group (acceptable errors seen in up to 95% of healthy controls = 1; errors demonstrated by ≥ 5% of healthy controls = 2; errors unique to the patient group = 3). Using a cutoff score ≥ 12 errors (5th percentile of controls) resulted in 82% sensitivity and 95.3% specificity. The MET-SV was more sensitive than traditional tests of executive function (Cognitive Estimates, FAS Verbal Fluency, MWCST), and MET-SV error category scores were highly predictive of rating s of executive symptoms of patients who passed traditional executive function tests but failed the MET-SV shopping task.

Responsiveness

Two studies used the MET (MET-HV, VMET and modified version of the MET-HV & MET-SV) to measure change following intervention.

Novakovic-Agopian et al. (2011) developed a modified version of the MET-HV and MET-SV to be used in local hospital settings. They developed 3 alternate forms that were used in a pilot study examining the effect of goal-oriented attentional self-regulation training with a sample of 16 patients with chronic brain injury including stroke or cerebral hemorrhage (n=3). A pseudo-random crossover design was used. During the first 5 weeks, one group (Group A) completed goal-oriented attentional self-regulation training while the other group (Group B) only received a 2-hour educational instructional session. In the subsequent phase, conditions were switched such that participants in Group B received goals training for 5 weeks while those in Group A received educational instruction. At week 5 the group that received goal training first demonstrated a significant reduction in task failures (p<0.01), whereas the group that received the educational session demonstrated no significant improvement in MET scores. From week 5 to week 10 there were no significant changes in MET scores in either group.

Rand, Weiss and Katz (2009b) used the MET-HV and VMET to detect change in multi-tasking skills of 4 clients with subacute stroke following virtual reality intervention using the VMall virtual supermarket. Clients demonstrated improved performance on both measures following 3 weeks of multi-tasking training using the VMall virtual supermarket.

References

  • Alderman, N., Burgess, P.W., Knight, C., & Henman, C. (2003). Ecological validity of a simplified version of the multiple errands shopping test. Journal of the International Neuropsychological Society, 9, 31-44.
  • Dawson, D.R., Anderson, N.D., Burgess, P., Cooper, E., Krpan, K.M., & Stuss, D.T. (2009). Further development of the Multiple Errands Test: Standardized scoring, reliability, and ecological validity for the Baycrest version. Archives of Physical Medicine and Rehabilitation, 90, S41-51.
  • Knight, C., Alderman, N., & Burgess, P.W. (2002). Development of a simplified version of the Multiple Errands Test for use in hospital settings. Neuropsychological Rehabilitation, 12(3), 231-255.
  • Maier, A., Krauss, S., & Katz, N. (2011). Ecological validity of the Multiple Errands Test (MET) on discharge from neurorehabilitation hospital. Occupational Therapy Journal of Research: Occupation, Participation and Health, 31(1) S38-46.
  • Novakovic-Agopian, T., Chen, A.J.W., Rome, S., Abrams, G., Castelli, H., Rossi, A., McKim, R., Hills, N., & D’Esposito, M. (2011). Rehabilitation of executive functioning with training in attention regulation applied to individually defined goals: A pilot study bridging theory, assessment, and treatment. The Journal of Health Trauma Rehabilitation, 26(5), 325-338.
  • Novakovic-Agopian, T., Chen, A. J., Rome, S., Rossi, A., Abrams, G., Dʼesposito, M., Turner, G., McKim, R., Muir, J., Hills, N., Kennedy, C., Garfinkle, J., Murphy, M., Binder, D., Castelli, H. (2012). Assessment of Subcomponents of Executive Functioning in Ecologically Valid Settings: The Goal Processing Scale. The Journal of Health Trauma Rehabilitation, 2012 Oct 16. [Epub ahead of print]
  • Rand, D., Rukan, S., Weiss, P.L., & Katz, N. (2009a). Validation of the Virtual MET as an assessment tool for executive functions. Neuropsychological Rehabilitation, 19(4), 583-602.
  • Rand, D., Weiss, P., & Katz, N. (2009b). Training multitasking in a virtual supermarket: A novel intervention after stroke. American Journal of Occupational Therapy, 63, 535-542.
  • Raspelli, S., Carelli, L., Morganti, F., Poletti, B., Corra, B., Silani, V., & Riva, G. (2010). Implementation of the Multiple Errands Test in a NeuroVR-supermarket: A possible approach. Studies in Health Technology and Informatic, 154, 115-119.
  • Raspelli, S., Pallavicini, F., Carelli, L., Morganti, F., Pedroli, E., Cipresso, P., Poletti, B., Corra, B., Sangalli, D., Silani, V., & Riva, G. (2012). Validating the Neuro VR-based virtual version of the Multiple Errands Test: Preliminary results. Presence, 21(1), 31-42.
  • Shallice, T. & Burgess, P.W. (1991). Deficits in strategy application following frontal lobe damage in man. Brain, 114, 727-741.

See the measure

How to obtain the Multiple Errands Test?

See the papers below for test instructions of the Simplified Version (MET-SV) and the Hospital Version (MET-HV):

  • Alderman, N., Burgess, P.W., Knight, C., & Henman, C. (2003). Ecological validity of a simplified version of the multiple errands shopping test.Journal of the International Neuropsychological Society, 9, 31-44.
  • Knight, C., Alderman, N., & Burgess, P.W. (2002). Development of a simplified version of the Multiple Errands Test for use in hospital settings.Neuropsychological Rehabilitation, 12(3), 231-255.

The Baycrest MET can be obtained at https://cognitionandeverydaylifelabs.com/multiple-errands-test/

Table of contents

Trail Making Test (TMT)

Evidence Reviewed as of before: 22-04-2012
Author(s)*: Katie Marvin, MSc. PT
Editor(s): Nicol Korner-Bitensky, PhD OT; Annabel McDermott, OT

Purpose

The Trail Making Test (TMT) is a widely used test to assess executive function in patients with stroke. Successful performance of the TMT requires a variety of mental abilities including letter and number recognition mental flexibility, visual scanning, and motor function.

In-Depth Review

Purpose of the measure

The Trail Making Test (TMT) is a widely used test to assess executive abilities in patients with stroke. Successful performance of the TMT requires a variety of mental abilities including letter and number recognition mental flexibility, visual scanning, and motor function.

Performance is evaluated using two different visual conceptual and visuomotor tracking conditions: Part A involves connecting numbers 1-25 in ascending order; and Part B involves connecting numbers and letters in an alternating and ascending fashion.

Available versions

The TMT was originally included as a component of the Army Individual Test Battery and is also a part of the Halstead-Reitan Neuropsychological Test Battery (HNTB).

Features of the measure

Description of tasks:

The TMT is comprised of 2 tasks – Part A and B:

  • Part A: Consists of 25 circles numbered from 1 to 25 randomly distributed over a page of letter size paper. The participant is required to connect the circles with a pencil as quickly as possible in numerical sequence beginning with the number 1.
  • Part B: Consists of 25 circles numbered 1 to 13 and lettered A to L, randomly distributed over a page of paper. The participant is required to connect the circles with a pencil as quickly as possible, but alternating between numbers and letters and taking both series in ascending sequence (i.e. 1, A, 2, B, 3, C…).

What to consider before beginning:

  • The TMT requires relatively intact motor abilities (i.e. ability to hold and maneuver a pen or pencil, ability to move the upper extremity. The Oral TMT may be a more appropriate version to use if the examiner considers that the participant’s motor ability may impact his/her performance.
  • Cultural and linguistic variables may impact performance and affect scores.

Scoring and Score Interpretation:

Performance is evaluated using two different visual conceptual and visuomotor tracking conditions: Part A involves connecting numbers 1-25 in ascending order; and Part B involves connecting numbers and letters in an alternating and ascending fashion.

Time taken to complete each task and number of errors made during each task are recorded and compared with normative data. Time to complete the task is recorded in seconds, whereby the greater the number of seconds, the greater the impairment.

In some reported methods of administration, the examiner pointed out and explained mistakes during the administration.

A maximum time of 5 minutes is typically allowed for Part B. Participants who are unable to complete Part B within 5 minutes are given a score of 300 or 301 seconds. Performance on Part B has not been found to yield any more information on stroke severity than performance on Part A (Tamez et al., 2011).

Ranges and Cut-Off Scores
Normal Brain-damage
TMT Part A 1-39 seconds 40 or more seconds
TMT Part B 1-91 92 or more seconds

Adapted from Reitan (1958) as cited in Matarazzo, Wiens, Matarazzo & Goldstein (1974).

Time:

Approximately 5 to 10 minutes

Training requirements:

No training requirements have been reported.

Equipment:

  • A copy of the measure
  • Pencil or pen
  • Stopwatch

Alternative versions of the Trail Making Test

  • Color Trails (D’Elia et al., 1996)
  • Comprehensive Trail Making Test (Reynolds, 2002)
  • Delis-Kaplan Executive Function Scale (D-KEFS) – includes subtests modeled after the TMT
  • Oral TMT – an alternative for patients with motor deficits or visual impairments (Ricker & Axelrod, 1994).
  • Repeat testing – alternate forms have been developed for repeat testing purposes (Franzen et al., 1996; Lewis & Rennick, 1979)
  • Symbol Trail Making Test – developed as an alternative to the Arabic version of the TMT, for populations with no familiarity with the Arabic numerical system (Barncord & Wanlass, 2001)

Client suitability

Can be used with:

  • Patients with stroke and brain damage.

Should not be used with:

  • Patients with motor deficiencies. If motor ability may impact performance, consider using the Oral TMT.

In what languages is the measure available?

Arabic, Chinese and Hebrew

Summary

What does the tool measure? Executive function in patients with stroke.
What types of clients can the tool be used for? The TMT can be used with, but is not limited to, patients with stroke.
Is this a screening or assessment tool? Assessment tool
Time to administer The TMT takes approximately 5 to 10 minutes to administer.
Versions

  • Color Trails
  • Comprehensive Trail Making Test
  • Delis-Kaplan Executive Function Scale (D-KEFS)
  • Oral TMT
  • Repeat testing – alternate forms have been developed for repeat testing purposes
  • Symbol Trail Making Test
Other Languages Arabic, Chinese and Hebrew
Measurement Properties
Reliability Test-retest:
Two studies examined the test-retest reliability of the TMT among patients with stroke and found adequate to excellent test-retest reliability.
Validity Content:
One study examined the content validity of the TMT and found it to be a complex test that involves aspects of abstraction, visual scanning and attention.

Criterion:
Predictive:
Several studies have examined the predictive validity of the TMT and have found Part B to be predictive of fitness to drive following stroke.

Construct:
Convergent:
One study examined the convergent validity of the TMT and found poor to adequate correlations with the Category Test, Wisconsin Card Sort Test, Paced Auditory Serial Addition Task and the Visual Search and Attention Test.

Known groups:
Three studies have examined the known groups validity of the TMT and found that the TMT was able to differentiate between patients with and without brain damage however, it was not sensitive to differentiating between front and non-frontal brain damage.

Floor/Ceiling Effects One study found Part A of the TMT to have significant ceiling effects.
Does the tool detect change in patients? The responsiveness of the TMT has not formally been studied, however the TMT has been used to detect changes in a clinical trial with patients with stroke.
Acceptability The TMT is simple and easy to administer.
Feasibility The TMT is relatively inexpensive and highly portable. The TMT is public domain and can be reproduced without permission. It can be administered by individuals with minimal training in cognitive assessment.
How to obtain the tool? The Trail Making Test (TMT) can be purchased from: http://www.reitanlabs.com

Psychometric Properties

Overview

A literature search was conducted to identify all relevant publications on the psychometric properties of the Trail Making Test (TMT) involving patients with stroke.

Floor/Ceiling Effects

In a study by Mazer, Korner-Bitensky and Sofer (1989) that investigated the ability of perceptual testing to predict on-road driving outcomes in patients with stroke, part A of the TMT was found to have significant ceiling effects. For this reason, Part A was excluded from study results as it was deemed too easy for participants when evaluating the ability of the TMT to predict on-road driving test outcomes. No ceiling effects for part B were found.

Reliability

Internal consistency:
No studies were identified examining the internal consistency of the TMT in patients with stroke.

Test-retest:
Matarazzo, Wiens, Matarazzo and Goldstein (1974) examined the test-retest reliability of the TMT and other components of the Halstead Impairment Index with 29 healthy males and 16 60-year old patients with diffuse cerebrovascular disease. Adequate test-retest reliability was found for both Part A and Part B of the TMT in the healthy control group (r=0.46 and 0.44 respectively), as calculated using Pearson correlation coefficients. Excellent and adequate test-retest reliability was found for Part A and Part B of the TMT respectively (r=0.78 and 0.67), among participants with diffuse cerebrovascular disease.

Goldstein and Watson (1989) investigated the test-retest reliability of the TMT as a part of the Halstead- Reitan Battery in a sample of 150 neuropsychiatric patients, including patients with stroke. Test-retest correlations were calculated using Pearson Correlation Coefficients for the entire sample and for the sub-group of patients with stroke. Excellent test-retest reliability for both Part A and Part B were found (0.94 and 0.86 respectively) in the sub-group of patients with stroke; and adequate reliability for the entire participant sample (0.69 and 0.66 respectively).

Intra-rater:
No studies were identified examining the intra-rater reliability of the TMT in patients with stroke.

Inter-rater:
No studies were identified examining the inter-rater reliability of the TMT in patients with stroke.

Validity

Content:

O’Donnell, MacGregor, Dabrowski, Oestreicher & Romero (1994) examined the face validity of the TMT in a sample of 117 community-dwelling patients, including patients with stroke. The results suggest that the TMT is a complex test that involves aspects of abstraction, visual scanning and attention.

Criterion:

Concurrent:
No studies were identified examining the concurrent validity of the TMT.

Predictive:
Mazer, Korner-Bitensky and Sofer (1998) examined the ability of the TMT and other measures of perceptual function to predict on-road driving test outcomes in 84 patients with subacute stroke. For Part B of the TMT, a cut-off score of < 3 errors demonstrated high positive predictive value (85%) and low negative predictive value (48%) for successful completion of driving evaluation. The Motor Free Visual Perception Test (MFVP) and the TMT Part B, when combined, demonstrated the highest predictive value for on-road driving test outcome. Participants who scored poorly on both the MFVP and TMT Part B had 22 times the likelihood of failing the on-road evaluation.

Devos, Akinwuntan & Nieuwboer (2011) conducted a systematic review to identify the best determinants of fitness to drive following stroke. The TMT Part B was evaluated in 2 studies (Mazer et al., 1998 and Mazer et al., 2003) and found to be one of the best predictors of passing on-road driving evaluation tests (effect size = 0.81, p<0.0001). In addition, when using a cutoff score of 90 seconds, the TMT Part B had a sensitivity of 80% and a specificity of 62% for detecting unsafe on-road performance. In a subsequent systematic review by Marshall et al. (2007), the TMT was, again, found to be one of the most useful predictors of fitness for driving post-stroke.

Construct:

Convergent/Discriminant:
O’Donnell et al. (1994) examined the convergent validity of the TMT and four other neuropsychological tests: Category Test (CAT), Wisconsin Card Sort Test (WCST), Paced Auditory Serial Addition Task (PASAT), and Visual Search and Attention Test (VSAT). The study involved 117 community-dwelling adults, including patients with stroke. Poor to adequate correlations were found between the TMT and the other measures (CAT r=0.38; WCST r=0.31; PASAT r=0.44; VAST r=0.30), using Pearson product-moment correlations.

Known groups:
Reitan (1955) examined the ability of the TMT to differentiate between patients with and without organic brain damage, including patients with stroke. Highly significant differences in mean and sum scores were found between the two groups (p<0.001) on both parts of the TMT, suggesting that the TMT is able to different between patients with and without brain damage.

Corrigan and Hinkeldey (1987) examined the relationship between Part A and Part B of the TMT. Data was collected from the charts of 497 patients receiving treatment at a rehabilitation centre. Patients with traumatic brain injury and stroke comprised a large majority of the sample. A difference (B-A) and a ratio (B/A) score were calculated. The difference score was highly correlated with intelligence and severity of impairment and only moderately correlated with age, education and memory functioning. The B/A ratio appeared to show greatest sensitivity to differences in cerebral lateralization of damage.

Tamez et al. (2011) examined the effects of frontal versus non-frontal stroke and severity of stroke on TMT performance in 689 patients with stroke. The TMT, Digit Span and National Institute of Health Stroke Scale (NIHSS) were administered within 72 hours of hospital admission. Stroke severity was classified according to the NIHSS, and frontal or non-frontal lesions by CT or MRI scans. Performance on both Part A and Part B of the TMT were significantly correlated with stroke severity using the NIHSS. Patients with frontal and non-frontal lesions were found to score equally on Part A and Part B (p>0.05). Results of this study suggest that the TMT is sensitive to brain damage, however, there is little evidence to support the widely held assumption that Trails B is more sensitive to frontal lesions than Part A.

Sensitivity/ Specificity:

No studies were identified examining the specificity of the TMT in patients with stroke.

Responsiveness

Barker-Collo, Feigin, Lawes, Senior and Parag (2000) assessed the course of recovery of attention span in 43 patients with acute stroke over a 6-month period. The TMT and other measures of attention were administered at baseline (within 4 weeks following stroke onset), 6 weeks, and 6 months after stroke. Although the responsiveness of the TMT was not formally assessed in this study, the scale was sensitive enough to detect an improvement in attention at 6 weeks and 6 months following stroke.

References

  • Barker-Collo, S., Feigin, V., Lawes, C., Senior, H., & Parag, V. (2010). Natural history of attention deficits and their influence on functional recovery from acute stages to 6 months after stroke.Neuroepidemiology, 35(4), 255-262.
  • Barncord, S.W. & Wanlass, R.L. (2001). The Symbol Trail Making Test: Test development and utility as a measure of cognitive impairment. Applied Neuropsychology, 8, 99-103
  • Corrigan, J. D. & Hinkeldey, N. S. (1987). Relationships between Parts A and B of the Trail Making Test. Journal of Clinical Psychology, 43(4), 402-409.
  • D’Elia, L.F., Satz, P., Uchiyama, C.I. & White, T. (1996). Color Trails Test. Odessa, Fla.:PAR.
  • Devos, H., Akinwuntan, A. E., Nieuwboer, A., Truijen, S., Tant, M., & De Weerdt, W. Screening for fitness to drive after stroke: a systematic review and meta-analysis.Neurology, 76(8), 747-756.
  • Elkin-Frankston, S., Lebowitz, B.K., Kapust, L.R., Hollis, A.M., & O’Connor, M.G. (2007). The use of the Colour Trails Test in the assessment of driver competence: Preliminary reports of a culture-fair instrument. Archives of Clinical Neuropsychology, 22, 631-635.
  • Goldstein, G. & Watson, J.R. (1989). Test-retest reliability of the Halstead-Reitan Battery and the WAIS in a Neuropsychiatric Population. The Clinical Neuropsychologist, 3(3), 265-273.
  • O’Donnell, J.P., Macgregor, L.A., Dabrowski, J.J., Oestreicher, J.M., & Romero, J.J. (1994). Construct validity of neuropsychological tests of conceptual and attentional abilities. Journal of Clinical Psychology, 50(4), 596-560.
  • Mark, V. W., Woods, A. J., Mennemeier, M., Abbas, S., & Taub, E. Cognitive assessment for CI therapy in the outpatient clinic.Neurorehabilitation, 21(2), 139-146.
  • Marshall, S.C., Molnar, F., Man-Son-Hing, M., Blair, R., Brosseau, L., Finestone, H.M., Lamothe, C, Korner-Bitensky, N., & Wilson, K. (2007). Predictors of driving ability following stroke: A systematic review. Topics in Stroke Rehabilitation, 14(1):98-114.
  • Matarazzo, J.D., Wiens, A.N., Matarazzo, R.G., & Goldstein, S.G. (1974). Psychometric and clinical test-retest reliability of the Halstead Impairment Index in a sample of healthy, young, normal men. The Journal of Nervous and Mental Disease, 188(1), 37-49.
  • Mazer, B.L., Korner-Bitensky, N.A., & Sofer, S. (1998). Predicting ability to drive after stroke. Archives of Physical Medicine and Rehabilitation, 79, 743-750.
  • Mazer, B.L., Sofer, S., Korner-Bitensky, N., Gelinas, I., Hanley, J. & Wood-Dauphinee, S. (2003). Effectiveness of a visual attention retraining program on the driving performance of clients with stroke. Archives of Physical Medicine and Rehabilitation, 84, 541-550.
  • Reitan, R.M. (1955). The relation of the Trail Making Test to organic brain damage. Journal of Consulting Psychology, 19(5), 393-394.
  • Reynolds, C. (2002). Comprehensive Trail Making Test. Austin, Tex,: Pro-Ed.
  • Ricker, J.H. & Axelrod, B.N. (1994). Analysis of an oral paradigm for the Trail Making Test. Assessment, 1, 47-51.
  • Strauss, E., Sherman, E.M.S., & Spreen, O. (2006).A Compendium of neuropsychological tests: Administration, norms, and commentary.(3rd. ed.).NY. Oxford University Press.
  • Tamez, E., Myersona, J., Morrisb, L., Whitea, D. A., Baum C., & Connor, L. T. (2011). Assessing executive abilities following acute stroke with the trail making test and digit span.Behavioural Neurology, 24(3), 177-185.

See the measure

How to obtain the Trail Making Test (TMT)?

The Trail Making Test (TMT) can be purchased from:

Reitan Neuropsychology Laboratory
P.O. Box 66080
Tucson, AZ
85728

http://www.reitanlabs.com

Table of contents
Help us to improve