ABILHAND

Evidence Reviewed as of before: 17-06-2012
Author(s)*: Annabel McDermott, OT
Editor(s): Nicol Korner-Bitensky, PhD OT

Purpose

The ABILHAND is a semi-structured item-response questionnaire that measures manual ability according to an individual’s perceived difficulty performing daily bimanual tasks.

In-Depth Review

Purpose of the measure

The ABILHAND is an interview-based assessment tool that measures a patient’s perceived difficulty using his/her hands to perform manual activities in daily life. The ABILHAND assesses active function of the upper limbs. The tool measures an individual’s ability to perform bimanual tasks, regardless of strategies used to complete the task (Ashford et al., 2008; Penta et al., 1998)

Available versions

The ABILHAND was originally developed by Penta et al. (1998) as a 56-item, 4-level questionnaire of unimanual and bimanual ability for patients with rheumatoid arthritis. The original ABILHAND was intended to measure rehabilitation outcomes and to provide guidelines for goal setting in treatment planning (Gustafsson et al., 2004). Penta et al. (2001) found that patients with stroke were able to complete unimanual activities with the unaffected limb, regardless of hand dominance, whereas bimanual tasks were more difficult. Accordingly, a version was developed specifically for patients with stroke that only included bimanual items, as well as two alternate unimanual’ activities that require skillful use of the affected hand (cutting nails, filing nails). Penta et al. (2001) also reviewed the 4-level scoring criterion (impossible, very difficult, difficult, easy) and found that patients rarely used the very difficult’ score. This indicated that the two intermediate scoring criteria (very difficult, difficult) were not sufficiently differentially distinct. Accordingly, the stroke version of the ABILHAND was developed with a 3-level scoring criterion (impossible, any difficulty, easy).

Other impairment-specific versions were subsequently created with modified item sets and levels. Each version of the ABILHAND has its own Rasch-derived item difficulty calibrations that rely on computerized algorithms to obtain the patient’s overall measure from his/her responses (Simone et al., 2011).

Features of the measure

Items:

The ABILHAND is an inventory of 23 bimanual activities (from most difficulty to least difficult):

  1. Hammering a nail
  2. Threading a needle
  3. Peeling potatoes with a knife
  4. Cutting own nails
  5. Wrapping up gifts
  6. Filing own nails
  7. Cutting meat
  8. Peeling onions
  9. Shelling hazel nuts
  10. Opening a screw-topped jar
  11. Fastening zipper of jacket
  12. Tearing open pack of chips
  13. Buttoning up a shirt
  14. Sharpening a pencil
  15. Spreading butter on a slice of bread
  16. Fastening a snap
  17. Buttoning up trousers
  18. Taking the cap off a bottle
  19. Opening mail
  20. Squeezing toothpaste on a toothbrush
  21. Pulling up the zipper of trousers
  22. Unwrapping a chocolate bar
  23. Washing hands

Scoring:

The patient is asked to rate his/her perceived difficulty performing items without help, according to the following scoring criteria:

  • 0 = impossible
  • 1 = difficult
  • 2 = easy

Tasks that the patient has not performed in the past 3 months are not scored and are encoded as missing responses.

The ABILHAND was developed using the Rasch measurement model, which provides a method to convert the ordinal raw score into a linear measure on a unidimensional scale. Item scores are entered into the WINSTEPS computer program, and raw ordinal data is converted to linear measures expressed in logits (log-odds probability units). The total score is scaled along a unidimensional continuum with 0 at the centre of the scale, whereby the higher the logit number, the greater the patient’s perceived ability (Gustafsson et al., 2004).

What to consider before beginning:

Users should note that self-estimated measures (i.e. when scores are not based on clinician observation of performance) are subject to overestimation or underestimation of actual performance, depending on motivation and cognitive skills (Penta et al. 2001).

Clinicians should consider patient factors such as self-esteem, insight, vision, hearing, language and cognitive function prior to administering the ABILHAND (Gustafsson et al., 2004).

Mpofu & Oakland (2010) advise caution when using the ABILHAND to measure improvements in impairment of the affected upper limb after stroke rehabilitation. The ABILHAND does not take into consideration the arm used to perform a task or compensatory strategies employed to complete the task. Accordingly, improvement in scores may be based on use of compensatory strategies rather than on improvement in the affected arm.

Time:

The ABILHAND takes 10 to 30 minutes to administer (Ashford et al., 2008; Connell et al., 2012).

Training requirements:

No training requirements have been specified for the ABILHAND, although administration by a clinician is recommended (Ashford et al., 2008).

Equipment:

The ABILHAND is a semi-structured questionnaire that does not require specific equipment, however the WINSTEPS computer program is required to process raw scores.

Client suitability

Can be used with:

  • Individuals with chronic stroke
  • Individuals with rheumatoid arthritis
  • Individuals with systemic sclerosis

Should not be used with:

  • Due to the subjective nature of the patient’s reports, this measure should not be used with individuals with severe cognitive deficits (Penta et al., 2001).
  • The ABILHAND may not be suitable for use with patients with aphasia or apraxia (Gustafsson et al., 2004).

In what languages is the measure available?

  • French
  • English
  • Dutch
  • Italian
  • Swedish

Summary

What does the tool measure? Manual ability of the upper extremity.
What types of clients can the tool be used for? The ABILHAND can be used with, but is not limited to, patients with stroke.
Is this a screening or assessment tool? Assessment
Time to administer 10-30 minutes
Versions
  • AH-RA for rheumatoid arthritis (46 items, 4 levels)
  • AH-RA revised version (27 items, 3 levels)
  • ABILHAND-ULA for upper limb amputees (22 items; 4 levels)
  • SSC-adapted ABILHAND for systemic sclerosis (26 items, 3 levels)
  • ABILHAND – neuromuscular age-independent version (22 items)
  • ABILHAND-Kids (21 items)
Other Languages French, English, Swedish, Dutch, Italian
Measurement Properties
Reliability Internal consistency:
– Order of difficulty of items has been confirmed by Rasch analysis.
– One study reported a high item reliability index.
– One study reported high person separation reliability.

Test-retest:
No studies have reported on the test-retest reliability of the ABILHAND.

Intra-rater:
No studies have reported on the intra-rater reliability of the ABILHAND.

Inter-rater:
No studies have reported on the inter-rater reliability of the ABILHAND.

Validity Content:
– One study reported that the 23 items of the ABILHAND define a common continuum of manual ability, and items are coherent with the overall questionnaire and contribute to the measurement of manual ability.
– One study examined stability of item difficulty of the ABILHAND and found that item hierarchy was substantially retained across different groupings (impairment, age, sex, ability).
– One study reported that scores explained 84% of observed variance. The main factor across the residuals explained only 11.4% of the residual variance (1.8% of the total variance).

Criterion:
Concurrent:
One study examined the concurrent validity of the ABILHAND among patients with chronic upper limb impairment resulting from conditions including stroke and reported adequate correlations with the Box and Block Test, Jamar handgrip and Purdue pegboard test, and an adequate negative correlation with the Nine Hole Peg Test.

Predictive:
No studies have reported on the predictive validity of the ABILHAND.

Construct:
Convergent/Discriminant:
No studies have reported on the convergent/discriminant validity of the ABILHAND.

Known Groups:
– One study reported highly significant differences in ABILHAND scores between patients with tetraparesis, hemiparesis, other neurological impairments (multiple sclerosis, Parkinson’s disease, ataxia) and healthy subjects.
– One study reported no correlation between ABILHAND scores and country, age, sex, time since stroke, affected side, lesion site or tactile sensitivity; poor correlation with grip strength and manual dexterity of the unaffected limb; poor negative correlation with depression; adequate correlation with grip strength and manual dexterity of the affected limb; and excellent correlation with upper limb motricity.

Floor/Ceiling Effects No studies have reported on the floor/ceiling effects of the ABILHAND.
Does the tool detect change in patients? – No studies have reported on the responsiveness of the ABILHAND.
– One study reported that the ABILHAND demonstrates 92% sensitivity and 80% specificity at a lower cutoff score of 80/100.
Acceptability The ABILHAND is non-invasive and quick to administer. The items are considered reflective of real-life activities (i.e. ecologically valid).
Feasibility The ABILHAND is portable and is suitable for administration in various settings. The assessment is quick to administer and requires minimal specialist equipment or training.
How to obtain the tool? The ABILHAND is available in Penta, M., Tesio, L., Arnould, C., Zancan, A., & Thonnard, J-L. (2001). The ABILHAND questionnaire as a measure of manual ability in chronic stroke patients: Rasch-based validation and relationship to upper limb impairment. Stroke, 32, 1627-34

Psychometric Properties

Overview

A literature search was conducted to identify all relevant publications on the psychometric properties of the ABILHAND. While additional studies have been conducted on other ABILHAND versions, this review specifically addresses the psychometric properties of the 23-item stroke version of the ABILHAND, unless otherwise specified. Two studies were identified.

Floor/Ceiling Effects

No studies have reported on the floor or ceiling effects of the ABILHAND. However, given the hierarchical relationship of items, lower-level tasks of the ABILHAND may be susceptible to floor effects (Ashford et al., 2008).

Reliability

Internal consistency:
Penta et al. (2001) examined the internal consistency of the original 56-item ABILHAND in a sample of 103 patients with chronic stroke using Rasch analysis and reported high reliability (Rasch separation reliability=0.90; person separation reliability=0.90). The authors examined the stability of the scale through differential item functioning (DIF) tests among 12 subgroups: sex (male/female); country (Belgium/Italy); age (< 60/≥ 60), affected side (dominant/nondominant); delay since stroke (< 2 years/≥ 2 years), level of depression, dexterity and manual ability of the unaffected limb, grip strength, dexterity and sensitivity of the affected limb, and motricity of the affected limb. The difficulty hierarchy of the ABILHAND was uniformly perceived by patients with chronic stroke.

Simone et al. (2011) examined the internal consistency of the ABILHAND in a sample of 126 patients with chronic upper limb impairment resulting from stroke (n=83), multiple sclerosis (n=17), peripheral or cerebellar ataxia (n=13), spinal cord lesion (n=10) or Parkinson’s disease (n=3), and 24 health subjects. The ABILHAND demonstrated high reliability (item reliability index=0.94; Cronbach’s alpha=0.99). All items of the ABILHAND fit the Rasch model satisfactorily. There were at least 4 strata of statistically different measures, indicating that variance across scores did not reflect randomness. The authors also examined stability of item difficulty through differential item functioning (DIF) by comparing 4 different groupings of the sample pool: impairment (hemiparesis vs. other); age (≤ 69 vs. > 69); sex (male vs. female); and ability (above median vs. below median). There was a very moderate DIF across the grouping criteria, whereby item hierarchy was substantially retained for all subgroups: impairment (1 outlier: buttoning a shirt); sex (6 outliers: fastening a snap, shelling hazel nuts, hammering a nail, wrapping up gifts, peeling potatoes, spreading butter); age (4 outliers: threading a needle, wrapping up gifts, spreading butter, fastening a snap); and ability (2 outliers: sharpening a pencil, cutting meat).

Test-retest:
No studies have reported on the test-retest reliability of the ABILHAND.

Intra-rater:
No studies have reported on the intra-rater reliability of the ABILHAND.

Inter-rater:
No studies have reported on the inter-rater reliability of the ABILHAND. Note, however that inter-rater reliability is less necessary because administration of the ABILHAND does not rely on clinician-observation of patient performance.

Validity

Content:

Penta et al. (2001) examined the measure of perceived difficulty of the ABILHAND in a sample of 103 patients with chronic stroke. Item distribution ranged from 1.72 to -2.18 logits. All items fit the Rasch model and the 23 items define a common continuum of manual ability. All point measure correlation coefficients (RPM) were positive, indicating that all items are coherent with the overall questionnaire and contribute to the measurement of manual ability. Although fit statistics indicated that most activities adequately measure recovery of manual ability in chronic stroke, 1 item obtained an outlier outfit value (buttoning up a shirt, mean square=1.64), and four items obtained outlier infit values (cutting meat, mnsq=0.69; shelling hazel nuts, mnsq=1.33; tearing open a packet of chips, mnsq=1.22; sharpening a pencil, mnsq=0.65).

Penta et al. (2001) examined the content validity of the ABILHAND by comparing the ranking of item difficulty with expert opinion of four occupational therapists regarding the involvement of the affected hand in each activity. The following classifications were used: (1) the item does not require the affected limb, if it is broken down into several unimanual sequences; (2) the task requires the affected upper limb to stabilize an object but does not involve any fingers; and (3) the task requires precision grip, grip strength, dexterity or any digital activity from the affected side. Findings indicate that more difficult items also tend to require a greater degree of use of the affected limb, whereas easier items do not require the use of the affected limb.

Simone et al. (2011) examined the validity of the ABILHAND in a sample of 126 patients with chronic upper limb impairment resulting from stroke (n=83), multiple sclerosis (n=17), peripheral or cerebellar ataxia (n=13), spinal cord lesion (n=10) or Parkinson’s disease (n=3), and 24 health subjects. Modeled scores explained 84% of observed variance. The main factor across the residuals explained only 11.4% of the residual variance (1.8% of the total variance).

Criterion:

Concurrent:
Simone et al. (2011) compared the concurrent validity of the ABILHAND, Jamar handgrip, Box and Block Test (BBT), Purdue pegboard test and Nine Hole Peg Test (NHPT) in a sample of 126 patients with chronic upper limb impairment resulting from stroke, multiple sclerosis, sensory or cerebellar ataxia, spinal cord lesion or Parkinson’s disease, and 24 healthy subjects, using Pearson’s r. Adequate correlations were found between the ABILHAND and the Jamar handgrip (r=0.377, p=0.001), BBT (r=0.481, p=0.000) and the Purdue pegboard test (r=0.493, p=0.000), and an adequate negative correlation was found between the ABILHAND and the NHPT (r=-0.370, r=0.007).

Predictive:
No studies have reported on the predictive validity of the ABILHAND.

Construct:

Convergent/Discriminant:
No studies have reported on the convergent/discriminant validity of the ABILHAND.

Known Group:
Penta et al. (2001) examined the relationship of the ABILHAND measures to other demographic and clinical variables in a sample of 103 patients with chronic stroke, using univariate ANOVA and correlation coefficients (Mann-Whitney U test, Kruskal-Wallis H tests, Spearman p, Pearson r). Tests revealed no significant differences in ABILHAND measures according to demographic indexes of country (Belgium/Italy), sex or age. Clinical variables such as time since stroke, affected side (dominant/nondominant), lesion site and tactile sensitivity of either limb (measured using the Semmes-Weinstein tactile sensation test) were not significantly related to ABILHAND measures. There was a poor correlation between ABILHAND measures and grip strength (Jamar handgrip, R=0.242, P<0.014) and manual dexterity (Box and Block Test, R=0.248, P=0.012) of the unaffected limb, and a poor negative correlation with depression (Geriatric Depression Scale, p=-0.213, P=0.030). ABILHAND measures demonstrated an adequate correlation with grip strength (R=0.562, P<0.001) and manual dexterity (R=0.598, P<0.001) of the affected limb, and an excellent correlation with upper limb motricity (Brunnstrom upper limb motricity test, p=0.730, P<0.001). Results showed a direct relationship between ABILHAND measures of manual ability and impairment on the affected side, where more complex combinations of manual dexterity without/without grip strength and/or upper limb motricity impairment correlated with higher manual disability.

Simone et al. (2011) examined the known-group validity of the ABILHAND in a sample of 126 patients with chronic upper limb impairment resulting from stroke, multiple sclerosis, sensory or cerebellar ataxia, spinal cord lesion or Parkinson’s disease, and 24 healthy subjects, using Kruskal-Wallis test. Highly significant differences (P<0.001) were found between patients with tetraparesis, hemiparesis, other neurological impairments (multiple sclerosis, Parkinson’s disease, ataxia) and control participants.

Responsiveness

Simone et al. (2011) reported a satisfactory match between the distribution of item difficulty levels and patients’ ability levels. The average ability of healthy controls vs. patients with chronic upper limb impairment resulting from stroke, multiple sclerosis, sensory or cerebellar ataxia, spinal cord lesion or Parkinson’s disease was 89 (standard error=8) vs. 63 (standard error=17).

Sensitivity & Specificity:
Simone et al. (2011) examined the sensitivity and specificity of the ABILHAND in a sample of 126 patients with chronic upper limb impairment resulting from stroke, multiple sclerosis, sensory or cerebellar ataxia, spinal cord lesion or Parkinson’s disease, and 24 healthy subjects. An “impairment-normality” cut-off was computed through logistic regression and a lower cut-off score of 80/100 is proposed for healthy controls (area under ROC curve=0.9097, p<0.05). This allowed correct classification of patients vs. healthy controls with a 92% sensitivity rate and 80% specificity rate, whereby 82% of the sample was correctly classified.

References

  • Ashford, S., Slade, M., Malaprade, F., & Turner-Stokes, L. (2008). Evaluation of functional outcome measures for the hemiparetic upper limb: a systematic review. Journal of Rehabilitation Medicine, 40, 787-95.
  • Connell, L.A. & Tyson, S.F. (2012). Clinical reality of measuring upper-limb ability in neurological conditions: a systematic review. Archives of Physical Medicine and Rehabilitation, 93, 221-8.
  • Gustafsson, S., Sunnerhagen, K.S, & Dahlin-Ivanoff, D. (2004). Occupational therapists’ and patients’ perceptions of ABILHAND, a new assessment tool for measuring manual ability. Scandinavian Journal of Occupational Therapy, 11, 107-17.
  • Mpofu, E. & Oakland, T. (2010). Rehabilitation and Health Assessment: Applying ICF Guidelines. New York: Springer Publishing Company.
  • Penta, M., Tesio, L., Arnould, C., Zancan, A., & Thonnard, J-L. (2001). The ABILHAND questionnaire as a measure of manual ability in chronic stroke patients: Rasch-based validation and relationship to upper limb impairment. Stroke, 32, 1627-34.
  • Simone, A., Rota, V., Tesio, L., & Perucca, L. (2011). Generic ABILHAND questionnaire can measure manual ability across a variety of motor impairments. International Journal of Rehabilitation and Research, 34, 131-40.

See the measure

How to obtain the ABILHAND:

The ABILHAND is available in Penta, M., Tesio, L., Arnould, C., Zancan, A., & Thonnard, J-L. (2001). The ABILHAND questionnaire as a measure of manual ability in chronic stroke patients: Rasch-based validation and relationship to upper limb impairment. Stroke, 32, 1627-34.

Table of contents

Action Research Arm Test (ARAT)

Evidence Reviewed as of before: 09-06-2011
Author(s)*: Sabrina Figueiredo, BSc
Editor(s): Lisa Zeltzer, MSc OT; Nicol Korner-Bitensky, PhD OT; Elissa Sitcoff, BA BSc

Purpose

The Action Research Arm Test (ARAT) is an evaluative measure to assess specific changes in limb function among individuals who sustained cortical damage resulting in hemiplegia (Lyle, 1981). It assesses a client’s ability to handle objects differing in size, weight and shape and therefore can be considered to be an arm-specific measure of activity limitation (Platz, Pinkowski, Kim, di Bella, & Johnson, 2005).

In-Depth Review

Purpose of the measure

The Action Research Arm Test (ARAT) is an evaluative measure to assess specific changes in limb function among individuals who sustained cortical damage resulting in hemiplegia (Lyle, 1981). It assesses a client’s ability to handle objects differing in size, weight and shape and therefore can be considered to be an arm-specific measure of activity limitation (Platz, Pinkowski, Kim, di Bella, & Johnson, 2005).

Available versions

The ARAT was developed by Ronald Lyle in 1981 by adapting the Upper Extremity Function Test (UEFT) (Carroll, 1965). The UEFT test administration and scoring was simplified, the time required to administer the test was shorted, and items were grouped based on the hierarchical scale (Guttman Scale) (Lang, Wagner, Dromerick, & Edwards, 2006). Due to the need for more specific and detailed instructions related to the client’s position, scoring and test administration, Yozbatiran, Der-Yeghiaian, and Cramer (2008) proposed a standardized approach to the ARAT.

Features of the measure

Items:

The ARAT consists of 19 items grouped into four subscales: grasp, grip, pinch, and gross movement. Each subscale constitutes a hierarchical Guttman scale, which means that all items are ordered according to ascending difficulty. In the ARAT, if the client succeeds in completing the most difficult item in a subscale, this suggests he/she will succeed in the easier items for that same subscale. Similarly, failure on an item suggests the client will be unable to complete the remaining more challenging items in the subscale.

According to the rules defained by Lyle (1981), the client must first try to perform the most difficult task in a subscale. If the maximum score (score = 3) is obtained for this task then the maximum score for this entire subscale should be assigned, and the evaluator should move to the next subscale to be administered. When the client is unable to complete the most difficult item (scoring between 0-2), then the easiest item in this specific subscale should be performed. If the client fails completely (score = 0) when performing the easiest task, then the other intermediate items must not be tested, the entire subscale should be scored as zero, and the evaluator should then move to the next subscale. However, if the client succeeds at the easiest task either partially (score = 1 or 2) or completely (score = 3), then all the other tasks in that same subscale should be tested before moving to the next subscale. Following these rules, the items administered will range from a minimum of 4 to a maximum of 19 (van der Lee, Roorda, & Lankhorst, 2002).

The ARAT must be administered in a formal setting, since a specially designed table and chair are required (see equipment section for more information). For the starting position, the client should be seated in a chair, with a firm back and no armrests. The client’s trunk should be in contact with the back of the chair at all times during the test performance. Instructions about the required seating posture should be provided to the client prior to initiating the test. Additionally, reminders about the maintenance of this position should be given to the client when this condition is not respected. The client’s feet should be in contact with the floor throughout testing (van der Lee, DeGroot, Beckerman, Wagenaar, Lankhorst, & Bouter, 2001a; Yozbatiran et al., 2008). Both hands should be tested, beginning with the non- or less-affected hand, in order to practice and register baseline scores. Should the client be unable to understand the instructions for the required task, the evaluator should demonstrate the task and allow the client to try it as a trial (Yozbatiran et al., 2008). To facilitate recording the time for each task, the client’s hands should start and finish the task with palms down on the table. However, for the gross movement tasks, the client’s hands should be placed pronated on their lap. (Lyle, 1981; Yozbatiran et al., 2008).

In the grasp and pinch subscales, testing materials are lifted 37 cm from the surface of the table to the top of the shelf. In the grip subscale, testing materials are moved from one side of the table to the other. Finally, in the gross movement subscale, the client is requested to place the hand being tested either behind his/her head, on top of his/her head, or to his/her mouth (Lyle, 1981; Hsieh, Hsueh, Chiang, & Lin, 1998; Hsueh, Lee, & Hsieh, 2002a). The proper sequence for testing is 1) grasp subscale, 2) grip subscale, 3) pinch subscale, 4) gross movement subscale (Lyle, 1981). The ARAT comes with simple instructions to guide the evaluator on scoring and administering the test (Lyle, 1981).

Scoring:

The ARAT is scored on a four-level ordinal scale (0-3) (Lyle, 1981).

  • 0 = can not perform any part of the test,
  • 1 = performs the test partially,
  • 2 = completes the test, but takes abnormally long, time
  • 3 = performs the test normally

In order to facilitate scoring, time limits have been suggested (Wagenaar, Meijer, van Wierinen, Kuik, Hazenberg, Lindeboom, Wichers, & Rijswijk, 1990; Yozbatiran et al., 2008). Incorporating the time limits to Lyle’s scoring definition, the new scoring system would be:

  • 0 = cannot perform any part of the test;
  • 1 = performs the test partially;
  • 2 = completes the test, but takes an abnormally long time, varying from 5 to 60 seconds.

    If a client takes more than 60 seconds to perform an item, the evaluator should interrupt after 60 seconds and a score of 1 is given on that specific item.

  • 3 = performs the test normally in less than 5 seconds.

The subscale scores range according to the number of items on each subscale, as follows:

Subscales on the ARAT Number of items per subscale Score ranges per subscale
Grasp subscale 6 items Score 0-18
Grip subscale 4 items Score 0-12
Pinch subscale 6 items Score 0-18
Gross Movement subscale 3 items Score 0-9

The total score on the ARAT ranges from 0 to 57, with the lowest score indicating that no movements can be performed, and the upper score indicating normal performance. Thus, higher scores will indicate better performance (Lang et al., 2006; van der Lee et al., 2002). The ARAT scores is a continuous measure, with no categorical cutoff scores. Therefore the score obtained at the ARAT does not allow classifying the clients into categories such as normal, mild limited, or severely limited.

Time:

The time required to complete the ARAT will depend on the number of items administered. Based on its hierarchical design, the ARAT was constructed to save testing time. Thus, no more than 7-10 minutes should be required to assess a client with stroke (DeWeerdt, & Harrinson, 1985). However, if all 19 items are performed, the ARAT usually takes 20 minutes to administer (van der Lee et al., 2002). In one study by Hsieh and colleagues (1998), the ARAT took, on average, 8 minutes to administer to clients with stroke.

Subscales:

The ARAT is divided in four subscales: Grasp; Grip; Pinch and Gross movement.

The grasp and pinch subscales have 6 items each, the grip subscale has 4 items, and the gross movement has 3 items (Lyle, 1981).

Equipment:

Standardized equipment is required to administer the ARAT. It can be ordered only from Netherlands’ representatives. The average cost for this equipment is approximately 850 Euros ($1200 CAD) with an additional delivery fee of 179 Euros ($252 CAD).

The complete ARAT kit consists of:

  • A specially designed table of 92cm x 45cm x 83cm high, with a shelf of 93cm x 10cm, positioned 37cm above the main surface of the table (Lyle, 1981; Hsueh et al., 2002a).
  • A chair with back rest and no arm rests, that should be placed 44cm above floor level (Lyle, 1981; Hsueh et al., 2002a).
  • Woodblocks of 2.5, 5, 7.5 and 10cm³ (Lyle, 1981; Hsueh et al., 2002a).
  • A cricket ball 7.5cm in diameter (Lyle, 1981; Hsueh et al., 2002a).
  • Two alloy tubes: one 2.25cm in diameter x 11.5 cm long, the second one 1.0cm in diameter x 16cm long (Lyle, 1981; Hsueh et al., 2002a).
  • A washer and bolt; which is a type of screw with its anchor (Lyle, 1981; Hsueh et al., 2002a).
  • Two glasses (Lyle, 1981; Hsueh et al., 2002a).
  • A marble 1.5cm in diameter (Lyle, 1981; Hsueh et al., 2002a).
  • A ball bearing 6mm in diameter (Lyle, 1981; Hsueh et al., 2002a).
  • A stopwatch (Wagenaar et al., 1990; Yozbatiran et al., 2008)
  • Paper and pencil for the evaluator.

Training:

None typically reported.

Alternative forms of the Action Research Arm Test

None.

Client suitability

Can be used with:

  • The ARAT was constructed for assessing recovery of upper limb function following cortical damage (Lyle, 1981).
  • Clients with stroke.

Should not be used in:

  • When administering the ARAT for clients with finger amputation, pinch subscale should be scored as 0 as well all other tasks that require movement of an amputated body part (Yozbatiran et al., 2008).

In what languages is the measure available?

There are no official translations of the ARAT.

Nevertheless, some peer-reviewed publications from the Netherlands and Taiwan have used the ARAT as an outcome measure, which may indicate that instructions have been informally translated to other languages (Hsieh et al., 1998; Hsueh et al., 2002a; van der Lee et al., 2002).

Summary

What does the tool measure? The ARAT measures specific changes in limb function among individuals who sustained cortical damage resulting in hemiplegia.
What types of clients can the tool be used for? The ARAT can be used with, but is not limited to clients with stroke.
Is this a screening or assessment tool? Assessment
Time to administer An average of 7 to 10 minutes.
Versions There are no alternative versions.
Other Languages There are no official translations.
Measurement Properties
Reliability Internal consistency:
One study examined the internal consistency of the ARAT and reported excellent internal consistency using Cronbach’s alpha.

Test-retest:
Three studies have examined the test-retest reliability of the ARAT. All reported excellent test-retest reliability using ICCs.

Intra-rater:
Four studies have examined the intra-rater reliability of the ARAT and reported excellent intra-rater reliability using Spearman rho correlation, intraclass correlation coefficients (ICC) and weighted kappa.

Inter-rater:
Seven studies examined the inter-rater reliability of the ARAT and reported excellent inter-rater reliability using Spearman rho correlation, Intra ICC and weighted kappa.

Validity Criterion:
Concurrent:
One study has examined the concurrent validity of the ARAT and reported adequate to excellent correlations with the Box and Block Test (BBT) and the Nine-Hole Peg Test (NHPT) at pre and post-treatment.

Predictive:
No studies have examined the predictive validity of the ARAT.

Construct:
Convergent:
Seven studies examined convergent validity of the ARAT and reported excellent correlations between the ARAT and the Brunnstrom-Fugl-Meyer test; the upper extremity subscale of the Motor Assessment scale; the Motricity Index; the upper extremity movement of Modified Motor Assessment Chart; the BTT; the motor function subscore of the Fugl-Meyer test; the Hemispheric Stroke Scale; upper extremity strength and grasp speed. Adequate correlations were reported between the ARAT and the passive joint motion/joint pain of the Fugl-Meyer test, the Functional Independence Measure and spasticity. Poor correlations were reported between the ARAT and the sensation score of the Fugl-Meyer test, the Ashworth scale, the Modified Barthel Index, the National Institutes of Health Stroke Scale, the light touch sensation and pain.

Floor/Ceiling Effects – One study examined the floor/ceiling effects of the ARAT in clients with acute stroke and reported that at earlier phases of the stroke, floor effects were poor. At discharge from the acute rehabilitation ward, ceiling effects on the ARAT were adequate.
– One study examined the floor/ceiling effects of the ARAT in stroke clients with mild to moderate hemiparesis and reported adequate floor and ceiling effects.
Sensitivity/ Specificity No studies have examined the specificity of the ARAT.
Does the tool detect change in patients? Six studies have examined the responsiveness of the ARAT and reported that the ARAT has a moderate to large Standardized Response Mean, moderate to large effect size and large responsiveness ratio, therefore, is able to detect change in clients with stroke.
Acceptability When administering the ARAT to clients with upper extremity amputations attention is required when scoring (i.e. – a score of 0 is given).
Feasibility The administration of the ARAT is quick and simple, but requires standardized equipment.
How to obtain the tool? Information on the ARAT can be obtained in the study by Lyle (1981), Hsieh et al. (1998), van der Lee et al. (2002), Rabadi & Rabadi (2006), and Yozbatiran et al. (2008) and at the website: http://www.aratest.eu/Index_english.htm Standardized equipment can be purchased from the following website: http://www.aratest.eu/ or from http://www.saliarehab.com/

Psychometric Properties

Overview

We conducted a literature search to identify all relevant publications on the psychometric properties of the Action Research Arm Test (ARAT) in individuals with stroke. We identified twelve studies. The ARAT appears to be floor effects.

Floor/Ceiling Effects

Hsueh and Hsieh (2002b) examined floor and ceilings effects for the ARAT and the Upper Extremity Motor Assessment Scale (Carr, Shepherd, Nordholm, & Lynne, 1985) in 48 clients with acute stroke. Participants were assessed at admission and discharge from an acute rehabilitation ward. At admission, the ARAT total score demonstrated a poor floor effect, with 52.1% of participants scoring 0. Although all subscales were classified as having a poor floor effect, when comparing ARAT’s subscales among themselves, 72.9% of participants were unable to perform the pinch subscale, 70.8% were unable to perform both grasp and grip subscales and 52.1 % were unable to complete the gross movement subscale. At discharge, the ARAT total score demonstrated an adequate ceiling effect, with only 7% of participants scoring the maximal value. When analyzing ARAT’s subscales individually the gross movement subscale presented the poorest ceiling effect, with 29.2% of participants scoring the maximum score, followed by 27% of participants on the grasp subscale. The grip and pinch subscale had the best classification, with an adequate ceiling effect of 18.8% and 16.7%, respectively.

Compared to the ARAT, at admission the Upper Extremity Motor Assessment Scale had 58% of participants scoring the minimal value, indicating a poor floor effect. However, at discharge the Upper Extremity Motor Assessment Scale demonstrated a more adequate ceiling effect than the ARAT, with only 4.3 % of participants obtaining the maximum score.

Nijland et al. (2010) investigated the psychometric properties of the ARAT and Wolf Motor Function Test in 40 patients with stroke with mild to moderate hemiparesis. The ARAT showed adequate floor and ceiling effects with only 12.5 to 17% of patients scoring the lowest or highest scores.

Reliability

Internal Consistency:
Nijland et al. (2010) investigated the internal consistency of the ARAT in 40 patients with stroke with mild to moderate hemiparesis. Internal consistency of the ARAT, as calculated using Cronbach’s Coefficient Alpha was excellent (α = 0.98).

Test-retest:
Note: From the descriptions provided of the following studies it appears that some authors called the testing test-retest reliability while others called the same analysis intra-rater reliability.

Lyle (1981) examined test-retest reliability in 20 individuals who sustained cortical damage, either from stroke or traumatic brain lesion. The mean age was 53 years, ranging from 26 to 72 years. Participants were re-assessed with a 1-week interval by the same rater and under the same conditions. The test-retest reliability, as calculated using Pearson correlation, was excellent (r = 0.98).

Hsueh, Lee, and Hsieh (2002a) evaluated test-retest reliability performed using a regular table instead of the specially designed table for this test in 61 individuals with sub-acute stroke and a mean age of 63 years old. Participants were re-assessed after a two-day interval by the same rater. The test-retest reliability, as calculated using the Intraclass Correlation Coefficient (ICC), was excellent for the total score (ICC = 0.99) as well as for the grasp, grip, pinch and gross movement subscales (ICC = 0.99, 0.98, 0.96 and 0.95, respectively).

Platz, Pinkowski, van Wijck, Kim, di Bella, and Johnson (2005) estimated test-retest reliability for the ARAT, the Box and Block Test (Cromwell, 1965; Mathiowetz, Volland, Kashman, & Weber, 1985a), and the Fugl-Meyer Test upper extremity items (including items from the Motor function, Sensation and Passive Joint Motion/Joint pain subscores) (Fugl-Meyer, Jääskö, Leyman, Olsson, & Steglind, 1975) in 23 participants with upper extremity paresis either from stroke, multiple sclerosis, or traumatic brain injury. The participant’s most affected arm was re-assessed 1 week later by the same rater. The test-retest reliability of the ARAT total score, as calculated using ICC’s and Spearman rho correlation, was excellent (ICC = 0.96 and rho = 0.96). Furthermore, test-retest reliabilities for each subscale were all excellent: grasp (ICC = 0.94 and rho = 0.96), grip (ICC = 0.94 and rho = 0.95), pinch (ICC = 0.89 and rho = 0.89) and gross movement (ICC = 0.97 and rho = 0.97).
Note: These results applies only to the most affected upper limb.

Intra-rater:
Wagenaar, Meijer, van Wierinen, Kuik, Hazenberg, Lindeboom, Wichers and Rijswijk (1990) evaluated intra-rater reliability in seven patients with acute stroke. The timeframe for assessments were not provided by the author. Intra-rater reliability as calculated using Spearman rho correlation, was excellent (rho = 0.99).

Van der Lee, DeGroot, Beckerman, Wagenaar, Lankhorst, and Bouter (2001a) estimated intra-rater reliability in 20 patients with chronic stroke and a median age of 62 years. Participants were evaluated by the same rater at three points in time. At the baseline assessment participants were videotaped. The second assessment was 4-27 months following the first assessment, and the final assessment was 4-6 weeks after. Scoring the last two assessments was based on the videotaped recorded at baseline. Intra-rater reliability results were analyzed between the two first assessments, where scoring sources were different (live vs. videotape) and between the two last assessments, were scoring sources were the same (videotape only). Intra-rater reliability, as calculated using ICC and Spearman rho correlation, was excellent (ICC = 0.99 and rho = 0.99), independent of scoring sources. Intra-rater reliability, as calculated using weighted kappa was also excellent: scoring with the same information source resulted in a kappa = 1.00 versus only a slightly lower kappa when scoring from two different information sources (kappa = 0.94). The gross movement subscale showed the lowest weighted kappa value (kappa = 0.83), suggesting that this subscale had the lowest agreement level.

Yozbatiran, Der-Yeghiaian, and Cramer (2008) examined intra-rater reliability in 8 clients with chronic stroke. Participants were re-assessed by the same rater and under the same conditions with a 1-week interval. Intra-rater reliability for the total score, as calculated using ICC and Spearman rho correlation, was excellent (ICC = 0.99 and rho = 0.99). Additionally, the same excellent level of intra-rater reliability was found for the grasp, grip, pinch, and gross motor movement subscales (ICC = 0.98 and rho = 0.93; ICC = 0.97 and rho = 0.93; ICC = 0.99 and rho = 0.98; ICC = 0.93 and rho = 0.91, respectively).

Nijland et al. (2010) investigated the psychometric properties of the ARAT and Wolf Motor Function Test in 40 patients with stroke with mild to moderate hemiparesis. 18 patients participated in the reproducibility testing of the ARAT and were assessed twice by the same observer approximately 10 days apart. Intra-rater reliability, as analyzed using the ICC was found to be excellent (ICC = 0.97).

Inter-rater:
Lyle (1981) examined inter-rater reliability in 20 individuals who had sustained cortical damage, either from stroke or traumatic brain injury. The mean age was 53 years, ranging from 26 to 72 years. Participants were assessed independently by two different raters. Agreement between raters as calculated using Pearson correlation, was excellent (r = 0.99).

Hsieh, Hsueh, Chiang, and Lin (1998) assessed inter-rater reliability in 50 clients with stroke. Their mean age was 65 years old. Participants were evaluated independently, on three different days, by three raters. ICC for the total score showed excellent agreement (ICC = 0.98). Agreement between raters was also excellent for grasp, grip, pinch and gross movement subscales (ICC = 0.98; ICC = 0.96; ICC = 0.96; ICC = 0.95, respectively).

Van der Lee et al. (2001a) estimated inter-rater reliability in 20 patients with chronic stroke and a median age of 62 years old. Participants were videotaped and scored independently by two raters. Inter-rater reliability, as calculated using ICC, weighted kappa, and Spearman rho correlation, was excellent (ICC = 0.98; kappa = 0.93; rho = 0.99). With respect to the individual subscales, the gross movement scale had the lowest weighted kappa value (kappa = 0.87), suggesting this subscale has the lowest agreement between raters.

Hsueh, Lee, and Hsieh (2002a) evaluated inter-rater reliability of the ARAT performed with a regular table instead of the specially designed table for this test in 61 individuals with sub-acute stroke and a mean age of 63 years old. Participants were re-assessed with a two-day interval by three different raters. ICC for the total score showed excellent agreement (ICC = 0.99) as well as for grasp, grip, pinch and gross movement subscales (ICC = 0.99; ICC = 0.98; ICC = 0.96; ICC = 0.94, respectively).

Platz et al. (2005) analyzed inter-rater reliability of the ARAT, the Box and Block Test and the Fugl-Meyer Test upper extremity items (including items from the Motor function, Sensation and Passive Joint Motion/Joint pain subscores) in 44 individuals with upper limb paresis either from stroke, multiple sclerosis, or traumatic brain injury. Participants had the most affected arm videotaped and scored independently by two raters. Inter-rater reliability for the ARAT total score, as calculated using the ICC and Spearman rho correlation, was excellent (ICC = 0.99 and rho = 0.99). Additionally, the scores for each subscale were provided and inter-rater reliability for grasp (ICC = 0.99 and rho = 0.99), grip (ICC = 0.96 and rho = 0.95), pinch (ICC = 0.99 and rho = 0.99) and gross movement (ICC = 0.98 and rho = 0.98) subscales were all excellent.
Note: These results applies only to the most affected upper limb.

Yozbatiran et al. (2008) evaluated inter-rater reliability in 9 clients with chronic stroke. Participants were scored simultaneously and independently by two raters. Inter-rater reliability for the total score, as calculated using the ICC and Spearman rho correlation, was excellent (ICC = 0.99 and rho = 0.96). The same excellent level of inter-rater reliability was found for the grasp, grip, pinch and gross motor movement subscales (ICC = 0.99 and rho = 1; ICC = 0.99 and rho = 0.99; ICC = 0.99 and rho = 0.98; ICC = 0.97 and rho = 0.93, respectively).

Nijland et al. (2010) investigated the psychometric properties of the ARAT and Wolf Motor Function Test in 40 patients with stroke with mild to moderate hemiparesis. 18 patients participated in the reproducibility testing of the ARAT and were assessed in random order by two observers, within one week. Inter-rater reliability, as analyzed using the ICC was found to be excellent (ICC = 0.92).

Validity

Content:

Lyle, 1981 generated the 19 ARAT items from the 33 items of the Upper Extremity Function Test (UEFT – Caroll, 1965). Item reduction was based on a low inter-item correlation, on item redundancy, confirmed through a very high inter-item correlation (above r = 0.9) and on items that were extremely difficult to perform. Nevertheless, ARAT items were not based on a theoretical model (Finch, Brooks, Stratford, & Mayo, 2002).

Criterion:

Concurrent:
No gold standard exists against which to compare the ARAT.

Lin, Chuang, Wu, Hsieh and Chang (2010) compared the concurrent validity of the ARAT, Box and Block Test (BBT) and Nine-Hole Peg Test (NHPT) for evaluating hand dexterity in 59 patients with stroke. The Fugl-Meyer Assessment of Sensorimotor Recovery After Stroke (FMA), Motor Activity Log (MAL) and Stroke Impact Scale (SIS) were also administered to assess the concurrent validity of the ARAT, BBT and NHPT. Using Spearman rank correlation coefficient, the ARAT, BBT and NHPT were found to have adequate to excellent correlations at pre-treatment (ranging from rho=-0.55 to -0.80) and post-treatment (ranging from rho=-0.57 to -0.71). In addition, the ARAT and BBT were found to have adequate correlations with the FMA, MAL and SIS (ranging from rho=0.31-59); however, the NHPT had only poor to adequate correlations with the FMA and MAL (ranging from rho=-0.16 to -0.33); and adequate to excellent correlations with the SIS (ranging from rho=-0.58 to -0.66). When considering both the results of responsiveness and validation components of the study, the ARAT and BBT are believed to be more appropriate than the NHPT for evaluating dexterity.

Predictive:
No studies have examined the predictive validity of the ARAT.

Construct:

Convergent/Discriminant:
DeWeerdt and Harrison (1985) evaluated the convergent validity of the ARAT by comparing it to the Fugl-Meyer test (Fugl-Meyer et al., 1975) in 53 clients with acute stroke. Their mean age was 68 years. Correlations were calculated at two points in time after stroke onset using Spearman correlation coefficient. Excellent correlations were found between the ARAT and Fugl-Meyer test at 2 months (rho = 0.91) and at 8 months (rho = 0.94) post-stroke.

Wagenaar, Meijer, van Wierinen, Kuik, Hazenberg, Lindeboom, Wichers and Rijswijk (1990) evaluated the convergent validity of the ARAT by comparing it to the Sollerman test (Jacobson-Sollerman & Sperling, 1977) in seven patients with acute stroke. An excellent correlation, as calculated using Spearman rho, was found (rho = 0.94).
Note: The Sollerman test measures hand grip function using 20 different daily life activities requiring hand movements.

Hsieh et al. (1998) assessed convergent validity of the ARAT by comparing it to the Upper Extremity portion of the Motor Assessment Scale (Carr et al., 1985), the arm subscale of the Motricity Index (Demeurisse, Demol, & obaye, 1980), and the upper extremity movements of the Modified Motor Assessment Chart (Lindmark & Hamrin, 1988) in 50 clients with stroke. The mean age of clients was 65 years old. Correlations were calculated using Pearson Correlation Coefficients. Excellent correlations were found between the ARAT and the Upper Extremity part of the Motor Assessment Scale ((r = 0.96), Motricity Index (r = 0.87) and the upper extremity movements of the Modified Motor Assessment Chart (r = 0.94).

Platz et al. (2005) tested convergent validity of the ARAT by comparing it to the Box and Block Test (Cromwell, 1965; Mathiowetz et al., 1985a), the Fugl-Meyer Test upper extremity items (including items from the Motor Function, Sensation and Passive Joint Motion/Joint Pain subscores) (Fugl-Meyer et al., 1975), the Motricity Index (Demeurisse et al., 1980), the Ashworth Scale (Ashworth, 1964), the Hemispheric Stroke Scale (Adams, Meador, Sethi, Grotta, & Thomson, 1986) and the Modified Barthel Index (Collin, Wade, Davies, & Horne, 1988) in 56 participants with upper extremity paresis either from stroke (n=37), multiple sclerosis (n=14), or traumatic brain injury (n=5). Correlations were calculated using the Spearman Correlation Coefficient. Excellent correlations were found between the ARAT and the Box and Block Test (rho = 0.95), the Motor Function subscore of the Fugl-Meyer Test (rho = 0.92), the Motricity Index (rho = 0.81), and the Hemispheric Stroke Scale (rho = -0.66). Adequate correlations were found between the ARAT and the Passive Joint Motion/Joint Pain subscore of Fugl Meyer Test (rho = 0.42). Poor correlations were found between the ARAT and the Sensation Subscore of the Fugl-Meyer Test (rho = 0.29), the Ashworth Scale (rho = -0.29) and the Modified Barthel Index (rho = 0.04).
Note: Negative correlations are observed because a high score on the ARAT indicates normal performance, whereas a low score on the Hemispheric Stroke Scale and the Ashworth Scale indicates normal performance.

Lang, Wagner, Dromerick, and Edwards (2006) evaluated the convergent validity of the ARAT in 50 individuals with acute to sub acute stroke, mean age of 63 years old, attending an acute neurology stroke service at three points in time: admission (day 0); post intervention (day 14); and 90 days poststroke (day 90). The ARAT was compared to measures of sensorimotor impairment (e.g. light touch sensation, pain, elbow joint spasticity, upper extremity strength), to kinematic measures (e.g. reach and grasp), to the Functional Independence Measure (FIM) (Keith, Granger, Hamilton, & Sherwin, 1987), and to the National Institutes of Health Stroke Scale (NIHSS) (Brott, Adams, Olinger, Marler, Barsan, Biller, et al., 1989). At day 0, excellent correlations were found between the ARAT and upper extremity strength (r = 0.60) and grasp speed (r = 0.60). Adequate correlations were found between the ARAT and grasp efficiency (r = 0.42), reach efficiency (r = -0.38) and reach speed (r = 0.40), and the FIM upper extremity score (r = 0.38). Poor correlations were found between the ARAT and NIHSS (r = -0.15); light touch sensation (r = 0.15), pain (r = 0.10), elbow joint spasticity (r = -0.28) and the FIM total score (r = 0.20). At day 14, excellent correlations were found between the ARAT and grasp efficiency (r = 0.60) and the FIM upper extremity scores (r = 0.62). Adequate correlations were found between the ARAT and elbow spasticity (r = 0.49), upper extremity strength (r = 0.42), reach efficiency (r = -0.58), grasp speed (r = 0.36) and the FIM total score (r = 0.52). Poor correlations were found between the ARAT and NIHSS (r = -0.24), light touch sensation (r = -0.20), and pain (r = -0.12). At day 90, excellent correlations were found between the ARAT and upper extremity strength (r = 0.60). Adequate correlations were found between the ARAT and elbow spasticity (r = -0.42), reach efficiency (r = -0.42), reach speed (r = 0.50), grasp efficiency (r = -0.48), grasp speed (r = 0.38) and the FIM upper extremity (r = 0.42) and total scores (r = 0.40). Poor correlations were found between the ARAT and the NIHSS (r = -0.29), light touch sensation (r = 0.00), and pain (r = 0.22). In summary, from this study’s findings it appears that the NIHSS, light touch sensation, and pain do not appear to relate to the ARAT. The relationship between the ARAT and FIM scores is stronger early on post-stroke and stabilizes by the ninetieth day.

Rabadi and Rabadi (2006) examined convergent validity of the ARAT by comparing it to the Fugl-Meyer Assessment (Fugl-Meyer et al., 1975) at admission and discharge from an acute stroke rehabilitation unit in 104 inpatients with acute stroke with a mean age of 72 years. The correlation between ARAT and the Fugl-Meyer Assessment was excellent both at admission (rho = 0.77) and discharge (rho = 0.87).

Yozbatiran et al. (2008) estimated the convergent validity of the ARAT by comparing it to the arm motor Fugl-Meyer Assessment (Fugl-Meyer et al., 1975) score in 12 clients with chronic stroke at a mean age of 61 years. Excellent correlation (r = 0.94) was found between the ARAT and arm motor Fugl-Meyer score.

Known groups:
No studies have examined known groups validity of the ARAT.

Responsiveness

Van der Lee, Beckerman, Lankhorst, and Bouter (2001b) evaluated the responsiveness on the ARAT and Fugl-Meyer Assessment (Fugl-Meyer et al., 1975) in 22 clients with chronic stroke, mean age of 58 years old, receiving intensive forced use treatment. Participants were assessed two weeks pre- and two weeks post- treatment. A responsiveness ratio was calculated. Compared to the Fugl-Meyer Assessment, the ARAT had a greater responsiveness ratio (2.03 for ARAT vs. 0.41 for Fugl-Meyer) suggesting that the ARAT is more sensitive to detecting change.
Note: The responsiveness ratio is a variant of effect size and higher values indicate better responsiveness.

Van der Lee, Roorda, Beckerman, and Lankhorst (2002) estimated the responsiveness of a modified version of the ARAT in 63 participants with chronic stroke. In this study, researchers did not follow Lyle’s standardized instructions. Instead, they administered all 19 ARAT items to verify any possible effect of this format on its psychometric properties. A responsiveness ratio was calculated. Compared to the hierarchical version proposed by Lyle, performing all 19 items was found to improve the measure’s responsiveness, with a responsiveness ratio of 1.7 compared to 1.2 with Lyle’s version.
Note: The responsiveness ratio can be considered an estimate of effect size normalized to the variability in a stable population and higher values indicate better responsiveness.

Hsueh et al. (2002b) analyzed the responsiveness of the ARAT and the upper extremity section of the Motor Assessment Scale (Carr et al., 1985) in 48 participants having acute stroke and a mean age of 62 years. Participants were assessed at two points in time: admission and discharge from the acute rehabilitation centre. The ARAT total score demonstrated a moderate effect size of 0.52, while the Motor Assessment Scale total score demonstrated a small effect size of 0.45.

Lang et al. (2006) examined the responsiveness of the ARAT in 50 participants with acute to subacute stroke, with a mean age of 63 years old, receiving constraint-induced movement therapy (CIMT). Assessments were performed at three points in time: baseline, immediately post-treatment, and 2.5 months post-treatment. Effects sizes and responsiveness ratios were calculated. ARAT total and subscale scores at the first follow-up evaluation were similar, with moderate to large effect sizes (ARAT total score = 1.01; grasp subscore = 1.04; pinch subscore = 0.85; grip subscore = 1.01; and gross movement subscore = 0.72). The second follow-up evaluation demonstrated large effect sizes, with individual higher values when compared to the first evaluation (ARAT total score = 1.39; grasp subscore = 1.22; pinch subscore = 1.49; grip subscore = 1.32 and gross movement subscore = 0.98). The responsiveness ratio for the ARAT total score at the first follow-up evaluation was 5.2 and at the second was 7.0. These two responsiveness estimations suggest that the ARAT is a sensitive tool for detecting change even months after stroke onset.
Note: Responsiveness ratio is a variant of effect size and higher values indicate better responsiveness.

Rabadi and Rabadi (2008) assessed the responsiveness of the ARAT and the Fugl-Meyer Assessment (Fugl-Meyer et al., 1975) in 104 participants with acute stroke, with a mean age of 72 years, undergoing inpatient rehabilitation. Participants were evaluated at admission and discharge from acute care. The Standardized Response Mean (SRM) was used to calculate responsiveness. Amongst these upper extremity tests, the ARAT was less sensitive than the Fugl-Meyer Assessment (SRM = 0.68 and 0.74, respectively). However, since the difference between the SRMs for these two measures was minimal, these tests can be considered equally sensitive to change during inpatient acute rehabilitation. This result is contrary to the one presented by Van der Lee at al. (2002). The reason for this difference may be due to the difference in these studies population age and stroke severity.
Note: SRM is a variant of effect size and higher values indicate better responsiveness.

Lin, Chuang, Wu, Hsieh and Chang (2010) evaluated the responsiveness of the ARAT, Box and Block Test (BBT), the Nine-Hole Peg Test (NHPT) for evaluating hand dexterity in 59 patients with subacute stroke (< 6-months) and Brunnstrom stage IV to VI for proximal and distal upper extremity function. Patients were randomly assigned to receive constraint-induced therapy, bilateral arm training or control treatment and received 2 hours of therapy, 5 days per week for 3 weeks. Assessments were performed at baseline and 3 weeks. Using Standardized Response Mean (SRM) to calculate responsiveness, the ARAT, BBT and NHPT were all found to have moderate SRM (0.79, 0.74, 0.64 respectively), indicating sensitivity for detecting change in hand dexterity. When considering both the results of responsiveness and validation components of the study, the ARAT and BBT are believed to be more appropriate than the NHPT for evaluating dexterity.

References

  • Adams, R.J., Meador, K.J., Sethi, K.D., Grotta, J.C., & Thomson, D.S. (1986). Graded neurologic scale for the use in acute hemispheric stroke treatment protocols. Stroke, 18, 665-669.
  • Ashworth, B. (1964). Preliminary trial of carisoprodol in multiple sclerosis. Practitioner, 192, 540-542.
  • Brott, T. G., Adams, H. P., Olinger, C. P., Marler, J. R., Barsan, W. G., Biller, J., Spilker, J., Holleran, R., Eberle, R., Hertzberg, V., Rorick, M., Moomaw, C. J., & Walker, M. (1989). Measurements of acute cerebral infarction: a clinical examination scale. Stroke, 20, 864 -70.
  • Carroll, D. (1965). A quantitative test of upper extremity function. Journal of Chronic Disability, 18, 479-91.
  • Carr, J.H., Shepherd, R.B., Nordholm, L., & Lynne, D. (1985). Investigation of a new motor assessment scale for stroke patients. Physical Therapy, 65, 175- 180.
  • Collin, C., Wade, D.T., Davies, S., & Horne, V. (1988). The Barthel ADL Index: a reliability study. International Disability Study, 10, 61-63.
  • Cromwell, F.S (1965). Occupational therapists manual for basic skills assessment: primary prevocational evaluation. Pasadena, (CA): Fair Oaks Printing; 29-31.
  • Demeurisse, G., Demol, O., & Robaye, E. (1980). Motor evaluation in vascular hemiplegia. European Neurology, 19(6), 382-389.
  • De Weerdt, W.J.G., & Harrison, M.A. (1985). Measuring recovery of arm hand function in stroke patients: a comparison of the Brunnstrom-Fugl-Meyer test and the Action Research Arm Test. Physiotherapy Canada, 37, 65-70.
  • Finch, E., Brooks, D., Stratford,P.W, & Mayo, N.E. (2002). Physical Outcome Measures: A guide to enhance physical outcome measures. Ontario, Canada: Lippincott, Williams, & Wilkins.
  • Fugl-Meyer, A.R., Jääskö, L., Leyman, I., Olsson, S., & Steglind, S. (1975). The post-stroke hemiplegic patient 1. A method for evaluation of physical performance. Scandinavian Journal of Rehabilitation Medicine, 7, 13-31.
  • Gowland, C., Van-Hullenaar, S., Torresin, W., et al., (1995). Chedoke-McMaster Stroke Assessment: development, validation, and administration manual. Hamilton, (ON), Canada: School of Rehabilitation Science, McMaster University
  • Heller, A., Wade, D.T., Wood, V.A., Sunderland, A., Hewer, R., & Ward, E. (1987). Arm function after stroke: measurement and recovery over the first three months. Journal of Neurology, Neurosurgery & Psychiatry, 50(6), 714-719.
  • Hsieh, C.L., Hsueh, I.P, Chiang, F., & Lin, P. (1998). Inter-rater reliability and validity of the action research arm test in stroke patients. Age and Ageing, 27, 107-113.
  • Hsueh, I.P, Lee, M.M., & Hsieh, C.L. (2002a). The action research arm test: Is it necessary for patients being tested to sit at a standardized table? Clinical Rehabilitation, 16, 382-388.
  • Hsueh, I.P. & Hsieh, C.L. (2002b). Responsiveness of two upper extremity function instruments for stroke inpatients receiving rehabilitation. Clinical Rehabilitation, 16, 617-624.
  • Jacobson-Sollerman, X & Sperling, Y. (1977). Grip function of the healthy hand in a standardized hand function test. A study of the Rancho Los Amigos test. Scandinavian Journal of Rehabilitation Medicine, 9(3), 123-129.
  • Keith, R.A, Granger, C.V., Hamilton, B.B., & Sherwin, F.S. (1987). The Functional Independence Measure: a new tool for rehabilitation. In: Eisenberg, M.G. & Grzesiak, R.C. (Ed.), Advances in clinical rehabilitation (pp. 6-18). New York: Springer Publishing Company.
  • Kellor, M., Frost, J., Silberberg, N., Iversen, I., & Cummings R. (1971). Hand strength and dexterity. American Journal of Occupational Therapy, 25, 77-83.
  • Lang, C.E., Wagner, J.M, Dromerick, A.W., & Edwards, D.F. (2006). Measurement of upper extremity function early after stroke: properties of the action research arm test. Archives Physical Medicine and Rehabilitation, 87, 1605-1610.
  • Lin, K-C., Chuang, L-L., Wu, C-Y., Hseih, Y-W. & Chang, W-Y. (2010). Responsiveness and validity of three dexterous function measures in stroke rehabilitation. Journal of Rehabilitation Research and Development, 47(6), 563-572.
  • Lindmark, B. & Hamrin, E. (1988). Evaluation of function capacity after stroke as a basis for active intervention: Presentation of a modified chart for motor capacity assessment and its reliability. Scandinavian Journal of Rehabilitation Medicine, 20, 103-109.
  • Lyle, R.C. (1981). A performance test for assessment of upper limb function in physical rehabilitation treatment and research. International Journal of Rehabilitation and Research, 4, 483-492.
  • Mathiowetz, V., Volland, G., Kashman, N., & Weber, K. (1985a). Adult norms for the box and block test of manual dexterity. American Journal of Occupational Therapy, 39, 386-391.
  • Mathiowetz, V., Weber, K., Kashman, N., & Volland, G. (1985b). Adult norms for the nine hole peg test of finger dexterity. Occupational Therapy Journal of Research, 5, 24 -33.
  • Nijland, R., van Wegen, E., Verbunt, J, van Wijk, R., van Kordelaar, J. & Kwakkel, G. (2010) A comparison of two validated tests for upper limb function after stroke: The Wolf Motor Function Test and the Action Research Arm Test. Journal of Rehabilitation Medicine, 42, 694-696.
  • Platz, T., Pinkowski, C., van Wijck, F., Kim, I.H., di Bella, P., & Johnson, G. (2005). Reliability and validity of arm function assessment with standardized guidelines for the Fugl-Meyer Test, Action Research Arm Test and Box and Block Test: a multicentre study. Clinical Rehabilitation, 19(4), 404-411.
  • Rabadi, M.H. & Rabadi, F.M. (2006). Comparison of the action research arm test and the Fugl-Meyer Assessment as measures of upper-extremity motor weakness after stroke. Archives of Physical of Medicine Rehabilitation, 87, 962-966.
  • van der Lee, J.H, Beckerman, H., Lankhorst, G.J., Bouter, L.M. (2001a). The responsiveness of the Action Research Arm Test and the Fugl-Meyer Assessment Scale in chronic stroke patients. Journal of Rehabilitation Medicine, 33, 110-113.
  • Van der Lee, J.H, Groot, V., Beckerman, H., Wagenaar, R.C., Lankhorst, G.J., Bouter, L.M. (2001b). The intra-rater and interrater reliability of the action research arm test: a practical test of upper extremity function in patients with stroke. Archives of Physical of Medicine Rehabilitation, 82, 14-19.
  • Van der Lee, J.H, Roorda, L.D., & Lankhorst, G.J. (2002). Improving the Action Research Arm Test: a unidimensional hierarchical scale. Clinical Rehabilitation, 16, 646-653.
  • Yozbatiran, N., Der-Yerghiaian, L., & Cramer, S.C. (2008). A standardized approach to performing the action research arm test. Neurorehabilitation & Neural Repair, 22(1), 78-90.
  • Wagenaar, R.C., Meijer, O.G., van Wieringen, P.C., Kuik, D.J., Hazenberg, G.J., Lindeboom, J., et al. (1990). The functional recovery of stroke: a comparison between neuro-developmental treatment and the Brunnstrom method. Scandinavian Journal of Rehabilitation and Medicine, 22, 1-8.

See the measure

How to obtain the Action Research Arm Test:

The ARAT can be obtained in the study by Lyle (1981), Hsieh et al. (1998), Van der Lee et al. (2002), Rabadi & Rabadi (2006), and Yozbatiran et al. (2008) and from the website: http://www.aratest.eu/Index_english.htm Standardized equipment can be purchased from the website: http://www.aratest.eu/ or from http://www.saliarehab.com/.

Table of contents

Box and Block Test (BBT)

Evidence Reviewed as of before: 09-06-2011
Author(s)*: Sabrina Figueiredo, BSc
Editor(s): Lisa Zeltzer, MSc OT; Nicol Korner-Bitensky, PhD OT; Elissa Sitcoff, BA BSc

Purpose

The Box and Block Test (BBT) measures unilateral gross manual dexterity. It is a quick, simple and inexpensive test. It can be used with a wide range of populations, including clients with stroke.

In-Depth Review

Purpose of the measure

The Box and Block Test (BBT) measures unilateral gross manual dexterity. It is a quick, simple and inexpensive test. It can be used with a wide range of populations, including clients with stroke.

Available versions

The original version of the BBT was developed, in 1957, by Jean Hyres and Patricia Buhler. This version was modified into the current one by E. Fuchs and P. Buhler (Cromwell, 1976). In 1985, normative data on the BBT was established by Mathiowetz, Volland, Kashman, and Weber.

Features of the measure

Items:

The BBT is composed of a wooden box divided in two compartments by a partition and 150 blocks. The BBT administration consists of asking the client to move, one by one, the maximum number of blocks from one compartment of a box to another of equal size, within 60 seconds. The box should be oriented lengthwise and placed at the client’s midline, with the compartment holding the blocks oriented towards the hand being tested. In order to practice and register baseline scores, the test should begin with the unaffected upper limb. Additionally, a 15-second trial period is permitted at the beginning of each side. Before the trial, after the standardized instructions are given to clients, they should be advised that their fingertips must cross the partition when transferring the blocks, and that they do not need to pick up the blocks that might fall outside of the box (Mathiowetz, Volland, Kashman, & Weber, 1985-1).

Scoring:

Clients are scored based on the number of blocks transferred from one compartment to the other compartment in 60 seconds (Mathiowetz et al., 1985-1). Higher scores are indicative of better manual dexterity. During the performance of the BBT, the evaluator should be aware of whether the client’s fingertips are crossing the partition. Blocks should be counted only when this condition is respected. Furthermore, if two blocks are transferred at once, only one block will be counted. Blocks that fall outside the box, after trespassing the partition, even if they don’t make it to the other compartment, should be counted.

Mathiowetz et al. (1985-1) reported that healthy male adults, aged 20 to 80 years, transfer an average of 77 blocks (SD ±11.6) with the right hand and 75 blocks (SD ±11.4) with the left hand within the 60 second limit. Scores for normal healthy men, aged 60 years old or more ranged from 61 to 70 blocks. Healthy female adults, aged 20 to 80 years, transfer an average of 78 blocks (SD ±10.4) with the right hand and 76 blocks (SD ±9.5) with the left hand. Scores for normal healthy women, aged 60 years old or more, ranged from 63 to 76 blocks. The score on the BBT and age are inversely correlated, meaning that average scores on the BBT decrease with older age.

Time:

The BBT requires 2 to 5 minutes to administer (Finch, Brooks, Stratford, & Mayo, 2002; Mathiowetz et al., 1985-1).

Subscales:

None.

Equipment:

The standardized equipment consists of:
A wooden box dimensioned in 53.7 cm x 25.4 cm x 8.5 cm. The partition should be placed at the middle of the box, dividing it in two containers of 25.4 cm each. (Mathiowetz et al., 1985-1).
150 wooden cubes – 2.5 cm in size (Mathiowetz et al., 1985-1). Stopwatch.

Training of administrator:

None typically reported.

Alternative forms of the Box and Block Test

None.

Client suitability

Can be used with:

  • Clients with stroke.

Should not be used in:

  • The BBT cannot be used with clients who have severe upper extremity impairment.
  • The BBT cannot be used with clients with severe cognitive impairment.

In what languages is the measure available?

There are no official translations of the BBT. The specific instructions provided to the client are in English. Clinicians and researchers may be using “home-grown” translations of the instructions as evidenced from peer-reviewed publication from Sweden, French Canada, Italy and Germany that have used the BBT as an outcome measure. (Broeren, Rydmark, Bjorkdahl, & Sunnerhagen, 2007; Dannenbaun, Michalsen, Desrosiers, & Levin, 2002; Mercier & Bourbonnais, 2004; Platz, Pinkowski, Kim, di Bella, & Johnson, 2005; Schneider, Schonle, Altenmuller, & Munte, 2007).

Summary

What does the tool measure? Unilateral gross manual dexterity.
What types of clients can the tool be used for? The BBT can be used with, but is not limited to clients with stroke.
Is this a screening or assessment tool? Assessment
Time to administer From 2 to 5 minutes.
Versions There are no alternative versions.
Other Languages There are no official translations.
Measurement Properties
Reliability Internal consistency:
No studies have examined the internal consistency of the BBT.
Test-retest:
Two studies have examined the test-retest reliability of the BBT. Both reported excellent test-retest reliability using ICC’s.
Inter-rater:
Two studies have examined the inter-rater reliability of the BBT and reported excellent inter-rater reliability using correlation coefficients and ICC. One study used Pearson correlation and the other, ICC and Spearman rho correlation.
Validity Criterion:
Concurrent:
One study has examined the concurrent validity of the BBT and reported adequate to excellent correlations with the Action Research Arm Test (ARAT) and the Nine-Hole Peg Test (NHPT) at pre and post-treatment.
Predictive:
One study has examined predictive validity and reported that the BBT, compared to the NHPT, the Frenchay Arm Test, Grip Strength and the Stroke Rehabilitation Assessment of Movement (STREAM) was the best predictor of upper limb function 5 weeks post-stroke.
Construct:
Convergent validity:
Three studies have examined convergent validity of the BBT and reported excellent correlations between the BBT and the Minnesota Rate of Manipulation Test, the ARAT, the Hemispheric Stroke Scale and the motor function score of the Fugl-Meyer Assessment (FMA). Adequate correlations were reported between the BBT and the SMAF, the Ashworth scale and the Passive Joint Motion/Joint Pain subscore of the FMA. Poor correlations were reported between the BBT and the Sensation subscore of the FMA and the Modified Barthel Index.
Floor/Ceiling Effects No studies have examined floor/ceiling effects of the BBT
Sensitivity/ Specificity No studies have examined sensitivity/specificity of the BBT
Does the tool detect change in patients?

Two studies have examined the responsiveness of the BBT and reported that the BBT has moderate to large Standardized Response Mean, therefore, is able to detect change in clients with stroke.

Acceptability The BBT should not be used clients with severe upper extremity impairment and severe cognitive impairments.
Feasibility The administration of the BBT is quick and simple, however requires standardized equipment.
How to obtain the tool?

The BBT instructions can be obtained in the study by Mathiowetz et al. (1985)

Standardized equipment can be obtained at the website:
http://www.sammonspreston.com/Supply/Product.asp?Leaf_Id=7531

Psychometric Properties

Overview

We conducted a literature search to identify all relevant publications on the psychometric properties of the Box and Block Test (BBT) in healthy individuals and individuals with stroke. We identified four studies. The BBT appears to be responsive in clients with stroke.

Floor/Ceiling Effects

No studies have examined floor/ceiling effects of the BBT.

Reliability

Test-retest:
Desrosiers, Bravo, Hebert, Dutil, and Mercier (1994) examined test-retest reliability of the BBT in 34 elderly with upper limb sensorimotor impairments from stroke (n=13) and other conditions. Participants were re-assessed with a 1-week interval by the same rater and under the same conditions. The test-retest reliability for the BBT was reported as excellent (ICC = 0.97; ICC = 0.96) for the right and left hand, respectively.

Platz, Pinkowski, van Wijck, Kim, di Bella, and Johnson (2005) estimated test-retest reliability of the BBT, the Action Research Arm Test (Lyle, 1981), and the Fugl-Meyer Assessment (FMA) upper extremity items including items from the motor function, sensation and passive joint motion/joint pain sub-scores, (Fugl-Meyer, Jääskö, Leyman, Olsson, & Steglind, 1975) in 23 participants with upper extremity paresis either from stroke, multiple sclerosis, or traumatic brain injury. The participant’s most affected arm was re-assessed after a 1-week interval by the same rater. The test-retest reliability of the BBT, as calculated using ICC’s and Spearman rho correlation, was excellent (ICC = 0.96 and r = 0.97).
Note: This result applies only to the most affected upper limb.

Inter-rater:
Mathiowetz, Volland, Kashman, and Weber (1985-1) assessed the inter-rater reliability of the BBT in 26 healthy young females. Participants were evaluated simultaneously and independently by two raters. Pearson correlationcoefficients showed excellent agreement (r = 1.00; r = 0.99) for the right and left hand, respectively.
Note: Pearson correlation coefficient is not the statistical analysis of choice for assessing inter-rater reliability as it may artificially inflate agreement.

Platz et al. (2005) as described earlier also analyzed inter-rater reliability of the BBT, the Action Research Arm Test (Lyle, 1981), and the FMA upper extremity items including items from the motor function, sensation and passive joint motion/joint pain sub-scores (Fugl-Meyer et al., 1975) in 44 individuals with upper limb paresis either from stroke, multiple sclerosis, or traumatic brain injury. Participants had the most affected arm videotaped and scored independently by two raters. Inter-rater reliability for the BBT, as calculated using the ICC and Spearman rho correlation, was excellent (ICC = 0.99 and r = 0.99).
Note: This result applies only to the most affected upper limb.

Validity

Content:

Not available.

Criterion:

Concurrent:
No gold standard exists against which to compare the BBT.

Lin, Chuang, Wu, Hsieh and Chang (2010) compared the concurrent validity of the BBT, Action Research Arm Test (ARAT) and Nine-Hole Peg Test (NHPT) for evaluating hand dexterity in 59 patients with stroke. The Fugl-Meyer Assessment (FMA), Motor Activity Log (MAL) and Stroke Impact Scale (SIS) were also administered to assess the concurrent validity of the BBT, ARAT and NHPT. Using Spearman rank correlation coefficient, the BBT, ARAT and NHPT were found to have adequate to excellent correlations at pre-treatment (ranging from rho=-0.55 to -0.80) and post-treatment (ranging from rho=-0.57 to -0.71). In addition, the BBT and ARAT were found to have adequate correlations with the FMA, MAL and SIS (ranging from rho=0.31-59); however, the NHPT had only poor to adequate correlations with the FMA and MAL (ranging from rho=-0.16 to -0.33); and adequate to excellent correlations with the SIS (ranging from rho=-0.58 to -0.66). When considering both the results of responsiveness and validation components of the study, the BBT and ARAT are believed to be more appropriate than the NHPT for evaluating dexterity.

Predictive:
Higgins, Mayo, Desrosiers, Salbach and Ahmed (2005) estimated wheter the BBT, Nine-Hole Peg Test (Kellor, Frost, Silberberg, Iversen, & Cummings, 1971; Mathiowetz, Weber, Kashman, & Volland, 1985-2), Frenchay Arm Test (Heller, Wade, Wood, Sunderland, Hewer, & Ward, 1987), Grip Strength (Mathiowetz, Kashman, Volland, Weber, Dowe, & Rogers, 1985-3), and Stroke Rehabilitation Assessment of Movement (STREAM – Daley, Mayo, Wood-Dauphine, Danys, & Cabot, 1997) were able to predict upper limb function, measured by the BBT, at 5 weeks post-stroke. Predictive validity of the BBT was measured in 55 participants with acute stroke. Assessments were performed at two points in time: one and five weeks post-stroke. Compared to the other upper limb performance tests, the BBT when performed at one week post-stroke, was the best predictor of upper limb function at five months post-stroke, followed by the STREAM.

Construct:

Convergent/Discriminant:
Cromwell (1976) examined the convergent validity of the BBT by comparing it to the Minnesota Rate of Manipulation Test (American Guidance Service, 1969) in an unspecified population. The correlation between BBT and the Minnesota Rate of Manipulation Test was excellent (r = 0.91).

Desrosiers et al. (1994) assessed the convergent validity of the BBT by comparing it to the Functional Autonomy Measurement System – FAMS, known as the SMAF in French (Hebert, Carries, & Bilodeau, 1988), and to the Action Research Arm Test (ARAT – Lyle, 1981) in 104 elderly with upper limb impairments secondary to stroke (n=53) amongst other conditions. Excellent correlations (r = 0.80) were found between the BBT and the ARAT. Adequate pearson correlations were found between the BBT and the FAMS (r = 0.47; r = 0.51) for the right and left hand, respectively.

Platz et al. (2005) tested the convergent validity of the BBT by comparing it to the Action Research Arm Test (ARAT Lyle, 1981) and to the Fugl-Meyer Assessment (FMA)upper extremity items including items from the motor function, sensation and passive joint motion/joint pain sub-scores (Fugl-Meyer et al., 1975) using Spearman Correlation, in 56 participants with upper extremity paresis either from stroke (n=37) or other conditions. Excellent correlations were found between the BBT and the ARAT (r = 0.95) and the Motor Function sub-score (r = 0.92) of the FMA. Furthermore, the BBT was correlated with more general measures of impairment and activity limitation, such as the Ashworth Scale (Ashworth, 1964), the Hemispheric Stroke Scale (Adams, Meador, Sethi, Grotta, & Thomson, 1986) and the Modified Barthel Index (Collin, Wade, Davies, & Horne, 1988). Excellent correlation was found between the BBT and the Hemispheric Stroke Scale (r = -0.67). Adequate correlations were found between the BBT and the passive joint motion/joint pain sub-score of the FMA (r = 0.43) and the Ashworth Scale (r = -0.38). Poor correlations were found between the BBT and the sensation sub-score of the FMA (r = 0.28) and the Modified Barthel Index (r = 0.04).
Note: Negative correlations are observed because a high score on the BBT indicates better performance, whereas a low score on the Hemispheric Stroke Scale or the Ashworth Scale indicates better performance.

Known groups:
No studies have examined known groups validity of the BBT.

Responsiveness

Higgings et al. (2005) evaluated the responsiveness on the BBT, Frenchay Arm Test (Heller et al., 1987), Grip strength (Mathiowetz et al., 1985-3) and the Stroke Rehabilitation Assessment of Movement (STREAM – Daley et al., 1997) in 50 participants with acute stroke. Participants were assessed one and four weeks post-stroke. The Standardized Response Mean (SRM) was used to calculate responsiveness. Amongst these upper extremity performance tests, the BBT was the most sensitive to detecting change, having a large SRM of 0.8.
Note: SRM is a variant of effect size and higher values indicate better responsiveness.

Lin, Chuang, Wu, Hsieh and Chang (2010) evaluated the responsiveness of the BBT, the Action Research Arm Test (ARAT) and the Nine-Hole Peg Test (NHPT) for evaluating hand dexterity in 59 patients with subacute stroke (< 6-months) and Brunnstrom stage IV to VI for proximal and distal upper extremity function. Patients were randomly assigned to receive constraint-induced therapy, bilateral arm training or control treatment and received 2 hours of therapy, 5 days per week for 3 weeks. Assessments were performed at baseline and 3 weeks. Using Standardized Response Mean (SRM) to calculate responsiveness, the BBT, ARAT and NHPT were all found to have moderate SRM (0.74, 0.64, 0.79 respectively), indicating sensitivity for detecting change in hand dexterity. When considering both the results of responsiveness and validation components of the study, the BBT and ARAT are believed to be more appropriate than the NHPT for evaluating dexterity.

References

  • American Guidance Service. The Minnesota Rate Manipulative Tests. Examiner’s manual. Circle Pines, (MN): Author; 1969.
  • Adams, R.J., Meador, K.J., Sethi, K.D., Grotta, J.C., & Thomson, D.S. (1986). Graded neurologic scale for the use in acute hemispheric stroke treatment protocols. Stroke 18, 665-669.
  • Ashworth, B. (1964). Preliminary trial of carisoprodol in multiple sclerosis. Practitioner, 192, 540-542.
  • Broeren, J., Rydmark, M., Bjorkdahl, A., & Sunnerhagen, K.S. (2007). Assessment and training in a 3-dimensional virtual environment with haptics: a report on 5 cases of motor rehabilitation in the chronic stage after stroke. Neurorehabilitation & Neural Repair, 21(2), 180-189.
  • Collin, C., Wade, D.T., Davies, S., & Horne, V. (1988). The Barthel ADL Index: a reliability study. International Disability Study, 10, 61-63.
  • Cromwell, F.S (1965). Occupational therapists manual for basic skills assessment: primary prevocational evaluation. Pasadena, (CA): Fair Oaks Printing; 29-31.
  • Daley, K., Mayo, N.E., Wood-Dauphinee, S., Danys, I., & Cabot, R. (1997). Verification of the Stroke Rehabilitation Assessment of Movement (STREAM). Physiotherapy Canada, 49, 269-278.
  • Dannenbaum, R.M., Michaelsen, S.M., Desrosiers, J., & Levin, M.F. (2002). Development and validation of two new sensory tests of the hand for patients with stroke. Clinical Rehabilitation, 16(6), 630-639.
  • Desrosiers, J., Bravo, G., Hébert, R., Dutil, É., & Mercier, L. (1994). Validation of the box and block test as a measure of dexterity of elderly people: reliability, validity and norms studies. Archives of Physical Medicine and Rehabilitation, 75, 751-755.
  • Desrosiers, J., Rochette, A., Hebert, R., & Bravo, G. (1997). The Minnesota manual dexterity test: reliability, validity and reference values studies with healthy elderly People. Canadian Journal of Occupational Therapy, 64(5), 270-276.
  • Finch, E., Brooks, D., Stratford,P.W, & Mayo, N.E. (2002). Physical Outcome Measures: A guide to enhance physical outcome measures. Ontario, Canada: Lippincott, Williams & Wilkins.
  • Fugl-Meyer, A.R., Jääskö, L., Leyman, I., Olsson, S., & Steglind, S. (1975). The post-stroke hemiplegic patient 1. A method for evaluation of physical performance. Scandinavian Journal of Rehabilitation Medicine, 7, 13-31.
  • Hébert, R., Carrier, R., & Bilodeau, A. (1988). The functional autonomy measurement system (SMAF): description and validation of an instrument for the measurement of handicaps. Age Ageing, 17, 293-302.
  • Heller, A., Wade, D.T., Wood, V.A., Sunderland, A., Hewer, R., & Ward, E. (1987). Arm function after stroke: measurement and recovery over the first three months. Journal of Neurology, Neurosurgery & Psychiatry, 50(6), 714- 719.
  • Higgins, J., Mayo, N.E., Desrosiers, J., Salbach, N.M., & Ahmed, S. (2005). Upper-limb function and recovery in the acute phase poststroke. Journal of Rehabilitation Research & Development, 42(1), 65-76.
  • Jebsen, R.H., Taylor, N., Trieschmann, R.B., Trotter, M.J., & Howard, L.A. (1969). An objective and standardized test of hand function. Archives of Physical Medicine and Rehabilitation, 50, 311-319.
  • Kellor, M., Frost, J., Silberberg, N., Iversen, I., & Cummings R. (1971). Hand strength and dexterity. American Journal of Occupational Therapy, 25, 77-83.
  • Lin, K-C., Chuang, L-L., Wu, C-Y., Hseih, Y-W. & Chang, W-Y. (2010). Responsiveness and validity of three dexterous function measures in stroke rehabilitation. Journal of Rehabilitation Research and Development, 47(6), 563-572.
  • Lyle, R.C. (1981). A performance test for assessment of upper limb function in physical rehabilitation treatment and research. International Journal of Rehabilitation and Research, 4, 483-492.
  • Mathiowetz, V., Volland, G., Kashman, N., & Weber, K. (1985-1). Adult norms for the box and block test of manual dexterity. American Journal of Occupational Therapy, 39, 386-391.
  • Mathiowetz, V., Weber, K., Kashman, N., & Volland, G. (1985-2). Adult norms for the nine hole peg test of finger dexterity. Occupational Therapy Journal of Research, 5, 24 -33.
  • Mathiowetz, V., Kashman, N., Volland, G., Weber, K., Dowe, M., & Rogers, S. (1985-3). Grip and pinch strength: normative data for adults. Archives of Physical and Medicine and Rehabilitation, 66, 69-72.
  • Mercier, C. & Bourbonnais, D. (2004). Relative shoulder flexor and handgrip strength is related to upper limb function after stroke. Clinical Rehabilitation, 18(2), 215-221.
  • Platz, T., Pinkowski, C., van Wijck, F., Kim, I.H., di Bella, P., & Johnson, G. (2005). Reliability and validity of arm function assessment with standardized guidelines for the Fugl-Meyer Test, Action Research Arm Test and Box and Block Test: a multicentre study. Clinical Rehabilitation, 19(4), 404-411.
  • Schneider, S., Schonle, P.W., Altenmuller, E., & Munte, T.F. Using musical instruments to improve motor skill recovery following a stroke. Journal of Neurology, 254(10), 1339-1346.
  • Tiffin, J. (1968). Purdue Pegboard Examiner Manual. Chicago, USA: Science Research Associates.

See the measure

How to obtain the BBT

The BBT instructions can be obtained in the study by Mathiowetz et al. (1985)

Standardized equipment can be obtained at the website:
http://www.sammonspreston.com/Supply/Product.asp?Leaf_Id=7531

By clicking here, you can access a video showing how to administer the assessment.

Table of contents

Chedoke Arm and Hand Activity Inventory (CAHAI)

Evidence Reviewed as of before: 08-01-2009
Author(s)*: Sabrina Figueiredo, BSc
Editor(s): Nicol Korner-Bitensky, PhD OT; Elissa Sitcoff, BA BSc
Expert Reviewer: Susan Barreca,MSc, PT

Purpose

The Chedoke Arm and Hand Activity Inventory (CAHAI) is a functional assessment of the recovering arm and hand after stroke. The CAHAI compliments the Chedoke-McMaster Stroke Assessment (Barreca, Stratford, Masters, Lambert, Griffiths, and McBay, 2006).

In-Depth Review

Purpose of the measure

The Chedoke Arm and Hand Activity Inventory (CAHAI) is a functional assessment of the recovering arm and hand after stroke. The CAHAI compliments the Chedoke-McMaster Stroke Assessment (Barreca, Stratford, Masters, Lambert, Griffiths, and McBay, 2006).

Available versions

The CAHAI was developed by Barreca, Gowland, Stratford, Huijbregts, Griffiths, Torresin, Dunkley, Miller, and Masters in 2004 to address the need for a valid, clinically relevant, and responsive functional assessment of the recovering paretic upper limb.

Three shortened versions of the CAHAI were developed by Barreca, Stratford, Masters, Lambert, Griffiths, and McBay in 2006. The shortened versions have 7, 8 or 9 items and are identified as CAHAI-7, CAHAI-8, CAHAI-9, respectively.

Features of the measure

Items:

The original CAHAI consists of 13 functional items that are non-gender specific, involve both upper limbs, and incorporates a range of movements and grasps that reflect stages of motor recovery following stroke. The following items were generated from a review of the scientific literature on stroke, as well as from input from individuals with stroke and their families (Barreca et al., 2004):

  1. Open a jar of coffee
  2. Dial 911
  3. Draw a line with a ruler
  4. Pour a glass of water
  5. Wring out a washcloth
  6. Do up five buttons
  7. Dry back with a towel
  8. Put toothpaste on a toothbrush
  9. Cut medium consistency putty
  10. Clean eye glasses
  11. Zip up a zipper
  12. Place a container on a table
  13. Carry a bag up the stairs

The CAHAI-7 utilizes the first 7 items, CAHAI-8 the first 8 items, and CAHAI-9 the first 9 items. The 13 items together represent the original CAHAI (Barreca et al., 2006). On average, clients with stroke consider items 1, 2, 4 and 12 easy to perform; items 8, 10, 11, and 13 moderately difficult; and items 3, 6, 7, and 9 the most difficult (Barreca et al., 2004).

Detailed administration guidelines are in the development manual that can be obtained can be obtained by visiting the official website: http://www.cahai.ca

Scoring:

Each item of the CAHAI is scored on a 7-point quantitative scale, similar to the scale used in the Functional Independence Measure (FIM) (Keith, Granger, Hamilton, & Sherwin, 1987)

A score of

  • 1 = client needs total assistance and the weak upper limb performs less than 25% of the task;
  • 2 = client needs maximal assistance and the weak upper limb performs 25% to 49% of the task. There are no signs of arm or hand manipulation, only stabilization;
  • 3 = client needs moderate assistance and the weak upper limb performs 50% to 74% of the task. Begins to show signs of arm or hand manipulation;
  • 4 = client needs minimal assistance (light touch) and the weak upper limb performs more than 75% of the task;
  • 5 = client requires supervision, coaxing, or cueing;
  • 6 = client requires use of assistive devices or requires more than reasonable time, or there are safety concerns; and
  • 7 = total independence in completing the task.

The minimal possible score for the CAHAI is 13 and the maximum is 91, with higher scores indicating greater functional independence (Barreca et al., 2004; Barreca, Stratford, Lambert, Masters, & Streiner, 2005; Barreca, Stratford, Masters, Lambert, & Griffiths, 2006b).

The affected limb is also scored according to its positioning and functioning during test performance. The therapist should record the performance of the affected limb on each item by checking the appropriate box. The scoring table for the CAHAI is as follows: (Barreca et al., 2004):

Items Affected Limb
1) Open a jar of coffee Holds jar Holds lid
2) Call 911 Holds receiver Dials phone
3) Draw a line with ruler Holds ruler Holds pen
4) Put toothpaste on toothbrush Holds toothpaste Holds brush
5) Cut medium consistency putty Holds knife Holds fork
6) Pour a glass of water Holds glass Holds pitcher
7) Clean a pair of eyeglasses Holds glasses Wipes lenses
8) Zip up the zipper Holds zipper Holds zipper pull
9) Dry back with towel Reaches for towel Grasps towel end

Note: Standardized instructions on scoring can be obtained by visiting the official website: http://www.cahai.ca

Time:

The time to administer and score the CAHAI is approximately 25 minutes (Barreca et al., 2004; Barreca et al., 2006).

Subscales:

None

Equipment required:

CAHAI-7

Version (Items 1-7) requires all items in Equipment List A

Equipment List A

  • height adjustable table
  • chair/wheelchair without armrests
  • dycem
  • 200g jar of coffee
  • push-button telephone
  • 12″/30cm ruler
  • 8.5″ x 11″ paper
  • pencil
  • 2.3L plastic pitcher with lid filled with 1600 ml. Water
  • 250 ml plastic cup
  • wash cloth
  • wash basin (24.5 cm. in diameter, height 8 cm.)
  • Pull-on vest with 5 buttons (one side male & one side female), buttons (1.5 cm. In diameter, 7 cm. apart)
  • bath towel (65cm X 100cm)

CAHAI-8

Version (Items 1-8) requires all items in Equipment List A and B

Equipment List B

  • 75ml toothpaste with screw lid, >50% full
  • toothbrush

CAHAI-9

Version (Items 1-9) requires all items in Equipment List A, B, and C

Equipment List C

  • dinner plate (Melamine or heavy plastic, 25 cm. in diameter)
  • medium resistance putty
  • knife and fork
  • built up handles the length of the utensil handle

CAHAI-13

Version (Items 1-13) requires all items in Equipment List A, B, C, and D

Equipment List D

  • 27″/67cm metal zipper in polar fleece poncho
  • eyeglasses
  • handkerchief
  • Rubbermaid 38L container (50 x 37 x 27cm)
  • 4 standard size steps with rail
  • plastic grocery bag holding 4lb/2kg weight

Training:

Training may be provided by the authors as a half-day workshop. There is a training DVD available in English for a cost of $29.00 Canadian including shipping. Only cheque or money orders are processed.

Alternative forms of the CAHAI

CAHAI-7, CAHAI-8, CAHAI-9

Client suitability

Can be used with:

  • Clients with stroke.

Should not be used in:

  • To date, there is no information on restrictions of using the CAHAI.

In what languages is the measure available?

English, French, German, Hebrew, Italian

Summary

What does the tool measure? The CAHAI assess upper limb functional recovery.
What types of clients can the tool be used for? The CAHAI can be used with, but is not limited to clients with stroke.
Is this a screening or assessment tool? Assessment
Time to administer An average of 20 to 25 minutes
Versions CAHAI, CAHAI-9, CAHAI-8, CAHAI-7.
Other Languages English, French, German, Hebrew and Italian.
Measurement Properties
Reliability Internal consistency:
Two studies have examined the internal consistency of the CAHAI and its shortened versions and reported excellent internal consistency using Cronbach’s alpha.

Test-retest:
One study examined the test-retest reliability of the CAHAI and reported excellent test-retest reliability using using the Intraclass Correlation Coefficient (ICC).

Intra-rater:
No studies have examined the intra-rater reliability of the CAHAI.

Inter-rater:
One study examined the inter-rater reliability of the CAHAI and reported excellent inter-rater reliability using ICC.

Validity Content:
One study examined the content validity of the CAHAI and reported that items were generated from a review of scientific literature and from input from clients with stroke, their family and caregivers. Items with poor frequency endorsement, difficulty to be standardized, and high inter-item correlation were eliminated.

Criterion:
Concurrent:
One study examined the concurrent validity of the CAHAI and the CAHAI-9 and reported that the CAHAI-9 was not able to predict individual scores and individual change scores of the CAHAI, using regression analysis.

Predictive:
No studies have examined the predictive validity of the CAHAI.

Construct:
Convergent:
Three studies examined convergent validity of the CAHAI and reported excellent correlations between all versions of the CAHAI and the Action Research Arm Test, and all versions of the CAHAI and the Chedoke-McMaster Stroke Assessment (CMSA), and poor to moderate correlations between the CAHAI and the CMSA shoulder pain score, using Pearson Correlation.

Known Groups:
Three studies examined longitudinal/known groups validity of all versions of the CAHAI and reported that all versions are able to distinguish changes between subjects with acute and chronic stroke, and mild from severe impairments, using ROC curve (Receiver Operation Characteristic).

Floor/Ceiling Effects No studies have examined the floor/ceiling effects of the CAHAI.
Sensitivity/ Specificity No studies have examined the sensitivity/specificity of the CAHAI.
Does the tool detect change in patients? One study examined the responsiveness of the CAHAI and reported that the minimal detectable change between two evaluations in stable patients was 6.3 points.
Acceptability The CAHAI is highly accepted by clients with stroke since is made up of real-life and non-gender specific items.
Feasibility The administration of the CAHAI is easy and quick to perform.
How to obtain the tool? The CAHAI can be obtained free of charge by visiting the official website: http://www.cahai.ca

Psychometric Properties

Overview

We conducted a literature search to identify all relevant publications on the psychometric properties of the Chedoke Arm and Hand Activity Inventory (CAHAI) in individuals with stroke. We identified four studies. The CAHAI appears to be responsive in clients with stroke.

Floor/Ceiling Effects

No studies have examined floor/ceiling effects of the CAHAI.

Reliability

Internal Consistency:
Barreca, Gowland, Stratford, Huijbregts, Griffiths, Torresin, Dunkley, Miller, and Masters (2004) assessed the internal consistency of the CAHAI in 100 clients with stroke. Internal consistency of the CAHAI, as calculated using Cronbach’s Coefficient Alpha was excellent (α = 0.98).

Barreca, Stratford, Masters, Lambert, Griffiths, and McBay (2006) examined the internal consistency of the CAHAI-7, CAHAI-8, and CAHAI-9 in 39 clients with stroke. Internal consistency of all shortened versions of the CAHAI, as calculated using Cronbach’s Coefficient Alpha, was excellent (α = 0.97; α = 0.98; α = 0.98, respectively).

Test-retest:
Barreca et al. (2006) examined the test-retest reliability of the shortened version of the CAHAI in 39 clients with stroke. Participants were stratified into two different groups based on the amount of expected improvement. Participants were re-assessed following a 36 hour interval. The test-retest reliability as calculated using Intraclass Correlation Coefficient (ICC) was excellent for all shortened versions: CAHAI-7 (ICC = 0.96), CAHAI-8 (ICC = 0.97), and CAHAI 9 (ICC = 0.97).

Intra-rater:
No studies have examined the intra-rater reliability of the CAHAI.

Inter-rater:
Barreca, Stratford, Lambert, Masters, and Streiner (2005) assessed the inter-rater reliability of the CAHAI in 39 clients with stroke. Participants were stratified into two different groups based on the amount of expected improvement. Participants were re-assessed following a 36 hours interval. The inter-rater reliability as calculated using Intraclass Correlation Coefficient (ICC), was excellent (ICC = 0.98).

Validity

Content:

Barreca et al. (2004) performed a literature review to generate items for the CAHAI. From this review, 177 items were selected. Eighty-one clients with stroke, their families and caregivers were surveyed about important and relevant items regarding stroke recovery, which generated an additional 574 items. To reduce the 725 generated items to 26 items, only bilateral, gender-neutral items, that fell into the domains identified by the clients as important that were easy to obtain were kept. This version, with 26 items, was then tested in 20 participants with stroke. Items that were difficult to standardize or those with the potential for safety concerns were eliminated. Items with a high degree of difficulty were added in order to minimize possible ceiling effects. Inter-item correlation analyses of this new version (which contained 25 items), identified some redundant items (r > 0.90). Items with poor frequency endorsement, difficulty to standardize and high inter-item correlation were eliminated, resulting in the 13 finalized items.

Criterion:

Concurrent:
Barreca, Stratford, Masters, Lambert, & Griffiths (2006b) examined the ability of the CAHAI-9 to predict the scores and change scores of the original CAHAI in 105 clients with stroke. Mean scores and mean change scores of the CAHAI-9 accurately predicted means scores and mean change scores of the CAHAI. However, individual scores and individual change scores of the CAHAI-9 displayed moderate variability in predicting individual scores and change scores of CAHAI. The findings indicate that the CAHAI-9 should not be administered with the intent to predict the CAHAI.

Predictive:
No studies have examined the predictive validity of the CAHAI.

Construct:

Convergent/Discriminant:
Barreca et al. (2005) estimated convergent validity of the CAHAI by comparing it to Chedoke-McMaster Stroke Assessment (CMSA – Gowland, Stratford, Ward, Moreland Torresin, VanHullenaar et al., 1993; Gowland, VanHullenaar, Torresin, et al., 1995) arm-hand sum score, and with the Action Research Arm Test (ARAT – Lyle, 1981) in 39 participants with stroke. Assessments were performed at baseline and 2 to 6 weeks later. Correlations, as calculated using Pearson Correlation Coefficient were excellent between the CAHAI and the ARAT (r = 0.93) and between the CAHAI and the CMSA arm-hand at baseline (r = 0.81) and at follow up (r = 0.89). In the same study, the authors analyzed discriminant validity of the CAHAI by comparing it to the CMSA shoulder pain score in the same 39 participants with stroke. The correlation between the CAHAI and CMSA shoulder pain score as calculated using Pearson Correlation, was adequate at baseline (r = 0.47) and at follow-up (r = 0.39).

Barreca et al. (2006) assessed the convergent validity of the CAHAI-7, CAHAI-8 and CAHAI-9 by comparing them to the Action Research Arm Test (ARAT), CAHAI and CMSA in 39 individuals with stroke. Pearson Correlations were used. Correlations between the ARAT and CAHAI-7 (r = 0.95), CAHAI-8 (r = 0.95) and CAHAI-9 (r = 0.94) were all excellent , as well as between the CAHAI and all the shortened versions (r = 0.99), and between the CMSA and CAHAI-7 (r = 0.85), CAHAI-8 (r = 0.84), and CAHAI-9 (r = 0.84).

Barreca et al., (2006b) determined the convergent validity of the CAHAI-9 and CAHAI by comparing them to the ARAT (Lyle, 1981) in 105 individuals with stroke. Re-assessments were performed with a 36 hours interval. Pearson Correlation Coefficients were excellent between the CAHAI-9 and ARAT at baseline (r = 0.93), and at follow-up (r = 0.95), as well as between the CAHAI at baseline (r = 0.93), and at follow-up (r = 0.95).

Known groups:
Barreca et al. (2005) analyzed the longitudinal validity of the CAHAI in 39 clients with stroke by comparing change scores on the CAHAI with change scores on the arm-hand sum and on the shoulder pain dimensions of the Chedoke-McMaster Stroke Assessment (CMSA – Gowland et al., 1995) and on the Action Research Arm Test (ARAT – Lyle, 1981). Change scores correlations, as calculated using Pearson Correlation Coefficient, was excellent between the CAHAI and the ARAT (r = 0.86), adequate between the CAHAI and the CMSA arm-hand sum (r = 0.52) and poor between the CAHAI and the CMSA shoulder pain (r = -0.24). In a second analysis, Barreca et al. (2005) analyzed whether the CAHAI was more adept then the CMSA and the ARAT at distinguishing change in patients with mild/moderate impairments from patients with severe impairments in 39 clients with stroke. Longitudinal/known groups validity, as calculated using Receiver Operating Characteristic (ROC) demonstrated an excellent area under the curve for the CAHAI (ROC = 0.95). The ARAT and CMSA presented an adequate area under the curve (ROC = 0.88; ROC = 0.76), respectively.
Note: ROC curve analysis quantifies a measure’s ability to distinguish between groups as an area under the ROC curve. Greater areas indicate the measure is better at discriminating between individuals in the two groups.

Barreca et al. (2006) assessed the longitudinal validity of the CAHAI and its three shortened versions in 39 participants with stroke. Participants were divided according to stroke’s severity into acute and chronic groups. The CAHAI, CAHAI-7, CAHAI-8, and CAHAI-9 were administered at admission and discharge (2 to 6 weeks after admission) to verify which version was more adept to detecting changes in patients with acute stroke from patients with chronic stroke. Longitudinal/known groups validity, as calculated using Receiver Operating Characteristic (ROC) demonstrated an excellent area under the curve for all versions of the CAHAI as follows: CAHAI (ROC = 0.95); CAHAI -7 (ROC = 0.97); CAHAI-8 (ROC = 0.93), and CAHAI-9 (ROC = 0.94), meaning all versions of CAHAI are equally able to distinguish changes between different groups in stroke.

Barreca et al. (2006b) examined the longitudinal validity of the CAHAI, CAHAI-9 and the ARAT in 105 individuals with stroke. Participants were stratified between mild/moderate impairments and severe impairments, and those with mild/moderate impairments were expected to show greater changes across two repeated measures. The three outcome measures were administered at two points in time to verify which of them were more adept to detecting changes in clients with mild/moderate impairment from clients with severe impairment. Longitudinal/known groups validity, as calculated using Receiver Operating Characteristics, were adequate for the ARAT (ROC = 0.72), the CAHAI -9 (ROC = 0.82), and the CAHAI (ROC = 0.86). This ROC analysis indicated that the CAHAI was the best measure to detect change among patients with mild/moderate impairment from patients with severe impairment.

Responsiveness

Barreca et al. (2005) assessed the minimal detectable change of the CAHAI in 39 clients with stroke. Participants were assessed at two points in time: at admission, and after 2 to 6 weeks. For the CAHAI, the minimal detectable change was 6.3 points, meaning that stable patients displayed random fluctuations of 6.3 CAHAI points or less when assessed on two different occasions.

References

  • Barreca, S.R., Gowland, C.K., Stratford, P.W., et al. (2004). Development of the Chedoke Arm and Hand Activity Inventory: Theoretical constructs, item generation, and selection. Topics in Stroke Rehabilitation, 11(4), 31- 42.
  • Barreca, S.R., Stratford, P.W., Lambert, C.L., Masters, L.M., & Streiner, D.L. (2005). Test-retest reliability, validity, and sensitivity of the Chedoke Arm and Hand Activity Inventory: a new measure of upper-limb function for survivors of stroke. Archives of Physical Medicine and Rehabilitation, 86, 1616-1622.
  • Barreca, S.R., Stratford, P.W., Masters, L.M., Lambert, C.L., Griffiths, J., McBay, C. (2006). Validation of three shortened versions of the Chedoke Arm and Hand Activity Inventory. Physiotherapy Canada, 58, 148-156.
  • Barreca, S.R., Stratford, P.W., Masters, L.M., Lambert, C.L., Griffiths, J. (2006b). Comparing two versions of the Chedoke Arm and Hand Activity Inventory with the Action Research Arm Test. Physical Therapy, 86(2), 245-253.
  • Gowland, C., Stratford, P., Ward, M., Moreland, J., Torresin, W., VanHullenaar, S. et al.(1993). Measuring physical impairment and disability with the Chedoke-McMaster Stroke Assessment. Stroke, 24,58-63.
  • Gowland, C., VanHullenaar, S., Torresin, W., et al. (1995). Chedoke-McMaster Stroke Assessment: development, validation, and administration manual. Hamilton, ON, Canada: School of Rehabilitation Science, McMaster University.
  • Heller, A., Wade, D.T., Wood, V.A., Sunderland, A., Hewer, R., & Ward, E. (1987). Arm function after stroke: measurement and recovery over the first three months. Journal of Neurology, Neurosurgery & Psychiatry, 50(6), 714-719.
  • Keith, R.A, Granger, C.V., Hamilton, B.B., & Sherwin, F.S. (1987). The Functional Independence Measure: a new tool for rehabilitation. In: Eisenberg, M.G. & Grzesiak, R.C. (Ed.), Advances in clinical rehabilitation (pp. 6-18). New York: Springer Publishing Company.
  • Kellor, M., Frost, J., Silberberg, N., Iversen, I., & Cummings R. (1971). Hand strength and dexterity. American Journal of Occupational Therapy, 25, 77-83.
  • Lyle, R.C. (1981). A performance test for assessment of upper limb function in physical rehabilitation treatment and research. International Journal of Rehabilitation and Research, 4, 483-492.
  • Mathiowetz, V., Kashman, N., Volland, G., Weber, K., Dowe, M., & Rogers, S. (1985). Grip and pinch strength: normative data for adults. Archives of Physical and Medicine and Rehabilitation, 66, 69-72.
  • Mathiowetz, V., Weber, K., Kashman, N., & Volland, G. (1985b). Adult norms for the nine hole peg test of finger dexterity. Occupational Therapy Journal of Research, 5, 24 -33.

See the measure

How to obtain the CAHAI

The CAHAI can be obtained free of charge by visiting the official website: http://www.cahai.ca

Table of contents

Comprehensive Coordination Scale (CCS)

Evidence Reviewed as of before: 11-11-2021
Author(s)*: Sandra R. Alouche; Marika Demers; Roni Molad ; Mindy F. Levin

Purpose

The Comprehensive Coordination Scale (CCS) is a measure of coordination of multiple body segments at both motor performance (endpoint movement) and quality of movement (joint rotations and interjoint coordination) levels based on observational kinematics.

In-Depth Review

Purpose of the measure

 The Comprehensive Coordination Scale (CCS) is a measure of coordination of multiple body segments at both motor performance (endpoint movement) and quality of movement (joint rotations and interjoint coordination) levels based on observational kinematics. Coordinated movements are defined as movements of one or more limbs or body segments that occur together in identifiable temporal (i.e., timing) and spatial (i.e., positional/angular) patterns, concerning the desired action. It can be measured at a specific point in time during the movement or over the whole movement time.

The CCS can be used by healthcare professionals to assess coordination in older adults and individuals with various neurological conditions. The CCS is composed of six different tests: the Finger-to-Nose Test, the Arm-Trunk Coordination Test, the Finger Opposition Test, the Interlimb Coordination (synchronous anti-phase forearm rotations) Test, the Lower Extremity MOtor COordination Test (LEMOCOT) and the Four-limb Coordination (Upper and lower limb movements) Test.

Available versions

The CCS was developed by Alouche et al. (2021) from valid and reliable tests used in clinical practice and research to assess complementary aspects of motor coordination of the trunk, upper limb (UL), lower limb (LL) and combinations of them. Behavioral elements used to perform each test were identified and rating scales were developed to guide observational kinematic analysis by expert consensus (Alouche et al., 2021).

Features of the measure

 Items:
The CCS consists of 6 different tests used in either clinical practice or research to assess complementary aspects of motor coordination of the trunk, upper limb (UL), lower limb (LL) and combinations of them.

  1. Finger-to-Nose Test (FTN)
  2. Arm-Trunk Coordination Test (ATC)
  3. Finger Opposition Test (FOT)
  4. Interlimb Coordination Test (ILC-2)
  5. Lower Extremity MOtor COordination Test (LEMOCOT)
  6. Four-limb Coordination Test (ILC-4)
Body parts tested Type of test Test Behavioral elements scored
Upper limb Unilateral Finger-to-Nose (FTN) Spatial: Stability, smoothness, accuracy
Temporal: Speed
Trunk and arm Unilateral Arm-Trunk Coordination test (ATC) Spatial: Accuracy, interjoint coordination
Upper limb (fine dexterity) Unilateral Finger Opposition (FOT) Spatial: Selectivity
Temporal: Timing
Interlimb coordination=both upper limbs Bilateral Alternate movements of two upper limbs (ILC-2) Spatial: Compensation
Temporal: Synchronicity/ timing
Lower limb Unilateral Lower Extremity MOtor COordination Test (LEMOCOT) Spatial: Smoothness, accuracy
Temporal: Speed
Four-limb coordination = upper limbs and lower limbs Bilateral Alternate movements of both hands and feet (ILC-4) Temporal: Timing/ complexity

Scoring:
Multiple behavioral elements of each test are scored on separate rating scales ranging from 3 (normal coordination) to 0 (impaired coordination) to assess different elements of motor behavior needed to perform the action.
The CCS includes a total of 13 rating scales for the 6 tests.
The CCS score ranges from 0 to 69 points, with higher scores indicating better motor coordination. The CCS total score represents a coordination score for the whole body.
The CCS scores can be broken into 4 subscores: UL, LL, Unilateral, Bilateral.
UL: 54 points (includes FTN-24 points, ATC-12 points, FOT-12 points, and ILC2-6 points).
LL: 12 points (includes LEMOCOT-12 points).
Unilateral: 30 points (includes FTN-12 points, ATC-6 points, FOT-6 points, and LEMOCOT-6 points).
Bilateral: 9 points (includes ILC2-6 points and ILC4-3 points).
The manual describes the initial position, the instructions, and the detailed scoring.

What to consider before beginning:
The CCS is scored based on observational kinematics.

Time:
The CCS takes approximately 10-15 minutes to administer (Molad et al., 2021).

Training requirements:
The healthcare professional should read the CCS manual available on Open Science Framework:  Marika Demers, Mindy F Levin, Roni Molad, and Sandra Alouche. 2021. “Comprehensive Coordination Scale.” OSF. July 12. osf.io/8h7nm.

 Equipment:

  • Chair with back support and without armrests (suggested seat height: 46 cm)
  • Footstool, if needed
  • Targets:
    • One 2.54 cm-diameter sticker (FNT)
    • One target (sphere of 2.54 cm-diameter or a cube of similar dimensions) on an adjustable height support (ATC)
    • Two 5 cm-diameter stickers placed 30 cm (centre-to-centre) apart and attached to a cardboard (LEMOCOT test)
  • Stopwatch / timer
  • Table (optional, suggested height: 72 cm)
  • Pillow (optional)

Client suitability

Can be used with:

  • Individuals with neurological disorders

Should not be used with:

  • No information availble

In what languages is the measure available?

English

Summary

What does the tool measure? Temporal and spatial aspects of coordination.
What types of clients can the tool be used for? The CCS can be used with patients with neurological disorders.
Is this a screening or assessment tool? Assessment tool.
Time to administer 10-15 minutes.
ICF Domain Body function.
Other Languages French Canadian, Portuguese (both not published)
Measurement Properties
Reliability Internal consistency:
One study has reported high internal consistency of the CCS in a stroke population (Molad et al., 2021).

Test-retest:
One study examined test-retest reliability of the CCS within a stroke population and reported excellent test-retest reliability (ICC = 0.97; 95% CI: 0.93-0.98; Molad et al., 2021).

Intra-rater:
One study examined intra-rater reliability of the CCS within a stroke population and reported excellent intra-rater reliability (ICC = 0.97; 95% CI: 0.93-0.98; Molad et al., 2021).

Inter-rater:
One study examined intra-rater reliability of the CCS within a stroke population and reported excellent intra-rater reliability (ICC = 0.98, 95% CI: 0.95-0.99; Molad et al., 2021).

Validity Content:
One study has examined the content validity of the CCS. Using a Delphi Study done by a panel of experts. The CCS was found to have strong content validity (Alouch et al., 2021).

Criterion:
Concurrent:
Concurrent validity of the CCS has not been examined within a stroke population.
Predictive:
Predictive validity of the CCS has not been examined within a stroke population.

Construct:
Convergent/Discriminant:
One study has examined convergent validity of the CCS within a stroke population and reported: Adequate convergent validity with Fugl-Meyer-Total Score (ρ=0.602; p=0.001) and Fugl-Meyer-Motor Score (ρ=0.585; p<0.001) (Molad et al, 2021).
Known Groups:
One study has examined the known-group validity of the upper-limb Interlimb Coordination Test (ICL2), a subscale of the CCS, within a stroke population and reported that the ICL2 is able to distinguish between aged-match healthy individiuals and chronic stroke survivors (Molad & Levin, 2021).

Floor/Ceiling Effects One study reported excellent floor and ceiling effects for the CCS (Molad et al., 2021).
Does the tool detect change in patients? No studies have reported on the responsiveness of the CCS within a stroke population.
Acceptability The CCS is non-invasive and quick to administer. The use of visual observation instead of complex and costly motion analysis equipment to analyze movement makes this scale clinically accessible and easy to use.
Feasibility The CCS is free and is suitable for administration in various settings. The assessment requires minimal specialist equipment or training. It takes 10-15 minutes to be completed.
How to obtain the tool? Alouche SR, Molad R, Demers M, Levin MF. Development of a Comprehensive Outcome Measure for Motor Coordination; Step 1: Three-Phase Content Validity Process. Neurorehabil Neural Repair. 2021 Feb;35(2):185-193. doi: 10.1177/1545968320981955. [Supplementary materials]
The CCS manual can be accessed on the Open Science Framework website: Marika Demers, Mindy F Levin, Roni Molad, and Sandra Alouche. 2021. “Comprehensive Coordination Scale.” OSF. July 12. osf.io/8h7nm.

Psychometric Properties

Overview

A literature search was conducted to identify all relevant publications on the psychometric properties of the Comprehensive Coordination Scale (CCS) in individuals with stroke. We identified two studies.

Floor/Ceiling Effects

Molad et al. (2021) examined floor/ceiling effects of the CCS in a sample of 30 participants with chronic stroke. There were no floor/ceiling effects for the total score of the CCS and CCS-Bilateral subscale. For the CCS-UL and CCS-LL subscales, 3.3% and 6.7% of participants reached the maximal score, respectively. Ten percent of participants scored 0 or 30 on the CCS-Unilateral subscale.

Reliability

Internal consistency:
Molad et al. (2021) assessed the internal consistency of the CCS in a sample of 30 chronic stroke survivors, using principal component analysis and confirmatory factor analysis. The authors reported excellent internal consistency (composite reliability = 0.938). Factor analysis of the entire CCS revealed two components explaining 99% of the variance: Factor 1: movement quality (8 items), Factor 2: endpoint performance (5 items).

Intra-rater:
Molad et al. (2021) assessed the intra-rater reliability of the CCS in 30 chronic stroke survivors. The intra-rater reliability was evaluated with intraclass correlation coefficients (ICC) with 95% confidence intervals (CI). The CCS has excellent intra-rater reliability (ICC = 0.97; 95%; CI: 0.93-0.98). All four subscales also have excellent intra-rater reliability: CCS-UL subscale (ICC = 0.96; 95%; CI: 0.92-0.98), CCS-LL subscale (ICC = 0.79; 95%; CI: 0.36-0.92), CCS-Unilateral (ICC = 0.98; 95%; CI: 0.96-0.99) and CCS-Bilateral scores (ICC = 0.95; 95%CI: 0.89-0.97).

Inter-rater:
Molad et al. (2021) assessed the inter-rater reliability of the CCS in 30 chronic stroke survivors. The inter-rater reliability was evaluated with intraclass correlation coefficients (ICC) with 95% confidence intervals (CI). The CCS has excellent inter-rater reliability (ICC = 0.98; 95%; CI: 0.95-0.99). All four subscales also have excellent inter-rater reliability: CCS-UL subscale (ICC = 0.96; 95%; CI: 0.91-0.98), CCS-LL subscale (ICC = 0.76; 95%; CI: 0.25-0.9), CCS-Unilateral scores (ICC = 0.99; 95%; CI: 0.97-0.99) and CCS-Bilateral (ICC = 0.95; 95%; CI: 0.89-0.98).

Validity

Content:
Alouche et al. (2021) conducted a 3-phase content validation supporting the importance, level of comprehension and feasibility of the CCS in identifying and quantifying coordination of movements made by individuals with neurological deficits in a clinical setting. First, a literature review was performed to generate unilateral and bilateral tests of UL, LL, and trunk coordination currently used in clinical practice or research studies for the CCS. From the 2761 studies reviewed, 5 tests were selected: FTN, ATC, LEMOCOT, ILC2, and ILC4. A Delphi study, using a structured questionnaire with open-ended questions, was done with 8 expert clinicians and researchers to identify the relative importance of each test, test element, and rating scales, the level of comprehension of the instructions, and the feasibility of each test. Then, a focus group meeting was held with 6 experts to refine the instructions and the rating scales. A consensus was reached to add the Finger Opposition Test (FOT) to the final version of the CCS to assess the selectivity and timing of finger movements.

Criterion:
Concurrent:
No studies have reported on the concurrent validity of the CCS.

Predictive:
No studies have reported on the predictive validity of the CCS.

Construct:
Convergent/Discriminant:
Molad et al. (2021) examined the convergent validity in a sample of 30 chronic stroke survivors. Convergent validity of the total CCS was measured with the Fugl-Meyer Assessment (total score and motor score). Adequate convergent validity of the CCS with FMA-Total Score (ρ=0.602; p=0.001) and FMA-Motor Score (ρ=0.585; p<0.001) was obtained. The convergent validity of the subcales was measured with the Fugl-Meyer Assessment, prehension and pinch strength, Box and Blocks and 10-meter walk test. CCS-UL and CCS-Unilateral scores were moderate to strongly correlated with the Fugl-Meyer Assessment (total score and motor score), prehension and pinch strength, Box and Blocks and 10-meter walk test. The CCS-LL subscale was moderately correlated with the Fugl-Meyer Assessment (total score and motor score) and the Box and Blocks. The CCS-Bilateral subscale was moderately correlated with the Fugl-Meyer Assessment (total score and UL motor score) and the Box and Blocks.

Known Group:
Molad & Levin (2021) examined the known group validity of the ILC2 subscale in a sample of 13 stroke survivors and 13 healthy participants. They compared ILC2 scores with trunk and upper limb kinematics during synchronous bilateral anti-phase forearm rotations in 4 conditions: self-paced internally-paced, fast internally-paced, slow externally-paced, and fast externally-paced. Healthy participants had near maximal ILC2 scores and high temporal and spatial coordination indices. However, participants with stroke had lower ILC2 scores and used trunk and shoulder compensations to perform the task. ILC2 scores distinguished between healthy participants and participants with chronic stroke.

Responsiveness

 The responsiveness for the CCS has not been established.

Measurement error:
Molad et al. (2021) examined the measurement error in a sample of 30 chronic stroke survivors. The standard error of the measurement (SEM) was calculated based on the standard deviation (SD) of the sample and the reliability of measurement.  The minimal detectable change (MDC) at the 95% confidence level was computed. The CCS SEM was 1.80 points and the MDC95 was 4.98 points. The SEM and MDC values for the CCS, the CCS-UL, CCS-Unilateral and CCS-bilateral were less than 17%. Only the CCS-LL had an MDC greater than 17%.  For the CCS and all subscales, the SEM was smaller than the MDC.

References

Alouche, S.R., Molad, R., Demers, M., Levin, M.F. (2021) Development of a Comprehensive Outcome Measure for Motor Coordination; Step 1: Three-Phase Content Validity Process. Neurorehabil Neural Repair. 35(2):185-193. doi: 10.1177/1545968320981955. PMID: 33349134.

Molad, R., Alouche, S.R., Demers, M., Levin, M.F. (2021) Development of a Comprehensive Outcome Measure for Motor Coordination, Step 2: Reliability and Construct Validity in Chronic Stroke Patients. Neurorehabil Neural Repair. 35(2):194-203. doi: 10.1177/1545968320981943. PMID: 33410389.

Molad, R., & Levin, M. F. (2021) Construct validity of the upper-limb Interlimb Coordination Test (ILC2) in stroke. Neurorehabil Neural Repair [epub ahead of print]. doi: 10.1177/1545968321105809. PMID: 34715755

See the measure

The tool is available as supplementary material in:
Alouche SR, Molad R, Demers M, Levin MF. Development of a Comprehensive Outcome Measure for Motor Coordination; Step 1: Three-Phase content validity Process. Neurorehabil Neural Repair. 2021 Feb;35(2):185-193. doi: 10.1177/1545968320981955. [Supplementary materials]

The CCS manual can be accessed on the Open Science Framework website:
Marika Demers, Mindy F Levin, Roni Molad, and Sandra Alouche. 2021. “Comprehensive Coordination Scale.” OSF. July 12. osf.io/8h7nm.

Table of contents

Disabilities of the Arm, Shoulder and Hand (DASH)

Evidence Reviewed as of before: 19-06-2012
Author(s)*: Annabel McDermott, OT
Editor(s): Nicol Korner-Bitensky, PhD OT
Expert Reviewer: Natasha Lannin (Associate Professor, OT)
Content consistency: Gabriel Plumier

Purpose

The Disabilities of the Arm, Shoulder and Hand (DASH) is a self-report questionnaire that measures disability and symptoms of upper limb musculoskeletal disorders.

In-Depth Review

Purpose of the measure

The Disabilities of the Arm, Shoulder and Hand (DASH) is a self-report questionnaire that measures physical function and symptoms of the upper limb. The DASH can be used for any joint and any musculoskeletal condition of the upper limb (Hudak et al., 1996; Veehof et al., 2002), which permits comparison across upper limb diagnoses (Atroshi et al., 2000). The DASH is intended for discriminative and evaluative purposes (Schmitt & Di Fabio, 2004).

The DASH demonstrates validity and responsiveness in proximal and distal upper limb disorders (Beaton et al., 2001). The DASH demonstrated better clinimetric properties than other shoulder disability questionnaires including the Simply Shoulder Test (SST), American Shoulder and Elbow Surgeons Standardised Shoulder assessment Form (ASES) and the Shoulder Pain and Disability Index (SPADI – Bot et al., 2004).

Available versions

The DASH was developed by the American Academy of Orthopedic Surgeons, the Council of the Musculoskeletal Specialty Societies, and the Institute for Work and Health as a region-specific instrument to measure patients’ perception of disability and symptoms associated with any joint or condition of the upper limb (Hudak et al., 1996; Veehof et al., 2002).

The third edition of the DASH has been recently published to incorporate the latest research and new information regarding cross-cultural use of the measure.

Features of the measure

Items:

The DASH consists of 30 items that measure: (a) physical function (21 items); (b) symptom severity (5 items); and (c) social or role function (4 items).

Ability to do the following activities:

  1. Open a tight or new jar
  2. Write
  3. Turn a key
  4. Prepare a meal
  5. Push open a heavy door
  6. Place an object on a shelf above your head
  7. Do heavy household chores (e.g. wash walls, wash floors)
  8. Garden or do yard work
  9. Make a bed
  10. Carry a shopping bag or briefcase
  11. Carry a heavy object (over 5kg)
  12. Change a light bulb overhead
  13. Wash or blow dry your hair
  14. Wash your back
  15. Put on a pullover sweater
  16. Use a knife to cut food
  17. Recreational activities that require little effort (e.g. card playing, knitting)
  18. Recreational activities that require taking some force or impact through the arm, shoulder or hand (e.g. golf, hammering, tennis)
  19. Recreational activities that require you to move the arm freely (Frisbee, badminton)
  20. Managing transportation needs (getting from one place to another0
  21. Sexual activities
  22. Extent to which arm, shoulder or hand problems interfered with normal social activities with family, friends, neighbours or groups
  23. Extent to which arm, shoulder or hand problems limited work or other regular daily activities

Severity of the following symptoms:

  1. Arm, shoulder or hand pain
  2. Arm, shoulder or hand pain when performing activities
  3. Tingling
  4. Weakness
  5. Stiffness
  6. Difficulty in sleeping
  7. Impact on self-image

The DASH also includes two optional modules regarding work and sports/performing arts that investigate the individual’s difficulty:

  1. Using the usual technique for the activity (work; sport/instrument)
  2. Performing the activity due to arm, shoulder or hand pain
  3. Performing the as well as he/she would like
  4. Spending the usual amount of time on the activity

Scoring:

The most recent version of the DASH uses a 5-point Likert scale that rates the individual’s difficulties the preceding week. Lower scores indicate no difficulty, limitations or symptoms whereas higher scores indicate inability to perform tasks or extreme difficulties or symptomatology.

Items 1 – 21
  • 1 = no difficulty
  • 2 = mild difficulty
  • 3 = moderate difficulty
  • 4 = severe difficulty
  • 5 = unable
Item 22
  • 1 = not at all
  • 2 = slightly
  • 3 = moderately
  • 4 = quite a bit
  • 5 = extremely
Item 23
  • 1 = not limited at all
  • 2 = slightly limited
  • 3 = moderately limited
  • 4 = very limited
  • 5 = unable
Items 24 – 28
  • 1 = none
  • 2 = mild
  • 3 = moderate
  • 4 = severe
  • 5 = extreme
Optional work and sports/performing arts modules:
  • 1 = no difficulty
  • 2 = mild difficulty
  • 3 = moderate difficulty
  • 4 = severe difficulty
  • 5 = unable

The DASH total score is calculated as a percentage (0=no disability to 100=maximal disability), using the following calculation:

[(Sum of completed responses ÷ number of completed responses) – 1] x 25

The final score for each optional module is calculated as follows:

[(Sum of completed responses ÷ 4) – 1] x 25

Note: A DASH total score cannot be calculated if more than 3 items have not been answered. Total scores for the additional modules cannot be calculated if there are any missing items.

Where 3 or fewer items have been missed, missing responses are replaced by the mean value of the responses to other items before summing.

Please note that earlier versions of the DASH use a different scoring system.

What to consider before beginning:

A study by Ring et al. (2006) showed a strong correlation between the DASH and measures of depression (Center for Epidemiologic Studies – Depression) and anxiety (Pain Anxiety Symptoms Scale) in a sample of 235 patients with discrete hand problems (e.g. carpal tunnel syndrome, de Quervain tenosynovitis, lateral elbow pain, trigger finger, distal radial fracture). Subsequently, Lozano Calderon et al. (2010) conducted a study with 516 patients requiring hand surgery and adjusted DASH scores for the influence of depression. This resulted in a significant decrease in the mean and standard deviation of DASH scores, although the decrease in variation was small. There was a high correlation between DASH and depression-adjusted DASH scores, indicating no notable benefit to adjusting DASH scores for depression. Given the high incidence of depression among patients with stroke, consideration of the correlation between disability and depression should be considered when using the DASH.

Time:

The DASH takes approximately 5 minutes to administer with patients with musculoskeletal disorders (Bot et al., 2004). Administration with patients with stroke may require more time and support materials.

Training requirements:

No specific training requirements are specified.

Equipment:

No specific equipment is required.

Alternative Forms of the Measure

The QuickDASH is an 11-item questionnaire that was developed from the DASH using a concept-retention’ approach (Beaton et al., 2005). The QuickDASH is comprised of the following items:

  1. Open a tight or new jar
  2. Do heavy household chores (e.g. wash walls, wash floors)
  3. Carry a shopping bag or briefcase
  4. Wash your back
  5. Use a knife to cut food
  6. Recreational activities that require taking some force or impact through the arm, shoulder or hand (e.g. golf, hammering, tennis)
  7. Extent to which arm, shoulder or hand problems interfered with normal social activities with family, friends, neighbours or groups
  8. Extent to which arm, shoulder or hand problems limited work or other regular daily activities
  9. Arm, shoulder or hand pain
  10. Tingling
  11. Difficulty in sleeping

The QuickDASH also retains the optional work and sports/performing arts modules (Beaton et al., 2005).

Like the DASH, the QuickDASH uses a 5-point Likert rating scale and the total score is calculated as a percentage (0=no disability – 100=most severe disability). At least 10 of the 11 items must be completed for correct use. The QuickDASH demonstrates similar test-retest reliability, validity and responsiveness to the DASH and may demonstrate better precision in detecting different degrees of disability than the DASH. Although there is a high correlation between the QuickDASH and the DASH, an exact match between the numeric scores of the two assessments is not guaranteed (Beaton et al., 2005). Due to the smaller number of items, the QuickDASH is considered to be more efficient than the DASH (Beaton et al., 2005; Gummesson et al., 2006). However, the DASH is more suitable than the QuickDASH for use when monitoring arm pain and function over time in individual patients.

Client suitability

Can be used with:

  • Individuals with upper limb musculoskeletal impairment.
  • Due to limited research regarding patient acceptability, the DASH may be more suitable for patients with mild impairment.

Should not be used with:

  • N/A

Languages of the measure

Approved translations have been made in the following languages:

  • Afrikaans
  • Arabic
  • Armenian
  • Chinese (Hong Kong)
  • Chinese (Taiwan)
  • Czech
  • Danish
  • Dutch
  • English (Australia)
  • English (Hong Kong)
  • English (South Africa)
  • Finnish
  • French Canadian
  • French
  • German
  • Greek
  • Hebrew
  • Hungarian
  • Italian
  • Japanese
  • Korean
  • Lithuanian
  • Malay
  • Norwegian
  • Persian (Iran)
  • Polish
  • Portugese (Brazil)
  • Portugese (Portugal)
  • Romanian
  • Russian
  • Serbian
  • Sinhala (Sri Lanka)
  • Spanish (Argentina)
  • Spanish (Puerto Rico)
  • Spanish (Spain)
  • Swedish
  • Thai
  • Turkish

Translations are also in progress for the following languages:

  • Croatian
  • Estonian
  • Filipino
  • Isi-Xhosa
  • Latvian
  • Malayalam
  • Slovak
  • Spanish (Chile)
  • Spanish (Dominican Republic)
  • Ukrainian

Summary

What does the tool measure? Upper extremity disability and pain.
What types of clients can the tool be used for? Individuals with musculoskeletal disorders of the upper limb.
Is this a screening or assessment tool? Assessment
Time to administer Five minutes.
Versions
  • DASH
  • QuickDASH
Other Languages Afrikaans, Arabic, Armenian, Chinese (Hong Kong), Chinese (Taiwan), Czech, Danish, Dutch, English (Australia), English (Hong Kong), English (South Africa), Finnish, French Canadian, French, German, Greek, Hebrew, Hungarian, Italian, Japanese, Korean, Lithuanian, Malay, Norwegian, Persian (Iran), Polish, Portugese (Brazil), Portugese (Portugal), Romanian, Russian, Serbian, Sinhala (Sri Lanka), Spanish (Argentina), Spanish (Puerto Rico), Spanish (Spain), Swedish, Thai, Turkish.
Measurement Properties
Reliability Internal consistency:
No studies have reported on the internal consistency of the DASH among patients with stroke.

Test-retest:
No studies have reported on the test-retest reliability of the DASH among patients with stroke.

Intra-rater:
No studies have reported on the intra-rater reliability of the DASH among patients with stroke.

Inter-rater:
No studies have reported on the inter-rater reliability of the DASH among patients with stroke.

Validity Content:
The DASH was developed by item generation (clinical expert input, literature review and patient focus groups) and item reduction (expert review, and psychometric and clinimetric analysis).

One study that examined the content validity of the DASH in a sample of patients with stroke suggested a disordered rating scale structure and item hierarchy that is not suitable for clinical use.

Criterion:
Concurrent:
No studies have reported on the concurrent validity of the DASH among patients with stroke.

Predictive:
No studies have reported on the predictive validity of the DASH among patients with stroke.

Construct:
Convergent/Discriminant:
One study reported moderate correlations between manual ability and pain.

Known Groups:
No studies have reported on the known-groups validity of the DASH among patients with stroke.

Floor/Ceiling Effects No studies have reported on the floor/ceiling effects of the DASH among patients with stroke.
Does the tool detect change in patients? No studies have reported on the responsiveness among patients with stroke.
Acceptability The DASH is simple to comprehend, quick to complete and is comprised of real-life, non-gender specific items. Due to limited research regarding patient acceptance, this tool may be more suitable for patients with mild impairment.
Feasibility The DASH is a versatile measure that can be used for clinical or research purposes. However there is insufficient research regarding use of the DASH with patients with stroke and concerns that without testing, the clinical utility of the DASH remains unknown.
How to obtain the tool? Visit the DASH website for more information: https://dash.iwh.on.ca/

Psychometric Properties

Overview

A literature search was conducted to identify all relevant publications on the psychometric properties of the DASH. While numerous studies have been conducted with other patient groups, this review specifically addresses the psychometric properties relevant to patients with stroke. At the time of publication there was 1 conference paper but no published studies specific to patients with stroke.

Floor/Ceiling Effects

No studies have reported on the floor/ceiling effects of the DASH in a sample of patients with stroke. The DASH demonstrates no floor or ceiling effects in patients with shoulder and combined shoulder-upper limb problems (Bot et al., 2004).

Reliability

Internal consistency:
No studies have examined internal consistency of the DASH in a sample of patients with stroke, although studies conducted among patient groups with other upper limb conditions indicate excellent reliability (see: Atroshi et al., 2000; Bot et al., 2004; Veehof et al., 2002). However, this may indicate item redundancy (Beaton et al., 2005).

Test-retest:
No studies have examined test-retest reliability of the DASH in a sample of patients with stroke, although studies conducted among patient groups with other upper limb conditions indicate excellent test-retest reliability (see: Atroshi et al., 2000; Bot et al., 2004; Beaton et al., 2001).

Intra-rater:
No studies have examined intra-rater reliability of the DASH in a sample of patients with stroke.

Inter-rater:
No studies have examined inter-rater reliability of the DASH in a sample of patients with stroke.

Validity

Content:

The DASH was developed in two stages of item generation and item reduction. The first stage of item generation involved clinical expert input, review of 13 relevant outcome measurement scales and patient focus groups to identify possible items. The second stage of item reduction involved preliminary item review by three content experts, secondary review by a panel of 15 experts for content/face validity and item importance, and subsequent pre-testing on 20 individuals with upper extremity difficulties. Further item reduction was conducted by psychometric and clinimetric analysis among patients with upper limb conditions, including (i) field-testing in a cross-sectional study of 407 patients with various upper limb problems, and (ii) importance- and difficulty- rating in a second sample of 76 patients. This resulted in the 30-item questionnaire (Hudak et al., 1996; Marx et al., 1999).

Lannin et al. (2010) examined the content validity of the DASH in a sample of 157 patients with stroke. Analysis of the original rating scale revealed a disordered structure; Rasch measurement modeling was used to transform ordinal ratings into a collapsed linear measure, which resulted in conformation to expectations of the model. The study also found that the hierarchy of the original 30 items is not appropriate for clinical use as there are few items suitable for the most disabled patient.

Franchignoni et al. (2010) investigated the dimensionality, rating scale diagnostics and model fit of the DASH (Italian version) on a sample of 238 patients with upper extremity disorders (excluding stroke). The authors noted that some items do not rely exclusively on upper limb function (e.g. item 9: Make a bed; item 20: manage transportation needs), and that items measure different ICF constructs (impairment, activity limitation and participation restriction). The authors found that patients were not able to reliably use the 5-level rating scale. Factor analysis revealed 3 underlying constructs of: (i) manual functioning (items 1-5, 7-11, 16-18, 20, 21); (ii) shoulder range of motion (items 6, 12-15, 19); and (iii) symptoms and consequences (items 22-30). Two items (Tingling, Sexual Activities) showed misfit by Rash Analysis. While results from this study identify issues to consider when using the DASH, it is important to note that patients with stroke were excluded from the sample population.

Criterion:

Concurrent:
No studies have reported on the concurrent validity of the DASH in a sample of patients with stroke.

Predictive:
No studies have reported on the predictive validity of the DASH in a sample of patients with stroke.

Construct:

Convergent/Discriminant :
Lannin et al. (2010) conducted a comparison of the DASH with a self-report questionnaire of upper limb function and an observation upper limb movement assessment in 90 patients with stroke. The authors reported moderate correlations between manual ability and pain (statistical data not provided).

While no other studies have examined construct validity of the DASH in a sample of patients with stroke, numerous studies conducted among patient groups with other upper limb conditions report adequate to excellent correlations with constructs of function and pain (see: Atroshi et al., 2000; Beaton et al., 2001; Bot et al., 2004; Kirkley et al., 1998; Schmitt & Di Fabio, 2004; SooHoo et al., 2002; Turchin et al., 1998).

Known Group:
No studies have examined known-group validity of the DASH in a sample of patients with stroke, although studies have been conducted among patient groups with other upper limb conditions (see: Beaton et al., 2001).

Responsiveness

No studies have examined responsiveness of the DASH in a sample of patients with stroke, although studies have been conducted among patient groups with other upper limb conditions (see: Beaton et al., 2001; Bot et al., 2004; MacDermid & Tottenham, 2004; Schmitt & Di Fabio, 2004).

Sensitivity & Specificity:
No studies have examined responsiveness of the DASH in a sample of patients with stroke, although studies have been conducted among patient groups with other upper limb conditions (see: Beaton et al., 2001).

References

  • Atroshi, I., Gummesson, C., Andersson, B., Dahlgren, E. & Johansson, A. (2000). The disabilities of the arm, shoulder and hand (DASH) outcome questionnaire: reliability and validity of the Swedish version evaluated in 176 patients. Acta Orthopaedica Scandinavica, 71(6), 613-8.
  • Beaton, D.E., Katz, J.N., Fossel, A.H., Wright, J.G., Tarasuk, V., & Bomardier, C. (2001). Measuring the whole or the parts? Validity, reliability, and responsiveness of the Disabilities of the Arm, Shoulder and Hand outcome measure in different regions of the upper extremity. Journal of Hand Therapy, 14, 128-46.
  • Beaton, D.E., Wright, J.G., Katz, J.N., and the Upper Extremity Collaborative Group. (2005). Development of the QuickDASH: comparison of three item-reduction approaches. The Journal of Bone and Joint Surgery, 87-A(5), 1038-46.
  • Bot, S.D.M., Terwee, C.B., van der Windt, D.A.W.M., Bouter, L.M., Dekker, J., & de Vet, H.C.W. (2004). Clinimetric evaluation of shoulder disability questionnaires: a systematic review of the literature. Annals of the Rheumatic Diseases, 63, 335-41.
  • Franchignoni, F., Biordano, A., Sartorio, F., Vercelli, S., Pascariello, B., & Ferriero, G. (2010). Suggestions for refinement of the Disabilities of the Arm, Shoulder and Hand outcome measure (DASH): a factor analysis and Rasch validation study. Archives of Physical Medicine and Rehabilitation, 91, 1370-7.
  • Gummesson, C., Ward, M.M., & Atroshi, I. (2006). The shortened disabilities of the arm, shoulder and hand questionnaire (QuickDASH): validity and reliability based on responses within the full-length DASH. BMC Musculoskeletal Disorders, 7(44). doi:10.1186/1471-2474-7-44.
  • Hudak, P.L., Amadio, P.C., Bombardier, C., and the Upper Extremity Collaborative Group. (1996). Development of an upper extremity outcome measure: the DASH (Disabilities of the Arm, Shoulder, and Hand). American Journal of Industrial Medicine, 29, 602-8.
  • Kirkley, A., Griffin, S., McLintock, H., & Ng, L. The development and evaluation of a disease-specific quality of life measurement tool for shoulder instability: The Western Ontario Shoulder Instability Index (WOSI). The American Journal of Sports Medicine, 26(6), 764-72.
  • Lannin, N. McCluskey, A. Cusick, A. Ashford, S. Ross, L. (2010) Measuring function in everyday life: enhancing the Disabilities of the Arm Shoulder Hand questionnaire for use post-stroke. World Federation of Occupational Therapy, Santiago, Chile, May.
  • Lozano Calderon, S.A., Zurakowski, D., Davis, J.S., & Ring, D. (2010). Quantitative adjustment of the influence of depression on the Disabilities of the Arm, Shoulder and Hand (DASH) questionnaire. Hand, 5, 49-55.
  • MacDermid, J.C. & Tottenham, V. (2004). Responsiveness of the Disabilities of the Arm, Shoulder and Hand (DASH) and patient-rated wrist/hand evaluation (PRWHE) in evaluating change after hand therapy. Journal of Hand Therapy, 17, 18-23.
  • Marx, R.G., Bombardier, C., Hogg-Johnson, S., & Wright, J.G. (1999). Clinimetric and psychometric strategies for development of a health measurement scale. Journal of Clinical Epidemiology, 52(2) 105-11.
  • Ring, D., Kadzielski, J., Fabien, L., Zurakowski, D., Malhotra, L.R., & Jupiter, J.B. (2006) Self-reported upper extremity health status correlates with depression. The Journal of Bone and Joint Surgery, 88-A(9), 1983-8).
  • Schmitt, J.S. & Di Fabio, R. (2004). Reliable change and minimum important difference (MID) proportions facilitated group responsiveness comparisons using individual threshold criteria. Journal of Clinical Epidemiology, 57, 1008-18.
  • SooHoo, N.F., McDonald, A.P., Seiler, J.G., & McGillivrary, G.R. (2002). Evaluation of construct validity of the DASH questionnaire by correlation to the SF-36. Journal of Hand Surgery, 27A, 537-41.
  • Turchin, D.C., Beaton, D.E. & Richards, R.R. (1998). Validity of observer-based aggregate scoring systems as descriptors of elbow pain, function and disability. The Journal of Bone and Joint Surgery, 80A(2), 154-62.
  • Veehof, M.M., Sleegers, E.J.A., van Veldhoven, N.H.M.J., Schuurman, A.H., & van Meeteren, N.L.U. (2002). Psychometric qualities of the Dutch language version of the Disabilities of the Arm, Shoulder, and Hand questionnaire (DASH-DLV). Journal of Hand Therapy, 15, 347-54.

See the measure

How to obtain the DASH?

You can obtain a copy of the DASH through https://dash.iwh.on.ca/

Table of contents

Frenchay Arm Test (FAT)

Evidence Reviewed as of before: 17-09-2012
Author(s)*: Katie Marvin, MPT
Editor(s): Annabel McDermott, OT

Purpose

The Frenchay Arm Test (FAT) is a measure of upper extremity proximal motor control and dexterity during ADL performance in patients with impairments resulting from neurological conditions. The FAT is an upper extremity specific measure of activity limitation.

In-Depth Review

Purpose of the measure

The Frenchay Arm Test (FAT) is a measure of upper extremity proximal motor control and dexterity during ADL performance in patients with impairments of the upper extremity resulting from neurological conditions. The FAT is an upper extremity specific measure of activity limitation.

Available versions

None typically reported.

Features of the measure

Description of tasks:

Clients sit comfortably at a table with hands on their lap; each test item starts from this position. Clients are then asked to use their affected arm to:

  • Stabilize a ruler, while drawing a line with a pencil held in the other hand. To pass, the ruler must be held firmly.
  • Grasp a cylinder (12 mm diameter, 5 cm long), set on its side approximately 15 cm from the table edge, lift it about 30 cm and replace it without dropping.
  • Pick up a glass, half full of water positioned about 15 to 30 cm from the edge of the table, drink some water and replace without spilling.
  • Remove and replace a sprung clothes peg from a 10mm diameter dowel, 15 cm long set in a 10 cm base, 15 to 30 cm from table edge. Not to drop peg or knock dowel over.
  • Comb hair (or imitate); must comb across top, down the back and down each side of head.

What to consider before beginning:

  • Before administering the FAT, the clinician should ensure that the client is able to comprehend either written or spoken language.
  • The FAT has been criticized for lacking assessment of quality of movement and performance (Kopp, 1997). In addition, clients were found to either pass or fail all or most subtests, indicating that the FAT may not be sensitive to change or subtleties in progress (Hsieh, Hsueh, Chiang & Lin, 1998), especially in clients performing in the upper range of arm function (Wade, et al., 1983).

Scoring and Score Interpretation:

Each item is scored as either pass (=1) or fail (=0). Total scores range from 0 to 5.

Time:

The FAT takes approximately 3 minutes to administer.

Training requirements:

None typically reported, however familiarity with the measure is recommended.

Equipment:

  • Ruler
  • Pencil
  • Paper
  • Cylinder (12mm diameter, 5 cm long)
  • Glass (Half filled with water)
  • Clothes peg
  • Dowel (15mm)
  • Hair comb

Alternative Forms of the FAT

None typically reported

Client suitability

Can be used with:

  • Clients with stroke

Should not be used in:

  • Clients with difficulty understanding written and spoken language

Languages of the measure

  • English
  • French
  • Dutch

Summary

What does the tool measure? The FAT measures upper extremity proximal control and dexterity during performance of functional tasks.
What types of clients can the tool be used for? The FAT can be used with, but is not limited to clients with stroke.
Is this a screening or assessment tool? Assessment
Time to administer The FAT takes approximately 3 minutes to administer.
Versions There are no alternative versions of the FAT.
Other Languages French and Dutch
Measurement Properties
Reliability Intra-rater:
One study examined the intra-rater reliability of the FAT in clients with stroke and found adequate to excellent intra-rater reliability.

Inter-rater:
One study examined the inter-rater reliability of the FAT in clients with stroke and found excellent inter-rater reliability.

Validity Sensitivity/ Specificity:
Two studies compared the sensitivity of the FAT with that of the Nine-Hole Peg Test (NHPT) and found the NHPT to be more sensitive than the FAT for detecting impaired upper extremity function in clients with stroke.
Floor/Ceiling Effects No studies have examined the floor/ceiling effects of the FAT in clients with stroke.
Does the tool detect change in patients? No studies have investigated the responsiveness of the FAT in clients with stroke.
Acceptability

The FAT has been criticized for lacking assessment of quality of movement and performance (Kopp, 1997). In addition, clients were found to either pass or fail all or most subtests, indicating that the FAT may not be sensitive to change (Hsieh, Hsueh, Chiang & Lin, 1998).

The FAT is quick to complete and should not produce any undue fatigue for patients.

Feasibility The FAT is short and easy to administer and score.
How to obtain the tool? For more information on the FAT, please visit the article by Parker, Wade & Langton Hewer (1986).

Psychometric Properties

Overview

A literature search was conducted to identify all relevant publications on the psychometric properties of the Frenchay Arm Test (FAT) in clients with stroke. Two studies were found and have been reviewed in this module. More studies are required before definitive conclusions can be drawn regarding the reliability and validity of the FAT.

Floor/Ceiling Effects

No studies have examined the floor/ceiling effects of the FAT in clients with stroke.

Reliability

Internal constancy:
No studies have examined the internal consistency of the FAT in clients with stroke.

Intra-rater:
Heller, Wade, Wood, Sunderland, Hewer, and Ward (1987) examined the intra-rater reliability of the FAT, Nine-Hole Peg Test (NHPT), Finger Tapping Rate (Lezak, 1983), and Grip Strength (Mathiowetz, Kashman, Volland, Weber, Dowe, & Rogers, 1985) in 10 patients with subacute stroke. Participants were re-assessed with a 2-week interval by the same rater. In this study, results describe the range of reliability of the four measures mentioned above, and values for each individual measure were not provided. Spearman rho correlation coefficient was adequate to excellent (ranging for all four measures from r = 0.68 to 0.99).
Note: Although is not possible to discern the exact value for the FAT reliability, all values were considered adequate to excellent and statistically significant, suggesting that the FAT may be reliable with stable stroke clients.

Inter-rater:
Heller et al. (1987) examined the inter-rater reliability of the FAT, Nine-Hole Peg Test (NHPT), Finger Tapping Rate (Lezak, 1983), and Grip Strength (Mathiowetz et al., 1985) in 10 patients with subacute stroke. Participants were assessed twice within a week by two raters. Spearman rho correlation coefficients were excellent (ranging for all four measures from r = 0.75 to 0.99).
Note: In this study, individual values for each measure were not provided. Although is not possible to discern the exact value for the FAT reliability, all values were considered excellent.

Test-retest:
No studies have examined the test-retest reliability of the FAT in clients with stroke.

Validity

Content:

No studies have examined the content validity of the FAT in clients with stroke.

Criterion:

Concurrent:
No studies have examined the concurrent validity of the FAT in clients with stroke.

Predictive:
No studies have examined the predictive validity of the FAT in clients with stroke.

Construct:

Convergent/Discriminant:
No studies have examined the discriminant validity of the FAT in clients with stroke.

Known Groups:
No studies have examined the known groups validity of the FAT in clients with stroke.

Sensitivity/specificity:
Heller et al. (1987) investigated the specificity of the FAT and the Nine Hole Peg Test (NHPT) in 56 clients with chronic stroke. All of the clients that scored less than 5/5 on the FAT were correctly identified as having impaired dexterity, as identified by using the normal cut-off scores for the NHPT. However, 48 percent of patients that scored 5/5 on the FAT scored in the below normal range on the Nine Hole Peg Test. These results indicate that the NHPT is more sensitive than the FAT for detecting impaired upper extremity function in clients with stroke.

Parker, Wade & Hewer (1986) compared the specificity of the FAT and the Nine-Hole Peg Test (NHPT) in 187 clients with sub-acute stroke. Participants that were able to successfully place nine pegs in the pegboard were further categorized according to those who completed the NHPT in less than 19 seconds (n=37) and those who required over 19 seconds (n=69). For the FAT, 114 participants score 5/5, 33 participants scored in the middle range (1/5 – 4/5) and 36 participants scored 0/5. Researchers concluded that the NHPT is more sensitive than the FAT because 13 percent of participants who scored perfectly on the FAT placed less than 9 pegs on the NHPT and all participants who scored perfectly on the NHPT (9 pegs placed in less than 19 seconds) also scored 5/5 on the FAT.

Responsiveness

No studies have examined the responsiveness of the FAT in clients with stroke.

References

  • Heller, A., Wade, D.T., Wood, V.A., Sunderland, A., Langton Hewer, R., & Ward, E. (1987). Arm function after stroke: Measurement and recovery over the first three months. Journal of Neurology, Neurosurgery, and Psychiatry, 50, 714-719.
  • Hsieh, C-L., Hsueh, P. Chiang, F-M., & Lin, P-H. (1998). Inter-rater reliability and validity of the Action Research Arm Test in stroke patients. Age and Ageing, 27, 107-113.
  • Parker, V.M., Wade, D.T., & Langton Hewer, R. (1986). Loss of arm function after stroke: Measurement, frequency, and recovery. International Rehabilitative Medicine, 8, 69-73.
  • Wade, D.T., Langton-Hewer, R., Wood, V.A., Skilbeck, C.E., & Ismail, H.M. (1983). The hemiplegic arm after stroke: Measurement and recovery. Journal of Neurology, Neurosurgery and Psychiatry, 46, 521-524.

See the measure

For more information on the FAT, please review the article by Parker, Wade & Langton Hewer (1986).

Table of contents

Jebsen Hand Function Test (JHFT)

Evidence Reviewed as of before: 17-09-2012
Author(s)*: Jennifer Vissers
Editor(s): Annabel McDermott, OT; Nicol Korner-Bitensky, PhD OT

Purpose

The Jebsen Hand Function Test (JHFT) assesses fine motor skills, weighted and non-weighted hand function activities during performance of activities of daily living.

In-Depth Review

Purpose of the measure

The Jebsen Hand Function Test (JHFT) is a standardized evaluative measure of functional hand motor skills (Hummel et al., 2005).

Available versions

The JHFT was developed in 1969 by Jebsen, Taylor, Treischmann, Trotter, and Howard (Cook, McCluskey, & Bowman, 2006). The JHFT is also referred to as the Jebsen-Taylor Hand Function Test or the Jebsen-Taylor Test of Hand Function.

A 3-item version (Modified Jebsen Hand Function Test, MJT) was developed by Bovend’Erdt et al. (2004) to measure gross functional dexterity in patients with moderate unilateral or bilateral upper limb impairment.

An 8-item Australian version was developed by Agnew and Maas (1982). It consists of the original 7 items with the addition of a grip strength item, measured using the Jamar dynamometer (Cook, McCluskey, & Bowman, 2006).

Features of the measure

Items:

The JHFT consists of 7 items that measure: (a) fine motor skills; (b) weighted functional tasks; and (c) non-weighted functional tasks (Jebsen et al., 1969):

  • Writing a short sentence (24 letters, 3rd grade reading difficulty)
  • Turning over a 3×5 inch card
  • Picking up small common objects
  • Simulated feeding
  • Stacking checkers
  • Picking up large light cans
  • Picking up large heavy cans

Administration guidelines specify that testing begin with the non-dominant hand (Jebsen et al., 1969). Further details about the administration procedures of the JHFT can be found in the original article by Jebsen et al. (1969).

Items of the Modified Jebsen Hand Function Test (MJT) (Bovend’Erdt et al., 2004):

  • Turning over 5 cards
  • Stacking 4 cones
  • Spooning 5 kidney beans into a bowl (simulated feeding)

Scoring:

Each item is scored according to time taken to complete the task. Times are rounded to the nearest second (Spinal Cord Injury Rehabilitation Evidence, 2010). The scores for all 7 items are then summed for a total score. Jebsen et al. (1969) established norms with a sample of 300 healthy subjects of different age groups (20-29 years, 30-39 years, 40-49 years, 50-59 years, 60-94 years). With the exception of writing, all items took under 10 seconds to perform. See Jebsen et al. (1969) for norms according to age, gender and hand use (dominant/non-dominant).

What to consider before beginning:

It is necessary to identify the patient’s dominant hand before beginning the JHFT. When working with patients with stroke it is recommended to take into consideration the area(s) of cortical insult, as damage to areas of the brain responsible for speech and language function may affect performance on the writing task (Celink et al., 2007). Prior to beginning the writing task, individuals should be reminded to use reading glasses if necessary (Jebsen et al., 1969).

Time:

The JHFT requires 15 – 45 minutes to complete.

Training requirements:

No specific training is required.

Equipment:

The JHFT does not require standardized equipment but the following equipment is used (Jebsen et al., 1969):

  • wooden board (41 1/2 inches long x 11 1/4 inches wide x 3/4 inch thick)
  • ball point pen
  • 8×11 inch sheets unruled paper
  • 5×8 inch index cards
  • 3×5 inch index cards
  • 1 pound coffee can
  • 1 inch paper clips
  • teaspoon
  • 5 kidney beans
  • standard size wooden checkers
  • 5 empty 303 cans
  • 5 full (1 pound) 303 cans.

Test equipment can be collated by the clinician or purchased as pre-packaged assessment kits from suppliers including:

Client suitability

Can be used with:

  • Clients with neurological or musculoskeletal conditions, e.g. stroke, spinal cord injury, arthritis (Cook, McCluskey, & Bowman, 2006).
  • This assessment has been administered in clients over 8 years of age (Cook, McCluskey, & Bowman, 2006).

Should not be used with:

  • Individuals with speech and language disorders may have difficulty understanding instructions.
  • The writing task can be excluded for individuals with speech and language difficulties due to dominant cerebral hemisphere stroke (Beebe & Lang, 2009, 2007; Hummel et al., 2005).

Languages of the measure

  • English
  • Portuguese (Ferreiro, dos Santos, & Conforto, 2010)

Summary

What does the tool measure? Hand function
What types of clients can the tool be used for? The JHFT can be use with, but is not limited to clients with stroke.
Is this a screening or assessment tool? Assessment
Time to administer 15-45 minutes
Versions
  • JHFT
  • Modified Jebsen Hand Function Test (MJT)
  • JHFT Australian version, Portuguese version
Other Languages English, Portuguese
Measurement Properties
Reliability Internal consistency:
One study reported excellent internal consistency of the JHFT (Portuguese version), and adequate to excellent internal consistency of individual items.

Test-retest:
One study reported adequate to excellent test-retest reliability of JHFT individual items.

One study reported excellent test-retest reliability of the MJT.

Intra-rater:
One study reported excellent intra-rater reliability of the JHFT (Portuguese version).

Inter-rater:
One study reported excellent inter-rater reliability of the JHFT (Portuguese version) and individual items.

Validity Content:
No studies have examined the content validity of the JHFT.

Criterion:
Concurrent:
Two studies reported excellent correlation between the JHFT and grip strength, pinch strength, Action Research Arm Test, Nine Hole Peg Test, and Stroke Impact Scale – Hand Domain.

One study reported an excellent correlation between the MJT and the Nine Hole Peg Test and an adequate correlation with grip strength.

Predictive:
No studies have examined the predictive validity of the JHFT.

Construct:
No studies have examined the construct validity of the JHFT.

One study reported no significant difference in scores on the JHFT (Portuguese version) according to education level or hand dominance.

Floor/Ceiling Effects No studies have examined the floor or ceiling effects of the JHFT.
Sensitivity/ Specificity No studies have reported on the sensitivity or specificity of the JHFT.
Does the tool detect change in patients?

One study reported moderate responsiveness of the JHFT from 1 to 3 months post-stroke, and from 3 to 6 months post-stroke.

Acceptability The JHFT is comprised of simple, familiar, and functional tasks. Consideration must be paid to individuals with speech and language difficulties, who may have difficulty understanding instructions and performing the writing task.
Feasibility The JHFT is easy to administer and does not require standardized equipment.
How to obtain the tool?

Information regarding test administration is provided in:

Jebsen, R.H., Taylor, N., Trieschmann, R.B., Trotter, M.J., & Howard, L.A. (1969). An objective and standardized test of hand function. Archives of Physical Medicine and Rehabilitation, 50(6), 311 – 319.

Assessment kits can be purchased from:

Psychometric Properties

Overview

A literature search was conducted to identify all relevant publications on the psychometric properties of the Jebsen Hand Function Test (JHFT). While studies have been conducted with other patient groups, this review specifically addresses the psychometric properties relevant to patients with stroke. At the time of publication five studies were identified: three relating to the JHFT, and one each for the JHFT (Portuguese version) and the Modified Jebsen Hand Function Test (MJT).

Floor/Ceiling Effects

No studies have examined the floor or ceiling effects of the JHFT.

Reliability

Internal consistency:
Ferreiro, dos Santos, & Conforto (2010) examined the internal consistency of the JHFT (Portuguese version) with a sample of 40 patients with stroke using Cronbach’s alpha, and reported excellent internal consistency(α=0.924). Internal consistency of individual items, reported using Pearson’s correlation coefficient and Cronbach’s alpha , was adequate to excellent (writing: r=0.812, α=0.844; card turning r=0.857, α=0.632; small common objects r=0.657, α=0.651; simulated feeding r=0.813, α=0.646; checkers r=0.712, α=0.633; large light objects r=0.849, α=-0.681; large heavy objects r=0.898, α=0.687).

Test-retest:
Jebsen et al. (1969) examined test-retest reliability of the JHFT in a sample of 26 patients with a range of upper limb conditions including hemiparesis from cerebral vascular disease (n=5), using Pearson’s correlation coefficient. Test-retest reliability of individual tasks was adequate to excellent (writing: r=0.67, 0.84; cards: r=0.91, 0.78; small objects: r=0.93, 0.85; simulated feeding: r=0.92, 0.60; checkers: r=0.99, 0.91; large light objects: r=0.89, 0.67; large heavy objects: r=0.89, 0.92, dominant and non-dominant hands respectively).

Bovend’Eerdt et al. (2004) examined the test-retest reliability of the Modified Jebsen Hand Function Test (MJT) in a sample of 26 individuals with neurological disorders including stroke (n=12), Multiple Sclerosis (n=7), head injury (n=4), and tumours (n=3). The mean time between retesting was 9.6 days. The study reported excellent test-retest reliability of the MJT (r = 0.95), using Pearson’s correlation coefficient.

Intra-rater:
Ferreiro, dos Santos, & Conforto (2010) examined intra-rater reliability of the JHFT (Portuguese version) with a sample of 40 patients with stroke and reported excellent intra-rater reliability (ICC=0.997), using intraclass correlation coefficient (ICC).

Inter-rater:
Ferreiro, dos Santos, & Conforto (2010) examined the inter-rater reliability of the JHFT (Portuguese version) with a sample of 40 patients with stroke using intraclass correlation coefficient (ICC), and reported excellent inter-rater reliability (ICC=1.0). Inter-rater reliability for individual items was also excellent (writing, ICC=0.999; card turning, ICC=0.977; small common objects, ICC=0.998; simulated feeding, ICC=0.991; checkers, ICC=0.995; large light objects, ICC=0.988; large heavy objects, ICC=0.991).

Validity

Content:

No studies have examined the content validity of the JHFT

Criterion:

Concurrent:
Beebe & Lang (2009) examined the concurrent validity of the JHFT with grip and pinch strength (measured by dynamometer), the Action Research Arm Test (ARAT) , Nine Hole Peg Test (NHPT), and the Stroke Impact Scale – Hand domain (SIS-Hand) in a sample of 33 patients with stroke, using Spearman’s correlation. Measures were administered at 1 month, 3 months and 6 months post-stroke. The JHFT demonstrated excellent correlations with grip strength (r=0.79-0.81), pinch strength (0.60-0.79), ARAT (r=0.87-0.95), NHPT (0.84-0.97) and SIS-Hand (0.61-0.83) at all time points.
Note: The study did not use the first task of the JHFT (writing a sentence) due to its dependence on hand dominance and education level.

Beebe & Lang (2007) examined concurrent validity of the JHFT with grip and pinch strength (measured by dynamometer), Action Research Arm Test (ARAT), 9-Hole Peg Test (NHPT), and Stroke Impact Scale – Hand Function Subscale (SIS-Hand) in a sample of 32 participants with stroke, using Pearson’s product moment correlation. The JHFT demonstrated excellent correlations with ARAT (r=-0.89), grip strength (r=-0.76), pinch strength (r=-0.68), 9-HPT (r=-0.89), and SIS-Hand Function (r=-0.82).
Note: The study did not use the first task of the JHFT (writing a sentence) due to its dependence on hand dominance and education level.

Bovend’Eerdt et al. (2004) examined the concurrent validity of the Modified Jebsen Hand Function Test (MJT) with the University of Maryland Arm Questionnaire for Stroke (UMAQS), Nine Hole Peg Test (NHPT), and grip strength (measured by dynamometer) in a sample of 26 individuals with neurological disorders including stroke (n=12), Multiple Sclerosis (n=7), head injury (n=4), and tumours (n=3). Measures were administered on two occasions (T1, T2) on average 9.6 days apart. The MJT showed excellent correlation with the NHPT (r=0.86 and 0.88 on T1 and T2 respectively) and adequate correlation with grip strength (r=0.44, significant on T2 only), using Pearson’s correlation coefficient. Correlations between the MJT and UMAQS were not significant at either time point.

Predictive:
No studies have examined the predictive validity of the JHFT.

Construct:

No studies have examined the construct validity of the JHFT.

Known Groups:
Ferreiro et al. (2010) reported no significant difference in scores on the JHFT (Portuguese version) according to education level or hand dominance in a sample of 40 patients with stroke.

Responsiveness

Beebe & Lang (2009) measured the responsiveness of the JHFT with a sample of 33 patients with stroke, using the single population effect size method. Measures were taken at 1, 3 and 6 months post-stroke, during which time participants received conventional stroke rehabilitation. The JHFT demonstrated moderate responsiveness from 1 to 3 months post-stroke (ES=0.69) and from 3 to 6 months post-stroke (ES=0.73).

Sensitivity & Specificity:
No studies have examined the sensitivity and specificity of the JHFT.

References

  • Beebe, J.A. & Lang, C.E. (2007). Relating movement control at 9 upper extremity segments to loss of hand function in people with chronic hemiparesis. Neurorehabilitation and Neural Repair, 21(3), 279 – 291.
  • Beebe, J.A. & Lang, C.E. (2009). Relationships and responsiveness of six upper extremity function tests during the first six months of recovery after stroke. Journal of Neurologic Physical Therapy, 33(2), 96-103.
  • Bovend’Erdt, T.J.H., Dawes, H., Johansen-Berg, H., & Wade, D.T. (2004). Evaluation of the Modified Jebsen Test of Hand Function and the University of Maryland Arm Questionnaire for Stroke. Clinical Rehabilitation, 18, 195-202
  • Celnik, P., Hummel, F., Harris-Love, M., Wolk, R., & Cohen, L. (2007). Somatosensory stimulation enhances the effects of training functional hand tasks in patients with chronic stroke. Archives of Physical Medicine and Rehabilitation, 88, 1369-76.
  • Cook, C., McCluskey, A., & Bowman, J. (2006). Jebsen Test of Hand Function. Penrith South, NSW: University of Western Sydney. Retrieved from http://www.maa.nsw.gov.au/default.aspx?MenuID=376
  • Duncan, P., Richards, L., Wallace, D., Stoker-Yates, J., Pohl, P., Luchies, C., Ogle, A., & Studenski, S. (1998). A randomized, controlled pilot study of a home-based exercise program for individuals with mild and moderate stroke. Stroke, 1998(29), 2055-2060.
  • Ferreiro, K.N., dos Santos, R.L., & Conforto, A.B. (2010). Pyschometric properties of the Portuguese version of the Jebsen-Taylor test for adults with mild hemiparesis. Revista Brasileira de Fisioterapia (Brazilian Journal of Physiotherapy), 14(5), 377-81.
  • Jebsen, R.H., Taylor, N., Trieschmann, R.B., Trotter, M.J., & Howard, L.A. (1969). An objective and standardized test of hand function. Archives of Physical Medicine and Rehabilitation, 50(6), 311 – 319.
  • Hummel, F., Celnik, P., Giraux, P., Floel, A., Wu, W., Gerloff, C., & Cohen, L. (2005). Effects of non-invasive cortical stimulation on skilled motor function in chronic stroke. Brain, 2005(128), 490-9.
  • Poole, J. (2003). Measures of Adult Hand Function: Arthritis Hand Function Test (AHFT), Grip Ability Test (GAT), Jebsen Test of Hand Function, and The Rheumatoid Hand Functional Disability Scale (The Duruöz Hand Index [DHI]). Arthritis and Rhematism (Arthritis Care and Research), 49(5S), S59-66.
  • Spinal Cord Injury Rehabilitation Evidence. (2010). Jebsen Hand Function Test. Retrieved from http://www.scireproject.com/outcome-measures/jebsen-hand-function-test
  • Wu, C., Seo, H., & Cohen, L. (2006). Influence of electric somatosensory stimulation on paretic-hand function in chronic stroke. Archives of Physical Medicine and Rehabilitation, 87, 351-7.

See the measure

How to obtain the JHFT?

Administration instructions are published in Jebsen, R.H., Taylor, N., Trieschmann, R.B., Trotter, M.J., & Howard, L.A. (1969). An objective and standardized test of hand function. Archives of Physical Medicine and Rehabilitation, 50(6), 311 – 319.

While the JHFT does not require standardized equipment, assessment kits can be purchased from:

Table of contents

Leeds Adult Spasticity Impact Scale (LASIS)

Evidence Reviewed as of before: 13-06-2012
Author(s)*: Annabel McDermott, OT
Editor(s): Nicol Korner-Bitensky, PhD OT

Purpose

The Leeds Adult Spasticity Impact Scale (LASIS) is a measure of passive arm function, suitable for patients with spasticity and little or no active movement of the upper extremity.

In-Depth Review

Purpose of the measure

The Leeds Adult Spasticity Impact Scale (LASIS) is a measure of passive arm function that is administered by semi-structured interview to the patient or carer. It consists of 12 items of low difficulty that evaluate performance of daily functional tasks in the individual’s normal environment. The LASIS is useful for patients with minimal or no active movement or function but with self-care issues of the upper extremity (Ashford et al., 2008).

Available versions

The LASIS was originally published as the Patient Disability and Carer Burden Scale by Bhakta et al. (1996), which included 8 patient items and 4 carer items (Bhakta et al., 2000). The four carer items have been excluded from the current version of the LASIS.

Features of the measure

Items:

The LASIS consists of 12 items that measure passive and low-level active function.

Passive function items:

  • Cleaning the palm (affected hand)*
  • Cutting fingernails (affected hand)*
  • Cleaning the affected elbow*
  • Cleaning the affected armpit*
  • Cleaning the unaffected elbow*
  • Putting arm through coat sleeve*
  • Difficulty putting on a glove
  • Difficulty rolling over in bed
  • Doing physiotherapy exercises to arm*

Active function items:

  • Difficulty balancing in standing*
  • Difficulty balancing when walking*
  • Hold object steady, use other hand (jar)

* Items originally included in the Patient Disability and Carer Burden Rating Scale (Bhakta et al., 2000).

Scoring:

Items are rated between 0 – 4 according to the following criteria:

  • 0 = No difficulty
  • 1 = Little difficulty
  • 2 = Moderate difficulty
  • 3 = A great deal of difficulty
  • 4 = Inability to perform the activity

The total score is calculated as the sum of individual scores, divided by the total number of questions answered. This results in a total score between 0 – 4 that represent disability or carer burden (Ashford et al., 2008).

Note: As the final score does not rely on responses to all 12 items, it may not reflect actual disability or function in the arm (Ashford et al., 2008).

Description of tasks:

The LASIS is administered through semi-structured interview with the patient or carer, with regard to the patient’s performance of tasks over the past 7 days.

Time:

The LASIS takes approximately 10 minutes to administer (Ashford et al., 2008).

Training requirements:

The LASIS should be administered by a clinician (Ashford et al., 2008).

Equipment:

Equipment such as a jar may be required to validate responses.

Alternative form of the Leeds Adult Spasticity Impact Scale (LASIS)

None reported.

Client suitability

Can be used with:

  • Patients with spasticity, including patients with stroke.

Should not be used with:

  • None reported.

Languages of the measure

No translations reported.

Summary

What does the tool measure? Passive and low-level active function of the upper limb.
What types of clients can the tool be used for? Patients with upper limb spasticity, including patients who have experienced a stroke.
Is this a screening or assessment tool? Assessment tool
Time to administer 10 minutes
Versions The LASIS was originally published as the Patient Disability and Carer Burden Scale, which included four dressing and grooming items that have been excluded from the current version of the LASIS.
Other Languages None reported
Measurement Properties
Reliability Internal consistency:
No studies have reported on the internal consistency of the LASIS.

Test-retest:
No studies have reported on the test-retest reliability of the LASIS.

Intra-rater:
No studies have reported on the intra-rater reliability of the LASIS.

Inter-rater:
No studies have reported on the inter-rater reliability of the LASIS.

Validity Content:
No studies have reported on the content validity of the LASIS.

Criterion:
Concurrent:
No studies have reported on the concurrent validity of the LASIS.

Predictive:
No studies have reported on the predictive validity of the LASIS.

Construct:
Convergent/Discriminant:
No studies have reported on the convergent/discriminant validity of the LASIS.

Known Groups:
No studies have reported on the known-groups validity of the LASIS.

Floor/Ceiling Effects No studies have reported on the floor or ceiling effects of the LASIS.
Does the tool detect change in patients? No studies have reported on the sensitivity of the LASIS in patients with stroke.
Acceptability The LASIS is useful for patients with minimal or no active movement or function of the upper extremity.
Feasibility Administrative burden due to calculation of total score, but not complex.
How to obtain the tool? Further information can be found here.

Psychometric Properties

Overview

A literature search was conducted to identify all relevant publications on the psychometric properties of the Leeds Adult Spasticity Impact Scale (LASIS). At the time of publication no studies have reported on the psychometric properties of the LASIS in the stroke population.

Floor/Ceiling Effects

While no studies have investigated the floor or ceiling effects of the LASIS when used with a stroke population, it ca be anticipated that ceiling effects may exist when the LASIS is used with high-functioning patients, due to the hierarchical relationship of items (Ashford et al., 2008).

Reliability

Internal consistency:
No studies have reported on the internal consistency of the LASIS.

Test-retest:
No studies have reported on the test-retest reliability of the LASIS.

Intra-rater:
No studies have reported on the intra-rater reliability of the LASIS.

Inter-rater:
No studies have reported on the inter-rater reliability of the LASIS.

Validity

Content:

No studies have reported on the content validity of the LASIS.

Criterion:

Concurrent:
No studies have reported on the concurrent validity of the LASIS.

Predictive:
No studies have reported on the predictive validity of the LASIS.

Construct:

Convergent/Discriminant:
No studies have reported on the convergent/discriminant validity of the LASIS.

Known Group:
No studies have reported on the known-groups validity of the LASIS.

Responsiveness

No studies have reported on the responsiveness of the LASIS.

Sensitivity/Specificity:
No studies have reported on the sensitivity or the specificity of the LASIS.

References

  • Ashford, S., Slade, M., Malaprade, F., & Turner-Stokes, L. (2008). Evaluation of functional outcome measures for the hemiparetic upper limb: A systematic review. Journal of Rehabilitation Medicine, 40, 787-95.
  • Bhakta, B.B., Cozens, J.A., Chamberlain, M.A., & Bamford, J.M. (2000). Impact of botulinum toxin type A on disability and carer burden due to arm spasticity after stroke: a randomised double blind placebo controlled trial. Journal of Neurological Neurosurgery and Psychiatry, 69, 217-21.

See the measure

How to obtain the LASIS?

Further information can be found here.

Table of contents

Motor Activity Log (MAL)

Evidence Reviewed as of before: 28-03-2019
Author(s)*: Annabel McDermott, OT
Content consistency: Gabriel Plumier

Purpose

The Motor Activity Log (MAL) is a subjective measure of an individual’s real life functional upper limb performance. The MAL is administered by semi-structured interview to determine (a) how much, and (b) how well the individual uses his upper limb in his own home (Ashford et al., 2008, Li et al., 2012; Simpson & Eng, 2013).

In-Depth Review

Purpose of the measure

The Motor Activity Log (MAL) was developed by Taub et al. (1993) as a subjective outcome measure of an individual’s real life functional upper limb performance. The MAL is administered by semi-structured interview to determine (a) how much (Amount of Use – AOU), and (b) how well the individual uses his upper limb (Quality of Movement – QOM) in his own home (Ashford et al., 2008, Li et al., 2012; Simpson & Eng, 2013).

Available versions

There are four versions of the original MAL-30, according to number of items.

  • MAL-14: Contains unilateral and simple items, to detect change in individuals with limited arm function.
  • MAL-26: Contains the same items as the MAL-14 as well as 11 additional items and 1 optional item chosen by the patient; this version includes some bilateral tasks.
  • MAL-28: Contains the same items as the MAL-14 and MAL-26, and additional items that challenge reach and strength.
  • MAL-12: A short version of the MAL-28 (Ashford et al., 2008).

Other adaptations of the MAL include:

  • Graded Motor Activity Log (Morera Silva et al., 2018)
  • Lower-Functioning Motor Activity Log (LF-MAL)
  • Lower-Extremity Motor Activity Log
  • Pediatric Motor Activity Log – Revised

Features of the measure

The MAL is comprised of two scales:

  • Amount of Use (AOU) scale – the amount the individual uses the paretic arm; and
  • Quality of Movement (QOM) scale – the patient’s perceived quality of movement while performing the functional activity (Ashford et al., 2008).

The MAL-QOM scale captures components of amount of arm use and has been shown to be more reliable than the MAL-AOU scale, and as such can be used independently (Uswatte & Taub, 2005).

Items:

Items the original MAL-30

  1. Turn on a light with a light switch
  2. Open drawer
  3. Remove an item from a drawer
  4. Pick up phone
  5. Wipe off a kitchen counter or other surface
  6. Get out of a car
  7. Open refrigerator
  8. Open a door by turning a door knob/handle
  9. Use a TV remote control
  10. Wash your hands
  11. Turning water on/off with knob/lever on faucet
  12. Dry your hands
  13. Put on your socks
  14. Take off your socks
  15. Put on your shoes
  16. Take off your shoes
  17. Get up from a chair with armrests
  18. Pull chair away from table before sitting down
  19. Pull chair toward table after sitting down
  20. Pick up a glass, bottle, drinking cup, or can
  21. Brush your teeth
  22. Put on makeup base, lotion, or shaving cream on face
  23. Use a key to unlock a door
  24. Write on paper
  25. Carry an object in your hand
  26. Use a fork or a spoon for eating
  27. Comb your hair
  28. Pick up a cup by a handle
  29. Button a shirt
  30. Eat half a sandwich or finger foods

Additional Items for the MAL-45

  • Removing bills from a wallet
  • Taking individual coins out of a pocket or purse
  • Removing keys out of a pocket or purse
  • Using a zipper pull
  • Pouring liquid from a bottle
  • Buckling a belt
  • Popping top of beverage can
  • Removing top from a medicine bottle
  • Keypad press
  • Use of keyboard/computer
  • Putting on or taking off watch band
  • Putting on glasses
  • Pumping a soap dispenser
  • Swiping a credit card or a card for an ATM
  • Adjusting a home or hotel air conditioner or heat

Items of the MAL-12:

  1. Pick up phone
  2. Open a door by turning a door knob
  3. Eat half a sandwich or finger food
  4. Turn water on/off with faucet
  5. Pick up a glass
  6. Pick up toothbrush and brush teeth
  7. Use a key to open a door
  8. Letter writing/typing
  9. Use removeable computer storage
  10. Pick up fork or spoon, use for eating
  11. Pick up cup by handle
  12. Carry an object from place to place

Items of the MAL-14:

  1. Putting arm through coat sleeve
  2. Steady myself while standing
  3. Carry an object from place to place
  4. Pick up fork or spoon, use for eating
  5. Comb hair
  6. Pick up cup by handle
  7. Hand craft/card playing
  8. Hold a book for reading
  9. Use towel to dry face or other body part
  10. Pick up a glass
  11. Pick up toothbrush and brush teeth
  12. Shaving/makeup
  13. Use a key to open a door
  14. Letter writing/typing

The MAL-26 includes the 14 items from the MAL-14 as well as the following items:

  1. Pour coffee/tea
  2. Peel fruit/potatoes
  3. Dial number on the phone
  4. Open/close a window
  5. Open an envelope
  6. Take money out of a wallet or purse
  7. Undo buttons on clothing
  8. Buttons on clothing
  9. Undo a zip
  10. Do up a zip
  11. Cut fingernails (affected hand)
  12. Other optional activity

Items of the MAL-28:

  1. Turn on a light with a light switch
  2. Open a drawer
  3. Remove item of clothing from drawer
  4. Pick up phone
  5. Wipe kitchen counter
  6. Get out of car
  7. Open refrigerator
  8. Open a door by turning a door knob
  9. Use a TV remote control
  10. Wash your hands
  11. Turn water on/off with faucet
  12. Dry your hands
  13. Put on your socks
  14. Take off your socks
  15. Put on your shoes
  16. Take off your shoes
  17. Get up from chair with armrests
  18. Pull chair away from table before sitting
  19. Pull chair toward table after sitting
  20. Pick up a glass
  21. Pick up toothbrush and brush teeth
  22. Use a key to unlock a door
  23. Steady self while standing
  24. Carry an object from place to place
  25. Comb hair
  26. Pick up cup by handle
  27. Buttons on clothing (shirt, trousers)
  28. Eat half a sandwich or finger food

For each item, the individual is asked whether he/she attempted the activity in the past 7 days, and the relevant score is assigned according to his/her response. The examiner can verify the response by paraphrasing it back to the individual (Uswatte & Taub, 2005). The MAL can also be used with caregivers.

Scoring:

The MAL is administered by semi-structured interview and items are scored by patients according to their performance of each task over the past 7 days; the MAL-28 can also be used to score performance over the past 3 days (Ashford et al., 2008; Uswatte & Taub, 2005).

The MAL adopts a 6-point ordinal scale, although patients can attribute a half-score, resulting in 11-point Likert scales with specified anchoring definitions at 6 points (Uswatte & Taub, 2005):

Amount of Use scale scoring:

  • 0: Never – The weaker arm was not used at all for that activity.
  • 1: Very rarely – Occasionally used the weaker arm, but only very rarely.
  • 2: Rarely – Sometimes used the weaker arm but did the activity most of the time with the stronger arm.
  • 3: Half pre-stroke – Used the weaker arm about half as much as before the stroke.
  • 4: Three quarters pre-stroke – Used the weaker arm almost as much as before the stroke.
  • 5: Same – Used the weaker arm as often as before the stroke.

Quality of Movement scale scoring:

  • 0: Never – The weaker arm was not used at all for that activity.
  • 1: Very rarely – The weaker arm was moved during the activity but was not very helpful.
  • 2: Rarely – The weaker arm was of some use during the activity but needed some help from the stronger arm but moved very slowly or with difficulty.
  • 3: Fair – The weaker arm was used for that activity, but the movements were slow or were made only with some effort.
  • 4: Almost normal – The movements made by the weaker arm for the activity were almost normal but not quite as fast or accurate as normal
  • 5: Normal – The ability to use the weaker arm for that activity was as good as before the stroke.

Scale total scores (summary scores) are the mean of the item scores.

What to consider before beginning:

The MAL is subject to experimenter bias and also the patient’s ability to accurately recall upper limb use (Page & Levine, 2003; Uswatte & Taub, 2005).

Ashford et al. (2008) noted an inadequate relationship between overall/item scores and the qualitative meaning, and an unclear Minimal clinically important difference.

Taub & Uswatte (2000) discuss the use of the MAL as an outcome measure in Constraint-Induced Movement Therapy (CIMT) research and recommend an upper cut-off score of 2.5 on the MAL-AOU, as the effect of stroke can impose an upper physiological limit on the amount of improvement that can be produced. The authors also note that individuals who score > 2.5 do not demonstrate learned non-use, which is the aim of CIMT.

Time:

All versions of the MAL are administered through structured interview with the patient and/or carer and require more than 10 minutes to administer. (Ashford et al., 2008).

Training requirements:

The MAL can be administered by health professionals who have reviewed the manual and literature.

Equipment:

Survey instrument and pencil.

Client suitability

Can be used with:

  • The MAL is suitable for use with adults and elderly adults following stroke and their caregivers. It is suitable for use in the subacute and chronic stages of stroke recovery.

Should not be used in:

Not specified.

  • The MAL is often used to measure outcomes following constraint induced movement therapy (Li et al., 2012; Page, 2003). The MAL is commonly used in research in conjunction wi with the Wolf Motor Function Test, Fugl-Meyer Assessment or the Action Research Arm Test (Santisteban et al., 2016; Simpson & Eng, 2013).

In what languages is the measure available?

  • Brazilian-Portuguese (Saliba et al., 2011)
  • English
  • German (Khan et al., 2013)
  • Portuguese (Pereira et al., 2011)
  • Turkish translation and cultural adaptation (Cakar et al., 2010).

Summary

What does the tool measure? Real life upper limb performance.
What types of clients can the tool be used for? Individuals following stroke and their caregivers.
Is this a screening or assessment tool? Assessment
What domain of the ICF does this measure? Activity/participation
Time to administer 20 minutes
Versions
  • MAL-30
  • MAL-28
  • MAL-26
  • MAL-14
  • MAL-12
  • Graded Motor Activity Log
  • Lower-Functioning Motor Activity Log (LF-MAL)
  • Lower-Extremity Motor Activity Log
  • Pediatric Motor Activity Log – Revised
Other Languages Brazilian-Portuguese, English, German, Portuguese, Turkish.
Measurement Properties
Reliability Internal consistency:
– MAL-14: Two studies reported excellent internal consistency.
– MAL: One study reported excellent internal consistency; one study reported excellent internal consistency among patients with mild-moderate hemiparesis and adequate to excellent internal consistency among patients with severe hemiparesis.
– MAL-28 (Turkish): One study reported excellent internal consistency.
– MAL-30 (German): One study reported excellent internal consistency.
– Grade 4/5 MAL: One study reported excellent internal consistency.

Test-retest:
– MAL-14: One study reported excellent test-retest reliability; one study reported adequate to excellent test-retest reliability.
– MAL: One study reported excellent test-retest reliability; one study reported adequate to excellent test-retest reliability.
– MAL-28 (Turkish): One study reported excellent test-retest reliability.
– MAL-28 (Brazilian): One study reported excellent test-retest reliability.
– MAL-45: One study reported excellent test-retest reliability.
– Grade 4/5 MAL: One study reported excellent test-retest reliability.

Intra-rater:
No studies have reported on the intra-rater reliability of the MAL.

Inter-rater:
MAL-14: One study reported adequate inter-rater reliability.

Validity Content:
No studies have reported on content validity of the MAL.

Criterion:
Concurrent:
– MAL-14: One study reported excellent correlations with accelerometry.
– MAL: Three studies reported an excellent correlation with SIS – Hand function domain; adequate correlations with the BBT, ARAT, FAI; poor to adequate correlations with SIS, SS-QOL, NEADL; and poor correlations with the Nine Hole Peg Test.
– MAL-30 (German): One study reported excellent negative correlations with WMFT-PT; excellent correlations with WMFT-FA and Grip strength scores, CMSA – Arm and Hand scores, isometric strength.
– MAL-45: 1 study reported excellent correlations with the Abilhand.

Predictive:
No studies have reported on predictive validity of the MAL.

Construct:
– MAL-14: One study reported excellent correlations between QOM and AOU patient/carer change scores; one study reported an excellent correlation between AOU and QOM scales.
– MAL: One study reported an excellent correlation between AOU and QOM scales; one study reported an adequate correlation between AOU and QOM scales; one study conducted item analysis and removed two items due to low item-total correlations and reliability coefficients; one study conducted item fit analysis and principal component analysis.
– MAL (Brazilian): One study reported an excellent correlation between AOU and QOM scales.
– MAL-30 (German): One study reported excellent correlations between AOU and QOM scales.
– MAL-28 (Turkish): One study reported an excellent correlation between AOU and QOM scales.
– LF-MAL: One study reported an adequate correlation between the AOU and QOM scales.

Convergent/Discriminant:
– MAL-14: Three studies reported excellent correlations with ARAT, accelerometry, Simple Test for Evaluating Hand Function (STEF).
– MAL: Seven studies reported excellent correlations with Actual Amount of Use Test, WMFT; adequate to excellent correlations with accelerometry ratios, SIS 2.0 – Hand function scale, FMA-UE; adequate correlations with ARAT, Motor Assessment Scale – Upper Extremity, 16 Hole Peg Test, grip strength; SF-36 – Physical domain; poor to adequate correlations with accelerometry ratios of the less affected arm; poor correlations with the SIS 2.0 – Mobility scale.
– MAL-28 (Turkish): One study reported excellent correlations with WMFT-FA; adequate negative correlations with the WMFT-PT.
– MAL (Brazilian): One study reported adequate correlations with grip strength of the more affected arm.

Known Group:
MAL: One study reported correlations with accelerometry was stronger among patients with paresis of the dominant arm vs. the non-dominant arm.

Floor/Ceiling Effects – Floor effects are evident when detecting change in lower level and passive functional tasks.
– One study found modest floor effects when the MAL-28 was administered to patients with upper extremity motor recovery at Brunnstrom stage III and higher; and modest floor effects when the LF-MAL was administered to patients with upper extremity motor recovery at Brunnstrom stage III and lower.
Does the tool detect change in patients? The MAL can be used to detect change
Acceptability The MAL reflects real life functional performance. It is simple and non-invasive to administer.
Feasibility The MAL is a free tool that requires no additional equipment. It can be administered in the clinical setting or the patient’s home. No additional training is required.
How to obtain the tool?

Click here to see the Motor Activity Log manual.

Psychometric Properties

Overview

A literature search was conducted to identify all relevant publications on the psychometric properties of the MAL. Twenty-six studies were identified, most of which included patients in the chronic phase of stroke recovery. This review includes different versions of the MAL – the original MAL-30, MAL-28, MAL-14, MAL-45, LF-MAL, Grade 4/5 MAL and Turkish, Brazilian and German versions.

Floor/Ceiling Effects

Chuang et al. (2017) examined floor/ceiling effects of the 30-item MAL in a sample of 403 patients with chronic stroke. The MAL was administered to patients with motor recovery of the proximal and distal upper limb at Brunnstrom stage III and higher. Results showed modest floor effects within this cohort, whereby 17.3% of participants received minimum scores on the MAL.

Chuang et al. (2017) examined floor/ceiling effects of the LF-MAL in a sample of 134 patients with chronic stroke. The LF-MAL was administered to patients with motor recovery of the proximal and distal upper limb at Brunnstrom stage III and lower. Results showed modest floor effects within this cohort, whereby 16.4% of participants received minimum scores on the LF-MAL.

Reliability

Internal consistency:
van der Lee et al. (2004) examined internal consistency of the MAL-14 in a sample of 56 patients with chronic stroke, using Cronbach’s alpha. Correlation among items was excellent for the MAL-AOU (a = 0.87) and the MAL-QOM (a = 0.90). Limits of agreement ranged from -0.70 to 0.85 for the MAL-AOU and from -0.61 to 0.71 for the MAL-QOM, indicating reproducibility sufficient to detect an individual change of approximately 12-15% of the range of the scale.

Uswatte et al. (2005b) examined internal consistency of the MAL-14 in a sample of 41 patients with chronic stroke and their caregivers, using Cronbach’s alpha. Correlation among items was excellent for patients’ MAL-QOM (a = 0.87) and caregivers’ MAL-AOU and MAL-QOM (a > 0.83). The authors also examined internal consistency of the MAL-14 (QOM scale only) in a sample of 27 patients with chronic stroke. Correlation among items was excellent for the MAL-QOM (a = 0.81).

Uswatte et al. (2006b) examined internal consistency of the MAL-28 in a sample of 222 patients with subacute/chronic stroke and their caregivers, using Cronbach’s alpha. Responses from both patient and caregiver groups showed excellent correlation among items for the MAL-AOU (patients a = 0.94; caregivers a = 0.95) and the MAL-QOM (patients a = 0.94; caregivers a = 0.95).

Huseyinsinoglu et al. (2011) examined internal consistency of the MAL-28 (Turkish version) in a sample of 30 patients with stroke, using Cronbach’s alpha. Internal consistency was excellent for the MAL-AOU (a = 0.96) and MAL-QOM (a = 0.96).

Khan et al. (2013) examined internal consistency of the MAL-30 (German version) in a sample of 42 patients with acute to chronic stroke, using Cronbach’s alpha. Measures were taken at baseline, discharge from rehabilitation and at 6-month follow-up. Internal consistency for the MAL-AOU and MAL-QOM were excellent at all timepoints (a = 0.98-0.995). The authors also calculated internal consistency based on an elimination procedure of items that scored “N/A” down to 26 items and reported that internal consistency remained high at all timepoints (a = 0.94-0.98).

Taub et al. (2013) reported on internal consistency of the Grade 4/5 MAL, referencing unpublished data from Morris (2009) that used a sample of 30 individuals with stroke, using Cronbach’s alpha. Internal consistency for the Grade 4/5 MAL was excellent (a = 0.95).

Chuang et al. (2017) examined the 6-point rating system of the MAL and found rater difficulty discriminating among the 6 levels of functional ability. Results showed that 15 items of the MAL-AOU and MAL-QOM displayed disordering of step difficulty. Accordingly, the 6 levels were collapsed into 4 levels to restore reversed threshold (0 = 0; 1-2 = 1; 3-4 = 2; 5 = 3); using the 4-point system 9 items still showed disordered ordering, so the levels were further collapsed into 2 categories (0 = 0; 1 to 3 = 1), at which point all items exhibited ordering. The authors examined unidimensionality of the 30-item MAL in a sample of 403 patients with chronic stroke, using the revised scoring system. Item fit analysis of the MAL revealed that 7 items* of the MAL-AOU and MAL-QOM were a poor fit and were removed. Principal component analysis (PCA) of the remaining 23 items showed that Rasch measures accounted for 76% of the variance for both the MAL-AOU and MAL-QOM, with an eigenvalue of the first residual factor of 2.7. This indicates that the 23 items constitute unidimensional constructs. The authors examined reliability of the revised MAL (23 items, 4-point rating system), using Rasch analysis. With Pearson separation values of 2.4 and 2.6 for the MAL-AOU and MAL-QOM respectively, the revised version was sensitive to distinguish among 3 strata of upper limb performance. Pearson reliability coefficients were 0.85 and 0.87 (respectively), suggesting good reliability. Results showed no Differential Item Functioning (DIF) items across age, gender or hand dominance. Item difficulty hierarchy was consistent with clinical expectation, however items were more difficult than individuals’ ability, suggesting unsuitable targeting for the participants of this sample.

* Misfit items: (6) Get out of car; (12) Dry your hands; (18) Pull a chair away from the table before sitting down; (19) Pull chair toward table after sitting down; (21) Brush your teeth; (24) Write on paper; (29) Button a shirt.

Chuang et al. (2017) examined the 6-point rating system of the LF-MAL and found disordered thresholds; accordingly, the 6 levels were collapsed into 3 levels to restore reversed threshold (0 = 0; 1-3 = 1; 4-5 = 2); this 3-point rating system achieved step ordering. The authors examined unidimensionality of the LF-MAL in a sample of 134 patients with chronic stroke, using the revised 3-point scoring system. Item fit analysis of the LF-MAL-AOU revealed that 6 items were out of the acceptable range; PCA of the remaining 24 items showed that the Rasch dimension explained 70.5% of the variance, with an eigenvalue of 2.6 of the first residual factor. Item fit analysis of the LF-MAL-QOM revealed that 7 items were out of the acceptable range; PCA of the remaining 23 items showed that the Rasch dimension explained 71.0% of the variance, with an eigenvalue of the first residual factor of 2.5. The authors examined reliability of the revised LF-MAL (25 items, 3-point rating system), using Rasch analysis. With Pearson separation values of 1.9 for both the LF-MAL-AOU and LF-MAL-QOM, the revised version was sensitive to distinguish 2 strata of upper limb performance. Pearson reliability coefficients were 0.79 for both the LF-MAL-AOU and LF-MAL-QOM, indicating acceptable reliability. Results showed no DIF items across age, gender or hand dominance. Item difficulty hierarchy was consistent with clinical expectation, however items were more difficult than individuals’ ability, suggesting unsuitable targeting for the participants of this sample.

* Misfit items: (5) Wipe off a kitchen counter or another surface; (6) Get out of a car; (7) Open a refrigerator; (19) Apply soap to your body while bathing (LF-MAL-QOM only); (21) Brush your teeth; (23) Steady yourself while standing; (24) Carry an object in your hand.

Moreira Silva et al. (2018) examined internal consistency of the MAL-30 in a sample of 66 individuals with chronic stroke, using Cronbach’s alpha. Participants were classified according to upper extremity motor function using the Fugl-Meyer Assessment – Upper Extremity (FMA-UE): mild to moderate hemiparesis (FMA-UE ≥ 32, n = 49) or severe hemiparesis (FMA-UE ≤31, n = 17). Internal consistency of the MAL-AOU and MAL-QOM was excellent among participants with mild-moderate hemiparesis (a = 0.95), and adequate to excellent among participants with severe hemiparesis (MAL-AOU: a = 0.79; MAL-QOM: a = 0.89). Rasch analysis was used to further evaluate reliability of the MAL-30. Item calibration of the MAL-AOU and MAL-QOM revealed one misfit (#19: Pull a chair toward table after sitting down). Item separation index of the MAL-AOU and MAL-QOM was 2.92 and 2.59 (respectively) suggesting 5 levels of difficulty for the MAL-AOU and 4 levels of difficulty for the MAL-QOM. Pearson separation index of the MAL-AOU and MAL-QOM was 2.62 and 2.58 (respectively), suggesting 4 ability levels for both the MAL-AOU and the MAL-QOM.

Test-retest:
Miltner et al. (1999) examined test-retest reliability of the MAL in a sample of 15 patients with chronic stroke. Measures were taken within a 2-week interval before participants began constraint-induced movement therapy. Test-retest reliability was excellent (r = 0.98).

Johnson et al. (2003) examined test-retest reliability of the MAL-45 in a sample of 12 patients with chronic stroke, using Pearson’s correlation coefficient. Measures were taken within a 3-week interval. Test-retest reliability was excellent for the MAL-AOU (r=0.96) and MAL-QOM (r = 0.99).

van der Lee et al. (2004) examined test-retest reliability of the MAL-14 in a sample of 56 patients with chronic stroke, using the Bland and Altman method. Measures were taken within a 2-week interval before participants commenced an intervention program. Test-retest reliability was excellent for the for MAL-AOU (r = 0.70 to 0.85) and the MAL-QOM (r = 0.61 to 0.71).

Uswatte et al. (2005b) examined test-retest reliability of the MAL-14 in a sample of 41 patients with chronic stroke and their caregivers, using Pearson correlation coefficients. Test-retest reliability was excellent for patient MAL-QOM scores (r = 0.91), and adequate for patient MAL-AOU scores (r = 0.44), and caregiver MAL-AOU and MAL-QOM scores (r = 0.61, r = 0.50 respectively).

Uswatte et al. (2006b) examined 2-week test-retest reliability of the MAL-30 in a sample of 116 patients with subacute/chronic stroke and their caregivers, using Intra Class Coefficients (ICC). Test-retest reliability for the MAL-AOU and MAL-QOM was excellent among patients (ICC = 0.79, ICC = 0.82, respectively), and adequate among caregivers (ICC = 0.66, ICC = 0.72, respectively). There was a trend toward an increase from test 1 to test 2 among both patients and caregivers (patient MAL-AOU: 0.3 ± 0.6, p = 0.04; patient MAL-QOM: 0.3 ± 0.5, p = 0.02; caregiver MAL-AOU: 0.4 ± 0.7, p = 0.05; caregiver MAL-QOM: 0.4 ± 0.7, p = 0.02), although increases were less than the minimal clinically important difference (< 0.5 points).

Huseyinsinoglu et al. (2011) examined 3-day test-retest reliability of the MAL-28 (Turkish version) in a sample of 30 patients with stroke, using intraclass coefficients (ICC) and Spearman correlation coefficients. Test-retest reliability was excellent for the MAL-AOU (ICC = 0.97, r = 0.94) and the MAL-QOM (ICC = 0.96, r = 0.93).

Saliba et al. (2011) examined test-retest reliability of the MAL (Brazilian version), using intra-class correlation coefficients (ICC). Test-retest reliability for the MAL-AOU and MAL-QOM was excellent (ICC = 0.98).

Taub et al. (2013) reported on test-retest reliability of the Grade 4/5 MAL, referencing unpublished data from Morris (2009) that used a sample of 10 individuals with stroke. Test-retest reliability for the Grade 4/5 MAL was excellent (r = 0.95).

Intra-rater:
No studies have reported on the intra-rater reliability of the MAL.

Inter-rater:
Uswatte et al. (2005b) examined inter-rater reliability of the MAL-14 in a sample of 41 patients with chronic stroke and their caregivers using Intra Class Coefficients (ICC). Participants received Constraint-Induced Movement Therapy (CIMT) or time-matched general fitness rehabilitation for two weeks. Reliability between patient and carer pre-treatment scores was adequate (ICC = 0.52, p < 0.01); reliability between patient and carer change scores following treatment was adequate (ICC = 0.7, p < 0.0001).

Validity

Content:

No studies have reported on content validity of the MAL.

Criterion:

Concurrent:
Johnson et al. (2003) examined concurrent validity of the MAL-45 in a sample of 12 patients with chronic stroke by comparison with the Abilhand, using Pearson correlation coefficients. Correlations with the Abilhand were excellent for the MAL-AOU (r = 0.71, p < 0.05) and MAL-QOM (r = 0.88, p < 0.05).

Uswatte et al. (2005b) examined concurrent validity of the MAL-14 (QOM scale only) in a sample of 27 patients with chronic stroke by comparison with accelerometry of the affected arm, using Pearson correlation coefficients. Correlations between the MAL-QOM and accelerometer recordings at pre-treatment (r = 0.70, p < 0.05) were excellent. Correlations between MAL-QOM change scores from pre-treatment to post-treatment and corresponding change scores on accelerometer readings were also excellent (r = 0.91, p < 0.01).

Lin et al. (2010a) examined concurrent validity of the MAL-30 by comparison with the Nine Hole Peg Test (9HPT), the Box and Block Test (BBT) and the Action Research Arm Test (ARAT), using Spearman rank correlation coefficients. Patients with chronic stroke (n=59) were randomized to receive distributed constraint-induced movement therapy, bilateral arm training or neurodevelopmental therapy, and measures were taken at baseline and post-treatment (3 weeks). Correlations at baseline and post-treatment were significant and adequate with the BBT (MAL-AOU: r = 0.37, r = 0.49; MAL-QOM: r = 0.52, r = 0.52) and the ARAT (MAL-AOU: r = 0.31, r = 0.32; MAL-QOM: r = 0.39, r = 0.35). Correlations with the 9HPT were significant for the MAL-QOM only (r = -0.26, r = -0.33).

Lin et al. (2010b) examined concurrent validity of the MAL-30 by comparison with the Stroke Impact Scale 3.0 (SIS) and the Stroke-Specific Quality of Life Scale (SS-QOL), using Spearman rank correlation coefficients. Patients with chronic stroke (n = 74) were randomized to receive distributed constraint-induced movement therapy, bilateral arm training or neurodevelopmental therapy, and measures were taken at baseline and post-treatment (3 weeks). There were significant poor to adequate correlations between the MAL-AOU and most SIS domains at baseline (r = 0.24-0.58) and post-treatment (r = 0.24-0.59). There were significant excellent correlations between the MAL-QOM and the SIS – Hand function domain at baseline (r = 0.65) and post-treatment (r = 0.68), and significant poor to adequate correlations between the MAL-QOM and most other SIS domains at baseline (r = 0.26-0.52) and post-treatment (r = 0.28-0.51). There were significant correlations between the MAL-AOU and some SS-QOL domains at baseline (r = 0.25-0.37) and post-treatment (r = 0.24-0.35), and between the MAL-QOM and some SS-QOL domains at baseline (r = 0.28-0.38) and post-treatment (r = 0.26-0.39).

Wu et al. (2011) examined concurrent validity of the MAL-30 in a sample of 77 patients with chronic stroke by comparison with a modified version of the Nottingham Extended ADL Scale (NEADL) and the Frenchay Activities Index (FAI), using Spearman rank correlation coefficients. Measures were taken at pre-treatment and 3 weeks later at post-treatment. Correlations with the NEADL were poor to adequate (MAL-AOU: r = 0.3; MAL-QOM: r = 0.2-0.3). Correlations with the FAI were adequate (MAL-AOU: r = 0.3-0.4); MAL-QOM: r = 0.3).

Khan et al. (2013) examined cross-sectional concurrent validity of the MAL-30 (German version) by comparison with the Wolf Motor Function Test (WMFT) – Time and Functional ability subtests, the Chedoke McMaster Stroke Assessment (CMSA) – Arm and Hand subtests, the grip strength scale, and isometric strength measured by handheld dynamometer (mean of shoulder and elbow flexion and extension), using Spearman’s rank correlation coefficients. Patients with acute to chronic stroke (n = 42) received inpatient rehabilitation and measures were taken at baseline; discharge from hospital and at 6-month follow-up. Significant negative correlations were seen with the WMFT – Time scores (MAL-AOU r = -0.747 – -0.878; MAL-QOM r = -0.770 – -0.901). Correlations were excellent at all time points with the WMFT – Functional ability (MAL-AOU r = 0.769 – 0.808, MAL-QOM r = 0.789 – 0.837), the CSMA – Arm (MAL-AOU r = 0.680 – 0.765; MAL-QOM r = 0.691 – 0.798) and CSMA – Hand (MAL-AOU r = 0.692 – 0.801; MAL-QOM r = 0.717 – 0.803), grip strength (MAL-AOU r = 0.698 – 0.716; MAL-QOM r = 0.659-.0733) and isometric strength (MAL-AOU r = 0.643-0.719; MAL-QOM r = 0.714-0.726).

Predictive:
No studies have examined predictive validity of the MAL.

Construct:

Uswatte et al. (2006b) conducted item analysis of the original MAL-30 using item-total correlations, reliability and proportion of missing data (with an a priori cut-off of 20%) in a sample of 222 patients with subacute/chronic stroke and their caregivers. Of the 30 items, 25 items were completed by > 80% of caregivers and 28 items were completed by > 80% of patients; analysis of these 28 items indicated item-total correlations > 0.5 for 92% of items, and reliability coefficients > 0.5 for 89% of items. The remaining 2 items (write on paper: 48% missing data; put makeup/shaving cream on face: 20% missing data) showed lower item-total correlations and reliability coefficients and were dropped accordingly.

van der Lee et al. (2004) examined construct validity of the MAL-14 in a sample of 56 patients with chronic stroke, using Spearman’s correlation coefficient. There was an excellent correlation between the MAL-AOU and MAL-QOM (r = 0.95, p < 0.001).

Uswatte et al. (2005b) examined construct validity of the MAL-14 (QOM scale only) in a sample of 27 patients with chronic stroke by comparison with patient/caregiver MAL-AOU scores, using Pearson correlation coefficients. Correlations were excellent between MAL-QOM change scores from pre-treatment to post-treatment and corresponding change scores in patient MAL-AOU (r = 0.80, p < 0.01), carer MAL-AOU (r = 0.73, p < 0.01) and carer MAL-QOM (r = 0.70, p < 0.01).

Uswatte et al. (2006a) examined construct validity of the MAL-30 in a sample of 169 individuals with subacute/chronic stroke, using Pearson correlation coefficient. There was an excellent correlation between the MAL-AOU and MAL-QOM (r = 0.92, p < 0.001).

Huseyinsinoglu et al. (2011) examined construct validity of the MAL-28 (Turkish version) in a sample of 30 patients with stroke, using Spearman’s correlation coefficient. The correlation between the MAL-AOU and the MAL-QOM was excellent (r = 0.95).

Saliba et al. (2011) examined construct validity of the MAL (Brazilian version) in a sample of 77 individuals with chronic stroke, using Rasch analysis. There was an excellent correlation between the MAL-AOU and the MAL-QOM (r = 0.97, p < 0.0001).

Khan et al. (2013) examined construct validity of the MAL-30 (German version), using Spearman’s rank correlation coefficients. Patients with acute to chronic stroke (n = 42) received inpatient rehabilitation and measures were taken at baseline, discharge from hospital and at 6-month follow-up. There was an excellent correlation between the MAL-AOU and MAL-QOM at all timepoints (r = 0.994, 0.982, 0.980).

Chuang et al. (2017) examined construct validity of the MAL-30 in a sample of 403 patients with chronic stroke with motor recovery of the proximal and distal upper limb at Brunnstrom stage III and higher, using Rasch analysis. Correlation between the MAL-AOU and MAL-QOM was adequate (r = 0.603), indicating that the subscales are not highly correlated and can be perceived as different concepts.

Chuang et al. (2017) examined construct validity of the LF-MAL in a sample of 134 patients with chronic stroke with motor recovery of the proximal and distal upper limb at Brunnstrom stage III and lower, using Rasch analysis. Correlation between the LF-MAL-AOU and LF-MAL-QOM was adequate (r = 0.607), indicating that the subscales are not highly correlated and can be perceived as different concepts.

Convergent/Discriminant:
van der Lee et al. (2004) examined cross-sectional convergent validity of the MAL-14 by comparison with the Action Research Arm Test (ARAT) in a sample of 56 patients with chronic stroke, using Spearman’s correlation coefficient. There were excellent correlations between the MAL-AOU and the ARAT (r = 0.63, p < 0.001) and between the MAL-QOM and the ARAT (r = 0.63, p < 0.001).

Uswatte et al. (2005a) examined convergent validity of the MAL-14 in a sample of 20 patients with chronic stroke by comparison with accelerometry of the affected arm, using Spearman rank correlations. There was an excellent correlation between the MAL-14 and accelerometry (r = 0.74, p < 0.001).

Uswatte et al. (2006a) examined convergent validity of the MAL-30 (QOM scale only) in a sample of 169 patients with subacute/chronic stroke by comparison with accelerometry of the affected arm and the Actual Amount of Use Test (AAUT), using Pearson correlation coefficients. Correlations between the MAL-QOM and accelerometry ratios (ratio summary variable, impaired arm summary variable) were adequate (r = 0.52, r = 0.41 respectively, p < 0.001). The correlation between the MAL-QOM and AAUT was excellent (r = 0.94, p < 0.001).

Uswatte et al. (2006b) examined convergent validity of the MAL-30 in a sample of 222 patients with subacute/chronic stroke and their caregivers by comparison with accelerometry of the affected arm, and the SIS 2.0 – Hand function scale, using Pearson correlation coefficients. Comparison of the MAL with accelerometry ratios showed adequate to excellent correlations for patient scores (MAL-AOU: r = 0.47; MAL-QOM: r = 0.52, p < 0.01), and adequate correlations for caregiver scores (MAL-AOU: r = 0.57; MAL-QOM, r = 0.61, p < 0.01). Comparison of the MAL and SIS – Hand function scores showed excellent correlations for patient scores (MAL-AOU: r = 0.68; MAL-QOM: r = 0.72, p < 0.01), and adequate correlations for caregiver scores (MAL-AOU: r = 0.35, MAL-QOM: r = 0.40, p < 0.01).

Uswatte et al. (2006b) examined divergent validity of the MAL-30 in a sample of 222 patients with subacute/chronic stroke and their caregivers by comparison with accelerometry of the less affected arm, and the SIS 2.0 – Mobility scale, using Pearson correlation coefficients. Comparison of the MAL with accelerometry ratios of the less affected arm showed poor correlations for patient scores (MAL-AOU: r = 0.14; MAL-QOM: r = 0.14, p > 0.05), and poor to adequate correlations for caregiver scores (MAL-AOU: r = 0.25; MAL-QOM, r = 0.23, p < 0.001). Comparison of the MAL and SIS – Mobility scores showed poor correlations for patient scores (MAL-AOU: r = 0.14; MAL-QOM: r = 0.14, p > 0.05), and poor correlations for caregiver scores (MAL-AOU: r = 0.10, MAL-QOM: r = 0.07, p > 0.05).

Hammer and Lindmark (2010) examined cross-sectional convergent validity of the MAL-30 by comparison with the FMA-UE, ARAT, Motor Assessment Scale – Upper Extremity score (MAS-UE), 16-hole peg test (16HPT) and the Grippit ratio of isometric grip strength, using Spearman’s correlation coefficient. Patients with subacute stroke (n = 30) were randomized to receive forced use therapy or standard upper limb rehabilitation, and measures were taken at baseline, post-treatment (2 weeks) and follow-up (3 months). Correlations were significant and adequate with all measures: FMA-UE (r = 0.43-0.52); ARAT (r = 0.31-0.51); MAS-UE (r = 0.41-0.54); 16HPT (r = -0.41 – -0.67); Grippit (r = 0.41-0.53).

Huseyinsinoglu et al. (2011) examined convergent validity of the MAL-28 (Turkish version) by comparison with the WMFT – Performance Time (WMFT-PT) and – Functional Ability (WMFT-FA) scores in a sample of 30 patients with stroke. There were excellent correlations with the WMFT-FA (MAL-AOU, r=0.63; MAL-QOM: r = 0.63), and adequate negative correlations with the WMFT-PT (MAL-AOU: r = -0.56; MAL-QOM: r = -0.55).

Saliba et al. (2011) examined convergent validity of the MAL (Brazilian version) by comparison with grip strength of the more severely affected upper limb in a sample of 77 individuals with chronic stroke, using Rasch analysis. There were adequate correlations between grip strength and the MAL-AOU (r = 0.51, p < 0.0001) and the MAL-QOM (r =0 .57, p < 0.0001).

Sterr et al. (2014) examined divergent validity of the MAL in a sample of 65 patients with chronic stroke by comparison with the Short Form 36 (SF-36), Stroke Impact Scale (SIS), Hospital Anxiety and Depression Scale (HADS) and Visual Analog Mood Score (VAMS), using regression analysis. Participants received four different Constraint-Induced Movement Therapy (CIMT) treatment protocols that differed in intensity and use of a constraint. Following treatment there was a significant positive association between the MAL-AOU and the SF-36 Physical domain (r = 0.38m p = 0.025) and a trend towards a moderate association with the SIS Total score (r = 0.43, p = 0.061).

Shindo et al. (2015) examined convergent validity of the MAL-14 in a sample of 34 patients with acute/subacute stroke by comparison with the Simple Test for Evaluating Hand Function (STEF), using Spearman’s correlation coefficient. There was a significant and excellent correlation between the assessments (MAL-AOU: r = 0.805; MAL-QOM: r = 0.768).

Simpson, Conroy & Beaver (2015) examined convergent validity of the MAL-28 in a sample of 9 patients with stroke, by comparison with the FMA, Wolf Motor Function Test and Stroke Impact Scale, using Spearman’s correlation coefficient. There were excellent correlations between baseline MAL-AOU and FMA (ρ = 0.6889, p < 0.0132) and MAL-QOM and FMA (ρ = 0.7276, p < 0.0073).

Moreira Silva et al. (2018) examined convergent validity of the MAL-30 in a sample of 66 individuals with chronic stroke by comparison with the FMA-UE, using Spearman’s correlation coefficient. There was a significant and excellent correlation with the FMA-UE (MAL-AOU: r = 0.87; MAL-QOM: r = 0.87).

Chen et al. (2018) examined convergent validity of the MAL in a sample of 82 patients with stroke by comparison with accelerometry of the affected arm, using Pearson’s correlation coefficient. There was an adequate correlation with accelerometry (MAL-AOU: r = 0.47; MAL-QOM: r = 0.57).

Known Group:
Uswatte et al. (2006b) examined known-group validity of the MAL in a sample of 222 patients with subacute/chronic stroke and their caregivers. Correlations between the MAL and accelerometry ratio was stronger among patients with paresis of their dominant arm (MAL-AOU: r = 0.56; MAL-QOM: r = 0.59) than among patients with paresis of the non-dominant arm (MAL-AOU: r = 0.28; MAL-QOM: r = 0.34).

Responsiveness

Taub et al. (1993) reported on Effect sizes (ES) of the MAL in a sample of 9 patients with chronic stroke. Participants received two weeks of upper extremity restraint and measures were taken at baseline, post-treatment and follow-up (1 month, 2 years). Effect sizes were large from baseline to 1-month follow-up (2.80) and from baseline to 2-year follow-up (2.95).

Kunkel et al. (1999) reported on ES of the MAL in a sample of 5 patients with chronic stroke. Participants received two weeks of Constraint-Induced Movement Therapy (CIMT) and measures were taken at baseline, post-treatment and follow-up (3 months). Effect sizes were large from baseline to post-treatment (MAL-AOU: 9.57; MAL-QOM: 3.24), and from baseline to 3-month follow-up (MAL-AOU: 7.59; MAL-QOM: 1.99).

Taub et al. (1999) reported on ES of the MAL in a sample of patients with stroke who received CIMT and reported a large effect size for lower-functioning individuals (n = 11, d = 4.0) and higher functioning individuals (n = 40, d = 3.3). The ES was larger for lower-functioning patients due to lower variability in scores from baseline to post-treatment.

Miltner et al. (1999) reported on ES of the MAL in a sample of 15 patients with chronic stroke. Participants received two weeks of CIMT and measures were taken at baseline, post-treatment and follow-up (4 weeks and 6 months). Effect sizes were large from first contact to post-treatment (MAL-AOU: 2.07; MAL-QOM: 1.33), from first contact to 4 weeks post-treatment (MAL-AOU: 2.98; MAL-QOM: 1.70), and from first contact to 6-month follow-up (MAL-AOU: 2.68; MAL-QOM: 2.14).

van der Lee et al. (1999) reported on ES of the MAL in a sample of 66 patients with chronic stroke. Participants were randomly assigned to receive forced manual therapy or bimanual training based on neurodevelopmental techniques for two weeks. A 25-item modified version of the MAL was used. There were no significant between-group differences in MAL-QOM scores following treatment. There was a significant difference in MAL-AOU scores, in favour of forced use therapy. The mean difference in gain was 0.52 points (95% CI, 0.11-0.93). Improvements exceeded the Minimal Clinically Important Difference of 0.50 within both groups. The treatment effect was clinically relevant for patients with hemineglect.

van der Lee et al. (2004) examined responsiveness and longitudinal construct validity of the MAL-14 in a sample of 56 patients with chronic stroke who were randomized to receive CIMT or bimanual training for a 2-week intervention period. Responsiveness was measured by responsiveness ratios (RR). Results showed adequate responsiveness for the MAL-AOU and MAL-QOM (RR = 1.9, 2.0 respectively). Longitudinal validity was measured by comparing MAL change scores with the Action Research Arm Test (ARAT) change scores and a global change rating (GCR), using Spearman’s correlation coefficient. Change scores between measures were not significant nor highly correlated (MAL-AOU vs. ARAT: r = 0.16, p = 0.23; MAL-QOM vs. ARAT: r = 0.16, p = 0.25; MAL-AOU vs. GCR: r = 0.20, p = 0.15; MAL-QOM vs. GCR: r = 0.22, p = 0.10).

Uswatte et al. (2005b) examined responsiveness of the MAL-14 in a sample of 41 patients with chronic stroke who received CIMT or time-matched general fitness rehabilitation, and their caregivers. Responsiveness was measured by responsiveness ratios (RR). Results showed high responsiveness for patient scores (MAL-AOU: 3.2; MAL-QOM: 4.5), and caregiver scores (MAL-AOU: 4.3; MAL-QOM: 3.0).

Uswatte et al. (2005b) examined responsiveness of the MAL-14 in a sample of 27 patients with chronic stroke who received an automated form of constraint-induced movement therapy (AutoCITE) or general fitness rehabilitation. Responsiveness was measured by responsiveness ratios; results showed high responsiveness for the MAL-AOU and MAL-QOM (RR = 3.8, 5.0, respectively).

Hammer and Lindmark (2010) examined responsiveness and longitudinal construct validity of the MAL-30 in a sample of 30 patients with subacute stroke who were randomized to receive forced use therapy or standard upper extremity rehabilitation. Responsiveness was measured according to effect size (ES), standard response means (SRM) and responsiveness ratios (RR) from baseline to post-treatment (2 weeks), and from baseline to follow-up (3 months). Effect sizes for the MAL-AOU and MAL-QOM were moderate to large from baseline to post-treatment (MAL-AOU: 0.51; MAL-QOM: 0.54) and from baseline to follow-up (MAL-AOU: 1.02; MAL-QOM: 1.17), indicating sensitivity to change. Standard response means were large from baseline to post-treatment (MAL-AOU: 1.28; MAL-QOM: 1.03), and from baseline to follow-up (MAL-AOU: 1.14; MAL-QOM: 1.19). The greater SRM compared to ES reflects smaller variability in change scores than baseline scores. Responsiveness ratios were large from baseline to post-treatment (MAL-AOU: 1.22; MAL-QOM: 1.23) and from baseline to follow-up (MAL-AOU: 2.44; MAL-QOM: 2.69). Longitudinal construct was measured by comparison with the FMA-UE, ARAT, Motor Assessment Scale – Upper Extremity score (MAS-UE), 16-hole peg test (16HPT) and the Grippit ratio of isometric grip strength, using Spearman’s correlation coefficient. Correlations with the MAS-UE were significant and adequate from baseline to follow-up (MAL-AOU r = 0.53, MAL-QOM r = 0.47); and with the FMA-UE from baseline to post-treatment (MAL-AOU r = 0.44, MAL-QOM r = 0.67) and from baseline to follow-up (MAL-AOU r = 0.39, MAL-QOM r = 0.43).

Khan et al. (2013) examined responsiveness of the German MAL-30 in a sample of 42 patients with acute to chronic stroke, using standard response mean (SRM). Participants were stratified into two groups according to level of arm and hand function using the Chedoke McMaster Stroke Assessment (CSMA). Measures were taken at baseline, discharge from rehabilitation and 6-month follow-up. Change scores from the lower-function group (CSMA arm and hand score ≤ 6) revealed high responsiveness of the MAL-AOU and MAL-QOM from baseline to discharge (SRM = 0.93, 0.94 respectively) and baseline to follow-up (SRM = 0.95. 0.98 respectively), but poor from discharge to follow-up (SRM = 0.20, 0.42 respectively). Change scores from the high-function group (CSMA arm and hand score > 6) showed high responsiveness of the MAL-AOU and MAL-QOM from baseline to discharge (SRM = 1.43, 1.31 respectively) and from baseline to follow-up (SRM = 1.34, 1.33, respectively), but poor responsiveness from discharge to follow-up (SRM = 0.22, 0.24 respectively). The authors concluded that the MAL is a responsive measure when the intervention period is included in the measured time interval.

Simpson & Eng (2013) conducted a literature review of upper limb assessments commonly used in stroke rehabilitation, including the MAL. In studies that measured outcomes following CIMT, the observed change (i.e. patients’ perceptions of change, effect size) was 1.6-6.2 times larger than measures of functional change such as the ARAT or WMFT. Similarly, assessments which measure perceived function in the individual’s environment require larger percentage changes than laboratory-based performance measures to surpass the measurement error. Minimal Detectable Change for the MAL-AOU ranged from 72.5% to 86.7% (90% and 95% confidence levels).

Taub et al. (2013) reported on effect size (ES) of the Lower Functioning MAL (LF-MAL) in a sample of 6 individuals with chronic stroke who used orthotics/splints and adaptive equipment outside the laboratory over 6 sessions (Phase A), then received mCIMT + neurodevelopmental therapy for 15 consecutive weekdays with continued use of assistive devices (Phase B). Effect sizes were calculated from (i) baseline to pre-mCIMT; (ii) pre-mCIMT to post-mCIMT; and (iii) baseline to post-mCIMT and were large at all timepoints (ES = 2.6, 2.1, 3.0, respectively, p < 0.002).

Sterr et al. (2014) reported on treatment effect in a sample of 65 patients with chronic stroke. Participants received four different CIMT treatment protocols that differed in intensity and use of a constraint. Whole-group analysis showed a significant and large treatment effect from baseline to post-treatment (MAL-AOU: d = 1.19; MAL-QOM: d = 1.38); the treatment effect from post-treatment to 6-month follow-up was small but significant for the MAL-AOU only (d = 0.4). Treatment effect was not significant at 12-month follow-up. There was a significant positive association between training intensity and improvement in MAL-AOU scores.

Sensitivity & Specificity:
Chen et al. (2012) examined minimal detectable change (MDC) of the MAL. This study used data from the EXCITE trial, in which 222 patients with subacute/chronic stroke who were randomized to receive constraint induced movement therapy (CIMT) for 2 weeks (n = 106) or no treatment (n = 116). MDC with 90% confidence intervals was calculated from pre-post test data from the control group. The MDC of the MAL-AOU was 16.8% (Standard Error of the Mean 7.2%), indicating that a change in amount of use of the affected upper limb greater than 16.8% is required so as to be 90% certain that the change is not due to measurement error. The MDC (90% CI) for the MAL-QOM was 15.4% (SEM 6.6%), indicating higher sensitivity than the MAL-AOU scale. After treatment, the CIMT group showed an 84.6% increase in MAL-AOU scores and a 72.2% increase in MAL-QOM scores. Both MAL scores exceeded the MDC and were sensitive to change in the context of this intervention.

Simpson, Conroy & Beaver (2015) examined sensitivity of the MAL-28 in a sample of 9 patients with stroke, by comparison with the Fugl-Meyer Assessment, the Wolf Motor Function Test and Stroke Iimpact Scale. Measures were taken at baseline, post-treatment and follow-up, and correlations were analysed using Spearman’s correlation coefficient. Changes in MAL-AOU scores were sensitive to changes in SIS physical domain scores (ρ = 0.7342, p < 0.0243). Changes in MAL-QOM scores were sensitive to changes in WMFT Functional Ability scores (ρ = 0.6245, p < 0.0722).

References

  • Ashford, S., Slade, M., Malaprade, F., & Turner-Stokes, L. (2008). Evaluation of functional outcome measures for the hemiparetic upper limb: a systematic review. Journal of Rehabilitation Medicine, 40, 787-95.
    DOI: 10.2340/16501977-0276
  • Cakar, E., Dincer, U., Zeki, M., Kilac, H., Tongur, N., & Taub, E. (2010). Turkish adaptation of Motor Activity Log-28. Turkish Journal of Physical Medicine and Rehabilitation, 56, 1-5.
    http://www.ftrdergisi.com/eng/arsiv.asp
  • Chen, H.L., Lin, K.C., Hsieh, Y.W., Wu, C.Y., Liing, R.J., & Chen, C.L. (2018). A study of predictive validity, responsiveness, and minimal clinically important difference of arm accelerometer in real-world activity of patients with chronic stroke. Clinical Rehabilitation, 32(1), 75-83.
    DOI: 10.1177/0269215517712042
  • Chen, S., Wolf, S.L., Zhang, Q., Thompson, P.A., & Winstein, C.J. (2012). Minimal detectable change of the Actual Amount of Use Test and the Motor Activity Log: the EXCITE trial. Neurorehabilitation and Neural Repair, 26(5), 507-14.
    DOI: 10.1177/1545968311425048
  • Chuang, I.-C., Lin, K.-C., Wu, C.-Y., Hsieh, Y.-W., Liu, C.-T., & Chen, C.-L. (2017). Using Rasch analysis to validate the Motor Activity Log and the Lower Functioning Motor Activity Log in patients with stroke. Physical Therapy, 97(10), 1030-40.
    DOI: 10.1093/pjpzs071
  • Dettmers, C., Teske, U., Hamzei, F., Uswatte, G., Taub, E., & Weiller, C. (2005). Distributed form of constraint-induced movement therapy improves functional outcome and quality of life after stroke. Archives of Physical Medicine and Rehabilitation, 86, 204-9.
    DOI: 10.1016/j.apmr.2004.05.007
  • Hammer, A.M. & Lindmark, B. (2010). Responsiveness and validity of the Motor Activity Log in patients during the subacute phase after stroke. Disability and Rehabilitation, 32(14), 1184-93.
    DOI: 10.3109/09638280903437253
  • Huseyinsinoglu, B.E., Ozdincler, A.R., Ogul, O.E., & Krespi, Y. (2011). Reliability and validity of Turkish version of Motor Activity Log-28. Turkish Journal of Neurology, 17(2), 83-9.
  • Johnson, A., Judkins, L., Morris, D.M., Uswatte, G., & Taub, E. (2003). The validity and reliability of the 45-item Upper Extremity Motor Activity Log. Journal of Neurologic Physical Therapy, 27(4), 172.
  • Khan, C.M. & Oesch, P. (2013). Validity and responsiveness of the German version of the Motor Activity Log for the assessment of self-perceived arm use in hemiplegia after stroke. NeuroRehabilitation, 33, 413-21.
    DOI: 10.3233/NRE-130972
  • Kunkel, A., Kopp, B., Muller, G., Villringer, K., Villringer, A., Taub, E., & Flor, H. (1999). Constraint-induced movement therapy for motor recovery in chronic stroke patients. Archives of Physical Medicine & Rehabilitation, 80, 624-8.
    PMID: 10378486.
  • Li, K.-Y., Lin, K.-C., Wang, T.-N., Wu, C.-Y., Huang, Y.-H., & Ouyang, P. (2012). Ability of three motor measures to predict functional outcomes reported by stroke patients after rehabilitation. NeuroRehabilitation, 30, 267-75.
    DOI: 10.3233/NRE-2012-0755
  • Lin, K.-C., Chuang, L.-L., Wu, C.-Y., Hsieh, Y.-W., & Chang, W.-Y. (2010a). Responsiveness and validity of three dexterous function measures in stroke rehabilitation. Journal of Rehabilitation Research & Development, 47(6), 563-72.
    DOI:10.1682/JRRD.2009.09.0155
  • Lin, K.-C., Fu, T., Wu, C.-Y., Hsieh, Y.-W., Chen, C.-L., & Lee, P.-C. (2010b). Psychometric comparisons of the Stroke Impact Scale 3.0 and Stroke-Specific Quality of Life Scale. Quality of Life Research, 19(3), 435-43.
    DOI 10.1007/s11136-010-9597-5
  • Miltner, W.H.R., Bauder, H., Sommer, M., Dettmers, C., & Taub, E. (1999). Effects of constraint-induced movement therapy on patients with chronic motor deficits after stroke: a replication. Stroke, 30(3), 586-92.
    PMID: 10066856
  • Page, S. (2003). Forced use after TBI: promoting plasticity and function through practice. Brain Injury, 17(8), 675-84.
    DOI: 10.1080/0269905031000107160
  • Pereira, N.D., Ovando, A.C., Michaelsen, S.M., Anjos, S.M.D., Lima, R.C.M., Nascimento, L.R., & Teixeira-Salmela, L.F. (2012). Motor Activity Log-Brazil: reliability and relationships with motor impairments in individuals with chronic stroke. Arquivos de Neuro-Psiquiatria, 70(3), 196-201.
  • Saliba, V.A., Magalhães, L.C., Faria, C.D., Laurentino, G.E.C., Cassiano, J.G., Teixeira-Salmela, L.F. (2011). [Cross-cultural adaptation and analysis of the psychometric properties of the Brazilian version of the Motor Activity Log]. Revista Panamericana de Salud Pública, 30(3), 262-71.
    https://www.researchgate.net/publication/266487017
  • Santisteban, L., Teremetz, M., Bleton, J.-P., Baron, J.-C., Maier, M.A., & Lindberg, P.G. (2016). Upper limb outcome measures used in stroke rehabilitation studies: a systematic literature review. Plos One, May 6.
    DOI: 10.1371/journal.pone.0154792
  • Shindo, K., Oba, H., Hara, J., Ito, M., Hotta, F. & Liu, M. (2015). Psychometric properties of the simple test for evaluating hand function in patients with stroke. Brain Injury, 29(6), 772-6.
    DOI: 10.3109/02699052.2015.1004740
  • Silva, E.S.M., Pereira, N.D., Gianlorenço, A.C.L., & Camargo, P.R. (2018). The evaluation of non-use of the upper limb in chronic hemiparesis is influenced by the level of motor impairment and difficulty of the activities – proposal of a new version of the Motor Activity Log. Physiotherapy Theory and Practice,
    DOI: 10.1080/09593985.2018.1460430
  • Simpson, A., Conroy, S., & Bever, C. (2015). Preliminary assessment of the Motor Activity Log-28 in patients with chronic stroke. Neurology, 84(14 Supplement), P5.174.
  • Simpson, L.A. & Eng, J.J. (2013). Functional recovery following stroke: capturing changes in upper extremity function. Neurorehabilitation and Neural Repair, 27(3), 240-50.
    DOI: 10.1177/1545968312461719
  • Sterr, A., O’Neill, D., Dean, P.J.A., & Herron, K.A. (2014). CI therapy is beneficial to patients with chronic low-functioning hemiparesis after stroke. Frontiers in Neurology, 5, 204.
    DOI: 10.3389/fneur.2014.00204
  • Taub, E., Miller, N.E., Novack, T.A., Cook, E.W., Fleming, W.C., Nepomuceno, C.S., Connel, J.S., & Crago, J.E. (1993). Technique to improve chronic motor deficit after stroke. Archives of Physical Medicine and Rehabilitation, 74(4), 347-54.
    PMID: 8466415
  • Taub, E. & Uswatte, G. (2000). Constraint-induced movement therapy and massed practice. Stroke, 31(4), 986-8.
    PMID: 10754013.
  • Taub, E., Uswatte, G., Bowman, M.H., Mark, V.W., Delgado, A., Bryson, C., Morris, D., & Bishop-McKay, S. (2013). Constraint-induced movement therapy combined with conventional neurorehabilitation techniques in chronic stroke patients with plegic hands: a case series. Archives of Physical Medicine and Rehabilitation, 94, 86-94.
    DOI: 10.1016/j.apmr.2012.07.029
  • Taub, E., Uswatte, G., & Pidikiti, R. (1999). Constraint-induced movement therapy: a new family of techniques with broad application to physical rehabilitation – a clinical review. Journal of Rehabilitation Research & Development, 36(3), 237-51.
    PMID: 10659807
  • Uswatte. G. & Taub, E. (2005). Implications of the learned nonuse formulation for measuring rehabilitation outcomes: lessons from constraint-induced movement therapy. Rehabilitation Psychology, 50(1), 34-42.
    DOI: 10.1037/0090-5550.50.1.34
  • Uswatte, G., Giuliani, C., Winstein, C., Zeringue, A., Hobbs, L., & Wolf, S.L. (2006a). Validity of accelerometry for monitoring real-world arm activity in patients with subacute stroke: evidence from the extremity constraint-induced therapy evaluation trial. Archives of Physical Medicine and Rehabilitation, 87, 1340-5.
    DOI: 10.1016/j.apmr.2006.06.006
  • Uswatte, G., Taub, E., Morris, D., Light, K., & Thompson, P.A. (2006b). The Motor Activity Log-28: assessing daily use of the hemiparetic arm after stroke. Neurology, 67(7), 1189-94.
    https://www.ncbi.nlm.nih.gov/pubmed/17030751
  • Uswatte, G., Foo, W.L., Olmstead, H., Lopez, K., Holand, A., & Simms, L.B. (2005a). Ambulatory monitoring of arm movement using accelerometry: an objective measure of upper-extremity rehabilitation in persons with chronic stroke. Archives of Physical Medicine and Rehabilitation, 86, 1498-1501.
    PMID: 16003690
  • Uswatte, G., Taub, E., Morris, D., Vignolo, M., & McCulloch, K. (2005b). Reliability and validity of the upper-extremity Motor Activity Log-14 for measuring real-world arm use. Stroke, 36(11), 2493-6.
    DOI: 10.1161/01.STR.0000185928.90848.2e
  • van der Lee, J.H., Beckerman, H., Knol, D.L., de Vet, H.C.W., & Bouter, L.M. (2004). Clinimetric properties of the Motor Activity Log for the assessment of arm use in hemiparetic patients. Stroke, 35, 1410-14.
    DOI: 10.1161/01.STR.0000126900.24964.7e
  • van der Lee, J.H., Wagenaar, R.C., Lankhorst, G.J., Vogelaar, T.W., Deville, W.L., & Bouter, L.M. (1999). Forced use of the upper extremity in chronic stroke patients: results from a single-blind randomized clinical trial. Stroke, 30, 2369-75.
  • Wu, C.-Y., Chuang, L.-L., Lin, K.-C., & Horng, Y.-S. (2011). Responsiveness and validity of two outcome measures of instrumental activities of daily living in stroke survivors receiving rehabilitative therapies. Clinical Rehabilitation, 25, 175-83.
    DOI: 10.1177/0269215510385482

See the measure

How to obtain the Motor Activity Log?

Click here to see the Motor Activity Log manual.

Table of contents

Motor Evaluation Scale for Upper Extremity in Stroke Patients (MESUPES)

Evidence Reviewed as of before: 08-09-2015
Author(s)*: Annabel McDermott, OT
Editor(s): Annie Rochette, PhD OT
Expert Reviewer: Prof. Ann Van de Winckel, PhD, MSc, PT
Content consistency: Gabriel Plumier

Purpose

The MESUPES measures quality of movement performance of the hemiparetic arm and hand in stroke patients. Authors of the assessment are Perfetti & Dal Pezzo (original version of the scale) and Ann Van de Winckel, PhD, MSc, PT (final version of the scale). The original publication of the final version of the scale is by Van de Winckel et al. (2006).

In-Depth Review

Purpose of the measure

The MESUPES measures quality of movement performance of the hemiparetic arm and hand in stroke patients.

Available versions

The original version of the MESUPES comprised 22 items within three categories of arm function (10 items), hand function (9 items) and functional tasks (3 items).

The final version of the measure, analyzed with Principle Component Analysis and Rasch analysis resulted in a 17-item version with two categories of arm function (8 items) and hand function (“range of motion” 6 items; and “orientation during functional tasks” 3 items) (Van de Winckel et al., 2006).

Features of the measure

Items:

The original MESUPES is comprised of 22 items in three subscales:

  1. Arm function: 10 items
  2. Hand function: 9 items
  3. Functional tasks: 3 items

The final version of the MESUPES is comprised of 17 items in two subscales:

  1. MESUPES–Arm function: 8 items with 6 response categories (0-5)
  2. MESUPES–Hand function: 9 items with 3 response categories (0-2).

During the MESUPES–Arm subset, patients are required to perform specific movements of the upper limb in three consecutive phases:

  1. The task is performed passively
  2. The therapist assists the patient during the movement
  3. The patient performs the task by him/herself.

During the MESUPES–Hand subsets, patients are instructed to perform specific movements of the hand and fingers by themselves.

Scoring:

As the MESUPES adopts an ordinal scale, Rasch analysis has been performed to translate ordinal data into interval measures (logit scores) (Van de Winckel et al., 2006).

Online scoring will soon be available to enable users to input the ordinal scores and retrieve logits scores immediately (personal correspondence, Van de Winckel, 2015).

Subset 1: Arm function

The MESUPES–Arm subset evaluates ‘normal’ movement of the hemiparetic limb, which can be judged by comparison with movement of the patient’s unaffected arm. Only qualitatively ‘normal’ movements of the arm are scored.

The tasks are performed in three phases. The number of phases evaluated depends on the level of ability the patient has, to perform the movement correctly.

Testing phase Points achieved
1. The therapist moves the patient’s arm and hand and evaluates muscle tone first.
No adequate adaptation of tone to movement: 0 points
Adequate adaptation of tone (normal tone) to at least part of the movement: 1 point
2. If the patient exhibits normal tone, the patient participates in the movement and the therapist evaluates muscle contractions.
The patient demonstrates functionally and qualitatively correct muscle contraction in at least part of the movement: 2 points
3. If the patient exhibits normal muscle contraction, the patient performs the movement independently and the therapist assesses range of movement.

A score is given for the range of motion that the patient can perform with good quality of motion.

Part of the movement is performed normally: 3 points
Total range of normal movement is done slowly or with great effort: 4 points
The patient demonstrates normal movement performance: 5 points

The patient is allowed to repeat test items with a maximum of three attempts; the patient is awarded the highest score achieved. See the measure for more scoring information.

Subset 2: Hand function (Range of Motion)

Performance of movement and measurement of range of motion is not compared with the unaffected hand for this subset. Only qualitatively normal movements of the hand and fingers are scored.

Testing procedure Points achieved
The patient performs the instructed movement actively and the therapist assesses range of movement between 0-2cm qualitatively and quantitatively. 0-2 points
no movement: 0 points
movement amplitude < 2 cm 1 point
movement amplitude ≥ 2 cm 2 points

Subset 3: Hand function (Orientation during functional tasks)

Quality of movement is not compared with the unaffected hand for this subset.

Testing procedure Points achieved
The patient manipulates materials as instructed and the therapist assesses whether the patient is able to orient the wrist and fingers to the object throughout the movement in a normal way. 0-2 points
no movement or movement with abnormal orientation of fingers and wrist towards the object: 0 points
movement with normal orientation of fingers or wrist towards the object: 1 point
whole movement correct: 2 points

The maximum achievable score is 58 (MESUPES-Arm maximum score is 40; MESUPES-Hand maximum score is 18). The patient is awarded one score for each task, and the highest score is retained. A score of 0 is awarded when the patient demonstrated inadequate tone, abnormal muscle contractions, synergic (flexor/extensor) or mass movement patterns (Appendix 2, Instructions, Van de Winckel et al. , 2006).

What to consider before beginning:

The first four items are performed in supine; all other items are performed in a sitting position with hips and knees at 90 degrees and elbows resting on the table. The patient can be provided support to maintain a sitting position if required. The patient cannot be assessed (and therefore awarded a point) if he/she is not able to sit in an upright position for a task. The therapist can reposition the patient’s upper extremity before beginning each new task, and should wait until the tone is normalized before starting a new task. If the patient is not able to achieve a relaxed starting position, he/she is awarded a score of 0 for the item.

The patient must be given clear instructions using the following steps:

  1. The therapist explains the task verbally and demonstrates the movement
  2. The patient is asked to perform the task with the non-affected side first to ensure he/she understands the demands of the task.

Time:

It takes approximately 10 minutes to administer the evaluation (between 5min for patients with very poor or very good motor impairment – about 15min for patients with more severe hypertonia).

Training requirements:

Instructions are given in Appendix 2 (Van de Winckel et al., 2006) and are available here online. These instructions should suffice for trained clinicians (physical therapists, occupational therapists etc).

For the original evaluation, seven raters were trained for an hour to familiarize them with the assessment protocol (Van de Winckel et al., 2006). In Johansson & Hager’s study (2012), raters underwent a 2h training session.

An instructional video will soon be made available online. In the meantime, the developer of the MESUPES (Prof. Ann Van de Winckel, avandewi@umn.edu) can be contacted to address questions concerning the use of the MESUPES.

Equipment:

  • Plinth or mat
  • Desk and chair, positioned so that the patient is sitting with hip and knees in 90 degrees flexion
  • Wooden or plastic block marked with 1cm and 2cm to measure range of movement during hand tasks
  • One larger and one smaller plastic bottle (cylinder; diameter 6 cm, like a 20fl oz or 591ml soda or water bottle)
  • One smaller plastic bottle (cylinder, diameter 2.5cm, height 8cm, like a round correction fluid bottle, as shown in the figure)
  • Dice (1.5 x 1.5 cm)

Client suitability

Differential item functioning was performed with Rasch analysis to test the stability of item hierarchy (from easy to difficult items) on several variables.

There is no differential item functioning across subgroups of gender, age (<60 / ≥60 years), time since stroke (< 3 months / ≥ 3 months), country of residence, side of lesion and type of stroke (hemorrhagic, ischemic) (Van de Winckel et al. 2006), meaning that the hierarchy of items (from easy to difficult) is maintained across all stroke patients groups with the above mentioned variables.

Can be used with:

  • Individuals with stroke

Should not be used with:

  • The measure is intended for use with adult patients with stroke; there is insufficient evidence regarding psychometric properties of the tool with other populations, including a pediatric population.

In what languages is the measure available?

  • Catalan (available online, Van de Winckel A, 2015)
  • Dutch (Flemish) (available online, Van de Winckel, A., 2015)
  • English (available online, Van de Winckel et al., 2006)
  • French (available online, Van de Winckel A, 2015)
  • German (available online, Van Bellingen, T., Van de Winckel, A., et al. 2009. Chapter 1: Assessment in Neurorehabilitation. In Neurology (2nd ed.) (192-201). Huber.
  • Italian – (available online, Van de Winckel A, 2015) (Perfetti & Dal Pezzo, original version)
  • Portuguese (available online, Van de Winckel A, 2015)
  • Spanish (available online, Van de Winckel A, 2015)
  • Swedish (available online, Johansson & Hager, 2012)/li>

Summary

What does the tool measure? The MESUPES measures quality of movement performance of the hemiparetic arm and hand in patients with stroke.
What types of clients can the tool be used for? The MESUPES was developed for use with adults with stroke.
Is this a screening or assessment tool? Assessment tool
Time to administer 10 minutes (range 5-15min)
ICF Domain • Body function/structure
• Activity
Versions Final version (Van de Winckel et al., 2006) = 17 items (total score /58; MESUPES-arm score /40; MESUPES-hand score /18)
Languages

Available online on StrokEngine:

  • Catalan
  • Dutch (Flemish)
  • English
  • French
  • German
  • Italian
  • Portuguese
  • Spanish
  • Swedish
Measurement Properties
Reliability Internal consistency:
One study has reported on the internal consistency of the MESUPES using Principal Component Analysis and Rasch analysis. Results showed high person separation indices and unidimensionality within subtests.

Test-retest:
Two studies have reported on the test-retest reliability of the MESUPES in patients with subacute to chronic stroke and reported good to very good agreement over 24-48 hours.

Intra-rater:
No studies have reported on the intra-rater reliability of the MESUPES.

Inter-rater:
Two studies have reported on the inter-rater reliability of the MESUPES in patients with subacute to chronic stroke and reported good to very good agreement between raters for subtests; moderate to very high item reliability; and sufficient absolute reliability of the total score.

Validity Content:
One study investigated validity of the 17-item MESUPES and reported unidimensionality of the arm and hand scales.

Criterion:
Concurrent:
One study examined concurrent validity of the MESUPES and reported high correlations with the Modified Motor Assessment Scale (MMAS).

Predictive:
No studies have reported on predictive validity of the MESUPES.

Construct:
Convergent/Discriminant:
No studies have reported on convergent/discriminant validity of the MESUPES.

Known Groups:
No studies have reported on known group validity of the MESUPES.

Floor/Ceiling Effects No studies have reported on the floor/ceiling effects of the MESUPES.
Does the tool detect change in patients? • No studies have reported on the sensitivity or specificity of the MESUPES.
• One study reported MDC scores of 8, 7 and 5 (95%, 90% and 80% CI, respectively).
Acceptability Administration of the MESUPES is easy and fast. The measure is inexpensive and requires minimal standard equipment.
Feasibility The MESUPES requires no specialized training to administer. However, the MESUPES should only be administered by clinicians with knowledge of stroke and clinical assessment of tone, muscle contraction and movement.
How to obtain the tool? See the measure

Psychometric Properties

Overview

A literature search was conducted to identify all relevant publications on the psychometric properties of the MESUPES. Two English studies were identified.

Floor and ceiling effect

No studies have reported on the floor or ceiling effects of the MESUPES.

Van de Winckel (personal correspondence, 2015) noted that in the study by Van de Winckel et al. (2006) in which 396 patients with low to high motor performance following stroke were assessed using the MESUPES less than 5% of patients achieved a score of 0 on the arm items and less than 20% of participants achieved the maximum score. Approximately 42% of participants achieved a score of 0 on the hand items and less than 5% of patients achieved a maximum score on the hand items.

Reliability

Internal consistency:
Van de Winckel et al. (2006) examined internal consistency of the MESUPES in a sample of patients with stroke using Principal Component Analysis and Rasch analysis. Rasch analysis was used to determine ‘item-trait interaction’, which shows the degree of invariance across the intended dimension, and ‘person separation index’. Internal consistency was obtained when the MESUPES was divided into the MESUPES-Arm (8 items) and MESUPES-Hand (9 items) subtests. Rasch analysis and fit statistics showed that both subtests adhered to unidimensional characteristics, whereby all items in the subtests pertain to the same construct. The person separation index was 0.99 for the MESUPES-Arm and 0.97 for the MESUPES-Hand, indicating very high internal consistency.

Test-retest:
See inter-rater reliability above for results also pertaining to test-retest reliability.

Intra-rater:
No studies have reported on the intra-rater reliability of the MESUPES.

Inter-rater:
Van de Winckel et al. (2006) investigated inter-rater reliability of the MESUPES in a sample of 56 patients with subacute to chronic stroke. Assessments were conducted by 2 assessors over 24 hours. Inter-rater reliability, calculated using intra-class correlation coefficients (ICCs) was excellent for the arm function total score (ICC=0.95, 95% CI 0.91-0.97) and hand function total score (ICC=0.97, 95% CI 0.95-0.98). Assessment of inter-rater reliability by weighted percentage agreement and weighted kappa confirmed item reliability for the arm function subtest (weighted kappa coefficient = 0.62-0.79; weighted percentage agreement 85.71-98.21); scores were not derived for hand function items as more than 50% of the sample scored 0.

Johansson & Hager (2012) investigated inter-rater reliability of the MESUPES in a sample of 42 patients with subacute to chronic stroke. Assessments were conducted by 2 therapists within 48 hours. Inter-rater reliability, calculated by percentage agreement using linear-weighted kappa analysis revealed good to very good agreement between raters (kappa range 0.63-0.96). Relative and absolute reliability was measured using intra-class correlation coefficients (ICCs) and standard error of measurement (SEM): item reliability was moderate to very high (ICC=0.63-0.96); reliability of subscores and the total score was very high (ICC=0.98, 95% CI 0.96=0.99); and the total score demonstrated sufficient absolute reliability (SEM=2.68).

Validity

Content:

The original version of the MESUPES developed by Perfetti & Dal Pezzo comprised 22 items across three categories of (i) arm function (10 items); (ii) hand function (9 items); and (iii) functional tasks (3 items).

Van de Winckel et al. (2006) investigated validity and unidimensionality of the MESUPES in a sample of 396 patients with subacute to chronic stroke. Principle Component Analysis (PCA) of the original 22-item version revealed two dimensions: arm function and hand function. Rasch analysis of these two separate scales identified misfit among five items (respectively 2 arm items and 3 hand items). Following removal of these items, subsequent Rasch analysis of the remaining 17 items and fit statistics confirmed unidimensionality of both arm and hand scales:

Person fit Item fit Person separation index
Arm function -0.51±1.19 -0.65±1.07 0.99
Hand function -0.12±0.71 0.15±1.21 0.97

Test items followed an order of increasing difficulty with no reversed thresholds and no differential item functioning (DIF) according to gender, age (<60, ≥60), side of hemiparesis, time since stroke (< 3 months, ≥ 3 months), type of stroke or country (Van de Winckel et al., 2006).

Criterion:

Concurrent:
Johansson & Hager (2012) investigated concurrent validity of the MESUPES in a sample of 42 patients with subacute to chronic stroke by comparison with the Modified Motor Assessment Scale (MMAS), using Spearman’s rho. Correlations were high between the MESUPES total scores and the MMAS (r=0.87); MESUPES arm items and MMAS (r=0.84); and MESUPES hand items and MMAS (r=0.80).

Predictive:
No studies have reported on the predictive validity of the MESUPES.

Construct:

Convergent/Discriminant:
No studies have reported on convergent/discriminant validity of the MESUPES.

Known Group:
No studies have reported on the known group validity of the MESUPES.

Responsiveness

Johansson & Hager (2012) assessed minimal detectable change (MDC) of the MESUPES with a sample of 42 patients with subacute to chronic stroke. Patients were assessed at two time points 48 hours apart. The authors reported change scores of 8, 7 and 5 (95%, 90% and 80% confidence intervals, respectively) were required for certainty of true change.

Sensitivity & Specificity:
No studies have reported on sensitivity/specificity of the MESUPES.

References

  • Johansson, G.M. & Hager, C.K. (2012). Measurement properties of the Motor Evaluation Scale for Upper Extremity in Stroke Patients (MESUPES). Disability & Rehabilitation, 34(4):288-94. DOI: 10.3109/09638288.2011.606343
  • Van de Winckel, A., Feys, H., van der Knaap, S., Messerli, R., Baronti, F., Lehmann, R., Van Hemelrijk, B., Pante, F., Perfetti, C., & De Weerdt, W. (2006). Can quality of movement be measured? Rasch analysis and inter-rater reliability of the Motor Evaluation Scale for Upper Extremity in Stroke Patients (MESUPES). Clinical Rehabilitation, 20, 871-84.

See the measure

How to obtain the MESUPES

Click on the language below:

Please click here for an instructional video on how to use the scale.

Table of contents

Nine Hole Peg Test (NHPT)

Evidence Reviewed as of before: 09-06-2011
Author(s)*: Sabrina Figueiredo, BSc
Editor(s): Lisa Zeltzer, MSc OT; Nicol Korner-Bitensky, PhD OT; Elissa Sitcoff, BA BSc

Purpose

The Nine Hole Peg Test (NHPT) was developed to measure finger dexterity, also known as fine manual dexterity. It can be used with a wide range of populations, including clients with stroke. Additionally, the NHPT is a relatively inexpensive test and can be administered quickly.

In-Depth Review

Purpose of the measure

The Nine Hole Peg Test (NHPT) was developed to measure finger dexterity, also known as fine manual dexterity. It can be used with a wide range of populations, including clients with stroke. Additionally, the NHPT is a relatively inexpensive test and can be administered quickly.

The NHPT should be used in association with other upper extremity performance tests, in order to estimate upper limb function with more accuracy.

Available versions

The NHPT was first introduced by Kellor, Frost, Silberberg, Iversen, and Cummings in 1971. In 1985, norms for the NHPT in healthy individuals were established by Mathiowetz, Weber, Kashman, and Volland.

Features of the measure

Items:

The NHPT is composed of a square board with 9 pegs. At one end of the board are holes for the pegs to fit in to, and at the other end is a shallow round dish to store the pegs. The NHPT is administered by asking the client to take the pegs from a container, one by one, and placing them into the holes on the board, as quickly as possible. Clients must then remove the pegs from the holes, one by one, and replace them back into the container. In order to practice and register baseline scores, the test should begin with the unaffected upper limb. The board should be placed at the client’s midline, with the container holding the pegs oriented towards the hand being tested. Only the hand being evaluated should perform the test. The hand not being evaluated is permitted to hold the edge of the board in order to provide stability (Mathiowetz et al., 1985; Sommerfeld, Eek, Svensson, Holmqvist, & Arbin, 2004).

Scoring:

Clients are scored based on the time taken to complete the test activity, recorded in seconds. The stopwatch should be started from the moment the participant touches the first peg until the moment the last peg hits the container. (Grice, Vogel, Le, Mitchell, Muniz, & Vollmer, 2003; Mathiowetz et al., 1985).

Mathiowetz et al. (1985) reported that on average, healthy male adults complete the NHPT in 19.0 seconds (SD 3.2) with the right hand, and in 20.6 seconds (SD 3.9) with the left hand. For healthy female adults, the NHPT was completed in 17.9 seconds (SD 2.8) and 19.6 seconds (SD 3.4) with the right and left hand, respectively.

Alternative scoring – the number of pegs placed in 50 or 100 seconds can be recorded. In this case, results are expressed as the number of pegs placed per second (Jacob-Lloyd, Dunn, Brain, & Lamb, 2005; Sunderland, Trinson, Bradley, & Langton-Hewer, 1989).

Time:

Not typically reported. Norms indicated above indicate approximate testing times in normals.

Subscales:

None

Equipment:

The standardized equipment consists of:

  • A board, in wood or plastic, with 9 holes (10 mm diameter, 15 mm depth), placed apart by 32 mm (Mathiowetz et al., 1985; Sommerfeld et al., 2004) or 50 mm (Heller, Wade, Wood, Sunderland, Hewer, & Ward, 1987).
  • A container for the pegs. Initially the container was a square box (100 x 100 x 10 mm) apart from the board. The most current container is a shallow round dish at the end of the board (Grice et al., 2003).
  • 9 pegs (7 mm diameter, 32 mm length) (Mathiowetz et al., 1985).
  • Stopwatch.

Training:

None typically reported.

Alternative forms of the Nine Hole Peg Test

None.

Client suitability

Can be used with:

  • Clients with stroke.
  • Clients should have a satisfactory level of upper limb fine motor skills as they must be able to pick up the pegs to complete the test.

Should not be used in:

  • The NHPT cannot be used with clients who have severe upper extremity impairment.
  • The NHPT cannot be used with clients with severe cognitive impairment.
  • Scoring with an upper time limit of 50 or 100 seconds requires caution especially in the acute post-stroke period due to the possibility of floor effects (Jacob-Lloyd et al., 2005; Sunderland et al.,1989).

In what languages is the measure available?

There are no official translations of the NHPT.

Some publications from Netherlands, Japan and Sweden have used the NHPT as an outcome measure, which shows its use in languages other than English. (Dekker, Van Staalduinem, Beckerman, Van der Lee, Koppe, & Zondervan, 2001; Hatanaka, Koyama, Kanematsu, Takahashi, Matsumoto, & Domen, 2007; Sommerfeld et al., 2004).

Summary

What does the tool measure? Finger dexterity.
What types of clients can the tool be used for? The NHPT can be used with, but is not limited to clients with stroke.
There are no restrictions when administering it to clients with chronic stroke. With clients with acute stroke the mode of scoring should be observed in order to avoid floor effects.
Is this a screening or assessment tool? Assessment
Time to administer The amount of time it takes to administer the NHPT has not been reported and it will vary according to the client’s impairment or the mode of scoring.
Versions There are no alternative versions.
Other Languages There are no official translations.
Measurement Properties
Reliability Internal consistency:
No studies have examined the internal consistency of the NHPT.

Intra-rater:
Three studies have examined the intra-rater reliability of the NHPT. Both reported excellent intra-rater reliability and one reported adequate intra-rater reliability using correlation coefficients. One study used Spearman rho and the two others, Pearson correlation.

Inter-rater:
Three studies have examined the inter-rater reliability of the NHPT and reported inter-rater reliability using correlation coefficients. One study used Spearman rho and the two others, Pearson correlation.

Validity Criterion:
Concurrent:
Two studies have examined the concurrent validity of the NHPT. The first study examined the sensitivity of the NHPT comparing it to the Frenchay Arm Test as the gold standard and reported that NHPT has a low sensitivity, with 27% of misclassified results. The second study examined the concurrent validity of the NHPT and reported adequate to excellent correlation with the Box and Block Test (BBT) and the Action Research Arm Test (ARAT) at pre and post-treatment.

Predictive:
One study has examined predictive validity and reported that NHPT is not able to predict functional outcomes after six months of stroke.

Construct:
Convergent:
One study has examined convergent validity of the NHPT and reported excellent correlations between the NHPT and the Motricity Index using Pearson correlation coefficients.

Floor/Ceiling Effects

Two studies have examined floor effects of the NHPT. In both studies, clients were scored based on a cutoff of 50 or 100 seconds. Participants not able to complete the test within this time were scored as 0. In both studies, at earlier phases of the stroke, floor effects were poor or adequat. After six months of stroke the floor effects were adequate.

Does the tool detect change in patients? Two studies have examined the ability to detect change of the NHPT and reported that the NHPT is able to detect change.
Acceptability The NHPT should not be used clients with severe upper extremity impairment and those who are not able to pick up the pegs.
Feasibility

The administration of the NHPT is quick and simple, however it requires standardized equipment.

One study has examined the feasibility of the NHPT and reported that, on average, 52% of clients with acute stroke were not able to perform the NHPT (Jacob-Lloyd et al., 2005).

How to obtain the tool?

The NHPT instructions can be obtained in the study by Mathiowetz et al., (1985). Also, a version of the measure can be obtained from the publication by Wade (1992). Davis et al. (1999) reported the most used standardized equipments for NHPT in the United States are produced by Smith and Nephew Rehabilitation, Inc. and Sammons Preston.

Standardized equipment can be obtained at the website: http://www.sammonspreston.com/Supply/Product.asp?Leaf_Id=A8515

Psychometric Properties

Overview

We conducted a literature search to identify all relevant publications on the psychometric properties of the Nine-Hole Peg Test (NHPT) in two different populations – healthy normal subjects and individuals with stroke. We identified seven. The results of these suggest that the NHPT may be a reliable, valid and responsive measure in clients with stroke. In clients with acute stroke, the NHPT needs to be used carefully due to the possibility of floor effects.

In a literature review, Croarkin, Danoff, and Barnes (2004) identified the level of evidence for nine upper extremity motor function tests. The level of evidence was established based on the total number of psychometric properties addressed in studies of each test. Compared to the Action Research Arm Test (Lyle, 1981), Chedoke-McMaster Stroke Assessment (Gowland, VanHullenaar & Torresin et al., 1995), Fugl-Meyer Sensorimotor Assessment (Fugl-Meyer, Jääskö, Leyman, Olsson & Steglind, 1975), Modified Motor Assessment Chart (Lindmark & Hamrin, 1988), Motor Assessment Scale (Carr, Shepherd, Nordholm & Lynne, 1985), Motor Club Assessment (Ashburn, 1982), Motricity Index (Demeurisse, Demol & Rolaye, 1980) et Rivermead Motor Assessment (Lincoln & Leadbitter, 1979),the NHPT was found to have the greatest number of psychometric properties supported, with studies on intra-rater reliability, inter-rater reliability, convergent validity and predictive validity.

Floor/Ceiling Effects

Jacob-Lloyd, Dunn, Brain, and Lamb (2005) examined the ceiling and floor effects of the NHPT in 50 persons with stroke. Participants were assessed twice within a 6 month interval. The first assessment was at hospital discharge. In this study, participants were scored based on the cutoff of 100 seconds. Those who took more than 100 seconds to complete the test were scored as 0. At discharge, the NHPT demonstrated an adequate floor effect, with less than 20 % of the participants scoring the minimal value. After 6 months, the number of participants scoring the minimal value decreased with the NHPT still demonstrating an adequate floor effect.

Sunderland, Trinson, Bradley, and Langton-Hewer (1989) examined the presence of a floor effect in 31 participants with stroke. Assessments were performed at four points in time: admission, 1, 3 and 6 months post-stroke. Participants were given 50 seconds to complete the test. Those who were not able to complete the test within this time limit were scored as 0. Initially, the NHPT demonstrated a poor floor effect of 65% but decreased at the 6 month follow up.
Note: No values were provided by the authors for the 6 month follow-up.

Reliability

Note: A number of the publications on reliability reviewed below used statistical analyses such as Pearson’s correlation coefficient that are not considered the analyses of preference for testing reliability and may artificially inflate reliability coefficients. Future studies should examine the reliability of the NHPT using ICC or Kappa statistics.

Test-retest:
No studies were identified examining the test-retest reliability of the NHPT.

Intra-rater:
Heller, Wade, Wood, Sunderland, Hewer, and Ward (1987) examined the intra-rater reliability of the NHPT, Frenchay Arm Test (Heller et al., 1987), Finger Tapping Rate (Lezak, 1983), and Grip Strength (Mathiowetz, Kashman, Volland, Weber, Dowe, & Rogers, 1985) in 10 patients with chronic stroke. Participants were re-assessed with a 2-week interval by the same rater. In this study, results describe the range of reliability of the four measures mentioned above, and values for each individual measure were not provided. Spearman rho correlation coefficient was excellent (ranging for all four measures from r = 0.68 to 0.99).
Note: Although is not possible to discern the exact value for the NHPT`s reliability, all values were considered excellent and statistically significant, suggesting that the NHPT may be reliable with stable stroke clients.

Mathiowetz, Weber, Kashman, and Volland (1985) examined the intra-rater reliability of the NHPT in 26 healthy female young adults. Participants were re-assessed with a 1-week interval by the same rater. The Pearson correlation coefficient showed excellent agreement (r = 0.69) for the right hand and adequate agreement (r = 0.43) for the left.

Grice et al. (2003) reproduced the Mathiowetz et al. (1985) study in order to estimate the intra-rater reliability of the NHPT, after its design was slightly modified. In the Mathiowetz and associates’ study, the NHPT equipment was composed of a wooden board for the holes and a wooden square container for the pegs. The NHPT equipment was then modified to a plastic board with a shallow round dish as container, at the end of the board. Pearson correlation coefficient for the new NHPT was reported as adequate (r = 0.46; r = 0.44) for the right and left hand, respectively.

Inter-rater:
Heller et al. (1987) examined the inter-rater reliability of the NHPT, Frenchay Arm Test (Heller et al., 1987), Finger Tapping Rate (Lezak, 1983), and Grip Strength (Mathiowetz et al., 1985) in 10 patients with chronic stroke. Participants were assessed twice within a week by two different raters. Spearman rho correlation coefficients were excellent (ranging for all four measures from r = 0.75 to 0.99).
Note: in this study, individual values for each measure were not provided. Although is not possible to discern the exact value for the NHPT`s reliability, all values were considered excellent.

Mathiowetz et al. (1985) examined the inter-rater reliability of the NHPT in 26 healthy young female adults. Participants were evaluated simultaneously and independently by two raters. Pearson correlation coefficients showed excellent agreement (r = 0.97; r = 0.99) for the right and left hand, respectively.

Grice et al (2003) reproduced Mathiowetz et al. (1985) study to estimate the inter-rater reliability of the new NHPT. Pearson correlation coefficients showed excellent agreement (r = 0 .98; r = 0.99) for the right and left hand, respectively.

Validity

Content:

Not available.

Criterion:

Concurrent:
Sunderland et al. (1989) estimated the sensitivity of the NHPT, the Motor Club Assessment (Ashburn, 1982) and the Motricity Index (Demeurisse et al., 1980) by comparing them to the Frenchay Arm Test (Heller et al., 1987), as the gold standard, in 31 participants with acute stroke. The NHPT had the lowest sensitivity with 27% of the cases incorrectly classified. The most sensitive measure, with 0% of cases misclassified, was the Motricity Index.

Lin, Chuang, Wu, Hsieh and Chang (2010) compared the concurrent validity of the NHPT, Action Research Arm Test (ARAT) and Box and Block Test (BBT) for evaluating hand dexterity in 59 patients with stroke. The Fugl-Meyer Assessment of Sensorimotor Recovery After Stroke (FMA), Motor Activity Log (MAL) and Stroke Impact Scale (SIS) were also administered to assess the concurrent validity of the NHPT, ARAT and BBT. Using Spearman rank correlation coefficient, the NHPT, ARAT and BBT were found to have adequate to excellent correlations at pre-treatment (ranging from rho=-0.55 to -0.80) and post-treatment (ranging from rho=-0.57 to -0.71). In addition, the ARAT and BBT were found to have adequate correlations with the FMA, MAL and SIS (ranging from rho=0.31 to -0.59); however, the NHPT had only poor to adequate correlations with the FMA and MAL (ranging from rho=-0.16 to -0.33); and adequate to excellent correlations with the SIS (ranging from rho=-0.58 to -0.66). When considering both the results of responsiveness and validation components of the study, the ARAT and BBT are believed to be more appropriate than the NHPT for evaluating dexterity.

Predictive:
Sunderland et al. (1989) examined whether the NHPT, Motor Club Assessment (Ashburn, 1982) and Motricity Index (Demeurisse et al., 1980) were able to predict functional outcomes at six months after stroke measured by the Frenchay Arm Test (Heller et al., 1987). Predictive validity of the NHPT was examined in 31 participants with acute stroke. Assessments were performed at four points in time: admission, 1, 3 and 6 months post-stroke. The NHPT administered at 1 month did not predict functional outcomes at 6 months. The best predictor of functional outcomes at 6 months was the Motricity Index.

Construct:

Convergent/Discriminant:
Parker, Wade, and Hewer (1986) tested the construct validity of the NHPT by comparing the NHPT to the Motricity Index (Demeurisse et al., 1980) in 187 persons with stroke. The correlation between NHPT and Motricity Index was excellent (r = 0.82).

Known groups:
No studies have examined known groups’ validity of the NHPT.

Responsiveness

Jacob-Lloyd et al. (2005) examined the responsiveness of the NHPT in 50 persons with stroke. Participants were assessed twice within a 6 month interval. The first assessment was at hospital discharge. Effect sizes were calculated using Wilcoxon signed rank test. Although the author reported a large effect size in this study, no reference values were provided. The NHPT was more likely to detect change than the Motricity Index (Demeurisse et al., 1980).

Lin, Chuang, Wu, Hsieh and Chang (2010) evaluated the responsiveness of the NHPT, the Action Research Arm Test (ARAT) and Box and Block Test (BBT) for evaluating hand dexterity in 59 patients with subacute stroke (< 6-months) and Brunnstrom stage IV to VI for proximal and distal upper extremity function. Patients were randomly assigned to receive constraint-induced therapy, bilateral arm training or control treatment and received 2 hours of therapy, 5 days per week for 3 weeks. Assessments were performed at baseline and 3 weeks. Using Standardized Response Mean (SRM) to calculate responsiveness, the NHPT, ARAT and BBT were all found to have moderate SRM (0.64 0.79, 0.74 respectively), indicating sensitivity for detecting change in hand dexterity. When considering both the results of responsiveness and validation components of the study, the ARAT and BBT are believed to be more appropriate than the NHPT for evaluating dexterity.

References

  • Ashburn, A. (1982). A physical assessment for stroke patients. Physiotherapy, 68, 109-113.
  • Carr, J.H., Shepherd, R.B., Nordholm, L., & Lynne, D. (1985). Investigation of a new motor assessment scale for stroke patients. Physical Therapy, 65, 175- 180.
  • Croarkin, E., Danoff, J., & Barnes, C. (2004). Evidence-based rating of upper-extremity motor function tests used for people following a stroke. Physical Therapy, 84, 62-74.
  • Cromwell, F.S. (1965). Occupational therapists manual for basic skills assessment: primary prevocational evaluation. California, USA: Fair Oaks Printing.
  • Davis, J., Kayser, J., Matlin, P., Mower, S., & Tadano, P. (1999). Nine hole peg tests – are they all the same? Occupational Therapy Practice, 4, 59-61.
  • Dekker, C.L., Van Staalduinem, A.M., Beckerman, H., Van der Lee, J.H., Koppe, P.A., & Zondervan, R.C.J. (2001). Concurrent validity of instruments to measure upper extremity performance: the action research arm test; the nine hole peg test and the motricity index. Nederlands Tijdscrift Voor Fysiotherapie, 111(15), 110- 115.
  • Demeurisse, G., Demol, O., & Robaye, E. (1980). Motor evaluation in vascular hemiplegia. European Neurology, 19(6), 382-389.
  • Desrosiers, J., Rochette, A.,Hebert, R.,& Bravo, G. (1997). The minnesota manual dexterity test: reliability, validity and reference values studies with healthy elderly people. Canadian Journal of Occupational Therapy, 64(5), 270-276.
  • Fugl-Meyer, A.R., Jääskö, L., Leyman, I., Olsson, S., & Steglind, S. (1975). The post-stroke hemiplegic patient 1. A method for evaluation of physical performance. Scandinavian Journal of Rehabilitation Medicine, 7, 13-31.
  • Grice, K.O., Vogel, K.A., Le, V., Mitchell, A., Muniz, S., & Vollmer, M.A. (2003). Adult norms for a commercially available nine hole peg test for finger dexterity. American Journal of Occupational Therapy, 57, 570-573.
  • Gowland, C., VanHullenaar, S., Torresin, W., et al., (1995). Chedoke-McMaster Stroke Assessment: development, validation, and administration manual. Hamilton, (ON), Canada: School of Rehabilitation Science, McMaster University.
  • Hatanaka, T., Koyama, T., Kanematsu, M., Takahashi, N., Matsumoto, K., & Domen, K. (2007). New evaluation method for upper extremity dexterity of patients with hemiparesis after stroke: the 10-second tests. International Journal of Rehabilitation Research, 30(3), 243-247.
  • Heller, A., Wade, D.T., Wood, V.A., Sunderland, A., Hewer, R., & Ward, E. (1987). Arm function after stroke: measurement and recovery over the first three months. Journal of Neurology, Neurosurgery & Psychiatry, 50(6), 714- 719.
  • Jacob-Lloyd, H.A., Dunn, O.M., Brain, N.D., & Lamb, S.E. (2005). Effective measurement of the functional progress of stroke clients. British Journal of Occupational Therapy, 68 (6), 253-259.
  • Jebsen, R.H., Taylor, N., Trieschmann, R.B., Trotter, M.J., & Howard, L.A. (1969). An objective and standardized test of hand function. Archives of Physical Medicine and Rehabilitation, 50, 311-319.
  • Kellor, M., Frost, J., Silberberg, N., Iversen, I., & Cummings R. (1971). Hand strength and dexterity. American Journal of Occupational Therapy, 25, 77-83.
  • Lezak, M.D. (1983). Neuropsychological assessment. Oxford, England: Oxford University Press.
  • Lincoln, N.B. & Leadbitter, D. (1979). Assessment of motor function in stroke patients. Physiotherapy, 65, 48-51.
  • Lin, K-C., Chuang, L-L., Wu, C-Y., Hseih, Y-W. & Chang, W-Y. (2010). Responsiveness and validity of three dexterous function measures in stroke rehabilitation. Journal of Rehabilitation Research and Development, 47(6), 563-572.
  • Lindmark, B. & Hamrin, E. (1988). Evaluation of function capacity after stroke as a basis for active intervention: Presentation of a modified chart for motor capacity assessment and its reliability. Scandinavian Journal of Rehabilitation Medicine, 20, 103-109.
  • Lyle, R.C. (1981). A performance test for assessment of upper limb function in physical rehabilitation treatment and research. International Journal of Rehabilitation and Research, 4, 483-92.
  • Mathiowetz, V., Weber, K., Kashman, N., & Volland, G. (1985). Adult norms for the nine hole peg test of finger dexterity. Occupational Therapy Journal of Research, 5, 24 -33.
  • Mathiowetz, V., Kashman, N., Volland, G., Weber, K., Dowe, M., & Rogers, S. (1985). Grip and pinch strength: normative data for adults. Archives of Physical and Medicine Rehabilitation, 66, 69-72.
  • Parker, V. M., Wade, D. T., & Hewer, R. (1986). Loss of arm function after stroke: measurement, frequency, and recovery. International Rehabilitation Medicine, 8(2), 69-73.
  • Sommerfeld, D.K., Eek, E.U.B., Svensson, A.K., Holmqvist, L.W., & Arbin, M.H. (2004). Spasticity after stroke: its occurrence and association with motor impairments and activity limitations. Stroke, 35, 134-140.
  • Sunderland, A., Trinson, D., Bradley, L., Hewer, R. (1989). Arm function after stroke: an evaluation of grip strength as a measure of recovery and a prognostic indicator. Journal of Neurology, Neurosurgery & Psychiatry, 52, 1267-1272.
  • Tiffin, J. (1968). Purdue Pegboard Examiner Manual. Chicago, USA: Science Research Associates.
  • Wade, D.T. (1992). Measurement in Neurological Rehabilitation. Oxford, England: Oxford University Press.

See the measure

How to obtain the NHPT?

The NHPT instructions can be obtained in the study by Mathiowetz et al. (1985) and Wade (1992).

Davis, Kayser, Matlin, Mower, and Tadano (1999) reported that the most commonly used standardized equipment for the NHPT in the United States are produced by both Smith and Nephew Rehabilitation, Inc., and Sammons Preston.

Standardized equipment can be obtained at the website: http://www.sammonspreston.com/Supply/Product.asp?Leaf_Id=A8515

Table of contents

Purdue Pegboard Test (PPT)

Evidence Reviewed as of before: 06-09-2012
Author(s)*: Katie Marvin, MSc.PT
Editor(s): Annabel McDermott, OT; Nicol Korner-Bitensky, PhD OT

Purpose

The Purdue Pegboard Test (PPT) is a test of fingertip dexterity and gross movement of the hand, fingers and arm in patients with impairments of the upper extremity resulting from neurological and musculoskeletal conditions.

In-Depth Review

Purpose of the measure

The Purdue Pegboard Test (PPT) was developed by Joseph Tiffin in 1948. The PPT is now used widely by clinicians and researchers as a measure of (1) gross movement of the arm, hand and fingers, and (2) fingertip dexterity. The PPT is suitable for use with patients with impairments of the upper extremity resulting from neurological and musculoskeletal conditions.

Available versions

None typically reported

Features of the measure

Description of tasks:

The PPT measures:

  • (1) Gross movement of the fingers, hand and arm; and
  • (2) Fingertip dexterity

The patient should be seated comfortably at a testing table with the PPT on the table in front of him/her. The testing board consists of a board with 4 cups across the top and two vertical rows of 25 small holes down the centre. The two outside cups contain 25 pins each; the cup to the immediate left contains 40 washers and the cup to the immediate right of the center contains 20 collars.

Picture: Google Images

The clinician demonstrates and then administers the following 5 subtests:

  • Right hand (30 seconds): Clients use their right hand to place as many pins as possible down on the row within 30 seconds.
  • Left hand (30 seconds): Clients use their left hand to place as many pins as possible down on the row within 30 seconds.
  • Both hands (30 seconds): Clients use both hands simultaneously to place as many pins as possible down both rows.
  • Right + Left + Both hands: *Please note that this is not an actual test, it is a mathematical sum calculation of the above scores.
  • Assembly (60 seconds): Clients use both hands simultaneously while assembling pins, washers and collars.

Specific administration instructions can be found in the instruction manual that accompanies the PPT.

Scoring and Score Interpretation:

The clinician compiles 5 separate scores from the complete test procedure, one for each of the following tasks:

  • Right hand (30 seconds): The total number of pins placed in the right hand column using the right hand in the allotted time.
  • Left hand (30 seconds): The total number of pins placed in the left hand column using the left hand in the allotted time.
  • Both hands (30 seconds): The total number of pairs of pins placed in both columns using both hands in the allotted time.
  • Right + Left + Both hands: The sum of scores for the previous three tasks (right hand + left hand + both hands).
  • Assembly (60 seconds): The total number of pins, washers and collars assembled in the allotted time.

The testing should commence in the order outlined above, unless the patient is left-handed; tasks 1 and 2 should then be reversed. The preferred method of administration is the three-trial method: the patient should be permitted to attempt three trials for each task after a single demonstration by the clinician. (The one-trial administration method only permits the patient one trial following demonstration by the clinician). The test can be administered in an individual or group setting.

Desrosiers, Hebert, Bravo and Dutil (1995) developed predictive equations for Purdue Pegboard subtest scores, based on normative data resulting from their study. The normative data portion of the study involved 360 healthy participants over the age of 60 years. The following predictive equations were determined:

Purdue subtests Females Males
Right hand 24.0 – 0.15 x (age) 22.5 – 0.15 x (age)
Left hand 23.7 – 0.16 x (age) 24.1 – 0.18 x (age)
Both hands 19.9 – 0.14 x (age) 20.0 – 0.15 x (age)
Right + Left + Both hands 67.7 – 0.45 x (age) 66.5 – 0.48 x (age)
Assembly 59.4 – 0.45 x (age) 62.2 – 0.53 x (age)

Example: The expected score for an 80 year old woman on the right hand task is: 24.0 – (0.15 x 80) = 12.

Time:

The PPT takes approximately 5 to 10 minutes to administer and score.

Training requirements:

None typically reported, however it is recommended that the clinician is familiar with the assessment tool. The clinician should be able to demonstrate to clients performance of the PPT at an average speed.

Equipment:

Purdue Pegboard Test (Model #32020)

  • Instruction manual
  • One test board
  • Pins x 50, collars x 20, washers x 40
  • Score sheets
  • Testing table approximately 30 inches tall
  • Stopwatch or clock that reads in seconds

Alternative forms of the Purdue Pegboard Test

None typically reported.

Client suitability

Can be used with:

  • Clients presenting with lateral brain damage (Costa et al., 1963)
  • Clients with hemiplegia resulting from stroke (Ashford, Slade, Malaprad & Turner-Stokes, 2008)
  • Clients requiring assessment for vocational rehabilitation (Hemm and Curtis, 1980)
  • Clients with dyslexia (Leslie, Davidson and Batey, 1985)
  • Clients of all ages

Should not be used in:

  • None reported

In what languages is the measure available?

No formal translations of the PPT have been reported. Because of the non-verbal nature of the assessment it can be used by non-English groups.

Summary

What does the tool measure? Dexterity and gross movement of the upper limb
What types of clients can the tool be used for? The PPT can be used with, but is not limited to clients with stroke.
Is this a screening or assessment tool? Assessment tool
Time to administer The PPT takes approximately 5 to 10 minutes to administer.
Versions There are no alternative versions of the PPT.
Other Languages None typically reported.
Measurement Properties
Reliability Test-retest:
Several studies have investigated the test-retest reliability of the PPT in healthy patients and found adequate to excellent test-retest reliability for all subtests. A three-trial administration method has been found to be more reliable than a one-trial method.
Validity Construct:
Known groups:
One study examined the known groups validity of the PPT and found to have 70% accuracy for detecting lateralization of brain damage and 90% accuracy for detecting brain damage regardless of lateralization.
Floor/Ceiling Effects No studies have examined the floor/ceiling effects of the PPT in clients with stroke.
Does the tool detect change in patients? No studies have formally investigated the responsiveness of the PPT in clients with stroke.
Acceptability The PPT has been criticized for not being reflective of real life activities of daily living (Ashford, Slade, Malaprade & Turner-Stokes, 2008). The test is quick to complete and should not produce undue fatigue for patients.
Feasibility The PPT is short and easy to administer and score.
How to obtain the tool?

The PPT can be ordered by contacting the manufacturer directly at:

Lafayette Instruments
3700 Sagamore Parkway North
P.O. Box 5729 | Lafayette, IN 47903 USA
Tel: 765.423.1505 | 800.428.7545
Fax: 765.423.4111
E-mail: info@lafayetteinstrument.com
www.lafayetteinstrument.com

Psychometric Properties

Overview

A literature search was conducted to identify all relevant publications on the psychometric properties of the Purdue Pegboard Test (PPT). Several studies have been conducted, however only one study was specific to clients with stroke.

Reliability

Internal consistency:
No studies have examined the internal consistency of the PPT in clients with stroke.

Test-retest:
Buddenberg and Davis (1999) examined the 1-week test-retest reliability of the PPT using the one-trial and three-trial administration procedures, in 47 healthy participants. The three-trial administration method was found to have excellent test-retest reliability for all subtests (ICC=0.82, 0.89, 0.85, 0.89 and 0.81 for the right hand, left hand, both hands, R+L+B and assembly subtests respectively). The one-trial administration method was found to have poor to adequate test-retest reliability using Interclass Correlation Coefficient (ICC=0.37, 0.61, 0.58, 0.70, 0.51 for the right hand, left hand, both hands, R+L+B and assembly subtests respectively).

Several studies have investigated the test-retest reliability of the one-trial administration method of the PPT in healthy participants. The following chart has been adapted from Buddenberg and Davis (1999).

Reliability Coefficients Reported for One-Trial Administrations of the Purdue Pegboard Test

Subtests Tiffin & Asher (1948) Bass & Stucki (1951) Tiffin (1968) Reddon, Gill, Gauk & Maerz (1988) (men/women) Desrosiers, Bravo & Dutil (1995)
Right hand 0.63 0.67 0.68 0.63/0.76 0.66
Left hand 0.60 0.66 0.65 0.64/0.79 0.66
Both hands 0.68 0.71 0.73 0.67/0.81 0.81
Right + Left + Both hands 0.71 0.79 0.71 NR 0.90
Assembly 0.68 0.72 0.67 0.81/0.83 0.84

NR=not reported

Desrosiers, Hebert, Bravo and Dutil (1995) investigated the test-retest reliability of the PPT in 35 healthy individuals aged 60-89 years with no-known upper-limb impairment. Each individual completed the PPT on 2 occasions with approximately 1 week between testing. Test-retest reliability, calculated using ICC was found to be adequate to excellent for the 5 subtests (ICC=0.66, 0.83, 0.81, 0.90 and 0.84 for Right hand, Left hand, Both hands, Right+Left+Both hands and Assembly subtests respectively). Scores from the second administration were higher, indicating a practice effect.

Intra-rater:
No studies have examined the intra-rater reliability of the PPT in clients with stroke.

Inter-rater:
No studies have examined the inter-rater reliability of the PPT in clients with stroke.

Validity

Content:

No studies have examined the content validity of the PPT in clients with stroke.

Criterion:

Concurrent:
No studies have examined the concurrent validity of the PPT in clients with stroke.

Predictive:
No studies have examined the predictive validity of the PPT in clients with stroke.

Construct:

Convergent/Discriminant:
No studies have examined the convergent or discriminant validity of the PPT in clients with stroke.

Known Groups:
Costa, Vaughan, Levita & Farber (1963) examined the known groups validity of Purdue Pegboard subtests (Right, Left and Both hands) in 54 clients with brain damage resulting from neoplasms, traumatic injury or degenerative, vascular or infectious diseases; and 26 clients with peripheral nervous system lesions or lesions below the level of the thoracic spine (control group). Clinical neurological examination, electroencephalography and neuroradiographic procedures were used to confirm diagnosis. The PPT accurately identified clients below the age of 60 years as having brain damage if one or more of the following were found on scoring: left score < 11; right score < 13; both hands score < 10; or left score > right score +3; and a lesion on the left if left score > right score, and on the right if right score > left score + 3. The PPT accurately identified clients above the age of 60 years as having brain damage if one or more of the following were found on scoring: left score < 10; right score < 10; both hands score < 8; or left > right +3; and a lesion on the left if left > right, and on the right if right > left + 3. If the client’s scores accurately classified the client as having brain damage but neither left or right lesions were identified based on the scores, the brain damage is categoried as bilateral. The above PPT cutoff scores were found to have a 70 percent accuracy for lateralization and a 90 percent accuracy for brain damage without regard to lateralization.

Sensitivity/specificity:

No studies have examined the sensitivity/specificity of the PPT in clients with stroke.

Responsiveness

No studies have examined the responsiveness of the PPT in clients with stroke.

References

  • Ashford, S., Slade, M., Malaprade, F., Turner-Stokes, L. (2008). Evaluation of functional outcome measures for the hemiparetic upper limb: A systematic review. Journal of Rehabilitation Medicine, 40, 787-795
  • Buddenberg, L.A. & Davis, C. (1999). Test-retest reliability of the Purdue Pegboard Test. The American Journal of Occupational Therapy, 54(5), 555-558
  • Costa, L.D., Vaughan, H.G., Levita, E. & Farber, N. (1963). Purdue Pegboard as a predictor of the presence and laterality of cerebral lesions. Journal of Consulting Psychology, 27(2), 133-137
  • Desrosiers, J., Hebert, R, Bravo, G. and Dutil, E. (1995). The Purdue Pegboard Test: Normative data for people aged 60 and over. Disability and Rehabilitation, 17(5), 217-224

See the measure

How to obtain the Purdue Pegboard Test?

The PPT can be ordered by contacting the manufacturer directly at:

Lafayette Instruments
3700 Sagamore Parkway North
P.O. Box 5729 | Lafayette, IN 47903 USA
Tel: 765.423.1505 | 800.428.7545
Fax: 765.423.4111
E-mail: info@lafayetteinstrument.com
Web: www.lafayetteinstrument.com

Table of contents

Stroke Arm Ladder

Evidence Reviewed as of before: 15-02-2012
Author(s)*: Katie Marvin, MSc. PT (Candidate)
Editor(s): Annabel McDermott, OT; Nicol Korner-Bitensky, PhD OT
Expert Reviewer: Johanne Higgins, PhD

Purpose

The Stroke Arm Ladder was developed from an existing bank of test items used to evaluate upper extremity function in patients with stroke. The Stroke Arm Ladder incorporates observable tests of capacity or performance and questions aimed at identifying activity and participation components of the World Health Organization’s International Classification of Functioning, Disability and Health (ICF). The measure includes items that cover a wide range of difficulty levels.

In-Depth Review

Purpose of the measure

The Stroke Arm Ladder was developed from an existing bank of test items used to evaluate upper extremity function in patients with stroke. The measure incorporates observable tests of capacity or performance and questions aimed at identifying activity and participation components of the World Health Organization’s International Classification of Functioning, Disability and Health (ICF). The measure includes items that cover a wide range of difficulty levels.

Clinicians and researchers need to use a variety of evaluation measures to assess interventions and constructs related to upper extremity function in patients following stroke. Administration of a variety of tests can be lengthy, time-consuming and burdensome on clients. The Stroke Arm Ladder was developed to address this issue by providing a more comprehensive, all-encompassing interval scale measure for evaluation and monitoring of upper extremity.

Available versions

None yet reported

Features of the measure

Items:

The Stroke Arm Ladder is comprised of 34 items selected from an existing bank of 49 test items used to evaluate upper extremity function in patients with stroke. The existing bank of items reflect the domains of the World Health Organization’s International Classification of Functioning, Disability and Health (ICF) (body functions; and activity and participation), and was derived from commonly used outcome measures, such as the Chedoke McMaster Stroke Assessment, Barthel Index and the Stroke Rehabilitation Assessment of Movement.

Description of tasks:

Staring item: pistol grip, pull trigger then return.

  • If patient is unable to perform starting item – then proceed to EASY subtest, start with number 7.
  • If patient is able to perform starting item – then proceed to DIFFICULT subtest, start with number 36.

EASY subtest items:

Item Score/100
1. Tie a scarf around one’s neck (bilateral task). The task is partially executed (more than 25%) or certain steps are executed with major difficulties necessitating repeated efforts. Part of the task may have had to be modified or needed assistance to make it achievable. 3
2. Open a jar and remove a spoonful of coffee (bilateral task). The task is partially executed (more than 25%) or certain steps are executed with major difficulties necessitating repeated efforts. Part of the task may have had to be modified or needed assistance to make it achievable. 4
3. Unlock a lock and open a pill container (bilateral task). The task is partially executed (more than 25%) or certain steps are executed with major difficulties necessitating repeated efforts. Part of the task may have had to be modified or needed assistance to make it achievable. 5
4. Feeding independently. The patient needs some assistance to feed him- or herself a meal from a tray or table when someone places the food within reach. The patient needs assistance to put on an assistive device if required, cut up food, use salt and pepper, spread butter, etc. The patient needs assistance to be able to accomplish this in a reasonable time. 23
5. Write on an envelope and stick a stamp on it (bilateral task). The task is partially executed (more than 25%) or certain steps are executed with major difficulties necessitating repeated efforts. Part of the task may have had to be modified or needed assistance to make it achievable. 29
6. Dressing and undressing. The patient needs some assistance: to put on, remove and fasten all clothing and tie shoelaces (unless it is necessary to use adaptive aids for this). This includes putting on, removing and fastening corsets or braces when they are prescribed. 33
7. Shuffle and deal playing cards (bilateral task). The task is partially executed (more than 25%) or certain steps are executed with major difficulties necessitating repeated efforts. Part of the task may have had to be modified or needed assistance to make it achievable.

Able to perform item number 7: Move down until patient is unable to meet the criteria for the specific task.

Unable to perform item number 7: Move up until patient is able to meet the specific criteria for the specific task.

34
8. Elbow at side 90 flexion: supination then pronation. 44
9. Finger flexion then extension. 45
10. Extends elbow in supine (starting with elbow fully flexed). Able to complete the movement in a manner that is comparable to the unaffected side. 46
11. Protract scapula in supine. Able to complete the movement in a manner that is comparable to the unaffected side. 48
12. Can the patient prepare their own meals? Cook meals independently?. 49
13. Feeding independently: The patient can feed him- or herself a meal from a tray or table when someone places the food within reach. The patient is able to put on an assistive device if required, cut up food, use salt and pepper, spread butter, etc. The patient must be able to accomplish this in a reasonable time. 51
14. Hand unsupported: opposition of thumb to little finger. 51
15. Handle coins (unilateral task). The task is partially executed (more than 25%) or certain steps are executed with major difficulties necessitating repeated efforts. Part of the task may have had to be modified or needed assistance to make it achievable. 52
16. Place hand on sacrum. Able to complete the movement in a manner that is comparable to the unaffected side. 53
17. Shrug shoulders (scapular elevation). Able to complete the movement in a manner that is comparable to the unaffected side. 55
18. Can patient perform housework? Without help? 56
19. Dressing and undressing independently. Patient is able to put on, remove and fasten all clothing and tie shoelaces (unless it is necessary to use adaptive aids for this). This includes putting on, removing and fastening corsets or braces when they are prescribed. 56
20. Pick up and move small objects (unilateral task). The task is partially executed (more than 25%) or certain steps are executed with major difficulties necessitating repeated efforts. Part of the task may have had to be modified or needed assistance to make it achievable. 57
21. Write on an envelope and stick a stamp on it (bilateral task). The task is successfully completed without hesitation or difficulty, as instructed or demonstrated. 57

Difficult subtest items:

Item Score/100
22. In the past two weeks, were you able to cut your food with a knife and fork? 58
23. In the past two weeks, were you able to use your hand that was more affected by your stroke to turn a doorknob? 59
24. Pick up and move a jar (unilateral task). The task is successfully completed without hesitation or difficulty, as instructed or demonstrated. 59
25. Unlock a lock and open a pill container (bilateral task). The task is successfully completed without hesitation or difficulty, as instructed or demonstrated. 60
26. In the past two weeks, were you able to do light household tasks/chores (e.g. dust, make a bed, take out garbage, do the dishes)? Just a little or not difficult at all. 61
27. Bathing independently. The patient must be able to use a bathtub, a shower or take a complete sponge bath. The patient must be able to perform all the steps involved in any one of these tasks without another person being present. 63
28. Tie a scarf around one’s neck (bilateral task). The task is successfully completed without hesitation or difficulty, as instructed or demonstrated. 63
29. Hand from knee to forehead 5x in 5 seconds. 64
30. Arm resting at side of body: raise arm overhead with full supination. 64
31. Pronation: tap index finger 10x in 5 seconds 65
32. In the past two weeks, were you able to use your hand that was most affected by your stroke to carry heavy objects (e.g. bag of groceries)? (Men) 66
33. Open a jar and remove a spoonful of coffee (bilateral task). The task is successfully completed without hesitation or difficulty, as instructed or demonstrated. 71
34. In the past two weeks, were you able to clip your toenails? 73
35. In the past two weeks, were you able to use your hand that was most affected by your stroke to carry heavy objects (e.g. bag of groceries)? (women) Just a little or not difficult at all. 73
36. Elbow at side, 90 degrees flexion: resisted shoulder external rotation.

Able to perform item number 36: Move down until patient is unable to meet the criteria for the specific task.

Unable to perform test item number 36: Move up until patient is able to meet the criteria for the specific task.

76
37. Thumb to finger tips, then reverse 3x in 12 seconds. 78
38. Number of blocks transferred in 60 seconds > 30 82
39. Clap hands overhead then behind back 3x in 5 seconds. 82
40. Bounce ball 4 times in succession then catch. 93
41. Number of blocks transferred in 60 seconds >60 100

Scoring and Score Interpretation:

The Stroke Arm Ladder is scored out of 100 and is based on completion of test items. For example, if the patient is able to perform the starting test item (pistol grip, pull trigger), they automatically start at item number 36 in the ‘DIFFICULT items subtest’; items are tested in a sequential order (36, 37, 38, 39…etc); if the patient successfully completes the next three items but is unable to complete item 40 then they receive a score of 82 out of 100 (as indicated in the right hand column beside item 39).

Information on score interpretation is not yet available.

Time:

Not reported.

Training requirements:

None reported.

Equipment:

  • Scarf
  • Jar with lid
  • Coffee
  • Pill container
  • Manual lock
  • Feeding utensils
  • Plate, bowl, glass, mug
  • Salt and pepper shakers
  • Envelope
  • Stamp
  • Clothing (shirt and pants with buttons)
  • Deck of cards
  • Coins
  • Pen or pencil
  • Access to a kitchen and bathroom if observation of tasks is required

Alternative forms of the assessment

None yet reported

Client suitability

Can be used with:

  • Clients with stroke (mild, moderate and severe) in the acute and sub-acute phase.

Should not be used with:

  • Patients greater than 7 months post-stroke until further validation testing is completed.

In what languages is the measure available?

English

Summary

What does the tool measure? Upper extremity function following stroke.
What types of clients can the tool be used for? Can be used with clients with stroke.
Is this a screening or assessment tool? Assessment tool
Time to administer Not yet reported.
Versions There are no alternative versions.
Other Languages There are no official translations.
Measurement Properties
Reliability Internal consistency:
One study examined the internal consistency of the Stroke Arm Ladder and found internal consistency to be excellent.
Validity Content:
One study examined the content validity of the Stroke Arm Ladder and confirmed the hierarchial sequencing of the items using Rasch analysis.

Construct:
Convergent/Discriminant:
One study examined convergent validity of the Stroke Arm Ladder and reported excellent correlations between the Stroke Arm Ladder and the Stroke Rehabilitation Assessment of Movement; and poor correlation between the Stroke Arm Ladder and the mental and emotional health subsets of the Medical Outcomes Study Short Form 36.

Known Groups:
One study examined known groups validity and found that the Stroke Arm Ladder could differentiate between the two extremes of stroke severity: mild and severe.

Floor/Ceiling Effects One study examined the floor and ceiling effects and found no floor or ceiling effects in a sample population of patients with stroke ranging from mild to severe.
Note: The Stroke Arm Ladder has only been tested on patients up to 7 months post-stroke.
Does the tool detect change in patients? Not yet assessed.
Acceptability Results support preliminary validation of the psychometric properties, however further research is needed before the tool is ready for use clinically.
Feasibility The administration of the Stroke Arm Ladder is easy and simple to administer. The Stroke Arm Ladder provides a more comprehensive all-encompassing evaluation tool for evaluation and monitoring of upper extremity function.
How to obtain the tool? Information on the Stroke Arm Ladder can be obtained from the Higgins, Finch, Kopec & Mayo (2011) study.

Psychometric Properties

Overview

A literature search was conducted to identify all relevant publications on the psychometric properties of the Stroke Arm Ladder and revealed only the initial validation study. Results support preliminary validation of the psychometric properties, however further research is needed before the tool is ready for use clinically.

Floor/Ceiling Effects

Higgins, Finch, Kopec and Mayo (2011) examined the floor and ceiling effects of the Stroke Arm Ladder in patients with stroke and found no floor or ceiling effects, as no patients scored below or above the easiest and hardest items (respectively).

Note: This sample only included patients up to 7 months post-stroke and thus, the Stroke Arm Ladder should not be used for patients past 7 months post-stroke until further validation testing is completed.

Reliability

Internal consistency:
Higgins, Finch, Kopec and Mayo (2011) investigated the internal consistency of the Stroke Arm Ladder and found excellent internal consistency (Cronbach’s alpha = 0.97).

Test-retest:
Test-retest reliability has not been examined.

Intra-rater:
Intra-rater reliability has not been examined.

Inter-rater:
Inter-rater reliability has not been examined.

Validity

Content:

Higgins, Finch, Kopec and Mayo (2011) investigated the content validity of the Stroke Arm Ladder in clients with stroke. In the development of the Stroke Arm Ladder, 49 items from validated tests and indices used to assess upper extremity function and movement, such as the Box and Block Tests, were selected. Fifteen items were deleted for reasons such as redundancy and lack of fit to the model. When validating the 34 items selected for the final version of the measure, all patients with stroke had fit residuals between -2.0 and +2.0. The hierarchical sequencing of the items was confirmed using Rasch analysis. The results from this study suggest that all 34 items in the Stroke and Arm Ladder reflect the same construct.

Criterion:

Concurrent:
Concurrent validity has not been examined.

Predictive:
Predictive validity has not been examined.

Construct:

Convergent/Discriminant:
Higgins, Finch, Kopec and Mayo (2011) investigated the convergent validity of the Stroke Arm Ladder by comparing it to the index of global functional recovery (total score on the Stroke Rehabilitation Assessment of Movement). Excellent correlation was found between the two measures (r=0.6, P<0.0001). The authors also reviewed the correlation between the Stroke Arm Ladder and the mental and emotional subsets of the Medical Outcomes Study Short Form 36 (SF-36), and found poor correlation (r=0.2, P<0.0001). Results from this study indicate that the Stroke Arm Ladder adequately measures the construct of upper extremity function, with limited ability to assess mental and emotional status following stroke, as intended by the developers.

Known Groups:
Higgins, Finch, Koppec and Mayo (2011) examined known groups validity of the Stroke Arm Ladder in patients with stroke. Patients with stroke were classified as having mild, mild-moderate, moderate or severe stroke using the Canadian Neurological Scale (CNS). Results revealed that the Stroke Arm Ladder was able to differentiate two out of four different levels of stroke severity: mild and severe Patients classified as having either moderate or severe stroke scored similarly on the measure, as did patients classified as having mild and mild-moderate stroke. Patients classified as having moderate or severe stroke differed significantly from those classified as having mild or mild-moderate stroke, indicating the ability of the Stroke Arm Ladder to differentiate between the two extremes (mild versus severe).

Sensitivity/ Specificity

Sensititive or specificity has not been examined.

Responsiveness

Responsiveness has not been examined.

References

  • Higgins, J., Finch, L.E., Kopec, J. & Mayo, N.E. (2011). Development and initial psychometric evaluation of the Stroke Arm Ladder: A measure of upper extremity function post stroke. Clinical Rehabilitation, 25(8), 740-759.

See the measure

How to obtain the Stroke Arm Ladder?

The Stroke Arm Ladder is available in the following article:

Higgins, J., Finch, L.E., Kopec, J. & Mayo, N.E. (2011). Development and initial psychometric evaluation of the Stroke Arm Ladder: A measure of upper extremity function post stroke. Clinical Rehabilitation, 25(8), 740-759.

Table of contents

Stroke Impact Scale (SIS)

Evidence Reviewed as of before: 29-06-2018
Author(s)*: Lisa Zeltzer, MSc OT; Katherine Salter, BA; Annabel McDermott
Editor(s): Nicol Korner-Bitensky, PhD OT; Elissa Sitcoff, BA BSc

Purpose

The Stroke Impact Scale (SIS) is a stroke-specific, self-report, health status measure. It was designed to assess multidimensional stroke outcomes, including strength, hand function Activities of Daily Living / Instrumental Activities of Daily Living (ADL/IADL), mobility, communication, emotion, memory and thinking, and participation. The SIS can be used both in clinical and in research settings.

In-Depth Review

Purpose of the measure

The Stroke Impact Scale (SIS) is a stroke-specific, self-report, health status measure. It was designed to assess multidimensional stroke outcomes, including strength, hand function, Activities of Daily Living / Instrumental Activities of Daily Living (ADL/IADL), mobility, communication, emotion, memory and thinking, and participation. The SIS can be used both in clinical and research settings.

Available versions

The Stroke Impact Scale was developed at the Landon Center on Aging, University of Kansas Medical Center. The scale was first published as version 2.0 by Duncan, Wallace, Lai, Johnson, Embretson, and Laster in 1999. Version 2.0 of the SIS is comprised of 64 items in 8 domains (Strength, Hand function, Activities of Daily Living (ADL) / Instrumental ADL, Mobility, Communication, Emotion, Memory and thinking, Participation). Based on the results of a Rasch analysis process, 5 items were removed from version 2.0 to create the current version 3.0 (Duncan, Bode, Lai, & Perera, 2003b).

Features of the measure

Items:

The SIS version 3.0 includes 59 items and assesses 8 domains:

  • Strength – 4 items
  • Hand function – 5 items
  • ADL/IADL – 10 items
  • Mobility – 9 items
  • Communication – 7 items
  • Emotion – 9 items
  • Memory and thinking – 7 items
  • Participation/Role function – 8 items

An extra question on stroke recovery asks that the client rate on a scale from 0 – 100 how much the client feels that he/she has recovered from his/her stroke.

To see the items of the SIS, please click here.

Instructions on item administration:

Prior to administering the SIS, the purpose statement must be read as written below. It is important to tell the respondent that the information is based on his/her point of view.

Purpose statement:
“The purpose of this questionnaire is to evaluate how stroke has impacted your health and life. We want to know from your point of view how stroke has affected you. We will ask you questions about impairments and disabilities caused by your stroke, as well as how stroke has affected your quality of life. Finally, we will ask you to rate how much you think you have recovered from your stroke”.

Response sheets in large print should be provided with the instrument, so that the respondent may see, as well as hear, the choice of responses for each question. The respondent may either answer with the number or the text associated with the number (eg. “5” or “Not difficult at all”) for an individual question. If the respondent uses the number, it is important for the interviewer to verify the answer by stating the corresponding text response. The interviewer should display the sheet appropriate for that particular set of questions, and after each question must read all five choices.

Questions are listed in sections, or domains, with a general description of the type of questions that will follow (eg. “These questions are about the physical problems which may have occurred as a result of your stroke”). Each group of questions is then given a statement with a reference to a specific time period (eg. “In the past week how would you rate the strength of your…”). The statement must be repeated before each individual question. Within the measure the time period changes from one week, to two weeks, to four weeks. It is therefore important to emphasize the change in the time period being assessed for the specific group of questions.

Scoring:

The SIS is a patient-based, self-report questionnaire. Each item is rated using a 5-point Likert scale. The patient rates his/her difficulty completing each item, where:

  • 1 = an inability to complete the item
  • 5 = no difficulty experienced at all.

Note: Scores for three items in the Emotion domain (3f, 3h, 3i) must be reversed before calculating the Emotion domain score (i.e. 1 » 5, 2 » 4, 3 = 3, 4 » 2, 5 » 1). The SIS scoring database (see link below) takes this change of direction into account when scoring. When scoring manually, use the following equation to compute the item score for 3f, 3h and 3i: Item score = 6 – individual’s rating.

A final single-item Recovery domain assesses the individual’s perception of his/her recovery from stroke, measured in the form of a visual analogue scale from 0-100, where:

  • 0 = no recovery
  • 100 = full recovery.

Domain scores range from 0-100 and are calculated using the following equation:

  • Domain score = [(Mean item score – 1) / 5-1 ] x 100

Scores are interpreted by generating a summative score for each domain using an algorithm equivalent to that used in the SF-36 (Ware & Sherbourne, 1992).

See http://www.kumc.edu/school-of-medicine/preventive-medicine-and-public-health/research-and-community-engagement/stroke-impact-scale/instructions.html to download the scoring database.

Time:

The SIS is reported to take approximately 15-20 minutes to administer (Finch, Brooks, Stratford, & Mayo, 2002).

Subscales:

The SIS 3.0 is comprised of 8 subscales or ‘Domains’:

  1. Strength
  2. Hand function
  3. ADL/IADL
  4. Mobility
  5. Communication
  6. Emotion
  7. Memory and thinking
  8. Participation

A final single-item domain measures perceived recovery since stroke onset.

Equipment:

Only the scale and a pencil are needed.

Training:

The SIS 3.0 requires no formal training for administration (Mulder & Nijland, 2016). Instructions for administration of the SIS 3.0 are available online through the University of Kansas Medical Center SIS information page.

Alternative forms of the SIS

SIS-16 (Duncan et al., 2003a).

Duncan et al. (2003) developed the SIS-16 to address the lack of sensitivity to differences in physical functioning in functional measures of stroke outcome. Factor analysis of the SIS 2.0 revealed that the four physical domains (Strength, Hand function, ADL/IADL, Mobility) are highly correlated and can be summed together to create a single physical dimension score (Duncan et al., 1999; Mulder & Nijland, 2016). Accordingly, the SIS-16 consists of 16 items from the SIS 2.0:

  1. ADL/IADL – 7 items
  2. Mobility – 8 items
  3. Hand Function – 1 item.

All other domains should remain separate (Duncan et al., 1999).

SF-SIS (Jenkinson et al., 2013).

Jenkinson et al. (2013) developed a modified short form of the SIS (SF-SIS) comprised of eight items. The developers selected the one item from each domain that correlated most highly with the total domain score, through three methods: initial pilot research, validation analysis and a focus group. The final choice of questions for the SF-SIS comprised those items that were chosen by methods on 2 or more occasions. The SF-SIS was evaluated for face validity and acceptability within a focus group of patients from acute and rehabilitation stroke settings and with multidisciplinary stroke healthcare staff. The SF-SIS has also been evaluated for content, convergent and discriminant validity (MacIsaac et al., 2016).

Client suitability

Can be used with:

  • The SIS can only be administered to patients with stroke.
  • The SIS 3.0 and SIS-16 can be completed by telephone, mail administration, by proxy, and by proxy mail administration (Duncan et al., 2002a; Duncan et al., 2002b; Kwon et al., 2006). Studies have shown potential proxy bias for physical domains (Mulder & Nijland, 2016). It is recommended that possible responder bias and the inherent difficulties of proxy use be weighed against the economic advantages of a mailed survey when considering these methods of administration.

Should not be used with:

  • The SIS version 2.0 should be used with caution in individuals with mild impairment as items in the Communication, Memory and Emotion domains are considered easy and only capture limitations in the most impaired individuals (Duncan et al., 2003).
  • Respondents must be able to follow a 3-step command (Sullivan, 2014).
  • Time taken to administer the SIS is a limitation for individuals with difficulties with concentration, attention or fatigue following stroke (MacIsaac et al., 2016).

In what languages is the measure available?

The SIS was originally developed in English.

Cultural adaptations, translations and psychometric testing have also been conducted in the following languages:

  • Brazilian (Carod-Artal et al., 2008)
  • French (Cael et al., 2015)
  • German (Geyh, Cieza & Stucki, 2009)
  • Italian (Vellone et al., 2010; Vellone et al., 2015)
  • Japanese (Ochi et al., 2017)
  • Korean (Choi et al., 2017; Lee & Song, 2015)
  • Nigerian (Hausa) (Hamza et al., 2012; Hamza et al., 2014)
  • Portuguese (Goncalves et al., 2012; Brandao et al., 2018)
  • Ugandan (Kamwesiga et al., 2016)
  • United Kingdom (Jenkinson et al., 2013)

The MAPI Research Institute has translated the SIS and/or SIS-16 into numerous languages including Afrikaans, Arabic, Bulgarian, Cantonese, Czech, Danish, Dutch, Farsi, Finnish, French, German, Greek, Hebrew, Hungarian, Icelandic, Italian, Japanese, Korean, Malay, Mandarin, Norwegian, Portuguese, Russian, Slovak, Spanish, Swedish, Tagalog, Thai and Turkish. Translations may not be validated.

Summary

What does the tool measure? Multidimentional stroke outcomes, including strength, hand function, Activities of daily living/Instrumental activities of daily living, mobility, communication, emotion, memory, thinking and participation.
What types of clients can the tool be used for? Patients with stroke.
Is this a screening or assessment tool? Assessment
Time to administer The SIS takes 15-20 minutes to administer.
Versions SIS 2.0, SIS 3.0, SIS-16, SF-SIS.
Other Languages The SIS has been translated into several languages. Please click here to see a list of translations.
Measurement Properties
Reliability Internal consistency:
SIS 2.0:
Two studies reported excellent internal consistency; one study reported excellent internal consistency for 5/8 domains and adequate internal consistency for 3/8 domains.

SIS 3.0:
Two studies reported excellent internal consistency; one study reported excellent internal consistency for 6/8 domains and adequate internal consistency for 2/8 domains.

SIS-16:
One study reported good spread of item difficulty.

SF-SIS:
One study reported excellent internal consistency.

Test-retest:
SIS 2.0:
One study reported adequate to excellent test-rest reliability in all domains except for the Emotion domain.

Validity Criterion :
Concurrent:
SIS 2.0:
Excellent correlations with the Barthel Index, FMA, nstrumental Activities of Daily Living (IADL) Scale, Duke Mobility Scale and Geriatric Depression Scale; adequate to excellent correlations with the FIM; adequate correlations with the NIHSS and MMSE; and poor to excellent correlations with the SF-36.

SIS 3.0:
Excellent correlation between SIS Hand Function and MAL-QOM; excellent correlation between SIS ADL/IADL and FIM, Barthel Index, Lawton IADL Scale; excellent correlation between SIS Strength and Motricity Index; excellent correlation between SIS Mobility and Barthel Index; adequate to excellent correlation between SIS ADL/IADL and NEADL; adequate correlation between SIS Social Participation and SF-36 Social Functioning, Lawton IADL scale; adequate correlation between SIS Memory domain and MMSE; poor to adequate correlations between remaining SIS domains and FIM, NEADL, FMA, MAL-AOU, MAL-QOM, FAI.

SIS-16:
Excellent correlation with the Barthel Index; adequate to excellent correlations with the STREAM total and subscale scores; adequate correlation with SF-36 Physical Functioning.

Predictive:
SIS 2.0:
Physical function, Emotion and Participation domains were statistically significant predictors of the patient’s own assessment of recovery; SIS scores were poor predictors of mean steps per day.

SIS 3.0:
Pre-treatment SIS scores were compared with outcome measures after 3 weeks of upper extremity rehabilitation: Hand function and ADL/IADL domains showed adequate to excellent correlations with FIM, FMA, MAL-AOU, MAL-QOM, FAI, and NEADL; other domains demonstrated poor to adequate correlations with outcome measures.

SIS-16:
– Admission scores show an excellent correlation with actual length of stay and an adequate correlation with predicted length of stay; there was a significant correlation with discharge destination (home/rehabilitation).
– The combination of early outcomes of MAL-QOM and SIS show high accuracy in predicting final QOL among patients with stroke.

Construct:
Convergent/Discriminant:
SIS 2.0:
Domains demonstrate adequate to excellent correlations with corresponding WHOQOL-BREF subscales and Zung’s Self-Rating Depression Scale; poor correlations between the SIS Communication domain and both WHOQOL-BREF and Zung’s Self-Rating Depression Scale; and a poor correlation between the SIS Physical domain and the WHOQOL Environment scores.

SIS 3.0:
Excellent correlation with the SF-SIS, EQ-5D, mRS, BI, NIHSS, EQ-5D; moderate to excellent correlations with the EQ-VAS; and a moderate correlation with the SIS-VAS.

SIS 3.0 telephone survey:
Adequate to excellent correlations with the FIM and SF-36V.

SIS-16:
Adequate to excellent correlations with the WHOQOL-BREF Physical domain; poor correlation with the WHOQOL Social relationships domain.

SF-SIS:
Excellent correlations with the EQ-5D, mRS, BI, NIHSS, EQ-5D; moderate to excellent correlations with the EQ-VAS; and moderate correlation with the SIS-VAS.

Known groups:
SIS 2.0: Most domains can differentiate between patients with varying degrees of stroke severity.

SIS 3.0:
Physical and ADL/IADL domains showed score discrimination and distribution for different degrees of stroke severity.

SIS-16:
Can discriminate between patients of varying degrees of stroke severity.

Floor/Ceiling Effects Three studies have examined floor/ceiling effects of the SIS.

SIS 2.0:
Two studies reported the potential for floor effects in the domain of Hand function among patients with moderate stroke severity, and a potential for ceiling effects in the Communication, Memory and Emotion domains.

SIS 3.0:
One study reported minimal floor and ceiling effects for the Social participation domain; one study reported ceiling effects for the Hand function, Memory and thinking, Communication, Mobility and ADL/IADL domains over time.

SIS-16:
One study reported no floor effects and minimal ceiling effects.

Does the tool detect change in patients? Five studies have investigated responsiveness of the SIS.

SIS 2.0:
One study reported significant change in patients’ recovery in the expected direction between assessments at 1 and 3 months, and at 1 and 6 months post-stroke, however sensitivity to change was affected by stroke severity and time of post-stroke assessment.

SIS 3.0:
– One study determined change scores for a clinically important difference (CID) within four subscales of the Strength, ADL/IADL, Mobility, Hand function. The MDC was 24.0, 17.3, 15.1 and 25.9 (respectively); minimal CID was 9.2, 5.9, 4.5 and 17.8 (respectively).
– One study reported medium responsiveness for Hand function, Stroke recovery and SIS total score; other domains showed small responsiveness.
– One study found Participation and Recovery from stroke were the most responsive domains over the first year post-stroke; Strength and Hand function domains also showed high clinically meaningful positive/negative change.

SIS-16:
One study reported change scores of 23.1 indicated statistically significant improvement from admission to discharge, and sensitivity to change was large.

Acceptability – SIS 3.0 and SIS-16 are available in proxy version. The patient-centred nature of the scale’s development may enhance its relevance to patients and assessment across multiple levels may reduce patient burden.
– Time taken to administer the SIS has been identified as a limitation.
– The SIS 2.0 should be used with caution in individuals with mild impairment as some domains only capture limitations in the most impaired individuals.
Feasibility – The SIS is a patient-based self-report scale that takes 15-20 minutes to administer.
– The SIS can be administered in person or by proxy, by mail or telephone.
– The SIS does not require any formal training.
– Instructions for administration of the SIS 3.0 are available online.
How to obtain the tool?

Please click here to see a copy of the SIS.

Psychometric Properties

Overview

We conducted a literature search to identify relevant publications on the psychometric properties of the SIS. Seventeen studies were included. Studies included in this review are specific to the original English versions of the SIS version 2.0, SIS 3.0 or SIS-16.

Floor/Ceiling Effects

Duncan et al. (1999) found that SIS version 2.0 showed the potential for floor effects in the Hand function domain in the moderate stroke group (40.2%) and a possible ceiling effect in the Communication domain for both the mild (35.4%) and moderate (25.7%) stroke groups. The highest percentage of ceiling effects for the SIS was for the Communication domain (35%) compared with a 64.6% ceiling rate for the Barthel Index (Mahoney & Barthel, 1965).

Duncan et al. (2003b) conducted a Rasch analysis which confirmed these two effects observed in Duncan et al. (1999) – a floor effect in the SIS Hand function domain and a ceiling effect in the Communication domain. A ceiling effect in the Memory and Emotion domains was also reported.

Lai et al. (2003) examined floor/ceiling effects of the SIS-16 and SIS Social Participation domain in a sample of 278 patients at 3 months post-stroke. The authors reported floor/ceiling effects of 0% and 4% (respectively) for the SIS-16, and 1% and 5% (respectively) for the SIS Social Participation domain.

Richardson et al. (2016) examined floor/ceiling effects of the SIS 3.0 in a sample of 164 patients with subacute stroke. Measures were taken at three timepoints: on admission to the study and at 6-month and 12-month follow-up (n=164, 108, 37 respectively). Poor ceiling effects (>20%) were seen for the Hand function domain at baseline, 6 months and 12 months (25.0%, 36.4%, 37.8%, respectively); the Memory and thinking domain at 6 months and 12 months (22.2%, 21.6%, respectively); the Communication domain at 6 months and 12 months (30.6%, 27%, respectively); the Mobility domain at 6 months (20.4%); and the ADL/IADL domain at 12 months (21.6%). There were no significant floor effects at any timepoint.

Reliability

Internal consistency:
Duncan et al (1999) examined internal consistency of the SIS version 2.0 using Cronbach’s alpha coefficients and reported excellent internal consistency for each of the 8 domains (ranging from a=0.83 to 0.90).

Duncan et al. (2003b) examined reliability of the SIS version 2.0 by Rasch analysis. Item separation reliability is the ratio of the “true” (observed minus error) variance to the obtained variation. The smaller the error, the higher the ratio will be. It ranges from 0.00 to 1.00 and is interpreted the same as the Cronbach’s alpha. Item separation reliability of the SIS version 2.0 ranged from 0.93-1.00. A separation index > 2.00 is equivalent to a Cronbach’s alpha of 0.80 or greater (excellent). In this study, 5 out of 8 domains had a separation index that exceeded 2.00 (in addition to the composite physical domain). The values for the Emotion and Communication domains were only in the adequate range because of the ceiling effect in those domains and those for the Hand function domain were only adequate because of the floor effect in that domain.

Edwards and O’Connell (2003) administered the SIS version 2.0 to 74 patients with stroke and reported excellent internal consistency (ranging from a=0.87 for participation to a=0.95 for hand function). The percentage of item-domain correlations >0.40 was 100% for all domains except emotion and ADL/IADL. In the ADL/IADL scale, one item (cutting food) was more closely associated with hand function than ADL/IADL.

Lai et al. (2003) examined reliability of the SIS-16 and SIS Social Participation domain in a sample of 278 patients at 3 months post-stroke. Both the SIS-16 and SIS Social Participation domain showed good spread of item difficulty, with easier items that are able to measure lower levels of physical functioning in patients with severe stroke.

Jenkinson et al. (2013) examined internal consistency of the SIS 3.0 and the SF-SIS among individuals with stroke (n=73, 151 respectively), using Cronbach’s alpha. Internal consistency of the SIS 3.0 was excellent for all domains (a=0.86 to 0.96). Higher order factor analysis of the SIS 3.0 showed one factor with an eigenvalue > 1 that accounted for 68.76% of the variance. Each dimension of the SIS 3.0 loaded on this factor (eigen value = 5.5). Internal consistency of the SF-SIS was high (a=0.89). Factor analysis of the SF-SIS similarly showed one factor that accounted for 57.25% of the variance.

Richardson et al. (2016) examined internal consistency of the SIS 3.0 in a sample of 164 patients with subacute stroke, using Cronbach’s alpha. Internal consistency was measured at three timepoints: on admission to the study and at 6-month and 12-month follow-up. Internal consistency of all domains was excellent at all timepoints (a=0.81 to 0.97). The composite Physical Functioning score was excellent at all timepoints (a=0.95 to 0.97).

MacIsaac et al. (2016) examined internal consistency of the SIS 3.0 in a sample of 5549 individuals in an acute stroke setting and 332 individuals in a stroke rehabilitation setting, using Cronbach’s alpha. Internal consistency was excellent within both acute and rehabilitation data sets (a=0.98, 0.93 respectively). Internal consistency of individual domains was excellent for both acute and rehabilitation data sets, except for the Emotion domain (a=0.60, 0.63 respectively) and the Strength domain (a=0.77, rehabilitation data set only).

Test-retest:
Duncan et al. (1999) examined test-retest reliability of the SIS version 2.0 in 25 patients who were administered the SIS at 3 or 6 months post stroke and again one week later. Test-retest was calculated using intraclass correlation coefficients (ICC), which ranged from adequate to excellent (ICC=0.7 to 0.92) with the exception of the Emotion domain, which had only a poor correlation (ICC=0.57).

Validity

Content:

Development of the SIS was based on a study at the Landon Center on Aging, University of Kansas Medical Center (Duncan, Wallace, Studenski, Lai, & Johnson, 2001) using feedback from individual interviews with patients and focus group interviews with patients, caregivers, and health care professionals. Participants included 30 individuals with mild and moderate stroke, 23 caregivers, and 9 stroke experts. Qualitative analysis of the individual and focus group interviews generated a list of potential items. Consensus panels reviewed the potential items, established domains for the measure, developed item scales, and decided on mechanisms for administration and scoring.

Criterion:

Concurrent:
Duncan et al. (1999) examined concurrent validity of the SIS by comparison with the Barthel Index, Functional Independence Measure (FIM), Fugl-Meyer Assessment (FMA), Mini-Mental State Examination (MMSE), National Institute of Health Stroke Scale (NIHSS), Medical Outcomes Study Short Form 36 (SF-36), Lawton Instrumental Activities of Daily Living (IADL) Scale, Duke Mobility Scale and Geriatric Depression Scale. The following results were found for each domain of the SIS:

SIS Domain Comparative Measure Correlation Rating
Hand function FMA – Upper Extremity Motor r = 0.81 Excellent
Mobility FIM Motor r = 0.83 Excellent
Barthel Index r = 0.82 Excellent
Duke Mobility Scale r = 0.83 Excellent
SF-36 Physical Functioning r = 0.84 Excellent
Strength NIHSS Motor r = -0.59 Adequate
FMA Total r = 0.72 Excellent
ADL/IADL Barthel Index r = 0.84 Excellent
FIM Motor r = 0.84 Excellent
Lawton IADL Scale r = 0.82 Excellent
Memory MMSE r = 0.58 Adequate
Communication FIM Social/Cognition r = 0.53 Adequate
NIHSS Language r = -0.44 Adequate
Emotion Geriatric Depression Scale r = -0.77 Excellent
SF-36 Mental Health r = 0.74 Excellent
Participation SF-36 Emotional Role r = 0.28 Poor
SF-36 Physical Role r = 0.45 Adequate
SF-36 Social Functioning r = 0.70 Excellent
Physical Barthel Index r = 0.76 Excellent
FIM Motor r = 0.79 Excellent
SF-36 Physical Functioning r = 0.75 Excellent
Lawton IADL Scale r = 0.73 Excellent

Duncan et al. (2002a) examined concurrent validity of the SIS version 3.0 and SIS-16 using Pearson correlations. The SIS was correlated with the Mini-Mental State Examination (MMSE), Barthel Index, Lawton IADL Scale and the Motricity Index. The SIS ADL/IADL domain showed an excellent correlation with the Barthel Index (r=0.72) and with the Lawton IADL Scale (r=0.77). The SIS Mobility domain showed an excellent correlation with the Barthel Index (r=0.69). The SIS Strength domain showed an excellent correlation with the Motricity Index (r=0.67). The SIS Memory domain showed an adequate correlation with the MMSE (r=0.42).

Lai et al. (2003) examined concurrent validity of the SIS-16 and SIS Social Participation domain by comparison with the SF-36 Physical Functioning and Social Functioning subscales, Barthel Index and Lawson IADL Scale, using Pearson correlation coefficients. Measures were administered to 278 patients with stroke at 3 months post-stroke. There was an adequate correlation between SIS-16 and SF-36 Physical Functioning (r=0.79), and an adequate correlation between SIS Social Participation and SF-36 Social Functioning (r=0.65). There was an excellent correlation between SIS-16 and the Barthel Index at 3 months post-stroke (r=0.75), and an adequate correlation between SIS Social Participation and Lawton IADL Scale at 3 months post-stroke (r=0.47).

Lin et al. (2010a) examined concurrent validity of the SIS version 3.0 by comparison with the Fugl-Meyer Assessment (FMA), Motor Activity Log – Amount of Use and – Quality of Movement (MAL-AOU, MAL-QOM), Functional Independence Measure (FIM), Frenchay Activities Index (FAI) and Nottingham Extended Activities of Daily Living Scale (NEADL). Concurrent validity was measured using Spearman correlation coefficients prior to and on completion of a 3-week intervention period. SIS Hand Function showed an excellent correlation with MAL-QOM at pre-treatment and post-treatment (r=0.65, 0.68, respectively, p<0.01), and adequate correlations with all other measures (FMA, MAL-AOU, FIM, FAI, NEADL). SIS ADL/IADL showed an excellent correlation with the FIM at pre-treatment and post-treatment (r=0.69, 0.75, respectively, p<0.01). Correlations between SIS ADL/IADL and the NEADL were adequate at pre-treatment (r=0.54, p<0.01) and excellent at post-treatment (r=0.62, p<0.01). Correlations between the SIS ADL-IADL and all other measures (FMA, MAL-AOU, MAL-QOM, FAI) were adequate at pre-treatment and post-treatment. Other SIS domains demonstrated poor to adequate correlations with comparison measures.

Ward et al. (2011) examined concurrent validity of the SIS-16 by comparison with the Stroke Rehabilitation Assessment of Movement (STREAM), using Spearman correlations. Measures were administered to 30 patients with acute stroke on admission to and discharge from an acute rehabilitation setting. Correlations between the SIS-16 and STREAM total and subscale scores were adequate to excellent on admission (STREAM total r=0.7073; STREAM subtests r=0.5992 to 0.6451, p<0.0005) and discharge (STREAM total r=0.7153; STREAM subtests r=0.5499 to 0.7985, p<0.0002).

Richardson et al. (2016) examined concurrent validity of the SIS 3.0 by comparison with the 5-level EuroQol 5D (EQ-5D-5L), using Pearson correlation coefficients. Measures were administered to patients with subacute stroke on admission to the study and at 6-month and 12-month follow-up (n=164, 108, 37, respectively). At admission correlations with the EQ-5D-5L were excellent for the ADL (r=0.663) and Hand function (r=0.618) domains and Physical composite score (r=0.71); correlations with other domains were adequate (r=0.318 to 0.588), except for the Communication domain (r=0.228). At 6-month follow-up correlations with the EQ-5D-5L were excellent for the Strength (r=0.628), ADL (r=0.684), Mobility (r=0.765), Hand function (r=0.668), Participation (r=0.740) and Recovery domains (r=0.601) and Physical composite score (r=0.772); correlations with other domains were adequate (r=0.402 to 0.562). At 12-month follow-up correlations with the EQ-5D-5L were excellent for the Strength (r=0.604), ADL (r=0.760), Mobility (r=0.683) and Participation (r=0.738) domains and the Physical composite score (r=756); correlations with other domains were adequate (r=0.364 to 0.592).

Predictive:
Duncan et al. (1999) examined which domain scores of the SIS version 2.0 could most accurately predict a patient’s own assessment of stroke recovery, using multiple regression analysis. The SIS domains of Physical function, Emotion, and Participation were found to be statistically significant predictors of the patient’s assessment of recovery. Forty-five percent of the variance in the patient’s assessment of percentage of recovery was explained by these factors.

Fulk, Reynolds, Mondal & Deutsch (2010) examined the predictive validity of the 6MWT and other widely used clinical measures (FMA-LE, self-selected gait-speed, SIS and BBS) in 19 patients with stroke. The SIS was found to be a poor predictor of mean steps per day (r=0.18, p=0.471). Although gait speed and balance were related to walking activity, only the 6MWT was found to be a predictor of community ambulation in patients with stroke.

Huang et al. (2010) examined change in quality of life after distributed constraint-induced movement therapy (CIMT) in a sample of 58 patients with chronic stroke, using CHAID analysis. Predictors of change included age, gender, side of lesion, time since stroke, cognitive status (measured by the MMSE), upper extremity motor impairment (measured by the FMA-UE) and independence in activities of daily living (measured by the FIM). Initial FIM scores were the strongest predictor of overall SIS score (p=0.006) and ADL/IADL domain score (p=0.004) at post-treatment. Participants with FIM scores ≤ 109 showed significantly greater improvement in overall SIS scores than participants with FIM scores > 109. There were no significant associations between other SIS domains and other predictors.

Lin et al. (2010a) examined predictive validity of the SIS version 3.0 by comparing pre-treatment SIS scores with post-treatment scores of the Fugl-Meyer Assessment (FMA), Motor Activity Log – Amount of Use and – Quality of Movement (MAL-AOU, MAL-QOM), Functional Independence Measure (FIM), Frenchay Activities Index (FAI) and Nottingham Extended Activities of Daily Living Scale (NEADL). Predictive validity was measured using Spearman correlation coefficients prior to and on completion of a 3-week intervention period. The SIS Hand Function showed excellent correlations with MAL-AOU (r=0.61, p<0.01) and MAL-QOM (r=0.66, p<0.01), and adequate correlations with all other measures (FMA, FIM, FAI, NEADL). The SIS ADL/IADL showed an excellent correlation with the FIM (r=0.70, p<0.01), and adequate correlations with all other measures (FMA, MAL-AOU, MAL-QOM, FAI, NEADL). Other SIS domains demonstrated poor to adequate correlations with comparison measures.

Ward et al. (2011) examined predictive validity of the SIS-16 and other clinical measures (STREAM, FIM) in a sample of 30 patients in an acute rehabilitation setting, using Spearman rho coefficients and Wilcoxon rank-sum tests. Results indicated an adequate correlation between SIS-16 admission scores and predicted length of stay (rho=-0.6743, p<0.001) and an excellent correlation between SIS-16 admission scores and actual length of stay (rho=-0.7953, p<0.001). There was an significant correlation with discharge destination (p<0.05).

Lee et al. (2016) developed a computational method to predict quality of life after stroke rehabilitation, using Particle Swarm-Optimized Support Vector Machine (PSO-SVM) classifier. A sample of 130 patients with subacute/chronic stroke received occupational therapy for 1.5-2 hours/day, 5 days/week for 3-4 weeks. Predictors of outcome included 5 personal parameters (age, gender, time since stroke onset, education, MMSE score) and 9 early functional outcomes (Fugl-Meyer Assessment, Wolf Motor Function Test, Action Research Arm Test, Functional Independence Measure, Motor Activity Log – Amount of Use (MAL-AOU) and – Quality of Movement (MAL-QOM), ABILHAND, physical function, SIS). The combination of early outcomes of MAL-QOM and SIS showed highest accuracy (70%) and highest cross-validated accuracy (81.43%) in predicting final QOL among patients with stroke. SIS alone showed high accuracy (60%) and cross-validated accuracy (81.43%).

Construct:

Duncan et al. (2003b) performed a Rasch analysis on version 2.0 of the SIS. For measures that have been developed using a conceptual hierarchy of items, the theoretical ordering can be compared with the empirical ordering produced by the Rasch analysis as evidence of the construct validity of the measure. In this study, the expectation regarding the theoretical ordering of task difficulty was consistent with the empirical ordering of the items by difficulty for each domain, providing evidence for the construct validity of the SIS.

Convergent/Discriminant:
Edwards and O’Connell (2003) examined discriminant validity of the SIS version 2.0 and SIS-16 in a sample of 74 patients with stroke, by comparison with the World Health Organization Quality of Life Bref-Scale (WHOQOL-BREF) and Zung’s Self-Rating Depression Scale (ZSRDS). There were adequate to excellent correlations between the SIS-16 and the WHOQOL-BREF Physical domain (r=0.40 to 0.63); correlations with the WHOQOL-BREF Social relationships domain were poor (r=0.13 to 0.18). There were adequate to excellent correlations between the SIS Participation domain and all WHOQOL-BREF domains (r=0.45 to 0.69). The correlation between the SIS Participation domain and the WHOQOL-BREF Physical domain was excellent (r=0.69). The SIS Participation domain demonstrated an adequate correlation with the ZSRDS (r=-0.56). There were adequate correlations between the SIS Memory and Emotion domains and the WHOQOL-BREF Psychological domain (r=0.49, 0.70, respectively) and between the SIS Memory and Emotion domains and the ZSRDS (r=-0.38, -0.62, respectively). There was a poor correlation between the SIS Physical domain and the WHOQOL-BREF Environment scores (r=0.15). Neither the ZSRDS nor the WHOQOL-BREF assess communication, accordingly both measures demonstrated poor correlations with the SIS Communication domain (ZSRDS: r=-0.28; WHOQOL-BREF: r=0.11 to 0.28).
Note: Some correlations are negative because a high score on the SIS indicates normal performance whereas a high score on other measures indicates impairment.

Jenkinson et al. (2013) examined convergent validity of the SIS version 3.0 and the SF-SIS in a sample of individuals with stroke (n=73, 151, respectively) by comparison with the EuroQoL EQ-5D, using Spearmans correlation coefficient. The SIS and SF-SIS demonstrated identical excellent correlations with the EQ-5D (r=0.83)

MacIsaac et al. (2016) examined convergent validity of the SIS 3.0 and the SF-SIS in a sample of 5549 patients in an acute stroke setting and 332 patients in a stroke rehabilitation setting, using Spearman’s correlation coefficient. Convergent validity was measured by comparison with the SIS-VAS, patient-reported outcome measures the EuroQoL EQ-5D and EQ-5D-VAS, and functional measures the Barthel Index (BI), modified Rankin Score (mRS), and the National Institutes of Health Stroke Scale (NIHSS). Within acute data, the SIS and SF-SIS demonstrated significant excellent correlations with the mRS (p=-0.87, -0.80, respectively), the BI (p=0.89, 0.80), the NIHSS (p=-0.77, -0.73), the EQ-5D (p=0.88, 0.82) and the EQ-VAS (p=0.73, 0.72). Within rehabilitation data, the SIS and SF-SIS demonstrated excellent correlations with the BI (p=0.72, 0.65, respectively) and the EQ5D (p=0.69, 0.69), and moderate correlations with the SIS-VAS (p=0.56, 0.57) and the EQ-VAS (p=0.46, 0.40). Correlations between the SIS and SF-SIS were excellent in the acute data (p=0.94) and rehabilitation data (p=0.96).

Kwon et al. (2006) examined convergent validity of the SIS 3.0 by telephone administration in a sample of 95 patients with stroke, using Pearson coefficients. Convergent validity was measured by comparison with the Functional Independence Measure (FIM) – Motor component (FIM-M) and – Cognitive component (FIM-C), with the Medical Outcomes Study Short Form 36 for veterans (SF-36V). Patients were administered the SIS at 12 weeks post-stroke and the FIM and SF-36 at 16 weeks post-stroke. The SIS 3.0 telephone survey showed adequate to excellent correlations with the FIM (r=0.404 to 0.858, p<0.001) and SF-36V (r=0.362 to 0.768, p<0.001).

Known groups:
Duncan et al. (1999) found that all domains of the SIS version 2.0, with the exception of the Memory/thinking and Emotion domains, were able to discriminate between patients across 4 Rankin levels of stroke severity (p<0.0001, except for the Communication domain, p=0.02). These results suggest that scores from most domains of the SIS can differentiate between patients based on stroke severity.

Lai et al. (2003) administered the SIS and SF-36 to 278 patients with stroke 90 days after stroke. The SIS-16 was able to discriminate among the Modified Rankin Scale (MRS) levels of 0 to 1, 2, 3, and 4. The SIS Participation domain was also able to discriminate across the MRS levels of 0 to 1, 2, and 3 to 4. These results suggest that the SIS can discriminate between patients of varying degrees of stroke severity.

Kwon et al. (2006) administered the SIS 3.0 by telephone administration to a sample of 95 patients at 12 weeks post-stroke. The MRS was administered to patients at hospital discharge. SIS 3.0 scores were reported by domains: SIS-16, SIS-Physical and SIS-ADL; all domains showed score discrimination and distribution for different degrees of stroke severity: MRS 0/1 vs. MRS 4/5; MRS 2 vs. MRS 4/5; and MRS 3 vs. MRS 4/5.

Sensitivity and Specificity:

Beninato, Portney & Sullivan (2009) examined sensitivity and specificity of the SIS-16 relative to a history of multiple falls in a sample of 27 patients with chronic stroke. Participants reported a history of no falls or one fall (n=18) vs. multiple falls (n=9), according to Tinetti’s definition of falls. SIS-16 cut-off scores of 61.7 yielded 78% sensitivity and 89% specificity. Area under the ROC curve was adequate (0.86). Likelihood ratios were used to calculate post-test probability of a history of falls, and results showed high positive (LR+ = 7.0) and low negative (LR- = 0.25) likelihood ratios. Results indicate that the SIS-16 demonstrated good overall accuracy in detecting individuals with a history of multiple falls.

Responsiveness

Duncan et al. (1999) examined responsiveness of the SIS version 2.0. Significant change was observed in patients’ recovery in the expected direction between assessments at 1 and 3 months, and at 1 and 6 months post-stroke, however sensitivity to change was affected by stroke severity and time of post-stroke assessment. All domains of the SIS showed statistically significant change from 1 to 3 months and 1 to 6 months post-stroke, but this was not observed between 3 and 6 months post-stroke for the domains of Hand function, Mobility, ADL/IADL, combined physical, and Participation among patients recovering from minor stroke. For patients with moderate stroke, statistically significant change was observed at both 1 to 3 months and 1 to 6 months post-stroke in all domains, and from 3 to 6 months for the domains of Mobility, ADL/IADL, combined physical, and Participation.

Lin et al. (2010a) examined responsiveness of the SIS version 3.0 in a sample of 74 patients with chronic stroke. Participants were randomly assigned to receive constraint-induced movement therapy (CIMT), bilateral arm training (BAT) or conventional rehabilitation over a 3-week intervention period. Responsiveness was measured according to change from pre- to post-treatment, using Wilcoxon signed rank test and Standardised Response Mean (SRM). Most SIS domains showed small responsiveness (SRM = 0.22-0.33, Wilcoxon Z = 1.78-2.72). Medium responsiveness was seen for Hand Function (SRM = 0.52, Wilcoxon Z = 4.24, P<0.05), Stroke Recovery (SRM = 0.57, Wilcoxon Z = 4.56, P<0.05) and SIS total score (SRM=0.50, Wilcoxon Z = 3.89, P<0.05).

Lin et al. (2010b) evaluated the clinically important difference (CID) within four physical domains of the SIS 3.0 (strength, ADL/IADL, mobility, hand function) in a sample of 74 patients with chronic stroke. Participants were randomly assigned to receive CIMT, BAT or conventional rehabilitation over a 3-week intervention period. The following change scores were found to indicate a true and reliable improvement (MDC): Strength subscale = 24.0; ADL/IADL subscale = 17.3; Mobility subscale = 15.1; and Hand Function subscale = 25.9. The following mean change scores were considered to represent a CID: Strength subscale = 9.2; ADL/IADL subscale = 5.9; Mobility subscale = 4.5; and Hand Function subscale = 17.8. CID values were determined by the effect-size index and from comparison with a global rating of change (defined by a score of 10-15% in patients’ perceived overall recovery from pre- to post-treatment).
Note: Lin et al. (2010b) note that CID estimates may have been influenced by the age of participants and baseline degree of severity. Younger patients needed greater change scores from pre- to post-treatment to have a clinically important improvement compared to older patients. Those with higher baseline severity of symptoms showed greater MDC values therefore must show more change from pre- to post-treatment in order to demonstrate significant improvements. Also, the results may be limited to stroke patients who demonstrate improvement after rehabilitation therapies, Brunnstromm stage III and sufficient cognitive ability. Therefore, a larger sample size is recommended for future validation of these findings.

Ward et al. (2011) examined responsiveness of the SIS-16 and other clinical measures (STREAM, FIM) in a sample of 30 patients with acute stroke. Change scores were evaluated using Wilcoxon signed rank test and responsiveness to change was assessed using standardized response means (SRM). Measures were taken on admission to and discharge from an acute rehabilitation setting (average length of stay 23.3 days, range 7-53 days). SIS-16 change scores indicated statistically significant improvement from admission to discharge (23.1, p<0.0001) and sensitivity to change was large (SRM=1.65).

Guidetti et al. (2014) examined responsiveness of the SIS 3.0 in a sample of 204 patients with stroke who were assessed at 3 and 12 months post-stroke, using Wilcoxon’s matched pairs test. Clinically meaningful change within a domain was defined as a change of 10-15 points between timepoints. The Participation and Recovery domains were the most responsive domains over the first year post-stroke, with 27.5% and 29.4% of participants (respectively) reporting a clinically meaningful positive change, and 20% and 10.3% of participants (respectively) reporting a clinically meaningful negative change, from 3 to 12 months post-stroke. The Strength and Hand function domains also showed high clinically meaningful positive change (23%, 18.0% respectively) and negative change (14.7%, 14.2% respectively) from 3 to 12 months post-stroke. There were significant changes in scores on the Strength (p=0.045), Emotion (p=0.001) and Recovery (p<0.001) domains from 3 to 12 months post-stroke. The Strength, Hand function and Participation domains had the highest perceived impact (i.e. lowest mean scores) at 3 months and 12 months.

References

  • Beninato, M., Portney, L.G., & Sullivan, P.E. (2009). Using the International Classification of Functioning, Disability and Health as a framework to examine the association between falls and clinical assessment tools in people with stroke. Physical Therapy, 89(8), 816-25.
  • Brandao, A.D., Teixeira, N.B., Brandao, M.C., Vidotto, M.C., Jardim, J.R., & Gazzotti, M.R. (2018). Translation and cultural adaptation of the Stroke Impact Scale 2.0 (SIS): a quality-of-life scale for stroke. Sao Paulo Medical Journal, 136(2), 144-9. doi: 10.1590/1516-3180.2017.0114281017
  • Brott, T.G., Adams, H.P., Olinger, C.P., Marler, J.R., Barsan, W.G., Biller, J., Spilker, J., Holleran, R., Eberle, R., Hertzberg, V., Rorick, M., Moomaw, C.J., & Walker, M. (1989). Measurements of acute cerebral infarction: A clinical examination scale. Stroke, 20, 864-70.
  • Cael, S., Decavel, P., Binquet, C., Benaim, C., Puyraveau, M., Chotard, M., Moulin, T., Parrette, B., Bejot, Y., & Mercier, M. (2015). Stroke Impact Scale version 2: validation of the French version. Physical Therapy, 95(5), 778-90.
  • Carod-Artal, F.J., Coral, L.F., Trizotto, D.S., Moreira, C.M. (2008). The Stroke Impact Scale 3.0: evaluation of acceptability, reliability, and validity of the Brazilian version. Stroke, 39, 2477-84.
  • Choi, S.U., Lee, H.S., Shin, J.H., Ho, S.H., Koo, M.J., Park, K.H., Yoon, J.A., Kim, D.M., Oh, J.E., Yu, S.H., & Kim, D.A. (2017). Stroke Impact Scale 3.0: reliability and validity evaluation of the Korean version. Annals of Rehabilitation Medicine, 41(3), 387-93.
  • Collin, C. & Wade, D. (1990). Assessing motor impairment after stroke: a pilot reliability study. Journal of Neurology, Neurosurgery, and Psychiatry, 53, 576-9.
  • Duncan, P. W., Bode, R. K., Lai, S. M., & Perera, S. (2003b). Rasch analysis of a new stroke-specific outcome scale: The Stroke Impact Scale. Archives of Physical Medicine and Rehabilitation, 84, 950-63.
  • Duncan, P. W., Lai, S. M., Tyler, D., Perera, S., Reker, D. M., & Studenski, S. (2002a). Evaluation of Proxy Responses to the Stroke Impact Scale. Stroke, 33, 2593-9.
  • Duncan, P.W., Reker, D.M., Horner, R.D., Samsa, G.P., Hoenig, H., LaClair, B.J., & Dudley, T.K. (2002b). Performance of a mail-administered version of a stroke-specific outcome measure: The Stroke Impact Scale. Clinical Rehabilitation, 16(5), 493-505.
  • Duncan, P.W., Wallace, D., Lai, S.M., Johnson, D., Embretson, S., & Laster, L.J. (1999). The Stroke Impact Scale version 2.0: Evaluation of reliability, validity, and sensitivity to change. Stroke, 30, 2131-40.
  • Duncan, P.W., Wallace, D., Studenski, S., Lai, S.M., & Johnson, D. (2001). Conceptualization of a new stroke-specific outcome measure: The Stroke Impact Scale. Topics in Stroke Rehabilitation, 8(2), 19-33.
  • Duncan, P.W., Lai, S.M., Bode, R.K., Perea, S., DeRosa, J.T., GAIN Americas Investigators. (2003a). Stroke Impact Scale-16: A brief assessment of physical function. Neurology, 60, 291-6.
  • Edwards, B. & O’Connell, B. (2003). Internal consistency and validity of the Stroke Impact Scale 2.0 (SIS 2.0) and SIS-16 in an Australian sample. Quality of Life Research, 12, 1127-35.
  • Finch, E., Brooks, D., Stratford, P.W., & Mayo, N.E. (2002). Physical Rehabilitations Outcome Measures. A Guide to Enhanced Clinical Decision-Making (2nd ed.), Canadian Physiotherapy Association, Toronto.
  • Folstein, M.F., Folstein, S.E., & McHugh, P.R. (1975). “Mini-mental state”. A practical method for grading the cognitive state of patients for the clinician. Journal of Psychiatric Research, 12(3), 189-98.
  • Fugl-Meyer, A.R., Jaasko, L., Leyman, I., Olsson, S., & Steglind, S. (1975). The post-stroke hemiplegic patient: a method for evaluation of physical performance. Scandinavian Journal of Rehabilitation Medicine, 7, 13-31.
  • Fulk, G.D., Reynolds, C., Mondal, S., & Deutsch, J.E. (2010). Predicting home and community walking activity in people with stroke. Archives of Physical Medicine and Rehabilitation, 91, 1582-6.
  • Geyh, S., Cieza, A., & Stucki, G. (2009). Evaluation of the German translation of the Stroke Impact Scale using Rasch analysis. The Clinical Neuropsychologist, 23(6), 978-95.
  • Goncalves, R.S., Gil, J.N., Cavalheiro, L.M., Costa, R.D., & Ferreira, P.L. (2012). Reliability and validity of the Portuguese version of the Stroke Impact Scale 2.0 (SIS 2.0). Quality of Life Research, 21(4), 691-6.
  • Guidetti, S., Ytterberg, C., Ekstam, L., Johansson, U., & Eriksson, G. (2014). Changes in the impact of stroke between 3 and 12 months post-stroke, assessed with the Stroke Impact Scale. Journal of Rehabilitative Medicine, 46, 963-8.
  • Hamilton, B.B., Granger, C.V., & Sherwin, F.S. (1987). A uniform national data system for medical rehabilitation. In: Fuhrer, M. J., ed. Rehabilitation Outcome: Analysis and Measurement. Baltimore, Md: Paul Brookes, 137-47.
  • Hamza, A.M., Nabilla, A.S., & Loh, S.Y. (2012). Evaluation of quality of life among stroke survivors: linguistic validation of the Stroke Impact Scale (SIS) 3.0 in Hausa language. Journal of Nigeria Soc Physiotherapy, 20, 52-9.
  • Hamza, A.M., Nabilla, A.-S., Yim, L.S., & Chinna, K. (2014). Reliability and validity of the Nigerian (Hausa) version of the Stroke Impact Scale (SIS) 3.0 index. BioMed Research International, 14, Article ID 302097, 7 pages. doi: 10.1155/2014/302097
  • Hogue, C., Studenski, S., Duncan, P.W. (1990). Assessing mobility: The first steps in preventing fall. In: Funk, SG., Tornquist, EM., Champagne, M.T., Copp, L.A., & Wiese, R.A., eds. Key Aspects of Recovery. New York, NY: Springer, 275-81.
  • Hsieh, F.-H., Lee, J.-D., Chang, T.-C., Yang, S.-T., Huang, C.-H., & Wu, C.-Y. (2016). Prediction of quality of life after stroke rehabilitation. Neuropsychiatry, 6(6), 369-75.
  • Huang, Y-h., Wu, C-y., Hsieh, Y-w., & Lin, K-c. (2010). Predictors of change in quality of life after distributed constraint-induced therapy in patients with chronic stroke. Neurorehabilitation and Neural Repair, 24(6), 559-66. doi: 10.1177/1545968309358074
  • Jenkinson, C., Fitzpatrick, R., Crocker, H., & Peters, M. (2013). The Stroke Impact Scale: validation in a UK setting and development of a SIS short form and SIS index. Stroke, 44, 2532-5.
  • Kamwesiga, J.T., von Koch, L., Kottorp, A., & Guidetti, S. (2009). Cultural adaptation and validation of Stroke Impact Scale 3.0 version in Uganda: a small-scale study. SAGE Open Medicine, 4: 2050312116671859. doi: 10.1177/2050312116671859
  • Kwon, S., Duncan, P., Studenski, S., Perera, S., Lai, S.M., & Reker, D. (2006). Measuring stroke impact with SIS: Construct validity of SIS telephone administration. Quality of Life Research, 15, 367-76.
  • Lai, S.M., Perera, S., Duncan, P.W., & Bode, R. (2003). Physical and Social Functioning After Stroke: Comparison of the Stroke Impact Scale and Short Form-36. Stroke, 34, 488-93.
  • Lawton, M. & Brody, E. (1969). Assessment of older people: self-maintaining and instrumental activities of daily living. Gerontologist, 9, 179 -86.
  • Lee, H.-J. & Song, J.-M. (2015). The Korean language version of Stroke Impact Scale 3.0: cross-cultural adaptation and translation. Journal of the Korean Society of Physical Medicine, 10(3), 47-55.
  • Lin, K.C., Fu, T., Wu, C.Y., Hsieh, Y.W., Chen, C.L., & Lee, P.C. (2010a). Psychometric comparisons of the Stroke Impact Scale 3.0 and Stroke-Specific Quality of Life Scale. Quality of Life Research, 19(3), 435-43. doi: 10.1007/s11136-010-9597-5.
  • Lin K.-C., Fu T., Wu C.Y., Wang Y.-H., Wang Y-.H., Liu J.-S., Hsieh C.-J., & Lin S.-F. (2010b). Minimal detectable change and clinically important difference of the Stroke Impact Scale in stroke patients. Neurorehabilitation and Neural Repair, 24, 486-92.
  • MacIsaac, R., Ali, M., Peters, M., English, C., Rodgers, H., Jenkinson, C., Lees, K.R., Quinn, T.J., VISTA Collaboration. (2016). Derivation and validation of a modified short form of the Stroke Impact Scale. Journal of the American Heart Association, 5:e003108. doi: 10/1161/JAHA.115003108.
  • Mahoney, F.I. & Barthel, D.W. (1965). Functional evaluation: The Barthel Index. Maryland State Medical Journal, 14, 61-5.
  • Mulder, M. & Nijland, R. (2016). Stroke Impact Scale. Journal of Physiotherapy, 62, 117.
  • Ochi, M., Ohashi, H., Hachisuka, K., & Saeki, S. (2017). The reliability and validity of the Japanese version of the Stroke Impact Scale version 3.0. Journal of UOEH, 39(3), 215-21. doi: 10.7888/juoeh.39.215
  • Richardson, M., Campbell, N., Allen, L., Meyer, M., & Teasell, R. (2016). The stroke impact scale: performance as a quality of life measure in a community-based stroke rehabilitation setting. Disability and Rehabilitation, 38(14), 1425-30. doi: 10.310/09638288.2015.1102337
  • Sullivan, J. (2014). Measurement characteristics and clinical utility of the Stroke Impact Scale. Archives of Physical Medicine and Rehabilitation, 95, 1799-1800.
  • Vellone, E., Savini, S., Barbato, N., Carovillano, G., Caramia, M., & Alvaro, R. (2010). Quality of life in stroke survivors: first results from the reliability and validity of the Italian version of the Stroke Impact Scale 3.0. Annali di Igiene, 22, 469-79.
  • Vellone, E., Savini, S., Fida, R., Dickson, V.V., Melkus, G.D., Carod-Artal, F.J., Rocco, G., & Alvaro, R. (2015). Psychometric evaluation of the Stroke Impact Scale 3.0. Journal of Cardiovascular Nursing, 30(3), 229-41. doi: 10.1097/JCN.0000000000000145
  • Ward, I., Pivko, S., Brooks, G., & Parkin, K. (2011). Validity of the Stroke Rehabilitation Assessment of Movement Scale in acute rehabilitation: a comparison with the Functional Independence Measure and Stroke Impact Scale-16. Physical Medicine and Rehabilitation, 3(11), 1013-21. doi: 10.1016/j.pmrj.2011.08.537
  • Ware, J.E. Jr., & Sherbourne, C.D. (1992). The MOS 36-item short-form health survey (SF-36). I. Conceptual framework and item selection. Medical Care, 30, 473-83.
  • Yesavage, J.A., Brink, T., Rose, T.L., Lum, O., Huang, V., Adey, M., & Leirer, V.O. (1983). Development and validation of a geriatric depression screening scale: A preliminary report. Journal of Psychiatric Research, 17, 37-49.

See the measure

How to obtain the SIS?

Please click here to see a copy of the SIS.

This instrument was developed by:

  • Pamela Duncan, PhD, PT
  • Dennis Wallace, PhD
  • Sue Min Lai, PhD, MS, MBA
  • Stephanie Studenski, MD, MPH
  • DallasJohnson, PhD, and
  • Susan Embretson, PhD.

In order to gain permission to use the SIS and its translations, please contact MAPI Research Trust: contact@mapi-trust.org

Table of contents

Upper Extremity Function Test (UEFT)

Evidence Reviewed as of before: 19-04-2013
Author(s)*: Katie Marvin, MSc. PT
Editor(s): Annabel McDermott; Nicol Korner-Bitensky, PhD OT

Purpose

The Upper Extremity Function Test (UEFT) is an evaluative measure to assess upper extremity functional impairment and the severity of impairment in patients exhibiting dysfunction in the upper extremity.

In-Depth Review

Purpose of the measure

The Upper Extremity Function Test (UEFT) is an evaluative measure to assess upper extremity functional impairment and the severity of impairment in patients exhibiting dysfunction in the upper extremity. The test assesses function based on the assumption that complex upper extremity movements used in everyday activities are made up of certain movement patterns (e.g. supination/pronation, grasp/release, pinch grip, etc.), so that evaluation of these movement patterns can predict the patient’s ability to perform functional activities. The UEFT was designed primarily to quantify the patient’s ability to execute upper extremity activities of a general nature, and does not take into consideration factors such as skill, speed, range of motion, endurance, sensation etc. The selected list of test items is believed to represent the upper limb movements that are necessary to perform many of the major activities of daily living. The UEFT has not yet been correlated to vocational activities of the upper extremity.

Available versions

The Action Research Arm Test (ARAT) was developed by Ronald Lyle in 1981 by adapting theUpper Extremity Function Test (UEFT)(Carroll, 1965). TheUEFTtest administration and scoring was simplified, the time required to administer the test was shortened, and items were grouped based on the hierarchical scale (Guttman Scale) (Lang, Wagner, Dromerick, & Edwards, 2006). Due to the need for more specific and detailed instructions related to the client’s position, scoring and test administration, Yozbatiran, Der-Yeghiaian, and Cramer (2008) proposed a standardized approach to the ARAT.

Please visit our Action Research Arm Test module for further information.

Features of the measure

Items:

The UEFT consists of 33 items or tasks, detailed below.

Description of tasks:

The patient is positioned comfortably in a chair in front of the table used for testing. The patient is evaluated while performing different tasks, such as moving objects to a shelf, placing objects over a peg, writing their name, etc. The objects are of varying shapes and weights in order to evaluate the patient’s grasp, grip, pinch, placing, arm extension and elevation, pronation and supination, and functional strength.

Please note that the patient is not permitted to move from the chair during testing (unless a break is required), although weight transfer and rolling from side to side of the buttock is permitted. Each arm is tested individually. Demonstration of tasks are permitted (Carroll, 1965)

Scoring and Score Interpretation:

The UEFT uses a simple scoring method where results can be compared at different time intervals.

Scoring:

3 Performs test normally.
2 Completes test, but takes abnormally long time or has great difficulty.
1 Performs test partially. This grade is assigned when the patient is able to pick up or lift the test item from the table but is unable to place the object in its correct end position. For example, in items 27 to 29, the patient is able to lift the pitcher or glass but is unable to pour the water into the proper receptacle.
0 Can perform no part of the test. If the patient pushes objects out of their slots or around on the table a grade of 0 is assigned.

The total score is tallied. The maximum score for the dominant hand is 99 as compared to a maximum score of 96 for the non-dominant hand, because item 33 consists of writing of the patient’s name with the dominant hand.

The authors of the test concluded that a score increase or decrease of 10 points represents a meaningful gain or loss of important function, respectively.

Nearly equal scoring points have been allotted for the two functions prehension’ (grasp, grip and pinch) and placing’ (shoulder stability; shoulder abduction and flexion/extension; elbow flexion/extension; wrist flexion/extension and pronation/supination); as such, both functions need to be intact in order for a high score to be awarded.

Score interpretation:

0 to 25: Trace function
26-50: Very poor
51-75: Poor
76-89: Partial function
90-98: Functional
99 (dominant hand) / 96 (non-dominant hand): Maximal function

Functional Implications of UEFT:

Basmajian et al. (1982) investigated the functional implications of UEFT scores and found the following scores to be indicative of the following patient capabilities:

  • 0: no function
  • 10: holding a book for reading
  • 20: driving
  • 30: carrying objects from place to place
  • 40: dressing
  • 50: feeding
  • 60: shaving/make-up
  • 70: hand crafts
  • 80: fine crafts (needlework, gardening, capentry)
  • 90: card playing
  • 100: letter writing/typing

Adapted from Basmajian, Gowland, Brandstater, Swanson & Trotter (1982).

Time:

The UEFT takes approximately 1 hour to administer (Lyle, 1981).

Training requirements:

None typically reported, however it is recommended that the clinician is familiar with the assessment tool.

Subscales:

None typically reported.

Equipment:

  • 17.5 in. width x 28.5 in. length x 30.75 in. height table
  • 3.75 in. width shelf mounted 14.75 in. from the table
  • Wooden cubes: 4 x 4 x 4in. (576g); 3 x 3 x 3in. (243g); 2 x 2 x 2in. (72g); 1 x 1 x 1 (9g)
  • Large iron pipe: 1.625 O.D. x 6.125in. (500g)
  • Small iron pipe: 0.87 O.D. x 4.125 (125g)
  • Slate: 4.125 x 1 x .375 (61g)
  • Wooden ball: 3 O.D. (100g)
  • Glass marble 0.625 O.D. (6.3g)
  • Metal sphere 0.44 O.D. (6.6g); 0.25 O.D. (1.0g); 0.16 (0.34g)
  • Steel washer 0.16 thick x 1.375 O.D. x 0.56 I.D. (14.5g)
  • Iron 6 lb approximately
  • 2 Plastic tumblers 8 fl. oz
  • Aluminum water pitcher 3 qt capacity
  • Pencil

*O.D. = outside diameter; I.D. = inside diameter

Please refer to Carroll (1965) for further information regarding administration set-up of the UEFT.

Alternative form of the Action Research Arm Test

None typically reported.

Client suitability

Can be used with:

  • Clients with stroke.

Should not be used with:

  • When administering the UEFT to clients with upper extremity amputations, the total score should be adjusted according to the following scale.

Total UEFT Scores for people with amputations:

Wrist: 0
Three fingers: 41
Middle finger: 87
Index finger and 2nd metacarpal: 84
Thumb and metacarpal-phalangeal joint: 91
Index finger at proximal interphalangeal joint: 93

Languages of the measure?

There are no official translations of the UEFT.

Summary

What does the tool measure? The UEFT measures specific changes in upper extremity impairment and function
What types of clients can the tool be used for? The UEFT can be used with, but is not limited to clients with stroke.
Is this a screening or assessment tool? Assessment
Time to administer The UEFT takes approximately 1 hour to administer.
Versions The Action Research Arm Test (ARAT) was developed by Ronald Lyle in 1981 by adapting the Upper Extremity Function Test (Carroll, 1965).
Other Languages There are no official translations.
Measurement Properties
Reliability Test-retest:
One study investigated the test-retest reliability of the UEFT and found strong inter-rater reliability in a sample of patients with chronic upper extremity impairment resulting from conditions including stroke.

Inter-rater:
One study investigated inter-rater reliability of the UEFT and found strong inter-rater reliability.

Validity Predictive:
One study examined the predictive validity of the UEFT and found admission UEFT scores to be predictive of discharge UEFT scores.
Floor/Ceiling Effects No studies have examined the floor/ceiling effects of the UEFT.
Does the tool detect change in patients? No studies have formally examined the responsiveness of the UEFT.
Acceptability The UEFT is simple to administer and can be easily administered in a variety of settings (e.g. home or medical office settings).
Feasibility The administration of the UEFT and the ARAT is quick and simple, but requires standardized equipment.
How to obtain the tool? Please refer to the initial validation study by Carroll (1965) for further information on the UEFT.

Psychometric Properties

Overview

A literature search was conducted to identify all relevant publications on the psychometric properties of the Upper Extremity Function Test. Limited information is available on the UEFT. However, the Action Research Arm Test, developed in 1981 as an adaptation of the UEFT, is a more reliable, valid and responsive measure currently used for clients with stroke.

Floor/Ceiling Effects

No studies have examined the floor/ceiling effects of the UEFT.

Reliability

Internal consistency:
No studies have examined the internal consistency of the UEFT.

Test-retest:
Carroll (1965) examined test re-test reliability of the UEFT in a sample of 23 patients with chronic stable upper extremity impairment due to varying causes (including stroke) and 7 patients with typical upper extremity function. The UEFT was administered two times, 30 days apart. Scores for individuals with typical upper extremity function were identical on the two different testing days. Of scores attained for patients with chronic stable upper extremity impairment, 1 case was identical, 5 cases showed a 1-point difference, 7 cases showed a 3-point difference, 2 cases showed a 5-point difference, and 3 cases showed a difference of 6, 7 and 8 points. The results of this initial validation study suggest that UEFT has strong test re-test reliability.

Intra-rater:
No studies have examined the intra-rater reliability of the UEFT.

Inter-rater:
Carroll (1965) investigated inter-rater reliability of the UEFT among clinicians who were either experienced or not experienced with the UEFT. Two raters with experience using the UEFT rated the upper extremities of 48 individuals with stroke. The two examiners rated 46% of the patients identically, 21% within 1 point, 8% within 2 points, 10% within 3 points, 8% within 4 points and 6% of patients within 5 points. Subsequently, three examiners without experience using the UEFT were educated on the grading system and were then asked to rate the performance of 15 patients with stroke. The inexperienced raters scored within 7 points of the experienced raters 97% of the time. The results of this study indicate that the UEFT has strong inter-rater reliability.

Validity

Content:

No studies have examined the content validity of the UEFT.

Criterion:

Concurrent:
No studies have examined the concurrent validity of the UEFT.

Predictive:
Barrecca, Finlayson, Gowland & Basmajian (1999) examined the predictive validity of the UEFT and the Halstead Category Test in 16 patients with stroke. Admission UEFT and Halstead Category Test scores were found to be predictive of discharge UEFT scores (approximately 5 weeks later), even in patients with severe upper extremity disability following stroke.

Construct:

Convergent/Discriminant:
No studies have examined the discriminant validity of the UEFT.

Known Groups:
No studies have examined the known groups validity of the UEFT.

Sensitivity/ Specificity:
No studies have examined the specificity of the UEFT.

Responsiveness

Popovic, Popovic, Sinkjaer, Stefanovic & Schwirtlick (2003) investigated the effects of Functional Electrical Stimulation on upper extremity function in patients with stroke. The UEFT was used as an outcome measure and was able to detect change in upper extremity function in patients with stroke.

References

  • Barreca, S., Finlayson, A., Gowland, C. & Basmajian, J. (1999). Use of the Halstead Category Test as a predictor of functional recovery in the hemiplegic upper limb: A cross-validation study. The Clinical Neuropsychologist, 13(2), 171-178.
  • Basmajian, C., Gowland, M., Brandstater, L., Swanson, L. & Trotter, J. (1982). EMG feedback treatment of upper limb in hemiplegic stroke patients: A pilot study. Archives of Physical Medicine Rehabilitation, 63, 614.
  • Carroll, D. (1965). A quantitative test of upper extremity function. Journal of Chronic Diseases, 18, 479-491.
  • Lang, C.E., Wagner, J.M, Dromerick, A.W., & Edwards, D.F. (2006). Measurement of upper extremity function early after stroke: properties of the action research arm test.Archives Physical Medicine and Rehabilitation, 87, 1605-1610.
  • Lyle, R.C. (1981). A performance test for assessment of upper limb function in physical rehabilitation treatment and research. International Journal of Rehabilitation Research, 4(4), 483-492.
  • Okkema, K.A. (1998). Functional evaluation of upper extremity use following stroke: A literature review. Topics of Stroke Rehabilitation, 4(4), 54-75
  • Popovic, M.B., Popovic, D.B., Sinkjaer, T., Stefanovic, A. & Schwirtlich, L. (2003). Clinical evaluation of Funcational Evaluation Therapy in acute hemiplegic subjects. Journal of Rehabilitation Research and Development, 40(5), 443-454.

See the measure

Further information on the UEFT can be found in the following publication:

Carroll, D. (1965). A quantitative test of upper extremity function. Journal of Chronic Diseases, 18, 479-491.

Table of contents

Wolf Motor Function Test (WMFT)

Evidence Reviewed as of before: 11-01-2011
Author(s)*: Sabrina Figueiredo, BSc
Editor(s): Nicol Korner-Bitensky, PhD OT
Content consistency: Gabriel Plumier

Purpose

The Wolf Motor Function Test (WMFT) quantifies upper extremity (UE) motor ability through timed and functional tasks (Wolf, Catlin, Ellis, Archer, Morgan & Piacentino, 1995).

In-Depth Review

Purpose of the measure

The Wolf Motor Function Test (WMFT) quantifies upper extremity (UE) motor ability through timed and functional tasks (Wolf, Catlin, Ellis, Archer, Morgan & Piacentino, 1995).

Available versions

The original version of the WMFT was developed by Wolf, Lecraw, Barton, and Jann in 1989 to examine the effects of constraint-induced movement therapy in clients with mild to moderate stroke and traumatic brain injury. In 1999, a graded WMFT was developed by Uswatte and Taub to assess the motor abilities of patients who were functioning at a lower level (Morris, Uswatte, Crago, Cook & Taub, 2001).

Features of the measure

Items:

The original version of the WMFT consisted of 21 items. The widely used version of the WMFT consists of 17 items. The first 6 items involve timed functional tasks, items 7 and 14 are measures of strength, and the remaining 9 items consist of analyzing movement quality when completing various tasks (Wolf et al., 1995; Whitall, Savin, Harris-Love, & Waller, 2006).

The examiner should test the less affected upper extremity followed by the most affected side. The following items should be performed as quickly as possible, truncated at 120 seconds (Wolf, Thompson, Morris, Rose, Winstein, Taub, et al., 2005):

  1. Forearm to table (side): client attempts to place forearm on a table by abducting at the shoulder
  2. Forearm to box (side): client attempts to place forearm on a box, 25.4cm tall, by abduction at the shoulder
  3. Extended elbow (side): client attempts to reach across a table, 28cm long, by extending the elbow (to the side)
  4. Extended elbow (to the side) with 1lb weight: client attempts to push the weight against outer wrist joint across the table by extending the elbow
  5. Hand to table (front): client attempts to place involved hand on a table
  6. Hand to box (front): client attempts to place hand on the box placed on the tabletop
  7. Weight to box: client attempts to place the heaviest possible weight on the box placed on the tabletop
  8. Reach and retrieve (front): client attempts to pull 1lb weight across the table by using elbow flexion and cupped wrist
  9. Lift can (front): client attempts to lift a can and bring it close to his/her lips with a cylindrical grasp
  10. Lift pencil (front): client attempts to pick up a pencil by using 3-jaw chuck grasp.
  11. Pick-up paper clip (front): client attempts to pick up a paper clip by using a pincer grasp
  12. Stack checkers (front): client attempts to stack checkers onto the center checker
  13. Flip 3 cards (front): using the pincer grasp, client attempts to flip each card over
  14. Grip strength
  15. Turning the key in lock (front): using pincer grasp, while maintaining contact, client turns key 180 degrees to the left and right
  16. Fold towel (front): client grasps towel, folds it lengthwise, and then uses the tested hand to fold the towel in half again
  17. Lift basket (standing): client picks up a 3lb basket from a chair, by grasping the handles, and placing it on a bedside table

Scoring:

The items are rated on a 6-point scale as outlined below (Wolf et al., 2005):

0. “Does not attempt with UE being tested”
1. “UE being tested does not participate functionally; however, an attempt is made to use the UE. In unilateral tasks, the UE not being tested may be used to move the UE being tested”.
2. “Does attempt, but requires assistance of the UE not being tested for minor readjustments or change of position, or requires more than 2 attempts to complete, or accomplishes very slowly. In bilateral tasks, the UE being tested may serve only as a helper”.
3. “Does attempt, but movement is influenced to some degree by synergy or is performed slowly or with effort”.
4. “Does attempt; movement is similar to the non-affected side but slightly slower; may lack precision, fine coordination or fluidity”.
5. “Does attempt, movement appears to be normal”.

Lower scores are indicative of lower functioning levels.

Time:

Not reported, but since a maximum of 120 seconds is allocated to each item, it should take approximately 30 minutes with additional time for measuring grip strength (item 14).

Subscales:

None officially documented. However, many studies use the Performance Time (WMFT-PT) and Functional Capacity (WMFT-FA) scales as subtests of the WFMT.

Equipment:

  • Table 28 cm long (height not reported)
  • Chair (dimensions not reported)
  • Bedside table (dimensions not reported)
  • Box (25.4 cm tall)
  • Free-weights
  • Can
  • Pencil
  • Paperclip
  • Checkers
  • Cards
  • Key lock with the key
  • Towel
  • Basket
  • Dynamometer for measuring hand grip strength

Training:

Not reported.

Alternative form of the WMFT

  • The original version (21 items)
  • The modified version (17 items): The modified version is most widely used and allows assessment of clients with severe, moderate and mild stroke.

Client suitability

Can be used with:

  • Clients with stroke
  • Clients with upper limb functional deficits/li>

Should not be used with:

  • Severe cases of upper limb spasticity, and upper limb amputees

In what languages is the measure available?

French and English.

Summary

Wolf Motor Function Test (WMFT) Evaluation Summary

What does the tool measure? The WMFT quantifies upper extremity motor ability through timed and functional tasks.
What types of clients can the tool be used for? The WMFT can be used with, but is not limited to clients with stroke.
Is this a screening or assessment tool? Assessment
Time to administer The WMFT takes approximately 30 minutes to administer.
Versions The original version (21 items), and the modified version (17 items)
Other Languages English
Measurement Properties
Reliability Internal consistency:
Two studies examined the internal consistency of the WMFT and reported excellent internal consistency using Cronbach’s alpha.

Test-retest:
Two studies examined the test-retest reliability of the WMFT and reported excellent reliability using Pearson and Intraclass correlations coefficients (ICC).

Inter-rater:
Four studies examined the inter-rater reliability of the WMFT and reported excellent reliability using the ICC.

Validity Content:
No studies have reported the content validity of the WMFT.

Criterion:
Concurrent:
– Two studies examined the concurrent validity of the WMFT and reported moderate to excellent correlations with the Fugl-Meyer Assessment, as the gold standard measure.
– One study examined the concurrent validity of the WMFT and reported excellent correlations with the Action Research Arm Test.

Construct:
Known Groups:
One study examined the known groups validity of the WMFT using Wilcoxon Test and reported that the WMFT is able to discriminate between healthy individuals and those with upper extremity impairments.

Floor/Ceiling Effects No studies have examined floor/ceiling effects of the WMFT in clients with stroke.
Does the tool detect change in patients? No studies have examined the responsiveness of the WMFT in clients with stroke.
Acceptability The WMFT is the widely used as an outcome measure for constraint-induced movement therapy.
Feasibility The administration of the WMFT is quick and simple.
How to obtain the tool? The WMFT can be found at: Wolf, S., Thompson, P., Morris, D., Rose, D., Winstein, C., Taub, E., Giuliani, C., & Pearson, S. (2005). The EXCITE Trial: Atrributes of the Wolf Motor Function test in patients with Subacute Stroke. Neurorehabil Neural Repair, 19, 194-205.

Psychometric Properties

Overview

We conducted a literature search to identify all relevant publications on the psychometric properties of the Wolf Motor Function Test (WMFT) in individuals with stroke. We identified 3 studies.

Floor/Ceiling Effects

Nijland et al. (2010) investigated the psychometric properties of the WMFT and the Action Research Arm Test in 40 patients with stroke with mild to moderate hemiparesis. The WMFT showed adequate floor and ceiling effects with only 5 to 17% of patients scoring the lowest or highest score

Reliability

Internal consistency:
Morris, Uswatte, Crago, Cook, and Taub (2001) evaluated the internal consistency of the WMFT in 24 clients with stroke. The internal consistency of the WMFT, as calculated using Cronbach’s Coefficient Alpha, was excellent (α = 0.92).

Nijland et al. (2010) investigated the internal consistency of the WMFT in 40 patients with stroke with mild to moderate hemiparesis. Internal consistency of the WMFT, as calculated using Cronbach’s Coefficient Alpha was excellent (α = 0.98).

Test-retest:
Morris et al. (2001) analyzed the test-retest reliability of the WMFT in 24 clients with stroke. Participants were re-assessed within a 2-week interval. The test-retest reliability, as calculated using Pearson Correlation Coefficient, was excellent for both functional ability and performance tests (r = 0.95; 0.90, respectively).

Whitall, Savin, Harris-Love, and Waller (2006) examined the test-retest reliability of the WMFT in 66 clients with stroke. Participants were re-assessed within a 2 week interval by the same rater and under the same conditions. Test-retest reliability, as calculated using Intraclass Correlation Coefficient (ICC), was found to be excellent (ICC = 0.97).

Inter-rater:
Morris et al. (2001) evaluated the Inter-rater reliability of the WMFT in 24 clients with stroke. Evaluations were conducted by a physiotherapist and were videotaped. The recordings were then rated by two physiotherapists and one occupational therapist. Inter-rater reliability, as calculated using ICC, was excellent for both functional ability and performance tests (ICC = 0.93; 0.99, respectively).

Wolf et al. (2001) verified the Inter-rater reliability of the WMFT in 19 clients with stroke and in 19 healthy individuals. All participants were evaluated by 2 raters, independently. Inter-rater reliability, as calculated using ICC, was excellent (ICC = 0.97)

Whitall et al. (2006) estimated the inter-rater reliability of the WMFT in 10 clients with stroke. The assessment of functional ability was videotaped and rated by three different raters. Inter-rater reliability was excellent (ICC = 0.99).

Nijland et al. (2010) investigated the psychometric properties of the WMFT and Action Research Arm Test in 40 patients with stroke with mild to moderate hemiparesis. 18 patients participated in the reproducibility testing of the WMFT and were assessed twice by the same observer approximately 10 days apart. Intra-rater reliability, as analyzed using the ICC was found to be excellent (ICC = 0.94).

Validity

Content:
No studies have reported the content validity of the WMFT.

Criterion:
Concurrent:
Wolf et al. (2001) examined the concurrent validity of the WMFT by comparing it to the Upper Extremity Fugl-Meyer Assessment (UE-FMA – Fugl-Meyer, Jääskö, Leyman, Olsson, & Steglind, 1975) as the gold standard in 19 clients with stroke. Adequate correlations were found between the WMFT and the UE-FMA (r = -0.57).

Whitall et al. (2006) assessed the concurrent validity of the WMFT by comparing it to the UE-FMA as the gold standard in 66 clients with stroke. Correlations between the functional ability test of the WMFT and the UE-FMA were excellent (r = -0.88).

Nijland et al. (2010) investigated the concurrent validity of the WMFT by comparing it to the Action Research Arm Test (ARAT – Lyle, 1981) in 40 patients with stroke with mild to moderate hemiparesis. For the purpose of their investigation, the WMFT score was split into 4 variables: Functional Ability Score (FAS), median time score (s), item 7 and item 14 (strength). Correlations were calculated between the ARAT total score and the four variables. Excellent correlations between the ARAT total score and the WMFT FAS (r= 0.86), median time score (r=-0.89) and strength tasks (items 7 and 14) (r=0.70) were found.

Predictive:
No studies have reported the predictive validity of the WMFT.

Construct:
Known groups:
Wolf et al. (2001) evaluated whether the WMFT was able to distinguish between individuals with impairment secondary to stroke (n=19) from those without impairment (n=19). Known group’s validity, as calculated using Wilcoxon test, showed that the WMFT scores for the dominant and the non-dominant hand of individuals without impairment were significantly higher when compared to the most and to the least affected upper extremity of clients with stroke.

Responsiveness

No studies have reported the responsiveness of the WMFT.

References

  • Barreca, S.R., Gowland, C.K., Stratford, P.W., et al. (2004). Development of the Chedoke Arm and Hand Activity Inventory: Theoretical constructs, item generation, and selection. Topics in Stroke Rehabilitation, 11(4), 31- 42.
  • Fugl-Meyer, A.R., Jääskö, L., Leyman, I., Olsson, S., & Steglind, S. (1975). The post-stroke hemiplegic patient 1. A method for evaluation of physical performance. Scandinavian Journal of Rehabilitation Medicine, 7, 13-31
  • Lyle, R.C. (1981). A performance test for assessment of upper limb function in physical rehabilitation treatment and research. International Journal of Rehabilitation and Research, 4, 483-492.
  • Morris, D., Uswatte, G., Crago, J., Cook, E., Taub, E. (2001). The reliability of the Wolf Motor Function Test for assessing upper extremity function after stroke. Arch Phys Med Rehabil, 82, 750-755.
  • Nijland, R., van Wegen, E., Verbunt, J, van Wijk, R., van Kordelaar, J. & Kwakkel, G. (2010) A comparison of two validated tests for upper limb function after stroke: The Wolf Motor Function Test and the Action Research Arm Test. Journal of Rehabilitation Medicine, 42, 694-696.
  • Whitall, J., Savin, D., Harris-Love, M., Waller, S. (2006). Psychometric properties of a modified wolf motor function test for people with mild and moderate upper extremity hemiparesis. Arch Phys Med Rehabil, 82, 750-755.
  • Wolf, S., Catlin, P., Ellis, M., Archer, A., Morgan, B., Piacentino, A. (2001). Assessing Wolf Motor Function Test as outcome measure for research in patients after stroke. Stroke, 32, 1635-1639.
  • Wolf, S., Thompson, P., Morris, D., Rose, D., Winstein, C., Taub, E., Giuliani, C., and Pearson, S. (2005). The EXCITE Trial: Atrributes of the Wolf Motor Function test in patients with Subacute Stroke. Neurorehabil Neural Repair, 19, 194-205.

See the measure

The WMFT can be obtained from the following publication or by clicking here.:

Wolf, S., Thompson, P., Morris, D., Rose, D., Winstein, C., Taub, E., Giuliani, C., & Pearson, S. (2005). The EXCITE Trial: Atrributes of the Wolf Motor Function test in patients with Subacute Stroke. Neurorehabil Neural Repair, 19, 194-205.

Table of contents
Help us to improve