Action Research Arm Test (ARAT)

Evidence Reviewed as of before: 09-06-2011
Author(s)*: Sabrina Figueiredo, BSc
Editor(s): Lisa Zeltzer, MSc OT; Nicol Korner-Bitensky, PhD OT; Elissa Sitcoff, BA BSc

Purpose

The Action Research Arm Test (ARAT) is an evaluative measure to assess specific changes in limb function among individuals who sustained cortical damage resulting in hemiplegia (Lyle, 1981). It assesses a client’s ability to handle objects differing in size, weight and shape and therefore can be considered to be an arm-specific measure of activity limitation (Platz, Pinkowski, Kim, di Bella, & Johnson, 2005).

In-Depth Review

Purpose of the measure

The Action Research Arm Test (ARAT) is an evaluative measure to assess specific changes in limb function among individuals who sustained cortical damage resulting in hemiplegia (Lyle, 1981). It assesses a client’s ability to handle objects differing in size, weight and shape and therefore can be considered to be an arm-specific measure of activity limitation (Platz, Pinkowski, Kim, di Bella, & Johnson, 2005).

Available versions

The ARAT was developed by Ronald Lyle in 1981 by adapting the Upper Extremity Function Test (UEFT) (Carroll, 1965). The UEFT test administration and scoring was simplified, the time required to administer the test was shorted, and items were grouped based on the hierarchical scale (Guttman Scale) (Lang, Wagner, Dromerick, & Edwards, 2006). Due to the need for more specific and detailed instructions related to the client’s position, scoring and test administration, Yozbatiran, Der-Yeghiaian, and Cramer (2008) proposed a standardized approach to the ARAT.

Features of the measure

Items:

The ARAT consists of 19 items grouped into four subscales: grasp, grip, pinch, and gross movement. Each subscale constitutes a hierarchical Guttman scale, which means that all items are ordered according to ascending difficulty. In the ARAT, if the client succeeds in completing the most difficult item in a subscale, this suggests he/she will succeed in the easier items for that same subscale. Similarly, failure on an item suggests the client will be unable to complete the remaining more challenging items in the subscale.

According to the rules defained by Lyle (1981), the client must first try to perform the most difficult task in a subscale. If the maximum score (score = 3) is obtained for this task then the maximum score for this entire subscale should be assigned, and the evaluator should move to the next subscale to be administered. When the client is unable to complete the most difficult item (scoring between 0-2), then the easiest item in this specific subscale should be performed. If the client fails completely (score = 0) when performing the easiest task, then the other intermediate items must not be tested, the entire subscale should be scored as zero, and the evaluator should then move to the next subscale. However, if the client succeeds at the easiest task either partially (score = 1 or 2) or completely (score = 3), then all the other tasks in that same subscale should be tested before moving to the next subscale. Following these rules, the items administered will range from a minimum of 4 to a maximum of 19 (van der Lee, Roorda, & Lankhorst, 2002).

The ARAT must be administered in a formal setting, since a specially designed table and chair are required (see equipment section for more information). For the starting position, the client should be seated in a chair, with a firm back and no armrests. The client’s trunk should be in contact with the back of the chair at all times during the test performance. Instructions about the required seating posture should be provided to the client prior to initiating the test. Additionally, reminders about the maintenance of this position should be given to the client when this condition is not respected. The client’s feet should be in contact with the floor throughout testing (van der Lee, DeGroot, Beckerman, Wagenaar, Lankhorst, & Bouter, 2001a; Yozbatiran et al., 2008). Both hands should be tested, beginning with the non- or less-affected hand, in order to practice and register baseline scores. Should the client be unable to understand the instructions for the required task, the evaluator should demonstrate the task and allow the client to try it as a trial (Yozbatiran et al., 2008). To facilitate recording the time for each task, the client’s hands should start and finish the task with palms down on the table. However, for the gross movement tasks, the client’s hands should be placed pronated on their lap. (Lyle, 1981; Yozbatiran et al., 2008).

In the grasp and pinch subscales, testing materials are lifted 37 cm from the surface of the table to the top of the shelf. In the grip subscale, testing materials are moved from one side of the table to the other. Finally, in the gross movement subscale, the client is requested to place the hand being tested either behind his/her head, on top of his/her head, or to his/her mouth (Lyle, 1981; Hsieh, Hsueh, Chiang, & Lin, 1998; Hsueh, Lee, & Hsieh, 2002a). The proper sequence for testing is 1) grasp subscale, 2) grip subscale, 3) pinch subscale, 4) gross movement subscale (Lyle, 1981). The ARAT comes with simple instructions to guide the evaluator on scoring and administering the test (Lyle, 1981).

Scoring:

The ARAT is scored on a four-level ordinal scale (0-3) (Lyle, 1981).

  • 0 = can not perform any part of the test,
  • 1 = performs the test partially,
  • 2 = completes the test, but takes abnormally long, time
  • 3 = performs the test normally

In order to facilitate scoring, time limits have been suggested (Wagenaar, Meijer, van Wierinen, Kuik, Hazenberg, Lindeboom, Wichers, & Rijswijk, 1990; Yozbatiran et al., 2008). Incorporating the time limits to Lyle’s scoring definition, the new scoring system would be:

  • 0 = cannot perform any part of the test;
  • 1 = performs the test partially;
  • 2 = completes the test, but takes an abnormally long time, varying from 5 to 60 seconds.

    If a client takes more than 60 seconds to perform an item, the evaluator should interrupt after 60 seconds and a score of 1 is given on that specific item.

  • 3 = performs the test normally in less than 5 seconds.

The subscale scores range according to the number of items on each subscale, as follows:

Subscales on the ARAT Number of items per subscale Score ranges per subscale
Grasp subscale 6 items Score 0-18
Grip subscale 4 items Score 0-12
Pinch subscale 6 items Score 0-18
Gross Movement subscale 3 items Score 0-9

The total score on the ARAT ranges from 0 to 57, with the lowest score indicating that no movements can be performed, and the upper score indicating normal performance. Thus, higher scores will indicate better performance (Lang et al., 2006; van der Lee et al., 2002). The ARAT scores is a continuous measure, with no categorical cutoff scores. Therefore the score obtained at the ARAT does not allow classifying the clients into categories such as normal, mild limited, or severely limited.

Time:

The time required to complete the ARAT will depend on the number of items administered. Based on its hierarchical design, the ARAT was constructed to save testing time. Thus, no more than 7-10 minutes should be required to assess a client with stroke (DeWeerdt, & Harrinson, 1985). However, if all 19 items are performed, the ARAT usually takes 20 minutes to administer (van der Lee et al., 2002). In one study by Hsieh and colleagues (1998), the ARAT took, on average, 8 minutes to administer to clients with stroke.

Subscales:

The ARAT is divided in four subscales: Grasp; Grip; Pinch and Gross movement.

The grasp and pinch subscales have 6 items each, the grip subscale has 4 items, and the gross movement has 3 items (Lyle, 1981).

Equipment:

Standardized equipment is required to administer the ARAT. It can be ordered only from Netherlands’ representatives. The average cost for this equipment is approximately 850 Euros ($1200 CAD) with an additional delivery fee of 179 Euros ($252 CAD).

The complete ARAT kit consists of:

  • A specially designed table of 92cm x 45cm x 83cm high, with a shelf of 93cm x 10cm, positioned 37cm above the main surface of the table (Lyle, 1981; Hsueh et al., 2002a).
  • A chair with back rest and no arm rests, that should be placed 44cm above floor level (Lyle, 1981; Hsueh et al., 2002a).
  • Woodblocks of 2.5, 5, 7.5 and 10cm³ (Lyle, 1981; Hsueh et al., 2002a).
  • A cricket ball 7.5cm in diameter (Lyle, 1981; Hsueh et al., 2002a).
  • Two alloy tubes: one 2.25cm in diameter x 11.5 cm long, the second one 1.0cm in diameter x 16cm long (Lyle, 1981; Hsueh et al., 2002a).
  • A washer and bolt; which is a type of screw with its anchor (Lyle, 1981; Hsueh et al., 2002a).
  • Two glasses (Lyle, 1981; Hsueh et al., 2002a).
  • A marble 1.5cm in diameter (Lyle, 1981; Hsueh et al., 2002a).
  • A ball bearing 6mm in diameter (Lyle, 1981; Hsueh et al., 2002a).
  • A stopwatch (Wagenaar et al., 1990; Yozbatiran et al., 2008)
  • Paper and pencil for the evaluator.

Training:

None typically reported.

Alternative forms of the Action Research Arm Test

None.

Client suitability

Can be used with:

  • The ARAT was constructed for assessing recovery of upper limb function following cortical damage (Lyle, 1981).
  • Clients with stroke.

Should not be used in:

  • When administering the ARAT for clients with finger amputation, pinch subscale should be scored as 0 as well all other tasks that require movement of an amputated body part (Yozbatiran et al., 2008).

In what languages is the measure available?

There are no official translations of the ARAT.

Nevertheless, some peer-reviewed publications from the Netherlands and Taiwan have used the ARAT as an outcome measure, which may indicate that instructions have been informally translated to other languages (Hsieh et al., 1998; Hsueh et al., 2002a; van der Lee et al., 2002).

Summary

What does the tool measure? The ARAT measures specific changes in limb function among individuals who sustained cortical damage resulting in hemiplegia.
What types of clients can the tool be used for? The ARAT can be used with, but is not limited to clients with stroke.
Is this a screening or assessment tool? Assessment
Time to administer An average of 7 to 10 minutes.
Versions There are no alternative versions.
Other Languages There are no official translations.
Measurement Properties
Reliability Internal consistency:
One study examined the internal consistency of the ARAT and reported excellent internal consistency using Cronbach’s alpha.

Test-retest:
Three studies have examined the test-retest reliability of the ARAT. All reported excellent test-retest reliability using ICCs.

Intra-rater:
Four studies have examined the intra-rater reliability of the ARAT and reported excellent intra-rater reliability using Spearman rho correlation, intraclass correlation coefficients (ICC) and weighted kappa.

Inter-rater:
Seven studies examined the inter-rater reliability of the ARAT and reported excellent inter-rater reliability using Spearman rho correlation, Intra ICC and weighted kappa.

Validity Criterion:
Concurrent:
One study has examined the concurrent validity of the ARAT and reported adequate to excellent correlations with the Box and Block Test (BBT) and the Nine-Hole Peg Test (NHPT) at pre and post-treatment.

Predictive:
No studies have examined the predictive validity of the ARAT.

Construct:
Convergent:
Seven studies examined convergent validity of the ARAT and reported excellent correlations between the ARAT and the Brunnstrom-Fugl-Meyer test; the upper extremity subscale of the Motor Assessment scale; the Motricity Index; the upper extremity movement of Modified Motor Assessment Chart; the BTT; the motor function subscore of the Fugl-Meyer test; the Hemispheric Stroke Scale; upper extremity strength and grasp speed. Adequate correlations were reported between the ARAT and the passive joint motion/joint pain of the Fugl-Meyer test, the Functional Independence Measure and spasticity. Poor correlations were reported between the ARAT and the sensation score of the Fugl-Meyer test, the Ashworth scale, the Modified Barthel Index, the National Institutes of Health Stroke Scale, the light touch sensation and pain.

Floor/Ceiling Effects – One study examined the floor/ceiling effects of the ARAT in clients with acute stroke and reported that at earlier phases of the stroke, floor effects were poor. At discharge from the acute rehabilitation ward, ceiling effects on the ARAT were adequate.
– One study examined the floor/ceiling effects of the ARAT in stroke clients with mild to moderate hemiparesis and reported adequate floor and ceiling effects.
Sensitivity/ Specificity No studies have examined the specificity of the ARAT.
Does the tool detect change in patients? Six studies have examined the responsiveness of the ARAT and reported that the ARAT has a moderate to large Standardized Response Mean, moderate to large effect size and large responsiveness ratio, therefore, is able to detect change in clients with stroke.
Acceptability When administering the ARAT to clients with upper extremity amputations attention is required when scoring (i.e. – a score of 0 is given).
Feasibility The administration of the ARAT is quick and simple, but requires standardized equipment.
How to obtain the tool? Information on the ARAT can be obtained in the study by Lyle (1981), Hsieh et al. (1998), van der Lee et al. (2002), Rabadi & Rabadi (2006), and Yozbatiran et al. (2008) and at the website: http://www.aratest.eu/Index_english.htm Standardized equipment can be purchased from the following website: http://www.aratest.eu/ or from http://www.saliarehab.com/

Psychometric Properties

Overview

We conducted a literature search to identify all relevant publications on the psychometric properties of the Action Research Arm Test (ARAT) in individuals with stroke. We identified twelve studies. The ARAT appears to be floor effects.

Floor/Ceiling Effects

Hsueh and Hsieh (2002b) examined floor and ceilings effects for the ARAT and the Upper Extremity Motor Assessment Scale (Carr, Shepherd, Nordholm, & Lynne, 1985) in 48 clients with acute stroke. Participants were assessed at admission and discharge from an acute rehabilitation ward. At admission, the ARAT total score demonstrated a poor floor effect, with 52.1% of participants scoring 0. Although all subscales were classified as having a poor floor effect, when comparing ARAT’s subscales among themselves, 72.9% of participants were unable to perform the pinch subscale, 70.8% were unable to perform both grasp and grip subscales and 52.1 % were unable to complete the gross movement subscale. At discharge, the ARAT total score demonstrated an adequate ceiling effect, with only 7% of participants scoring the maximal value. When analyzing ARAT’s subscales individually the gross movement subscale presented the poorest ceiling effect, with 29.2% of participants scoring the maximum score, followed by 27% of participants on the grasp subscale. The grip and pinch subscale had the best classification, with an adequate ceiling effect of 18.8% and 16.7%, respectively.

Compared to the ARAT, at admission the Upper Extremity Motor Assessment Scale had 58% of participants scoring the minimal value, indicating a poor floor effect. However, at discharge the Upper Extremity Motor Assessment Scale demonstrated a more adequate ceiling effect than the ARAT, with only 4.3 % of participants obtaining the maximum score.

Nijland et al. (2010) investigated the psychometric properties of the ARAT and Wolf Motor Function Test in 40 patients with stroke with mild to moderate hemiparesis. The ARAT showed adequate floor and ceiling effects with only 12.5 to 17% of patients scoring the lowest or highest scores.

Reliability

Internal Consistency:
Nijland et al. (2010) investigated the internal consistency of the ARAT in 40 patients with stroke with mild to moderate hemiparesis. Internal consistency of the ARAT, as calculated using Cronbach’s Coefficient Alpha was excellent (α = 0.98).

Test-retest:
Note: From the descriptions provided of the following studies it appears that some authors called the testing test-retest reliability while others called the same analysis intra-rater reliability.

Lyle (1981) examined test-retest reliability in 20 individuals who sustained cortical damage, either from stroke or traumatic brain lesion. The mean age was 53 years, ranging from 26 to 72 years. Participants were re-assessed with a 1-week interval by the same rater and under the same conditions. The test-retest reliability, as calculated using Pearson correlation, was excellent (r = 0.98).

Hsueh, Lee, and Hsieh (2002a) evaluated test-retest reliability performed using a regular table instead of the specially designed table for this test in 61 individuals with sub-acute stroke and a mean age of 63 years old. Participants were re-assessed after a two-day interval by the same rater. The test-retest reliability, as calculated using the Intraclass Correlation Coefficient (ICC), was excellent for the total score (ICC = 0.99) as well as for the grasp, grip, pinch and gross movement subscales (ICC = 0.99, 0.98, 0.96 and 0.95, respectively).

Platz, Pinkowski, van Wijck, Kim, di Bella, and Johnson (2005) estimated test-retest reliability for the ARAT, the Box and Block Test (Cromwell, 1965; Mathiowetz, Volland, Kashman, & Weber, 1985a), and the Fugl-Meyer Test upper extremity items (including items from the Motor function, Sensation and Passive Joint Motion/Joint pain subscores) (Fugl-Meyer, Jääskö, Leyman, Olsson, & Steglind, 1975) in 23 participants with upper extremity paresis either from stroke, multiple sclerosis, or traumatic brain injury. The participant’s most affected arm was re-assessed 1 week later by the same rater. The test-retest reliability of the ARAT total score, as calculated using ICC’s and Spearman rho correlation, was excellent (ICC = 0.96 and rho = 0.96). Furthermore, test-retest reliabilities for each subscale were all excellent: grasp (ICC = 0.94 and rho = 0.96), grip (ICC = 0.94 and rho = 0.95), pinch (ICC = 0.89 and rho = 0.89) and gross movement (ICC = 0.97 and rho = 0.97).
Note: These results applies only to the most affected upper limb.

Intra-rater:
Wagenaar, Meijer, van Wierinen, Kuik, Hazenberg, Lindeboom, Wichers and Rijswijk (1990) evaluated intra-rater reliability in seven patients with acute stroke. The timeframe for assessments were not provided by the author. Intra-rater reliability as calculated using Spearman rho correlation, was excellent (rho = 0.99).

Van der Lee, DeGroot, Beckerman, Wagenaar, Lankhorst, and Bouter (2001a) estimated intra-rater reliability in 20 patients with chronic stroke and a median age of 62 years. Participants were evaluated by the same rater at three points in time. At the baseline assessment participants were videotaped. The second assessment was 4-27 months following the first assessment, and the final assessment was 4-6 weeks after. Scoring the last two assessments was based on the videotaped recorded at baseline. Intra-rater reliability results were analyzed between the two first assessments, where scoring sources were different (live vs. videotape) and between the two last assessments, were scoring sources were the same (videotape only). Intra-rater reliability, as calculated using ICC and Spearman rho correlation, was excellent (ICC = 0.99 and rho = 0.99), independent of scoring sources. Intra-rater reliability, as calculated using weighted kappa was also excellent: scoring with the same information source resulted in a kappa = 1.00 versus only a slightly lower kappa when scoring from two different information sources (kappa = 0.94). The gross movement subscale showed the lowest weighted kappa value (kappa = 0.83), suggesting that this subscale had the lowest agreement level.

Yozbatiran, Der-Yeghiaian, and Cramer (2008) examined intra-rater reliability in 8 clients with chronic stroke. Participants were re-assessed by the same rater and under the same conditions with a 1-week interval. Intra-rater reliability for the total score, as calculated using ICC and Spearman rho correlation, was excellent (ICC = 0.99 and rho = 0.99). Additionally, the same excellent level of intra-rater reliability was found for the grasp, grip, pinch, and gross motor movement subscales (ICC = 0.98 and rho = 0.93; ICC = 0.97 and rho = 0.93; ICC = 0.99 and rho = 0.98; ICC = 0.93 and rho = 0.91, respectively).

Nijland et al. (2010) investigated the psychometric properties of the ARAT and Wolf Motor Function Test in 40 patients with stroke with mild to moderate hemiparesis. 18 patients participated in the reproducibility testing of the ARAT and were assessed twice by the same observer approximately 10 days apart. Intra-rater reliability, as analyzed using the ICC was found to be excellent (ICC = 0.97).

Inter-rater:
Lyle (1981) examined inter-rater reliability in 20 individuals who had sustained cortical damage, either from stroke or traumatic brain injury. The mean age was 53 years, ranging from 26 to 72 years. Participants were assessed independently by two different raters. Agreement between raters as calculated using Pearson correlation, was excellent (r = 0.99).

Hsieh, Hsueh, Chiang, and Lin (1998) assessed inter-rater reliability in 50 clients with stroke. Their mean age was 65 years old. Participants were evaluated independently, on three different days, by three raters. ICC for the total score showed excellent agreement (ICC = 0.98). Agreement between raters was also excellent for grasp, grip, pinch and gross movement subscales (ICC = 0.98; ICC = 0.96; ICC = 0.96; ICC = 0.95, respectively).

Van der Lee et al. (2001a) estimated inter-rater reliability in 20 patients with chronic stroke and a median age of 62 years old. Participants were videotaped and scored independently by two raters. Inter-rater reliability, as calculated using ICC, weighted kappa, and Spearman rho correlation, was excellent (ICC = 0.98; kappa = 0.93; rho = 0.99). With respect to the individual subscales, the gross movement scale had the lowest weighted kappa value (kappa = 0.87), suggesting this subscale has the lowest agreement between raters.

Hsueh, Lee, and Hsieh (2002a) evaluated inter-rater reliability of the ARAT performed with a regular table instead of the specially designed table for this test in 61 individuals with sub-acute stroke and a mean age of 63 years old. Participants were re-assessed with a two-day interval by three different raters. ICC for the total score showed excellent agreement (ICC = 0.99) as well as for grasp, grip, pinch and gross movement subscales (ICC = 0.99; ICC = 0.98; ICC = 0.96; ICC = 0.94, respectively).

Platz et al. (2005) analyzed inter-rater reliability of the ARAT, the Box and Block Test and the Fugl-Meyer Test upper extremity items (including items from the Motor function, Sensation and Passive Joint Motion/Joint pain subscores) in 44 individuals with upper limb paresis either from stroke, multiple sclerosis, or traumatic brain injury. Participants had the most affected arm videotaped and scored independently by two raters. Inter-rater reliability for the ARAT total score, as calculated using the ICC and Spearman rho correlation, was excellent (ICC = 0.99 and rho = 0.99). Additionally, the scores for each subscale were provided and inter-rater reliability for grasp (ICC = 0.99 and rho = 0.99), grip (ICC = 0.96 and rho = 0.95), pinch (ICC = 0.99 and rho = 0.99) and gross movement (ICC = 0.98 and rho = 0.98) subscales were all excellent.
Note: These results applies only to the most affected upper limb.

Yozbatiran et al. (2008) evaluated inter-rater reliability in 9 clients with chronic stroke. Participants were scored simultaneously and independently by two raters. Inter-rater reliability for the total score, as calculated using the ICC and Spearman rho correlation, was excellent (ICC = 0.99 and rho = 0.96). The same excellent level of inter-rater reliability was found for the grasp, grip, pinch and gross motor movement subscales (ICC = 0.99 and rho = 1; ICC = 0.99 and rho = 0.99; ICC = 0.99 and rho = 0.98; ICC = 0.97 and rho = 0.93, respectively).

Nijland et al. (2010) investigated the psychometric properties of the ARAT and Wolf Motor Function Test in 40 patients with stroke with mild to moderate hemiparesis. 18 patients participated in the reproducibility testing of the ARAT and were assessed in random order by two observers, within one week. Inter-rater reliability, as analyzed using the ICC was found to be excellent (ICC = 0.92).

Validity

Content:

Lyle, 1981 generated the 19 ARAT items from the 33 items of the Upper Extremity Function Test (UEFT – Caroll, 1965). Item reduction was based on a low inter-item correlation, on item redundancy, confirmed through a very high inter-item correlation (above r = 0.9) and on items that were extremely difficult to perform. Nevertheless, ARAT items were not based on a theoretical model (Finch, Brooks, Stratford, & Mayo, 2002).

Criterion:

Concurrent:
No gold standard exists against which to compare the ARAT.

Lin, Chuang, Wu, Hsieh and Chang (2010) compared the concurrent validity of the ARAT, Box and Block Test (BBT) and Nine-Hole Peg Test (NHPT) for evaluating hand dexterity in 59 patients with stroke. The Fugl-Meyer Assessment of Sensorimotor Recovery After Stroke (FMA), Motor Activity Log (MAL) and Stroke Impact Scale (SIS) were also administered to assess the concurrent validity of the ARAT, BBT and NHPT. Using Spearman rank correlation coefficient, the ARAT, BBT and NHPT were found to have adequate to excellent correlations at pre-treatment (ranging from rho=-0.55 to -0.80) and post-treatment (ranging from rho=-0.57 to -0.71). In addition, the ARAT and BBT were found to have adequate correlations with the FMA, MAL and SIS (ranging from rho=0.31-59); however, the NHPT had only poor to adequate correlations with the FMA and MAL (ranging from rho=-0.16 to -0.33); and adequate to excellent correlations with the SIS (ranging from rho=-0.58 to -0.66). When considering both the results of responsiveness and validation components of the study, the ARAT and BBT are believed to be more appropriate than the NHPT for evaluating dexterity.

Predictive:
No studies have examined the predictive validity of the ARAT.

Construct:

Convergent/Discriminant:
DeWeerdt and Harrison (1985) evaluated the convergent validity of the ARAT by comparing it to the Fugl-Meyer test (Fugl-Meyer et al., 1975) in 53 clients with acute stroke. Their mean age was 68 years. Correlations were calculated at two points in time after stroke onset using Spearman correlation coefficient. Excellent correlations were found between the ARAT and Fugl-Meyer test at 2 months (rho = 0.91) and at 8 months (rho = 0.94) post-stroke.

Wagenaar, Meijer, van Wierinen, Kuik, Hazenberg, Lindeboom, Wichers and Rijswijk (1990) evaluated the convergent validity of the ARAT by comparing it to the Sollerman test (Jacobson-Sollerman & Sperling, 1977) in seven patients with acute stroke. An excellent correlation, as calculated using Spearman rho, was found (rho = 0.94).
Note: The Sollerman test measures hand grip function using 20 different daily life activities requiring hand movements.

Hsieh et al. (1998) assessed convergent validity of the ARAT by comparing it to the Upper Extremity portion of the Motor Assessment Scale (Carr et al., 1985), the arm subscale of the Motricity Index (Demeurisse, Demol, & obaye, 1980), and the upper extremity movements of the Modified Motor Assessment Chart (Lindmark & Hamrin, 1988) in 50 clients with stroke. The mean age of clients was 65 years old. Correlations were calculated using Pearson Correlation Coefficients. Excellent correlations were found between the ARAT and the Upper Extremity part of the Motor Assessment Scale ((r = 0.96), Motricity Index (r = 0.87) and the upper extremity movements of the Modified Motor Assessment Chart (r = 0.94).

Platz et al. (2005) tested convergent validity of the ARAT by comparing it to the Box and Block Test (Cromwell, 1965; Mathiowetz et al., 1985a), the Fugl-Meyer Test upper extremity items (including items from the Motor Function, Sensation and Passive Joint Motion/Joint Pain subscores) (Fugl-Meyer et al., 1975), the Motricity Index (Demeurisse et al., 1980), the Ashworth Scale (Ashworth, 1964), the Hemispheric Stroke Scale (Adams, Meador, Sethi, Grotta, & Thomson, 1986) and the Modified Barthel Index (Collin, Wade, Davies, & Horne, 1988) in 56 participants with upper extremity paresis either from stroke (n=37), multiple sclerosis (n=14), or traumatic brain injury (n=5). Correlations were calculated using the Spearman Correlation Coefficient. Excellent correlations were found between the ARAT and the Box and Block Test (rho = 0.95), the Motor Function subscore of the Fugl-Meyer Test (rho = 0.92), the Motricity Index (rho = 0.81), and the Hemispheric Stroke Scale (rho = -0.66). Adequate correlations were found between the ARAT and the Passive Joint Motion/Joint Pain subscore of Fugl Meyer Test (rho = 0.42). Poor correlations were found between the ARAT and the Sensation Subscore of the Fugl-Meyer Test (rho = 0.29), the Ashworth Scale (rho = -0.29) and the Modified Barthel Index (rho = 0.04).
Note: Negative correlations are observed because a high score on the ARAT indicates normal performance, whereas a low score on the Hemispheric Stroke Scale and the Ashworth Scale indicates normal performance.

Lang, Wagner, Dromerick, and Edwards (2006) evaluated the convergent validity of the ARAT in 50 individuals with acute to sub acute stroke, mean age of 63 years old, attending an acute neurology stroke service at three points in time: admission (day 0); post intervention (day 14); and 90 days poststroke (day 90). The ARAT was compared to measures of sensorimotor impairment (e.g. light touch sensation, pain, elbow joint spasticity, upper extremity strength), to kinematic measures (e.g. reach and grasp), to the Functional Independence Measure (FIM) (Keith, Granger, Hamilton, & Sherwin, 1987), and to the National Institutes of Health Stroke Scale (NIHSS) (Brott, Adams, Olinger, Marler, Barsan, Biller, et al., 1989). At day 0, excellent correlations were found between the ARAT and upper extremity strength (r = 0.60) and grasp speed (r = 0.60). Adequate correlations were found between the ARAT and grasp efficiency (r = 0.42), reach efficiency (r = -0.38) and reach speed (r = 0.40), and the FIM upper extremity score (r = 0.38). Poor correlations were found between the ARAT and NIHSS (r = -0.15); light touch sensation (r = 0.15), pain (r = 0.10), elbow joint spasticity (r = -0.28) and the FIM total score (r = 0.20). At day 14, excellent correlations were found between the ARAT and grasp efficiency (r = 0.60) and the FIM upper extremity scores (r = 0.62). Adequate correlations were found between the ARAT and elbow spasticity (r = 0.49), upper extremity strength (r = 0.42), reach efficiency (r = -0.58), grasp speed (r = 0.36) and the FIM total score (r = 0.52). Poor correlations were found between the ARAT and NIHSS (r = -0.24), light touch sensation (r = -0.20), and pain (r = -0.12). At day 90, excellent correlations were found between the ARAT and upper extremity strength (r = 0.60). Adequate correlations were found between the ARAT and elbow spasticity (r = -0.42), reach efficiency (r = -0.42), reach speed (r = 0.50), grasp efficiency (r = -0.48), grasp speed (r = 0.38) and the FIM upper extremity (r = 0.42) and total scores (r = 0.40). Poor correlations were found between the ARAT and the NIHSS (r = -0.29), light touch sensation (r = 0.00), and pain (r = 0.22). In summary, from this study’s findings it appears that the NIHSS, light touch sensation, and pain do not appear to relate to the ARAT. The relationship between the ARAT and FIM scores is stronger early on post-stroke and stabilizes by the ninetieth day.

Rabadi and Rabadi (2006) examined convergent validity of the ARAT by comparing it to the Fugl-Meyer Assessment (Fugl-Meyer et al., 1975) at admission and discharge from an acute stroke rehabilitation unit in 104 inpatients with acute stroke with a mean age of 72 years. The correlation between ARAT and the Fugl-Meyer Assessment was excellent both at admission (rho = 0.77) and discharge (rho = 0.87).

Yozbatiran et al. (2008) estimated the convergent validity of the ARAT by comparing it to the arm motor Fugl-Meyer Assessment (Fugl-Meyer et al., 1975) score in 12 clients with chronic stroke at a mean age of 61 years. Excellent correlation (r = 0.94) was found between the ARAT and arm motor Fugl-Meyer score.

Known groups:
No studies have examined known groups validity of the ARAT.

Responsiveness

Van der Lee, Beckerman, Lankhorst, and Bouter (2001b) evaluated the responsiveness on the ARAT and Fugl-Meyer Assessment (Fugl-Meyer et al., 1975) in 22 clients with chronic stroke, mean age of 58 years old, receiving intensive forced use treatment. Participants were assessed two weeks pre- and two weeks post- treatment. A responsiveness ratio was calculated. Compared to the Fugl-Meyer Assessment, the ARAT had a greater responsiveness ratio (2.03 for ARAT vs. 0.41 for Fugl-Meyer) suggesting that the ARAT is more sensitive to detecting change.
Note: The responsiveness ratio is a variant of effect size and higher values indicate better responsiveness.

Van der Lee, Roorda, Beckerman, and Lankhorst (2002) estimated the responsiveness of a modified version of the ARAT in 63 participants with chronic stroke. In this study, researchers did not follow Lyle’s standardized instructions. Instead, they administered all 19 ARAT items to verify any possible effect of this format on its psychometric properties. A responsiveness ratio was calculated. Compared to the hierarchical version proposed by Lyle, performing all 19 items was found to improve the measure’s responsiveness, with a responsiveness ratio of 1.7 compared to 1.2 with Lyle’s version.
Note: The responsiveness ratio can be considered an estimate of effect size normalized to the variability in a stable population and higher values indicate better responsiveness.

Hsueh et al. (2002b) analyzed the responsiveness of the ARAT and the upper extremity section of the Motor Assessment Scale (Carr et al., 1985) in 48 participants having acute stroke and a mean age of 62 years. Participants were assessed at two points in time: admission and discharge from the acute rehabilitation centre. The ARAT total score demonstrated a moderate effect size of 0.52, while the Motor Assessment Scale total score demonstrated a small effect size of 0.45.

Lang et al. (2006) examined the responsiveness of the ARAT in 50 participants with acute to subacute stroke, with a mean age of 63 years old, receiving constraint-induced movement therapy (CIMT). Assessments were performed at three points in time: baseline, immediately post-treatment, and 2.5 months post-treatment. Effects sizes and responsiveness ratios were calculated. ARAT total and subscale scores at the first follow-up evaluation were similar, with moderate to large effect sizes (ARAT total score = 1.01; grasp subscore = 1.04; pinch subscore = 0.85; grip subscore = 1.01; and gross movement subscore = 0.72). The second follow-up evaluation demonstrated large effect sizes, with individual higher values when compared to the first evaluation (ARAT total score = 1.39; grasp subscore = 1.22; pinch subscore = 1.49; grip subscore = 1.32 and gross movement subscore = 0.98). The responsiveness ratio for the ARAT total score at the first follow-up evaluation was 5.2 and at the second was 7.0. These two responsiveness estimations suggest that the ARAT is a sensitive tool for detecting change even months after stroke onset.
Note: Responsiveness ratio is a variant of effect size and higher values indicate better responsiveness.

Rabadi and Rabadi (2008) assessed the responsiveness of the ARAT and the Fugl-Meyer Assessment (Fugl-Meyer et al., 1975) in 104 participants with acute stroke, with a mean age of 72 years, undergoing inpatient rehabilitation. Participants were evaluated at admission and discharge from acute care. The Standardized Response Mean (SRM) was used to calculate responsiveness. Amongst these upper extremity tests, the ARAT was less sensitive than the Fugl-Meyer Assessment (SRM = 0.68 and 0.74, respectively). However, since the difference between the SRMs for these two measures was minimal, these tests can be considered equally sensitive to change during inpatient acute rehabilitation. This result is contrary to the one presented by Van der Lee at al. (2002). The reason for this difference may be due to the difference in these studies population age and stroke severity.
Note: SRM is a variant of effect size and higher values indicate better responsiveness.

Lin, Chuang, Wu, Hsieh and Chang (2010) evaluated the responsiveness of the ARAT, Box and Block Test (BBT), the Nine-Hole Peg Test (NHPT) for evaluating hand dexterity in 59 patients with subacute stroke (< 6-months) and Brunnstrom stage IV to VI for proximal and distal upper extremity function. Patients were randomly assigned to receive constraint-induced therapy, bilateral arm training or control treatment and received 2 hours of therapy, 5 days per week for 3 weeks. Assessments were performed at baseline and 3 weeks. Using Standardized Response Mean (SRM) to calculate responsiveness, the ARAT, BBT and NHPT were all found to have moderate SRM (0.79, 0.74, 0.64 respectively), indicating sensitivity for detecting change in hand dexterity. When considering both the results of responsiveness and validation components of the study, the ARAT and BBT are believed to be more appropriate than the NHPT for evaluating dexterity.

References

  • Adams, R.J., Meador, K.J., Sethi, K.D., Grotta, J.C., & Thomson, D.S. (1986). Graded neurologic scale for the use in acute hemispheric stroke treatment protocols. Stroke, 18, 665-669.
  • Ashworth, B. (1964). Preliminary trial of carisoprodol in multiple sclerosis. Practitioner, 192, 540-542.
  • Brott, T. G., Adams, H. P., Olinger, C. P., Marler, J. R., Barsan, W. G., Biller, J., Spilker, J., Holleran, R., Eberle, R., Hertzberg, V., Rorick, M., Moomaw, C. J., & Walker, M. (1989). Measurements of acute cerebral infarction: a clinical examination scale. Stroke, 20, 864 -70.
  • Carroll, D. (1965). A quantitative test of upper extremity function. Journal of Chronic Disability, 18, 479-91.
  • Carr, J.H., Shepherd, R.B., Nordholm, L., & Lynne, D. (1985). Investigation of a new motor assessment scale for stroke patients. Physical Therapy, 65, 175- 180.
  • Collin, C., Wade, D.T., Davies, S., & Horne, V. (1988). The Barthel ADL Index: a reliability study. International Disability Study, 10, 61-63.
  • Cromwell, F.S (1965). Occupational therapists manual for basic skills assessment: primary prevocational evaluation. Pasadena, (CA): Fair Oaks Printing; 29-31.
  • Demeurisse, G., Demol, O., & Robaye, E. (1980). Motor evaluation in vascular hemiplegia. European Neurology, 19(6), 382-389.
  • De Weerdt, W.J.G., & Harrison, M.A. (1985). Measuring recovery of arm hand function in stroke patients: a comparison of the Brunnstrom-Fugl-Meyer test and the Action Research Arm Test. Physiotherapy Canada, 37, 65-70.
  • Finch, E., Brooks, D., Stratford,P.W, & Mayo, N.E. (2002). Physical Outcome Measures: A guide to enhance physical outcome measures. Ontario, Canada: Lippincott, Williams, & Wilkins.
  • Fugl-Meyer, A.R., Jääskö, L., Leyman, I., Olsson, S., & Steglind, S. (1975). The post-stroke hemiplegic patient 1. A method for evaluation of physical performance. Scandinavian Journal of Rehabilitation Medicine, 7, 13-31.
  • Gowland, C., Van-Hullenaar, S., Torresin, W., et al., (1995). Chedoke-McMaster Stroke Assessment: development, validation, and administration manual. Hamilton, (ON), Canada: School of Rehabilitation Science, McMaster University
  • Heller, A., Wade, D.T., Wood, V.A., Sunderland, A., Hewer, R., & Ward, E. (1987). Arm function after stroke: measurement and recovery over the first three months. Journal of Neurology, Neurosurgery & Psychiatry, 50(6), 714-719.
  • Hsieh, C.L., Hsueh, I.P, Chiang, F., & Lin, P. (1998). Inter-rater reliability and validity of the action research arm test in stroke patients. Age and Ageing, 27, 107-113.
  • Hsueh, I.P, Lee, M.M., & Hsieh, C.L. (2002a). The action research arm test: Is it necessary for patients being tested to sit at a standardized table? Clinical Rehabilitation, 16, 382-388.
  • Hsueh, I.P. & Hsieh, C.L. (2002b). Responsiveness of two upper extremity function instruments for stroke inpatients receiving rehabilitation. Clinical Rehabilitation, 16, 617-624.
  • Jacobson-Sollerman, X & Sperling, Y. (1977). Grip function of the healthy hand in a standardized hand function test. A study of the Rancho Los Amigos test. Scandinavian Journal of Rehabilitation Medicine, 9(3), 123-129.
  • Keith, R.A, Granger, C.V., Hamilton, B.B., & Sherwin, F.S. (1987). The Functional Independence Measure: a new tool for rehabilitation. In: Eisenberg, M.G. & Grzesiak, R.C. (Ed.), Advances in clinical rehabilitation (pp. 6-18). New York: Springer Publishing Company.
  • Kellor, M., Frost, J., Silberberg, N., Iversen, I., & Cummings R. (1971). Hand strength and dexterity. American Journal of Occupational Therapy, 25, 77-83.
  • Lang, C.E., Wagner, J.M, Dromerick, A.W., & Edwards, D.F. (2006). Measurement of upper extremity function early after stroke: properties of the action research arm test. Archives Physical Medicine and Rehabilitation, 87, 1605-1610.
  • Lin, K-C., Chuang, L-L., Wu, C-Y., Hseih, Y-W. & Chang, W-Y. (2010). Responsiveness and validity of three dexterous function measures in stroke rehabilitation. Journal of Rehabilitation Research and Development, 47(6), 563-572.
  • Lindmark, B. & Hamrin, E. (1988). Evaluation of function capacity after stroke as a basis for active intervention: Presentation of a modified chart for motor capacity assessment and its reliability. Scandinavian Journal of Rehabilitation Medicine, 20, 103-109.
  • Lyle, R.C. (1981). A performance test for assessment of upper limb function in physical rehabilitation treatment and research. International Journal of Rehabilitation and Research, 4, 483-492.
  • Mathiowetz, V., Volland, G., Kashman, N., & Weber, K. (1985a). Adult norms for the box and block test of manual dexterity. American Journal of Occupational Therapy, 39, 386-391.
  • Mathiowetz, V., Weber, K., Kashman, N., & Volland, G. (1985b). Adult norms for the nine hole peg test of finger dexterity. Occupational Therapy Journal of Research, 5, 24 -33.
  • Nijland, R., van Wegen, E., Verbunt, J, van Wijk, R., van Kordelaar, J. & Kwakkel, G. (2010) A comparison of two validated tests for upper limb function after stroke: The Wolf Motor Function Test and the Action Research Arm Test. Journal of Rehabilitation Medicine, 42, 694-696.
  • Platz, T., Pinkowski, C., van Wijck, F., Kim, I.H., di Bella, P., & Johnson, G. (2005). Reliability and validity of arm function assessment with standardized guidelines for the Fugl-Meyer Test, Action Research Arm Test and Box and Block Test: a multicentre study. Clinical Rehabilitation, 19(4), 404-411.
  • Rabadi, M.H. & Rabadi, F.M. (2006). Comparison of the action research arm test and the Fugl-Meyer Assessment as measures of upper-extremity motor weakness after stroke. Archives of Physical of Medicine Rehabilitation, 87, 962-966.
  • van der Lee, J.H, Beckerman, H., Lankhorst, G.J., Bouter, L.M. (2001a). The responsiveness of the Action Research Arm Test and the Fugl-Meyer Assessment Scale in chronic stroke patients. Journal of Rehabilitation Medicine, 33, 110-113.
  • Van der Lee, J.H, Groot, V., Beckerman, H., Wagenaar, R.C., Lankhorst, G.J., Bouter, L.M. (2001b). The intra-rater and interrater reliability of the action research arm test: a practical test of upper extremity function in patients with stroke. Archives of Physical of Medicine Rehabilitation, 82, 14-19.
  • Van der Lee, J.H, Roorda, L.D., & Lankhorst, G.J. (2002). Improving the Action Research Arm Test: a unidimensional hierarchical scale. Clinical Rehabilitation, 16, 646-653.
  • Yozbatiran, N., Der-Yerghiaian, L., & Cramer, S.C. (2008). A standardized approach to performing the action research arm test. Neurorehabilitation & Neural Repair, 22(1), 78-90.
  • Wagenaar, R.C., Meijer, O.G., van Wieringen, P.C., Kuik, D.J., Hazenberg, G.J., Lindeboom, J., et al. (1990). The functional recovery of stroke: a comparison between neuro-developmental treatment and the Brunnstrom method. Scandinavian Journal of Rehabilitation and Medicine, 22, 1-8.

See the measure

How to obtain the Action Research Arm Test:

The ARAT can be obtained in the study by Lyle (1981), Hsieh et al. (1998), Van der Lee et al. (2002), Rabadi & Rabadi (2006), and Yozbatiran et al. (2008) and from the website: http://www.aratest.eu/Index_english.htm Standardized equipment can be purchased from the website: http://www.aratest.eu/ or from http://www.saliarehab.com/.

Table of contents
What do you think?