American Speech-Language-Hearing Association Functional Assessment of Communication Skills for adults (ASHA-FACS)

Evidence Reviewed as of before: 26-11-2010
Author(s)*: Sabrina Figueiredo, BSc
Editor(s): Lisa Zeltzer, MSc OT; Nicol Korner-Bitensky, PhD OT

Purpose

The ASHA-FACS is a measure of functional communication. It does not aim to measure impairment. Rather, the assessment aims to measure how specific speech, language, hearing and/or cognitive deficits affect the performance of daily life activities (Frattali, Holland, Thompson, Wohl, & Ferketic, 1995).

In-Depth Review

Purpose of the measure

The ASHA-FACS is a measure of functional communication. It does not aim to measure impairment. Rather, the assessment aims to measure how specific speech, language, hearing and/or cognitive deficits affect the performance of daily life activities (Frattali, Holland, Thompson, Wohl, & Ferketic, 1995).

Available versions

The ASHA-FACS was developed in 1995 by Frattali, Holland, Thompson, Wohl, and Ferketic.

Features of the measure

Items:
The ASHA-FACS consists of 43 items. The items within the ASHA-FACS address four domains: social communication; communication of basic needs; reading, writing and number concepts; and daily planning. It covers different activities of daily living (ADLs) such as understanding television and radio, responding in an emergency, and using a calendar (Frattali et al., 1995).

The ASHA-FACS is also subdivided into 2 scales: Communicative Independence and Qualitative Dimensions of Communication. These scales measure the level of independence, and the nature of the functional deficit, respectively (Frattali et al., 1995).

The ASHA-FACS is in an observational format, and requires that the examiner be familiar with the client prior to rating his/her communication. A major strength of the ASHA-FACS is that it has simple wording and behavioral operationalization of item content (Glueckauf, Blonder, Ecklund-Jonhson, Maher, Crosson & Gonzalez-Rothi, 2003).

The examiner can also solicit judgments from the significant others, in order to augment his/her observations (Frattali et al., 1995). It has been recommended that three observational sessions are required before ratings are made (Worrall & Yiu, 2000).

Scoring:
The items are rated after observing functional communication on the two scales.

The Communicative Independence Scale is rated on a 7-point scale ranging from 1) “does not perform the behavior” to 7) “does perform the behavior”. It reflects the extent of assistance the individual requires to perform routine verbal and nonverbal transactions. The Qualitative Dimensions of Communication Scale is rated on a 5-point scale reflecting adequacy, appropriateness, and promptness of communication and communication sharing.

The total score is obtained by summing the scores on each item and then dividing by the number of items (means). It is also possible to calculate both abovementioned scale scores by calculating the means of interest (Frattali et al., 1995).

Lower scores are indicative of greater impairment.

Time:
The ASHA-FACS takes, on average, 20 minutes to score. Administration will vary according to the time spent on the observational sessions (Frattali et al., 1995).

Subscales:
The ASHA-FACS is comprised of 2 subscales (Frattali et al., 1995):
– Communicative Independence
– Qualitative Dimensions of Communication

Equipment:
Not reported.

Training:
Not reported.

Alternative forms of the ASHA-FACS

None.

Client suitability

Can be used with:

  • Clients with stroke.
  • Clients with communicative deficits.

Should not be used in:

  • The ASHA-FACS should not be used with clients that are not able to communicate.

In what languages is the measure available?

English (Frattali et al., 1995)

Summary

What does the tool measure? The ASHA-FACS is a measure of functional communication.
What types of clients can the tool be used for? The ASHA-FACS can be used with, but is not limited to clients with stroke.
Is this a screening or assessment tool? Assessment
Time to administer The ASHA-FACS takes approximately 20 minutes to administer
Versions None
Other Languages English
Measurement Properties
Reliability Internal consistency:
One study examined the internal consistency of the ASHA-FACS and reported that, after deleting items with low correlations, all the other items scores and domain scores correlated well with the overall score.
Intra-rater:
One study examined the intra-rater reliability of the ASHA-FACS and reported excellent reliability using Pearson correlations.
Inter-rater:
One study examined the inter-rater reliability of the ASHA-FACS and reported excellent reliability using Pearson correlations.
Validity Content:
One study examined the content validity of the ASHA-FACS and reported the ASHA-FACS items are representative of the everyday communication of older individuals in Australia.
Construct:
Convergent:
Two studies examined the convergent validity of the ASHA-FACS and reported excellent correlations between the social communication domain of the ASHA-FACS and the FOQ; moderate correlations between the ASHA-FACS and Western Aphasia Battery, SCATBI, the communication domain of the Functional Independence Measure, the Rancho Los Amigos Levels of Cognitive Functioning and poor correlations between the ASHA-FACS and the SF-36.
Known Groups:
One study examined known groups validity of the ASHA-FACS using Wilcoxon W and reported that the ASHA-FACS is able to discriminate between healthy individuals and those with communicative impairments.
Floor/Ceiling Effects No studies have examined floor/ceiling effects of the ASHA-FACS in clients with stroke.
Sensitivity / Specificity One study examined the sensitivity/specificity of the ASHA-FACS by comparing it with the Aphasia Checklist Score, as the gold standard and reported that a cut-off of 6.99 in the Communication Independence domain yields a sensitivity of 90% and a specificity of 70% while a cut-off of 4.81 in the Communications Dimensions yields a sensitivity of 100% and a specificity of 90%.
Does the tool detect change in patients? No studies have examined the responsiveness of the ASHA-FACS in clients with stroke.
Acceptability Examiners must have adequate opportunity for direct observation of communication at home or in the community. Additional information, as required, may be obtained from the significant other.
Feasibility The administration of the ASHA-FACS is quick and simple.
How to obtain the tool? The ASHA-FACS can be purchased from the ASHA website: http://www.asha.org/eWeb/OLSDynamicPage.aspx?Webcode=olsdetails&title=Functional+Assessment+of+Communication+Skills+for+Adults+(ASHA+FACS)

Psychometric Properties

Overview

We conducted a literature search to identify all relevant publications on the psychometric properties of the ASHA-FACS in individuals with stroke. We identified 5 studies.

Floor/Ceiling Effects

No studies have reported floor or ceiling effects of the ASHA-FACS in clients with stroke.

Reliability

Internal Consistency:
Frattali, Holland, Thompson, Wohl, and Ferketic (1995) verified the internal consistency of the ASHA-FACS in 32 subjects with stroke and 26 with traumatic brain injury. Items with correlations lower than 0.70 were deleted. All the other item scores and domain scores correlated well with the overall score.

Intra-rater:
Frattali et al. (1995) analyzed the Intra-rater reliability, as calculated using Pearson Correlation Coefficient, was excellent (r = 0.99).

Inter-rater:
Frattali et al. (1995) evaluated the inter-rater reliability of the ASHA-FACS. Speech language pathologists from five different centers throughout the US evaluated 32 subjects with stroke and 26 with traumatic brain injury. Inter-rater reliability, as calculated using Pearson Correlation Coefficient, was excellent (r = 0.82).
Note: the exact number of therapists was not provided.

Validity

Content:
Davidson, Worral and Hickson (2003) analyzed the content validity of the ASHA-FACS among 30 older Australians (15 with stroke and aphasia and 15 healthy individuals). This observational study reported that the ASHA-FACS items are representative of the everyday communication of older individuals in Australia.

Criterion:
Concurrent:
No studies have reported the concurrent validity of the ASHA-FACS in clients with stroke.

Predictive:
No studies have reported the predictive validity of the ASHA-FACS in clients with stroke.

Sensitivity/ Specificity:
Ross and Wertz (2004) estimated the sensitivity and specificity of the ASHA-FACS by comparing it with the Aphasia Checklist Score (Ross and Wertz, 2004), as the gold standard, in 10 clients with stroke and aphasia. The aphasia checklist score is a diagnostic test that takes into account information collected from the medical records and the World Health Organization’s concepts of disability. In the Communication Independence scale a cut-off of 6.99 yielded a sensitivity of 90% and a specificity of 70%. In the Communications Dimensions scale a cut-off of 4.81 produced 100% sensitivity and 90% specificity.

Construct:
Convergent/Discriminant:
Frattali et al. (1995) assessed the construct validity of the ASHA-FACS by comparing it to the Western Aphasia Battery (Kertesz, 1982), the Scales of Cognitive Ability for Traumatic Brain Injury (SCATBI) (Adamovich & Henderson, 1992), and the communication domain of the Functional Independence Measure (FIM) (Keith, Granger, Hamilton, & Sherwin, 1987) and the Rancho Los Amigos Levels of Cognitive Functioning (Hagen, Malkmus, & Durham, 1972) in 32 subjects with stroke and 26 with traumatic brain injury. Correlations between the ASHA-FACS and all outcome measures were adequate.
Note: the authors did not report the correlation values.

Glueckauf, Blonder, Ecklund-Jonhson, Maher, Crosson and Gonzalez-Rothi (1993) assessed correlations between the scores of the ASHA-FACS, the Western Aphasia Battery (Kertesz, 1982), the Functional Outcome Questionnaire for Aphasia (FOQ-A) (Glueckauf et al., 1993) and the Medical Outcomes Study 36-item Short-Form Health Survey (SF-36) (Ware & Sherbourne, 1992). All outcome measures were scored by 18 caregivers of individuals with stroke. Correlations were found to be excellent between the social communication domain of the ASHA-FACS and the FOQ-A (r = 0.74) and poor between this domain and the Aphasia Quotient of the Western Aphasia Battery and the SF-36 (r = 18; r = 0.19, respectively). Correlations between the basic needs domain of the ASHA-FACS and the FOQ-A were adequate (r = 0.58), while correlations between the basic needs domain of the ASHA-FACS and the Aphasia Quotient of the Western Aphasia Battery and the SF-36 were poor (r = -16; r = 0.14), respectively.

Known groups:
Ross and Wertz (2003) verified the ability of the ASHA-FACS to discriminate between healthy elderly individuals (n = 18) and individuals who had experienced stroke and mild aphasia (n = 18). Known groups validity, as calculated using Wilcoxon W, suggested that scores of healthy subjects were significantly higher than the scores of the participants with verbal communicative impairments. Therefore, the ASHA-FACS is able to differentiate between healthy individuals and individuals with mild aphasia.

Responsiveness

No studies have reported the responsiveness of the ASHA-FACS in clients with stroke.

References

  • Adamovich, B.B. & Henderson, J. (1992). SCATBI – Scales of Cognitive Ability of Traumatic Brain Injury. Riverside.
  • Blomert, L., Kean, M. L., Koster, C., & Schokker, J. (1994). Amsterdam-Nijmegen Everyday Language Test: construction, reliability and validity. Aphasiology, 8, 381-407.
  • Davidson, B., Worrall, L., & Hickson, L. (2003). Identifying the communication activities of older people with aphasia: Evidence from naturalistic observation. Aphasiology, 17, 243-264.
  • Frattali, C. M., Thompson, C. K., Holland, A. L., Wohl, C., & Ferketic, M. M. (1995). The FACS of Life. ASHA FACS — A Functional Outcome Measure for Adults. ASHA, 7, 40-46.
  • Glueckauf, R. L., Blonder, L. X., Ecklund-Jonhson, E., Maher, L., Crosson, B., & Gonzalez-Rothi, L. (2003). Functional questionnaire for aphasia: overview and preliminary psychometric evaluation. Neurorehabilitation, 18, 281-290.
  • Hagen, C., Malkmus, D., & Durham, P. (1972). Rancho Los Amigos Levels of Cognitive Functioning. Communication Disorders Service: Rancho Los Amigos Hospital.
  • Keith, R. A., Granger, C. V., Hamilton, B. B., Sherwin, F. S. (1987). The functional independence measure: A new tool for rehabilitation. Adv Clin Rehabil, 1, 6-18.
  • Kertesz, A. (1982). Western Aphasia Battery. NY: Grune & Stratton.
  • Ross, K. B. & Wertz, R. T. (2003). Discriminative validity of selected measures for differentiating normal from aphasic performance. American Journal of Speech-Language Pathology, 12, 312-319.
  • Ross, K. B. & Wertz, R. T. (2004). Accuracy of formal tests for diagnosing mild aphasia: An application of evidence-based medicine. Aphasiology, 18, 337-355.
  • Ware, J. E. Jr. & Sherbourne, C. D. (1992) The MOS 36-item short-form health survey (SF-36). I. Conceptual framework and item selection. Med Care, 30, 473-483.
  • Worrall, L. & Yiu, E. (2000). Effectiveness of functional communication therapy by volunteers for people with aphasia following stroke. Aphasiology, 14, 911-924.

See the measure

How to obtain the ASHA-FACS:

The ASHA-FACS can be purchased at the ASHA website and costs $USD165 for a certified non-ASHA member, and $USD124 for an ASHA member. This price does not include taxes and shipping.

Following is the website address for purchasing the ASHA-FACS: https://www.asha.org/

Table of contents

Amsterdam-Nijmegen Everyday Language Test (ANELT)

Evidence Reviewed as of before: 23-04-2009
Author(s)*: Sabrina Figueiredo, BSc
Editor(s): Lisa Zeltzer, MSc OT; Nicol Korner-Bitensky, PhD OT; Elissa Sitcoff, BA BSc

Purpose

The ANELT is designed to assess the level of verbal communicative abilities of individuals with aphasia. A second goal of the ANELT is to estimate a client’s change on verbal communicative abilities over time (Blomert et al., 1994).

In-Depth Review

Purpose of the measure

The ANELT is designed to assess the level of verbal communicative abilities of individuals with aphasia. A second goal of the ANELT is to estimate a client’s change on verbal communicative abilities over time (Blomert et al., 1994).

Available versions

The ANELT was published in 1994 by Blomert, Kean, Kosters, and Schokker. There are two available versions (ANELT I and ANELT II). Both versions have the same number of items, and the same difficulty level. The main difference is how the items are worded. Typically, the ANELT II is used as a re-evaluation of a client when the second assessment is performed within a short period of time. Therefore, it may prevent possible learning and memory effects.

Features of the measure

Items:
The ANELT I and II consist of 10 items each that characterize familiar everyday life situations. Before starting the test, the examiner should allow the client to practice by asking two items from the measure. During the practice trial, the examiner should provide instructions and correct the client if he or she does not appear to understand the instructions (Blomert et al., 1994).

The examiner should record the administration of the ANELT on audiotape for later scoring. The examiner presents each item verbally to the patient and must avoid conversing with the client during the administration of the test. Instead, the examiner should act as an interested listener, while the client answers the items as a monologue (Blomert et al., 1994).

The items-scenarios all have a strongly conventional script-like character. They engage the interest of the client, minimize stress in the testing situation, and encourage optimal performance. The ANELT I test items are as follows (Blomert et al., 1994):

  1. You are now at the dry cleaner’s. You have come to pick this up and you get it back like this [present shirt with scorch mark]. What do you say?
  2. The kids on the street are playing football in your yard. You have asked them before not to do that. You go outside and speak to the boys. What do you say?
  3. You are in a store and want to buy a television. I am the salesperson here. ‘Can I help you?’
  4. You go to the shoemaker with this shoe. [Present shoe] There is a lot wrong with this shoe, but for some reason you want him to repair only one thing. You may choose one. What do you say?
  5. You have an appointment with the doctor. Something else has come up. You call up and what do you say?
  6. You are in the drug store and this [present glove] is lying on the floor. What do you say?
  7. You see your neighbor walking by. You want to ask him/her to come to visit some time. What do you say?
  8. Your neighbor’s dog barks all day long. You are really tired of it. You want to talk to him about it. What do you say?
  9. You have just moved in next door to me. You would like to meet me. You ring my doorbell and say…
  10. You are at the florist. You want to have a bouquet of flowers delivered to a friend. I am the salesperson. What do you say?

Scoring:
Each item is scored from 0 to 5 on two different scales: One scale is used to score understandability and is also known as ANELT A. This scale assesses whether the content of the message given by the client is interpretable. The other scale, ANELT B, rates intelligibility. This scale is independent of content and assesses whether the words provided by the client are able to be perceived or clearly recognized (Blomert et al., 1994).

A score of 0 is given when the patient, due to severe aphasia, is incapable of taking instructions and/or producing an answer. A score of 5 indicates the client’s speech is unimpaired. The total score for each scale is obtained by summing all items. The total score on each scale (ANELT A & B) ranges from 0 to 50. Scores lower than 36, on each scale, are indicative of a moderate or severe verbal communicative deficit. Non-verbal responses should only be scored when they are provided by the client to reinforce or clarify a verbal response (Blomert et al., 1994).

Time:
The ANELT takes 15 to 25 minutes to administer (Blomert et al., 1994).

Subscales:
The ANELT is comprised of 2 subscales:
Understandability (ANELT A) and
Intelligibility (ANELT B).

Equipment:
The ANELT requires specific equipment, according to the items being used such as a stained shirt, a damaged shoe, and a pair of gloves.

Training:
Not reported.

Alternative forms of the ANELT

None

Client suitability

Can be used with:

  • Clients with stroke.
  • Clients with communicative deficits.

Should not be used in:

  • The ANELT should not be used with clients that are not able to communicate.

In what languages is the measure available?

Dutch, Swedish, German and English (Blomert et al., 1994; Doesborgh, 2004; Laska, 2007).

Summary

What does the tool measure? The ANELT was designed to assess verbal communicative abilities of patients with aphasia and to estimate change over time.
What types of clients can the tool be used for? The ANELT can be used with, but is not limited to clients with stroke.
Is this a screening or assessment tool? Assessment
Time to administer The ANELT takes 15 to 25 minutes to administer.
Versions ANELT I and ANELT II.
Other Languages Dutch, Swedish, German and English.
Measurement Properties
Reliability Internal consistency:
One study examined the internal consistency of the ANELT and reported a Cronbach’s alpha >0.90, which could be indicative of redundancy.

Test-retest:
One study examined the test-retest reliability of the ANELT and reported reliability across repeated measures using factor analysis.

Inter-rater:
One study examined the inter-rater reliability of the ANELT and reported excellent agreement between both scales and between naïve and expert raters using Krippenndorf analysis and Pearson correlations.

Validity

Content:
One study examined the content validity of the ANELT and reported the item generation process for the ANELT.

Criterion:
Concurrent:
Two studies examined the concurrent validity and reported an excellent correlation between the ANELT and the Aachener Aphasia Test (AAT) using Pearson correlation. Moreover, the semantic component of the AAT explained 33% of variance of the ANELT.

Predictive:
Three studies examined the predictive validity of the ANELT and reported that the ANELT measured shortly after stroke predicted recovery at 6 and 18 months post-stroke. However, it does not predict life situation of the significant other.

Construct:
Convergent:
Two studies examined the convergent validity of the ANELT and reported excellent correlations between the ANELT and the Coefficient in Norsk Grunntest for Afasi and adequate correlations between the ANELT and the Scandinavian Stroke Supervision Scale using Pearson correlation.

Longitudinal:
One study examined the longitudinal validity of the ANELT and reported, in the group receiving semantic treatment, adequate change score correlations between the ANELT and semantic measures and poor correlations between the ANELT and phonological measures. In the group receiving phonological intervention, change score correlations were adequate between the ANELT, the phonological measures and the Semantic Association Test, and poor between the ANELT and the Synonym Judgment subscale from the Psycholinguistic Assessment of Language Processing Aphasia.

Known Groups:
One study using ANOVA examined known groups validity of the ANELT and reported that the ANELT is able to discriminate between healthy individuals and those with communicative impairments.

Floor/Ceiling Effects One study reported that ceiling effects may be present when administering the ANELT to clients with mild communication deficits.
Sensitivity/Specificity One study examined the sensitivity/specificity of the ANELT and reported that an ANELT cut-off of 3.5 yields a sensitivity of 79% and a specificity of 83%.
Does the tool detect change in patients? Three studies examined the responsiveness of the ANELT and reported significant changes on the ANELT measured at 3, 6 or 18 months post-stroke. Significant changes were more pronounced in the first 3 months and in clients with fluent aphasia. Furthermore, a positive change of 8 points was identified as the minimal clinically significant change.
Acceptability Within 11 days of stroke onset, ANELT administration achieves a 90% completion rate (Laska et al., 2001)
Feasibility The administration of the ANELT is quick and simple, but requires some standardized equipment.
How to obtain the tool?

The ANELT I can be obtained on the website:

http://www.hogrefe.nl/site/?/test/show/52/

The complete pack consists of the manual, 20 forms, instruction card and CD-ROM. It costs 150.00 Euros, excluding taxes and postage.

Psychometric Properties

Overview

We conducted a literature search to identify all relevant publications on the psychometric properties of the Amsterdam Nijmegen Everyday Language Test (ANELT) in individuals with stroke. We identified 6 studies.

Floor/Ceiling Effects

Laska, Hellblom, Murray, Kahan and Von Arbin (2001) reported that ceiling effects may be present when administering the ANELT to clients with mild communication deficits.

Reliability

Internal Consistency:
Blomert, Kean, Kosters, and Schokker (1994) verified the internal consistency of the ANELT I & II in 35 clients with stroke. Each version showed a Cronbach’s alpha > 0.90. This result suggests the possibility of some redundant items on the ANELT.

Test-retest:
Blomert et al. (1994) examined the test-retest reliability of the ANELT I scales (A & B) in 30 clients with stroke. Participants were re-assessed within a 3 month interval. Stability over test-retest was measured using factor analysis. Comparisons for all items showed no significant differences, suggesting there was no change across two-repeated measures. These results suggest both ANELT I scales (A & B) are stable and reliable over time.

Inter-rater:
Blomert et al. (1994) evaluated the inter-rater reliability of the ANELT in 14 clients with stroke. Participant’s answers were rated by 6 evaluators, who were blinded to each other’s scores. Inter-rater reliability on individual items was calculated using Krippendorff analysis. Inter-rater reliability was excellent for the understandability scale (0.92) and adequate for the intelligibility scale (0.70). Additionally, in the same study, the authors analyzed the correlation between naïve and expert raters. In the first analysis when naïve evaluators rated a video performance and expert evaluators rated an audio performance, correlations between naïve and expert raters were excellent for the understandability (r = 0.83) and the intelligibility (r = 0.63) scales. When both naïve and expert evaluators rated audio performance, correlations for the understandability and the intelligibility scales were also excellent but with higher values (r = 0.99 and 0.97, respectively).

Validity

Content:
Blomert et al. (1994) began content validation with a large number of items that they administered to 60 healthy individuals. Twenty items were then selected based on a high response rate. After completion of the test, the 60 participants were questioned about the nature of the remaining items. All items were considered highly imaginable and recognizable, independent of biographical background, and representative of daily situations.

Criterion:
Concurrent:
Blomert el al. (1994) analyzed, in 254 clients with stroke, the concurrent validity of the ANELT by comparing it with the Aachener Aphasia Test (AAT) (Huber, Poeck, Weninger & Willmes, 1983) as the gold standard. The AAT is a 10 minute semi-structured interview used to elicit information on the communicative level of the patient and also to diagnose aphasia syndromes. Correlations between the ANELT and the AAT were excellent (r = 0.81).

Doesborgh, van de Sandt-Koenderman, Dipple, van Harskamp, Koudstaal and Visch-Brink (2002) assessed the concurrent validity of the ANELT by comparing it to semantic (word-meaning) and phonological (word-sounding) measures in 29 clients with stroke and aphasia. Regression analysis indicated that semantic measures, when compared to phonological measures, were better related with ANELT scores. The semantic component of the Aachener Aphasia Test (AAT) (Huber et al., 1983) explained 33% of variance of the ANELT scores.

Predictive:
Laska et al. (2001) assessed the ability of ANELT scores, measured shortly after stroke, to predict functional recovery at 18 months post-stroke in 119 clients. Linear regression analysis indicated that the ANELT scores were a significant predictor of functional recovery. Furthermore, less severe aphasia at baseline was related to higher degrees of functional recovery.

Franzen-Dahlin, Laska, Larson, Wredling, Billing, and Murray (2007) examined, in 148 clients with stroke (71 with depression and 77 with aphasia), whether age, gender, need of assistance, personality change, state of aggression, ANELT scores, Barthel Index scores (Mahoney & Barthel 1965), severity of depression, cohabitant/single and previous stroke were able to predict life situation of the significant other measured at 3 to 6 months post-stroke. Linear regression analysis indicated that ANELT scores were not a significant predictor of life situation of the significant other, which was best predicted by the need of assistance, personality change and living with the patient.

Laska, Bartfai, Hellblom, Murray and Kahan (2007) assessed the ability of the ANELT and the Coefficient in Norsk Grunntest for Afasi (Coeff) (Reinvang, 1985) to predict recovery at 6 months post-stroke. The Coefficient in Norsk Grunntest for Afasi is a measure of the severity of impairments on fluency, comprehension, naming and repetition in addition to writing and reading (Reinvang, 1985). Predictive validity was calculated by use of c-statistics to calculate the area under the Receiver Operating Characteristic (ROC) curve. A Coeff ≥ 49 (AUC = 0.82) and an ANELT ≥ 3.5 (AUC = 0.80) were both excellent on predicting recovery 6 months post-stroke. These results suggest that the percentage of patients correctly classified according to their recovery level at 6 months post-stroke is slightly lower when using the ANELT over the Coefficient in Norsk Grunntest for Afasi. Using an ANELT cut-off of 3.5 yields a sensitivity of 79% and a specificity of 83% to predicting recovery 6 months post-stroke.

Construct:
Convergent/Discriminant:
Blomert et al. (1994) assessed the construct validity of the ANELT subscales – understandability and intelligibility – in 254 clients with stroke and aphasia. Correlations between the understandability and intelligibility subscales were excellent (r = 0.70). However the strength of the association between the ANELT subscales varied according to the type of aphasia: excellent in clients with Wernicke’s aphasia (r = 0.66), adequate in clients with Global and Rest aphasia (r = 0.35; r = 0.33, respectively), and poor in clients with Anomic and Broca’s aphasia (r = 0.28; r = 0.27, respectively). These results suggest that within a large sample, with large variance, both scales are reflective of verbal communicative impairments. However, within certain types of aphasia, the ANELT is able to measure two sub-constructs: understandability and intelligibility. In other words, these two scales are not completely independent; but each contributes uniquely to the overall validity of the construct of verbal communication.

Laska et al. (2007) examined the convergent validity of the ANELT by comparing it to Coefficient in Norsk Grunntest for Afasi (Reinvang, 1985) and the Scandinavian Stroke Supervision Scale, a measure of neurological impairment (Röden-Jüllig, Britton, Gustavsson, & Fugl-Meyer, 1994) at baseline and 6 months later. The number of participants ranged from 72 to 118. Correlations between the ANELT and the Coefficient in Norsk Grunntest for Afasi were excellent both at baseline and 6 months later. (r = 0.71; r =0.87, respectively). Adequate correlations were found between the ANELT and the Scandinavian Stroke Supervision Scale (r = 0.33; r = 0.53). This result suggests that aphasia’s severity level is directed associated with neurological impairments.

Longitudinal:
Doesborgh et al. (2004) analyzed the longitudinal validity of the ANELT in 29 clients with stroke and aphasia by comparing change scores on the ANELT with change scores in semantic and phonological measures within each group. Semantic measures, which reflects word-meaning, were composed of the Semantic Association Test (SAT) (Visch-Brink, Denes, & Stronks, 1996) and the Synonym Judgment subscale from the Psycholinguistic Assessment of Language Processing Aphasia (PALPA) (Kay, Lesser, & Coltheart, 1992). Phonological measures, which are concerned about word-sounding, were represented by the Repetition Non-words and Lexical Decisions subscales from the PALPA. In the SAT the client is required to make a semantic association with the target (word or picture) by grouping the relevant information from a multiple choice set (words or pictures) while the PALPA assesses orthography, phonology, word and picture semantics, morphology and syntax and therefore is a complete assessment of language impairment. Participants were randomized into two groups: either semantic or phonological intervention. The group receiving semantic treatment demonstrated adequate change score correlations between the ANELT and both semantic measures (r = 0.58; 0.34, respectively), and poor correlations between the ANELT and both phonological measures (r = 0.04; 0.24, respectively). In the group receiving phonological intervention, change score correlations were adequate between the ANELT, the phonological measures and the Semantic Association Test (0.58, 0.50, 0.40, respectively) and poor between the ANELT and the Synonym Judgment subscale from the PALPA (r = 0.16).

Known groups:
Blomert et al. (1994) verified the ability of the ANELT to discriminate between healthy individuals (n = 60) and individuals who had experienced stroke and aphasia (n = 252). Known group validity, as calculated using ANOVA, suggested that scores of healthy subjects were significantly higher than the scores of the participants with verbal communicative impairments, thus supporting the known groups validity of the ANELT.

Responsiveness

Laska et al. (2001) evaluated the responsiveness of the ANELT in 119 clients with stroke and aphasia. Participants were assessed at four points in time: baseline, 3, 6, and 18 months post-stroke. Clients with fluent aphasia had greater ANELT score changes than clients with non-fluent aphasia (p<0.0001). Additionally, ANELT score changes were more significant in the 3 first months of recovery (p<0.0001) as would be expected based on what is known about post-stroke recovery.

Doesborgh et al. (2004) assessed the responsiveness of the ANELT in 55 clients with stroke. Participants were assessed at two points in time: at admission to a rehabilitation program and after 40 hours of treatment. In this study, the percentage of patients who showed a clinically significant improvement (> 8 points) was 39% after semantic treatment (focusing on word-meaning) compared with 35% after phonological treatment (focusing on word-sounding).
Note: A clinically significant improvement > 8 points was determined by Blomert, Koster, and Kean in 1995. However the original publication is in Dutch. (Blomert L, Koster Ch, Kean ML. Amsterdam-Nijmegen Test voorAlledaagse Taalvaardigheid. Lisse, Netherlands: Swets & Zeitlinger).

Laska et al. (2007) examined the responsiveness of the ANELT in 148 clients with stroke and aphasia. Participants were evaluated at baseline and at 6 months post-stroke. Changes on ANELT scores were significant for all participants (p<0.0001) from baseline to 6 months suggesting that the ANELT is responsive to clinical improvement.

References

  • Blomert, L., Kean, M.L., Koster, C., & Schokker, J. (1994). Amsterdam-Nijmegen Everyday Language Test: construction, reliability and validity. Aphasiology, 8, 381-407.
  • Franzen-Dahlin, A., Laska, A.C., Larson, J., Wredling, R., Billing, E., & Murray, V. (2008). Predictors of life situation among significant others of depressed or aphasic stroke patients. Journal of Clinical Nursing, 17, 1574-1580.
  • Frattali, C., Thompson, C.K., Holland, A.L., Wohl, C., & Ferketic, M.M. (1995). The American Speech-Language-Hearing Association Functional Assessment of Communication Skills for Older Adults (ASHA FACS). Rockville MD: ASHA.
  • Doesborgh, S.J.C., van de Sandt-Koenderman, W.M.E., Dipple, D.W.J., van Harskamp, F., Koudstaal, P.J., & Visch-Brink, E.G. (2002). The impact of linguistic deficits on verbal communication. Aphasiology, 16, 413-423.
  • Doesborgh, S.J.C., van de Sandt-Koenderman, M.W.E., Dippel, D.W.J., van Harskamp, F., Koudstaa, P.J., & Visch-Brink, E.G. (2004). Effects of semantic treatment on verbal communication and linguistic processing in aphasia after stroke. A randomized controlled trial. Stroke, 35, 141-146.
  • Huber, W., Poeck, K., Weninger, D., & Willmes, K. (1983). Der Aachener Aphasietest Gottingen: Hogrefe.
  • Kay, J., Lesser, R., & Coltheart, M. (1992). Psycholinguistic Assessment of Language Processing in Aphasia. Hove, UK: Lawrence Erlbaum Associates Ltd.
  • Mahoney, F. I., Barthel, D. W. (1965). Functional evaluation: The Barthel Index. Md State Med J, 14, 61-5.
  • Laska, A.C., Hellblom, A., Murray, V., Kahan, T., & Von Arbin, M. (2001). Aphasia in acute stroke and relation to outcome. J Intern Med, 249, 413-422.
  • Laska, A.C., Bartfai, A., Hellblom, A., Murray, V., & Kahan, T. (2007). Clinical and prognostic properties of standardized and functional aphasia assessments. J Rehabil Med, 39, 387-392.
  • Reinvang, I. (1985). Aphasia and brain organisation. New York: Plenum Press.
  • Röden-Jüllig, Ã…., Britton, M., Gustavsson, C., Fugl-Meyer, A. (1994). Validation of four scales for acute stage of stroke. J Intern Med, 236, 125-136.
  • Visch-Brink, E.G., Denes, G., & Stronks, D. (1996). Visual and verbal semantic processing in aphasia. Brain Lang, 55, 130-132.

See the measure

How to obtain the ANELT:

The ANELT I can be obtained on the website: http://www.hogrefe.nl/site/?/test/show/52/

The complete pack consists of the manual, 20 forms, instruction card and CD-ROM. It costs 150.00 Euros, excluding taxes and postage.

Table of contents

Boston Diagnostic Aphasia Examination (BDAE)

Evidence Reviewed as of before: 25-10-2012
Author(s)*: Sabrina Figueiredo, BSc; Vanessa Barfod, BA
Editor(s): Lisa Zeltzer, MSc OT; Nicol Korner-Bitensky, PhD OT; Elissa Sitcoff, BA BSc
Expert Reviewer: Dr. Lorraine Obler

Purpose

The BDAE is designed to diagnose aphasia and related disorders. This test evaluates various perceptual modalities (auditory, visual, and gestural), processing functions (comprehension, analysis, problem-solving) and response modalities (writing, articulation, and manipulation). The BDAE can be used by neurologists, psychologists, speech language pathologists and occupational therapists (Goodglass & Kaplan, 1972).

In-Depth Review

Purpose of the measure

The BDAE is designed to diagnose aphasia and related disorders. This test evaluates various perceptual modalities (auditory, visual, and gestural), processing functions (comprehension, analysis, problem-solving) and response modalities (writing, articulation, and manipulation). The BDAE can be used by neurologists, psychologists, speech language pathologists and occupational therapists (Goodglass & Kaplan, 1972).

Available versions

The BDAE was developed in 1972 by Goodglass and Kaplan. A second edition was published in 1983 by the same authors. The most recent edition was published in 2001 by Goodglass, Kaplan, and Barresi and contains both a shortened and extended version of the BDAE.

Features of the measure

Items and scoring:
Items and scoring on the BDAE is as follows (Goodglass & Kaplan, 1972):

1. Fluency:

In this section the client should be encouraged to engage in a free narrative and an open-ended conversation. The following features are then assessed:

Melodic line: The examiner should observe the intonational pattern in the entire sentence.

Phrase length: The examiner should observe the length of uninterrupted runs of words.

Articulatory agility: The examiner should observe how the client articulates phonemic sequences.

Grammatical form: The examiner should observe the variety of grammatical construction.

Paraphasia in running speech: The examiner should observe substitutions or insertions of semantically erroneous words in running conversation.

Word-finding: The examiner should observe the client’s capacity to evoke needed concept names and informational content in the sentences.

All features are scored on a 7-point scale where 1 is the maximum abnormality and 7 the minimum abnormality.

2. Auditory Comprehension

Word discrimination: Consists of a multiple choice task and samples six categories of words: objects, geometric forms, letters, actions, numbers and colors. Five words are written on cards and the client is asked to identify among them the word requested by the examiner. Clients are given 2 points for correctly identifying the word within 5 seconds, 1 point for correct identification taking longer than 5 seconds, half a point for localizing the right category. The maximum score is 72. The examiner should record in writing all incorrect choices by the client.

Body-part identification: Includes 24 items, the first 18 are related to body part names, and the remaining 8 with right-left comprehension. The client is asked to identify on his own body the body part named by the examiner. One point is given for correctly identifying the body within 5 seconds. When more time is required on identifying the body parts only a half point is scored. For right-left comprehension, the examiner requires the client to identify the right forearm, for example. One point is given for correctly identifying the side within 5 seconds. When more time is required on identifying the side of the body only one point is scored.

Commands: The client is requested to carry out commands. The score in this subscale ranges from 0 to 15.

Complex ideational material: In this section the examiner asks general questions such as “will a stone sink in water?” and the client is required to understand and express agreement or disagreement. Each item consists of two questions, one having yes and the other no as response options. One point is scored for each item with both questions correctly answered. Score ranges from 0 to 10.

3. Naming

Responsive naming: The examiner asks the client a question containing a key word associated with the expected answer. Then the client should answer the question using the following words: nouns (watch, scissors, match, drugstore); colors (green, black), verbs (shave, wash, write) and a number (twelve). Three points are given when the response is provided within 3 seconds, 2 points within 3 to 10 seconds, 1 point within 10 to 30 seconds, and 0 if the client provides an improper answer. Maximum score is 30.

Visual Confrontation: The client should name the images presented by the examiner. The visual stimulus items are from cards 2 and 3 and represent objects, geometric forms, letters, actions, numbers, colors and body parts. Three points are given when the response is given within 3 seconds, 2 points within 3 to 10 seconds, 1 point within 10 to 30 seconds, and 0 if the client is unable to provide the correct answer. Maximum score is 105.

Animal naming: The first word “dog” is provided by the examiner to stimulate the client. Then the client should provide all animals name that he/she knows within 60 seconds. The score consists of counting the number of different animals named by the client.

Body part naming: The examiner points to 10 body parts to be named on him/her. Three points are given when the response is given within 3 seconds, 2 points within 3 to 10 seconds, 1 point within 10 to 30 seconds, and 0 if the client if the client provides the wrong answer. Maximum score is 30.

4. Oral Reading

Word reading: The examiner indicates a word from card 5 that should be read by the client. Three points are given when the word is read within 3 seconds, 2 points within 3 to 10 seconds, 1 point within 10 to 30 seconds, and 0 if the client provides the wrong answer. Maximum score is 30.

Oral sentence: Ten sentences should be read from cards 6 and 7. The sentences are scored as pass (score of 1) or fail (score of 0).

5. Repetition

Words: A wide sampling of word types is presented, including a grammatical function word, objects, colors, a letter, numbers, an abstract verb of three syllables and a tongue twister. An item is scored correct if all phonemes are in correct order and recognizable. One point is allowed per item for a total of 10.

High and low probability sentences: The sentences should be repeated by the client, alternating between a high- and a low-probability item. One point is given for each sentence correctly repeated and high- and low- probability sections are scored separately from 0 to 8.

6. Automatic speech

Automatized sequences: Four sequences are tested: days of the week, months of the year, number from one to twenty-one and the alphabet. Two points maximum are given for complete recitation of any series and 1 point is given for unaided runs of 4 consecutive words when reciting days, 5 consecutive words when reciting months, 8 consecutive words when reciting numbers and 7 consecutive words when reciting the alphabet.

Reciting: Several nursery rhymes are suggested to elicit completion responses. A score of 0 is given if the client is unable to recite, 1 for impaired recitation and 2 for good recitation.

7. Reading Comprehension

Symbol discrimination: Cards 8 and 9 contain 10 items each. The examiner shows the word or letter centered above the five multiple-choice responses and asks the client to select the equivalent. One point is given to each correct item.

Word recognition: Using cards 10 and 11 the client is requested to identify the one word, out of 5, which matches the word said previously by the examiner. This task is repeated another 7 times and a score of 1 point is given to each correct answer.

Oral spelling: The client should recognize 8 words spelled by the examiner. One point is given for each correct recognition.

Word-picture matching: Ten words are selected from card 5 to be identified on cards 2 and 3. One point is given for each correct recognition.

Sentences and paragraphs: The examiner reads 10 sentences from cards 12 to 16. The client is requested to complete the ending of a sentence with a four multiple choice options. One point is given for each correct sentence.

8. Writing

Mechanics: The client is requested to write his/her name and address with the stronger hand. In case he/she is not able to do so, then the examiner can write the sentence and the client should then transcribe it. Score ranges from 0 to 3 according to performance level.

Serial writing: The client should write the alphabet and numbers from 1 to 21. The score is the total number of different, correct letters and numbers, combined for a maximum score of 47.

Primer-level dictation: The client should write the letters, numbers and primer words that are dictated by the examiner. A score is given by adding the number of correct words.

Spelling to dictation: The client should write the words dictated by the examiner. Score is based on the amount of correct words written by the client.

Written confrontation naming: The patient should write the name of the figure that is shown from cards 2 and 3 by the examiner. The examiner should show 10 figures. One point is given for each correctly spelled response.

Sentences to dictation: The client should write the three sentences dictated by the examiner. Scores for each sentence range from 0 to 4.

Narrative writing: Card 1 has a picture of a cookie theft which is shown to the client who must then write as much as he/she can about what he/she sees in the picture. The client should be encouraged to keep writing for 2 minutes. Scores for this section range from 0 (no relevant writing) to 4 (full description in grammatical sentences).

Sixteen stimulus cards are enclosed with the BDAE. These cards include a range of images, words and sentence that are shown to the client during the assessment.

Detailed administration guidelines are in the test manual that should be purchased.

Time:
The BDAE takes 90 to 120 minutes to administer. The extended format of the BDAE may take up to 2 1/2 hours (Sbordone, Saul & Purisch, 2007). The shortened version takes 30 to 45 minutes (Goodglass & Kaplan, 2001).

Subscales:
The BDAE is comprised of 8 subscales:

  • Fluency
  • Auditory comprehension
  • Naming
  • Oral reading
  • Repetition
  • Automatic speech
  • Reading comprehension
  • Writing

Equipment:
The BDAE requires specialized equipment that should be purchased in specialized stores or online.

Training:
The test costs approximately US$450.00 and includes the full test battery, manual and instructional video.

Alternative forms of the BDAE

Shortened version: described as “a brief, no frills assessment.”

Extended version: includes an assessment of praxis in addition to the standard assessment.

Client suitability

Can be used with:

  • Adults with stroke
  • Adults with communication and language impairments

Should not be used with:

  • Not reported

In what languages is the measure available?

English, Spanish, Portuguese, French, Hindi, Finnish, and Greek (Radanovic & Scaff, 2002; Rosselli, Ardila, Florez & Castro, 1990; Tsapkini, Vlahou & Potagas, 2009/2010).

Summary

What does the tool measure? The BDAE is designed to diagnose aphasia and related disorders.
What types of clients can the tool be used for? The BDAE can be used with, but is not limited to clients with stroke.
Is this a screening or assessment tool? Assessment
Time to administer
  • 90 to 120 minutes (BDAE)
  • 30 to 45 minutes (shortened version)
  • up to 2 1/2 hours (extended version)
Versions Shortened and extended versions
Other Languages English, Spanish, Portuguese, French, Hindi, Finnish, and Greek
Measurement Properties
Reliability No studies have examined the reliability of the BDAE in clients with stroke.
Validity Content:
No studies have examined the content validity of the BDAE in clients with stroke.

Criterion
Concurrent:
No studies have examined the concurrent validity of the BDAE in clients with stroke.

Predictive:
No studies have examined the predictive validity of the BDAE in clients with stroke.

Construct:
Convergent/Discriminant:
Four studies have examined the convergent validity of the BDAE and reported poor to adequate correlations between the repetition and the commands subscales of the BDAE and the Repeatable Battery for the Assessment of Neuropsychological Status (RBANS); poor correlations between the BDAE and the Visuospatial Index of the RBANS and adequate correlations between the BDAE and the Language Index of the RBANS; adequate to excellent correlations with modified versions of the Stroke Impact Scale (Communication, Participation, Physical and Stroke Recovery domains) and Activity Card Sort (total, instrumental and low demand leisure scores); and excellent correlations with the Bilingual Aphasia Test (BAT) comparable automated sequences, listening comprehension and reading subtests.

Known Groups:
One study has examined known groups validity of the BDAE-SF (Greek version) using Wilcoxon W and reported differentiation between healthy adults and patients with aphasia following stroke.

Floor/Ceiling Effects No studies have examined ceiling effects of the BDAE in clients with stroke.
Sensitivity/Specificity No studies have explored the sensitivity/specificity of the BDAE.
Does the tool detect change in patients? No studies have examined the responsiveness of the BDAE in clients with stroke.
Acceptability The BDAE administration is lengthy and some clients can become irritated with the more simplistic items.
Feasibility The BDAE is widely used as an assessment of aphasia. Age and education-adjusted norms are available (Borod, Goodglass & Kaplan, 1980).
How to obtain the tool?

The BDAE can be obtained from one of the following websites at costs from US$435 to US$496:

Psychometric Properties

Overview

We conducted a literature search to identify all relevant publications on the psychometric properties of the Boston Diagnostic Aphasia Examination (BDAE) in individuals with stroke. We identified 5 studies.

Floor/Ceiling Effects

No studies have examined the ceiling effects of the BDAE in clients with stroke.

Reliability

No studies have examined the reliability of the BDAE in clients with stroke.

Validity

Content:
No studies have examined the content validity of the BDAE in clients with stroke.

Criterion:
Concurrent:
No studies have examined the concurrent validity of the BDAE in clients with stroke.

Predictive:
No studies have examined the predictive validity of the BDAE in clients with stroke.

Construct:
Convergent/Discriminant:
Larson, Kirschner, Bode, Heinemann and Goodman (2005) analyzed the construct validity of the BDAE by comparing it to the Repeatable Battery for the Assessment of Neuropsychological Status (RBANS) (Randolph, 1998) in 88 clients with stroke. Correlations between the Repetition subscale of the BDAE and the Attention, Language, Immediate Memory, and Delayed Memory Indexes of the RBANS were adequate (r = 0.45; 0.42; 0.40; 0.38, respectively), while correlations between the Repetition subscale of the BDAE and the Visuospatial Index of the RBANS were poor (r = 0.25). Correlations between the Commands subscale of the BDAE and the Language and Immediate Memory Indexes of the RBANS were adequate (r = 0.38; 0.37, respectively) while between the Commands subscale of the BDAE and the Delayed Memory, Attention and Visuospatial Indexes of the RBANS correlations were poor (r = 0.30; 0.24; 0.14, respectively).

Wilde, (2006) examined the construct validity of the BDAE by comparing it to the Repeatable Battery for the Assessment of Neuropsychological Status (RBANS) (Randolph, 1998) in 22 clients with stroke. The BDAE showed an adequate correlation with the Language Index of the RBANS (r = 0.40) and a poor correlation with the Visuospastial Index of the RBANS (r = 0.19).

Tucker et al. (2012) developed modified versions of the Stroke Impact Scale (SIS), Medical Outcomes Study Short Form-36 (SF-36), Activity Card Sort (ACS) and the Reintegration to Normal Living Scale to improve the use of the measures with patients with aphasia. The authors examined the relationship between patients’ performance on these modified measures and the severity of aphasia as measured using the BDAE-3 Short Form, in a sample of 29 community-dwelling people with subacute or chronic stroke and aphasia. The BDAE Expressive component demonstrated excellent correlations with the SIS Communication, Participation and Stroke Recovery domains (r=0.72, 0.66, 0.60 respectively), and adequate correlations with the SIS Physical domain (r=0.46) and the ACS total, Instrumental and Low Demand Leisure scores (r=0.55, 0.53, 0.57 respectively). The BDAE Auditory Comprehension component demonstrated adequate correlations with the SIS Communication, Participation and Stroke Recovery domains (r=0.57, 0.50, 0.45 respectively) and the ACS total, Instrumental and Low Demand Leisure scores (r=0.51, 0.48, 0.47 respectively). The BDAE Language Competency Index (LCI) demonstrated excellent correlations with the SIS Communication and Participation domains (r=0.67, 0.61 respectively) and adequate correlations with the SIS Stroke Recovery domain (r=0.56) and the ACS total, Instrumental and Low Demand Leisure scores (r=0.55, 0.53, 0.55 respectively). Correlations with other SIS domains, ACS scores and the SF-36 and Reintegration to Normal Living Scale were not significant.

Peristeri and Tsapkini (2011) examined correlations between similar subtests of the BDAE-3 Short Form (Greek version) and the Bilingual Aphasia Test (BAT) (Greek version) in 9 patients with agrammatic aphasia, including 7 with chronic stroke. Correlations between tests of automated sequences, listening comprehension (BDAE complex ideational material subtest) and reading were excellent (r=0.75; 0.75; 0.88, respectively). Correlations between tests of fluency, commands, verbal auditory discrimination, word repetition, sentence repetition and naming were not significant.

Known groups:
Tsapkini, Vlahou and Potagas (2009/2010) examined the discriminative (known group) validity of the BDAE-SF (Greek version) by comparing the performance of healthy community-dwelling adults and patients with aphasia secondary to stroke, using Wilcoxon’s rank sum test (W). Participants were matched according to education, age and gender. Significant differences between healthy adults and individuals with aphasia were seen in subgroups of middle-aged individuals (40-59 years) of middle education and higher education on subtests of auditory comprehension (W=32, p=0.005; W=20, p=0.015 respectively), oral expression (W=32, p=0.005; W=20, p=0.015 respectively) and reading (W=24, p=0.003; W=10, p=0.035 respectively), and in a subgroup of older individuals (60 years+) with low education on subtests of auditory comprehension (W=56.5, p=0.009) and oral expression (W=51, p=0.005).

Responsiveness

No studies have examined the responsiveness of the BDAE in clients with stroke.

References

  • Borod, J.C., Goodglass, H., & Kaplan, E. (1980). Normative data on the Boston Diagnostic Aphasia Examination, Parietal Lobe Battery, and the Boston Naming Test. Journal of Clinical Neuropsychology, 3, 209-215.
  • Enderby, P.M., Wood, V.A., Wade, D.T., & Hewer, L.R. (1987). The Frenchay Aphasia Screening Test: A short, simple test for aphasia appropriate for nonspecialists. International Journal of Rehabilitation Medicine, 8, 166-170.
  • Goodglass, H. & Kaplan, E. (1972). The assessment of aphasia and related disorders. Philadelphia, Boston: Lea & Febiger.
  • Larson, E.B., Kirschner, K., Bode, R., Heinemann, A., & Goodman, R. (2003). Construct and predictive validity of the repeatable battery for the assessment of neuropsychological status in the evaluation of stroke patients. Journal of clinical and experimental neuropsychology, 27, 16-32.
  • Peristeri, E., & Tsapkini, K. (2011). A comparison of the BAT and BDAE-SF batteries in determining the linguistic ability in Greek-speaking patients with Broca’s aphasia. Clinical Linguistics & Phonetics, 25 (6-7): 464-479.
  • Radanovic, M. & Scaff, M. (2003). Speech and language disturbances due to subcortical lesions. Brain and language, 84, 337-352.
  • Randolph, C. (1998). The Repeatable Battery for the Assessment of Neuropsychological Status. San Antonio, TX: The Psychological Corporation.
  • Rosselli, M., Ardila, A., Florez, A., & Castro, C. (1990). Normative data on the Boston Diagnostic Aphasia Examination in a Spanish-speaking population. Journal of Clinical and Experimental Neuropsychology, 12, 313-322.
  • Sbordone, R.J., Saul, R.E., & Purisch, A.D. Neuropsychology for Psychologists, Health Care Professionals, and Attorneys. Boca Raton, FL: Taylor and Francis Group. Tucker, F.M., Edwards, D.F., Mathews, L.K., Baum, C.M., & Connor, L.T. (2012). Modifying Health Outcome Measures for People With Aphasia. American Journal of Occupational Therapy, 66, 42-50.
  • Tsapkini, K., Vlahou, C.H., & Potagas, C. (2009/2010). Adaptation and validation of standardized aphasia tests in different languages- Lessons from the Boston Diagnostic Aphasia Examination – Short Form in Greek. Behavioural Neurology, 22, 111-119.
  • Tucker, F.M., Edwards, D.F., Mathews, L.K., Baum, C.M., & Connor, L.T. (2012). Modifying Health Outcome Measures for People With Aphasia. American Journal of Occupational Therapy, 66, 42-50.
  • Wilde, M.C. (2006). The validity of the repeatable battery of neuropsychological
    status in acute stroke. The clinical neuropsychologist, 20, 702-715.

See the measure

How to obtain the BDAE:

The BDAE can be obtained in the following websites and costs from US$ 435 to US$ 496.

Table of contents

Frenchay Aphasia Screen Test (FAST)

Evidence Reviewed as of before: 19-08-2008
Author(s)*: Lisa Zeltzer, MSc OT
Editor(s): Nicol Korner-Bitensky, PhD OT; Elissa Sitcoff, BA BSc

Purpose

The Frenchay Aphasia Screening Test (FAST) was developed to provide healthcare professionals working with patients who might have aphasia with a quick and simple method to identify the presence of a language deficit. The FAST was intended to be used as a screening device to identify those patients having communication difficulties who should be referred for a more detailed evaluation performed by a speech and language pathologist.

In-Depth Review

Purpose of the measure

The Frenchay Aphasia Screening Test (FAST) was developed to provide healthcare professionals working with patients who might have aphasia with a quick and simple method to identify the presence of a language deficit. The FAST was intended to be used as a screening device to identify those patients having communication difficulties who should be referred for a more detailed evaluation performed by a speech and language pathologist.

Available versions

The FAST was first published in 1987 by Enderby, Wood, Wade, and Hewer. This version has 4 subscales (comprehension; verbal expression; reading; writing) and is scored out of a total of 30 points.

Features of the measure

Items:
The FAST assesses language in four major areas: comprehension, verbal expression, reading, and writing. Testing is focused around a single, double-sided stimulus card depicting a scene on one side and geometric shapes on the other and five written sentences. All instructions or item tasks presented to the respondent are of graded length and difficulty.

The instructions for administering each section of the FAST are as follows:

Comprehension (out of 10).
Show patient the card with the river scene and say: ‘Look at the picture. Listen carefully to what is said and point to the things I tell you to.

(a) River scene
Practice item: “Point to the river“. Do not score this item. Repeat until patient understands what is required.
1. “Point to a boat”
2. “Point to the tallest tree”
3. “Point to the man and point to the dog”
4. “Point to the man’s left leg and then to the canoe”
5. “Before pointing to a duck near the bridge, show me the middle hill”

(b) Shapes
Practice item: “Point to the circle“. Repeat until patient understands task.
1. “Point to the square”
2. “Point to the cone”
3. “Point to the oblong and the square”
4. “Point to the square, the cone and the semicircle”
5. “Point to the one that looks like a pyramid and the one that looks like a segment of orange”

Verbal expression (out of 10).
(a) Show the patient the river scene and say: ‘Tell me as much about the picture as you can.’ If the patient does not appear to understand, say: ‘Name anything you can see in the picture.’

(b) Remove picture card from view and inform patient that you are now going to attempt something a little different. Ask the patient to name as many animals as he/she can think of in 1 minute. If the patient appears doubtful, explain that you want the names of any kind of animal, wild or domestic, and not just those which may have been seen in the picture. Start timing with the stopwatch as soon as the patient names their first animal and allow the patient to list for 60 seconds before stopping the task.

Reading (out of 5).
Check that the patient is wearing their correct eyeglasses for reading purposes. Show the patient the river scene and first reading card. Ask the patient to read the sentence to him/herself, not aloud, and do whatever it instructs him/her to do. Proceed in the same manner with the remaining four reading cards.

Writing (out of 5).
Show patient river scene and say: ‘Please write as much as you can about what is happening in the picture’. If the patient does not appear to understand say: ‘Write anything that you can see in the picture’. If their dominant hand is affected, ask the patient to attempt the test with their non-dominant hand. Encourage the patient if he/she stops prematurely. Allow a maximum of 5 minutes to complete this section.

Scoring:
Points are awarded based on the correctness or completeness of the response. Scores from each test area are summed to provide a total score.

How to score comprehension section:
Score 1 point for each item performed correctly. If instructions require repeating, score as an error. Unprompted self-correction may be scored as correct. This section is scored out of a total score of 10 points.

How to score verbal expression section:
The verbal expression section is scored out of a total score of 10 points.

Part (a):

1. unable to name any objects intelligibly
2. names 1 – 2 objects
3. names 3 – 4 objects
4. names 5 – 7 objects
5. names 8 or 9 objects or uses phrases and sentences, but performance not normal (e.g. hesitations, inappropriate comments, etc.)
6. Normal – uses phrases and sentences, naming 10 items

Part (b):

1. None named
2. Names 1 – 2
3. Names 3 – 5
4. Names 6 – 9
5. Names 10 – 14
6. Names 15 or more

How to score reading section:
Score 1 point for each items completed correctly for a total score of 5 points.

How to score writing section:
The writing section is scored out of a total score of 5 points.

1. Able to attempt task but does not write any intelligible or appropriate words
2. Writes 1 or 2 appropriate words
3. Writes down names of 3 objects or a phrase including 2 or 3 objects
4. Writes down names of 4 objects (correctly spelled), or 2 or 3 phrases including names of 4 items
5. Uses phrases and sentences, including names of 5 items, but not considered ‘normal’ performance, e.g. sentence is not integrating people and actions
6. Definitely normal performance, e.g. sentence integrating people and actions

Total score interpretation:
The presence of aphasia is indicated if the patient scores below the following cutoff points:

Age Raw Score
Up to 60 27
61+ 25

A significant inverse relationship between age and FAST total score has been reported (O’Neill, Cheadle, Wyatt, McGuffog, and Fullerton, 1990). Although stratified cutoffs and normative data are available for both the complete and shortened versions of the FAST for three age groups; =60 years, 61-70 years and =71 years, this is based on the assessment of a small sample (n=123) of normal individuals aged 21-81. As there was limited representation of the very elderly within the normative sample, it has been recommended that test scores be interpreted with caution and the cutoff point signifying the presence of language difficulties in this group be lowered to avoid the incorrect classification of very elderly patients (O’Neill et al., 1990).

Time:
The FAST takes 3-10 minutes to complete (Enderby et al., 1987).

Subscales:
There are 4 subscales to the FAST: comprehension; verbal expression; reading; writing

Equipment: Completion of the FAST requires the following items:

  • Double-sided stimulus card with attached reading cards
  • Pencil and paper
  • Stopwatch or watch with second hand

Training:
The FAST is suitable for use by general practitioners, junior medical staff, and other non-specialists (Enderby et al., 1987).

Alternative Forms of the FAST

  • Shortened version of the FAST (Enderby, Wood, Wade, Langton Hewer, 1987).
    To reduce administration time, only the comprehension and expression sections of the test can be administered, for a total combined score of 20. A score of 13 or less out of 20 indicates aphasia. The classification sensitivity of this shortened version of the FAST is reported to be similar to that for the complete assessment.

Client suitability

Can be used with:

  • Patients with stroke.

Should not be used with:

  • The specificity of the FAST appears to be adversely affected by the presence of visual field deficits, visual neglect or inattention, illiteracy, deafness, poor concentration or confusion and therefore should be used with caution in patients with these conditions (Enderby, 1987; Al-Khawaja, Wade, & Collin, 1996; Gibson, MacLennan, Gray, & Pentland, 1991).

In what languages is the measure available?

To our knowledge, the FAST is only available in English.

Summary

What does the tool measure? The presence of a language deficit.
What types of clients can the tool be used for? Patients who might have aphasia.
Is this a screening or assessment tool? Screening
Time to administer The FAST takes approximately 3-10 minutes to administer.
Versions Original FAST (This version has 4 subscales comprehension; verbal expression; reading; writing) and is scored out of a total of 20 points; Shortened Version of the FAST (only the comprehension and expression sections are administered).
Other Languages None.
Measurement Properties
Reliability Internal consistency:
No studies have examined the internal consistency of the FAST.

Test-retest:
Two studies have examined the test-retest reliability of the FAST. One study reported excellent test-retest using kappa statistics, and one study reported high test-retest using Kendall’s coefficient of concordance (W).

Intra-rater:
No studies have examined the intra-rater reliability of the FAST.

Inter-rater:
One study has examined the inter-rater reliability of the FAST and reported high inter-rater reliability as measured by Kendall’s coefficient of concordance (W).

Validity Criterion:
Concurrent:
Excellent correlation between the FAST and the Functional Communication Profile.

Construct:
Adequate correlation between the FAST and the Barthel Index.

Convergent/Discriminant:
Excellent correlations between the comprehension scores on the FAST and the receptive skills on the Sheffield Screening Test for Acquired Language Disorders (SST), between the expression scores on the FAST and the expressive skills on the SST, between the total scores of the FAST and the SST as well as with the total score of the Short Orientation, Memory and Concentration test (SOMC). Excellent correlations between the FAST and the total scores of the Functional Communication Profile and the shortened Minnesota Test for Differential Diagnosis of Aphasia, as well as with subtests of the Minnesota Test for Differential Diagnosis of Aphasia.

Floor/Ceiling Effects No studies have examined the ceiling effects of the FAST.
Sensitivity/Specificity One study compared the FAST to the examination by speech therapists (the “gold standard”) and reported an overall sensitivity of 87% and a specificity of 80%.

One study compared the comprehensive and expressive subtests of the FAST to examination by an experienced clinician and reported a sensitivity of 96%-100% and a specificity of 61%-79%.

Does the tool detect change in patients? Although the FAST is intended to be a screening measure, one study reported that the FAST demonstrated significant change in the expected direction.
Acceptability The specificity of the FAST appears to be adversely affected by the presence of visual field deficits, visual neglect or inattention, illiteracy, deafness, poor concentration or confusion and therefore should be used with caution in patients with these conditions. It has been recommended that when testing the very elderly, test scores be interpreted with caution and the cutoff point signifying the presence of language difficulties be lowered to avoid incorrect classification.
Feasibility The administration of the FAST is quick and simple and can be administered by general practitioners, junior medical staff and other non-specialists and does not require any formal training. FAST stimulus and reading cards are required to complete the measure and can be purchased online. The FAST is simple to score and stratified cutoffs and normative data are available.
How to obtain the tool?

The FAST stimuli and reading cards are available from Wiley at: http://ca.wiley.com/WileyCDA/WileyTitle/productCd-1861564422.html

Psychometric Properties

Overview

We conducted a literature search to identify all relevant publications on the psychometric properties of the FAST.

Reliability

Test-retest:
Philip, Lowles, Armstrong, and Whitehead (2002) administered the FAST to 50 older patients with repeat administration one or two weeks later by a nurse. Using kappa statistics, the test-retest reliability of the FAST was found to be excellent (kappa = 1.00). A kappa value of 1.0 represents perfect agreement between the two test administration times.

Enderby, Wood, Wade, and Langton Hewer (1987) examined the test-retest reliability of the FAST and reported a high Kendall’s coefficient of concordance (W) (W = 0.97).

Inter-rater:
Enderby, Wood, Wade, and Langton Hewer (1987) examined the inter-rater reliability of the FAST and reported a high Kendall’s coefficient of concordance (W) (W = 0.97).

Sweeney, Sheahan, Rice, Malone, Walsh, and Coakley (1993) examined the inter-rater reliability of the FAST and reported 93% agreement between raters.

Validity

Criterion:
Concurrent:
Enderby, Wood, Wade, and Langton Hewer (1987) examined the concurrent validity of the FAST with the Functional Communication Profile in patients 15 days post-stroke, and in patients with chronic aphasia. Excellent correlations between the two measures were reported for both groups (r = 0.87 and r = 0.96, respectively).

Construct:
Al-Khawaja, Wade, and Collin (1995) administered the FAST to 50 patients who were suspected to have aphasia. The FAST had an adequate correlation with the Barthel Index (r=0.59). The authors state that these findings confirm reports that language disorders are associated with the severity of disability (i.e. patients who were incontinent, unable to transfer/walk and required help for personal care, showed more significant speech and language disorders on screening).

Convergent:
Al-Khawaja, Wade, and Collin (1995) examined the relationship between the FAST with the Sheffield Screening Test for Acquired Language Disorders (SST) (Syder, Body, Parker, & Boddy, 1993). The SST is another measure that has been developed to detect the presence of language disorders in adults. The comprehension scores on the FAST had an excellent correlation with receptive skills on the SST (r = 0.74) and with expression scores on the FAST with expressive skills on the SST (r = 0.92). The total scores of the two tests also had an excellent correlation (r = 0.89). In this study, the FAST was also compared to the Short Orientation, Memory and Concentration test (SOMC) (Katzman, Brown, Fuld, Peck, Schechter, and Schimmel, 1983). An excellent correlation between the total score on the FAST and the SOMC (r = 0.86), and between the total scores on the SST and SOMC (r = 0.91) were reported.

Enderby and Crow (1996) examined correlations between the FAST, the Functional Communication Profile and the shortened Minnesota Test for Differential Diagnosis of Aphasia and reported excellent correlations between the FAST and the total scores on both of these measures (r = 0.73 and r = 0.91, respectively). The correlations between the FAST and subtests of the Minnesota Test for Differential Diagnosis of Aphasia ranged from 0.70 to 0.82 and are considered ‘excellent‘.

Sensitivity/ Specificity:
Al-Khawaja et al. (1995) compared the presence (or absence) of aphasia as confirmed by speech therapists (the “gold standard”) to the FAST and reported that the FAST has an overall sensitivity of 87% and a specificity of 80% using age-stratified cut-off scores.

O’Neill, Cheadle, Wyatt, McGuffog, and Fullerton (1990) compared the comprehensive and expressive subtests of the FAST to clinical examination by an experienced clinician. Using a cutoff score of < 25 out of 30 to identify the presence of aphasia, at one day post-stroke, a sensitivity of 96% and specificity of 61% was reported. At one week post-stroke, a sensitivity of 100% and a specificity of 79% were observed. However, in this study, lower specificity was associated with FAST than with clinical examination, suggesting that administration of the FAST confers no real advantage over the careful examination of an experienced clinician.

Responsiveness

Although the FAST is intended to be a screening measure, the first published study on the FAST examined repeat administration of the test over time and found that the FAST demonstrated significant change in the expected direction (Enderby, 1987). However, the responsiveness of the FAST to change has not been evaluated in more detail.

References

  • Enderby, P. M., Wood, V. A., Wade, D. T., Langton Hewer, R. (1987). The Frenchay Aphasia Screening Test: A short, simple test for aphasia appropriate for nonspecialists. International Journal of Rehabilitation Medicine, 8, 166-170.
  • Enderby, P., Crow, E. (1996). Frenchay Aphasia Screening Test: Validity and comparability. Disability and Rehabilitation, 18, 238-240.
  • Gibson ,L., MacLennan, W. J., Gray, C., Pentland, B. (1991). Evaluation of a comprehensive assessment battery for stroke patients. International Journal of Rehabilitation Research, 14, 93-100.
  • Katzman, R., Brown, T., Fuld, P., Peck, A., Schechter, R., Schimmel, H. (1983). Validation of a short Orientation-Memory-Concentration test of cognitive impairment. Am J Psychiatry, 140, 734-739.
  • O’Neill, P. A., Cheadle, B., Wyatt, R., McGuffog, J., Fullerton, K. J. (1990). The value of the Frenchay Aphasia Screening Test in screening for dysphasia: Better than the clinician? Clinical Rehabilitation, 4, 123-128.
  • Philp, I., Lowles, R. V., Armstrong, G. K., Whitehead, C. (2002). Repeatability of standardized tests of functional impairment and well-being in older people in a rehabilitation setting. Disability and Rehabilitation, 24, 243-249.
  • Salter, K., Jutai, J., Foley, N., Hellings, C., Teasell, R. (2006). Identification of aphasia post stroke: A review of screening assessment tools. Brain Injury, 20(6), 559-568.
  • Sweeney, T., Sheahan, N., Rice, I., Malone, J., Walsh, J. B., Coakley, D. (1993). Communication disorders in a hospital elderly population. Clinical Rehabilitation, 7, 113-117.
  • Syder, D., Body, R., Parker, M., Boddy, M. (1993). Sheffield screening test for acquired language disorders. Manual: Nfer-Nelson.

See the measure

How to obtain the FAST:

The FAST test manual and stimulus and reading cards can be purchased by accessing the Stass publications website.

Table of contents

Stroke Impact Scale (SIS)

Evidence Reviewed as of before: 29-06-2018
Author(s)*: Lisa Zeltzer, MSc OT; Katherine Salter, BA; Annabel McDermott
Editor(s): Nicol Korner-Bitensky, PhD OT; Elissa Sitcoff, BA BSc

Purpose

The Stroke Impact Scale (SIS) is a stroke-specific, self-report, health status measure. It was designed to assess multidimensional stroke outcomes, including strength, hand function Activities of Daily Living / Instrumental Activities of Daily Living (ADL/IADL), mobility, communication, emotion, memory and thinking, and participation. The SIS can be used both in clinical and in research settings.

In-Depth Review

Purpose of the measure

The Stroke Impact Scale (SIS) is a stroke-specific, self-report, health status measure. It was designed to assess multidimensional stroke outcomes, including strength, hand function, Activities of Daily Living / Instrumental Activities of Daily Living (ADL/IADL), mobility, communication, emotion, memory and thinking, and participation. The SIS can be used both in clinical and research settings.

Available versions

The Stroke Impact Scale was developed at the Landon Center on Aging, University of Kansas Medical Center. The scale was first published as version 2.0 by Duncan, Wallace, Lai, Johnson, Embretson, and Laster in 1999. Version 2.0 of the SIS is comprised of 64 items in 8 domains (Strength, Hand function, Activities of Daily Living (ADL) / Instrumental ADL, Mobility, Communication, Emotion, Memory and thinking, Participation). Based on the results of a Rasch analysis process, 5 items were removed from version 2.0 to create the current version 3.0 (Duncan, Bode, Lai, & Perera, 2003b).

Features of the measure

Items:

The SIS version 3.0 includes 59 items and assesses 8 domains:

  • Strength – 4 items
  • Hand function – 5 items
  • ADL/IADL – 10 items
  • Mobility – 9 items
  • Communication – 7 items
  • Emotion – 9 items
  • Memory and thinking – 7 items
  • Participation/Role function – 8 items

An extra question on stroke recovery asks that the client rate on a scale from 0 – 100 how much the client feels that he/she has recovered from his/her stroke.

To see the items of the SIS, please click here.

Instructions on item administration:

Prior to administering the SIS, the purpose statement must be read as written below. It is important to tell the respondent that the information is based on his/her point of view.

Purpose statement:
“The purpose of this questionnaire is to evaluate how stroke has impacted your health and life. We want to know from your point of view how stroke has affected you. We will ask you questions about impairments and disabilities caused by your stroke, as well as how stroke has affected your quality of life. Finally, we will ask you to rate how much you think you have recovered from your stroke”.

Response sheets in large print should be provided with the instrument, so that the respondent may see, as well as hear, the choice of responses for each question. The respondent may either answer with the number or the text associated with the number (eg. “5” or “Not difficult at all”) for an individual question. If the respondent uses the number, it is important for the interviewer to verify the answer by stating the corresponding text response. The interviewer should display the sheet appropriate for that particular set of questions, and after each question must read all five choices.

Questions are listed in sections, or domains, with a general description of the type of questions that will follow (eg. “These questions are about the physical problems which may have occurred as a result of your stroke”). Each group of questions is then given a statement with a reference to a specific time period (eg. “In the past week how would you rate the strength of your…”). The statement must be repeated before each individual question. Within the measure the time period changes from one week, to two weeks, to four weeks. It is therefore important to emphasize the change in the time period being assessed for the specific group of questions.

Scoring:

The SIS is a patient-based, self-report questionnaire. Each item is rated using a 5-point Likert scale. The patient rates his/her difficulty completing each item, where:

  • 1 = an inability to complete the item
  • 5 = no difficulty experienced at all.

Note: Scores for three items in the Emotion domain (3f, 3h, 3i) must be reversed before calculating the Emotion domain score (i.e. 1 » 5, 2 » 4, 3 = 3, 4 » 2, 5 » 1). The SIS scoring database (see link below) takes this change of direction into account when scoring. When scoring manually, use the following equation to compute the item score for 3f, 3h and 3i: Item score = 6 – individual’s rating.

A final single-item Recovery domain assesses the individual’s perception of his/her recovery from stroke, measured in the form of a visual analogue scale from 0-100, where:

  • 0 = no recovery
  • 100 = full recovery.

Domain scores range from 0-100 and are calculated using the following equation:

  • Domain score = [(Mean item score – 1) / 5-1 ] x 100

Scores are interpreted by generating a summative score for each domain using an algorithm equivalent to that used in the SF-36 (Ware & Sherbourne, 1992).

See http://www.kumc.edu/school-of-medicine/preventive-medicine-and-public-health/research-and-community-engagement/stroke-impact-scale/instructions.html to download the scoring database.

Time:

The SIS is reported to take approximately 15-20 minutes to administer (Finch, Brooks, Stratford, & Mayo, 2002).

Subscales:

The SIS 3.0 is comprised of 8 subscales or ‘Domains’:

  1. Strength
  2. Hand function
  3. ADL/IADL
  4. Mobility
  5. Communication
  6. Emotion
  7. Memory and thinking
  8. Participation

A final single-item domain measures perceived recovery since stroke onset.

Equipment:

Only the scale and a pencil are needed.

Training:

The SIS 3.0 requires no formal training for administration (Mulder & Nijland, 2016). Instructions for administration of the SIS 3.0 are available online through the University of Kansas Medical Center SIS information page.

Alternative forms of the SIS

SIS-16 (Duncan et al., 2003a).

Duncan et al. (2003) developed the SIS-16 to address the lack of sensitivity to differences in physical functioning in functional measures of stroke outcome. Factor analysis of the SIS 2.0 revealed that the four physical domains (Strength, Hand function, ADL/IADL, Mobility) are highly correlated and can be summed together to create a single physical dimension score (Duncan et al., 1999; Mulder & Nijland, 2016). Accordingly, the SIS-16 consists of 16 items from the SIS 2.0:

  1. ADL/IADL – 7 items
  2. Mobility – 8 items
  3. Hand Function – 1 item.

All other domains should remain separate (Duncan et al., 1999).

SF-SIS (Jenkinson et al., 2013).

Jenkinson et al. (2013) developed a modified short form of the SIS (SF-SIS) comprised of eight items. The developers selected the one item from each domain that correlated most highly with the total domain score, through three methods: initial pilot research, validation analysis and a focus group. The final choice of questions for the SF-SIS comprised those items that were chosen by methods on 2 or more occasions. The SF-SIS was evaluated for face validity and acceptability within a focus group of patients from acute and rehabilitation stroke settings and with multidisciplinary stroke healthcare staff. The SF-SIS has also been evaluated for content, convergent and discriminant validity (MacIsaac et al., 2016).

Client suitability

Can be used with:

  • The SIS can only be administered to patients with stroke.
  • The SIS 3.0 and SIS-16 can be completed by telephone, mail administration, by proxy, and by proxy mail administration (Duncan et al., 2002a; Duncan et al., 2002b; Kwon et al., 2006). Studies have shown potential proxy bias for physical domains (Mulder & Nijland, 2016). It is recommended that possible responder bias and the inherent difficulties of proxy use be weighed against the economic advantages of a mailed survey when considering these methods of administration.

Should not be used with:

  • The SIS version 2.0 should be used with caution in individuals with mild impairment as items in the Communication, Memory and Emotion domains are considered easy and only capture limitations in the most impaired individuals (Duncan et al., 2003).
  • Respondents must be able to follow a 3-step command (Sullivan, 2014).
  • Time taken to administer the SIS is a limitation for individuals with difficulties with concentration, attention or fatigue following stroke (MacIsaac et al., 2016).

In what languages is the measure available?

The SIS was originally developed in English.

Cultural adaptations, translations and psychometric testing have also been conducted in the following languages:

  • Brazilian (Carod-Artal et al., 2008)
  • French (Cael et al., 2015)
  • German (Geyh, Cieza & Stucki, 2009)
  • Italian (Vellone et al., 2010; Vellone et al., 2015)
  • Japanese (Ochi et al., 2017)
  • Korean (Choi et al., 2017; Lee & Song, 2015)
  • Nigerian (Hausa) (Hamza et al., 2012; Hamza et al., 2014)
  • Portuguese (Goncalves et al., 2012; Brandao et al., 2018)
  • Ugandan (Kamwesiga et al., 2016)
  • United Kingdom (Jenkinson et al., 2013)

The MAPI Research Institute has translated the SIS and/or SIS-16 into numerous languages including Afrikaans, Arabic, Bulgarian, Cantonese, Czech, Danish, Dutch, Farsi, Finnish, French, German, Greek, Hebrew, Hungarian, Icelandic, Italian, Japanese, Korean, Malay, Mandarin, Norwegian, Portuguese, Russian, Slovak, Spanish, Swedish, Tagalog, Thai and Turkish. Translations may not be validated.

Summary

What does the tool measure? Multidimentional stroke outcomes, including strength, hand function, Activities of daily living/Instrumental activities of daily living, mobility, communication, emotion, memory, thinking and participation.
What types of clients can the tool be used for? Patients with stroke.
Is this a screening or assessment tool? Assessment
Time to administer The SIS takes 15-20 minutes to administer.
Versions SIS 2.0, SIS 3.0, SIS-16, SF-SIS.
Other Languages The SIS has been translated into several languages. Please click here to see a list of translations.
Measurement Properties
Reliability Internal consistency:
SIS 2.0:
Two studies reported excellent internal consistency; one study reported excellent internal consistency for 5/8 domains and adequate internal consistency for 3/8 domains.

SIS 3.0:
Two studies reported excellent internal consistency; one study reported excellent internal consistency for 6/8 domains and adequate internal consistency for 2/8 domains.

SIS-16:
One study reported good spread of item difficulty.

SF-SIS:
One study reported excellent internal consistency.

Test-retest:
SIS 2.0:
One study reported adequate to excellent test-rest reliability in all domains except for the Emotion domain.

Validity Criterion :
Concurrent:
SIS 2.0:
Excellent correlations with the Barthel Index, FMA, nstrumental Activities of Daily Living (IADL) Scale, Duke Mobility Scale and Geriatric Depression Scale; adequate to excellent correlations with the FIM; adequate correlations with the NIHSS and MMSE; and poor to excellent correlations with the SF-36.

SIS 3.0:
Excellent correlation between SIS Hand Function and MAL-QOM; excellent correlation between SIS ADL/IADL and FIM, Barthel Index, Lawton IADL Scale; excellent correlation between SIS Strength and Motricity Index; excellent correlation between SIS Mobility and Barthel Index; adequate to excellent correlation between SIS ADL/IADL and NEADL; adequate correlation between SIS Social Participation and SF-36 Social Functioning, Lawton IADL scale; adequate correlation between SIS Memory domain and MMSE; poor to adequate correlations between remaining SIS domains and FIM, NEADL, FMA, MAL-AOU, MAL-QOM, FAI.

SIS-16:
Excellent correlation with the Barthel Index; adequate to excellent correlations with the STREAM total and subscale scores; adequate correlation with SF-36 Physical Functioning.

Predictive:
SIS 2.0:
Physical function, Emotion and Participation domains were statistically significant predictors of the patient’s own assessment of recovery; SIS scores were poor predictors of mean steps per day.

SIS 3.0:
Pre-treatment SIS scores were compared with outcome measures after 3 weeks of upper extremity rehabilitation: Hand function and ADL/IADL domains showed adequate to excellent correlations with FIM, FMA, MAL-AOU, MAL-QOM, FAI, and NEADL; other domains demonstrated poor to adequate correlations with outcome measures.

SIS-16:
– Admission scores show an excellent correlation with actual length of stay and an adequate correlation with predicted length of stay; there was a significant correlation with discharge destination (home/rehabilitation).
– The combination of early outcomes of MAL-QOM and SIS show high accuracy in predicting final QOL among patients with stroke.

Construct:
Convergent/Discriminant:
SIS 2.0:
Domains demonstrate adequate to excellent correlations with corresponding WHOQOL-BREF subscales and Zung’s Self-Rating Depression Scale; poor correlations between the SIS Communication domain and both WHOQOL-BREF and Zung’s Self-Rating Depression Scale; and a poor correlation between the SIS Physical domain and the WHOQOL Environment scores.

SIS 3.0:
Excellent correlation with the SF-SIS, EQ-5D, mRS, BI, NIHSS, EQ-5D; moderate to excellent correlations with the EQ-VAS; and a moderate correlation with the SIS-VAS.

SIS 3.0 telephone survey:
Adequate to excellent correlations with the FIM and SF-36V.

SIS-16:
Adequate to excellent correlations with the WHOQOL-BREF Physical domain; poor correlation with the WHOQOL Social relationships domain.

SF-SIS:
Excellent correlations with the EQ-5D, mRS, BI, NIHSS, EQ-5D; moderate to excellent correlations with the EQ-VAS; and moderate correlation with the SIS-VAS.

Known groups:
SIS 2.0: Most domains can differentiate between patients with varying degrees of stroke severity.

SIS 3.0:
Physical and ADL/IADL domains showed score discrimination and distribution for different degrees of stroke severity.

SIS-16:
Can discriminate between patients of varying degrees of stroke severity.

Floor/Ceiling Effects Three studies have examined floor/ceiling effects of the SIS.

SIS 2.0:
Two studies reported the potential for floor effects in the domain of Hand function among patients with moderate stroke severity, and a potential for ceiling effects in the Communication, Memory and Emotion domains.

SIS 3.0:
One study reported minimal floor and ceiling effects for the Social participation domain; one study reported ceiling effects for the Hand function, Memory and thinking, Communication, Mobility and ADL/IADL domains over time.

SIS-16:
One study reported no floor effects and minimal ceiling effects.

Does the tool detect change in patients? Five studies have investigated responsiveness of the SIS.

SIS 2.0:
One study reported significant change in patients’ recovery in the expected direction between assessments at 1 and 3 months, and at 1 and 6 months post-stroke, however sensitivity to change was affected by stroke severity and time of post-stroke assessment.

SIS 3.0:
– One study determined change scores for a clinically important difference (CID) within four subscales of the Strength, ADL/IADL, Mobility, Hand function. The MDC was 24.0, 17.3, 15.1 and 25.9 (respectively); minimal CID was 9.2, 5.9, 4.5 and 17.8 (respectively).
– One study reported medium responsiveness for Hand function, Stroke recovery and SIS total score; other domains showed small responsiveness.
– One study found Participation and Recovery from stroke were the most responsive domains over the first year post-stroke; Strength and Hand function domains also showed high clinically meaningful positive/negative change.

SIS-16:
One study reported change scores of 23.1 indicated statistically significant improvement from admission to discharge, and sensitivity to change was large.

Acceptability – SIS 3.0 and SIS-16 are available in proxy version. The patient-centred nature of the scale’s development may enhance its relevance to patients and assessment across multiple levels may reduce patient burden.
– Time taken to administer the SIS has been identified as a limitation.
– The SIS 2.0 should be used with caution in individuals with mild impairment as some domains only capture limitations in the most impaired individuals.
Feasibility – The SIS is a patient-based self-report scale that takes 15-20 minutes to administer.
– The SIS can be administered in person or by proxy, by mail or telephone.
– The SIS does not require any formal training.
– Instructions for administration of the SIS 3.0 are available online.
How to obtain the tool?

Please click here to see a copy of the SIS.

Psychometric Properties

Overview

We conducted a literature search to identify relevant publications on the psychometric properties of the SIS. Seventeen studies were included. Studies included in this review are specific to the original English versions of the SIS version 2.0, SIS 3.0 or SIS-16.

Floor/Ceiling Effects

Duncan et al. (1999) found that SIS version 2.0 showed the potential for floor effects in the Hand function domain in the moderate stroke group (40.2%) and a possible ceiling effect in the Communication domain for both the mild (35.4%) and moderate (25.7%) stroke groups. The highest percentage of ceiling effects for the SIS was for the Communication domain (35%) compared with a 64.6% ceiling rate for the Barthel Index (Mahoney & Barthel, 1965).

Duncan et al. (2003b) conducted a Rasch analysis which confirmed these two effects observed in Duncan et al. (1999) – a floor effect in the SIS Hand function domain and a ceiling effect in the Communication domain. A ceiling effect in the Memory and Emotion domains was also reported.

Lai et al. (2003) examined floor/ceiling effects of the SIS-16 and SIS Social Participation domain in a sample of 278 patients at 3 months post-stroke. The authors reported floor/ceiling effects of 0% and 4% (respectively) for the SIS-16, and 1% and 5% (respectively) for the SIS Social Participation domain.

Richardson et al. (2016) examined floor/ceiling effects of the SIS 3.0 in a sample of 164 patients with subacute stroke. Measures were taken at three timepoints: on admission to the study and at 6-month and 12-month follow-up (n=164, 108, 37 respectively). Poor ceiling effects (>20%) were seen for the Hand function domain at baseline, 6 months and 12 months (25.0%, 36.4%, 37.8%, respectively); the Memory and thinking domain at 6 months and 12 months (22.2%, 21.6%, respectively); the Communication domain at 6 months and 12 months (30.6%, 27%, respectively); the Mobility domain at 6 months (20.4%); and the ADL/IADL domain at 12 months (21.6%). There were no significant floor effects at any timepoint.

Reliability

Internal consistency:
Duncan et al (1999) examined internal consistency of the SIS version 2.0 using Cronbach’s alpha coefficients and reported excellent internal consistency for each of the 8 domains (ranging from a=0.83 to 0.90).

Duncan et al. (2003b) examined reliability of the SIS version 2.0 by Rasch analysis. Item separation reliability is the ratio of the “true” (observed minus error) variance to the obtained variation. The smaller the error, the higher the ratio will be. It ranges from 0.00 to 1.00 and is interpreted the same as the Cronbach’s alpha. Item separation reliability of the SIS version 2.0 ranged from 0.93-1.00. A separation index > 2.00 is equivalent to a Cronbach’s alpha of 0.80 or greater (excellent). In this study, 5 out of 8 domains had a separation index that exceeded 2.00 (in addition to the composite physical domain). The values for the Emotion and Communication domains were only in the adequate range because of the ceiling effect in those domains and those for the Hand function domain were only adequate because of the floor effect in that domain.

Edwards and O’Connell (2003) administered the SIS version 2.0 to 74 patients with stroke and reported excellent internal consistency (ranging from a=0.87 for participation to a=0.95 for hand function). The percentage of item-domain correlations >0.40 was 100% for all domains except emotion and ADL/IADL. In the ADL/IADL scale, one item (cutting food) was more closely associated with hand function than ADL/IADL.

Lai et al. (2003) examined reliability of the SIS-16 and SIS Social Participation domain in a sample of 278 patients at 3 months post-stroke. Both the SIS-16 and SIS Social Participation domain showed good spread of item difficulty, with easier items that are able to measure lower levels of physical functioning in patients with severe stroke.

Jenkinson et al. (2013) examined internal consistency of the SIS 3.0 and the SF-SIS among individuals with stroke (n=73, 151 respectively), using Cronbach’s alpha. Internal consistency of the SIS 3.0 was excellent for all domains (a=0.86 to 0.96). Higher order factor analysis of the SIS 3.0 showed one factor with an eigenvalue > 1 that accounted for 68.76% of the variance. Each dimension of the SIS 3.0 loaded on this factor (eigen value = 5.5). Internal consistency of the SF-SIS was high (a=0.89). Factor analysis of the SF-SIS similarly showed one factor that accounted for 57.25% of the variance.

Richardson et al. (2016) examined internal consistency of the SIS 3.0 in a sample of 164 patients with subacute stroke, using Cronbach’s alpha. Internal consistency was measured at three timepoints: on admission to the study and at 6-month and 12-month follow-up. Internal consistency of all domains was excellent at all timepoints (a=0.81 to 0.97). The composite Physical Functioning score was excellent at all timepoints (a=0.95 to 0.97).

MacIsaac et al. (2016) examined internal consistency of the SIS 3.0 in a sample of 5549 individuals in an acute stroke setting and 332 individuals in a stroke rehabilitation setting, using Cronbach’s alpha. Internal consistency was excellent within both acute and rehabilitation data sets (a=0.98, 0.93 respectively). Internal consistency of individual domains was excellent for both acute and rehabilitation data sets, except for the Emotion domain (a=0.60, 0.63 respectively) and the Strength domain (a=0.77, rehabilitation data set only).

Test-retest:
Duncan et al. (1999) examined test-retest reliability of the SIS version 2.0 in 25 patients who were administered the SIS at 3 or 6 months post stroke and again one week later. Test-retest was calculated using intraclass correlation coefficients (ICC), which ranged from adequate to excellent (ICC=0.7 to 0.92) with the exception of the Emotion domain, which had only a poor correlation (ICC=0.57).

Validity

Content:

Development of the SIS was based on a study at the Landon Center on Aging, University of Kansas Medical Center (Duncan, Wallace, Studenski, Lai, & Johnson, 2001) using feedback from individual interviews with patients and focus group interviews with patients, caregivers, and health care professionals. Participants included 30 individuals with mild and moderate stroke, 23 caregivers, and 9 stroke experts. Qualitative analysis of the individual and focus group interviews generated a list of potential items. Consensus panels reviewed the potential items, established domains for the measure, developed item scales, and decided on mechanisms for administration and scoring.

Criterion:

Concurrent:
Duncan et al. (1999) examined concurrent validity of the SIS by comparison with the Barthel Index, Functional Independence Measure (FIM), Fugl-Meyer Assessment (FMA), Mini-Mental State Examination (MMSE), National Institute of Health Stroke Scale (NIHSS), Medical Outcomes Study Short Form 36 (SF-36), Lawton Instrumental Activities of Daily Living (IADL) Scale, Duke Mobility Scale and Geriatric Depression Scale. The following results were found for each domain of the SIS:

SIS Domain Comparative Measure Correlation Rating
Hand function FMA – Upper Extremity Motor r = 0.81 Excellent
Mobility FIM Motor r = 0.83 Excellent
Barthel Index r = 0.82 Excellent
Duke Mobility Scale r = 0.83 Excellent
SF-36 Physical Functioning r = 0.84 Excellent
Strength NIHSS Motor r = -0.59 Adequate
FMA Total r = 0.72 Excellent
ADL/IADL Barthel Index r = 0.84 Excellent
FIM Motor r = 0.84 Excellent
Lawton IADL Scale r = 0.82 Excellent
Memory MMSE r = 0.58 Adequate
Communication FIM Social/Cognition r = 0.53 Adequate
NIHSS Language r = -0.44 Adequate
Emotion Geriatric Depression Scale r = -0.77 Excellent
SF-36 Mental Health r = 0.74 Excellent
Participation SF-36 Emotional Role r = 0.28 Poor
SF-36 Physical Role r = 0.45 Adequate
SF-36 Social Functioning r = 0.70 Excellent
Physical Barthel Index r = 0.76 Excellent
FIM Motor r = 0.79 Excellent
SF-36 Physical Functioning r = 0.75 Excellent
Lawton IADL Scale r = 0.73 Excellent

Duncan et al. (2002a) examined concurrent validity of the SIS version 3.0 and SIS-16 using Pearson correlations. The SIS was correlated with the Mini-Mental State Examination (MMSE), Barthel Index, Lawton IADL Scale and the Motricity Index. The SIS ADL/IADL domain showed an excellent correlation with the Barthel Index (r=0.72) and with the Lawton IADL Scale (r=0.77). The SIS Mobility domain showed an excellent correlation with the Barthel Index (r=0.69). The SIS Strength domain showed an excellent correlation with the Motricity Index (r=0.67). The SIS Memory domain showed an adequate correlation with the MMSE (r=0.42).

Lai et al. (2003) examined concurrent validity of the SIS-16 and SIS Social Participation domain by comparison with the SF-36 Physical Functioning and Social Functioning subscales, Barthel Index and Lawson IADL Scale, using Pearson correlation coefficients. Measures were administered to 278 patients with stroke at 3 months post-stroke. There was an adequate correlation between SIS-16 and SF-36 Physical Functioning (r=0.79), and an adequate correlation between SIS Social Participation and SF-36 Social Functioning (r=0.65). There was an excellent correlation between SIS-16 and the Barthel Index at 3 months post-stroke (r=0.75), and an adequate correlation between SIS Social Participation and Lawton IADL Scale at 3 months post-stroke (r=0.47).

Lin et al. (2010a) examined concurrent validity of the SIS version 3.0 by comparison with the Fugl-Meyer Assessment (FMA), Motor Activity Log – Amount of Use and – Quality of Movement (MAL-AOU, MAL-QOM), Functional Independence Measure (FIM), Frenchay Activities Index (FAI) and Nottingham Extended Activities of Daily Living Scale (NEADL). Concurrent validity was measured using Spearman correlation coefficients prior to and on completion of a 3-week intervention period. SIS Hand Function showed an excellent correlation with MAL-QOM at pre-treatment and post-treatment (r=0.65, 0.68, respectively, p<0.01), and adequate correlations with all other measures (FMA, MAL-AOU, FIM, FAI, NEADL). SIS ADL/IADL showed an excellent correlation with the FIM at pre-treatment and post-treatment (r=0.69, 0.75, respectively, p<0.01). Correlations between SIS ADL/IADL and the NEADL were adequate at pre-treatment (r=0.54, p<0.01) and excellent at post-treatment (r=0.62, p<0.01). Correlations between the SIS ADL-IADL and all other measures (FMA, MAL-AOU, MAL-QOM, FAI) were adequate at pre-treatment and post-treatment. Other SIS domains demonstrated poor to adequate correlations with comparison measures.

Ward et al. (2011) examined concurrent validity of the SIS-16 by comparison with the Stroke Rehabilitation Assessment of Movement (STREAM), using Spearman correlations. Measures were administered to 30 patients with acute stroke on admission to and discharge from an acute rehabilitation setting. Correlations between the SIS-16 and STREAM total and subscale scores were adequate to excellent on admission (STREAM total r=0.7073; STREAM subtests r=0.5992 to 0.6451, p<0.0005) and discharge (STREAM total r=0.7153; STREAM subtests r=0.5499 to 0.7985, p<0.0002).

Richardson et al. (2016) examined concurrent validity of the SIS 3.0 by comparison with the 5-level EuroQol 5D (EQ-5D-5L), using Pearson correlation coefficients. Measures were administered to patients with subacute stroke on admission to the study and at 6-month and 12-month follow-up (n=164, 108, 37, respectively). At admission correlations with the EQ-5D-5L were excellent for the ADL (r=0.663) and Hand function (r=0.618) domains and Physical composite score (r=0.71); correlations with other domains were adequate (r=0.318 to 0.588), except for the Communication domain (r=0.228). At 6-month follow-up correlations with the EQ-5D-5L were excellent for the Strength (r=0.628), ADL (r=0.684), Mobility (r=0.765), Hand function (r=0.668), Participation (r=0.740) and Recovery domains (r=0.601) and Physical composite score (r=0.772); correlations with other domains were adequate (r=0.402 to 0.562). At 12-month follow-up correlations with the EQ-5D-5L were excellent for the Strength (r=0.604), ADL (r=0.760), Mobility (r=0.683) and Participation (r=0.738) domains and the Physical composite score (r=756); correlations with other domains were adequate (r=0.364 to 0.592).

Predictive:
Duncan et al. (1999) examined which domain scores of the SIS version 2.0 could most accurately predict a patient’s own assessment of stroke recovery, using multiple regression analysis. The SIS domains of Physical function, Emotion, and Participation were found to be statistically significant predictors of the patient’s assessment of recovery. Forty-five percent of the variance in the patient’s assessment of percentage of recovery was explained by these factors.

Fulk, Reynolds, Mondal & Deutsch (2010) examined the predictive validity of the 6MWT and other widely used clinical measures (FMA-LE, self-selected gait-speed, SIS and BBS) in 19 patients with stroke. The SIS was found to be a poor predictor of mean steps per day (r=0.18, p=0.471). Although gait speed and balance were related to walking activity, only the 6MWT was found to be a predictor of community ambulation in patients with stroke.

Huang et al. (2010) examined change in quality of life after distributed constraint-induced movement therapy (CIMT) in a sample of 58 patients with chronic stroke, using CHAID analysis. Predictors of change included age, gender, side of lesion, time since stroke, cognitive status (measured by the MMSE), upper extremity motor impairment (measured by the FMA-UE) and independence in activities of daily living (measured by the FIM). Initial FIM scores were the strongest predictor of overall SIS score (p=0.006) and ADL/IADL domain score (p=0.004) at post-treatment. Participants with FIM scores ≤ 109 showed significantly greater improvement in overall SIS scores than participants with FIM scores > 109. There were no significant associations between other SIS domains and other predictors.

Lin et al. (2010a) examined predictive validity of the SIS version 3.0 by comparing pre-treatment SIS scores with post-treatment scores of the Fugl-Meyer Assessment (FMA), Motor Activity Log – Amount of Use and – Quality of Movement (MAL-AOU, MAL-QOM), Functional Independence Measure (FIM), Frenchay Activities Index (FAI) and Nottingham Extended Activities of Daily Living Scale (NEADL). Predictive validity was measured using Spearman correlation coefficients prior to and on completion of a 3-week intervention period. The SIS Hand Function showed excellent correlations with MAL-AOU (r=0.61, p<0.01) and MAL-QOM (r=0.66, p<0.01), and adequate correlations with all other measures (FMA, FIM, FAI, NEADL). The SIS ADL/IADL showed an excellent correlation with the FIM (r=0.70, p<0.01), and adequate correlations with all other measures (FMA, MAL-AOU, MAL-QOM, FAI, NEADL). Other SIS domains demonstrated poor to adequate correlations with comparison measures.

Ward et al. (2011) examined predictive validity of the SIS-16 and other clinical measures (STREAM, FIM) in a sample of 30 patients in an acute rehabilitation setting, using Spearman rho coefficients and Wilcoxon rank-sum tests. Results indicated an adequate correlation between SIS-16 admission scores and predicted length of stay (rho=-0.6743, p<0.001) and an excellent correlation between SIS-16 admission scores and actual length of stay (rho=-0.7953, p<0.001). There was an significant correlation with discharge destination (p<0.05).

Lee et al. (2016) developed a computational method to predict quality of life after stroke rehabilitation, using Particle Swarm-Optimized Support Vector Machine (PSO-SVM) classifier. A sample of 130 patients with subacute/chronic stroke received occupational therapy for 1.5-2 hours/day, 5 days/week for 3-4 weeks. Predictors of outcome included 5 personal parameters (age, gender, time since stroke onset, education, MMSE score) and 9 early functional outcomes (Fugl-Meyer Assessment, Wolf Motor Function Test, Action Research Arm Test, Functional Independence Measure, Motor Activity Log – Amount of Use (MAL-AOU) and – Quality of Movement (MAL-QOM), ABILHAND, physical function, SIS). The combination of early outcomes of MAL-QOM and SIS showed highest accuracy (70%) and highest cross-validated accuracy (81.43%) in predicting final QOL among patients with stroke. SIS alone showed high accuracy (60%) and cross-validated accuracy (81.43%).

Construct:

Duncan et al. (2003b) performed a Rasch analysis on version 2.0 of the SIS. For measures that have been developed using a conceptual hierarchy of items, the theoretical ordering can be compared with the empirical ordering produced by the Rasch analysis as evidence of the construct validity of the measure. In this study, the expectation regarding the theoretical ordering of task difficulty was consistent with the empirical ordering of the items by difficulty for each domain, providing evidence for the construct validity of the SIS.

Convergent/Discriminant:
Edwards and O’Connell (2003) examined discriminant validity of the SIS version 2.0 and SIS-16 in a sample of 74 patients with stroke, by comparison with the World Health Organization Quality of Life Bref-Scale (WHOQOL-BREF) and Zung’s Self-Rating Depression Scale (ZSRDS). There were adequate to excellent correlations between the SIS-16 and the WHOQOL-BREF Physical domain (r=0.40 to 0.63); correlations with the WHOQOL-BREF Social relationships domain were poor (r=0.13 to 0.18). There were adequate to excellent correlations between the SIS Participation domain and all WHOQOL-BREF domains (r=0.45 to 0.69). The correlation between the SIS Participation domain and the WHOQOL-BREF Physical domain was excellent (r=0.69). The SIS Participation domain demonstrated an adequate correlation with the ZSRDS (r=-0.56). There were adequate correlations between the SIS Memory and Emotion domains and the WHOQOL-BREF Psychological domain (r=0.49, 0.70, respectively) and between the SIS Memory and Emotion domains and the ZSRDS (r=-0.38, -0.62, respectively). There was a poor correlation between the SIS Physical domain and the WHOQOL-BREF Environment scores (r=0.15). Neither the ZSRDS nor the WHOQOL-BREF assess communication, accordingly both measures demonstrated poor correlations with the SIS Communication domain (ZSRDS: r=-0.28; WHOQOL-BREF: r=0.11 to 0.28).
Note: Some correlations are negative because a high score on the SIS indicates normal performance whereas a high score on other measures indicates impairment.

Jenkinson et al. (2013) examined convergent validity of the SIS version 3.0 and the SF-SIS in a sample of individuals with stroke (n=73, 151, respectively) by comparison with the EuroQoL EQ-5D, using Spearmans correlation coefficient. The SIS and SF-SIS demonstrated identical excellent correlations with the EQ-5D (r=0.83)

MacIsaac et al. (2016) examined convergent validity of the SIS 3.0 and the SF-SIS in a sample of 5549 patients in an acute stroke setting and 332 patients in a stroke rehabilitation setting, using Spearman’s correlation coefficient. Convergent validity was measured by comparison with the SIS-VAS, patient-reported outcome measures the EuroQoL EQ-5D and EQ-5D-VAS, and functional measures the Barthel Index (BI), modified Rankin Score (mRS), and the National Institutes of Health Stroke Scale (NIHSS). Within acute data, the SIS and SF-SIS demonstrated significant excellent correlations with the mRS (p=-0.87, -0.80, respectively), the BI (p=0.89, 0.80), the NIHSS (p=-0.77, -0.73), the EQ-5D (p=0.88, 0.82) and the EQ-VAS (p=0.73, 0.72). Within rehabilitation data, the SIS and SF-SIS demonstrated excellent correlations with the BI (p=0.72, 0.65, respectively) and the EQ5D (p=0.69, 0.69), and moderate correlations with the SIS-VAS (p=0.56, 0.57) and the EQ-VAS (p=0.46, 0.40). Correlations between the SIS and SF-SIS were excellent in the acute data (p=0.94) and rehabilitation data (p=0.96).

Kwon et al. (2006) examined convergent validity of the SIS 3.0 by telephone administration in a sample of 95 patients with stroke, using Pearson coefficients. Convergent validity was measured by comparison with the Functional Independence Measure (FIM) – Motor component (FIM-M) and – Cognitive component (FIM-C), with the Medical Outcomes Study Short Form 36 for veterans (SF-36V). Patients were administered the SIS at 12 weeks post-stroke and the FIM and SF-36 at 16 weeks post-stroke. The SIS 3.0 telephone survey showed adequate to excellent correlations with the FIM (r=0.404 to 0.858, p<0.001) and SF-36V (r=0.362 to 0.768, p<0.001).

Known groups:
Duncan et al. (1999) found that all domains of the SIS version 2.0, with the exception of the Memory/thinking and Emotion domains, were able to discriminate between patients across 4 Rankin levels of stroke severity (p<0.0001, except for the Communication domain, p=0.02). These results suggest that scores from most domains of the SIS can differentiate between patients based on stroke severity.

Lai et al. (2003) administered the SIS and SF-36 to 278 patients with stroke 90 days after stroke. The SIS-16 was able to discriminate among the Modified Rankin Scale (MRS) levels of 0 to 1, 2, 3, and 4. The SIS Participation domain was also able to discriminate across the MRS levels of 0 to 1, 2, and 3 to 4. These results suggest that the SIS can discriminate between patients of varying degrees of stroke severity.

Kwon et al. (2006) administered the SIS 3.0 by telephone administration to a sample of 95 patients at 12 weeks post-stroke. The MRS was administered to patients at hospital discharge. SIS 3.0 scores were reported by domains: SIS-16, SIS-Physical and SIS-ADL; all domains showed score discrimination and distribution for different degrees of stroke severity: MRS 0/1 vs. MRS 4/5; MRS 2 vs. MRS 4/5; and MRS 3 vs. MRS 4/5.

Sensitivity and Specificity:

Beninato, Portney & Sullivan (2009) examined sensitivity and specificity of the SIS-16 relative to a history of multiple falls in a sample of 27 patients with chronic stroke. Participants reported a history of no falls or one fall (n=18) vs. multiple falls (n=9), according to Tinetti’s definition of falls. SIS-16 cut-off scores of 61.7 yielded 78% sensitivity and 89% specificity. Area under the ROC curve was adequate (0.86). Likelihood ratios were used to calculate post-test probability of a history of falls, and results showed high positive (LR+ = 7.0) and low negative (LR- = 0.25) likelihood ratios. Results indicate that the SIS-16 demonstrated good overall accuracy in detecting individuals with a history of multiple falls.

Responsiveness

Duncan et al. (1999) examined responsiveness of the SIS version 2.0. Significant change was observed in patients’ recovery in the expected direction between assessments at 1 and 3 months, and at 1 and 6 months post-stroke, however sensitivity to change was affected by stroke severity and time of post-stroke assessment. All domains of the SIS showed statistically significant change from 1 to 3 months and 1 to 6 months post-stroke, but this was not observed between 3 and 6 months post-stroke for the domains of Hand function, Mobility, ADL/IADL, combined physical, and Participation among patients recovering from minor stroke. For patients with moderate stroke, statistically significant change was observed at both 1 to 3 months and 1 to 6 months post-stroke in all domains, and from 3 to 6 months for the domains of Mobility, ADL/IADL, combined physical, and Participation.

Lin et al. (2010a) examined responsiveness of the SIS version 3.0 in a sample of 74 patients with chronic stroke. Participants were randomly assigned to receive constraint-induced movement therapy (CIMT), bilateral arm training (BAT) or conventional rehabilitation over a 3-week intervention period. Responsiveness was measured according to change from pre- to post-treatment, using Wilcoxon signed rank test and Standardised Response Mean (SRM). Most SIS domains showed small responsiveness (SRM = 0.22-0.33, Wilcoxon Z = 1.78-2.72). Medium responsiveness was seen for Hand Function (SRM = 0.52, Wilcoxon Z = 4.24, P<0.05), Stroke Recovery (SRM = 0.57, Wilcoxon Z = 4.56, P<0.05) and SIS total score (SRM=0.50, Wilcoxon Z = 3.89, P<0.05).

Lin et al. (2010b) evaluated the clinically important difference (CID) within four physical domains of the SIS 3.0 (strength, ADL/IADL, mobility, hand function) in a sample of 74 patients with chronic stroke. Participants were randomly assigned to receive CIMT, BAT or conventional rehabilitation over a 3-week intervention period. The following change scores were found to indicate a true and reliable improvement (MDC): Strength subscale = 24.0; ADL/IADL subscale = 17.3; Mobility subscale = 15.1; and Hand Function subscale = 25.9. The following mean change scores were considered to represent a CID: Strength subscale = 9.2; ADL/IADL subscale = 5.9; Mobility subscale = 4.5; and Hand Function subscale = 17.8. CID values were determined by the effect-size index and from comparison with a global rating of change (defined by a score of 10-15% in patients’ perceived overall recovery from pre- to post-treatment).
Note: Lin et al. (2010b) note that CID estimates may have been influenced by the age of participants and baseline degree of severity. Younger patients needed greater change scores from pre- to post-treatment to have a clinically important improvement compared to older patients. Those with higher baseline severity of symptoms showed greater MDC values therefore must show more change from pre- to post-treatment in order to demonstrate significant improvements. Also, the results may be limited to stroke patients who demonstrate improvement after rehabilitation therapies, Brunnstromm stage III and sufficient cognitive ability. Therefore, a larger sample size is recommended for future validation of these findings.

Ward et al. (2011) examined responsiveness of the SIS-16 and other clinical measures (STREAM, FIM) in a sample of 30 patients with acute stroke. Change scores were evaluated using Wilcoxon signed rank test and responsiveness to change was assessed using standardized response means (SRM). Measures were taken on admission to and discharge from an acute rehabilitation setting (average length of stay 23.3 days, range 7-53 days). SIS-16 change scores indicated statistically significant improvement from admission to discharge (23.1, p<0.0001) and sensitivity to change was large (SRM=1.65).

Guidetti et al. (2014) examined responsiveness of the SIS 3.0 in a sample of 204 patients with stroke who were assessed at 3 and 12 months post-stroke, using Wilcoxon’s matched pairs test. Clinically meaningful change within a domain was defined as a change of 10-15 points between timepoints. The Participation and Recovery domains were the most responsive domains over the first year post-stroke, with 27.5% and 29.4% of participants (respectively) reporting a clinically meaningful positive change, and 20% and 10.3% of participants (respectively) reporting a clinically meaningful negative change, from 3 to 12 months post-stroke. The Strength and Hand function domains also showed high clinically meaningful positive change (23%, 18.0% respectively) and negative change (14.7%, 14.2% respectively) from 3 to 12 months post-stroke. There were significant changes in scores on the Strength (p=0.045), Emotion (p=0.001) and Recovery (p<0.001) domains from 3 to 12 months post-stroke. The Strength, Hand function and Participation domains had the highest perceived impact (i.e. lowest mean scores) at 3 months and 12 months.

References

  • Beninato, M., Portney, L.G., & Sullivan, P.E. (2009). Using the International Classification of Functioning, Disability and Health as a framework to examine the association between falls and clinical assessment tools in people with stroke. Physical Therapy, 89(8), 816-25.
  • Brandao, A.D., Teixeira, N.B., Brandao, M.C., Vidotto, M.C., Jardim, J.R., & Gazzotti, M.R. (2018). Translation and cultural adaptation of the Stroke Impact Scale 2.0 (SIS): a quality-of-life scale for stroke. Sao Paulo Medical Journal, 136(2), 144-9. doi: 10.1590/1516-3180.2017.0114281017
  • Brott, T.G., Adams, H.P., Olinger, C.P., Marler, J.R., Barsan, W.G., Biller, J., Spilker, J., Holleran, R., Eberle, R., Hertzberg, V., Rorick, M., Moomaw, C.J., & Walker, M. (1989). Measurements of acute cerebral infarction: A clinical examination scale. Stroke, 20, 864-70.
  • Cael, S., Decavel, P., Binquet, C., Benaim, C., Puyraveau, M., Chotard, M., Moulin, T., Parrette, B., Bejot, Y., & Mercier, M. (2015). Stroke Impact Scale version 2: validation of the French version. Physical Therapy, 95(5), 778-90.
  • Carod-Artal, F.J., Coral, L.F., Trizotto, D.S., Moreira, C.M. (2008). The Stroke Impact Scale 3.0: evaluation of acceptability, reliability, and validity of the Brazilian version. Stroke, 39, 2477-84.
  • Choi, S.U., Lee, H.S., Shin, J.H., Ho, S.H., Koo, M.J., Park, K.H., Yoon, J.A., Kim, D.M., Oh, J.E., Yu, S.H., & Kim, D.A. (2017). Stroke Impact Scale 3.0: reliability and validity evaluation of the Korean version. Annals of Rehabilitation Medicine, 41(3), 387-93.
  • Collin, C. & Wade, D. (1990). Assessing motor impairment after stroke: a pilot reliability study. Journal of Neurology, Neurosurgery, and Psychiatry, 53, 576-9.
  • Duncan, P. W., Bode, R. K., Lai, S. M., & Perera, S. (2003b). Rasch analysis of a new stroke-specific outcome scale: The Stroke Impact Scale. Archives of Physical Medicine and Rehabilitation, 84, 950-63.
  • Duncan, P. W., Lai, S. M., Tyler, D., Perera, S., Reker, D. M., & Studenski, S. (2002a). Evaluation of Proxy Responses to the Stroke Impact Scale. Stroke, 33, 2593-9.
  • Duncan, P.W., Reker, D.M., Horner, R.D., Samsa, G.P., Hoenig, H., LaClair, B.J., & Dudley, T.K. (2002b). Performance of a mail-administered version of a stroke-specific outcome measure: The Stroke Impact Scale. Clinical Rehabilitation, 16(5), 493-505.
  • Duncan, P.W., Wallace, D., Lai, S.M., Johnson, D., Embretson, S., & Laster, L.J. (1999). The Stroke Impact Scale version 2.0: Evaluation of reliability, validity, and sensitivity to change. Stroke, 30, 2131-40.
  • Duncan, P.W., Wallace, D., Studenski, S., Lai, S.M., & Johnson, D. (2001). Conceptualization of a new stroke-specific outcome measure: The Stroke Impact Scale. Topics in Stroke Rehabilitation, 8(2), 19-33.
  • Duncan, P.W., Lai, S.M., Bode, R.K., Perea, S., DeRosa, J.T., GAIN Americas Investigators. (2003a). Stroke Impact Scale-16: A brief assessment of physical function. Neurology, 60, 291-6.
  • Edwards, B. & O’Connell, B. (2003). Internal consistency and validity of the Stroke Impact Scale 2.0 (SIS 2.0) and SIS-16 in an Australian sample. Quality of Life Research, 12, 1127-35.
  • Finch, E., Brooks, D., Stratford, P.W., & Mayo, N.E. (2002). Physical Rehabilitations Outcome Measures. A Guide to Enhanced Clinical Decision-Making (2nd ed.), Canadian Physiotherapy Association, Toronto.
  • Folstein, M.F., Folstein, S.E., & McHugh, P.R. (1975). “Mini-mental state”. A practical method for grading the cognitive state of patients for the clinician. Journal of Psychiatric Research, 12(3), 189-98.
  • Fugl-Meyer, A.R., Jaasko, L., Leyman, I., Olsson, S., & Steglind, S. (1975). The post-stroke hemiplegic patient: a method for evaluation of physical performance. Scandinavian Journal of Rehabilitation Medicine, 7, 13-31.
  • Fulk, G.D., Reynolds, C., Mondal, S., & Deutsch, J.E. (2010). Predicting home and community walking activity in people with stroke. Archives of Physical Medicine and Rehabilitation, 91, 1582-6.
  • Geyh, S., Cieza, A., & Stucki, G. (2009). Evaluation of the German translation of the Stroke Impact Scale using Rasch analysis. The Clinical Neuropsychologist, 23(6), 978-95.
  • Goncalves, R.S., Gil, J.N., Cavalheiro, L.M., Costa, R.D., & Ferreira, P.L. (2012). Reliability and validity of the Portuguese version of the Stroke Impact Scale 2.0 (SIS 2.0). Quality of Life Research, 21(4), 691-6.
  • Guidetti, S., Ytterberg, C., Ekstam, L., Johansson, U., & Eriksson, G. (2014). Changes in the impact of stroke between 3 and 12 months post-stroke, assessed with the Stroke Impact Scale. Journal of Rehabilitative Medicine, 46, 963-8.
  • Hamilton, B.B., Granger, C.V., & Sherwin, F.S. (1987). A uniform national data system for medical rehabilitation. In: Fuhrer, M. J., ed. Rehabilitation Outcome: Analysis and Measurement. Baltimore, Md: Paul Brookes, 137-47.
  • Hamza, A.M., Nabilla, A.S., & Loh, S.Y. (2012). Evaluation of quality of life among stroke survivors: linguistic validation of the Stroke Impact Scale (SIS) 3.0 in Hausa language. Journal of Nigeria Soc Physiotherapy, 20, 52-9.
  • Hamza, A.M., Nabilla, A.-S., Yim, L.S., & Chinna, K. (2014). Reliability and validity of the Nigerian (Hausa) version of the Stroke Impact Scale (SIS) 3.0 index. BioMed Research International, 14, Article ID 302097, 7 pages. doi: 10.1155/2014/302097
  • Hogue, C., Studenski, S., Duncan, P.W. (1990). Assessing mobility: The first steps in preventing fall. In: Funk, SG., Tornquist, EM., Champagne, M.T., Copp, L.A., & Wiese, R.A., eds. Key Aspects of Recovery. New York, NY: Springer, 275-81.
  • Hsieh, F.-H., Lee, J.-D., Chang, T.-C., Yang, S.-T., Huang, C.-H., & Wu, C.-Y. (2016). Prediction of quality of life after stroke rehabilitation. Neuropsychiatry, 6(6), 369-75.
  • Huang, Y-h., Wu, C-y., Hsieh, Y-w., & Lin, K-c. (2010). Predictors of change in quality of life after distributed constraint-induced therapy in patients with chronic stroke. Neurorehabilitation and Neural Repair, 24(6), 559-66. doi: 10.1177/1545968309358074
  • Jenkinson, C., Fitzpatrick, R., Crocker, H., & Peters, M. (2013). The Stroke Impact Scale: validation in a UK setting and development of a SIS short form and SIS index. Stroke, 44, 2532-5.
  • Kamwesiga, J.T., von Koch, L., Kottorp, A., & Guidetti, S. (2009). Cultural adaptation and validation of Stroke Impact Scale 3.0 version in Uganda: a small-scale study. SAGE Open Medicine, 4: 2050312116671859. doi: 10.1177/2050312116671859
  • Kwon, S., Duncan, P., Studenski, S., Perera, S., Lai, S.M., & Reker, D. (2006). Measuring stroke impact with SIS: Construct validity of SIS telephone administration. Quality of Life Research, 15, 367-76.
  • Lai, S.M., Perera, S., Duncan, P.W., & Bode, R. (2003). Physical and Social Functioning After Stroke: Comparison of the Stroke Impact Scale and Short Form-36. Stroke, 34, 488-93.
  • Lawton, M. & Brody, E. (1969). Assessment of older people: self-maintaining and instrumental activities of daily living. Gerontologist, 9, 179 -86.
  • Lee, H.-J. & Song, J.-M. (2015). The Korean language version of Stroke Impact Scale 3.0: cross-cultural adaptation and translation. Journal of the Korean Society of Physical Medicine, 10(3), 47-55.
  • Lin, K.C., Fu, T., Wu, C.Y., Hsieh, Y.W., Chen, C.L., & Lee, P.C. (2010a). Psychometric comparisons of the Stroke Impact Scale 3.0 and Stroke-Specific Quality of Life Scale. Quality of Life Research, 19(3), 435-43. doi: 10.1007/s11136-010-9597-5.
  • Lin K.-C., Fu T., Wu C.Y., Wang Y.-H., Wang Y-.H., Liu J.-S., Hsieh C.-J., & Lin S.-F. (2010b). Minimal detectable change and clinically important difference of the Stroke Impact Scale in stroke patients. Neurorehabilitation and Neural Repair, 24, 486-92.
  • MacIsaac, R., Ali, M., Peters, M., English, C., Rodgers, H., Jenkinson, C., Lees, K.R., Quinn, T.J., VISTA Collaboration. (2016). Derivation and validation of a modified short form of the Stroke Impact Scale. Journal of the American Heart Association, 5:e003108. doi: 10/1161/JAHA.115003108.
  • Mahoney, F.I. & Barthel, D.W. (1965). Functional evaluation: The Barthel Index. Maryland State Medical Journal, 14, 61-5.
  • Mulder, M. & Nijland, R. (2016). Stroke Impact Scale. Journal of Physiotherapy, 62, 117.
  • Ochi, M., Ohashi, H., Hachisuka, K., & Saeki, S. (2017). The reliability and validity of the Japanese version of the Stroke Impact Scale version 3.0. Journal of UOEH, 39(3), 215-21. doi: 10.7888/juoeh.39.215
  • Richardson, M., Campbell, N., Allen, L., Meyer, M., & Teasell, R. (2016). The stroke impact scale: performance as a quality of life measure in a community-based stroke rehabilitation setting. Disability and Rehabilitation, 38(14), 1425-30. doi: 10.310/09638288.2015.1102337
  • Sullivan, J. (2014). Measurement characteristics and clinical utility of the Stroke Impact Scale. Archives of Physical Medicine and Rehabilitation, 95, 1799-1800.
  • Vellone, E., Savini, S., Barbato, N., Carovillano, G., Caramia, M., & Alvaro, R. (2010). Quality of life in stroke survivors: first results from the reliability and validity of the Italian version of the Stroke Impact Scale 3.0. Annali di Igiene, 22, 469-79.
  • Vellone, E., Savini, S., Fida, R., Dickson, V.V., Melkus, G.D., Carod-Artal, F.J., Rocco, G., & Alvaro, R. (2015). Psychometric evaluation of the Stroke Impact Scale 3.0. Journal of Cardiovascular Nursing, 30(3), 229-41. doi: 10.1097/JCN.0000000000000145
  • Ward, I., Pivko, S., Brooks, G., & Parkin, K. (2011). Validity of the Stroke Rehabilitation Assessment of Movement Scale in acute rehabilitation: a comparison with the Functional Independence Measure and Stroke Impact Scale-16. Physical Medicine and Rehabilitation, 3(11), 1013-21. doi: 10.1016/j.pmrj.2011.08.537
  • Ware, J.E. Jr., & Sherbourne, C.D. (1992). The MOS 36-item short-form health survey (SF-36). I. Conceptual framework and item selection. Medical Care, 30, 473-83.
  • Yesavage, J.A., Brink, T., Rose, T.L., Lum, O., Huang, V., Adey, M., & Leirer, V.O. (1983). Development and validation of a geriatric depression screening scale: A preliminary report. Journal of Psychiatric Research, 17, 37-49.

See the measure

How to obtain the SIS?

Please click here to see a copy of the SIS.

This instrument was developed by:

  • Pamela Duncan, PhD, PT
  • Dennis Wallace, PhD
  • Sue Min Lai, PhD, MS, MBA
  • Stephanie Studenski, MD, MPH
  • DallasJohnson, PhD, and
  • Susan Embretson, PhD.

In order to gain permission to use the SIS and its translations, please contact MAPI Research Trust: contact@mapi-trust.org

Table of contents

Western Aphasia Battery (WAB)

Evidence Reviewed as of before: 07-06-2013
Author(s)*: Vanessa Barfod, BA
Editor(s): Annabel McDermott, OT., Nicol Korner-Bitensky, PhD OT.
Content consistency: Gabriel Plumier

Purpose

The Western Aphasia Battery (WAB) is a diagnostic tool used to assess the linguistic skills and main nonlinguistic skills of adults with aphasia. This provides information for the diagnosis of the type of aphasia and identifies the location of the lesion causing aphasia.

In-Depth Review

Purpose of the measure

The Western Aphasia Battery (WAB) was developed by Kertesz in 1979 based on the original format of the Boston Diagnostic Aphasia Examination (Goodglass & Kaplan, 1972). It was designed as an assessment tool to examine linguistic skills (information content, fluency, auditory comprehension, repetition, naming and word finding, reading, and writing) and main nonlinguistic skills (drawing, block design, calculation, and praxis) of adults with aphasia. The observed language behaviours facilitate diagnosis by classifying the patient as having 1 of 8 types of aphasia (Global, Broca’s, Transcortical motor, Wernicke’s, Transcortical sensory, Mixed transcortical, Conduction, and Anomic) according to primary aspects of language functioning (Risser & Spreen, 1985). The WAB is also used to determine the location of the lesion.

The WAB is one of the most commonly used assessment tools by speech language pathologists in Canada (Korner-Bitensky et al., 2006).

Available versions

The WAB was originally developed by Kertesz in 1979. It was revised in 1982 and then again in 2006, when it was published as the Western Aphasia Battery-Revised (WAB-R). The WAB-R has a number of improvements including supplemental tasks, revision of the items, more modern equipment (e.g. spiral-bound stimulus book replacing loose stimulus cards), as well as revised directions and scoring guidelines for clarity. The WAB-R also includes a bedside screening tool.

Features of the measure

Subscales:

The WAB is comprised of 8 subscales, the WAB-R includes an additional subscale:

  1. Spontaneous speech
  2. Auditory verbal comprehension
  3. Repetition
  4. Naming and word finding
  5. Reading
  6. Writing
  7. Apraxia
  8. Contructional, visuospatial, and calculation tasks
  9. Supplemental writing and reading tasks (WAB-R only)

Items and scoring:

The WAB consists of two parts:

PART 1
Subtest 1. Spontaneous speech
Task Description Scoring
1. Conversational question: The client verbally responds to 6 personal questions. There are 2 scoring sections for Spontaneous speech: Information content and Fluency, grammatical competence, and paraphasias.For Information content:10 points are given if all 6 questions were answered correctly with sentences of normal length and complexity, as well as a reasonably complete description of the picture. Nine points are given if all 6 questions were answered correctly, as well as an almost complete description of the picture. Eight points are given if 5 questions were answered correctly, with an incomplete description of the picture. Seven points are given if 4 questions were answered correctly, as well as at least 6 items in the picture being mentioned. Six points are given if 4 questions were answered correctly, as well as some response to the picture. Five points are given if 3 questions were answered correctly, as well as some response to the picture. Four points are given if 3 questions were answered correctly. Three points are given if 2 questions were answered correctly. Two points are given if 1 question is answered correctly. One point is given for incomplete responses. No points are given for no information.

For Fluency, grammatical competence, and paraphasias:

Scoring is from 0 to 10. Ten points are given for sentences of normal length and complexity, without slowing down, stopping, or articulatory difficulty, and no paraphasias. As sentences become less lengthy and complex, more slow, and with paraphasias, less points are given.

Please see the test manual for further information.

2. Personal description: The client describes a picture in the stimulus book.
Subtest 2. Auditory verbal comprehension
Task Description Scoring
1. Yes/No questions: The client must answer personal, environment and general questions with a Yes or No. If the client corrects themselves, the last answer is scored. Three points are given for each correct answer. If the answer is ambiguous, 0 points are given. The examiner also marks whether the response was verbal, gestural, or eye blink.
2. Auditory word recognition: The client is shown 6 real objects, as well as cards of pictured objects, forms, letters, numbers and colors. The client must point to what the examiner says. There are 6 items in each category: real objects, drawn objects, forms, letters, numbers, colors, furniture, body parts, fingers and right-left body parts. One point is given to each item pointed to correctly. For the right-left category, the client must get both the side and the body part correct to receive the point. Maximum score is 60.
3. Sequential commands: The client must execute 11 commands that increase in difficulty and length. There are scores associated to the segments in each of the listed commands. Points are given for each correct execution. Please see the test booklet for further information. Maximum score is 80.
Subtest 3. Repetition
Description Scoring
1. The client must repeat words, phrases and sentences of increasing difficulty (from single words to a complex sentence), with a total of 15 items. As length and difficulty increases, more points are given to the client for correct repetition. Two points are given if an item is incompletely repeated. One point is taken off for errors in the sequence of words, or for every literal paraphasia. Maximum score is 100.Scoring takes phonemic errors into account by permitting partial credit.
Subtest 4. Naming and word finding
Task Description Scoring
1. Object naming: Show 20 objects from various categories to client, and ask them to name them one at a time. If there is no response to the visual stimulus, the examiner allows the client to touch the stimulus. If there still isn’t a correct response, the examiner presents a phonemic or semantic cue. A maximum of 20 seconds if given to the client to respond. Three points are given if named correctly (or with a minor articulatory error); 2 points are given for a recognizable phonemic paraphasia; 1 point is given if the client needed a tactile or phonemic cue to respond correctly. Maximum score is 60.Permits sequential tactile and phonemic cueing for the patient who cannot provide the proper name upon initial confrontation with the object, yielding qualitatively useful information without sacrificing the integrity of the scoring system.
2. Word fluency: The client must name as many animals as they can in one minute. One point is given for each animal names, even if it is distorted by literal paraphasia. Maximum score is 20.
3. Sentence completion: The client must complete sentences read to them. Two points are given for correct responses; 1 point is given for phonemic paraphasias. Reasonable alternatives are accepted. Maximum score is 10.
4. Responsive speech: The client must answer 5 sentences read to them. Two points are given for acceptable responses; 1 point is given for phonemic paraphasias. Maximum score is 10.
PART 2
Subtest 1: Reading
Task Description Scoring
1. Reading comprehension of sentences: The client is shown 8 test sentences and asked to point to the best missing word from a list, to complete the sentence. As the sentences increase in length and difficulty, more points are given to the client for correct answers. Maximum score is 40.
2. Reading commands: The client is asked to read a command out loud and then do what it says. There are 6 items. For the first three items, 1 point is given for correctly reading it out loud and 1 point is given for correctly performing the command. For the next/last 3 items, 2 points are given for correctly reading it out loud and 2 points are given for correctly performing the command. A partial score is given if only part of the command is read, or contains paraphasias, or if only part of the command is performed.Maximum score is 20.
At this point, if the combined score of tasks 1 and 2 is 50 or more, give full credit of 100 minus twice the difference of the client’s score from 60
[Score = 100 – 2(60 – Patient’s score].
If this is not the case, continue testing.
3. Written word stimulus-object choice matching: Objects are placed in a random order in front of the client. They are then asked to point to the object that matches the word on the card shown to them. One point is given for every correct response. Maximum score is 6.
4. Written word stimulus-picture choice matching: A card with pictures on it is shown to the client. They are then asked to point to the picture that matches the words on the same card as the previous task. One point is given for every correct response. Maximum score is 6.
5. Picture stimulus-written word choice matching: The same words on the card from the previous two tasks are shown to the client. They are then asked to point to the word that matches the picture that is presented to them one at a time. One point is given for every correct response. Maximum score is 6.
6. Spoken word-written word choice matching: The examiner will say a word and the client must point to one of 5 words written on the card, that matches it. One point is given for every correct response. Maximum score is 4.
7. Letter discrimination: The examiner must take the score from the Auditory word recognition from the second subtest of Part 1 of the WAB. If that score is less than 3, the examiner must show the client cutout letters and have them point to ones that are on the card presented to them. If the score from the Auditory word recognition is less than 3, one point is given for every correct response. Maximum score is 6.
8. Spelled word recognition: The client must say the word that the examiner spells out orally. One point is given for every correct response. Maximum score is 6.
9. Spelling: The client must spell the word said orally to them by the examiner. One point is given for every correct response. Maximum score is 6.
Subtest 2: Writing
Task Description Scoring
1. Writing on request: The client must write down their name and address. One point is given for every recognizable word or number. Half a point is deducted for every spelling mistake or paraphasic error. Maximum score is 6.
2. Written output: The client is given 3 minutes to write a story about a picture they are shown. Thirty-four points are given for a full description; 8 points are given for every complete sentences with at least 6 words; 1 point is given for every correct word in incomplete or short sentences (with a maximum of 10 points for this). Half a point is deducted for every spelling mistake or paraphasic error. Punctuation is not score. Maximum score is 34.
3. Writing to dictation: The client must write down sentences dictated by the examiner. Ten points are given for the complete sentence being written down; 1 point is given for every correct word; 0.5 point is deducted for every spelling mistake or paraphasic error.
At this point, if the combined score of tasks 1, 2 and 3 is 40 or more,
the score is: 2 x patient’s score.
4. Writing of dictated or visually presented words: The client must write down words that were dictated by the examiner. There are 6 items. If the client does not understand the word dictated, the examiner shows them the real object. Full credit is given at this point. If they still don’t understand, the examiner spells the word orally. If the patient still doesn’t know, the examiner provides cut-out letters with 2 extra ones. Half the score is given to correct answers at this point. Half a point is deducted for incorrect letters. Maximum score is 10.
5. Alphabet and numbers: The client must write down the alphabet, as well as numbers 0-20. Half a point is given for every letter or number, despite the order. Maximum score for the alphabet is 12.5; for the numbers is 10.
6. Dictated letters and numbers: The client must write down letters, followed by numbers, dictated by the examiner. Half a point is given for every letter written correctly; 1 point is given for every complete number. Maximum score for the letters is 2.5; for the numbers is 5.
7. Copying a sentence: The client is shown a card with a sentence written on it. They are instructed to copy down this sentence. Ten points are given for the complete sentence; 1 point is given for every correct word; 0.5 point is deducted for every incorrect letter. Maximum score is 10.
Subtest 3: Apraxia
Description Scoring
1. Clients are asked to perform 20 actions. Three points are given for a good performance; 2 points are given for an approximate performance. If the client fails to perform the command well, the examiner imitates the action; 2 points are given for a good performance at this point; 1 point is given for an approximate performance at this point. If the client fails to perform well after imitation, the examiner gives the client the real object, where applicable; 2 points are given if the client uses a body part for an object; 1 point is given for a good performance with the real object. Maximum score is 60.
Subtest 4: Constructional, visuospatial and calculation tasks
Task Description Scoring
1. Drawing: The client must draw 8 figures of different complexity.
2. Block design: The client must arrange 4 blocks to match the picture shown to them from the Stimulus book. There are 3 different pictures that must be matched. The examiner firstly shows the client how to arrange the blocks, then mixes them to allow the client to complete the task. If the client completes the task in 60 seconds, they are given 3 points. If the client fails to complete it in 90 seconds, the examiner mixes the blocks and lets them try again. If they complete it within the extra time (2 minutes) they are given 2 points. They are given 1 point for putting the blocks together.
3. Calculation: The client must solve 12 mathematical equations (addition, subtraction, division, multiplication) by either pointing to or saying one of the 4 answers shown to them on the card. Two points are given for each correct answer. No partial marks are given. Maximum score is 24.
4. Raven’s colored progressive matrices: The client must point to the piece of a pattern from that of a larger pattern. One point is given for each correct answer. One additional point is awarded if the task is completed under 5 minutes. Maximum score is 37.
WAB-R Supplemental writing and reading tasks
Task Description Scoring
1. Writing irregular words to dictation The client must write down the 10 words, with irregular spelling, that the examiner dictates to them. Please refer to the manual.
2. Writing non-words to dictation The client must write down 10 nonsense words that are dictated to them by the examiner. Please refer to the manual.
3. Reading irregular words The client must read 10 words, with irregular spelling, from the stimulus book, out loud. Please refer to the manual.
4. Reading non-words (supplemental) The client must read 10 nonsense words out loud. Please refer to the manual.

Composite scores:

In addition to the subscales scores, there are three additional composite scores (note that the following do not include the supplemental subtest of the WAB-R):

1 – Language Quotient (LQ):

This is the newest summary score that encompasses auditory comprehension, oral expression, reading, and writing performance (Shewan, 1986). The first two elements relate to spoken language performance and the latter two elements relate to written language performance. The LQ uses a 60:40 ratio of spoken to written language performance, in light of the fact that people more regularly use spoken language skills.

Computation: The LQ uses the scores from all subtests of the first part of the WAB (Spontaneous speech, Auditory verbal comprehension, Repetition, Naming and word finding) and the first two subtests of the second part of the WAB (Reading, Writing). The LQ score can range from 0 to 100. Each subtest score is a portion of the total score according to their respective maximum scores, as follows:

Subtest Contribution to the LQ Computation
Spontaneous speech 20% maximum score = 20; patient’s score is recorded as-is
Auditory verbal comprehension 20% maximum score = 200; patient’s score is divided by 10 to reach 20%
Repetition 10% maximum score = 100; patient’s score is divided by 10 to reach 10%
Naming and word finding 10% maximum score = 100; patient’s score is divided by 10 to reach 10%
Reading 20% maximum score = 100; patient’s score is divided by 5 to reach 20%
Writing 20% maximum score = 100; patient’s score is divided by 5 to reach 20%

Accordingly, the LQ is calculated as the sum of each subtest revised score.

2. Cortical Quotient (CQ):

The CQ is a weighted average of all subtest scores. Since the non-language portion is included in this summary score, it is not a sound indicator of language ability and its severity in aphasia. The LQ score can range from 0-100.

Computation: Like the LQ, each subtest’s score is a portion of the total score.

Subtest Contribution to the LQ Computation
Spontaneous speech 20% maximum score = 20; patient’s score is recorded as-is
Auditory verbal comprehension 20% maximum score = 200; patient’s score is divided by 10 to reach 20%
Repetition 10% maximum score = 100; patient’s score is divided by 10 to reach 10%
Naming and word finding 10% maximum score = 100; patient’s score is divided by 10 to reach 10%
Reading 20% maximum score = 100; patient’s score is divided by 5 to reach 20%
Writing 20% maximum score = 100; patient’s score is divided by 5 to reach 20%

3. Aphasia Quotient (AQ):

The AQ is a weighted average of all subtest scores relating to spoken language, measuring language ability. It is a sum of all subtest scores from the first part of the WAB (Spontaneous speech, Auditory verbal comprehension, Repetition, Naming and word finding). The examiner can use the AQ score to classify the client’s aphasia as 1 of 8 types (Kyoung Kang et al, 2010).

AQ Score Severity
0-25 Very severe
26-50 Severe
51-75 Moderate
76+ Mild

Taxonomic Table of the Western Aphasia Battery (Kertesz, 1982)

Fluency Comprehension Repetition Naming
Global 0-4 0-3.9 0-4.9 <7
Broca’s 0-4 4-10 0-7.9 <9
Isolation 0-4 0-3.9 5-10 <7
Transcortical Motor 0-4 4-10 8-10 <9
Wernicke’s 5-10 0-6.9 0-7.9 <10
Transcortical Sensory 5-10 0-6.9 8-10 <10
Conduction 5-10 7-10 0-6.9 <10
Anomic 5-10 7-10 7-10 <10

Note: However, Risser and Spreen (1984) commented that WAB criteria for classification force assignments to types of aphasia based on idealized subtest scores and avoids the term ‘mixed’ aphasia that accounts for a proportion of the aphasic population.

What to consider before beginning:

  • The client should be well rested and should be allowed frequent breaks during the assessment.
  • The examiner may want to video or the testing session to allow for later review of the client’s performance.
  • The examiner must remember to place all pictures and objects within the client’s visual field.
  • The examiner must be aware of the client’s self-corrections and must always record the client’s responses, whether or not it is correct.
  • The examiner may repeat the instructions once, unless otherwise instructed in the manual.
  • The examiner must not let the client know how they are doing on the test and should be aware not to give cues unless instructed to.
  • Before administering the supplemental section of the WAB-R, the examiner must see the client’s performance on the previous Reading and Writing tasks. The examiner may choose to not administer this last section if the client did not perform satisfactorily, because the level of difficulty is increased. The supplemental section is used to distinguish whether a client has semantic and phonological dyslexia. The examiner may also gather information about spelling dyslexia

Time:

Part 1 of the WAB takes approximately 30-45 minutes; Part 2 takes approximately 45-60 minutes to administer.

The WAB-R includes a Bedside Screener that takes about 15 minutes to administer.

Training requirements:

None typically reported.

Equipment:

The WAB requires specialized equipment, in addition to unlined paper, a pen and pencil, and a stopwatch. The complete set includes 25 record forms, 57 stimulus cards and a test manual. Stimulus cards include a range of images, words and sentences that are shown to the client during the assessment.

The WAB-R contains a pack of 25 bedside record forms, a Raven’s Coloured Progressive Matrices test booklet and manipulative set (cup, comb, flower, matches, screwdriver, 4 Koh’s blocks, watch, hammer, telephone, ball, knife, safety pin, toothbrush, eraser, padlock, key, paperclip, rubber band, spoon, tape, fork).

Detailed administration guidelines are provided in the test manual.

Alternative forms of the Assessment

There are no alternate forms, although there is a bedside screener in the WAB-R in case of time constraints. This bedside screener can be administered in a comfortable setting. The record form provides administration and scoring guidelines. The score retrieved can determine a client’s abilities and functioning before surgery or any other medical procedure.

Can be used with:

  • Patients between the ages of 18 and 89 with acquired neurological disorders due to stroke, a head injury, or dementia.
  • Alzheimer’s disease (Risser & Spreen, 1984)
  • Nonaphasic right hemisphere lesions (Risser & Spreen, 1984)

In what languages is the measure available?

  • Published in English (Risser & Spreen, 1984).
  • Published in Japanese (WAB Aphasia Test Construction Committee, 1986).
  • Available in Hungarian, French, Portuguese, and two Indian language translations (Risser & Spreen, 1984); translated into Turkish (Keklikoglu, Selcuki & Keskin, 2009), translated into Korean (Kim and Na, 2004) (Unpublished).
  • Standardized Cantonese version (CAB; Yiu, 1992).
  • Hebrew version (Kasher, Batori, Soroker, Graves & Zaidel, 1999).

Summary

What does the tool measure? The WAB assesses the linguistic and main nonlinguistic skills of adults with aphasia. This provides information regarding the type and severity of aphasia and lesion location.
What types of clients can the tool be used for? Clients between the ages of 18 and 89 with acquired neurological disorders due to stroke, head injury, or dementia.
Is this a screening or assessment tool? Assessment tool (The WAB-R has a bedside screening tool).
Time to administer
  • 30-45 minutes (Part 1)
  • 45-60 minutes (Part 2)
  • 15 minutes (Bedside Screener)
Versions
  • Cantonese version of the WAB (CAB)
  • Korean version of the WAB (K-WAB)
Other Languages
  • Published in English (Risser & Spreen, 1984) and Japanese (WAB Aphasia Test Construction Committee, 1986).
  • Available in Hungarian, French, Portuguese, and two Indian language translations (Risser & Spreen, 1984); translated into Turkish (Keklikoglu, Selcuki & Keskin, 2009), translated into Korean (Kim and Na, 2004) (Unpublished).
Measurement Properties
Reliability Internal consistency:
– One study reported excellent internal consistency of the WAB-LQ.
– One study reported adequate to excellent internal consistency of the K-WAB Naming subtest.

Test-retest:
– Three studies reported adequate to excellent test-retest reliability of the WAB.
– One study reported excellent test-retest reliability of the K-WAB.
– One study reported excellent test-retest reliability of the WAB.

Intra-rater:
One study reported excellent intra-rater reliability of the WAB.

Inter-rater:
– One study reported excellent inter-rater reliability of the WAB.
– One study reported excellent inter-rater reliability of the K-WAB.

Validity Content:
Content validity of the WAB was derived from comparison with content of other aphasia test batteries.

Criterion:
Concurrent:
One study examined concurrent validity of the K-WAB and reported adequate to excellent correlations with the Korean version of the Boston Naming Test.

Predictive:
No studies have examined predictive validity of the WAB with patients with stroke.

Construct:
One study examined construct validity of the WAB by factor analysis.

Convergent/Discriminant:
– Six studies examined convergent validity of the WAB and have reported excellent correlations with the Neurosensory Center Comprehensive Examination for Aphasia and the Mississippi Aphasia Screening Test; adequate to excellent correlations with the Communicative Effectiveness Index; and adequate correlations with tests of word productivity and error frequency during fluent speech.
– One study examined convergent validity of the WAB and reported adequate to excellent correlations with Main Concept analysis.
– Two studies examined discriminant validity of the WAB and reported adequate correlations with the Raven’s Coloured Progressive Matrices; excellent correlations with the Scandinavian Stroke Scale; and adequate correlations with the Functional Independence Measure and Barthel Index.

Known Groups:
Four studies examined known group validity of the WAB and determined that the tool is able to differentiate between individuals with aphasia and individuals without aphasia; different types of aphasia (except Broca’s and Wernicke’s aphasia); and different severity of aphasia.

Floor/Ceiling Effects No studies have examined floor/ceiling effects of the WAB in clients with stroke.
Sensitivity/Specificity Two studies have reported on sensitivity/specificity of the WAB with patients with stroke. The authors of the measure recommended a cut-off score of 93.8 to determine ‘aphasic’ performance on the WAB Aphasia Quotient, with 60% sensitivity and 100% specificity. A corresponding WAB Cognitive Quotient cutoff score of 95.32 resulted in 80% sensitivity and specificity.
Does the tool detect change in patients? The WAB has been used to detect change in communication or severity of aphasia over time or in response to intervention.
Acceptability The WAB is quite lengthy to administer.
Feasibility No specific training is required to administer the WAB.
How to obtain the tool? Available at Pearson assessments

Psychometric Properties

Overview

A literature search was conducted to identify all relevant publications on the psychometric properties of the WAB relevant to individuals with stroke. Thirteen studies were found.

Floor/Ceiling Effects

No studies have reported on floor or ceiling effects of the WAB specific to clients with stroke.

Reliability

Internal consistency:
Shewan and Kertesz (1980) examined internal consistency of WAB Spontaneous Speech, Auditory Comprehension, Repetition, Naming, Reading and Writing subtests (reported by Shewan (1986) as the WAB-LQ) with 140 patients with aphasia secondary to acute stroke, using Cronbach’s alpha and Bentler’s coefficient theta. Internal consistency of all subtests was excellent (a=0.905; q=0.974).

Kim and Na (2004) used the Korean Western Aphasia Battery (K-WAB) to examine the relationships between the Naming subtest total score and tasks: Object Naming (ON), Word Fluency (WF), Sentence Completion (SC) and Responsive Speech (RS). There were excellent correlations between the Naming subtest total score and all tasks (ON: r=0.971; WF: r=0.693; SC: r=0.785; RS: r=0.797). There were excellent correlations between all tasks (r=0.606-0.726) except for an adequate correlation between the SC and WF tasks (r=0.534). These correlations suggest that all tasks are similar enough to be part of the same subtest, but not too similar to be redundant.

Test-retest:
Kertesz and McCabe (1977) assessed 1-year test-retest reliability of the WAB in 22 patients with aphasia secondary to chronic stroke, using Pearson r correlation coefficient and reported excellent test-retest reliability (r=0.992).

Shewan and Kertesz (1980) examined test-retest reliability of the WAB in a sample of up to 38 patients with aphasia secondary to chronic stroke, using Pearson correlation coefficients. Participants were assessed on two occasions anywhere from 6 months to 6.5 years apart (average 12-23 months between testing). Test-retest reliability was excellent for the WAB-AQ (r=0.968), WAB-CQ (n=9, r=0.895), WAB-LQ subtests (also reported in Shewan, 1986): Spontaneous Speech – Information Content (r=0.947) and Fluency (r=0.941), Comprehension (r=0.881), Repetition (r=0.970), Naming (r=0.923), Reading (n=32; r=0.927) and Writing (n=25; r=0.956) and the Construction subtest (n=14, r=974). Test-retest reliability was adequate for the Praxis subtest (n=18, r=0.581).

Pederson, Vinter and Skyhoj Olsen (2001) examined 3½-month test-retest reliability of the WAB in a sample of 19 patients with aphasia secondary to chronic stroke, using Pearson correlation coefficient and reported excellent test-retest reliability (r=0.96). There was no significant change in WAB scores across the two time points.

Kim and Na (2004) examined 5-day test-retest reliability of the K-WAB in a sample of 20 neurologically stable aphasic patients, using Pearson’s correlation coefficient. Test-retest reliability of composite scores and all subtests was excellent (Aphasia Quotient: r=0.976; Language Quotient: r=0.977; Cortical Quotient: r=0.920; Spontaneous Speech: r=0.96; Auditory Comprehension: r=0.967; Repetition: r=0.952; Naming: r=0.934; Reading: r=0.986; Writing: r=0.988; Praxis, r=0.908; Construction: r=0.922). These results suggest that all subtests of the K-WAB are stable and reliable over a short time period.

Pak-Hin Kong (2011) examined test-retest reliability of the Cantonese version of the WAB (CAB) in a sample of 16 participants with aphasia secondary to chronic stroke, using Spearman’s rho correlation. Participants were examined on two occasions, 12 to 16 months apart. Test-retest reliability was excellent for the Spontaneous Speech subtest – Information, Fluency and total scores (r=0.83, 0.94, 0.96 respectively) and for the Naming subtest (r=0.91) and overall AQ (r=0.93). Participants did not demonstrate significant differences in CAB scores across the two evaluations.

Intra-rater:
Shewan and Kertesz (1980) reported on intra-rater reliability of the WAB administered to 10 patients with aphasia secondary to stroke several months apart. Videotapes were viewed and scored by 3 judges. Intra-rater reliability was excellent for the WAB-AQ (r=0.986-0.997), WAB-CQ (r=0.992-0.998), all WAB-LQ subtests (also reported in Shewan, 1986): Spontaneous Speech – Information Content (r=0.926-944), Spontaneous Speech – Fluency (r=0.794-0.989), Comprehension (r=0.985-0.993), Repetition (r=0.993-0.997), Naming (r=0.995-0.999), Reading/Writing (r=0.985-0.998), and the Praxis (r=0.983-0.989) and Construction (0.948-0.996) subtests.

Inter-rater:
Shewan and Kertesz (1980) examined inter-rater reliability of the WAB administered to 10 patients with aphasia secondary to stroke several months apart, using Pearson product moment correlation coefficients. Videotapes were independently scored by 8 judges and correlations were averaged for each subtest. Inter-rater reliability was excellent for the WAB-AQ, WAB-CQ, WAB-LQ subtests (also reported in Shewan, 1986) and the Praxis and Construction subtests.

Kim and Na (2004) examined inter-rater reliability of the K-WAB in a sample of patients, where 93.7% presented with acute or subacute stroke, using Pearson’s correlation coefficient. Three certified Speech-Language Pathologists assessed video-recordings of participants’ language performance. Inter-rater reliability was excellent for the K-WAB composite scores and all subtests (Aphasia Quotient: r=1.000; Language Quotient: r=0.997; Cortical Quotient: r=1.000; Spontaneous Speech – Fluency: r=0.994; Spontaneous Speech – Content: r=0.993; Auditory Comprehension: r=1.000; Repetition: r=0.999; Naming: r=0.999; Reading: r=0.999; Writing: r=0.999; Praxis: r=1.000; Construction: r=0.998).
Note: The number of participants used to calculate inter-rater reliability was unclear.

Validity

Content:

Shewan and Kertesz (1980) reported that the WAB appears to meet subjective criteria for content validity as it measures language content areas common to all aphasia batteries and has similar items to the BDAE.

Shewan (1986) examined content validity of the WAB-LQ by comparison with content of other aphasia test batteries. WAB-LQ subtests (Spontaneous Speech, Auditory Comprehension, Naming, Repetition, Reading, Writing) span the domain of spoken and written language, are aspects typically addressed in rehabilitation, and are also included in common aphasia test batteries in terms of area assessed, content and difficulty.

Criterion:

Concurrent:
Kim and Na (2004) examined concurrent validity of the K-WAB Naming subtest and the Korean version of the Boston Naming Test (K-BNT) in 238 patients, where 93.7% (n=223) of patients presented with stroke. There were excellent correlations between the K-BNT score and K-WAB Object Naming (r=0.720) and Word Fluency (r=0.613) tasks and Naming subtotal score (r=0.719). There were adequate correlations between the K-BNT score and the K-WAB Sentence Completion (r=0.526) and Responsive Speech (r=0.553) tasks.

Predictive:
No studies have reported on predictive validity of the WAB with patients with stroke.

Construct:

Kertesz and Phipps (1977) examined construct validity of the WAB by factor analysis with a sample of 142 patients with aphasia secondary to stroke, using an r-type Principal Components Analysis. Results indicated that the five subtests contributing to the total AQ load relatively equally on Root 1, which accounts for 83% of total variance. The Fluency and Comprehension subtests were the principal components for Root 2, which accounted for 9% of total variance. Repetition and Information Content were principal components of Root 3, which accounted for 7% of total variance. Naming was the principal component for Root 4, which accounted for 2% of total variance.

Convergent/Discriminant:
Shewan and Kertesz (1980) examined convergent validity of the WAB by comparison with corresponding subtests of the Neurosensory Center Comprehensive Examination for Aphasia (NCCEA) with a sample of 15 patients with aphasia, using Pearson correlation coefficients. Correlations were excellent between: WAB Spontaneous Speech and NCCEA Description of Use and Sentence Construction (r= 0.817); WAB Comprehension and NCCEA Identification by Name and Identification by Sentence (r= 0.915); WAB Repetition and NCCEA Sentence Repetition (r= 0.880); WAB Naming and NCCEA Visual Naming and Word Fluency (r= 0.904); WAB Reading and NCCEA Reading subtests (r=0.919); WAB Writing and NCCEA Writing subtests (r=0.905); and WAB and NCCEA total scores (r=0.973). There was also an excellent correlation between the WAB-CQ (minus the Praxis and Construction subtests) and a comparable NCCEA score (minus the Tactile Naming-Right/Left, Articulation, Digit Repetition-Forward/Backward subtests) (r=0.964).

Shewan and Kertesz (1980) examined discriminant validity of the WAB by comparison with Raven’s Coloured Progressive Matrices scores with a sample of 140 patients with aphasia, using Pearson product-moment correlation coefficients. There was an adequate correlation (r=0.547) between the two tests, indicated that the WAB had some influence of intelligence.

Laures-Gore, DuBay, Duff and Buchanan (2010) examined convergent validity of the WAB AQ by comparison with measures of word productivity (WP) and error frequency (EF) in fluent speech in a sample of 14 patients with aphasia secondary to stroke, using Pearson correlation coefficients. WP was measured as the proportion of productive words to total words for each minute of a language sample, and EF was measured as the total number of errors divided by the total number of productive words across a 10-minute speech sample. There was an adequate positive correlation between WAB AQ and WP (r=0.59) and an adequate negative correlation between WAB AQ and EF (r=-0.54).

Bakheit, Carrington, Griffiths and Searle (2005) examined convergent validity of the WAB by comparison with the Communicative Effectiveness Index (CETI) in 67 patients with aphasia and acute stroke, using Pearson correlation coefficients. Participants received multidisciplinary rehabilitation including conventional speech language pathology began on admission to a rehabilitation unit and continued on discharge from hospital. Measures were taken at baseline, 4 weeks, 8 weeks, 12 weeks and 24 weeks. There was an excellent correlation between the two assessments over all test periods (r=0.71), with adequate to excellent correlations at each time point (r= 0.70, 0.67, 0.53, 0.66 and 0.68, respectively).

Pederson, Vinter and Skyhoj Olsen (2001) examined convergent validity of the WAB by comparison with a Danish adaptation and translation of the Communicative Effectiveness Index (CETI) in 66 patients with aphasia secondary to chronic stroke, using Pearson correlation coefficients. There was an excellent correlation between the CETI and the WAB Aphasia Quotient (r=0.76). There were adequate to excellent correlations between the CETI and WAB Information (r=0.73), Fluency (r=0.73), Comprehension (r=0.65), Repetition (r=0.53) and Naming (r=0.75) subtests. There were adequate to excellent correlations between the CETI and WAB reading (r=0.70), writing (r=0.69), apraxia (r=0.60), calculation (r=0.64), block design (r=0.34), drawing (r=0.41) and Raven Coloured Matrices (r=0.41) tasks.

Pederson, Vinter and Skyhoj Olsen (2001) examined discriminant validity of the WAB Aphasia Quotient (WAB-AQ) and Raven Coloured Matrices by comparison with the Scandinavian Stroke Scale (SSS), Barthel Index (BI) and Frenchay Activities Index (FAI) with a sample of 66 patients with aphasia secondary to chronic stroke, using Pearson Correlation. There was an excellent correlation between the WAB-AQ and the SSS (r=0.64) and adequate correlations between the WAB-AQ and the BI (r=0.44) and the FAI (r=0.50). There were adequate correlations between the WAB Raven Coloured Matrices and the BI (r=0.41) and FAI (r=0.48).

Kostalova et al. (2008) examined convergent validity of the WAB by comparison with the Czech version of the Mississippi Aphasia Screening Test (MASTcz) in 45 patients with left hemisphere strokes with aphasia, using Pearson product-limit correlation. There was an excellent correlation between the two tests (r= 0.933).

Ivanova and Hallowell (2012) examined convergent validity of the WAB-R Spontaneous speech, Auditory verbal comprehension, Repetition and Naming subtests and Aphasia Quotient with an Eye Movement Working Memory (EMWM) task in a sample of 28 people with aphasia secondary to stroke. There were no significant correlations between tests following Holm correction to control for familywise alpha.

Pak-Hin Kong (2011) examined convergent validity of the Cantonese version of the WAB (CAB) with the main concept (MC) analysis in 16 participants with aphasia secondary to chronic stroke, using Spearman’s rho coefficients. MC analysis captures the presence, accuracy, completeness and efficiency of content in oral narratives among Cantonese speakers with aphasia, and scores included: number of accurate and complete concepts (AC); number of accurate but incomplete concepts (AI), number of inaccurate concepts (IN), number of absent concepts (AB), MC score (MC) and number of accurate and complete concepts per min (AC/min). Scores were correlated with the CAB Spontaneous Speech subtest Information, Fluency and total scores, Naming subtest and overall AQ. Participants were examined on two occasions, 12 to 16 months apart. At both time points there were adequate to excellent correlations between the CAB Spontaneous Speech Information score and most related MC measures (AC: r=0.54, 0.73; AI: r=0.60, 0.62; AB: r=-0.64, -0.75; MC score: r=0.64, 0.76), and excellent correlations between the CAB Spontaneous Speech Fluency score and most related MC measures (AC: r=0.94, 0.93; AB: r=-0.91, -0.87; MC score: r=0.94, 0.91; AC/min: r=0.94, 0.93), CAB Spontaneous Speech score and MC score (r=0.96, 0.90), CAB Naming subtest and related MC measures (AC: r=0.91, 0.89; AB: r=-0.85, -0.84; MC score: r=0.89, 0.92) and CAB overall AQ and related MC measures (AC: r=0.91, 0.89; AB: r=-0.93, -0.84; MC score: r=0.95, 0.91).

Known groups:
Shewan and Kertesz (1980) examined known-group validity of the WAB with a sample of 117 patients with aphasia and 122 control subjects without aphasia, using Newman-Keuls Shortest Significant Range Test. Patients with aphasia were categorized according to type: anomic aphasia (n=37), conduction aphasia (n=13), Wernicke’s aphasia (n=18), Broca’s aphasia (n=25) or global type aphasia (n=24). Control subjects without aphasia formed one of three groups: non-brain injured adults (n=31), individuals with non-dominant-hemisphere lesions (n=70) or individuals with diffuse brain lesions (n=21). Results indicated that the control groups differed significantly from all aphasic types, excluding the anomic group. The control groups did not differ significantly among themselves. All aphasic groups differed significantly among themselves, with the exception of the Broca’s and Wernicke’s aphasic types. The global aphasic group scored significantly lower than other aphasic groups.

Shewan (1986) examined change in WAB-LQ scores over time in a sample of 50 adults with aphasia secondary to acute stroke. Participants were stratified according to aphasia severity based on baseline WAB-AQ scores (mild, moderate, severe) and were assessed at baseline (2-4 weeks post-onset of aphasia), 3 months, and at least 6 months post-baseline. Post-hoc analysis by Newman-Keuls shortest significant range test revealed significant differences among the three groups from baseline to final assessment (p<0.05). WAB-LQ scores changed in a similar fashion for the three severity levels across time. Greatest gains were seen among patients with moderate or severe aphasia.

Ross and Wertz (2003) examined known-group validity of the WAB with a sample of 18 healthy adults and 18 patients with aphasia secondary to chronic stroke, using Wilcoxon’s rank sum test (W). There were significant differences between groups for WAB aphasia quotient and cortical quotient (p<0.000).

Ross and Wertz (2004) examined known-group validity of the WAB in a sample of 10 healthy (non-brain-injured) adults and 10 patients with mild aphasia secondary to chronic stroke. There were significant differences between groups for the WAB aphasia quotient and cortical quotient (p<0.000).

Sensitivity/Specificity:
Ross and Wertz (2004) examined sensitivity and specificity of the WAB with a sample of 10 patients with mild aphasia secondary to chronic stroke and 10 healthy (non-brain-injured) adults. A prescribed score of 93.8 was used to determine ‘aphasic’ performance on the WAB Aphasia Quotient, with 60% sensitivity and 100% specificity. A corresponding WAB Cognitive Quotient cutoff score of 95.32 resulted in 80% sensitivity and specificity.

Kim and Na (2004) examined sensitivity and specificity of the K-WAB with a sample of 238 patients, where 93.7% (n=223) presented with acute or subacute stroke. Patients were grouped according to age (15-74 years vs. ≥75 years) and years of education (0, 1-6, ≥7). The following results were found for the AQ:

  • For patients between 15-74 years with 0 years of education an optimal cutoff score of 85.25 resulted in 93% diagnostic accuracy, 92% sensitivity and 94% specificity (area under the curve (AUC) = 0.973);
  • For patients between 15-74 years with 1-6 years of education an optimal cutoff score of 87.45 resulted in 96% diagnostic accuracy, 94% sensitivity and 100% specificity (AUC = 0.998);
  • For patients between 15-74 years with over 7 years of education an optimal cutoff score of 93.25 resulted in 94% diagnostic accuracy, 93% sensitivity and 96% specificity (AUC=0.986);
  • For patients 75 years and older with 0 years of education an optimal cutoff score of 74.05 resulted in 94% diagnostic accuracy, 100% sensitivity and 94% specificity (AUC=0.976);
  • For patients 75 years and older with 1-6 years of education an optimal cutoff score of 86.00 resulted in 100% diagnostic accuracy, sensitivity and specificity (AUC=1.000);
  • For patients 75 years of age and older with over 7 years of education an optimal cutoff score of 88.30 resulted in 93% diagnostic accuracy, 94% sensitivity and 90% specificity (AUC=0.975).

The following results were found for the LQ:

  • For patients between 15-74 years with 1-6 years of education an optimal cutoff score of 79.96 produced 96% diagnostic accuracy, 94% sensitivity and 100% specificity (AUC=0.982);
  • For patients between 15-74 years with over 7 years of education an optimal cutoff score of 89.05 produced 94% diagnostic accuracy, 96% sensitivity and 93% specificity (AUC=0.977);
  • For patients 75 years and older with 1-6 years of education an optimal cutoff of 65.9 produced 100% diagnostic accuracy, sensitivity and specificity (AUC=1.000);
  • For patients 75 years and older with over 7 years of education an optimal cutoff of 72.7 produced 100% diagnostic accuracy, sensitivity and specificity (AUC=1.000).

The following results were found for the CQ:

  • For patients between 15-74 years with 1-6 years of education the optimal cutoff score of 77.65 produced 100% diagnostic accuracy, sensitivity and specificity (AUC=1.000);
  • For patients between 15-74 years of age, with over 7 years of education the optimal cutoff score of 90.85 produced 95% diagnostic accuracy, 94% sensitivity and 96% specificity (AUC=0.986);
  • For patients 75 years and older with 1-6 years of education the optimal cutoff score of 69.72 produced 100% diagnostic accuracy, sensitivity and specificity (AUC=1.000);

For patients 75 years and older with over 7 years of education the optimal cutoff score of 75.86 produced 100% diagnostic accuracy, sensitivity and specificity (AUC=1.000).

Responsiveness

Shewan (1986) examined change in WAB-LQ scores over time in a sample of 50 adults with aphasia secondary to acute stroke, who received treatment (n=42) or no treatment (n=8). Participants were assessed at baseline (2-4 weeks post-onset of aphasia), 3 months, and at least 6 months post-baseline. There was a significant main effect for time (F=43.33, df=2.96, p<0.0001), and post-hoc analysis by Newman-Keuls shortest significant range test revealed significant differences in the mean scores for the three tests (p<0.01). These significant increases in WAB-LQ scores with recovery support its validity as a measure of severity.

Aftonomos, Steele, Appelbaum and Harris (2001) used the WAB to measure change in communication at the impairment level following a community-based Language Care Center (LCC) Treatment Program with a sample of 50 patients with aphasia, 49 secondary to subacute or chronic stroke. Participants’ mean scores improved significantly from pre- to post-treatment on all WAB subtests, with absolute percentages ranging from 6.5% to 13% improvement (p values ranging from p<0.01 to p<0.0001). Participants were categorized according to pre-treatment WAB AQ scores: (a) < 25 (n=13); (b) 25-50 (n=16); (c) 50-75 (n=12); and (d) ≥ 75 (n=9). Mean AQ scores improved by +8.8, +11.2, +13.6 and +7.9 (respectively), whereby participants with scores in the two middle groups showed greater gains than those with scores at the extremes.

References

  • Aftonomos, L.B., Steele, R.D., Appelbaum, J.S., & Harris, V.M. (2001). Relationships between impairment-level assessments and functional-level assessments in aphasia: Findings from LCC treatment programmes. Aphasiology, 15(10/11), 951-964.
  • Bakheit, A.M.O., Carrington, S., Griffiths, S., & Searle, K. (2005). High scores on the Western Aphasia Battery correlate with good functional communication skills (as measured with the Communicative Effectiveness Index) in aphasic stroke patients. Disability & Rehabilitation, 27(6), 287-291.
  • Goldstein, G., Beers, S. R., & Hersen, M. (2004). Comprehensive Handbook of Psychological Assessment: Intellectual and Neuropsychological Assessment. New Jersey: John Wiley & Sons.
  • Goodglass, H. & Kaplan, E. (1972). The Assessment of Aphasia and Related Disorders. Philadelphia: Lea & Febiger.
  • Ivanova, M.V. & Hallowell, B. (2012). Validity of an eye-tracking method to index working memory in people with and without aphasia. Aphasiology, 26(3-4), 556-578.
  • Kertesz, A. & McCabe, P. (1977). Recovery patterns and prognosis in aphasia. Brain, 100, 1-18.
  • Kertesz, A., & Phipps, J. B. (1977). Numerical taxonomy of aphasia. Brain and Language, 4(1), 1-10.
  • Kertesz, A. (1982). Western Aphasia Battery Test Manual. New York: Grune and Stratton.
  • Kim, H., & Na, D.L. (2004). Normative data on the Korean version of the Western Aphasia Battery. Journal of Clinical and Experimental Neuropsychology, 26(8), 1011-1020.
  • Kostalova, M., Bartkova, E., Sajgalikova, K., Dolenska, A., Dusek, L., & Bednarik, J. (2008). A standardization study of the Czech version of the Mississippi Aphasia Screening Test (MASTcz) in stroke patients and control subjects. Brain Injury, 22(10), 793-801.
  • Laures-Gore, J.S., DuBay, M. F., Duff, M. C., & Buchanan, T. W. (2010). Identifying behavioral measures of stress in individuals with aphasia. Journal of Speech, Language, and Hearing Research, 53(5), 1394-1400.
  • Pak-Hin Kong, A. (2011). The main concept analysis in Cantonese aphasic oral discourse: External validation and monitoring chronic aphasia. Journal of Speech, Language, and Hearing Research, 54(1), 148-159.
  • Pedersen, P.M., Vinter, K., & Skyhoj Olsen, T. (2001). The Communicative Effectiveness Index: Psychometric properties of a Danish adaptation. Aphasiology, 15(8), 787-802.
  • Risser, A. H., & Spreen, O. (1985). Test review: The Western Aphasia Battery. Journal of Clinical and Experimental Neuropsychology, 7(4), 463-470.
  • Ross, K. B. & Wertz, R. T. (2003). Discriminative validity of selected measures for differentiating normal from aphasic performance. American Journal of Speech-Language Pathology, 12(3), 312-319.
  • Ross, K.B. & Wertz, R.T. (2004). Accuracy of formal tests for diagnosing mild aphasia: An application of evidence‐based medicine. Aphasiology, 18(4), 337-355.
  • Shewan, C.M. (1986). The language quotient (LQ): A new measure for the Western Aphasia Battery. Journal of Communication Disorders, 19(6), 427-439.
  • Shewan, C.M., & Kertesz, A. (1980). Reliability and validity characteristics of the Western Aphasia Battery (WAB). Journal of Speech and Hearing Disorders, 45(3), 308-324.
  • WAB Aphasia Test Construction Committee (1986). The Japanese version of the Western Aphasia Battery. Tokyo:Igaku-Shoin Ltd.
  • Yiu, E.M.L. (1992) Linguistic assessment of Chinese-speaking aphasics: Development of a Cantonese aphasia battery. Journal of Neurolinguistics, 7, 379-424.

See the measure

How to obtain the WAB

The WAB-R can be purchased online at Pearson Assessment & Information

Table of contents
Help us to improve