Multiple Errands Test (MET)

Evidence Reviewed as of before: 08-05-2013

Author(s)*: Valérie Poulin, OT, PhD candidate; Annabel McDermott, OT

Editor(s): Nicol Korner-Bitensky, PhD OT

Expert Reviewer: Deirdre Dawson, PhD OT

Purpose

The Multiple Errands Test (MET) evaluates the effect of executive function deficits on everyday functioning through a number of real-world tasks (e.g. purchasing specific items, collecting and writing down specific information, arriving at a stated location). Tasks are performed in a hospital or community setting within the constraints of specified rules. The participant is observed performing the test and the number and type of errors (e.g. rule breaks, omissions) are recorded.

In-Depth Review

Purpose of the measure

The Multiple Errands Test was developed by Shallice and Burgess in 1991. The measure was intended to evaluate a patient’s ability to organize performance of a number of simple unstructured tasks while following several simple rules.

See Alternative Forms sections below for information regarding other versions.

Features of the measure

Items:

The original Multiple Errands Test (Shallice and Burgess, 1991) was comprised of 8 items: 6 simple tasks (e.g. buy a brown loaf of bread, buy a packet of throat pastilles), 1 task that is time-dependent, and 1 that comprises 4 subtasks (see Description of tasks, below). It should be noted that the MET was originally devised in an experimental context, rather than as a formal assessment.

Description of tasks:

The original Multiple Errands Test (Shallice and Burgess, 1991) was comprised of 8 written tasks to be completed in a pedestrian shopping precinct. Tasks and rules are written on a card provided to the participant before arriving at the shopping precinct. Of the 8 tasks, 6 are simple (e.g. buy a brown loaf of bread, buy a packet of throat pastilles), the 7th requires the participant to be at a particular place 15 minutes after starting the test, and the 8th is more demanding as it comprises 4 sets of information that the participant must obtain and write on a postcard:

the name of the shop most likely to have the most expensive item;
the price of a pound of tomatoes;
the name of the coldest place in Britain yesterday; and
the rate of the exchange of the French franc yesterday.

The card also includes instructions and rules, which are repeated to the participant on arrival at the shopping precinct:

“You are to spend as little money as possible (within reason) and take as little time as possible (without rushing excessively). No shop should be entered other than to buy something. Please tell one or other of us when you leave a shop what you have bought. You are not to use anything not bought on the street (other than a watch) to assist you. You may do the tasks in any order.“

Scoring:

The participant is observed performing the test and errors are recorded according to the following categorizations:

Inefficiencies: where a more effective strategy could have been applied
Rule breaks: where a specific rule (either social or explicitly mentioned in the task) is broken
Interpretation failure: where requirements of a particular task are misunderstood
Task failure: where a task is either not carried out or not completed satisfactorily.

Time taken to complete the assessment is recorded and the total number of errors is calculated.

Alternative versions of the Multiple Errands Test

Different versions of the MET were developed for use in specific hospitals (MET – Hospital Version and Baycrest MET), a small shopping plaza (MET – Simplified Version), and a virtual reality environment (Virtual MET). For each of these versions, 12 tasks must be performed (e.g. purchasing specific items and collecting specific information) while following several rules.

MET – Hospital Version (MET-HV – Knight, Alderman & Burgess, 2002)

The MET-HV was developed for use with a wider range of participants than the original version by adopting more concrete rules and simpler tasks. Clients are provided with an instruction sheet that explicitly directs them to record designated information. Clients must achieve four sets of simple tasks, with a total of 12 separate subtasks:

The client must complete six specific errands (purchase 3 items, use the internal phone, collect an envelope from reception, and send a letter to an external address).
The client must obtain and write down four items of designated information (e.g. the opening time of a shop on Saturday).
The client must meet the assessor outside the hospital reception 20 minutes after the test had begun and state the time.
The client must inform the assessor when he/she finishes the test.

The MET-HV uses 9 rules in order to reduce ambiguity and simplify task demands (Knight et al., 2002). Errors are categorized according to the same definitions as the original MET. The test is preceded by (a) an efficiency question rated using an end-point weighted 10-point Likert scaleLikert scaling is one type of response to items in a questionnaire or tool. For example, Likert scaling would have you rate an item such as "I am satisfied with the care I received" on a scale using a 1-to-5 response scale where:
• 1 = strongly disagree
• 2 = disagree
• 3 = undecided
• 4 = agree
• 5 = strongly agree
You will find various options and scaling methods for the number of response choices (1-to-7, 1-to-9, 0-to-4). Odd-numbered scales usually have a middle value that is labelled Neutral or Undecided. Some tools used forced-choice Likert scaling with an even number of responses and no middle neutral or undecided choice. (“How efficient would you say you were with tasks like shopping, finding out information, and meeting people on time?“); and (b) a familiarity question rated using a 4-point scale (“How well would you say you know the hospital grounds?“). On completion the client answers a question rated using a 10-point scale (“How well do you think you did with the task?“).

MET – Simplified Version (MET-SV – Alderman, Burgess, Knight & Henman, 2003)

The MET-SV includes four sets of simple tasks analogous to those in the original MET, however the MET-SV incorporates 3 main modifications to the original version:

More concrete rules to enhance task clarity and reduce likelihood of interpretation failures;
Simplification of task demands; and
Space provided on the instruction sheet for the participant to record the information they were required to collect.

The MET-SV has 9 rules that are more explicit than the original MET and are clearly presented on the instruction sheet.

Baycrest MET (BMET – Dawson, Anderson, Burgess, Cooper, Krpan & Stuss, 2009)

The BMET was developed with an identical structure to the MET-HV, except that some items, information and a meeting place are specific to the testing environment (Baycrest Center, Toronto). The BMET comprises 12 items and 8 rules. The test manual provides explicit instructions including collecting test materials, language to be used in describing the test, and a pretest section to ensure participants understand the tasks. Scoring was standardized to allow for increased usability. The score sheet allows identification of specific task errors or omissions, other inefficiencies, rule breaks and strategy use (please contact the authors for further details regarding the manual: ddawson@research.baycrest.org).

Virtual MET (VMET – Rand, Rukan, Weiss & Katz, 2009)

The VMET was developed within the Virtual Mall, a functional video-capture virtual shopping environment that consists of a large supermarket with 9 aisles. The system includes a single camera that films the user and displays his/her image within the virtual environment. The VMET is a complex shopping task that includes the same number of tasks (items to be bought and information to be obtained) as the MET-HV. However, the client is required to check the contents of the shopping cart at a particular time instead of meeting the tester at a certain time. Virtual reality enables the assessor to objectively measure the client’s behaviour in a safe, controlled and ecologically valid environment. It enables repeated learning trials and adaptability of the environment and task according to the client’s needs.

What to consider before beginning:

The MET is performed in a real-world shopping area that allows for minor unpredicted events to occur.

Time:

The BMET takes approximately 60 minutes to administer (Dawson et al., 2009).

Training requirements:

It is advised that the assessor reads the test manual and becomes familiar with the procedures for test administration and scoring.

Equipment:

Access to a shopping precinct or virtual shopping environment
Pen and paper
Instruction sheet (according to version being used)

Client suitability

Can be used with:

The MET has been tested on populations with acquired brain injury including strokeAlso called a "brain attack" and happens when brain cells die because of inadequate blood flow. 20% of cases are a hemorrhage in the brain caused by a rupture or leakage from a blood vessel. 80% of cases are also know as a "schemic stroke", or the formation of a blood clot in a vessel supplying blood to the brain..

Should not be used with:

The MET cannot be administered to patients who are confined to bed.
Participants require sufficient language skills.
Some tasks may need to be adapted depending on the rehabilitation setting.

In what languages is the measure available?

The MET was developed in English.

Summary

What does the tool measure?	The effect of executive function deficits on everyday functioning.
What types of clients can the tool be used for?	The Multiple Errands Test can be used with, but is not limited to, clients with strokeAlso called a "brain attack" and happens when brain cells die because of inadequate blood flow. 20% of cases are a hemorrhage in the brain caused by a rupture or leakage from a blood vessel. 80% of cases are also know as a "schemic stroke", or the formation of a blood clot in a vessel supplying blood to the brain..
Is this a screeningTesting for disease in people without symptoms. or assessment tool?	Assessment
Time to administer	Baycrest MET: approximately 60 minutes (Dawson et al., 2009).
Versions	Multiple Errands Test (MET) (Shallice and Burgess, 1991) MET – Simplified Version (MET-SV) (Alderman et al., 2003) MET – Hospital Version (MET-HV) (Knight, Alderman & Burgess, 2002) Virtual MET (Rand, Rukan, Weiss & Katz, 2009) Baycrest MET (Dawson et al., 2009) Modified version of the MET-SV and MET-HV (including 3 alternate versions) (Novakovic-Agopian et al., 2011, 2012)
Other Languages	N/A
Measurement Properties
ReliabilityReliability can be defined in a variety of ways. It is generally understood to be the extent to which a measure is stable or consistent and produces similar results when administered repeatedly. A more technical definition of reliability is that it is the proportion of "true" variation in scores derived from a particular measure. The total variation in any given score may be thought of as consisting of true variation (the variation of interest) and error variation (which includes random error as well as systematic error). True variation is that variation which actually reflects differences in the construct under study, e.g., the actual severity of neurological impairment. Random error refers to "noise" in the scores due to chance factors, e.g., a loud noise distracts a patient thus affecting his performance, which, in turn, affects the score. Systematic error refers to bias that influences scores in a specific direction in a fairly consistent way, e.g., one neurologist in a group tends to rate all patients as being more disabled than do other neurologists in the group. There are many variations on the measurement of reliability including alternate-forms, internal consistency , inter-rater agreement , intra-rater agreement , and test-retest .	Internal consistencyA method of measuring reliability . Internal consistency reflects the extent to which items of a test measure various aspects of the same characteristic and nothing else. Internal consistency coefficients can take on values from 0 to 1. Higher values represent higher levels of internal consistency.: One study reported adequate internal consistencyA method of measuring reliability . Internal consistency reflects the extent to which items of a test measure various aspects of the same characteristic and nothing else. Internal consistency coefficients can take on values from 0 to 1. Higher values represent higher levels of internal consistency. of the MET-HV in a sample of patients with chronic acquired brain injury including strokeAlso called a "brain attack" and happens when brain cells die because of inadequate blood flow. 20% of cases are a hemorrhage in the brain caused by a rupture or leakage from a blood vessel. 80% of cases are also know as a "schemic stroke", or the formation of a blood clot in a vessel supplying blood to the brain.. Test-retest: No studies have reported on the test-retest reliabilityA way of estimating the reliability of a scale in which individuals are administered the same scale on two different occasions and then the two scores are assessed for consistency. This method of evaluating reliability is appropriate only if the phenomenon that the scale measures is known to be stable over the interval between assessments. If the phenomenon being measured fluctuates substantially over time, then the test-retest paradigm may significantly underestimate reliability. In using test-retest reliability, the investigator needs to take into account the possibility of practice effects, which can artificially inflate the estimate of reliability (National Multiple Sclerosis Society). of the MET with a population of patients with strokeAlso called a "brain attack" and happens when brain cells die because of inadequate blood flow. 20% of cases are a hemorrhage in the brain caused by a rupture or leakage from a blood vessel. 80% of cases are also know as a "schemic stroke", or the formation of a blood clot in a vessel supplying blood to the brain.. Intra-rater: No studies have reported on the intra-rater reliabilityThis is a type of reliability assessment in which the same assessment is completed by the same rater on two or more occasions. These different ratings are then compared, generally by means of correlation. Since the same individual is completing both assessments, the rater's subsequent ratings are contaminated by knowledge of earlier ratings. of the MET with a population of patients with strokeAlso called a "brain attack" and happens when brain cells die because of inadequate blood flow. 20% of cases are a hemorrhage in the brain caused by a rupture or leakage from a blood vessel. 80% of cases are also know as a "schemic stroke", or the formation of a blood clot in a vessel supplying blood to the brain.. Inter-rater: – One study reported excellent inter-rater reliabilityA method of measuring reliability . Inter-rater reliability determines the extent to which two or more raters obtain the same result when using the same instrument to measure a concept. of the MET-HV in a sample of patients with chronic acquired brain injury including strokeAlso called a "brain attack" and happens when brain cells die because of inadequate blood flow. 20% of cases are a hemorrhage in the brain caused by a rupture or leakage from a blood vessel. 80% of cases are also know as a "schemic stroke", or the formation of a blood clot in a vessel supplying blood to the brain.. – One study reported adequate to excellent inter-rater reliabilityA method of measuring reliability . Inter-rater reliability determines the extent to which two or more raters obtain the same result when using the same instrument to measure a concept. of the BMET in a sample of patients with acquired brain injury including strokeAlso called a "brain attack" and happens when brain cells die because of inadequate blood flow. 20% of cases are a hemorrhage in the brain caused by a rupture or leakage from a blood vessel. 80% of cases are also know as a "schemic stroke", or the formation of a blood clot in a vessel supplying blood to the brain..
ValidityThe degree to which an assessment measures what it is supposed to measure.	Criterion: Concurrent: No studies have reported on the concurrent validityTo validate a new measure, the results of the measure are compared to the results of the gold standard obtained at approximately the same point in time (concurrently), so they both reflect the same construct. This approach is useful in situations when a new or untested tool is potentially more efficient, easier to administer, more practical, or safer than another more established method and is being proposed as an alternative instrument. See also "gold standard." of the MET in a strokeAlso called a "brain attack" and happens when brain cells die because of inadequate blood flow. 20% of cases are a hemorrhage in the brain caused by a rupture or leakage from a blood vessel. 80% of cases are also know as a "schemic stroke", or the formation of a blood clot in a vessel supplying blood to the brain. population. Predictive: One study examined predictive validityA form of criterion validity that examines a measure's ability to predict some subsequent event. Example: can the Berg Balance Scale predict falls over the following 6 weeks? The criterion standard in this example would be whether the patient fell over the next 6 weeks. of the MET-HV with a sample of patients with acquired brain injury including strokeAlso called a "brain attack" and happens when brain cells die because of inadequate blood flow. 20% of cases are a hemorrhage in the brain caused by a rupture or leakage from a blood vessel. 80% of cases are also know as a "schemic stroke", or the formation of a blood clot in a vessel supplying blood to the brain. and reported poor to adequate correlations between discharge MET-HV performance and community participationAs defined by the International Classification of Functioning, Disability and Health, participation is an individual's involvement in life situations in relation to health conditions, body functions or structures, activities, and contextual factors. Participation restrictions are problems an individual may have in the manner or extent of involvement in life situations. measured by the Mayo-Portland Adaptability Inventory (MPAI-4). Construct: Convergent/Discriminant: – Three studies* examined convergent validityA type of validity that is determined by hypothesizing and examining the overlap between two or more tests that presumably measure the same construct. In other words, convergent validity is used to evaluate the degree to which two or more measures that theoretically should be related to each other are, in fact, observed to be related to each other. of the MET-HV and reported excellent correlations with the Modified Wisconsin Card Sorting Test (MWCST), Behavioural Assessment of Dysexecutive Syndrome battery (BADS), Dysexecutive questionnaire (DEX), IADL questionnaire and FIM Cognitive score; and an adequate correlationThe extent to which two or more variables are associated with one another. A correlation can be positive (as one variable increases, the other also increases - for example height and weight typically represent a positive correlation) or negative (as one variable increases, the other decreases - for example as the cost of gasoline goes higher, the number of miles driven decreases. There are a wide variety of methods for measuring correlation including: intraclass correlation coefficients (ICC), the Pearson product-moment correlation coefficient, and the Spearman rank-order correlation. with the Rivermead Behavioural Memory Test (RBMT). – One study* examined convergent validityA type of validity that is determined by hypothesizing and examining the overlap between two or more tests that presumably measure the same construct. In other words, convergent validity is used to evaluate the degree to which two or more measures that theoretically should be related to each other are, in fact, observed to be related to each other. of the MET-SV and reported adequate correlations with the Weschler Adult Intelligence Scale – Revised Full Scale IQ (WAIS-R FSIQ), MWCST, BADS and Cognitive Estimates test; and poor to adequate correlations with the DEX. – One study* examined convergent validityA type of validity that is determined by hypothesizing and examining the overlap between two or more tests that presumably measure the same construct. In other words, convergent validity is used to evaluate the degree to which two or more measures that theoretically should be related to each other are, in fact, observed to be related to each other. of the BMET and reported adequate to excellent correlations with the Sickness Impact Profile and Assessment of Motor and Process Skills. – Three studies* examined convergent validityA type of validity that is determined by hypothesizing and examining the overlap between two or more tests that presumably measure the same construct. In other words, convergent validity is used to evaluate the degree to which two or more measures that theoretically should be related to each other are, in fact, observed to be related to each other. of the VMET and reported excellent correlations with the MET-HV, BADS, IADL questionnaire, Semantic Fluencies test, Tower of London test, Trail Making Test, Corsi’s supra-span test, Street’s Completion Test and the Test of Attentional Performance. Note: Correlations between the MET and other measures of everyday executive functioning and IADLs used in these studies also provide support for the ecological validityRefers to the extent to which a measure captures behaviours that are reflective of those that would occur in a natural setting of the MET. Known Groups:* – Two studies reported that the MET-HV is able to differentiate between individuals with acquired brain injury (including strokeAlso called a "brain attack" and happens when brain cells die because of inadequate blood flow. 20% of cases are a hemorrhage in the brain caused by a rupture or leakage from a blood vessel. 80% of cases are also know as a "schemic stroke", or the formation of a blood clot in a vessel supplying blood to the brain.) vs. healthy adults, and between healthy older adults vs. healthy younger adults. – One study reported that the MET-SV is able to differentiate between clients with brain injury including strokeAlso called a "brain attack" and happens when brain cells die because of inadequate blood flow. 20% of cases are a hemorrhage in the brain caused by a rupture or leakage from a blood vessel. 80% of cases are also know as a "schemic stroke", or the formation of a blood clot in a vessel supplying blood to the brain. vs. healthy adults. – One study reported that the BMET is able to differentiate between clients with strokeAlso called a "brain attack" and happens when brain cells die because of inadequate blood flow. 20% of cases are a hemorrhage in the brain caused by a rupture or leakage from a blood vessel. 80% of cases are also know as a "schemic stroke", or the formation of a blood clot in a vessel supplying blood to the brain. vs. healthy adults. – Three studies reported that the VMET is able to differentiate between clients with strokeAlso called a "brain attack" and happens when brain cells die because of inadequate blood flow. 20% of cases are a hemorrhage in the brain caused by a rupture or leakage from a blood vessel. 80% of cases are also know as a "schemic stroke", or the formation of a blood clot in a vessel supplying blood to the brain. vs. healthy adults, and between healthy older adults vs. healthy younger adults. SensitivitySensitivity refers to the probability that a diagnostic technique will detect a particular disease or condition when it does indeed exist in a patient (National Multiple Sclerosis Society). See also "Specificity." /Specificity: – One study reported 85% sensitivitySensitivity refers to the probability that a diagnostic technique will detect a particular disease or condition when it does indeed exist in a patient (National Multiple Sclerosis Society). See also "Specificity." and 95% specificitySpecificity refers to the probability that a diagnostic technique will indicate a negative test result when the condition is absent (true negative). when using a cut-off score ≥ 7 errors on the MET-HV with clients with chronic acquired brain injury including strokeAlso called a "brain attack" and happens when brain cells die because of inadequate blood flow. 20% of cases are a hemorrhage in the brain caused by a rupture or leakage from a blood vessel. 80% of cases are also know as a "schemic stroke", or the formation of a blood clot in a vessel supplying blood to the brain.. – One study reported 82% sensitivitySensitivity refers to the probability that a diagnostic technique will detect a particular disease or condition when it does indeed exist in a patient (National Multiple Sclerosis Society). See also "Specificity." and 95.3% specificitySpecificity refers to the probability that a diagnostic technique will indicate a negative test result when the condition is absent (true negative). when using a cut-off score ≥ 12 errors on the MET-SV with clients with brain injury including strokeAlso called a "brain attack" and happens when brain cells die because of inadequate blood flow. 20% of cases are a hemorrhage in the brain caused by a rupture or leakage from a blood vessel. 80% of cases are also know as a "schemic stroke", or the formation of a blood clot in a vessel supplying blood to the brain..
Floor/Ceiling Effects	No studies have reported on the floor/ceiling effects of the MET.
Does the tool detect change in patients?	ResponsivenessThe ability of an instrument to detect clinically important change over time. of the MET has not been formally evaluated, however: – One study used a modified version of the MET-HV and MET-SV to measure change following intervention; – One study used the MET-HV and the VMET to detect change in multi-tasking skills of clients with strokeAlso called a "brain attack" and happens when brain cells die because of inadequate blood flow. 20% of cases are a hemorrhage in the brain caused by a rupture or leakage from a blood vessel. 80% of cases are also know as a "schemic stroke", or the formation of a blood clot in a vessel supplying blood to the brain. following intervention.
Acceptability	The MET provides functional assessment of executive function as it enables clients to participate in real-world activitiesAs defined by the International Classification of Functioning, Disability and Health, activity is the performance of a task or action by an individual. Activity limitations are difficulties in performance of activities. These are also referred to as function. .
Feasibility	Administration of the MET requires access to a shopping area and so is not always feasible in a typical clinical setting. Some tasks may need to be adapted depending on the rehabilitation setting. Administration time can be lengthy. Ecological validityRefers to the extent to which a measure captures behaviours that are reflective of those that would occur in a natural setting is supported.
How to obtain the tool?	The Baycrest MET can be obtained at https://cognitionandeverydaylifelabs.com/multiple-errands-test/

Psychometric Properties

Overview

A literature search was conducted to identify publications on the psychometric properties of the Multiple Errands Test (MET) relevant to a population of patients with strokeAlso called a "brain attack" and happens when brain cells die because of inadequate blood flow. 20% of cases are a hemorrhage in the brain caused by a rupture or leakage from a blood vessel. 80% of cases are also know as a "schemic stroke", or the formation of a blood clot in a vessel supplying blood to the brain.. Of the 10 studies reviewed, 8 included a mixed population of patients with acquired brain injury including strokeAlso called a "brain attack" and happens when brain cells die because of inadequate blood flow. 20% of cases are a hemorrhage in the brain caused by a rupture or leakage from a blood vessel. 80% of cases are also know as a "schemic stroke", or the formation of a blood clot in a vessel supplying blood to the brain.. Studies have reviewed psychometric properties of the original MET, Hospital Version (MET-HV), Simplified Version (MET-SV), Baycrest MET (BMET) and Virtual MET (VMET), as indicated in the summaries below. While research indicates that the MET demonstrates adequate validityThe degree to which an assessment measures what it is supposed to measure.
and reliabilityReliability can be defined in a variety of ways. It is generally understood to be the extent to which a measure is stable or consistent and produces similar results when administered repeatedly. A more technical definition of reliability is that it is the proportion of "true" variation in scores derived from a particular measure. The total variation in any given score may be thought of as consisting of true variation (the variation of interest) and error variation (which includes random error as well as systematic error). True variation is that variation which actually reflects differences in the construct under study, e.g., the actual severity of neurological impairment. Random error refers to "noise" in the scores due to chance factors, e.g., a loud noise distracts a patient thus affecting his performance, which, in turn, affects the score. Systematic error refers to bias that influences scores in a specific direction in a fairly consistent way, e.g., one neurologist in a group tends to rate all patients as being more disabled than do other neurologists in the group. There are many variations on the measurement of reliability including alternate-forms, internal consistency , inter-rater agreement , intra-rater agreement , and test-retest .
in populations with acquired brain injury including strokeAlso called a "brain attack" and happens when brain cells die because of inadequate blood flow. 20% of cases are a hemorrhage in the brain caused by a rupture or leakage from a blood vessel. 80% of cases are also know as a "schemic stroke", or the formation of a blood clot in a vessel supplying blood to the brain., further research regarding responsivenessThe ability of an instrument to detect clinically important change over time.
of the measure is warranted.

Floor/Ceiling Effects

No studies have reported on floor/ceiling effects of the MET with a strokeAlso called a "brain attack" and happens when brain cells die because of inadequate blood flow. 20% of cases are a hemorrhage in the brain caused by a rupture or leakage from a blood vessel. 80% of cases are also know as a "schemic stroke", or the formation of a blood clot in a vessel supplying blood to the brain. population.

Reliability

Internal consistencyA method of measuring reliability . Internal consistency reflects the extent to which items of a test measure various aspects of the same characteristic and nothing else. Internal consistency coefficients can take on values from 0 to 1. Higher values represent higher levels of internal consistency.:
Knight, Alderman & Burgess (2002) calculated internal consistencyA method of measuring reliability . Internal consistency reflects the extent to which items of a test measure various aspects of the same characteristic and nothing else. Internal consistency coefficients can take on values from 0 to 1. Higher values represent higher levels of internal consistency. of the MET-HV in a sample of 20 patients with chronic acquired brain injury (traumatic brain injury, n=12; strokeAlso called a "brain attack" and happens when brain cells die because of inadequate blood flow. 20% of cases are a hemorrhage in the brain caused by a rupture or leakage from a blood vessel. 80% of cases are also know as a "schemic stroke", or the formation of a blood clot in a vessel supplying blood to the brain., n=5, both TBI and strokeAlso called a "brain attack" and happens when brain cells die because of inadequate blood flow. 20% of cases are a hemorrhage in the brain caused by a rupture or leakage from a blood vessel. 80% of cases are also know as a "schemic stroke", or the formation of a blood clot in a vessel supplying blood to the brain., n=3) and 20 healthy control subjects matched for gender, age and IQ, using Cronbach’s alpha. Internal consistencyA method of measuring reliability . Internal consistency reflects the extent to which items of a test measure various aspects of the same characteristic and nothing else. Internal consistency coefficients can take on values from 0 to 1. Higher values represent higher levels of internal consistency. was adequate (α=0.77).

Test-retest:
No studies have reported on the test-retest reliabilityA way of estimating the reliability of a scale in which individuals are administered the same scale on two different occasions and then the two scores are assessed for consistency. This method of evaluating reliability is appropriate only if the phenomenon that the scale measures is known to be stable over the interval between assessments. If the phenomenon being measured fluctuates substantially over time, then the test-retest paradigm may significantly underestimate reliability. In using test-retest reliability, the investigator needs to take into account the possibility of practice effects, which can artificially inflate the estimate of reliability (National Multiple Sclerosis Society).
of the MET.

Intra-rater:
No studies have reported on the intra-rater reliabilityThis is a type of reliability assessment in which the same assessment is completed by the same rater on two or more occasions. These different ratings are then compared, generally by means of correlation. Since the same individual is completing both assessments, the rater's subsequent ratings are contaminated by knowledge of earlier ratings.
of the MET.

Inter-rater:
Knight, Alderman & Burgess (2002) calculated inter-rater reliabilityA method of measuring reliability . Inter-rater reliability determines the extent to which two or more raters obtain the same result when using the same instrument to measure a concept.
of the MET-HV error categories in a sample of 20 patients with chronic acquired brain injury (traumatic brain injury, n=12; strokeAlso called a "brain attack" and happens when brain cells die because of inadequate blood flow. 20% of cases are a hemorrhage in the brain caused by a rupture or leakage from a blood vessel. 80% of cases are also know as a "schemic stroke", or the formation of a blood clot in a vessel supplying blood to the brain., n=5, both TBI and stroke, n=3) and 20 healthy control subjects matched for gender, age and IQ, using intraclass correlationThe extent to which two or more variables are associated with one another. A correlation can be positive (as one variable increases, the other also increases - for example height and weight typically represent a positive correlation) or negative (as one variable increases, the other decreases - for example as the cost of gasoline goes higher, the number of miles driven decreases. There are a wide variety of methods for measuring correlation including: intraclass correlation coefficients (ICC), the Pearson product-moment correlation coefficient, and the Spearman rank-order correlation.
coefficients. Participants were scored by 2 assessors. Inter-rater reliabilityA method of measuring reliability . Inter-rater reliability determines the extent to which two or more raters obtain the same result when using the same instrument to measure a concept.
was excellent (ICC ranging from 0.81-1.00). The ‘rule breaks’ error category demonstrated the strongest inter-rater reliabilityA method of measuring reliability . Inter-rater reliability determines the extent to which two or more raters obtain the same result when using the same instrument to measure a concept.
(ICC=1.00).

Dawson, Anderson, Burgess, Cooper, Krpan and Stuss (2009) examined inter-rater reliabilityA method of measuring reliability . Inter-rater reliability determines the extent to which two or more raters obtain the same result when using the same instrument to measure a concept.
of the BMET with clients with strokeAlso called a "brain attack" and happens when brain cells die because of inadequate blood flow. 20% of cases are a hemorrhage in the brain caused by a rupture or leakage from a blood vessel. 80% of cases are also know as a "schemic stroke", or the formation of a blood clot in a vessel supplying blood to the brain. (n=14) or traumatic brain injury (n=13) and healthy matched controls (n=25), using Intraclass CorrelationThe extent to which two or more variables are associated with one another. A correlation can be positive (as one variable increases, the other also increases - for example height and weight typically represent a positive correlation) or negative (as one variable increases, the other decreases - for example as the cost of gasoline goes higher, the number of miles driven decreases. There are a wide variety of methods for measuring correlation including: intraclass correlation coefficients (ICC), the Pearson product-moment correlation coefficient, and the Spearman rank-order correlation.
Coefficients and 2-way random effects models. Participants were scored by two assessors. Inter-rater reliabilityA method of measuring reliability . Inter-rater reliability determines the extent to which two or more raters obtain the same result when using the same instrument to measure a concept.
was adequate to excellent for the five summary measures used: mean number of tasks completed accurately (ICC = 0.80), mean number of rules adhered to (ICC = 0.71), mean number of total errors (ICC = 0.82), mean number of total rules broken (ICC = 0.88) and mean number of requests for help (ICC = 0.71).

Validity

Content:

Shallice & Burgess (1991) evaluated the MET in a sample of 3 patients with traumatic brain injury (TBI) who demonstrated above-average performance on measures of general ability and normal or near-normal performance on frontal lobe tests, and 9 age- and IQ-matched controls. Participants were monitored by two observers and were scored according to number of errors (inefficiencies, rule breaks, interpretation failures, task failures and total score) and qualitative observation. The patients demonstrated qualitatively and quantitatively impaired performance, particularly relating to rule breaks and inefficiencies. The most difficult subtest was the least sensitive part of the procedure and presented difficulties for both patients and control subjects.

Criterion:

Concurrent:
No studies have reported on the concurrent validityTo validate a new measure, the results of the measure are compared to the results of the gold standard obtained at approximately the same point in time (concurrently), so they both reflect the same construct. This approach is useful in situations when a new or untested tool is potentially more efficient, easier to administer, more practical, or safer than another more established method and is being proposed as an alternative instrument. See also "gold standard."
of the MET in a strokeAlso called a "brain attack" and happens when brain cells die because of inadequate blood flow. 20% of cases are a hemorrhage in the brain caused by a rupture or leakage from a blood vessel. 80% of cases are also know as a "schemic stroke", or the formation of a blood clot in a vessel supplying blood to the brain. population.

Predictive:
Maier, Krauss & Katz (2011) examined predictive validityA form of criterion validity that examines a measure's ability to predict some subsequent event. Example: can the Berg Balance Scale predict falls over the following 6 weeks? The criterion standard in this example would be whether the patient fell over the next 6 weeks.
of the MET-HV in relation to community participationAs defined by the International Classification of Functioning, Disability and Health, participation is an individual's involvement in life situations in relation to health conditions, body functions or structures, activities, and contextual factors. Participation restrictions are problems an individual may have in the manner or extent of involvement in life situations. with a sample of 30 patients with acquired brain injury including strokeAlso called a "brain attack" and happens when brain cells die because of inadequate blood flow. 20% of cases are a hemorrhage in the brain caused by a rupture or leakage from a blood vessel. 80% of cases are also know as a "schemic stroke", or the formation of a blood clot in a vessel supplying blood to the brain. (n=19). Community participationAs defined by the International Classification of Functioning, Disability and Health, participation is an individual's involvement in life situations in relation to health conditions, body functions or structures, activities, and contextual factors. Participation restrictions are problems an individual may have in the manner or extent of involvement in life situations. was measured using the Mayo-Portland Adaptability Inventory (MPAI-4) ParticipationAs defined by the International Classification of Functioning, Disability and Health, participation is an individual's involvement in life situations in relation to health conditions, body functions or structures, activities, and contextual factors. Participation restrictions are problems an individual may have in the manner or extent of involvement in life situations. Index (M2PI), completed by the participant and a significant other. The MET-HV was administered 1 week prior to discharge from rehabilitation and the M2PI was administered at 3 months post-discharge. Analyses were performed using Pearson correlationThe extent to which two or more variables are associated with one another. A correlation can be positive (as one variable increases, the other also increases - for example height and weight typically represent a positive correlation) or negative (as one variable increases, the other decreases - for example as the cost of gasoline goes higher, the number of miles driven decreases. There are a wide variety of methods for measuring correlation including: intraclass correlation coefficients (ICC), the Pearson product-moment correlation coefficient, and the Spearman rank-order correlation.
analysis and partial correlationThe extent to which two or more variables are associated with one another. A correlation can be positive (as one variable increases, the other also increases - for example height and weight typically represent a positive correlation) or negative (as one variable increases, the other decreases - for example as the cost of gasoline goes higher, the number of miles driven decreases. There are a wide variety of methods for measuring correlation including: intraclass correlation coefficients (ICC), the Pearson product-moment correlation coefficient, and the Spearman rank-order correlation.
controlling for cognitive status using FIM Cognitive scores. Predictably, higher MET-HV error scores correlated with more restrictions in community participationAs defined by the International Classification of Functioning, Disability and Health, participation is an individual's involvement in life situations in relation to health conditions, body functions or structures, activities, and contextual factors. Participation restrictions are problems an individual may have in the manner or extent of involvement in life situations.. There were adequate correlations between participants’ and significant others’ M2PI total score and MET-HV total error score (r = 0.403, 0.510 respectively), inefficiencies (r = 0.353, 0.524 respectively) and rule breaks (r = 0.361, 0.449 respectively). The ability for the MET total error score to predict the M2PI significant other score remained significant but poor following partial correction controlling for cognitive status using FIM Cognitive scores (r = 0.212).

Construct:

Convergent/Discriminant:
Knight, Alderman & Burgess (2002)* examined convergent validityA type of validity that is determined by hypothesizing and examining the overlap between two or more tests that presumably measure the same construct. In other words, convergent validity is used to evaluate the degree to which two or more measures that theoretically should be related to each other are, in fact, observed to be related to each other.
of the MET-HV by comparison with tests of IQ and cognitive functioning, traditional frontal lobe tests and ecologically sensitive executive function tests, in a sample of 20 patients with chronic acquired brain injury (traumatic brain injury, n=12; strokeAlso called a "brain attack" and happens when brain cells die because of inadequate blood flow. 20% of cases are a hemorrhage in the brain caused by a rupture or leakage from a blood vessel. 80% of cases are also know as a "schemic stroke", or the formation of a blood clot in a vessel supplying blood to the brain., n=5, both TBI and strokeAlso called a "brain attack" and happens when brain cells die because of inadequate blood flow. 20% of cases are a hemorrhage in the brain caused by a rupture or leakage from a blood vessel. 80% of cases are also know as a "schemic stroke", or the formation of a blood clot in a vessel supplying blood to the brain., n=3). Tests of IQ and cognitive functioning included the National Adult Reading Test – Revised Full Scale Intelligence Quotient (NART-R FSIQ), Weschler Adult Intelligence Scale – Revised Full Scale Intelligence Quotient (WAIS-R FSIQ), Adult Memory and Information Processing Battery (AMIPB), Rivermead Behavioural Memory Test (RBMT) and Visual Objects and Space Perception battery (VOSP). Frontal lobe tests included verbal fluency, the Cognitive Estimates Test (CET), Modified Card Sorting Test (MCST), Tower of London Test (TOLT) and versions of the hand manipulation and hand alternation tests. Ecologically sensitive executive function tests included the Behavioural Assessment of the Dysexecutive Syndrome battery (BADS) and the Test of Everyday Attention (TEA) Map Search and Visual Elevator tasks. The Dysexecutive (DEX) questionnaire was also used, although proxy reports were used rather than self-reports due to identified lack of insight of individuals with brain injury. There were excellent correlations between the MCST percentage perseverative errors with MET-HV rule breaks (r=0.66) and MET-HV total errors (r=0.67) following Bonferroni adjustment. There were excellent correlations between the BADS Profile score and the MET-HV task failures (r = -0.58), interpretation failures (r = 0.64) and total errors (r = -0.57). There was an excellent correlationThe extent to which two or more variables are associated with one another. A correlation can be positive (as one variable increases, the other also increases - for example height and weight typically represent a positive correlation) or negative (as one variable increases, the other decreases - for example as the cost of gasoline goes higher, the number of miles driven decreases. There are a wide variety of methods for measuring correlation including: intraclass correlation coefficients (ICC), the Pearson product-moment correlation coefficient, and the Spearman rank-order correlation.
between the DEX intentionality factor and MET-HV task failures (r = 0.70). In addition, the relationship between the MET-HV and DEX was re-evaluated to control for possible confounding effects; controlling variables age, familiarity and memory function with respect to MET-HV task failures resulted in excellent correlations with the DEX total score (r = 0.79) and DEX inhibitionThe ability to suppress automatic actions that are inappropriate in a given context that interfere with a certain behavior (Grieve & Gnanasekaran, 2008)
(r = 0.69), intentionality (r = 0.76) and executive memory (r = 0.67) factors. There was an adequate correlationThe extent to which two or more variables are associated with one another. A correlation can be positive (as one variable increases, the other also increases - for example height and weight typically represent a positive correlation) or negative (as one variable increases, the other decreases - for example as the cost of gasoline goes higher, the number of miles driven decreases. There are a wide variety of methods for measuring correlation including: intraclass correlation coefficients (ICC), the Pearson product-moment correlation coefficient, and the Spearman rank-order correlation.
between the RBMT Profile Score and the MET-HV number of task failures (r=-0.57). There were no significant correlations between the MET and other tests of IQ and cognitive functioning (MET-HV, NART-R FSIQ, WAIS-R FSIQ, AMIPB, VOSP), and other frontal lobe tests (verbal fluency, CET, TOLT, hand manipulation and hand alternation tests), other ecologically sensitive executive function tests (TEA Map Search and Visual Elevator tasks) or other DEX factors (positive affect, negative affect).
Note: Initial correlations were measured using Pearson correlationThe extent to which two or more variables are associated with one another. A correlation can be positive (as one variable increases, the other also increases - for example height and weight typically represent a positive correlation) or negative (as one variable increases, the other decreases - for example as the cost of gasoline goes higher, the number of miles driven decreases. There are a wide variety of methods for measuring correlation including: intraclass correlation coefficients (ICC), the Pearson product-moment correlation coefficient, and the Spearman rank-order correlation.
coefficient and significance levels were subsequently adjusted by Bonferroni adjustment to account for multiple comparisons; results reported indicate significant correlations following Bonferroni adjustment.

Rand, Rukan, Weiss & Katz (2009a)* examined convergent validityA type of validity that is determined by hypothesizing and examining the overlap between two or more tests that presumably measure the same construct. In other words, convergent validity is used to evaluate the degree to which two or more measures that theoretically should be related to each other are, in fact, observed to be related to each other.
of the MET-HV by comparison with measures of executive function and IADLs with a sample of 9 patients with subacute or chronic strokeAlso called a "brain attack" and happens when brain cells die because of inadequate blood flow. 20% of cases are a hemorrhage in the brain caused by a rupture or leakage from a blood vessel. 80% of cases are also know as a "schemic stroke", or the formation of a blood clot in a vessel supplying blood to the brain., using Spearman correlationThe extent to which two or more variables are associated with one another. A correlation can be positive (as one variable increases, the other also increases - for example height and weight typically represent a positive correlation) or negative (as one variable increases, the other decreases - for example as the cost of gasoline goes higher, the number of miles driven decreases. There are a wide variety of methods for measuring correlation including: intraclass correlation coefficients (ICC), the Pearson product-moment correlation coefficient, and the Spearman rank-order correlation.
coefficients. Executive function was measured using the BADS Zoo Map test and IADLs were measured using the IADL questionnaire. There were excellent negative correlations between the BADS Zoo Map and MET-HV outcome measures of total number of mistakes (r = -0.93), partial mistakes in completing tasks (r = -0.80), non-efficiency mistakes (r = -0.86) and time to complete the MET (r = -0.79). There were excellent correlations between the IADL questionnaire and the MET-HV number of mistakes of rule breaks (r = 0.80) and total number of mistakes (r = -0.76).

Maier, Krauss & Katz (2011)* examined convergent validityA type of validity that is determined by hypothesizing and examining the overlap between two or more tests that presumably measure the same construct. In other words, convergent validity is used to evaluate the degree to which two or more measures that theoretically should be related to each other are, in fact, observed to be related to each other.
of the MET-HV by comparison with the FIM Cognitive score with a sample of 30 patients with acquired brain injury including strokeAlso called a "brain attack" and happens when brain cells die because of inadequate blood flow. 20% of cases are a hemorrhage in the brain caused by a rupture or leakage from a blood vessel. 80% of cases are also know as a "schemic stroke", or the formation of a blood clot in a vessel supplying blood to the brain. (n=19), using Pearson correlationThe extent to which two or more variables are associated with one another. A correlation can be positive (as one variable increases, the other also increases - for example height and weight typically represent a positive correlation) or negative (as one variable increases, the other decreases - for example as the cost of gasoline goes higher, the number of miles driven decreases. There are a wide variety of methods for measuring correlation including: intraclass correlation coefficients (ICC), the Pearson product-moment correlation coefficient, and the Spearman rank-order correlation.
analysis. There was an excellent negative correlationThe extent to which two or more variables are associated with one another. A correlation can be positive (as one variable increases, the other also increases - for example height and weight typically represent a positive correlation) or negative (as one variable increases, the other decreases - for example as the cost of gasoline goes higher, the number of miles driven decreases. There are a wide variety of methods for measuring correlation including: intraclass correlation coefficients (ICC), the Pearson product-moment correlation coefficient, and the Spearman rank-order correlation.
between MET-HV total errors score and FIM Cognitive score (r = -0.67).

Alderman, Burgess, Knight and Henman (2003)* examined convergent validityA type of validity that is determined by hypothesizing and examining the overlap between two or more tests that presumably measure the same construct. In other words, convergent validity is used to evaluate the degree to which two or more measures that theoretically should be related to each other are, in fact, observed to be related to each other.
of the MET-SV by comparison with tests of IQ, executive function and everyday executive abilities with 50 clients with brain injury including strokeAlso called a "brain attack" and happens when brain cells die because of inadequate blood flow. 20% of cases are a hemorrhage in the brain caused by a rupture or leakage from a blood vessel. 80% of cases are also know as a "schemic stroke", or the formation of a blood clot in a vessel supplying blood to the brain. (n=9). Neuropsychological tests included the WAIS-R FSIQ, BADS, Cognitive Estimates Test, FAS verbal fluency test, a modified version of the Wisconsin Card Sorting Test (MWCST) and the DEX. There were adequate correlations between MET-SV task failure errors and WAIS-R FSIQ (r = -0.32), MWCST perseverative errors (r = 0.39), BADS profile score (r = -0.46) and Zoo-Map (r = -0.46) and Six Element Test (r = -0.41) subtests. There were adequate negative correlations between MET-SV social rule breaks and the Cognitive Estimates (r = -0.33), and between MET-SV task rule breaks, social rule breaks and total rule breaks and the BADS Action Program subtest (r = -0.42, -0.40, -0.43 respectively). There were poor to adequate negative correlations between the DEX and MET-SV rule breaks (r = -0.30), task failures (r = -0.25) and total errors (r = -0.37).

In a subgroup analysis of individuals with brain injury who passed traditional executive function tests but failed the MET-SV (n=17), there were adequate to excellent correlations between MET-SV inefficiencies and DEX factors of intentionality and negative affect (r = 0.59, -0.76); MET-SV interpretation failures and DEX inhibitionThe ability to suppress automatic actions that are inappropriate in a given context that interfere with a certain behavior (Grieve & Gnanasekaran, 2008)
and total (r = -0.67, -0.57); MET-SV total and actual rule breaks and DEX inhibitionThe ability to suppress automatic actions that are inappropriate in a given context that interfere with a certain behavior (Grieve & Gnanasekaran, 2008)
(r = -0.70, 0.66), intentionality (r = 0.60, 0.64) and total (r = -0.57, 0.59); MET-SV social rule breaks and DEX positive and negative affect (r = 0.79, -0.59); MET-SV task failures and DEX inhibitionThe ability to suppress automatic actions that are inappropriate in a given context that interfere with a certain behavior (Grieve & Gnanasekaran, 2008)
and positive affect (r = -0.58, -0.52), and MET-SV total errors and DEX intentionality (r = 0.67).

Dawson et al. (2009)* examined convergent validityA type of validity that is determined by hypothesizing and examining the overlap between two or more tests that presumably measure the same construct. In other words, convergent validity is used to evaluate the degree to which two or more measures that theoretically should be related to each other are, in fact, observed to be related to each other.
of the BMET by comparison with other measures of IADL and everyday function with 14 clients with strokeAlso called a "brain attack" and happens when brain cells die because of inadequate blood flow. 20% of cases are a hemorrhage in the brain caused by a rupture or leakage from a blood vessel. 80% of cases are also know as a "schemic stroke", or the formation of a blood clot in a vessel supplying blood to the brain., using Pearson correlationThe extent to which two or more variables are associated with one another. A correlation can be positive (as one variable increases, the other also increases - for example height and weight typically represent a positive correlation) or negative (as one variable increases, the other decreases - for example as the cost of gasoline goes higher, the number of miles driven decreases. There are a wide variety of methods for measuring correlation including: intraclass correlation coefficients (ICC), the Pearson product-moment correlation coefficient, and the Spearman rank-order correlation.
. Other measures included the DEX (significant other report), StrokeAlso called a "brain attack" and happens when brain cells die because of inadequate blood flow. 20% of cases are a hemorrhage in the brain caused by a rupture or leakage from a blood vessel. 80% of cases are also know as a "schemic stroke", or the formation of a blood clot in a vessel supplying blood to the brain. Impact Profile (SIP), Assessment of Motor and Process Skills (AMPS) and Mayo Portland Adaptability Inventory (MPAI) (significant other report). There were excellent correlations between the BMET number of rules broken and the SIP – Physical (r = 0.78) and Affective behavior (r = 0.64) scores and the AMPS motor score (r = -0.75). There was an adequate correlationThe extent to which two or more variables are associated with one another. A correlation can be positive (as one variable increases, the other also increases - for example height and weight typically represent a positive correlation) or negative (as one variable increases, the other decreases - for example as the cost of gasoline goes higher, the number of miles driven decreases. There are a wide variety of methods for measuring correlation including: intraclass correlation coefficients (ICC), the Pearson product-moment correlation coefficient, and the Spearman rank-order correlation.
between the BMET time to completion and SIP physical score (r = 0.54).

Rand et al. (2009a)* examined convergent validityA type of validity that is determined by hypothesizing and examining the overlap between two or more tests that presumably measure the same construct. In other words, convergent validity is used to evaluate the degree to which two or more measures that theoretically should be related to each other are, in fact, observed to be related to each other.
of the VMET by comparison with the BADS Zoo Map test and IADL questionnaire with the same sample of 9 patients with subacute or chronic strokeAlso called a "brain attack" and happens when brain cells die because of inadequate blood flow. 20% of cases are a hemorrhage in the brain caused by a rupture or leakage from a blood vessel. 80% of cases are also know as a "schemic stroke", or the formation of a blood clot in a vessel supplying blood to the brain., using Spearman correlationThe extent to which two or more variables are associated with one another. A correlation can be positive (as one variable increases, the other also increases - for example height and weight typically represent a positive correlation) or negative (as one variable increases, the other decreases - for example as the cost of gasoline goes higher, the number of miles driven decreases. There are a wide variety of methods for measuring correlation including: intraclass correlation coefficients (ICC), the Pearson product-moment correlation coefficient, and the Spearman rank-order correlation.
coefficients. There was an excellent negative correlationThe extent to which two or more variables are associated with one another. A correlation can be positive (as one variable increases, the other also increases - for example height and weight typically represent a positive correlation) or negative (as one variable increases, the other decreases - for example as the cost of gasoline goes higher, the number of miles driven decreases. There are a wide variety of methods for measuring correlation including: intraclass correlation coefficients (ICC), the Pearson product-moment correlation coefficient, and the Spearman rank-order correlation.
between the BADS Zoo Map and VMET outcome measure of non-efficiency mistakes (r = -0.87), and between the IADL and VMET total number of mistakes (r = -0.82).

Rand et al. (2009a) also examined the relationships between the scores of the VMET and those of the MET-HV using Spearman and Pearson correlationThe extent to which two or more variables are associated with one another. A correlation can be positive (as one variable increases, the other also increases - for example height and weight typically represent a positive correlation) or negative (as one variable increases, the other decreases - for example as the cost of gasoline goes higher, the number of miles driven decreases. There are a wide variety of methods for measuring correlation including: intraclass correlation coefficients (ICC), the Pearson product-moment correlation coefficient, and the Spearman rank-order correlation.
coefficients. Among patients with strokeAlso called a "brain attack" and happens when brain cells die because of inadequate blood flow. 20% of cases are a hemorrhage in the brain caused by a rupture or leakage from a blood vessel. 80% of cases are also know as a "schemic stroke", or the formation of a blood clot in a vessel supplying blood to the brain., there were excellent correlations between MET-HV and VMET outcomes for the total number of mistakes (r = 0.70), partial mistakes in completing tasks (r = 0.88) and non-efficiency mistakes (r = 0.73). Analysis of the whole population indicated adequate to excellent correlations between MET-HV and VMET outcomes for the total number of mistakes (r = 0.77), complete mistakes of completing a task (r = 0.63), partial mistakes in completing tasks (r = 0.80), non-efficiency mistakes (r = 0.72) and use of strategies (r = 0.44), but not for rule break mistakes.

Raspelli et al. (2010) examined convergent validityA type of validity that is determined by hypothesizing and examining the overlap between two or more tests that presumably measure the same construct. In other words, convergent validity is used to evaluate the degree to which two or more measures that theoretically should be related to each other are, in fact, observed to be related to each other.
of the VMET by comparison with neuropsychological tests, with 6 clients with strokeAlso called a "brain attack" and happens when brain cells die because of inadequate blood flow. 20% of cases are a hemorrhage in the brain caused by a rupture or leakage from a blood vessel. 80% of cases are also know as a "schemic stroke", or the formation of a blood clot in a vessel supplying blood to the brain. and 14 healthy subjects. VMET outcome measures included time, searched item in the correct area, sustained attention, maintained sequence and no perseveration. Neuropsychological tests included the Trail Making Test, Corsi spatial memory supra-span test, Street’s Completion Test, Semantic Fluencies and Tower of London test. There were excellent correlations between the VMET variable ‘time’ and the Semantic Fluencies test (r = -0.87) and the Tower of London test (r = -0.82); between the VMET variable ‘searched item in the correct area’ and the Trail Making Test (r = 0.96); and between the VMET variables ‘sustained attention’, ‘maintained sequence’ and ‘no perseveration’ and Corsi’s supra-span test (r = 0.84) and Street’s Completion Test (r = -0.86).

Raspelli et al. (2012) examined convergent validityA type of validity that is determined by hypothesizing and examining the overlap between two or more tests that presumably measure the same construct. In other words, convergent validity is used to evaluate the degree to which two or more measures that theoretically should be related to each other are, in fact, observed to be related to each other.
of the VMET by comparison with the Test of Attentional Performance (TEA) with 9 clients with strokeAlso called a "brain attack" and happens when brain cells die because of inadequate blood flow. 20% of cases are a hemorrhage in the brain caused by a rupture or leakage from a blood vessel. 80% of cases are also know as a "schemic stroke", or the formation of a blood clot in a vessel supplying blood to the brain.. VMET outcome measures included time, errors, inefficiencies, rule breaks, strategies, interpretation failures and partial-task failures. Authors reported excellent correlations between the VMET outcomes time, inefficiencies and total errors and TEA tests (range r = -0.67 to 0.81).
Note: Other neuropsychological tests were administered but correlations are not reported (Mini Mental Status Examination (MMSE), Beck DepressionIllness involving the body, mood, and thoughts, that affects the way a person eats and sleeps, the way one feels about oneself, and the way one thinks about things. A depressive disorder is not the same as a passing blue mood or a sign of personal weakness or a condition that can be wished away. People with a depressive disease cannot merely "pull themselves together" and get better. Without treatment, symptoms can last for weeks, months, or years. Appropriate treatment, however, can help most people with depression.
Inventory (BDI), State and Trait Anxiety Index (STAI), Behavioural Inattention Test (BIT) – Star Cancellation Test, Brief Neuropsychological Examination (ENB) – Token Test, Street’s Completion Test, Stroop Colour-Word Test, Iowa Gambling Task, DEX and ADL/IADL Tests).
*Note: The correlations between the MET and other measures of everyday executive functioning and IADLs also provide support for the ecological validityRefers to the extent to which a measure captures behaviours that are reflective of those that would occur in a natural setting
of the MET (as reported by the authors of these articles).

Known Group:
Knight, Alderman & Burgess (2002) examined known-group validityThe degree to which an assessment measures what it is supposed to measure.
of the MET-HV in a sample of 20 patients with chronic acquired brain injury (traumatic brain injury, n=12; strokeAlso called a "brain attack" and happens when brain cells die because of inadequate blood flow. 20% of cases are a hemorrhage in the brain caused by a rupture or leakage from a blood vessel. 80% of cases are also know as a "schemic stroke", or the formation of a blood clot in a vessel supplying blood to the brain., n=5, both TBI and strokeAlso called a "brain attack" and happens when brain cells die because of inadequate blood flow. 20% of cases are a hemorrhage in the brain caused by a rupture or leakage from a blood vessel. 80% of cases are also know as a "schemic stroke", or the formation of a blood clot in a vessel supplying blood to the brain., n=3) and 20 healthy control subjects (hospital staff members) matched for gender, age and IQ*. Clients with brain injury made significantly more rule breaks (p=0.002) and total errors (p<0.001), and achieved significantly fewer tasks (p<0.001) than control subjects. Clients with brain injury used significantly more strategies such as looking at a map (p=0.008), reading signs (p=0.006), although use of strategies had little effect on test performance. The test was able to discriminate between individuals with acquired brain injury and healthy controls.
*Note: IQ was measured using the National Adult Reading Test – Revised Full Scale Intelligence Quotient (NART-R FSIQ).

Rand et al. (2009a) examined known group validityThe degree to which an assessment measures what it is supposed to measure.
of the MET-HV with 9 patients with subacute or chronic strokeAlso called a "brain attack" and happens when brain cells die because of inadequate blood flow. 20% of cases are a hemorrhage in the brain caused by a rupture or leakage from a blood vessel. 80% of cases are also know as a "schemic stroke", or the formation of a blood clot in a vessel supplying blood to the brain., 20 healthy young adults and 20 healthy older adults, using Kruskal Wallis H. Patients with strokeAlso called a "brain attack" and happens when brain cells die because of inadequate blood flow. 20% of cases are a hemorrhage in the brain caused by a rupture or leakage from a blood vessel. 80% of cases are also know as a "schemic stroke", or the formation of a blood clot in a vessel supplying blood to the brain. made more mistakes than older adults on VMET outcomes of total mistakes, mistakes in completing tasks, partial mistakes in completing tasks and non-efficiency mistakes, but not rule break mistakes or use of strategies mistakes. Older adults made more mistakes than younger adults on VMET outcomes of total mistakes, partial mistakes in completing tasks and non-efficiency mistakes, but not mistakes in completing tasks, rule break mistakes or use of strategies mistakes.

Alderman et al. (2003) examined known group validityThe degree to which an assessment measures what it is supposed to measure.
of the MET-SV with 46 individuals with no history of neurological disease (hospital staff members) and 50 clients with brain injury including strokeAlso called a "brain attack" and happens when brain cells die because of inadequate blood flow. 20% of cases are a hemorrhage in the brain caused by a rupture or leakage from a blood vessel. 80% of cases are also know as a "schemic stroke", or the formation of a blood clot in a vessel supplying blood to the brain. (n=9), using a series of t-tests. Clients with brain injury made significantly more rule breaks (t = 4.03), task failures (t = 10.10), total errors (t = 7.18), and social rule breaks (chi square 4.3) than individuals with no history of neurological disease. Results regarding errors were preserved when group comparisons were repeated using age, familiarity and cognitive ability (measured by the NART-R FSIQ) as covariates (F = 11.79, 40.82, 27.92 respectively). There was a significant difference in task failures between groups after covarying for age, IQ (measured by the WAIS-R FSIQ) and familiarity with the shopping centre (F = 11.57). Clients with brain injury made approximately three times more errors as healthy individuals. For both groups, rule breaks and task failures were the most common errors.

Dawson et al. (2009) examined known group validityThe degree to which an assessment measures what it is supposed to measure.
of the BMET with 14 clients with strokeAlso called a "brain attack" and happens when brain cells die because of inadequate blood flow. 20% of cases are a hemorrhage in the brain caused by a rupture or leakage from a blood vessel. 80% of cases are also know as a "schemic stroke", or the formation of a blood clot in a vessel supplying blood to the brain. and 13 healthy matched controls, using a series of t-tests. Clients with strokeAlso called a "brain attack" and happens when brain cells die because of inadequate blood flow. 20% of cases are a hemorrhage in the brain caused by a rupture or leakage from a blood vessel. 80% of cases are also know as a "schemic stroke", or the formation of a blood clot in a vessel supplying blood to the brain. performed significantly worse on number of tasks completed accurately (d = 0.84, p<0.05), rule breaks (d = 0.92, p<0.05) and total failures (d = 1.05, r<0.01). The proportion of group members who completed fewer than 40% (< 5) tasks satisfactorily was also significantly different between the two groups (28% of clients with strokeAlso called a "brain attack" and happens when brain cells die because of inadequate blood flow. 20% of cases are a hemorrhage in the brain caused by a rupture or leakage from a blood vessel. 80% of cases are also know as a "schemic stroke", or the formation of a blood clot in a vessel supplying blood to the brain. vs. 0% of healthy matched controls, p<0.05).
Note: d is the effect sizeEffect size (ES) is a name given to a family of indices that measure the magnitude of a treatment effect. Unlike significance tests, these indices are independent of sample size. The ES is generally measured in two ways: as the standardized difference between two means, or as the correlation between the independent variable classification and the individual scores on the dependent variable. This correlation is called the "effect size correlation".
; effect sizes ≥ 0.7 are considered large.

Rand et al. (2009a) examined known group validityThe degree to which an assessment measures what it is supposed to measure.
of the VMET with a sample of 9 patients with subacute or chronic strokeAlso called a "brain attack" and happens when brain cells die because of inadequate blood flow. 20% of cases are a hemorrhage in the brain caused by a rupture or leakage from a blood vessel. 80% of cases are also know as a "schemic stroke", or the formation of a blood clot in a vessel supplying blood to the brain., 20 healthy young adults and 20 healthy older adults, using Kruskal Wallis H. Patients with strokeAlso called a "brain attack" and happens when brain cells die because of inadequate blood flow. 20% of cases are a hemorrhage in the brain caused by a rupture or leakage from a blood vessel. 80% of cases are also know as a "schemic stroke", or the formation of a blood clot in a vessel supplying blood to the brain. made more mistakes than older adults on all VMET outcomes except for rule break mistakes. Older adults made more mistakes than young adults on all VMET outcomes except for the use of strategies mistakes.

Raspelli et al. (2010) examined known group validityThe degree to which an assessment measures what it is supposed to measure.
of the VMET with 6 clients with strokeAlso called a "brain attack" and happens when brain cells die because of inadequate blood flow. 20% of cases are a hemorrhage in the brain caused by a rupture or leakage from a blood vessel. 80% of cases are also know as a "schemic stroke", or the formation of a blood clot in a vessel supplying blood to the brain. and 14 healthy subjects. There were significant differences between groups in time taken to execute the task (higher for healthy subjects) and in the partial error ‘Maintained task objective to completion’.

Raspelli et al. (2012) examined known group validityThe degree to which an assessment measures what it is supposed to measure.
of the VMET with 9 clients with strokeAlso called a "brain attack" and happens when brain cells die because of inadequate blood flow. 20% of cases are a hemorrhage in the brain caused by a rupture or leakage from a blood vessel. 80% of cases are also know as a "schemic stroke", or the formation of a blood clot in a vessel supplying blood to the brain., 10 healthy young adults and 10 healthy older adults, using Kruskal-Wallis procedures. Results showed that clients with strokeAlso called a "brain attack" and happens when brain cells die because of inadequate blood flow. 20% of cases are a hemorrhage in the brain caused by a rupture or leakage from a blood vessel. 80% of cases are also know as a "schemic stroke", or the formation of a blood clot in a vessel supplying blood to the brain. scored lower in VMET time and errors than older adults, and that older adults scored lower in VMET time and errors than young adults.

SensitivitySensitivity refers to the probability that a diagnostic technique will detect a particular disease or condition when it does indeed exist in a patient (National Multiple Sclerosis Society). See also "Specificity."
/ SpecificitySpecificity refers to the probability that a diagnostic technique will indicate a negative test result when the condition is absent (true negative).
:
Knight, Alderman & Burgess (2002) investigated sensitivitySensitivity refers to the probability that a diagnostic technique will detect a particular disease or condition when it does indeed exist in a patient (National Multiple Sclerosis Society). See also "Specificity."
and specificitySpecificity refers to the probability that a diagnostic technique will indicate a negative test result when the condition is absent (true negative).
of the MET-HV in a sample of 20 patients with chronic acquired brain injury (traumatic brain injury, n=12; strokeAlso called a "brain attack" and happens when brain cells die because of inadequate blood flow. 20% of cases are a hemorrhage in the brain caused by a rupture or leakage from a blood vessel. 80% of cases are also know as a "schemic stroke", or the formation of a blood clot in a vessel supplying blood to the brain., n=5, both TBI and strokeAlso called a "brain attack" and happens when brain cells die because of inadequate blood flow. 20% of cases are a hemorrhage in the brain caused by a rupture or leakage from a blood vessel. 80% of cases are also know as a "schemic stroke", or the formation of a blood clot in a vessel supplying blood to the brain., n=3) and 20 healthy control subjects matched for gender, age and IQ*. A cut-off score ≥ 7 errors (i.e. 5th percentile of total errors of control subjects) resulted in correct identification of 85% of participants with acquired brain injury (85% sensitivitySensitivity refers to the probability that a diagnostic technique will detect a particular disease or condition when it does indeed exist in a patient (National Multiple Sclerosis Society). See also "Specificity."
, 95% specificitySpecificity refers to the probability that a diagnostic technique will indicate a negative test result when the condition is absent (true negative).
).
*Note: IQ was measured using the National Adult Reading Test – Revised Full Scale Intelligence Quotient (NART-R FSIQ).

Alderman et al. (2003) reported on sensitivitySensitivity refers to the probability that a diagnostic technique will detect a particular disease or condition when it does indeed exist in a patient (National Multiple Sclerosis Society). See also "Specificity."
and specificitySpecificity refers to the probability that a diagnostic technique will indicate a negative test result when the condition is absent (true negative).
of the MET-SV with 46 individuals with no history of neurological disease and 50 clients with brain injury including strokeAlso called a "brain attack" and happens when brain cells die because of inadequate blood flow. 20% of cases are a hemorrhage in the brain caused by a rupture or leakage from a blood vessel. 80% of cases are also know as a "schemic stroke", or the formation of a blood clot in a vessel supplying blood to the brain. (n=9). Using a cutoff score ≥ 12 errors (i.e. 5th percentile of controls) results in 44% sensitivitySensitivity refers to the probability that a diagnostic technique will detect a particular disease or condition when it does indeed exist in a patient (National Multiple Sclerosis Society). See also "Specificity."
(i.e. correct classification of clients with brain injury) and 95.3% specificitySpecificity refers to the probability that a diagnostic technique will indicate a negative test result when the condition is absent (true negative).
(i.e. correct classification of healthy individuals). The authors caution that deriving a single measure based only on number of errors fails to consider between-group qualitative differences in performance. Accordingly, error scores were recalculated to reflect “normality” of the error type, with weighting of errors according to prevalence in the healthy control group (acceptable errors seen in up to 95% of healthy controls = 1; errors demonstrated by ≥ 5% of healthy controls = 2; errors unique to the patient group = 3). Using a cutoff score ≥ 12 errors (5th percentile of controls) resulted in 82% sensitivitySensitivity refers to the probability that a diagnostic technique will detect a particular disease or condition when it does indeed exist in a patient (National Multiple Sclerosis Society). See also "Specificity."
and 95.3% specificitySpecificity refers to the probability that a diagnostic technique will indicate a negative test result when the condition is absent (true negative).
. The MET-SV was more sensitive than traditional tests of executive function (Cognitive Estimates, FAS Verbal Fluency, MWCST), and MET-SV error category scores were highly predictive of rating s of executive symptoms of patients who passed traditional executive function tests but failed the MET-SV shopping task.

Responsiveness

Two studies used the MET (MET-HV, VMET and modified version of the MET-HV & MET-SV) to measure change following intervention.

Novakovic-Agopian et al. (2011) developed a modified version of the MET-HV and MET-SV to be used in local hospital settings. They developed 3 alternate forms that were used in a pilot study examining the effect of goal-oriented attentional self-regulation training with a sample of 16 patients with chronic brain injury including strokeAlso called a "brain attack" and happens when brain cells die because of inadequate blood flow. 20% of cases are a hemorrhage in the brain caused by a rupture or leakage from a blood vessel. 80% of cases are also know as a "schemic stroke", or the formation of a blood clot in a vessel supplying blood to the brain. or cerebral hemorrhage (n=3). A pseudo-random crossover design was used. During the first 5 weeks, one group (Group A) completed goal-oriented attentional self-regulation training while the other group (Group B) only received a 2-hour educational instructional session. In the subsequent phase, conditions were switched such that participants in Group B received goals training for 5 weeks while those in Group A received educational instruction. At week 5 the group that received goal training first demonstrated a significant reduction in task failures (p<0.01), whereas the group that received the educational session demonstrated no significant improvement in MET scores. From week 5 to week 10 there were no significant changes in MET scores in either group.

Rand, Weiss and Katz (2009b) used the MET-HV and VMET to detect change in multi-tasking skills of 4 clients with subacute strokeAlso called a "brain attack" and happens when brain cells die because of inadequate blood flow. 20% of cases are a hemorrhage in the brain caused by a rupture or leakage from a blood vessel. 80% of cases are also know as a "schemic stroke", or the formation of a blood clot in a vessel supplying blood to the brain. following virtual reality intervention using the VMall virtual supermarket. Clients demonstrated improved performance on both measures following 3 weeks of multi-tasking training using the VMall virtual supermarket.

References

Alderman, N., Burgess, P.W., Knight, C., & Henman, C. (2003). Ecological validity of a simplified version of the multiple errands shopping test. Journal of the International Neuropsychological Society, 9, 31-44.
Dawson, D.R., Anderson, N.D., Burgess, P., Cooper, E., Krpan, K.M., & Stuss, D.T. (2009). Further development of the Multiple Errands Test: Standardized scoring, reliability, and ecological validity for the Baycrest version. Archives of Physical Medicine and Rehabilitation, 90, S41-51.
Knight, C., Alderman, N., & Burgess, P.W. (2002). Development of a simplified version of the Multiple Errands Test for use in hospital settings. Neuropsychological Rehabilitation, 12(3), 231-255.
Maier, A., Krauss, S., & Katz, N. (2011). Ecological validity of the Multiple Errands Test (MET) on discharge from neurorehabilitation hospital. Occupational Therapy Journal of Research: Occupation, Participation and Health, 31(1) S38-46.
Novakovic-Agopian, T., Chen, A.J.W., Rome, S., Abrams, G., Castelli, H., Rossi, A., McKim, R., Hills, N., & D’Esposito, M. (2011). Rehabilitation of executive functioning with training in attention regulation applied to individually defined goals: A pilot study bridging theory, assessment, and treatment. The Journal of Health Trauma Rehabilitation, 26(5), 325-338.
Novakovic-Agopian, T., Chen, A. J., Rome, S., Rossi, A., Abrams, G., DÃŠ¼esposito, M., Turner, G., McKim, R., Muir, J., Hills, N., Kennedy, C., Garfinkle, J., Murphy, M., Binder, D., Castelli, H. (2012). Assessment of Subcomponents of Executive Functioning in Ecologically Valid Settings: The Goal Processing Scale. The Journal of Health Trauma Rehabilitation, 2012 Oct 16. [Epub ahead of print]
Rand, D., Rukan, S., Weiss, P.L., & Katz, N. (2009a). Validation of the Virtual MET as an assessment tool for executive functions. Neuropsychological Rehabilitation, 19(4), 583-602.
Rand, D., Weiss, P., & Katz, N. (2009b). Training multitasking in a virtual supermarket: A novel intervention after stroke. American Journal of Occupational Therapy, 63, 535-542.
Raspelli, S., Carelli, L., Morganti, F., Poletti, B., Corra, B., Silani, V., & Riva, G. (2010). Implementation of the Multiple Errands Test in a NeuroVR-supermarket: A possible approach. Studies in Health Technology and Informatic, 154, 115-119.
Raspelli, S., Pallavicini, F., Carelli, L., Morganti, F., Pedroli, E., Cipresso, P., Poletti, B., Corra, B., Sangalli, D., Silani, V., & Riva, G. (2012). Validating the Neuro VR-based virtual version of the Multiple Errands Test: Preliminary results. Presence, 21(1), 31-42.
Shallice, T. & Burgess, P.W. (1991). Deficits in strategy application following frontal lobe damage in man. Brain, 114, 727-741.

See the measure

How to obtain the Multiple Errands Test?

See the papers below for test instructions of the Simplified Version (MET-SV) and the Hospital Version (MET-HV):

Alderman, N., Burgess, P.W., Knight, C., & Henman, C. (2003). Ecological validityRefers to the extent to which a measure captures behaviours that are reflective of those that would occur in a natural setting
of a simplified version of the multiple errands shopping test.Journal of the International Neuropsychological Society, 9, 31-44.
Knight, C., Alderman, N., & Burgess, P.W. (2002). Development of a simplified version of the Multiple Errands Test for use in hospital settings.Neuropsychological Rehabilitation, 12(3), 231-255.

The Baycrest MET can be obtained at https://cognitionandeverydaylifelabs.com/multiple-errands-test/