Mini-Mental State Examination (MMSE)
Purpose
The Mini-Mental State Examination (MMSE) was originally developed as a brief screening
tool to provide a quantitative evaluation of cognitive impairment and to record cognitive changes over time (Folstein, Folstein, & McHugh, 1975). Since that time it has become recognized that repeated use of the MMSE with the same client reduces its validity
, so it is advised that this screening
tool not be used repeatedly with the same individual if the time interval between testing is short. Rather than provide a diagnosis, the measure should be used to detect the presence of cognitive impairment (Folstein, Robins, & Helzer, 1983). The MMSE briefly measures orientation to time and place, immediate recall, short-term verbal memory, calculation, language, and construct ability. While the measure was originally used to detect dementia within a psychiatric setting, its use has become widespread. Since 1993, the MMSE has been available with an attached table that enables patient-specific norms to be identified on the basis of age and educational level (Crum, Anthony, Bassett, & Folstein, 1993).
In-Depth Review
Purpose of the measure
The Mini-Mental State Examination (MMSE) was originally developed as a brief screening
tool to provide a quantitative evaluation of cognitive impairment and to record cognitive changes over time (Folstein, Folstein, & McHugh, 1975). Since that time it has become recognized that repeated use of the MMSE with the same client reduces its validity
, so it is advised that this screening
tool not be used repeatedly with the same individual if the time interval between testing is short. Rather than provide a diagnosis, the measure should be used to detect the presence of cognitive impairment (Folstein, Robins, & Helzer, 1983). The MMSE briefly measures orientation to time and place, immediate recall, short-term verbal memory, calculation, language, and construct ability. While the measure was originally used to detect dementia within a psychiatric setting, its use has become widespread. Since 1993, the MMSE has been available with an attached table that enables patient-specific norms to be identified on the basis of age and educational level (Crum, Anthony, Bassett, & Folstein, 1993).
Available versions
The MMSE was published by Folstein et al. in 1975.
Features of the measure
Items:
The MMSE consists of 11 simple questions or tasks that look at various functions including: arithmetic, memory and orientation.
Scoring:
The score is the number of correct items. The measure yields a total score of 30. A score of 23 or less is the generally accepted cutoff point indicating the presence of cognitive impairment (Ruchinskas & Curyto, 2003).
Levels of impairment have also been classified as none (24-30); mild (18-23) and severe (0-17) (Tombaugh & McIntyre 1992).
More recently, Folstein, Folstein, McHugh, and Fanjiang. (2001) recommended the following cutoff scores:
Score | Level of impairment |
≥ ? 27 | None |
21-26 | Mild |
11-20 | Moderate |
≤ 10 | Severe |
Crum et al. (1993) reported that cognitive performance as measured by the MMSE varies within the population by age and educational level. There is an inverse relationship between MMSE scores and age, ranging from a median of 29 for those aged 18 to 24 years, to 25 for individuals 80 years of age and older. There is also an inverse relationship between MMSE scores and education. The median MMSE score is 29 for individuals with at least 9 years of schooling, 26 for those with 5 to 8 years of schooling, and 22 for those with 0 to 4 years of schooling.
The following table, created by Crum et al. (1993) can be used to compare your patient’s MMSE score with a reference group based on age and education level.
(Source: Crum et al., 1993)
Age | |||||
Education | 20-24 | 25-29 | 30-34 | 35-39 | 40-44 |
4th grade | 22 | 25 | 25 | 23 | 23 |
8th grade | 27 | 27 | 26 | 26 | 27 |
High school | 29 | 29 | 29 | 28 | 28 |
College | 29 | 29 | 29 | 29 | 29 |
Age | |||||
Education | 45-49 | 50-54 | 55-59 | 60-64 | 65-69 |
4th grade | 23 | 23 | 22 | 23 | 22 |
8th grade | 26 | 27 | 26 | 26 | 26 |
High school | 28 | 28 | 28 | 28 | 28 |
College | 29 | 29 | 29 | 29 | 29 |
Age | ||||
Education | 70-74 | 75-79 | 80-84 | >84 |
4th grade | 22 | 21 | 20 | 19 |
8th grade | 25 | 25 | 25 | 23 |
High school | 27 | 27 | 25 | 26 |
College | 28 | 28 | 27 | 27 |
Subscales:
Orientation (total points = 10), Registration (total points = 3), Attention and calculation (total points = 5), Recall (total points = 3), and Language (total points = 9).
Equipment:
The MMSE requires no specialized equipment.
Training:
Little information has been reported on training for the MMSE, however a standardized version of the MMSE has been developed (Molloy & Standish, 1997).
Time:
Administration by a trained interviewer takes approximately 10 minutes.
Alternative form of the MMSE
The Modified mini-mental state examination (3MS) (Teng & Chui, 1987).
An expanded version of the MMSE was developed by Teng and Chui (1987) increasing the content, number and difficulty of items included in the assessment. The score of the 3MS ranges from 0 – 100 with a standardized cut-off point of 79/80 for the presence of cognitive impairment. This expanded assessment takes approximately 5 minutes more to administer than the original MMSE, which takes approximately 10 minutes to complete. Grace et al. (1995) compared the MMSE to the 3MS in geriatric patients with strokeAlso called a “brain attack” and happens when brain cells die because of inadequate blood flow. 20% of cases are a hemorrhage in the brain caused by a rupture or leakage from a blood vessel. 80% of cases are also know as a “schemic stroke”, or the formation of a blood clot in a vessel supplying blood to the brain.. Test-retest reliabilityA way of estimating the reliability of a scale in which individuals are administered the same scale on two different occasions and then the two scores are assessed for consistency. This method of evaluating reliability is appropriate only if the phenomenon that the scale measures is known to be stable over the interval between assessments. If the phenomenon being measured fluctuates substantially over time, then the test-retest paradigm may significantly underestimate reliability. In using test-retest reliability, the investigator needs to take into account the possibility of practice effects, which can artificially inflate the estimate of reliability (National Multiple Sclerosis Society).
of the 3MS was excellent (r = 0.80). The 3MS also correlated with a battery of neuropsychological assessments and with some cognitive domains missed by the MMSE. The 3MS was a significantly better predictor of functional outcome (as measured by the Functional Independence Measure) than the MMSE. The 3MS was found to have higher sensitivitySensitivity refers to the probability that a diagnostic technique will detect a particular disease or condition when it does indeed exist in a patient (National Multiple Sclerosis Society). See also “Specificity.”
than the MMSE (69% vs. 44%) and similar specificitySpecificity refers to the probability that a diagnostic technique will indicate a negative test result when the condition is absent (true negative).
(80% vs. 79%). The area under the curve (AUC) was 0.798 for the 3MS.
3MS + Clock-drawing (Suhr & Grace, 1999).
The addition of clock drawing, a simple measure of constructional ability, increased the sensitivitySensitivity refers to the probability that a diagnostic technique will detect a particular disease or condition when it does indeed exist in a patient (National Multiple Sclerosis Society). See also “Specificity.”
in detecting focal brain damage of the 3MS in patients with right hemisphere strokeAlso called a “brain attack” and happens when brain cells die because of inadequate blood flow. 20% of cases are a hemorrhage in the brain caused by a rupture or leakage from a blood vessel. 80% of cases are also know as a “schemic stroke”, or the formation of a blood clot in a vessel supplying blood to the brain. (87%). The addition of the Clock Drawing Test requires about two extra minutes in administration time.
Standardized MMSE (SMMSE) (Molloy & Standish, 1997).
Molloy and Standish (1997) developed the SMMSE to improve the reliabilityReliability can be defined in a variety of ways. It is generally understood to be the extent to which a measure is stable or consistent and produces similar results when administered repeatedly. A more technical definition of reliability is that it is the proportion of “true” variation in scores derived from a particular measure. The total variation in any given score may be thought of as consisting of true variation (the variation of interest) and error variation (which includes random error as well as systematic error). True variation is that variation which actually reflects differences in the construct under study, e.g., the actual severity of neurological impairment. Random error refers to “noise” in the scores due to chance factors, e.g., a loud noise distracts a patient thus affecting his performance, which, in turn, affects the score. Systematic error refers to bias that influences scores in a specific direction in a fairly consistent way, e.g., one neurologist in a group tends to rate all patients as being more disabled than do other neurologists in the group. There are many variations on the measurement of reliability including alternate-forms, internal consistency , inter-rater agreement , intra-rater agreement , and test-retest .
of the measure. The idea was to develop strict guidelines for administration and scoring. To examine the reliabilityReliability can be defined in a variety of ways. It is generally understood to be the extent to which a measure is stable or consistent and produces similar results when administered repeatedly. A more technical definition of reliability is that it is the proportion of “true” variation in scores derived from a particular measure. The total variation in any given score may be thought of as consisting of true variation (the variation of interest) and error variation (which includes random error as well as systematic error). True variation is that variation which actually reflects differences in the construct under study, e.g., the actual severity of neurological impairment. Random error refers to “noise” in the scores due to chance factors, e.g., a loud noise distracts a patient thus affecting his performance, which, in turn, affects the score. Systematic error refers to bias that influences scores in a specific direction in a fairly consistent way, e.g., one neurologist in a group tends to rate all patients as being more disabled than do other neurologists in the group. There are many variations on the measurement of reliability including alternate-forms, internal consistency , inter-rater agreement , intra-rater agreement , and test-retest .
of the SMMSE in 48 older adults, university students were randomized to administer either the MMSE or the SMMSE, and were trained on that test to give to participants on three different occasions. The SMMSE had significantly better inter-rater and intra-rater reliabilityThis is a type of reliability assessment in which the same assessment is completed by the same rater on two or more occasions. These different ratings are then compared, generally by means of correlation. Since the same individual is completing both assessments, the rater’s subsequent ratings are contaminated by knowledge of earlier ratings.
compared with the MMSE. The inter-rater variance was reduced by 76% and the intra-rater variance was reduced by 86%. It took less time to administer the SMMSE compared with the MMSE (average 10.5 minutes and 13.4 minutes, respectively. The intraclass correlationThe extent to which two or more variables are associated with one another. A correlation can be positive (as one variable increases, the other also increases – for example height and weight typically represent a positive correlation) or negative (as one variable increases, the other decreases – for example as the cost of gasoline goes higher, the number of miles driven decreases. There are a wide variety of methods for measuring correlation including: intraclass correlation coefficients (ICC), the Pearson product-moment correlation coefficient, and the Spearman rank-order correlation.
(ICC) for the MMSE was adequate (ICC = 0.69), and was excellent for the SMMSE (ICC = 0.90).
Telephone version (ALFI-MMSE) (Roccaforte, Burke, Bayer, & Wengel, 1992).
This version includes 22/30 of the original MMSE items, the majority of which were removed from the last section (language and motor skills). Roccaforte et al. (1992) examined the validityThe degree to which an assessment measures what it is supposed to measure.
of the ALFI-MMSE in 100 geriatric outpatients. Correlations between phone and face-to-face versions of the MMSE were excellent (Pearson’s r = 0.85). Patients tended to score slightly higher on in-person testing than on the telephone. SensitivitySensitivity refers to the probability that a diagnostic technique will detect a particular disease or condition when it does indeed exist in a patient (National Multiple Sclerosis Society). See also “Specificity.”
(using a brief neurological screeningTesting for disease in people without symptoms.
test as the criterion) of 67% and specificitySpecificity refers to the probability that a diagnostic technique will indicate a negative test result when the condition is absent (true negative).
of 100% were reported in a population of elderly, community-dwelling individuals. This was similar to the sensitivitySensitivity refers to the probability that a diagnostic technique will detect a particular disease or condition when it does indeed exist in a patient (National Multiple Sclerosis Society). See also “Specificity.”
(68%) and specificitySpecificity refers to the probability that a diagnostic technique will indicate a negative test result when the condition is absent (true negative).
(100%) reported for screeningTesting for disease in people without symptoms.
with the traditional MMSE.
26-item version of the ALFI-MMSE (T-MMSE) (Roccaforte et al. cited in Newkirk, Kim, Thompson, Tinklenberg, Yesavage, & Taylor, 2004).
The T-MMSE was developed from the ALFI-MMSE. It is a 26-point adaptation, containing a 3-step command: “Say hello, tap the mouthpiece of the phone 3 times, then say I’m back”. It also contains a new question that requests that the patient give the interviewer a phone number where they can usually be reached. The T-MMSE had an excellent correlation
with the MMSE (r = 0.88). Neither hearing impairment nor years of education were associated with T-MMSE scores. On the 22 points in common between the 2 scales, scores had an excellent correlation
(r = 0.88), however, telephone scores tended to be lower than in-face scores (Newkirk et al., 2004). The authors provide tables for the conversion of T-MMSE scores to MMSE scores
Client suitability
Can be used with:
- Patients with strokeAlso called a “brain attack” and happens when brain cells die because of inadequate blood flow. 20% of cases are a hemorrhage in the brain caused by a rupture or leakage from a blood vessel. 80% of cases are also know as a “schemic stroke”, or the formation of a blood clot in a vessel supplying blood to the brain. (Agrell & Dehlin, 2000; Ozdemir, Birtane, Tabatabaei, Ekuklu, Kokino, & Siranus, 2001; Grace et al., 1995; Suhr & Grace, 1999).
Should not be used with:
- The MMSE was ineffective in detecting cognitive impairment in patients with right-sided strokeAlso called a “brain attack” and happens when brain cells die because of inadequate blood flow. 20% of cases are a hemorrhage in the brain caused by a rupture or leakage from a blood vessel. 80% of cases are also know as a “schemic stroke”, or the formation of a blood clot in a vessel supplying blood to the brain. (Grace et al., 1995).
- The MMSE is not suitable for use with a proxy respondent as it is administered via direct observation of task completion.
- Because the MMSE is heavily language dependent, it is likely to misclassify patients with aphasiaAphasia is an acquired disorder caused by an injury to the brain and affects a person’s ability to communicate. It is most often the result of stroke or head injury.
An individual with aphasia may experience difficulty expressing themselves when speaking, difficulty understanding the speech of others, and difficulty reading and writing. Sadly, aphasia can mask a person’s intelligence and ability to communicate feelings, thoughts and emotions. (The Aphasia Institute, Canada). - The MMSE has a limited ability to diagnose dementia in general practice and should therefore be used as only one aspect of a patient’s overall cognitive profile (Wind, Schellevis, van Staveren, Scholten, Jonker, & van Eijk, 1997).
- The MMSE has been criticized for attempting to assess too many functions in one brief test. An individual’s performance on individual items or within a single domain may be more useful than interpretation of a single, overall score (Tombaugh & McIntyre 1992). However, when used to screen for visual or verbal memory problems, or for problems in orientation or attention, it is not possible to identify acceptable cut-off scores (Blake, McKinney, Treece, Lee, & Lincoln, 2002).
- MMSE scores have been shown to be affected by age, level of education, ethnicity, and sociocultural background (Tombaugh & McIntyre, 1992; Bleeker et al., 1988; Lorentz et al., 2002; Shadlen, Larson, Gibbons, McCormick, & Teri, 1999). These variables may introduce bias leading to the misclassification of individuals. For example, highly educated individuals who have mild dementia may well score within normal range on the MMSE because they find the questions easy. Further, poorly educated individuals may have low scores on the MMSE simply because they find the questions difficult. Thus, their scoring on the MMSE may indicate a diagnosis of dementia when none is present. Although these biases are not always present, Agrell and Dehlin (2000) found that age and education did not influence scores in their study, attention to these factors is warranted when interpreting MMSE results.
- The MMSE has been found to lack sensitivitySensitivity refers to the probability that a diagnostic technique will detect a particular disease or condition when it does indeed exist in a patient (National Multiple Sclerosis Society). See also “Specificity.”
in patients with strokeAlso called a “brain attack” and happens when brain cells die because of inadequate blood flow. 20% of cases are a hemorrhage in the brain caused by a rupture or leakage from a blood vessel. 80% of cases are also know as a “schemic stroke”, or the formation of a blood clot in a vessel supplying blood to the brain. (Blake et al., 2002; Suhr & Grace, 1999; Nys et al., 2005). Other studies have reported low levels of sensitivitySensitivity refers to the probability that a diagnostic technique will detect a particular disease or condition when it does indeed exist in a patient (National Multiple Sclerosis Society). See also “Specificity.”
among individuals with mild cognitive impairment (Tombaugh & McIntyre, 1992; de Koning et al., 1998) and in patients with right-hemisphere lesions (Dick et al., 1984). One potential solution to increase the sensitivitySensitivity refers to the probability that a diagnostic technique will detect a particular disease or condition when it does indeed exist in a patient (National Multiple Sclerosis Society). See also “Specificity.”
of the MMSE is the addition of a Clock Drawing Test (Suhr & Grace, 1999). Another solution that has been offered is to administer the Neurobehavioral Cognitive Status Examination (NCSE) in lieu of the MMSE. The NCSE is a highly sensitive measure to detect cognitive impairment in patients with brain lesions (Schwamm, Van Dyke, Kiernan, Merrin, & Mueller, 1997). - Da Costa et al. (2010) investigated the cognitive evolution and clinical severity of illiterate and schooled patients with strokeAlso called a “brain attack” and happens when brain cells die because of inadequate blood flow. 20% of cases are a hemorrhage in the brain caused by a rupture or leakage from a blood vessel. 80% of cases are also know as a “schemic stroke”, or the formation of a blood clot in a vessel supplying blood to the brain. during a 6-month follow-up, using the MMSE and National Institutes of Health StrokeAlso called a “brain attack” and happens when brain cells die because of inadequate blood flow. 20% of cases are a hemorrhage in the brain caused by a rupture or leakage from a blood vessel. 80% of cases are also know as a “schemic stroke”, or the formation of a blood clot in a vessel supplying blood to the brain. Scale (NIHSS) respectively. Significant improvement in clinical severity as measured by NIHSS was observed in both groups (P<0.001); however, only schooled individuals showed a significant improvement in MMSE scores, indicating an improvement in their overall cognitive function (P=0.008). Schooling was found to significantly influence MMSE scores.
- Folstein, Folstein, and McHugh (1998) reported that the MMSE demonstrates marked ceiling effects in younger intact individuals and marked floor effects in moderately to severely impaired individuals.
In what languages is the measure available?
Afrikaans | Dutch | Israeli English | Romanian |
Arabic | Estonian | Italian | Russian |
Argentinean Spanish | Filipino | Japanese | Russian for Estonia |
Belgian Dutch | Finnish | Kannada | Serbian |
Belgian French | French | Korean | Slovakian |
Bosnian | Austrian German | Latvian | South African English |
Brazilian Portuguese | German | Lithuanian | Spanish |
Bulgarian | Greek | Macedonian | Swedish |
Chilean Spanish | Gujarati | Malayalam | Telugu |
Chinese | Hebrew | Marathi | Turkish |
Croatian | Hindi | Norwegian | UK English |
Czech | Hungarian | Polish | Ukranian |
Danish | Indian English | Portuguese | Urdu |
Authorized translations of the MMSE can be obtained by contacting Custsupp@parinc.com or call 1.800.331.8378
Summary
What does the tool measure? | Cognitive impairment |
What types of clients can the tool be used for? | While originally used to detect dementia within a psychiatric setting, its use is now widespread and is available with an attached table that enables patient-specific norms |
Is this a screening or assessment tool? |
Screening |
Time to administer | Administration by a trained interviewer takes approximately 10 minutes. |
Versions | The modified mini-mental state examination (3MS); 3MS + Clock-drawing; Standardized MMSE (SMMSE); Telephone version (ALFI-MMSE); 26-item version of the ALFI-MMSE (T-MMSE) |
Other Languages | Afrikaans; Dutch; Romanian; Arabic; Estonian; Italian; Russian; Argentinean Spanish; Filipino; Japanese; Russian for Estonia; Belgian Dutch; Finnish; Kannada; Serbian; Belgian French; French; Korean; Slovakian; Bosnian; Austrian German; Latvian; Brazilian; Portuguese; German; Lithuanian; Spanish; Bulgarian; Greek; Macedonian; Swedish; Chilean Spanish; Gujarati; Malayalam; Telugu; Chinese; Hebrew; Marathi; Turkish; Croatian; Hindi; Norwegian; Czech; Hungarian; Polish; Ukranian; Danish; Portuguese; Urdu |
Floor/Ceiling effects | Folstein, Folsten, and McHugh (1998) reported that the MMSE demonstrates marked ceiling effects in younger intact individuals and marked floor effects in individuals with moderate to severe cognitive impairment. |
Reliability |
Internal consistency Out of nine studies examining the internal consistency Test-rest: Inter-rater: |
Validity |
Criterion: The MMSE can discriminate between patients with Alzheimer’s Disease and frontotemporal dementia; can discriminate between patients with left- and right-hemispheric stroke Construct: Predictive: |
Does the tool detect change in patients? | Not applicable. |
Acceptability | The MMSE is a brief measure to administer. Patient variables such as age, level of education and sociocultural backgroup may affect scores on the measure. It is administered by direct observation and is therefore not appropriate for proxy use. |
Feasibility | No specialized equipment is required, and therefore it is a highly portable and inexpensive measure. However, one study reported that physicians found the MMSE too lengthy and unable to contribute much useful information. |
How to obtain the tool? | The MMSE can be obtained from the current copyright owner, Psychological Assessment Resources (PAR). |
Psychometric Properties
Overview
We conducted a literature search to identify all relevant publications on the psychometric properties of the MMSE.
Floor/Ceiling Effects
Folstein, Folstein, and McHugh (1998) reported that the MMSE demonstrates marked ceiling effects in younger intact individuals and marked floor effects in individuals with moderate to severe impairment.
Reliability
Internal consistencyA method of measuring reliability . Internal consistency reflects the extent to which items of a test measure various aspects of the same characteristic and nothing else. Internal consistency coefficients can take on values from 0 to 1. Higher values represent higher levels of internal consistency.:
Tombaugh and McIntyre (1992) reviewed studies published on the psychometric properties of the MMSE over the last 26 years. The internal consistencyA method of measuring reliability . Internal consistency reflects the extent to which items of a test measure various aspects of the same characteristic and nothing else. Internal consistency coefficients can take on values from 0 to 1. Higher values represent higher levels of internal consistency. of the MMSE was reported to range from poor to excellent (alpha = 0.54 to 0.96).
McDowell, Kristjansson, Hill, and Hebert (1997) examined the internal consistencyA method of measuring reliability . Internal consistency reflects the extent to which items of a test measure various aspects of the same characteristic and nothing else. Internal consistency coefficients can take on values from 0 to 1. Higher values represent higher levels of internal consistency. of the MMSE used as a screeningTesting for disease in people without symptoms.
test for cognitive impairment and dementia. The internal consistencyA method of measuring reliability . Internal consistency reflects the extent to which items of a test measure various aspects of the same characteristic and nothing else. Internal consistency coefficients can take on values from 0 to 1. Higher values represent higher levels of internal consistency. was adequate (alpha = 0.78).
Holzer, Tischler, Leaf, and Myers (1984) examined the prevalence of dementia in a community sample (n = 4,917). In this study, the internal consistencyA method of measuring reliability . Internal consistency reflects the extent to which items of a test measure various aspects of the same characteristic and nothing else. Internal consistency coefficients can take on values from 0 to 1. Higher values represent higher levels of internal consistency. of the MMSE was found to be adequate (alpha = 0.77). ReliabilityReliability can be defined in a variety of ways. It is generally understood to be the extent to which a measure is stable or consistent and produces similar results when administered repeatedly. A more technical definition of reliability is that it is the proportion of “true” variation in scores derived from a particular measure. The total variation in any given score may be thought of as consisting of true variation (the variation of interest) and error variation (which includes random error as well as systematic error). True variation is that variation which actually reflects differences in the construct under study, e.g., the actual severity of neurological impairment. Random error refers to “noise” in the scores due to chance factors, e.g., a loud noise distracts a patient thus affecting his performance, which, in turn, affects the score. Systematic error refers to bias that influences scores in a specific direction in a fairly consistent way, e.g., one neurologist in a group tends to rate all patients as being more disabled than do other neurologists in the group. There are many variations on the measurement of reliability including alternate-forms, internal consistency , inter-rater agreement , intra-rater agreement , and test-retest .
of individual items ranged from poor (alpha = 0.43 for Orientation) to excellent (alpha = 0.82 for Registration). Calculation/attention items were omitted from this study.
Kay, Henderson, Scott, Wilson, Rickwood, and Grayson (1985) conducted a community survey in 274 individuals over 70 years of age. Rates of dementia were measured by interviewing participants with the MMSE. In this study, the internal consistencyA method of measuring reliability . Internal consistency reflects the extent to which items of a test measure various aspects of the same characteristic and nothing else. Internal consistency coefficients can take on values from 0 to 1. Higher values represent higher levels of internal consistency. of the MMSE was poor (alpha = 0.68).
Foreman (1987) examined the reliabilityReliability can be defined in a variety of ways. It is generally understood to be the extent to which a measure is stable or consistent and produces similar results when administered repeatedly. A more technical definition of reliability is that it is the proportion of “true” variation in scores derived from a particular measure. The total variation in any given score may be thought of as consisting of true variation (the variation of interest) and error variation (which includes random error as well as systematic error). True variation is that variation which actually reflects differences in the construct under study, e.g., the actual severity of neurological impairment. Random error refers to “noise” in the scores due to chance factors, e.g., a loud noise distracts a patient thus affecting his performance, which, in turn, affects the score. Systematic error refers to bias that influences scores in a specific direction in a fairly consistent way, e.g., one neurologist in a group tends to rate all patients as being more disabled than do other neurologists in the group. There are many variations on the measurement of reliability including alternate-forms, internal consistency , inter-rater agreement , intra-rater agreement , and test-retest .
of the MMSE in 66 hospitalized medical-surgical patients (normal, dementia, or delirium) over 65 years of age. The MMSE was found to have an excellent internal consistencyA method of measuring reliability . Internal consistency reflects the extent to which items of a test measure various aspects of the same characteristic and nothing else. Internal consistency coefficients can take on values from 0 to 1. Higher values represent higher levels of internal consistency. (alpha = 0.96).
Jorm, Scott, Henderson, and Kay (1988) examined whether there was a bias in the MMSE such that individuals with less education (less than or equal to 8th grade) would perform worse on the measure than individuals with more education (more than 8th grade). The MMSE was administered 269 elderly participants. The internal consistencyA method of measuring reliability . Internal consistency reflects the extent to which items of a test measure various aspects of the same characteristic and nothing else. Internal consistency coefficients can take on values from 0 to 1. Higher values represent higher levels of internal consistency. was found to be poor in both the more educated group (alpha = 0.54) and the less educated group (alpha = 0.65).
Albert and Cohen (1992) administered the MMSE to 40 elderly residents with severe cognitive impairment. The internal consistencyA method of measuring reliability . Internal consistency reflects the extent to which items of a test measure various aspects of the same characteristic and nothing else. Internal consistency coefficients can take on values from 0 to 1. Higher values represent higher levels of internal consistency. of the MMSE was poor in patients with an MMSE score ≤ 10 (alpha = 0.56). However, when subjects representing the full range of MMSE scores were included, the internal consistencyA method of measuring reliability . Internal consistency reflects the extent to which items of a test measure various aspects of the same characteristic and nothing else. Internal consistency coefficients can take on values from 0 to 1. Higher values represent higher levels of internal consistency. was excellent (alpha = 0.90).
Tombaugh, McDowell, Kristjansson, and Hubley (1996) compared the psychometric properties of the MMSE to the 3MS in community-dwelling participants between the ages of 65-89. Participants were divided into two groups, one with no cognitive impairment (n = 406) and one with Alzheimer’s disease (n = 119). The internal consistencyA method of measuring reliability . Internal consistency reflects the extent to which items of a test measure various aspects of the same characteristic and nothing else. Internal consistency coefficients can take on values from 0 to 1. Higher values represent higher levels of internal consistency. of the MMSE was poor in the group without cognitive impairment (alpha = 0.62) and was found to be excellent in patients with Alzheimer’s disease (alpha = 0.81).
Hopp, Dixon, Grut, and Backman (1997) administered the MMSE to 44 adults without dementia, who were over the age of 75 years. In this sample, the internal consistencyA method of measuring reliability . Internal consistency reflects the extent to which items of a test measure various aspects of the same characteristic and nothing else. Internal consistency coefficients can take on values from 0 to 1. Higher values represent higher levels of internal consistency. of the MMSE was poor (alpha ranged from 0.31 to 0.52).
Test-retest:
Tombaugh and McIntyre (1992) reviewed studies published on the psychometric properties of the MMSE over the last 26 years. They reported that in studies having a re-test interval of < 2 months, the MMSE has poor to excellent test-retest reliabilityA way of estimating the reliability of a scale in which individuals are administered the same scale on two different occasions and then the two scores are assessed for consistency. This method of evaluating reliability is appropriate only if the phenomenon that the scale measures is known to be stable over the interval between assessments. If the phenomenon being measured fluctuates substantially over time, then the test-retest paradigm may significantly underestimate reliability. In using test-retest reliability, the investigator needs to take into account the possibility of practice effects, which can artificially inflate the estimate of reliability (National Multiple Sclerosis Society).
with correlations ranging from 0.38 to 0.99. Twenty-four out of 30 studies reported excellent test-retest reliabilityA way of estimating the reliability of a scale in which individuals are administered the same scale on two different occasions and then the two scores are assessed for consistency. This method of evaluating reliability is appropriate only if the phenomenon that the scale measures is known to be stable over the interval between assessments. If the phenomenon being measured fluctuates substantially over time, then the test-retest paradigm may significantly underestimate reliability. In using test-retest reliability, the investigator needs to take into account the possibility of practice effects, which can artificially inflate the estimate of reliability (National Multiple Sclerosis Society).
(r > 0.75).
Folstein et al. (1975) administered the MMSE to 206 patients with dementia syndromes, affective disorder, affective disorder with cognitive impairment, mania, schizophrenia, personality disorders, and to 63 healthy controls. The test-retest reliabilityA way of estimating the reliability of a scale in which individuals are administered the same scale on two different occasions and then the two scores are assessed for consistency. This method of evaluating reliability is appropriate only if the phenomenon that the scale measures is known to be stable over the interval between assessments. If the phenomenon being measured fluctuates substantially over time, then the test-retest paradigm may significantly underestimate reliability. In using test-retest reliability, the investigator needs to take into account the possibility of practice effects, which can artificially inflate the estimate of reliability (National Multiple Sclerosis Society).
of the MMSE when administered twice within 24 hours was excellent, with a Pearson correlationThe extent to which two or more variables are associated with one another. A correlation can be positive (as one variable increases, the other also increases – for example height and weight typically represent a positive correlation) or negative (as one variable increases, the other decreases – for example as the cost of gasoline goes higher, the number of miles driven decreases. There are a wide variety of methods for measuring correlation including: intraclass correlation coefficients (ICC), the Pearson product-moment correlation coefficient, and the Spearman rank-order correlation.
coefficient of r = 0.89. When the MMSE was given to patients with depressionIllness involving the body, mood, and thoughts, that affects the way a person eats and sleeps, the way one feels about oneself, and the way one thinks about things. A depressive disorder is not the same as a passing blue mood or a sign of personal weakness or a condition that can be wished away. People with a depressive disease cannot merely “pull themselves together” and get better. Without treatment, symptoms can last for weeks, months, or years. Appropriate treatment, however, can help most people with depression.
and dementia twice, 28 days apart, the correlationThe extent to which two or more variables are associated with one another. A correlation can be positive (as one variable increases, the other also increases – for example height and weight typically represent a positive correlation) or negative (as one variable increases, the other decreases – for example as the cost of gasoline goes higher, the number of miles driven decreases. There are a wide variety of methods for measuring correlation including: intraclass correlation coefficients (ICC), the Pearson product-moment correlation coefficient, and the Spearman rank-order correlation.
was excellent, with a Pearson correlationThe extent to which two or more variables are associated with one another. A correlation can be positive (as one variable increases, the other also increases – for example height and weight typically represent a positive correlation) or negative (as one variable increases, the other decreases – for example as the cost of gasoline goes higher, the number of miles driven decreases. There are a wide variety of methods for measuring correlation including: intraclass correlation coefficients (ICC), the Pearson product-moment correlation coefficient, and the Spearman rank-order correlation.
of r = 0.99.
Note: Pearson correlationThe extent to which two or more variables are associated with one another. A correlation can be positive (as one variable increases, the other also increases – for example height and weight typically represent a positive correlation) or negative (as one variable increases, the other decreases – for example as the cost of gasoline goes higher, the number of miles driven decreases. There are a wide variety of methods for measuring correlation including: intraclass correlation coefficients (ICC), the Pearson product-moment correlation coefficient, and the Spearman rank-order correlation.
coefficients are likely to over-estimate reliabilityReliability can be defined in a variety of ways. It is generally understood to be the extent to which a measure is stable or consistent and produces similar results when administered repeatedly. A more technical definition of reliability is that it is the proportion of “true” variation in scores derived from a particular measure. The total variation in any given score may be thought of as consisting of true variation (the variation of interest) and error variation (which includes random error as well as systematic error). True variation is that variation which actually reflects differences in the construct under study, e.g., the actual severity of neurological impairment. Random error refers to “noise” in the scores due to chance factors, e.g., a loud noise distracts a patient thus affecting his performance, which, in turn, affects the score. Systematic error refers to bias that influences scores in a specific direction in a fairly consistent way, e.g., one neurologist in a group tends to rate all patients as being more disabled than do other neurologists in the group. There are many variations on the measurement of reliability including alternate-forms, internal consistency , inter-rater agreement , intra-rater agreement , and test-retest .
and the Pearson is no longer used for test-retest reliabilityA way of estimating the reliability of a scale in which individuals are administered the same scale on two different occasions and then the two scores are assessed for consistency. This method of evaluating reliability is appropriate only if the phenomenon that the scale measures is known to be stable over the interval between assessments. If the phenomenon being measured fluctuates substantially over time, then the test-retest paradigm may significantly underestimate reliability. In using test-retest reliability, the investigator needs to take into account the possibility of practice effects, which can artificially inflate the estimate of reliability (National Multiple Sclerosis Society).
.
Schmand, Lindeboom, Launer, Dinkgreve, Hooijer, and Jonker (1995) examined the test-retest reliabilityA way of estimating the reliability of a scale in which individuals are administered the same scale on two different occasions and then the two scores are assessed for consistency. This method of evaluating reliability is appropriate only if the phenomenon that the scale measures is known to be stable over the interval between assessments. If the phenomenon being measured fluctuates substantially over time, then the test-retest paradigm may significantly underestimate reliability. In using test-retest reliability, the investigator needs to take into account the possibility of practice effects, which can artificially inflate the estimate of reliability (National Multiple Sclerosis Society).
of the MMSE in healthy older subjects who were examined twice with an interval of 1 year between evaluations. Test-retest reliabilityA way of estimating the reliability of a scale in which individuals are administered the same scale on two different occasions and then the two scores are assessed for consistency. This method of evaluating reliability is appropriate only if the phenomenon that the scale measures is known to be stable over the interval between assessments. If the phenomenon being measured fluctuates substantially over time, then the test-retest paradigm may significantly underestimate reliability. In using test-retest reliability, the investigator needs to take into account the possibility of practice effects, which can artificially inflate the estimate of reliability (National Multiple Sclerosis Society).
was adequate (Spearman’s correlationThe extent to which two or more variables are associated with one another. A correlation can be positive (as one variable increases, the other also increases – for example height and weight typically represent a positive correlation) or negative (as one variable increases, the other decreases – for example as the cost of gasoline goes higher, the number of miles driven decreases. There are a wide variety of methods for measuring correlation including: intraclass correlation coefficients (ICC), the Pearson product-moment correlation coefficient, and the Spearman rank-order correlation.
= 0.58). The results of this study are similar to those found in O’Connor et al. (1989). These results suggest that the MMSE is not an appropriate measure for detecting subtle cognitive impairment.
Hopp et al. (1997) administered the MMSE to 44 adults without dementia, who were over the age of 75 years. The test-retest reliabilityA way of estimating the reliability of a scale in which individuals are administered the same scale on two different occasions and then the two scores are assessed for consistency. This method of evaluating reliability is appropriate only if the phenomenon that the scale measures is known to be stable over the interval between assessments. If the phenomenon being measured fluctuates substantially over time, then the test-retest paradigm may significantly underestimate reliability. In using test-retest reliability, the investigator needs to take into account the possibility of practice effects, which can artificially inflate the estimate of reliability (National Multiple Sclerosis Society).
for 6- 12- and 18-month intervals, using Pearson’s correlations, ranged from adequate to excellent (r = 0.56 to r = 0.80).
Olin and Zelinski (1991) examined the 12-month reliabilityReliability can be defined in a variety of ways. It is generally understood to be the extent to which a measure is stable or consistent and produces similar results when administered repeatedly. A more technical definition of reliability is that it is the proportion of “true” variation in scores derived from a particular measure. The total variation in any given score may be thought of as consisting of true variation (the variation of interest) and error variation (which includes random error as well as systematic error). True variation is that variation which actually reflects differences in the construct under study, e.g., the actual severity of neurological impairment. Random error refers to “noise” in the scores due to chance factors, e.g., a loud noise distracts a patient thus affecting his performance, which, in turn, affects the score. Systematic error refers to bias that influences scores in a specific direction in a fairly consistent way, e.g., one neurologist in a group tends to rate all patients as being more disabled than do other neurologists in the group. There are many variations on the measurement of reliability including alternate-forms, internal consistency , inter-rater agreement , intra-rater agreement , and test-retest .
of the MMSE in 57 elderly participants without dementia. Poor 12-month test-retest correlations were found for the total MMSE score (r = 0.34 when administering the alternate Attention item, r =0.23 when administering the same Attention item).
Uhlmann, Larson, and Buchner (1987) also examined the 12-month test-retest reliabilityA way of estimating the reliability of a scale in which individuals are administered the same scale on two different occasions and then the two scores are assessed for consistency. This method of evaluating reliability is appropriate only if the phenomenon that the scale measures is known to be stable over the interval between assessments. If the phenomenon being measured fluctuates substantially over time, then the test-retest paradigm may significantly underestimate reliability. In using test-retest reliability, the investigator needs to take into account the possibility of practice effects, which can artificially inflate the estimate of reliability (National Multiple Sclerosis Society).
of the MMSE in outpatients with dementia. In this study, the test-retest reliabilityA way of estimating the reliability of a scale in which individuals are administered the same scale on two different occasions and then the two scores are assessed for consistency. This method of evaluating reliability is appropriate only if the phenomenon that the scale measures is known to be stable over the interval between assessments. If the phenomenon being measured fluctuates substantially over time, then the test-retest paradigm may significantly underestimate reliability. In using test-retest reliability, the investigator needs to take into account the possibility of practice effects, which can artificially inflate the estimate of reliability (National Multiple Sclerosis Society).
was found to be excellent (r = 0.86).
Mitrushina and Satz (1991) examined the test-retest reliabilityA way of estimating the reliability of a scale in which individuals are administered the same scale on two different occasions and then the two scores are assessed for consistency. This method of evaluating reliability is appropriate only if the phenomenon that the scale measures is known to be stable over the interval between assessments. If the phenomenon being measured fluctuates substantially over time, then the test-retest paradigm may significantly underestimate reliability. In using test-retest reliability, the investigator needs to take into account the possibility of practice effects, which can artificially inflate the estimate of reliability (National Multiple Sclerosis Society).
of the MMSE in 122 healthy community-residing elderly volunteers between the ages of 57-85. The test-retest reliabilityA way of estimating the reliability of a scale in which individuals are administered the same scale on two different occasions and then the two scores are assessed for consistency. This method of evaluating reliability is appropriate only if the phenomenon that the scale measures is known to be stable over the interval between assessments. If the phenomenon being measured fluctuates substantially over time, then the test-retest paradigm may significantly underestimate reliability. In using test-retest reliability, the investigator needs to take into account the possibility of practice effects, which can artificially inflate the estimate of reliability (National Multiple Sclerosis Society).
of the MMSE was adequate (ranging from r = 0.45 to r = 0.50) over a 1-year interval, and poor over a 2-year period (r = 0.38).
Intra-rater/Inter-rater:
Molloy and Standish (1997) examined the intra-rater reliabilityThis is a type of reliability assessment in which the same assessment is completed by the same rater on two or more occasions. These different ratings are then compared, generally by means of correlation. Since the same individual is completing both assessments, the rater’s subsequent ratings are contaminated by knowledge of earlier ratings.
of the MMSE in comparison to the SMMSE in 48 older adults. University students, who were trained to administer either the MMSE or the SMMSE, tested participants on three different occasions to assess their inter-rater and intra-rater reliabilityThis is a type of reliability assessment in which the same assessment is completed by the same rater on two or more occasions. These different ratings are then compared, generally by means of correlation. Since the same individual is completing both assessments, the rater’s subsequent ratings are contaminated by knowledge of earlier ratings.
. An adequate ICC of 0.69 was reported for the traditional MMSE.
Inter-rater:
Dick et al. (1984) examined the inter-rater reliability
of the MMSE in patients with neurological disorders and reported a kappa of 0.63, demonstrating the adequate inter-rater reliability
of the MMSE.
Fabrigoule, Lechevallier, Crasborn, Dartigues, and Orgogozo (2003) examined the reliability
of the MMSE in patients who were likely to develop dementia. Fifty trained general practitioners and psychologists examined patients. There was a significant difference in scores between the general practitioners and the psychologists for the MMSE. The concordance correlation
coefficient was 0.87 between evaluations performed by general practitioners and those performed by psychologists.
In a study by O’Connor et al. (1989), 5 coders rated taped interviews with 54 general practice patients aged 75 and over. In this study, the inter-rater reliability
was excellent, with a mean kappa value of 0.97.
Validity
Criterion:
Although the MMSE is generally considered unidimensional, Jones and Gallo (2000) identified five factors (concentration, language and praxis, orientation, memory, and attention) to support the construct validityReflects the ability of an instrument to measure an abstract concept, or construct. For some attributes, no gold standard exists. In the absence of a gold standard , construct validation occurs, where theories about the attribute of interest are formed, and then the extent to which the measure under investigation provides results that are consistent with these theories are assessed.
of the MMSE as a measure of cognitive mental state among community dwelling older adults.
Concurrent:
Friedl, Schmidt, Stronegger, Fazekas, and Reinhart (1996) examined the concurrent validityTo validate a new measure, the results of the measure are compared to the results of the gold standard obtained at approximately the same point in time (concurrently), so they both reflect the same construct. This approach is useful in situations when a new or untested tool is potentially more efficient, easier to administer, more practical, or safer than another more established method and is being proposed as an alternative instrument. See also “gold standard.”
of the MMSE and the Mattis Dementia Rating Scale (MDRS) (Mattis, 1976), two measures commonly used to screen for dementia. Concurrent validityTo validate a new measure, the results of the measure are compared to the results of the gold standard obtained at approximately the same point in time (concurrently), so they both reflect the same construct. This approach is useful in situations when a new or untested tool is potentially more efficient, easier to administer, more practical, or safer than another more established method and is being proposed as an alternative instrument. See also “gold standard.”
between the MMSE and the MDRS was found to be poor (Pearson’s r = 0.29), as were correlations between the MMSE and MDRS subtests (attention r = 0.18; initiationThe ability to spontaneously start a task or activity (Grieve & Gnanasekaran, 2008)
and perseveration r = 0.04; construction r = 0.10; conceptualization r = 0.17; verbal and non-verbal short-term memory r = 0.27).
Folstein et al. (1975) administered the MMSE to 206 patients with dementia syndromes, affective disorder, affective disorder with cognitive impairment, mania, schizophrenia, personality disorders, and to 63 healthy controls. The concurrent validityTo validate a new measure, the results of the measure are compared to the results of the gold standard obtained at approximately the same point in time (concurrently), so they both reflect the same construct. This approach is useful in situations when a new or untested tool is potentially more efficient, easier to administer, more practical, or safer than another more established method and is being proposed as an alternative instrument. See also “gold standard.”
of the MMSE was examined by correlating the measure with the Wechsler Adult Intelligence Scale (WAIS – Wechsler, 1955). The concurrent validityTo validate a new measure, the results of the measure are compared to the results of the gold standard obtained at approximately the same point in time (concurrently), so they both reflect the same construct. This approach is useful in situations when a new or untested tool is potentially more efficient, easier to administer, more practical, or safer than another more established method and is being proposed as an alternative instrument. See also “gold standard.”
between the MMSE and the WAIS verbal IQ (r = 0.78) and the WAIS performance IQ (r = 0.66) were both excellent.
Hopp, Dixon, Grut, and Backman (1997) administered the MMSE and the Wechsler Adult Intelligence Scale-Revised (WAIS-R, Wechsler, 1981) to 44 adults without dementia, who were over the age of 75 years. Correlations between the MMSE and the WAIS-R Verbal IQ were adequate, ranging from r = 0.36 to r = 0.52. Correlations between the MMSE and WAIS-R Performance IQ were also adequate, ranging from r = 0.37 to r = 0.57. Correlations between the MMSE and the WAIS-R subtests ranged from poor to excellent (r = 0.20 to r = 0.60). Correlations between the MMSE subscales and the WAIS-R were generally lower than r = 0.41. The Language subscaleMany measurement instruments are multidimensional and are designed to measure more than one construct or more than one domain of a single construct. In such instances subscales can be constructed in which the various items from a scale are grouped into subscales. Although a subscale could consist of a single item, in most cases subscales consist of multiple individual items that have been combined into a composite score (National Multiple Sclerosis Society).
of the MMSE showed the lowest correlations with both WAIS-R Verbal and WAIS-R Performance. Correlations between MMSE subscales and WAIS-R subtests showed that the MMSE subscaleMany measurement instruments are multidimensional and are designed to measure more than one construct or more than one domain of a single construct. In such instances subscales can be constructed in which the various items from a scale are grouped into subscales. Although a subscale could consist of a single item, in most cases subscales consist of multiple individual items that have been combined into a composite score (National Multiple Sclerosis Society).
, Orientation, had the lowest correlations with all WAIS-R subtests (r = 0.001 to r = 0.40).
Similar to the results by Hopp et al. (1997), Dick et al. (1984) examined the utility of the MMSE for bedside screeningTesting for disease in people without symptoms.
, and serial assessment of cognitive function in 126 neurological patients and found adequate correlations between the MMSE and the Weschler Adult Intelligence Scale (WAIS) (r = 0.55 for WAIS-Verbal; r = 0.56 for WAIS-Performance).
Agrell and Dehlin (2000) reported significant correlations between MMSE scores and the Barthel Index (Mahoney & Barthel, 1965), the Montgomery Asberg DepressionIllness involving the body, mood, and thoughts, that affects the way a person eats and sleeps, the way one feels about oneself, and the way one thinks about things. A depressive disorder is not the same as a passing blue mood or a sign of personal weakness or a condition that can be wished away. People with a depressive disease cannot merely “pull themselves together” and get better. Without treatment, symptoms can last for weeks, months, or years. Appropriate treatment, however, can help most people with depression.
Rating Scale (MADRS – Montgomery & Asberg, 1979) and the Zung DepressionIllness involving the body, mood, and thoughts, that affects the way a person eats and sleeps, the way one feels about oneself, and the way one thinks about things. A depressive disorder is not the same as a passing blue mood or a sign of personal weakness or a condition that can be wished away. People with a depressive disease cannot merely “pull themselves together” and get better. Without treatment, symptoms can last for weeks, months, or years. Appropriate treatment, however, can help most people with depression.
Scale (Zung, 1965).
Diamond, Felsenthal, Macciocci, Butler, and Lally-Cassady (1996) examined the relationship between cognition and ability to benefit from inpatient rehabilitation in 52 patients admitted to geriatric rehabilitation. Functional gain was assessed using the change in Functional Independence Measure (FIM – Keith, Granger, Hamilton, & Sherwin, 1987) score from admission to discharge. The MMSE was not found to be associated with change in FIM score (r = 0.10). However, the MMSE alone and in combination with age correlated adequately with functional status on admission (r = 0.58) and discharge (r = 0.49).
Predictive:
Ozdemir et al. (2001) examined the predictive validity
of the MMSE in 43 patients with stroke
score improvement as measured by the Adapted Patient Evaluation and Conference System functional scale (r = 0.31). These results suggest that baseline total MMSE scores are somewhat predictive of functional improvement in patients with stroke
Diamond et al. (1996) examined the relationship between cognition and the ability to benefit from inpatient rehabilitation in 52 patients admitted to geriatric rehabilitation. The MMSE was found to be highly predictive of discharge destination such that low MMSE scores were associated with a greater likelihood of nursing home placement (r = 0.68). While only 8% of the uppermost MMSE quartile was discharged to nursing home placement, 62% of the lowest MMSE quartile was discharged to nursing homes.
Aguero-Torres, Fratiglioni, Guo, Viitanen, von Strauss, and Winblad (1998) examined predictors of dependence in activities
of daily living (as measured by the Katz index of Activities
of Daily Living (Katz, Downs, Cash, Grotz, 1970)) in the elderly. In patients without dementia, the MMSE was found to be one of the strongest predictors for developing functional dependence at a 3-year follow-up interval. Lower MMSE scores were associated with functional dependence in both adults with dementia (OR = 0.8) and in adults without dementia (OR = 0.8). Initial MMSE performance also predicted future functional dependence and decline among adults without dementia (OR = 0.7). Thus, independent of the presence of other chronic conditions, the MMSE may indicate subsequent functional status in a cognitively intact elderly population.
Matsueda and Ishii (2000) retrospectively examined the relationship between MMSE score and ambulatory level (divided into three groups: dependent, partially dependent, and independent) in 162 elderly patients who experienced a hip fracture. A significant relationship was found between initial MMSE score and ambulatory level such that those in the dependent group had the lowest mean MMSE score of only 6.6, those in the partially dependent group had a mean score of 17.9, and those in the independent group had the highest MMSE score of 24.6.
Huusko, Karppi, Avikainen, Kautiainen, and Sulkava (2000) examined the effect of intensive geriatric rehabilitation (intervention group) versus local hospital treatment (control group) on patients with dementia and a hip fracture. MMSE scores were predictive of length of hospital stay such that for patients with moderate dementia (MMSE score of 12-17), the median length of stay was 47 days in the intervention group and 147 days in control group. Patients with mild dementia (MMSE score of 18-23) had a length of stay of 29 days in intervention group and 46.5 days in the control group. No significant differences in mortality or in the length of hospital stay were observed for patients with severe dementia. In the intervention group, 3 months after surgery 91% of the patients with mild dementia and 63% of the patients with moderate dementia were living independently. In the control group, the corresponding figures were 67% and 17%, respectively. The results of this study suggest that the MMSE is associated with the length of hospital and rehabilitation stay, and that length of stay can be impacted on by intervention for those with cognitive impairment.
Pettigrew, Thomas, Howard, Veltkamp, and Toole (2000) examined whether low MMSE scores predict transient ischemic attack, stroke
Construct:
Convergent:
Snowden at al. (1999) examined 140 patients who were part of the Alzheimer’s Disease Patient Registry to evaluate the psychometric properties of a new measure, the Minimum Data Set (MDS). The cognitive performance scores from the MDS were correlated with the MMSE. The MMSE correlated adequately with the MDS (Spearman’s r = -0.45) (this correlationThe extent to which two or more variables are associated with one another. A correlation can be positive (as one variable increases, the other also increases – for example height and weight typically represent a positive correlation) or negative (as one variable increases, the other decreases – for example as the cost of gasoline goes higher, the number of miles driven decreases. There are a wide variety of methods for measuring correlation including: intraclass correlation coefficients (ICC), the Pearson product-moment correlation coefficient, and the Spearman rank-order correlation.
is negative because a low score on the MMSE indicates cognitive impairment, whereas a high score on the MDS indicates impairment). Consistent with previous studies, the MMSE had excellent correlations with the Weschler Adult Intelligence Scale (WAIS) Verbal and Performance IQ scores (r = 0.78 and r = 0.66, respectively).
Discriminant:
Winograd et al. (1994) developed the Physical Performance and Mobility Examination, a measure used to assess 6 domains of physical functioning and mobility for hospitalized elderly. The construct validityReflects the ability of an instrument to measure an abstract concept, or construct. For some attributes, no gold standard exists. In the absence of a gold standard , construct validation occurs, where theories about the attribute of interest are formed, and then the extent to which the measure under investigation provides results that are consistent with these theories are assessed.
of this measure was examined by comparing it to the MMSE, Activities of Daily Living (ADL)Basic tasks that involve bodily issues (bathing, dressing, toileting, transferring, continence, eating and walking) that are done on a daily basis., Instrumental Activities of Daily Living (IADL)Complex tasks that involve social or societal issues (shopping, bill paying, cooking, housework, etc.) that are done on a regular basis. (Lawton & Brody, 1969), Geriatric DepressionIllness involving the body, mood, and thoughts, that affects the way a person eats and sleeps, the way one feels about oneself, and the way one thinks about things. A depressive disorder is not the same as a passing blue mood or a sign of personal weakness or a condition that can be wished away. People with a depressive disease cannot merely “pull themselves together” and get better. Without treatment, symptoms can last for weeks, months, or years. Appropriate treatment, however, can help most people with depression.
Scale (Yesavage et al., 1983), and modified Medical Outcomes Study Measure of Physical Functioning (MOS-PFR). The MMSE correlated poorly with the Physical Performance and Mobility Examination (r = 0.36), suggesting that these two measures assess different constructs.
Macnight and Rockwood (1995) examined discriminant validityMeasures that should not be related are not. Discriminant validity examines the extent to which a measure correlates with measures of attributes that are different from the attribute the measure is intended to assess.
of the MMSE by comparing it to a new measure, the Hierarchical Assessment of Balance and Mobility (HABAM) in patients 65 and older. The discriminant validityMeasures that should not be related are not. Discriminant validity examines the extent to which a measure correlates with measures of attributes that are different from the attribute the measure is intended to assess.
was demonstrated, as the two measures correlated poorly (r = 0.15).
Known groups:
Wetherell, Darby, Emerson, and Miller (1997) found that the MMSE was able to discriminate between patients with Alzheimer’s Disease and frontotemporal dementia.
Kase, Wolf, Kelly-Hayes, Kannel, Beiser, and D’Agostino (1998) found that baseline pre-stroke MMSE scores were significantly lower for patients with stroke
Sensitivity and Specificity
Low reported levels of sensitivitySensitivity refers to the probability that a diagnostic technique will detect a particular disease or condition when it does indeed exist in a patient (National Multiple Sclerosis Society). See also “Specificity.”
, particularly among individuals with mild cognitive impairment, have been reported for the MMSE (Tombaugh & McIntyre, 1992; de Koning et al. 1998) and may be due to the emphasis placed on language items and a lack of items assessing visual-spatial ability (Grace et al. 1995; de Koning et al. 1998; Suhr & Grace, 1999).
Blake et al. (2002) examined the sensitivitySensitivity refers to the probability that a diagnostic technique will detect a particular disease or condition when it does indeed exist in a patient (National Multiple Sclerosis Society). See also “Specificity.”
and specificitySpecificity refers to the probability that a diagnostic technique will indicate a negative test result when the condition is absent (true negative).
of the MMSE for detecting cognitive impairment after strokeAlso called a “brain attack” and happens when brain cells die because of inadequate blood flow. 20% of cases are a hemorrhage in the brain caused by a rupture or leakage from a blood vessel. 80% of cases are also know as a “schemic stroke”, or the formation of a blood clot in a vessel supplying blood to the brain.. When the MMSE was compared with cognitive impairment identified an optimum cutoff of <24, with good specificitySpecificity refers to the probability that a diagnostic technique will indicate a negative test result when the condition is absent (true negative).
(88%) and moderate sensitivitySensitivity refers to the probability that a diagnostic technique will detect a particular disease or condition when it does indeed exist in a patient (National Multiple Sclerosis Society). See also “Specificity.”
(62%). However, it was not possible to identify suitable cutoff scores to use the MMSE to assess for the presence of either visual or verbal memory deficits.
Nys, van Zandvoort, de Kort, Jansen, Kappelle, and de Haan (2005) administered the MMSE to 34 patients with strokeAlso called a “brain attack” and happens when brain cells die because of inadequate blood flow. 20% of cases are a hemorrhage in the brain caused by a rupture or leakage from a blood vessel. 80% of cases are also know as a “schemic stroke”, or the formation of a blood clot in a vessel supplying blood to the brain. and 34 healthy controls. In this study, no optimum cut-off scores yielding both sensitivitySensitivity refers to the probability that a diagnostic technique will detect a particular disease or condition when it does indeed exist in a patient (National Multiple Sclerosis Society). See also “Specificity.”
greater than 80%, and specificitySpecificity refers to the probability that a diagnostic technique will indicate a negative test result when the condition is absent (true negative).
greater than 60%, could be identified.
References
- Agrell, B., Dehlin, O. (2000). Mini mental state examination in geriatric stroke patients. Validity, differences between subgroups of patients, and relationships to somatic and mental variables. Aging (Milano), 12(6), 439-444.
- Aguero-Torres, H., Fratiglioni, L., Guo, Z., Viitanen, M., von Strauss, E., Winblad, B. (1998). Dementia is the major cause of functional dependence in the elderly: 3-year follow-up data from population-based
- study. American Journal of Public Health, 88,1452-1456.
- Albert, M., Cohen, C. (1992). The test for severe impairment: An instrument for the assessment of patients with severe cognitive dysfunction. J Am Geriatr Soc, 40(5), 449-453.
- Blake, H., McKinney, M., Treece, K., Lee, E., Lincoln, N. B. (2002). An evaluation of screening measures for cognitive impairment after stroke. Age and Ageing, 31, 451-456.
- Bleecker, M. L., Bolla-Wilson, K., Kawas, C., Agnew, J. (1988). Age-specific norms for the Mini-Mental State Exam. Neurology, 10, 1565-1568.
- Crum, R. M., Anthony, J. C., Bassett, S. S., Folstein, M. F. (1993). Population-based norms for the mini-mental state examination by age and educational level. JAMA, 18, 2386-2391.
- Da Costa, F.A., Bezerra, I.F.D., de Araujo Silva, D.L., de Oliveira, R. & da Rocha, V.M. (2010). Cognitive evolution by MMSE in poststroke patients. International Journal of Rehabilitation Research, 33, 248-253.
- de Koning, I., van Kooten, F., Dippel, D. W. J., van Harskamp, F., Grobbee, D. E., Kluft, C., Koudstaal, P. J. (1998). The CAMCOG: A useful screening instrument for dementia in stroke patients. Stroke, 29, 2080-2086.
- Diamond, P. T., Felsenthal, G., Macciocci, S. N., Butler, D. H., Lally-Cassady, D. (1996). Effect of cognitive impairment on rehabilitation outcome. American Journal of Physical Medicine & Rehabilitation, 75(1), 40-43.
- Dick, J. P., Guiloff, R. J., Stewart, A., Blackstock, J., Bielawska, C., Paul, E. A., Marsden, C. D. (1984). Mini-mental state examination in neurological patients. Journal of Neurology, Neurosurgery, and Psychiatry, 47, 496-499.
- Fabrigoule, C., Lechevallier, N., Crasborn, L., Dartigues, J. F., Orgogozo, J. M. (2003). Inter-rater reliability of scales used to measure mild cognitive impairment by general practitioners and psychologists. Current Medial Research and Opinion, 19(7), 603-608.
- Folstein, M. F., Folstein, S. E., McHugh, P. R. (1975). “Mini-mental state”. A practical method for grading the cognitive state of patients for the clinician. J Psychiatr Res, 12(3), 189-198.
- Folstein, M. F., Folstein, S. E., McHugh, P. R. (1998). Key papers in geriatric psychiatry. Mini-Mental State: A practical method for grading the cognitive state of patients for the clinician. Int J Geriat Psychiatry, 13(5), 285-294.
- Folstein, M. F., Folstein, S. E., McHugh, P. R., Fanjiang, G. (2001). Mini-Mental State Examination User’s Guide. Odessa, FL: Psychological Assessment Resources.
- Folstein, M. F., Robins, L. N., Helzer, J. E. (1983). The Mini-Mental State Examination. Arch Gen Psychiatry, 40(7), 812.
- Foreman, M. D. (1987). Reliability and validity of mental status questionnaires in elderly hospitalized patients. Nurs Res, 36(4), 216-220.
- Friedl, W., Schmidt, R., Stronegger, W. J., Fazekas, F., Reinhart, B. (1996). Sociodemographic predictors and concurrent validity of the Mini Mental State Examination and the Mattis Dementia Rating Scale. European Archives of Psychiatry and Clinical Neuroscience, 246(6), 317-319.
- Grace, J., Nadler, J. D., White, D. A., Guilmette, T. J., Giuliano, A. J., Monsch, A. U., Snow, M. G. (1995). Folstein vs modified Mini-Mental State Examination in geriatric stroke. Stability, validity, and screening utility. Archives of Neurology, 52(5), 477-484.
- Holzer, C. E., Tischler, G. L., Leaf, P. J., Myers, J. K. (1984). An epidemiologic assessment of cognitive impairment in a community. Research in Community Mental Health, 4, 3-32.
- Hopp, G. A., Dixon, R. A., Grut, M., Backman, L. (1997). Longitudinal and psychometric profiles of two cognitive status tests in very old adults. J Clin Psychol, 53(7), 673-686.
- Huusko, T. M., Karppi, P., Avikainen, V., Kautiainen, H., Sulkava, R. (2000). Randomised, clinically controlled trial of intensive geriatric rehabilitation in patients with hip fracture: Subgroup analysis of patients with dementia. British Medical Journal, 321,1107-1111.
- Jones, R. N., Gallo, J. J. (2000). Dimensions of the Mini-Mental State Examination among community dwelling older adults. Psychological Medicine, 30, 605-618.
- Jorm, A. F., Scott, R., Henderson, A. S., Kay, K. W. (1988). Educational level differences on the Mini-Mental State: The role of test bias. Psychol Med, 18(3), 727-731.
- Kase, C. S., Wolf, P. A., Kelly-Hayes, M., Kannel, W. B., Beiser, A., D’Agostino, R. B. (1998). Intellectual decline after stroke: The Framingham study. Stroke, 29, 805-812.
- Katz, S., Downs, T. D., Cash, H. R., Grotz, R. C. (1970). Index of Activities of Daily Living. The Gerontologist, 1, 20-30.
- Kay, K. W., Henderson, A. S., Scott, R., Wilson, J., Rickwood, D., Grayson, D. A. (1985). Dementia and depression among the elderly living in the Hobart community: The effect of the diagnostic criteria on the prevalence rates. Psychol Med, 15(4), 771-788.
- Keith, R. A., Granger, C. V., Hamilton, B. B., Sherwin, F. S. (1987). The functional independence measure: A new tool for rehabilitation. Adv Clin Rehabil, 1, 6-18.
- Lawton, M. P., Brody, E. M. (1969). Assessment of older people: Self-maintaining and instrumental activities of daily living. Gerontologist, 9, 179-186.
- Lorentz, W. J., Scanlan, J. M., Borson, S. (2002). Brief screening test for dementia. Can J Psychiatry, 47, 723-733.
- Macnight, C., Rockwood, K. (1995). A hierarchical assessment of balance and mobility. Age and Ageing, 24(2), 126-130.
- Mahoney, F. I., Barthel, D. W. (1965). Functional evaluation: The Barthel Index. Md State Med J, 14, 61-5.
- Matsueda, M., Ishii, Y. (2000). The relationship between dementia score and ambulatory level after hip fracture in the elderly. American Journal of Orthopedics, 29,691-693.
- Mattis, S. (1976). Mental status examination for organic mental syndrome in the elderly patient. In: Bellak L, Karasu TB, editors. Geriatric Psychiatry. New York: Grune and Stratton, 77-101.
- McDowell, I., Kristjansson, B., Hill, G. B., Hebert, R. (1997). Community screening for dementia: The Mini Mental State Exam (MMSE) and modified Mini-Mental State Exam (3MS) compared. Journal of Clinical Epidemiology, 50(4), 377-383.
- Mitrushina, M., Satz, P. (1991). Reliability and validity of the Mini-Mental State Exam in neurologically intact elderly. J Clin Psychol, 47(4), 537-543.
- Molloy, D. W., Standish, T. I. M. (1997). A guide to the Standardized Mini-Mental State Examination. International Psychogeriatrics, 9(1), 87-94.
- Montgomery, S. A., Asberg, M. (1979). A new depression scale designed to be sensitive to change. Brit J Psychiat, 134, 382-389.
- Newkirk, L. A., Kim, J. M., Thompson, J. M., Tinklenberg, J. R., Yesavage, J. A., Taylor, J. L. (2004). Validation of a 26-point telephone version of the Mini-Mental State Examination. Journal of Geriatric Psychiatry and Neurology, 17(2), 81-87.
- Nys, G. M., van Zandvoort, M. J., de Kort, P. L., Jansen, B. P., Kappelle, L. J., de Haan, E. H. (2005). Restrictions of the Mini-Mental State Examination in acute stroke. Arch Clin Neuropsychol, 20(5), 623-629.
- O’Connor, D. W., Pollitt, P. A., Hyde, J. B., Fellows, J. L., Miller, N. D., Brooke, C. P., Reiss, B. B. (1989). The reliability and validity of the Mini-Mental State in a British community survey. J Psychiatr Res, 23(1), 87-96.
- Olin, J.T., Zelinski, E.M. (1991). The 12-month reliability of the Mini-Mental State Examination. Psychological Assessment, 3, 427-432.
- Ozdemir, F., Birtane, M., Tabatabaei, R., Ekuklu, G., Kokino, S. (2001). Cognitive evaluation and functional outcome after stroke. American Journal of Physical Medicine & Rehabilitation. 80(6), 410-415.
- Pettigrew, L. C., Thomas, N., Howard, V. J., Veltkamp, R., Toole, J. F. (2000). Low mini-mental status predicts mortality in asymptomatic carotid arterial stenosis. Neurology, 55,30-34.
- Roccaforte, W. H., Burke, W. J., Bayer, B. L., Wengel, S. P. (1992). Validation of a telephone version of the mini-mental state examination. J Am Geriatr Soc, 40(7), 697-702.
- Ruchinskas, R. A., Curyto, K. J. (2003). Cognitive screening in geriatric rehabilitation. Rehab Psychol, 48, 14-22.
- Schmand, B., Lindeboom, J., Launer, L., Dinkgreve, M., Hooijer, C., Jonker, C. (1995). What is a significant score change on the Mini-Mental State Examination? International Journal of Geriatric Psychiatry, 10, 411-414.
- Schwamm, L. H., Van Dyke, C., Kiernan, R. J., Merrin, E. L., Mueller, J. (1987). The Neurobehavioral Cognitive Status Examination: Comparison with the Cognitive Capacity Screening Examination and the Mini-Mental State Examination in a neurosurgical population. Ann Intern Med, 107(4), 486-491.
- Shadlen, M. F., Larson, E. B., Gibbons, L., McCormick, W. C., Teri, L. (1999). Alzheimer’s disease symptom severity in Blacks and Whites. Journal of the American Geriatrics Society, 47,482-486.
- Snowden, M., McCormick, W., Russo, J., Srebnik, D., Comtois, K., Bowen, J., Teri, L., Larson, E. B. (1999). Validity and responsiveness of the Minimum Data Set. Journal of the American Geriatrics Society, 47(8), 1000-1004.
- Suhr, J. A., Grace, J. (1999). Brief cognitive screening of right hemisphere stroke: Relation to functional outcome. Arch Phys Med Rehabil, 80(7), 773-776.
- Teng, E. L., Chui, H. C. (1987). The Modified Mini-Mental State (3MS) examination. J Clin Psychiatry, 48(8), 314-318.
- Tombaugh, T. N., McIntyre, N. J. (1992). The mini-mental state examination: A comprehensive review. J Am Geriatr Soc, 40(9), 922-935.
- Tombaugh, T. N., McDowell, I., Kristjansson, B., Hubley, A. M. (1996). Mini-Mental State Examination (MMSE) and the modified MMSE (3MS): A psychometric comparison and normative data. Psychol Assess, 8(1), 48-59.
- Uhlmann, R. F., Larson, E. B., Buchner, D. M. (1987). Correlations of Mini-Mental State and modified Dementia Rating Scale to measures of transitional health status in dementia. J Gerontol, 42(1), 33-36.
- Wechsler, D. (1981). Wechsler Adult Intelligence Scale-Revised: Test. New York: Harcourt Brace
- Wechsler, D. (1955). Manual for the Wechsler Adult Intelligence Scale. New York: The Psychological Corporation.
- Wetherell, M., Darby, A., Emerson, K., & Miller, B. L. (1997). Mini- Mental State Examination performance in Alzheimer’s disease and frontotemporal dementia. International Journal of Rehabilitation and Health, 3,253-265.
- Wind, A. W., Schellevis, F. G., van Staveren, G., Scholten, R. J. P. M., Jonker, C., van Eijk, J. M. (1997). Limitations of the mini-mental state examination in diagnosing dementia in general practice. International Journal of Geriatric Psychiatry, 12(1), 101-108.
- Winograd, C. H., Lemsky, C. M., Nevitt, M. C., Nordstrom, T. M., Stewart, A. L., Miller, C. J., Bloch, D. A. (1994). Development of a physical performance and mobility examination. J Am Geriatr Soc, 42(7), 743-749.
- Yesavage, J. A., Brink, T. L., Rose, T. L., Lum, O., Huang, V., Adey, M. B., Leirer, V. O. (1983). Development and validation of a geriatric depression screening scale: A preliminary report. Journal of Psychiatric Research, 17, 37-49.
- Zung, W. W. K. (1965). A self-rating depression scale. Arch Gen Psychiatry, 12, 63-70.
See the measure
How to obtain the MMSE
The MMSE can be obtained from the current copyright owner, Psychological Assessment Resources (PAR).