Abstract
Objective/methods: The Montreal Cognitive Assessment (MoCA) is an increasingly used screening tool for cognitive impairment. While it has been validated in multiple settings and languages, most studies have used a biased case-control design including healthy controls as comparisons not representing a clinical setting. The purpose of the present cross-sectional study is to test the criterion validity of the MoCA for mild cognitive impairment (MCI) and mild dementia (MD) in an old age psychiatry cohort (n = 710). The reference standard consists of a multidisciplinary, consensus-based diagnosis in accordance with international criteria. As a secondary outcome, the use of healthy community older adults as additional comparisons allowed us to underscore the effects of case-control spectrum-bias.
Results: The criterion validity of the MoCA for cognitive impairment (MCI + MD) in a case-control design, using healthy controls, was satisfactory (area under the curve [AUC] 0.93; specificity of 73% less than 26), but declined in the cross-sectional design using referred but not cognitive impaired as comparisons (AUC 0.77; specificity of 37% less than 26). In an old age psychiatry setting, the MoCA is valuable for confirming normal cognition (greater than or equal to 26, 95% sensitivity), excluding MD (greater than or equal to 21; negative predictive value [NPV] 98%) and excluding MCI (greater than or equal to 26;NPV 94%); but not for diagnosing MD (less than 21; positive predictive value [PPV] 31%) or MCI (less than 26; PPV 33%).
Conclusions: This study shows that validating the MoCA using healthy controls overestimates specificity. Taking clinical and demographic characteristics into account, the MoCA is a suitable screening tool-in an old age psychiatry setting-for distinguishing between those in need of further diagnostic investigations and those who are not but not for diagnosing cognitive impairment.
1 INTRODUCTION
The Montreal Cognitive Assessment (MoCA) was developed as a brief screening test for mild cognitive impairment (MCI). It is widely used across the world in a variety of settings. The MoCA is recommended by the Alzheimer Society to objectively assess cognitive complaints in a clinical setting.
Even though more and more advocacy groups or policy makers favor screening for dementia, there is still a debate if screening in various populations is wise. However, the setting of old age psychiatry is different to our opinion. By knowing patient's cognitive functioning at referral, besides timely detecting dementia also to monitor all causes of MCI in old age psychiatry, one can adapt their (psychiatric) treatment; eg, pharmacotherapy (including compliance) or psychotherapy, especially as this population is at greater risk of changing cognitive functioning not only by age but also by (psychotropic) medication or because of the referral reasons. In The Netherlands, referrals to old age psychiatry consist of a mix of neurodegenerative and other psychiatric disorders, such as depression, bipolar disorders, schizophrenia, and severe anxiety disorders, all of which can be accompanied by poor cognitive functioning.
We introduced in our clinic a short cognitive assessment using the MoCA for all referred patients to lower doctors delay by adding an objective aid to triage those in need for specialized diagnostic route besides having baseline cognitive data. Therefore, we need to know its diagnostic test accuracy in this setting.
The MoCA shows good validity in multiple languages, although moderately so in Dutch in a geriatric memory clinic setting. It is important to validate the MoCA in specific settings, as the selection of subjects with different characteristics may influence the test characteristics of a scale such as the MoCA. This is especially relevant in case-control study designs using community-based healthy controls (HC), as this is not representative of the clinical reality. The MoCA has not yet been validated in old age psychiatry settings, where patients are referred with multidimensional causes for MCI and to our knowledge our study is the first to do so. Differentiation between cognitive impairment as a consequence of a psychiatric disease and/or as a consequence of early stage dementia is complicated and may affect the test-characteristics of the MoCA.
According to the Cochrane review, “the MoCA may help identify people requiring specialist assessment and treatment for dementia.”
We aim to validate the MoCA in this clinical setting following the standards for reporting diagnostic accuracy (STARD 2015) recommendations by using a cross-sectional study design. The purpose of the present study is to test the criterion validity (ie, can the MoCA predict a diagnose correctly) of the MoCA to detect MCI and early stage/mild dementia (MD) in an old age psychiatry cohort including referred but not cognitive impaired patients as primary comparisons. The reference standard consists of a multidisciplinary, consensus-based diagnosis in accordance with international criteria. The above cross-sectional design avoids the spectrum-bias of most case-control studies where the extremes of the spectrum of cognitive function were included. To illustrate this effect, we present as a secondary outcome the MoCA results in a case-control design, using community-based HC with normal cognitive aging as secondary comparisons.
2 METHOD
2.1 Sample
This study was performed in an old age (60 years +) psychiatry outpatient clinic in a large Dutch City (Utrecht), which offers services to the north-west side of the city and its rural surroundings (57.000 inhabitants of 60+ in the north-west). Between 2008 and 2018, all newly referred patients were eligible for this study. The inclusion criterion was the ability to give written informed consent. Therefore, patients referred with severe dementia (Global Deterioration Scale [GDS] greater than or equal to 6), Behavioral and Psychological Symptoms of Dementia (BPSD), or compulsory referrals were not included.
Participants were assessed by a multidisciplinary team, on all occasions including an old age psychiatrist and a trained psychiatric nurse practitioner. After referral, patients with an obvious cause of their cognitive complaints were excluded to resemble a clinical screening population: Those with a diagnosis of severe mid-stage dementia (GDS greater than or equal to 5), a recent history of substance abuse (<2 years), recent delirium (<6 months), or an acquired brain injury including cerebrovascular accident (CVA) or transient ischemic attack (TIA). In addition, patients with insufficient command of the Dutch language were excluded.
The secondary study compares the test properties of the MoCA with an unrealistic situation: a group of community-based HC, age 60+. They were recruited from acquaintances of patients or research assistants. Inclusion criteria were no cognitive complaints and no risk factors for cognitive dysfunction. Exclusion criteria were acquired brain injury including CVA or TIA, substance abuse, recent delirium, recent treatment for psychiatric or neurologic diseases, and use of medication that can alter cognitive functioning. From potential HC showing signs of cognitive impairment during the interview or with a MoCA score below 25, consent was obtained to interview the next of kin (n = 11), who were assessed with the Modified Informant Questionnaire on Cognitive Decline in the Elderly (IQCode). No potential HC had an IQCode higher than 3.5, which would indicate potential moderate cognitive impairment and would be an exclusion criterion.
The Committee for Research and Ethics of the institution approved this study (CWO-nr 1606). All participants gave their informed consent. Data are available on request due to privacy/ethical restrictions.
2.2 Measurements
Initial assessment was performed by an old age psychiatrist, including a medical history obtained from the next of kin and relevant laboratory tests for cognitive impairment. During the diagnostic procedure, the 15-item Geriatric Depression Scale (GDS15) and the Global Assessment of Functioning (GAF) were collected. Investigation of instrumental activities of daily living (IADL) was done by a psychiatric nurse practitioner on a home visit. When this initial assessment raised any suspicion of cognitive impairment, further assessment took place with a neuropsychological assessment (n = 289) and, when applicable, CT/MRI imaging and cerebrospinal fluid (CSF) analysis. The neuropsychological assessment, done by a neuropsychologist not aware of the MoCA score, was an extensive and comprehensive assessment including multiple tests in the domains of memory, attention, executive function, fluid intelligence, and language capacities. Full test of: Dutch reading test for adults to estimate premorbid intelligence (“Nederlandse Leestest voor Volwassenen” NLV); Proverbs; Zung 12; Self-rating Depression scale (ZDS); Raven Coloured Progressive Matrices; Questionnaire for orientation and personal and non-personal episodic memories (“Toutenburger Vragenlijst”); Visual Association Test(VAT); Fifteen words imprinting and recall or recognition; Copying of Drawings (Meander of Luria, Complex figure of Rey, House, Cube, Greek cross; D-KEFS|Trail Making Test A and B (TMT); Hooper Visual Organization test (VOT-short version); Calculation; spelling and reading; Binet- Bobertag story; Fluency- test category (and letter); Groninger Intelligence test (GIT); Clock reading and writing. Subtest of: Wechsler Adult intelligence scale; WAIS IV (Symbol substitution, Numerical series/Digit Span, Agreements/Similarities, Figures; Figure Weights). Wechsler Memory scale IV; WMS IV (numerical series). Behavioral Assessment of the Dysexecutive Syndrome (BADS; Keysearch test and Zoo-plan test).
The HC were interviewed and assessed by research assistants. The assessment was carried out in a single day and included the MoCA, the GDS15, and GAF.
2.3 Diagnostic test
All referred participants were assessed with a MoCA as soon as possible, within a maximum of 3 months from referral, by a trained research assistant or psychiatric nurse practitioner. This was independent from the diagnostic procedure. The MoCA was assessed during the feedback appointment of the initial assessment when the treatment plan was presented. The treatment plan included referral to our memory clinic for further assessment if there was doubt or suspicion of CI.
The MoCA consists of one page, covering the cognitive domains of executive function and visuospatial abilities, naming, short-term memory, attention and working memory, language, concentration, verbal abstraction, and orientation. It can be carried out within 10 minutes, with a maximum score of 30 indicating no errors were made. Scores were corrected for low education according to instructions, by adding one point to the total score of patients with 12 years of education or less. The original suggested cutoff for the diagnosis of CI was a score of (below) 26 (less than 26).
2.4 Reference test
The reference test was the diagnosis determined at multidisciplinary team meetings, including an old age psychiatrist, neuropsychologist, and geriatrician.
The diagnoses of dementia and MCI were supported by a minimum of a neuropsychological assessment and laboratory tests. The diagnoses were made in consensus and in accordance with the MCI criteria as proposed by an international consortium or the Dutch guideline on dementia. This guideline covers the criteria of DSM IV for dementia, NIA-AA/NINCDS-ADRDA for Alzheimer disease, NINDS-AIREN/AHA-ASA for vascular dementia, frontotemporal dementia (FTD) according to The Lund and Manchester Groups and the Consensus for Dementia with Lewy Body. The MCI group included those with MCI due to psychiatric causes, in accordance with the international consensus. No further differentiation of MCI was made in this study. The results of the MoCA were not used to diagnose MCI or dementia.
Referred patients without suspicion of CI during initial assessment were followed up for a minimum of 2 years to compensate for not having a neuropsychological assessment. Patients who did not meet the aforementioned criteria for a diagnosis of dementia or MCI during follow-up were classified as no cognitive impairment (NoCI). Patients who did meet the aforementioned criteria after the initial 3 months during follow-up were classified as inconclusive to be cautious (n = 3).
2.5 Statistical analyses
Results were compared within the referred patients with MD, MCI, or NoCI and between the groups total referred patients (MD + MCI + NoCI) and HC, using the Statistical Package for the Social Sciences (SPSS, version 22; SPSS Inc., Chicago, IL); chi squared test to compare sex and education. ANOVA to compare age, GAF, GDS15, and MoCA scores followed with a least significant difference (LSD) (and a Bonferroni not shown) post hoc test. An ANCOVA with age as a covariate was run additionally.
Using receiver operating characteristic (ROC) analysis, the area under the curve (AUC) was calculated as a measure for the diagnostic accuracy of the MoCA. As the MoCA can be used to detect dementia in a clinical setting as well as to rule out cognitive impairment in a clinical setting, we calculated different ROC curves: (a) to detect dementia in a clinical setting, (b) to detect cognitive impairment (MD + MCI) in a clinical setting, and (c) to detect MCI in in a subgroup of patients (MD excluded). To compare these analyses with previous case-control studies and to see the effect of bias, all analyses were repeated with HC.
PPV and NPV were calculated for the “optimal” cutoff scores as calculated by the Youden J index. Cronbach alpha was calculated for internal consistency of the MoCA.
3 RESULTS
3.1 Study groups
Out of 2204 referrals, 1337 were not eligible for this study. Eight hundred sixty-seven referred patients were assessed with a MoCA for this study (mean delay 21.5 days, 65% within 3 weeks of referral). After applying the exclusion criteria (Figure 1), a group of 710 participants remained: 81 MD, 153 MCI, 459 referred patients with no MCI or dementia (NoCI), and 17 inconclusive. Mean time needed for diagnosing was 40.5 days for the NoCI and 60.8 for the CI. For the secondary outcome, 84 HC were included of a group of 96 potential healthy volunteers (Figure 1). Two of them had an IQcode between 3.25 and 3.5 indicating minor decline over the past 10 years. All others were in between 3.0 and 3.25 indicating (almost) no decline.
Figure 1

Flowchart referred patients and healthy controls. MCI: mild cognitive impairment; NoCI: no cognitive impairment; HC: healthy controls; GDS: Global Deterioration Scale; IQCode: Informant Questionnaire on Cognitive Decline in the Elderly; BPSD: Behavioral and Psychological Symptoms of Dementia
3.2 Demographic findings
Within the referred patients, there was a significant difference in age (ANOVA F = 26.0 P = .000) between the diagnostic groups, as expected. There was no significant difference between sex (P = .39) and education length (P = .142). Disability, as measured by the GAF, showed an expected difference: MCI best and the demented and NoCI (as most of them were psychiatrically ill) the worst GAF score (P = .001). The GDS15 shows no significant differences between the referred groups.
As for the secondary outcome, there were no significant differences in age, education, and sex between the population of referred patients and the HC (Table 1). The significant differences in GDS15 and GAF were as expected; the HC had significantly fewer depressive symptoms (GDS15-score) and better global functioning (GAF score).
Table 1. Key demographic and clinical characteristics

Note. Education and sex were compared between b, c, and d and between a and e with a chi‐squared test. Groups b, c, and d, were compared with ANOVA.Groups a and e were compared with t test.
Abbreviations: GAF, Global Assessment of Functioning; GDS15, Geriatric Depression Scale 15 question version; MCI, mild cognitive impairment; MoCA,Montreal Cognitive Assessment; NoCI, no cognitive impairment.
3.3 MoCA outcome
The mean MoCA scores differed significantly between groups; the differences in average MoCA scores between the individual referred groups were significant (P = .000), as well as those for the secondary outcome between combined total referred group and the HC (P = .000). The standard deviations (MCI towards NoCI) and range (all groups) of the referred groups did overlap and showed a wide distribution (Table 1). The internal consistency of the MoCA, as expressed by the Cronbach alpha on the standardized items (.761), was good. All 12 items of the MoCA contribute to a positive Cronbach alpha, as no item “if item deleted” gives a higher outcome (.708-.737).
The results of the ROC analysis, for clinical situations, are shown in Figure 2A,B. Table 2 displays the AUCs of these and additional analyses, as well as their sensitivity and specificity at the literature-recommended cutoff scores of 26 and 21. All AUCs were significantly different from 0.5 (no diagnostic accuracy), P < .001. The AUCs with HC as secondary comparison ranged between 0.90 and 0.98, an excellent accuracy. The MoCA performed less well in a clinical setting, with AUCs between 0.70 and 0.87.
Figure 2

Results of receiver operating characteristic (ROC) analysis. A, dementia versus no dementia (mild cognitive impairment [MCI] + no cognitive impairment [NoCI]); B, cognitive impairment (CI = mild dementia [MD] + MCI) versus NoCI [Colour figure can be viewed at wileyonlinelibrary.com]
Table 2. The effect of using HC instead of NoCI as comparisons on area under the curve between variations of groups and their sensitivity and specificity at cutoff scores 26 and 21, often used in literature.

Note. Dem: dementia (n = 81); NoDem: no dementia (MCI + NoCI; n = 612); MCI: mild cognitive impairment (n = 153); NoCI: referred patients no cognitive impairment (n = 459); HC: healthy controls (n = 84); CI: cognitive impairment (Dem + MCI; n = 234). AUC: area under the curve. SE: standard error. Sens: sensitivity. Spec: specificity.
For the original suggested cutoff score of 26 to discriminate MCI from HC, the sensitivity and specificity are 94% and 73%, respectively (in the original article 90% and 87%). Using the same cutoff score in a realistic setting (ie, discriminating against referred NoCI) leads to a drop in specificity to 37%. The clinical situation of detecting CI (MD + MCI) below this cutoff had a sensitivity of 95%.
A cutoff score for diagnosing dementia is still under debate but is often set around 21, which in our study results in a sensitivity of 90%. The specificity dropped from 99% using dementia vs HC to 74% in a clinical setting (dementia vs MCI + NoCI), and 63% for dementia vs MCI. To find the “best” cutoff score for our population, the specificity and sensitivity were calculated for different scores of the MoCA (Table 3).
Table 3. Sensitivity and Specificity at MoCA scores from 28 to 18

Note. Dem: dementia (n = 81); MCI: mild cognitive impairment (n = 153); NoCI: referred patients no cognitive impairment (n = 459); HC: healthy controls (n = 84); CI: cognitive impairment (Dem + MCI; n = 234).
a(MoCA-D below score).
The “optimum” cutoff scores against NoCI as calculated by the Younden index were less than 25 for detecting MCI, sensitivity 88% (95% CI, 81-92), specificity 49% (95% CI, 44-53); less than 23 for CI, sensitivity 75% (95% CI, 69-81), specificity 68% (95% CI, 63-72); and less than 21 for MD, sensitivity 90% (95% CI, 81-95), specificity 78% (95% CI, 74-81) and comparable with literature.
The PPV and NPV were calculated (Table 4) for the two scores with the highest computed Younden index. The PPV and the NPV show different results. The PPV was low in almost all situations whereas the NPV was high in all situations. Using a cutoff of less than 21 for dementia results in 31% of a positive MoCA having MD and 98% of a negative test having no MD. For detecting MCI at a cutoff of less than 26; 33% has indeed MCI when the MoCA is positive and 94% above this threshold will not have MCI.
Table 4. Positive and negative predictive values of cutoff scores with the highest Younden index

Note. Dem: dementia (n = 81); MCI: mild cognitive impairment (n = 153); NoCI: referred patients no cognitive impairment (n = 459); HC: healthy controls (n = 84); CI: cognitive impairment (Dem + MCI; n = 234). PPV: positive predictive value; NPV: negative predictive value; 95% CI, 95% confidence intervals.
a MoCA-D below score.
4 DISCUSSION
In this cross-sectional study, patients with dementia were significant older than those without. There were more females in each group, which is representative of the population referred to old age psychiatry. Age has been shown to be of influence, as MoCA scores decline with aging and can alter the (interpretation of) results. However, age has little unique variance and a correlation of less than 10%. An additional ANCOVA sensitivity analysis with age as a covariate still showed significant differences in MoCA scores between the different diagnostic groups in our study.
The GDS15, a geriatric depression scale, revealed no differences between the referred groups. This finding underscores again the necessity to be cautious when using a screening tool like the GDS15 in attempting to differentiate between or detect psychiatric causes of cognitive complaints.
Our study reproduced the significantly different mean MoCA scores reported in previous literature. Our secondary outcome, differentiating patients with MD or MCI from HC, shows comparable properties reported in previous case-control studies. But to avoid this spectrum bias, we studied the MoCA in a cohort of patients referred to old age psychiatry, which more accurately represents the clinical reality. This is illustrated in Table 3, where the AUC and specificity drop when the comparison is realistic (NoCI as comparisons) and not fictive (HC as comparisons). One can argue that this bias we underscore, by adding HC, is well-known and its effect on the AUC shown before. Apparently, it is still important to stress out the effect it has on optimum cutoff scores as the case-control study design is still the majority of the MoCA validation studies. Clinicians should be careful to use cutoffs based on those studies. Twenty-seven percent of the HC had a MoCA score below 26, compared with 63% of the referred NoCI. The MoCA scores of our NoCI patients match with that of a longitudinal, population based study (n = 2653; mean MoCA 23.36, 64% specificity less than 26) indicating we have a realistic comparison group. Even though there was a wide range of MoCA scores in our group, this occurred in a clinical setting and can be explained by the following.
False negative results were found in cases of high educational and/or professional levels or FTD in the dementia group. False positive results occurred due to a lack of motivation and/or attention in depressed, manic, or psychotic patients, with or without MCI. One may argue the latter should have been diagnosed with MCI due to their psychiatric conditions. However, it was the clinical opinion of the team, after IADL investigation, that their presentation was not persistent and did not justify a diagnosis of MCI, as the MoCA score was not taken into account.
There is a risk, including in this study, of a subjective decision whether MCI is diagnosed or not when a psychiatric disorder explains its etiology, despite the criteria for MCI being met. We minimized this by including a neuropsychological assessment during the diagnostic work-up when there was suspicion of persistent impaired cognitive functioning. In the future, the MoCA would make it easier and more objective to select these possible MCIs and identify those in need of a further work-up.
False positives (ie, a low MoCA score) due to unrecognized neurodegenerative MCI can be excluded in our study, as progression to any DSM IV diagnosis of cognitive impairment was monitored with a mean follow-up of 3.5 years.
This study shows it is safe to use a threshold of greater than or equal to 26 to indicate normal cognition (95% sensitivity for CI), taking specific situations, like a university degree or FTD, into account.
While the MoCA detects most MD (less than 21; 90% sensitivity) and MCI (less than 26; 94% sensitivity) below these cutoff scores, making it fit for screening, it is not suitable for diagnosing MD or MCI in our study population, as the PPV for MD and MCI are still only fair (31% and 33% PPV, respectively). The proportion of referred psychiatric patients scoring below these cutoff scores is too high for diagnostic purposes (22% and 63% of NoCI, respectively).
The MoCA is suitable for excluding dementia (greater than or equal to 21; NPV 92%-98%) and MCI (greater than or equal to 26; NPV 94%), if used to assess patients referred to an old age psychiatry setting. This, combined with the high sensitivity at these cutoffs, makes the MoCA a useful screening tool.
In the case of a positive test result, further work-up is usually necessary; the absolute amount of false-positives is substantial, since the majority of referred patients do not suffer from MD.
Using our study cohort as an example, applying a MoCA cutoff of less than 21 to screen 100 referred patients would lead to 33 patients receiving specialized diagnostic tests, of whom 14.7 would be NoCI, 8.2 MCI, and 10.5 correctly identified MD. One patient (1.15) with MD would not be detected using this cutoff score. This confirms that screening comes with its price, also in old age psychiatry.
We recommend further research to find methods that increase the specificity and improve selection of those in need of a specialized diagnostic pathway. The aforementioned weaknesses of our study—unrealistic scattering and seemingly missed CI diagnoses—would in practice be interpreted as part of a larger clinical picture; incongruous results would be reconsidered if these MoCAs are clinically relevant or correct, or considered as CI. This would increase the specificity of the MoCA. Further research should focus on the suspected CI referrals only and investigate if a MoCA reassessment after recovery from serious psychiatric episodes can lower the false positive rate. Another limitation is that we did not gave all the comparisons the same full diagnostic assessment due to practicality and resource constraints. Because adding the HC was mainly to underscore the spectrum-bias effect, this is to our opinion acceptable.
The NoCIs that were not suspected of CI, hence did not got a full diagnostic work-up, were followed for at least 2 years to compensate for this limitation. The NoCIs that were suspected of CI did get the same full diagnostic assessment. Excluding the GDS greater than or equal to 5 and BPSD could be seen as selection bias and a limitation. To our opinion, avoiding the extremes of the spectrum is a strength of our study. The clinical reality is that the obvious demented will not be screened whether they need a specialized diagnostic route. But including their low MoCA scores in the study would bias the results.
5 CONCLUSION
This study shows that validating the MoCA in a biased setting, ie, against HC, overestimates specificity. Our findings are in line with the literature, where lower cutoff scores are repeatedly suggested to tackle this problem.
Taking the above results into account, one can conclude that the MoCA can be useful in an old age psychiatric setting to confirm normal cognitive functioning and to identify those who are in need for a specialized diagnostic pathway. However, further research is necessary to minimize the number of false positives in the latter group.