Identifying Depression: How Accurate is the PHQ-2?
The accuracy of the PHQ-2 in identifying major depression is lower than originally thought.
The accuracy of the PHQ-2 in identifying major depression is lower than originally thought, according to a new meta-analysis conducted by a team of researchers in the United Kingdom and published in Journal of Affective Disorders.1
The researchers analyzed data from 21 studies including 11 175 individuals, of whom 1 529 were diagnosed with major depressive disorder (MDD). Of the 21 studies, 19 reported data for a cut-off point of ≥3 or an alternative cut-off point of ≥2 (n=17 studies). They combined data from the studies using bivariate diagnostic meta-analysis to derive sensitivity, specificity, likelihood ratios, and diagnostic odds ratios.
Clinical settings varied, and studies included data from outpatient primary care, neurology, ob/gyn, geriatric, and university services. The wide range of patient populations included healthy adults, patients with epilepsy, stroke, movement disorders, addiction, and cardiovascular disease.
Studies utilizing the ≥3 cut-off point had pooled sensitivity of 0.76 and pooled specificity of 0.87. At this cut-off, the researchers reported “substantial heterogeneity.” The studies that used the ≥2 cut-off point found lower heterogeneity and higher pooled sensitivity (0.91), but lower specificity (0.70).
The researchers compared their findings to the original validation study of the PHQ-2, conducted by Kroenke and colleagues,2 which utilized a ≥3 cut-off point and found a sensitivity of 0.83 and a specificity of 0.90 to identify depression in a sample of primary and secondary care patients.
The current study demonstrates a lower sensitivity compared to that of the original study. This finding indicates that the cut-off of ≥2 “might be preferable if clinicians want to ensure that few cases of depression are missed.” In situations with a low prevalence of depression, however, “this may result in an unacceptably high false-positive rate because of the modest specificity,” they noted.
The extent to which this might be problematic would “depend on the prevalence of depression in which the screen is being used and the cost and availability of strategies to further assess those who score positively on the initial screen.” For this reason, the cut-off point of ≥2 might not be a “useful clinical tool,” although it may be helpful in screening situations in groups known to be at high risk for depression.
1. Manea L, Gilbody S, Hewitt C , et al. Identifying depression with the PHQ-2: a diagnostic meta-analysis. J Affect Disord. 2016. doi: 10.1016/j.jad.2016.06.003. [Epub ahead of print]
2. Kroenke K, Spitzer RL, Williams JB. The patient health questionnaire-2: validity of a two-item depression screener. Med Care. 2003;41:1284-1292.