The Patient Health Questionnaire-9 (PHQ-9) overestimates the prevalence of depression markedly, or nearly 12% more than would be achieved using validated semistructured diagnostic interviews, according to the results of a study published in the Journal of Clinical Epidemiology.

The classification of major depression requires the use of validated diagnostic interviews, but conducting interviews to determine the prevalence of depression in the general population requires extensive mental health resources. Self-report questionnaires or screening tools lack key components of diagnostic interviews, such as an assessment of functional impairment or an evaluation of nonpsychiatric disorders that may cause similar symptoms.

Brooke Levis, PhD, from the department of epidemiology, biostatistics and occupational health, McGill University, Montreal, Quebec, Canada, and colleagues examined 9242 participants, including 1389 cases of depression, from 44 primary studies. They endeavored to determine the value of the PHQ-9, the most commonly used depression screening tool in primary care, in estimating the prevalence of major depression in the overall population. They compared prevalence estimates between the PHQ-9 and the Structured Clinical Interview for DSM (SCID).

Continue Reading

Related Articles

­Using the pooled PHQ-9 scores (cutoff ≥10), the prevalence of major depression was 24.6%, whereas pooled SCID scores showed a depression prevalence of 12.1%, indicating a pooled difference of 11.9%. A PHQ-9 score ≥14 and the PHQ-9 diagnostic algorithm correlated more closely with the SCID results. However, the study level prevalence differed from SCID-based prevalence by an average absolute difference of 4.8% for PHQ-9 ≥14 and 5.6% for the PHQ-9 diagnostic algorithm. Furthermore, these screening tools were found to both overestimate and underestimate the prevalence of depression, reflecting the considerable heterogeneity found in the individual studies.

The researchers noted a number of limitations. They were unable to include data from 14 of 58 eligible published data sets, and data sets were almost all from patients in healthcare settings, which may have contributed to error variance. In addition, the data sets came from a wide variety of study settings, which may have resulted in the heterogeneity observed.

“Researchers sometimes report prevalence estimates based on cutoffs from questionnaires, including the PHQ-9, as prevalence of ‘clinically significant’ symptoms or ‘symptoms’ of depression, rather than ‘depression,'” the investigators noted, “screening tool cutoffs do not reflect a meaningful divide between impairment and non-impairment.”

Disclosure: several study authors reported conflicts of interest. Please see the original study for a full list of disclosures.


Levis B, Benedetti A, Ioannidis JPA, et al. Patient Health Questionnaire-9 scores do not accurately estimate depression prevalence: individual participant data meta-analysis [published online February 24, 2020]. J Clin Epidemiol. doi:10.1016/j.jclinepi.2020.02.002