Harmonizing Depression Severity Scales Preserves Statistical Power

Worried woman thinking alone in the night at home
Researchers analyzed the associations between categorical and continuous and harmonized measures of depression and global functioning in older adults with bipolar disorder.

Harmonizing depression data from multiple small studies does not affect statistical power, which makes it a useful approach for examining specific subgroups, according to a study published in the Journal of Affective Disorders.

The researchers wanted to know more about how depression affects adults aged 60 years and older with bipolar disorder (BD). Most studies in this age group involve 50 participants or less. Harmonizing datasets would increase the sample size but has been known to reduce sensitivity.

Researchers compared categorical and continuous and harmonized measures of depression and global functioning in a large dataset of older adults with BD. The researchers used pooled data from 8 studies that included adults aged older than 50 years with BD. They divided studies by scale: Hamilton Depression Scale (HAM-D), Montgomery Åsberg Depression Scale (MADRS), and the Center for Epidemiological Studies Depression Scale (CES-D).

The difference between effect sizes for the continuous and categorical variables was least for HAM-D and CES-D (about 6% and 2% less variance associated with the categorical depression severity predictor, respectively) and slightly higher for MADRS (about 14% less variance associated with the categorical variable). The effect size and variance associated with the model for the harmonized depression severity score in the full sample (η2=0.137, R2=0.291) was higher than both the categorical and continuous measure in the CES-D subsample. Moreover, the effect size and variance was also lower than both the categorical and continuous measures in the HAM-D and MADRS subsample, researchers reported.

Across all measures, more severe depression symptoms were associated with worse functioning. However, the variances differed depending on the scale. “Harmonizing different depression scales into clinically relevant categories appears feasible without greatly reducing effect sizes,” researchers stated.

The analysis has limitations, including how the studies had different inclusion and exclusion criteria, different study designs, and other factors, which potentially affected the analysis. The studies were conducted in the United States and the Netherlands, which means the overall sample size skewed toward White, educated adults, limiting the generalizability of the results.

Since the number of older adults with BD is increasing, “it is essential to gain more knowledge about this vulnerable group,” researchers concluded. “Future research might therefore explore these harmonization methods in relation to different cognitive measures to study the relationship between cognition and functioning in this group, and to investigate potential subgroups accounting for heterogeneity.”


Orhan M, Millett C, Klaus F, et al. Comparing continuous and harmonized measures of depression severity in older adults with bipolar disorder: relationship to functioning. Published online July 6, 2022. J Affect Disord. doi:10.1016/j.jad.2022.06.074