AUTHORS:
Jonathan F. Bauer Department for Clinical Psychology and Psychotherapy, Friedrich-Alexander-Universität, Erlangen, Germany
Maurice Gerzcuk Chair of Embedded Intelligence for Health Care & Wellbeing University of Augsburg, Augsburg, Germany
Björn Schuller Munich Data Science Institute, Technical University Munich, Munich, Germany
Matthias Berking Department for Clinical Psychology and Psychotherapy, Friedrich-Alexander-Universität, Erlangen, Germany
ABSTRACT:
Machine learning-based depression classification based on paralinguistic speech parameters yields a novel approach to detect depression. However, there is uncertainty about the effect of different types of speech recordings on classification accuracy. We suggest that recordings of free speech containing antidepressive statements may be particularly suitable for depression classification.
To test this hypothesis, we conducted Structured Clinical Interviews for DSM-5 to determine depression diagnoses on suitable candidates, resulting in a final sample of 48 clinically depressed individuals, 48 sub-clinically depressed individuals, and 48 non-depressed individuals. Participants from each group completed four different speech tasks: Participants read aloud neutral texts, they read aloud scripted depressive statements, they came up with and expressed anti-depressive statements, and 50% of participants read aloud scripted anti-depressive statements. Separate classification models aimed at classifying current depression were trained for each speech type and with two different state-of-the-art machine learning methods.
We found that training a depression classification model on recordings of anti-depressive statements was not superior to training models on other types of speech recordings. We only found a significantly better accuracy for the depression classification model trained on recordings of neutral read speech compared to the model trained on recordings of depressive read speech.
We could not confirm our hypothesis that recordings of antidepressive statements would result in superior depression classification accuracy compared to recordings of neutral text reading. Eliciting depression-related speech may reduce affective variability in individuals’ responses and therefore diminish depression-discriminative information. Our findings provide important directions for future research aimed to optimize speech elicitation tasks for depression classification.
Keywords: Depression, Speech, Voice, Machine Learning, Speech Elicitation
Conference Venue: Male, Maldives
Conference Date: 5-7 November 2024
ISBN Number: 978-625-00-7517-3
DOI Number: https://doi.org/10.53375/imhsc.2024.126
PDF Download