MIT’s Neural Network uses Text & Audio to Perform Sentiment Analysis for Healthcare
- A neural network model designed by MIT’s researchers can diagnose depression
- The model is currently giving a precision score of 71% and recall of 83%
- Text and audio recordings of natural conversations are used to diagnose the illness, rather than a set of pre-defined questions
Depression is an overwhelmingly common illness in society these days. It has been exacerbated in recent years with the advent of social media and is especially prevalent in young people. Doctors typically ask a set of pre-defined questions to the patient, based on which they diagnose depression.
And now MIT researchers have designed a neural network model that doesn’t need all these questions to detect depression in a person. Instead, the model focuses on an individual’s way of speaking and his/her writing style.
Think about it for a second – how do we tell if a person is depressed? We typically analyze their way of talking, whether they sound low, etc. And that’s essentially what this neural network does. It does away with any of these constraints on the data, making it free from context.
The researchers have based their model on a technique called sequence modelling. They obtained samples of audio recordings and text and trained the neural network on those. These samples were from both depressed and non-depressed people. According to the lead researcher, Tuka Alhanai, the dataset contained 142 interactions from the Distress Analysis Interview Corpus (DAIC).
The job of the neural network model was to analyze sequences of words (or the speaking style) and then predict whether the individual was depressed or not. The model gave results of 71% precision and 83% recall when tested on a held-out dataset from DAIC itself.
It might not surprise you to know that the model had a much tougher time detecting depression from the audio recordings as compared to the writing style. The model required an average of seven sequences to precisely diagnose depression, while this number jumped to 30 for audio recordings.
Our take on this
I’ve recently been reading about sequence modeling and can attest to it’s usefulness. It’s potential (at least in my opinion) lies in the audio/speech processing space and this study further reinforces my thoughts. A great place to start learning about this technique is here – A Must-Read Introduction to Sequence Modeling.
The results may not jump out at you, but they’re still better than most research in this area. The model’s ability to analyze any kind of conversation is significant, rather than working with only a specific set of questions (as has been the case with other models).
Subscribe to AVBytes here to get regular data science, machine learning and AI updates in your inbox!