AI Speech Analysis may aid in Assessing and Preventing Potential Suicides, says researcher

Spread the love
Credit: Pixabay/CC0 Public Domain

Speech is critical to detecting suicidal ideation and a key to understanding the mental and emotional state of people experiencing it. Suicide hotline counselors are trained to quickly analyze speech variation to better help callers through a crisis.

But just as no system is perfect, there is room for error in interpreting a caller’s speech. In order to assist hotline counselors to properly assess a caller’s condition, Concordia Ph.D. student Alaa Nfissi has developed a model for speech emotion recognition (SER) using artificial intelligence tools. The model analyzes and codes waveform modulations in callers’ voices. This model, he says, can lead to improved responder performance in real-life suicide monitoring.

The research is published as part of the 2024 IEEE 18th International Conference on Semantic Computing (ICSC).

“Traditionally, SER was done manually by trained psychologists who would annotate speech signals, which requires high levels of time and expertise,” he says. “Our deep learning model automatically extracts speech features that are relevant to emotion recognition.”

Nfissi is a member of the Centre for Research and Intervention on Suicide, Ethical Issues and End-of-Life Practices (CRISE). His paper was first presented at the February 2024 IEEE 18th International Conference on Semantic Computing in California, where it received the Best Student Paper Award.

Instant emotional reads
To build his model, Nfissi used a database of actual calls made to suicide hotlines, which were merged with a database of recordings from a diverse range of actors expressing particular emotions. Both sets of recordings were segmented and annotated by trained researchers, or by the actors who had voiced the recordings, according to a protocol tailored for this task.

Each segment was annotated to reflect a specific state of mind: angry, neutral, sad, or fearful/concerned/worried. The actors’ recordings enhanced the original dataset’s emotional coverage, in which angry and fearful/concerned/worried states were underrepresented.

Nfissi’s deep learning model then analyzed the data using a neural network and gated recurrent units. These deep learning architectures are used to process data sequences that extract local and time-dependent features.

“This method conveys emotions through a time process, meaning we can detect emotions by what has been prior to one individual instant. We have an idea of what happened and what was before, and that us to better detect the emotional state at a certain time.”

This model improves on existing architectures, according to Nfissi. Older models required segments to be the same length in order to be processed, usually somewhere in the five- to six-second range. His model uses variable length management signals, which can process different time segments with no need for hand-crafted features.

The results validated Nfissi’s model. It recognized the four emotions in the merged dataset accurately. It correctly identified fearful/concerned/worried 82% of the time; neutral, 78%; sad, 77%; and angry, 72% of the time.

The model proved particularly adept at correctly identifying the professionally recorded segments, with success rates between 78% for sad and 100% for angry.

This work is personal to Nfissi, who had to study in-depth suicide hotline intervention while developing the model.

“Many of these people are suffering, and sometimes just a simple intervention from a counselor can help a lot. However, not all counselors are trained the same way, and some may need more time to process and understand the emotions of the caller.”

He says he hopes his model can be used to develop a real-time dashboard counselors can use when talking to emotional callers in order to help choose the appropriate intervention strategy.

“This will hopefully ensure that the intervention will help them and ultimately prevent a suicide.”

Professor Nizar Bouguila at the Concordia Institute for Information Systems and Engineering co-authored the paper, along with Wassim Bouachir the Université TÉLUQ and CRISE and Brian Mishara at UQÀM and CRISE. https://techxplore.com/news/2024-04-ai-speech-analysis-aid-potential.html