Are We Building Robust AI for Mental Health Prediction in Social Media?


[A shortened version of this post appears on the npj digital medicine website]

Did you know that Facebook has a “suicide prevention AI”? The popular social media website uses behavioral and linguistic patterns to guess if someone may harm themselves. NPR reports that this AI detects about 10 people every day. From there, Facebook can make an intervention to potentially save someone’s life. Now, Facebook isn’t the only one who has used social media data to make predictions about people’s well-being. In fact, researchers have been investigating this for nearly a decade — and these systems boast great promise for lifesaving and cost-reducing applications.

A crucial part of this work is the trustworthy evaluation of mental health from social media data. Without a doctor to make diagnosis or a screener to evaluate, how can we make sure the social media signals are measuring the thing we hope they measure? If these answers are wrong and we make them without human oversight, we risk making life-altering mistakes. We may allocate resources to a person who is not in distress and be overbearing, or the reverse, and miss a person who desperately needs assistance. As researchers who conduct this work, we were curious about practices in the larger research community — we wanted to know about the methods and dataset selection when important fields work together, like machine learning and clinical psychiatry. We set out to answer those questions.

In our paper in NPJ Digital Medicine, we studied 75 scientific papers for predicting mental illness using machine learning and social media data — the state-of-the-art in predictions within the space over the last five years. We look at two dimensions: whether the data could extract signals around mental illness and if the AI systems themselves were mathematically correct and adhered to scientific standards for machine learning

What we found was that most papers have large gaps in scientific reporting and validation strategies for “clinical” assessment that is essential to making trustworthy predictions. This has serious consequences. Failing to validate one’s data can result in the risks we mentioned earlier, incorrectly allocating already limited monetary and human resources. Not following reporting standards can make it difficult, even borderline impossible for fellow scientists to independently evaluate and confirm a study. Either can be enough to prevent accurate research — in conjunction, they pose a major barrier to solving the pressing challenges posed by AI applied to mental health.

Social Media and Mental Health Prediction

You have probably felt “anxious” in your life — public speaking makes almost all of us uneasy, for example. But you may have also felt anxious as a longer-term emotional state around an uncertain job search or challenging problem. Perhaps you know someone who suffers from anxiety.

Anxiety is a word with lots of meanings — the term is overloaded because it has several definitions both formally informally. It includes a category of disorders, the symptom of anxiety (that happens with other disorders), an emotion people occasionally feel, or casual use that describes a personality trait.

In our study, we examined if the papers defined what they were studying — for anxiety and other disorders and symptoms, such as depression, eating disorders, and suicidal thoughts. We found seven types of “proxy signals” were used to make these decisions, such as using hashtags like #depressed or a self-disclosure like posting “I was diagnosed with anorexia”. On first glance, this seems promising — who would participate in a depression community if they weren’t depressed?

What we discovered is that very few studies define what they mean when they explore certain terms — we couldn’t tell if people meant they wanted to study anxiety the emotion, anxiety the symptom, or anxiety the disorder. Even more concerning is that very few studies went back to check if those proxy signals actually mapped to a good measurement of mental illness and symptoms.

In addition to studying proxy signals from social media, we also wondered if the AI models were mathematically correct and reported so that later work could produce the same results. We cataloged the data collection and study design as almost all papers in the dataset used machine learning to predict the presence of mental illness or an important symptom.

To look at whether these papers could be reproduced, we looked at five factors that are essential for machine learning applications. These include how big and who is in the dataset, the variables used for prediction, and the statistics for how well the model performed, like accuracy. What we found alarmed us — only 32 of 75 papers, or 42%, reported on all five of these factors and therefore could be reproduced. We noticed that many papers did not say how many variables they used or what they were measuring, which leads to models that will not perform reliably over time.

In light of these results, we are hopeful the field can improve in these areas. In the paper, we generate a list of reporting standards for what must be included in these papers. We also discuss promising opportunities for collaborations with medical researchers to align proxy signals with clinical findings. These both are crucial in developing trustworthy and accurate predictions that leads to better research, and eventually better tools that we all can use for helping solve the pressing challenge of mental illness diagnosis and treatment.

The paper is available here (open access, so everyone can read it!). We’d love to hear your thoughts or comments on this phenomenon.

Short version of the methods: This paper is based out of a larger literature review of the state of the art in predicting mental illness from social media data. From 4400 papers, we found 75 papers published between 2013–2018 that discuss prediction, mental illness, and social media data. By reading each paper and qualitatively analyzing them, we analyzed the data collection, methods, results, and analysis methods, and examined the questions we mentioned above.

Full paper citation: Chancellor, S., De Choudhury, M. Methods in predictive techniques for mental health status on social media: a critical review. npj Digit. Med. 3, 43 (2020).




Professor at Minnesota CS, Georgia Tech PhD. Human-centered machine learning, work/life balance, and productivity. @snchancellor on Twitter

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

A new AI draws delightful and not-so-delightful images — DALL-E 2

9 Must Follow RPA Blogs

9 Must Follow RPA Blogs — RPAFeed

Chatbots in the time of a Pandemic

What Exactly is AI?

Why Chatbots Are Important to Increase your Conversions and Sales

Trueface at CES Las Vegas 2019

When AI Is At The Driving Controls Of Self-Driving Cars, There Are Complexities Galore

Final Blog Post

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Stevie Chancellor

Stevie Chancellor

Professor at Minnesota CS, Georgia Tech PhD. Human-centered machine learning, work/life balance, and productivity. @snchancellor on Twitter

More from Medium

The Idea Of A Coherent Curriculum For Mathematics And Science

We live to compete. What I learned by winning the Duolingo diamond league

How Can Fintechs Serve More Women Customers?

Banner — Fintech and the Gender Gap in bold blue letters in the left. A picture of an iPhone screen with a woman looking at a notebook. GATE and Rotman logos in the upper left corner. Random green dots of varying size are floating around for fun.

Happiness and Riches