This blog is reposted from SpringerOpen blog
When you’re feeling sad, the people around you probably know it. Moody playlists, slumped shoulders, drawn-out sighs – there are many ways we signal to the rest of the world when we’re having a down day. It’s not all that much of a stretch, then, to imagine your Instagram posts might look happier when you’re feeling happy, and sadder when you’re feeling sad.
What if you were feeling depressed, but didn’t quite know it yet – would your depression still show up somehow in the photos you shared online? This possibility got us thinking: how might we combine what psychologists know about depression, with what data scientists know about analytics, to develop a quantifiable approach for evaluating mental health on Instagram?
The results of our work suggest that early-warning signs of emerging mental health issues like depression can be observed in Instagram posts, even before any clinical diagnosis is made.
We asked people to share their Instagram posting histories with us, along with details about their mental health history. By design, roughly half of our study participants reported having been clinically diagnosed with depression sometime in the last three years. All-in-all, we collected 43,950 photos posted to Instagram for analysis.
Depressed individuals in our sample tended to post photos that were, on average, bluer, darker, and grayer than those posted by healthy individuals.
Using findings from clinical psychology research, we identified several visual and behavioral markers associated with depression that seemed like good candidates for measurement. For example, individuals suffering from depression exhibit different preferences for color, shading, and brightness in imagery, compared to healthy individuals. Pixel analysis of the photos in our dataset revealed that depressed individuals in our sample tended to post photos that were, on average, bluer, darker, and grayer than those posted by healthy individuals.
Depression is also characterized by reduced or avoidant social engagement. Social engagement involves other people, so we speculated that one rough measure of sociability might be the average number of people that show up in the photos you post. We wrote a face detection algorithm to count the number of faces that appeared in each posted photograph. It turned out that depressed individuals posted significantly fewer faces per photograph, compared to healthy individuals.
Even the way depressed and healthy people chose to present their photos on Instagram was different. Instagram offers a series of ready-made filters that adjust a photo’s appearance. Among healthy users, we observed that the most popular filter was Valencia, which gives photos a warmer, brighter feel. Among depressed users, however, the most popular filter was Inkwell, turning it black-and-white. In other words, people suffering from depression were more likely to favor a filter that literally drained all the color out of the images they wanted to share.
We were able to observe these differences reliably, even when only looking at depressed users’ posts made prior to receiving a clinical diagnosis of depression. These and other recent findings (here, here, and here) indicate that social media data may be a valuable resource for developing efficient, low-cost, and accurate predictive mental health screening methods.
We do feel strongly that there’s an important ethical discussion that must occur in step with these technological developments, regarding data privacy and the implications of applying sophisticated analytical tools in an online medium which doesn’t forget. Even so, the possibility that social media analytics may offer a means of getting help faster to people in need is important, and should be explored further.
I have a method question about this study, in particular about the screening process used.
The paper states “Separate surveys were created for depressed and healthy individuals. In the depressed survey, participants were invited to complete a survey that involved passing a series of inclusion criteria, responding to a standardized clinical depression survey, answering questions related to demographics and history of depression, and sharing social media history.” (Section 2.1, p.4). So the idea seems to be that the authors only wanted to include participants who had actually been diagnosed with depression at some point.
The paper continues “We also excluded participants with CES-D scores of 22 or higher. Studies have indicated that a CES-D score of 22 represents an optimal cutoff for identifying clinically relevant depression across a range of age groups and and circumstances [31, 32]” (Section 2.3, p. 5).
From the APA: “The Center for Epidemiological Studies-Depression (CES-D), originally published by Radloff in 1977, is a 20-item measure that asks caregivers to rate how often over the past week they experienced symptoms associated with depression, such as restless sleep, poor appetite, and feeling lonely.” So the CES-D is a measure of current experience of depression.
From the Supplementary Information, Section IV, I note that “Among depressed participants, 84 individuals successfully completed participation and provided access to their Instagram data. Imposing the CES-D cutoff reduced the number of viable participants to 71.” So 13 depressed individuals were recorded above the cutoff for the CES-D and therefore excluded.
The purpose of cutoff point in a depression measure is to screen out individuals who may be putative cases in need of further attention. Such a screening tool is used to partition a population into ‘presumed well’ and ‘presumed ill’ (see Lewinsohn, Seeley, Roberts, & Allen, 1997).
Hence, by excluding those people above this cutoff the idea in this study seems to be that the authors wanted to analyse the Instagram photos of people who had been diagnosed with depression at some point *but weren’t currently experiencing a depressive episode*. This isn’t particularly clear from the paper but it makes a certain amount of sense. Otherwise, I don’t understand why, in a study on depression, you would exclude the most depressed participants…
That’s the only logic I can see in excluding those people from the analysis. But it does raise the obvious question: *what happened to these participants?* By the study’s own report, these people had been previously diagnosed with depression and were currently experiencing a high level of depression. Perhaps I am missing something?
Critically, the study cites Haringsma, Engels, Beekman, & Spinhoven (2004, p. 561) as Reference 32, which states “In our opinion, subjects scoring >25 on the CES-D should be followed up with a diagnostic interview to specify clinical diagnosis and appropriate treatment.” What was the follow-up with these 13 participants, who had been previously diagnosed with depression, who were currently experiencing high levels of depression? I may have misread the paper but I cannot see any such reference in it.