Data Impact Challenge II Question 3

Question 3:  What proportion of Canadian youth (13-17) post about their mental health, and describe experiencing bullying or suicidal thoughts in the past 12 months on social media?

Suicide is the second leading cause of death among youth in Canada, and mental illnesses such as depression are the main risk factors of suicide. Youth ages 18 to 24 have the highest rates of mental illness such as depression and generalized anxiety disorder than other age groups.

With response rates to traditional surveys in decline, it is incumbent on governments and their partners to tap into new and emerging sources of publicly available data such as social media. Response rates to traditional surveys among youth, in particular male youth, are some of the lowest among all age groups in the country.

Youth are active users of social media, with about 74% of Twitter users between the ages of 15 and 25 years of age. Use of social media outlets such as Facebook or Twitter as a data source would serve to augment existing survey and administrative data sources and allow for better analysis, contextualization and interpretation of traditional incidence, prevalence and case level data currently available. There is even a potential opportunity that these new sources could provide an early indication of possible trends to guide more formal surveillance activities.


SAS utilized the data available through Twitter’s API to analyze 1.1 million Tweets from within Canada that profiling software identified as likely coming from 13- to 17-year-old users. Their analysis made use of natural language processing, predictive modelling, text mining, and data visualization.


The identification of potential new data sources must meet the following requirements:

  • The data source must be currently available;
  • Data extraction needs to ensure safeguards to individual privacy and confidentiality;
  • Sample size and time period must be sufficiently meaningful to make the analysis relevant;
  • The data source must not be currently used by federal/provincial/territorial governments in a systematic way; and,
  • The data must pertain to the Canadian population.

Participating Team


Team bio: SAS is committed to delivering empowering knowledge through the use of data in healthcare. The growth in the availability of unstructured, self-reported health data is one area of focus that SAS is pioneering to improve health outcomes.
Entry: PDF