otso Annotator is live
Trial our cloud-based annotator built for Machine Learning Engineers and Data Scientists now. No credit card required.
Learn More
Customer Insights
Project Feelings
Book a Demo
Log in

Using custom topic modelling to analyse covid-19 voice of the community sentiment trends

In light of the Victorian government stepping up its COVID-19 public health ad campaign this week, with a messaging focus on driving home the real human impact of the virus, we analysed a sample dataset of r/melbourne subreddit posts published since the weekend to get an indicative feel for local voice of the community sentiment trends.

Leveraging a range of otso’s custom topic modelling machine learning (ML) techniques to extract insights, prevention emerged as a leading thematic driver behind positive sentiment direction; with anecdotal experiences shared of everyday mask adoption rich in surprise and hope that the majority of Victorians were doing the right thing in shared public and retail spaces.

dashboard chart: topic classification filter by sentiment

More polarising in sentiment were direct expressions of personal mental health struggles with lockdown and isolation. Adding balance to those posts tinged with sadness and anxiety, was gratitude towards those Redditors sharing stories that it’s okay to not be okay.

dashboard chart: topic impact on overall sentiment score

Now here come’s the (data) science bit…

Sam Hardy, otso ML Lead, took time out from the machine to share the knowledge wealth. “Leveraging otso’s pre-existing suite of sentiment and Named Entity Recognition models allows us to focus on the domain-specific problem at hand, for this project classifying ~6000 Reddit posts and comments by topic.”

“Customising ML solutions is often one of the best ways to ensure for their effectiveness. Analysis of the r/melbourne Reddit corpus revealed that interactions are often multi-faceted and complex. For instance, Redditors are often talking about the effect of restrictions (topic A) on mental health (topic B). Given that this is the case, we sought to develop a multi-label classifier, capable of returning multiple labels given a single comment or post, consistent with these preliminary findings.”

“Uploading the Reddit corpus to otso’s annotator platform enabled quick development of a labelled data set, including labels such as Communication, Employment, Mental Health, Political Discussion, Prevention, Restrictions and Testing.”

To wrap up with added context, we’ve developed a suite of ready-to-deploy models directly applicable to Australian public sector clients, topic classifiers for example targeted to extract community safety and customer service issues, performing ‘off the shelf’ around the 75 per cent accuracy marker.

The final supervised learning stretch from here to attain and maintain high performing classification accuracy in the region of 85 to 95 per cent, is then tweaking and training the classifiers on community feedback data unique to an organisation’s domain.

We’re excited to measure over the coming weeks how far we can take ready-to-deploy model performance for COVID-19 sentiment and topic classification, trained on Reddit voice of the community social media data. Stay tuned!

About UsCareersContact UsBlog
Copyright © 2021 - otso.ai. All rights reserved.
Follow us:
LinkedIn LinkTwitter Link