Opinion: social media sites face a huge challenge to stop the publication of offensive content, especially as the task requires more than just technology 

Communicating online is the addiction of our modern times. Surveys suggest that young adults check their phone 85 times per day, with 50 percent of them going on to Facebook as soon as they wake up. For social media companies, this reach into people’s lives raises the huge challenge of how to prevent the publication of hateful or offensive content. It is a worldwide phenomenon that online platforms are used as a tool for bullying and offensive content, with the most common reported form of cyberbullying for teenagers as "nasty or hurtful messages". Social media and other user content hosting companies have an ethical and, increasingly, legal responsibility to prevent their platforms being used as conduits for hateful content. 

The problem is conceptually simple: assess or "moderate" every piece of online content posted before it is published, as is done by editorial staff for their newspapers every day. But the scale of user driven content removes this simplicity. By the time you are one minute into reading this article, 510,000 comments will have been posted on Facebook and 360,000 tweets posted on Twitter

"At best, social media companies can aim to catch a lot of the bad content, a lot of the time"

Social media companies use a number of approaches to content moderation. These can be broadly divided into pre-publication moderation, where content posted is moderated before it is published, and post-publication, where content is published in close to real-time with little or no checks.

With post-publication, users can then report or flag content as offensive, but the associated "damage already done" risk exists. Some providers such as Facebook also allow users to specify what published content they would like to block from their feed, in a form of self-directed post-publication moderation. Taylor Swift was famously able to eliminate a barrage of abusive comments to her Instagram account by using Instagram’s new features that prevents offensive comments being made public.

The holy grail of content moderation is pre-publication moderation of all content with 100 percent accuracy and close to zero time lag, so that damaging content never reaches an audience. In reality, however, social media companies can at best aim to catch a lot of the bad content, a lot of the time.

Abusive content is subjective: what I find offensive may be different for you

Computers and the algorithms are wonderful for addressing large-scale problems. But assessing user comments is an ambiguous task. We humans are complicated individuals, using a highly ambiguous way to communicate online – written language. Abusive content is subjective: what I find offensive may be different for you. As researchers, we already note that when collecting datasets of abusive content, human moderators do not fully agree on whether samples are abusive or not, and a voting system is needed to gain a majority view.

There is also the more subtle problems of ambiguity of language and context. "You were such a success" is a positive heart-warming seal of approval posted online by a proud aunt. But put this in the context of a post sent to a 14 year old girl (and for her peers) after a humiliating performance in the school play and the post takes on a different identity: sarcastic, humiliating, damaging. 

Artificial Intelligence techniques are a highly active area of research, producing algorithms that can automatically flag content as abusive. These algorithms learn from high volumes of sample posts, good and bad, so that new examples can be moderated automatically. Text analytics, natural language processing (NLP), network behaviour monitoring and machine learning techniques are used to develop and train algorithms to support pre-publication moderation. Deep learning, a branch of machine learning based on neural networks, is now delivering higher accuracy rates

From RTÉ One's Claire Byrne Live show, Jackie Fox discusses her campaign for parents to check their children's social media accounts and talk to them about bullying

However, the impact of wrong decisions is high. If an algorithm identifies 90 percent of all abusive posts, 10 percent of abusive posts will be published with subsequent damage. And conversely for false negatives, is it acceptable for clean posts to be blocked from publishing? Will dissatisfied users leave the service?

AI-based techniques that can moderate abusive comments automatically will continue to improve in accuracy, but they will never be 100 percent accurate. Social media companies will continue to need human moderators, with governments and society doing their part to make online bullying behaviour unacceptable or even illegal.


The views expressed here are those of the author and do not represent or reflect the views of RTÉ