Online troll pulling a face

AI tool boosts ‘help speech’ to muffle hate speech

Image credit: Dreamstime

Researchers at Carnegie Mellon University in Pennsylvania have created an AI tool which identifies positive comments. The tool could be used to find and promote these comments online in an alternative approach to combatting hate speech.

A team at the university’s Language Technologies Institute developed the tool, hoping that it could be used to counter hate speech towards largely-voiceless groups, such as the Rohingya people displaced from Myanmar. According to Dr Ashique KhudaBukhsh, who led the project, the Rohingyas are mostly defenceless against online hate speech on account of limited proficiency in English and other global languages, limited internet access, and limited time to spend defending themselves hate speech when their lives are threatened.

KudaBukhsh and his colleagues decided to focus on hate speech against Rohingyas, collecting more than a quarter of a million YouTube comments to analyse.

To analyse this vast quantity of text for positive opinions, they used machine learning models which predict what words are likely to appear in a sequence. However, it was a challenge to apply these models to short social media comments in South Asia, which often include spelling and grammar errors, different writing systems, and samples of different languages. In order to handle these comments, they obtained new ‘embeddings’ – representations of words used in machine learning to group words with similar meanings and compute their proximity to other words – specific to this distinctive writing style.

This adaptation worked as well or better than commercially available solutions and has been harnessed as a tool for automatic analysis of social media content in the region.

Using this approach, the researchers found that approximately 10 per cent of the comments in their sample were positive “help speech”, such as “No country is too small to take on refugees” or “All the countries should take a stand for these people”. The researchers suggest that human moderators, who would be unable to assess such vast numbers of comments themselves, would be presented with these comments and choose some to “highlight” in comment sections.

“Even if there’s lots of hateful content, we can still find positive comments,” said KhudaBukhsh. He suggested that promoting these comments could – along with demoting hate speech and banning the users who post it – make the internet a safer and healthier place.

A small number of the comments identified as positive by the model could be categorised as hate speech against other groups, the researchers acknowledged, such as a comment comparing antagonists of the Rohingya to animals. They said that this showed that there is a continued need for human judgement and further research.

In a similar study, the researchers applied the same technique to search for anti-war ‘hope speech’ among almost a million YouTube comments about the February 2019 Pulwama terrorist attack in Kashmir, which enflamed tensions in the region.

In the past few years, violent events associated with online hate groups - such as the fatal right-wing Charlottesville rally and Christchurch terrorist attack - have prompted social media companies to take a stand against hate speech, with Facebook and Google employing thousands more human moderators and deploying more sophisticated AI tools to detect and remove inappropriate content and the users spreading this content. However, these companies have continued to attract criticism for passively allowing controversial figures like Steven Crowder and David Duke to remain on their platforms.

Sign up to the E&T News e-mail to get great stories like this delivered to your inbox every day.

Recent articles