Tackling the impossible problem of content moderation
Image credit: Dreamstime
Moderating content online is messy, arbitrary and expensive – a huge headache for lawmakers and social media companies alike. While automating such moderation is essential at scale due to the sheer volume of traffic, it remains problematic.
“I always perform with nipple tassels,” says Dr Carolina Are, pole dance instructor, activist and innovation fellow at Northumbria University. “Because if a nipple shows up, I can’t post it on social media.”
In her academic life at the university’s Centre for Digital Citizens, she researches the below-the-radar strategies social sites use to control content and the impact of being blocked or limited online.
She speaks from experience. As a performer, her pole-dancing videos have been ‘shadow banned’ (by Instagram) or outright blocked (by TikTok). Instagram apologised and now collaborates with her – the platform has long been criticised for moderation policies which hide images without informing subjects. TikTok alleges her videos ‘imply’ nudity, which is banned by the largely youth platform – and her dance to celebrate her PhD was taken down. “But I’m clothed,” she says.
By contrast, the vicious trolling she’s received – “that I deserve to get raped, my morals are loose, etc” – is on a different scale. When she reported this abuse, platforms didn’t respond or take it down. Users are censored, but not protected, she says.
Are’s pole-dancing videos are indeed raunchy for the 13-year-old age limit of most social media platforms and their many underage users, “but they’re similar to what you could see in a music video, advert or celebrity posting – and these are not controlled”.
She’s careful to remain within community guidelines. Female bodies are more censored than male by artificial intelligence (AI) trained to spot nudity, a Guardian investigation found – male nipples are fine, female nipples aren’t – and over the last decade or so, controversy has reigned over this algorithmic bias.
These policies are misogynistic, objectifying, and just unfair, says Are.
TikTok has since reinstated her and given a “wish washy” apology. Yet many women – entrepreneurs, small businesses through to artists and performers, and sex workers who can vet clients online – rely on online communities for their livelihoods and are vulnerable to faceless, unaccountable censorship, she says, which sometimes results from the complaints of other users. “But platforms have huge populations and they’re trying to moderate them as if all users were the same,” says Are. “This one-size-fits-all approach that lumps sexual content and harm together doesn’t make sense.”
Moderators don’t have time or incentive to make thoughtful decisions, she says. Beyond obvious cases, defining what is harmful or toxic is at the heart of the problem. Why not empower individuals, Are says, to decide what they do or don’t want to see? “We need to protect users who are targeted by this kind of platform governance because it can be weaponised against them.”
However, in the case of under 18s, there’s a disconnect between what tech companies consider safe content, and what parents expect their children to be protected from.
Blunt tools have flaws, say activists – as witness the experience of a San Francisco-based father who was concerned about a swelling in his toddler son’s genitals in 2021. After sending photos to a doctor, he was horrified to be flagged as a potential paedophile, subject to a police investigation and find his online accounts frozen for months.
And whenever there is control, there is evasion. In a latest TikTok craze, mostly female users have been getting one over on the censor, apparently flashing their breasts using blink-and-you’ll-miss-it techniques in the so-called ‘Foopah’ challenge – difficult to spot, moderators probably can’t keep up. In darker corners, political extremes also deploy subterfuge to stay under the radar – ‘88’ is a white supremacist code for Heil Hitler for instance.
More than four billion video views take place on Facebook every day, while on Pinterest, people watch close to a billion videos. In a single minute, 500 hours of content are uploaded to YouTube and 347,222 stories are posted to Instagram. Facebook – which didn’t respond to a request for comment – has committed 5 per cent of revenue ($3.7bn) to content moderation and employs 40,000 moderators worldwide.
Most individuals’ experience of social media is harmless, fun, sometimes annoying. But for some it can be catastrophic. Husband of Nicola Bulley, whose disappearance sparked a national hunt in the UK, had his Pinterest account hacked with sexually explicit images, while conspiracy theorists had an online field day before his wife’s body was found.
Nearly six years after 14-year-old Molly Russell took her own life (November 2017) after seeing graphic self-harm content driven by algorithmic recommendations, her father says response from social platforms has been underwhelming and has little faith that planned legislation – the Online Safety Bill (see boxout below) – will check this kind of toxic content. Leading platforms want to hook users for as long as possible: less engagement means less cash.
All larger platforms use automation as a first line of defence. TikTok runs videos through computer vision for any content that infringes guidelines. “There’s absolutely no way humans can check everything,” says Professor Oliver Lemon, an AI and natural language processing expert at Heriot-Watt University Edinburgh.
Tech companies can either buy, build or partner to implement safety technology, says Crispin Pikes, member of the Online Safety Tech Industry Association and co-founder of AI moderation software Image Analyzer.
Yet ambiguous material is still run past real humans, and moderators see thousands of videos in a working day and much of this work is outsourced abroad, overseen by in-house teams.
A string of well-publicised legal challenges has revealed the grim life of content moderators who see the toxic stuff automation can’t classify. Facebook has compensated more than 11,000 content moderators over claims of PTSD, caused by viewing thousands of disturbing graphic videos and images. Three years ago, professional services firm Cognizant – which policed Facebook, Google and Twitter – withdrew from content moderation after an investigation into working conditions by magazine The Verge.
Some images, text and audio will always be hard to decipher. “Consider four lines of flour beside a card and a rolled-up bank note,” says Pikes. “Even a human will say ‘that’s cocaine’.” AI, he says, has only been used since 2017 and it’s a work in progress, though many new safety technologies in development could be available within a year or so. Amateur pornography used to pass under the radar because technology had learned to spot only the professionally produced videos. “The lighting was so bad [in amateur porn] because it wasn’t filmed in a studio, so the system left it to one side,” he says.
Context can dictate meaning. One culture’s ‘glamour’ shots are another’s pornography, while an insult in one country is a harmless label in another – language is complex to classify. ‘Smoke a fag’ is innocuous in the UK, but in the US, it means to murder a gay man. Any moderation technology must be highly configurable, says Pikes, not only to accommodate cultural sensitivities but also the different platform models. TikTok’s algorithms focus on how long users linger and engage with videos, whereas Facebook focuses on social links. And there are thousands of platforms and closed communities following different models.
Safety tech is a burgeoning sector with a groundswell of collaborative projects. Some £550,000 of government grants were awarded last year to five UK initiatives to make the internet safer for children – by combatting end-to-end encryption on messaging apps, which abusers use to share images under the radar.
Winning projects include an AI plug-in for encrypted messaging which could flag images of child abuse, and facial-recognition technology that can scan and estimate someone’s age to detect child abuse images before they’re even uploaded. Another initiative aims to prevent livestreaming of child sexual abuse with AI-driven live video moderation technology. An app developed by the same UK company, SafeToNet – that works on devices to identify and block harmful images before users see them – has just received EU funding. The company’s AI technology will be trained by Cambridge-based Internet Watch Foundation (IWF), which assesses and removes child abuse images from the internet.
IWF already uses digital tools to hunt down images of child abuse. Today, the charity works with 185 members, including major tech platforms, to check for any known images that have already received a unique digital fingerprint – a hash (which can survive minor editing or formatting) – which flags them without the need for people to see them. It’s the unknown images, or those that are sent via encrypted messaging, that must also be stopped.
“We want to be able to say to a child ‘we’ve got your image and it will never be distributed again’,” says chief technology officer Dan Sexton, who wants to see tech firms embrace emerging safety technology. He wants suspect images to be blocked on devices before they can be shared – in the same way that companies flag and block malware.
Children will have stronger legal protection if the UK’s Online Safety Bill passes, but lawmakers themselves need to understand how safety tech could work, says Shweta Singh, assistant professor in information systems and management at Warwick Business School. “If they knew what was possible, they could demand it,” she says.
She’s creating an AI tool which she describes as a layer of ‘understanding’ that could sit on top of online communication and read between the lines. “Content without context doesn’t mean much.” Her technology will focus on specific harms, including child sex abuse and violence, hate speech, cyber bullying, anorexia, and suicide. It’s laborious work – sifting through vast amounts of communication to compile ‘dictionaries’ of insights based on these defined harms. Leveraging Google’s natural language processing algorithm BERT (Bidirectional Encoder Representations from Transformers), which allows technology to understand language more as humans do, the AI will better understand nuance and context and help flag when children are at risk, and a prototype could be ready this summer. It could raise the alarm by spotting suspect communications and links – banter between friends differs from a lonely child being groomed by an adult. “There are probably thousands of children who’ve been traumatised and haven’t made the headlines, and it’s our responsibility to protect them,” Singh says.
Tech platforms already hold a lot of information about their users, such as online habits, identity, location, topics of engagement, says Pikes. “They can see a lot through lexical analysis.” Some tech already scans images before they’re uploaded – Facebook Marketplace for instance – but this level of power in the hands of technology makes freedom of speech advocates feel queasy.
Fast-learning natural language processing tools such as ChatGPT also present a terrifying prospect, says Lemon at Heriot-Watt, who’s also co-founder of conversational AI firm Alana AI. “These tools can create very believable texts that could contain falsehoods. If at the press of a button you could request a couple of paragraphs on why invading a country is a good idea and send out thousands of messages across all social media platforms which don’t appear to have been written by a computer – that’s very risky. You can rapidly disseminate propaganda.”
Access to large language models has expanded the scope of conversational AI, he says – but it must be trained on truths rather than the toxic communications. Working with Unicef, Alana AI has already created a tool which can spot and help combat myths around Covid-19 – such as drinking whisky cures the virus – and vaccines via a conversational chatbot. Now a team at the National Robotarium, hosted by Heriot-Watt, are working on AI tools that can identify and counter online gender-based violence, as well as provide proactive AI support for victims.
Smaller online groups tend to police themselves, and gaming communities have embraced moderation, says Pikes – because it pays to do so. “If they get toxic behaviour in their environment, people leave. And that costs the industry because there’s so much money spent there now.”
One huge challenge platforms will face is age verification, which forthcoming legislation will demand. How can companies, in a country without a central identify system, check up on underage users? About 60 per cent of children aged 8 to 12 who use social media are signed up with their own profile – and a fake age – despite the 13-year age limit. This matters – once a user reaches 16 or 18, platforms might introduce new features such as adult content and direct messaging. Should content moderation reflect where young users actually are, rather than where they should be?
In the meantime, Ofcom is recruiting technology experts in anticipation that legislation will pass this year. And rather than custodial sentences, platform executives may be fined if Ofcom judges they haven’t lived up to their own standards – to the disappointment of campaigners such as the NSPCC, which wants to see individuals held to account. But creating the law must be a priority, says Pikes. “Otherwise, platforms can say you’re telling us to build something to a design that doesn’t exist yet.”
The Online Safety Bill
Nearly four years in the making, this ambitious piece of legislation aims to walk the tightrope of policing harmful content while retaining free speech. Firms could be fined up to 10 per cent of global turnover if they fail to moderate or remove illegal content, and regulator Ofcom will gain wide-ranging powers.
New laws will be seen around the world as a test case of how to regulate online content.
Legislation could force platforms, forums and search engines to tackle racist and sexual abuse, revenge porn and other dangerous communications. Technology companies will have to stop under 18s from seeing content that risks serious harm – content that encourages suicide is already illegal; now anything that encourages self-harm could also be prosecuted. Platforms must introduce age verification and Ofcom will have power to audit how tech companies control users’ experience through algorithms, and how they meet their own standards.
Controversially, a clause that would limit ‘legal but harmful’ content for adults has been removed, after concern that it would allow tech companies to censor legal speech. Custodial sentences have also been dropped, replaced by fines. Groups outside parliament fear smaller businesses will be hit with complicated rules and pay the cost of compliance. The bill doesn’t recognise users’ rights to freedom of expression, says Monica Horten, policy manager at Open Rights Group.
“[New clauses] raise new concerns around embedded power of the tech companies and a worrying lack of transparency around how Ofcom will enforce them.”
The growth and complexity of online child abuse
In 2021, 96 per cent of imagery removed by the Internet Watch Foundation (IWF) was of girls.
Since 2019, the IWF has recorded a 1,058 per cent increase in the number of webpages depicting the sexual abuse of children aged between seven and10 who were groomed or coerced online (in these cases offenders trick or coerce children into abusing themselves). In the last year alone, this has grown by 129 per cent and the IWF has removed 63,050 webpages of sexually abused 7–10-year-olds – 14 per cent of these involved the most severe forms of abuse.
In 2022, the IWF investigated 375,230 reports of suspected child abuse and confirmed 255,588 reports contained abuse, some cases containing thousands of images and videos.
Sign up to the E&T News e-mail to get great stories like this delivered to your inbox every day.