We’re excited to bring Transform 2022 back in person on July 19 and virtually July 20 – August 3. Join AI and data leaders for insightful conversations and exciting networking opportunities. Learn more
From social media cyberbullying to attacks in the metaverse, the internet can be a dangerous place. Online content moderation is one of the key ways companies can make their platforms more secure for users.
However, moderating content is not an easy task. The amount of content online is staggering. Moderators deal with everything from hate speech and terrorist propaganda to nudity and blood. The “data overload” of the digital world is only compounded by the fact that much of the content is user generated and can be difficult to identify and categorize.
AI to automatically detect hate speech
That’s where AI comes in. By using machine learning algorithms to identify and categorize content, companies can identify insecure content as soon as it is created, instead of waiting hours or days for human review, reducing the number of people exposed to insecure content.
For example, Twitter uses AI to identify and remove terrorist propaganda from its platform. AI flags more than half of tweets violating terms of service, while CEO Parag Agrawal has made his focus on using AI to identify hate speech and misinformation. That said, more needs to be done as toxicity is still rampant on the platform.
Similarly, Facebook’s AI detects nearly 90% of hate speech removed by the platform, including nudity, violence, and other potentially offensive content. But like Twitter, Facebook still has a long way to go.
Where AI goes wrong
Despite its promise, AI-based content moderation faces many challenges. One is that these systems often mistakenly mark safe content as unsafe, which can have serious consequences. For example, at the start of the pandemic, Facebook marked legitimate news articles about the coronavirus as spam. It falsely banned a Republican party Facebook page for more than two months. And it flagged posts and comments about the Plymouth Hoe, a public landmark in England, as offensive.
However, the problem is tricky. Not flagging content can have even more dangerous consequences. The shooters of both the El Paso and Gilroy shootings publicized their violent intentions on 8chan and Instagram before venting their rage. Robert Bowers, the accused perpetrator of the Pittsburgh synagogue massacre, was active on Gab, a Twitter-like site used by white supremacists. Misinformation about the war in Ukraine has gained millions of views and likes on Facebook, Twitter, YouTube and TikTok.
Another issue is that many AI-based moderation systems exhibit racial biases that need to be addressed to create a safe and useful environment for all.
Improving AI for Moderation
To solve these problems, AI moderation systems need higher quality training data. Today, many companies outsource the data to train their AI systems to low-skilled, poorly trained call centers in third-world countries. These labelers lack the language skills and cultural context to make accurate moderation decisions. For example, unless you’re familiar with American politics, you probably don’t know what a post that says “January 6” or “Rudy and Hunter” refers to, despite its importance to content moderation. If you’re not a native English speaker, you’ll likely index too much on profane terms, even when used in a positive context, erroneously marking references to the Plymouth Hoe or “she’s such a bad bitch” as offensive.
One company solving this challenge is Surge AI, a data labeling platform designed to train AI in the nuances of language. It was founded by a team of engineers and researchers who built the trust and security platforms on Facebook, YouTube and Twitter.
For example, Facebook has had a lot of trouble collecting high-quality data to train its moderation systems in key languages. Despite the company’s size and its scope as a global communications platform, it barely had enough content to train and maintain a model for standard Arabic, let alone dozens of dialects. The lack of a comprehensive list of poisonous defamation in the languages spoken in Afghanistan meant the company could miss many violations. It lacked an Assamese hate speech model, although workers flagged hate speech as a major risk in Assam due to the increasing violence against ethnic groups there. These are issues Surge AI is helping to solve through its focus on languages and datasets on toxicity and profanity.
In short, with larger, higher-quality datasets, social media platforms can train more precise content moderation algorithms to detect malicious content, keeping them safe and free from abuse. Just as large data sets have laid the foundation for today’s state-of-the-art language generation models, such as OpenAI’s GPT-3, they can also provide better AI for moderation. With enough data, machine learning models can learn to detect toxicity with greater accuracy, and without the biases found in lower quality data sets.
AI-assisted content moderation isn’t a perfect solution, but it’s a valuable tool that can help businesses keep their platforms safe and harmless. With the increasing use of AI, we can hope for a future where the online world is safer for everyone.
Valerias Bangert is a strategy and innovation consultant, founder of three profitable media outlets and published author.
Welcome to the VentureBeat Community!
DataDecisionMakers is where experts, including the technical people who do data work, can share data-related insights and innovation.
If you want to read about the latest ideas and up-to-date information, best practices and the future of data and data technology, join us at DataDecisionMakers.
You might even consider contributing an article yourself!
Read more from DataDecisionMakers