How Artificial Intelligence is Creating a More Secure Online Environment

The Internet can be a dangerous place, from social media cyberbullying to assault in the metaverse. Moderation of online content is a critical way for businesses to make their platforms safer for users.

 

Moderating content, on the other hand, is not an easy task. The amount of content available online is mind-boggling. Moderators must deal with a range of issues, from hate speech and terrorist propaganda to nakedness and gore. The “data overload” that exists in the digital world is exacerbated by the fact that much of the content is user-generated and thus difficult to identify and categorise.

Automatic detection of hate speech via artificial intelligence

That is where artificial intelligence comes into play. By utilising machine learning algorithms to identify and categorise content, businesses can identify unsafe content immediately upon creation, rather than waiting hours or days for human review, reducing the number of people exposed to unsafe content.

 

For example, Twitter employs artificial intelligence to detect and remove terrorist propaganda from its platform. Over half of tweets that violate the company’s terms of service are flagged by AI, and CEO Parag Agrawal has made it a priority to use AI to detect hate speech and misinformation. Having said that, more needs to be done, as toxicity continues to be a problem on the platform.

 

Similarly, Facebook’s artificial intelligence detects and removes nearly 90% of hate speech, including nudity, violence, and other potentially offensive content. However, Facebook, like Twitter, has a long way to go.

Where artificial intelligence goes wrong

Despite its potential, AI-assisted content moderation faces numerous obstacles. One issue is that these systems frequently incorrectly flag safe content as unsafe, which can have dire consequences. For example, at the start of the pandemic, Facebook flagged legitimate news articles about the coronavirus as spam. It suspended a Republican Party Facebook page for more than two months after making an error. Additionally, it flagged as offensive posts and comments about the Plymouth Hoe, a public landmark in England.

 

However, the issue is complicated. Failure to flag content has the potential to have even more dangerous consequences. Before going on their rampages, the shooters in both the El Paso and Gilroy shootings expressed their violent intentions on 8chan and Instagram. Robert Bowers, the alleged perpetrator of the Pittsburgh synagogue massacre, was a frequent user of Gab, a white supremacist-oriented Twitter-like website. On Facebook, Twitter, YouTube, and TikTok, misinformation about the Ukraine war has garnered millions of views and likes.

 

Additionally, many AI-based moderation systems exhibit racial biases that must be addressed in order to create a safe and usable environment for all.

Enhancing artificial intelligence for moderation

To address these concerns, AI moderation systems require more accurate training data. Many businesses now outsource the data necessary to train their AI systems to low-skilled, poorly trained call centres in third-world countries. These labelers lack the linguistic proficiency and cultural context necessary to make accurate moderation decisions.

                                                

For instance, unless you’re familiar with American politics, you’re unlikely to understand what a message mentioning “Jan 6” or “Rudy and Hunter” means, despite their critical role in content moderation. If you are not a native English speaker, you will almost certainly over-index on profane terms, even when used positively, flagging references to the Plymouth Hoe or “she’s such a bad bitch” as offensive.

 

Surge AI, a data labelling platform designed for training AI in the nuances of language, is one company addressing this challenge. It was founded by a group of engineers and researchers who previously worked at Facebook, YouTube, and Twitter on trust and safety platforms.

 

For instance, Facebook has encountered numerous challenges obtaining high-quality data to train its moderation systems in critical languages. Despite the company’s size and scope as a global communications platform, it lacked the content necessary to train and maintain a model for standard Arabic, let alone dozens of dialects.

 

Due to the company’s lack of a comprehensive list of toxic slurs in the Afghan languages, it is possible that it is overlooking numerous infringing posts. It lacked an Assamese hate speech model, despite the fact that employees identified hate speech as a significant risk in Assam, owing to the region’s growing violence against ethnic groups. Surge AI contributes to resolving these issues by focusing on languages as well as toxicity and profanity datasets.

 

In short, by training more accurate content moderation algorithms on larger, higher-quality datasets, social media platforms can help keep their platforms safe and free of abuse.

 

Large datasets, just as they have fueled today’s state-of-the-art language generation models, such as OpenAI’s GPT-3, can also fuel improved AI for moderation. With sufficient data, machine learning models can be trained to detect toxicity more accurately and without the biases associated with low-quality datasets.

 

While AI-assisted content moderation is not a perfect solution, it is an important tool that can assist businesses in keeping their platforms safe and secure. With the growing use of artificial intelligence, we can hope for a future in which the online world is a safer place for everyone.

Related Post