Reddit’s New AI Tool to Detect Online Harassment

Last Updated : 12 Mar, 2024

Reddit, the self-proclaimed “front page of the internet,” has long grappled with issues of online harassment. From trolling to hate speech, these negative interactions can create a hostile environment and drive away valuable users. To address this, Reddit recently introduced an AI-powered tool designed to detect and flag potentially harassing content.

In Short

Reddit launches an AI-powered harassment filter to detect and flag offensive content.

The tool uses a Large Language Model trained on moderator actions and previously removed content.

The new feature aims to support moderators and enhance user experience on the platform.

Reddit-Introduces-AI-Powered-Tool-to-Combat-Online-Harassment

What is Reddit’s New AI-powered Tool

Reddit’s new weapon against online nastiness is an AI-powered tool! It scans posts and comments for signs of harassment, like threats and insults. This helps moderators identify trouble faster, but don’t worry, the AI doesn’t have the final say. Human moderators still review flagged content to ensure a safe space for everyone.

How Does Reddit’s AI Tool Work?

The new tool uses a Large Language Model (LLM) trained on a massive dataset of moderator actions and content removed by Reddit’s internal teams. This training allows the LLM to identify patterns in language that often correlate with online harassment. When a user submits a post or comment, the AI analyzes the text, searching for flags like insults, threats, and other forms of abusive language.

What Happens When the AI Detects Harassment?

If the AI identifies a potential harassment issue, the content won’t be automatically removed. Instead, the system will flag the post or comment for human moderators to review. This two-step approach ensures that flagged content undergoes a human review process before any action is taken.

What Does This Mean for Reddit Moderators?

Reddit’s AI harassment tool is a game-changer for moderators. Imagine AI scanning content for red flags, freeing you to focus on complex issues and building your community. Faster response times to harassment and a potential decrease in overall nastiness mean a more positive environment for everyone. But remember, the AI is an assistant, not a dictator. Human moderators will always have the final say on content removal.

Set Up Reddit’s Harassment Filter on Desktop

Step 1: Navigate to the Community You Moderate

Head over to the specific subreddit community you want to manage.

Step 2: Access the “Mod Tools” Section

On the right-hand sidebar, locate the “About Community” tab. Click on this tab and then select “Mod Tools” from the available options.

Set Up Reddit’s Harassment Filter on iOS and Android

Step 1: Go To The Community You Want to Moderate

Open the Reddit app and navigate to the specific community where you have moderation privileges.

Step 2: Tap On The “Mod Tools” Button

Look for the “Mod Tools” button located below the community’s banner. This button might be represented by a wrench icon or labeled explicitly.

Step 3: Go To “Moderation” And Then “Safety”

Once you’ve accessed the Mod Tools menu, find the “Moderation” section and tap on it. Within “Moderation,” locate the “Safety” option and tap again to proceed.

Step 4: Enable The “Harassment Filter” And Choose Your Desired Strength

In the “Safety” settings, you should see the “Harassment filter” option. Toggle the switch to activate it. Additionally, you can choose between “Low” or “High” filter strength depending on your community’s needs.

Unfortunately, there isn’t a built-in way for users to set up a personal harassment filter on Reddit. However, Reddit does have a built-in harassment filter that moderators can enable for their communities.

What Are the Filter Options Available?

There aren’t built-in filter options for individual users to target harassment on Reddit. However, moderators do have access to a harassment filter within their moderation tools.

Here’s a quick breakdown:

The filter itself isn’t customizable: It’s an on/off switch for moderators to enable.
Two Strength Options: Once enabled, moderators can choose between “Low” or “High” filter strength.
Low: This setting catches less content but boasts higher accuracy in identifying true harassment.
High: This setting casts a wider net, potentially filtering out some legitimate content alongside genuine harassment. Reddit recommends “High” for communities experiencing significant harassment issues.

Potential Benefits Reddit’s AI Tool

Reddit’s new AI-powered harassment detection tool holds the promise of significant improvements to the platform’s online safety. Here’s a closer look at some of the potential benefits:

Increased Efficiency for Moderators: Sifting through massive amounts of content can be overwhelming for moderators. The AI tool can act as a valuable assistant, flagging potentially harassing posts and comments for review. This frees up moderator time to focus on more complex issues and community-building initiatives.
Faster Response Times: Online harassment often thrives when left unchecked. The AI tool can help moderators identify potential issues much faster, allowing them to take action and address the situation before it escalates. This can contribute to a more positive and inclusive environment for all users.
Reduced Harassment and Trolling: The mere presence of a harassment detection tool can act as a deterrent. Knowing their posts might be flagged can discourage users from resorting to abusive language or trolling behavior. Over time, this could lead to a noticeable decrease in overall harassment on the platform.
Improved User Experience: By filtering out abusive content, the AI tool can create a more welcoming and enjoyable experience for the vast majority of Reddit users. This can encourage participation, foster healthy discussions, and ultimately lead to a more vibrant and engaged user base.
Enhanced Community Building: When users feel safe and respected, they’re more likely to participate in discussions and contribute to the community. The AI tool can help cultivate a more positive atmosphere, allowing moderators and users to work together in building stronger, harassment-free communities.

Impact on Free Speech on Reddit

Free speech is a cornerstone principle of Reddit, allowing for open and diverse discussions. However, this freedom doesn’t extend to harassment and hate speech. The implementation of the AI tool raises questions about its potential impact on free speech on the platform. Here’s a breakdown of the potential concerns and how Reddit aims to navigate them:

Potential for Misflagging
Transparency and User Trust
Human Oversight Remains Paramount
Finding the Right Balance

As the AI tool evolves and Reddit gathers user feedback, refinements can be made to ensure it upholds free speech while creating a positive online experience for everyone.

Conclusion

The implementation of an AI-powered harassment detection tool demonstrates Reddit’s commitment to creating a safer platform for its users. While the technology has its limitations, it has the potential to significantly improve online safety. As Reddit and other platforms continue to develop and implement AI tools, it will be interesting to see how they shape the future of online discourse.

Reddit’s AI-Powered Tool – FAQs

What is Reddit’s AI technology?

Reddit’s AI tool is a harassment detection system that helps moderators identify potentially abusive content.

Will Reddit’s AI tool remove my content?

No, the AI tool flags content for human moderators to review. Moderators make the final decision on removal.

Is Reddit’s AI Tool free?

Yes, the AI tool is part of the Reddit platform and there’s no additional cost for users.

Suggest improvement

Top 12 AI Tools for Remote Learning and Online Education

Share your thoughts in the comments