Content Moderation on News and Social Media

AI tools are used to spot potentially harmful comments, posts and content and remove them from discussion boards and social media platforms. These tools may often misconstrue language that is culturally different, effectively censoring people’s voices.

Filter resources by type or complexity

All AdvancedArticleBeginnerIntermediateResearch PaperVideo

Responsible AI for Inclusive, Democratic Societies: A cross-disciplinary approach to detecting and countering abusive language online

Research paper about responsible AI Toxic and abusive language threaten the integrity of public dialogue and democracy. In response, governments worldwide have enacted strong laws against abusive language that leads to hatred, violence and criminal offences against a particular group. The responsible (i.e. effective, fair and unbiased) moderation of abusive language carries significant challenges. Our […]

Read More

Let’s Talk About Race: identity, chatbots, and AI

A research paper about race and AI chatbots Why is it so hard for AI chatbots to talk about race? By researching databases, natural language processing, and machine learning in conjunction with critical, intersectional theories, we investigate the technical and theoretical constructs underpinning the problem space of race and chatbots. This paper questions how to […]

Read More

Racial bias in hate speech and abusive language detection datasets

A paper on racial bias in hate speech Technologies for abusive language detection are being developed and applied with little consideration of their potential biases. We examine racial bias in five different sets of Twitter data annotated for hate speech and abusive language. Tweets written in African-American English are far more likely to be automatically […]

Read More

Risk of racial bias in hate speech detection

Risk of racial bias in hate speech detection This research paper investigates how insensitivity to differences in dialect can lead to racial bias in automatic hate speech detection models, potentially amplifying harm against minority populations.

Read More