What It Is
Potentially offensive comments are hidden, based on user reports and algorithmic scoring, and require users to click on them to be seen.
Civic Signal Being Amplified
When To Use It
What Is Its Intended Impact
Filtering offensive comments aims to reduce users’ likelihood of exposure, thus fostering a healthier online environment and, by altering the norms of a platform, potentially reducing the production of other offensive comments.
Evidence That It Works
Evidence That It Works
Using data from a large-scale randomized controlled trial on Nextdoor, Katsaros et al. (2024) evaluated the impact of offensive comment filtering on how - and how much - users engage on the platform. In a treatment condition comments that had been flagged as offensive were hidden from users, although they could see them by clicking to "see all comments" (only 1% of users did). The treatment led to a 12% reduction in exposure compared to a control group (the limited drop was due to the lag time between posting and flagging), but that decrease in exposure did not lead to any detectable effect on the content users created; the comments of users with less exposure to potentially offensive content were just as civil as those in the control condition. On the plus side, however, users in the treatment group showed no decrease in the amount of time they engaged on the platform.
While not the subject of Katsaros et al. (2024), a related study by Ribeiro et al. (2022) found that filtering comments - as opposed to removing them - likewise had no prosocial effect on users who authored the problematic comments.
While Katsaros et al. (2024) observed no improvement in user interactions as result of the intervention, that could be due to the delays in filtering and limited reduction in exposure. Future designs and research might explore the effects of automated algorithmic - and thus immediate - filtering on increasing prosocial outcomes.
Why It Matters
As online platforms work to create healthier environments, filtering offensive comments offers a way to reduce toxic content visibility and potentially shape norms that then lead to healthier interactions. While current research does not provide evidence that filtering comments shifts norms to produce prosocial outcomes, there is evidence that it likewise does not reduce user engagement levels; thus while filtering offensive comments may not lead to more prosocial behavior, platforms can reduce risk of exposing users to offensive material without paying a cost in reduced engagement.