Online Content Moderation
Our research in natural language processing and computational social science has developed new methods for moderation of abusive content online, across different types of media including social media platforms and news outlets. These have led to a number of software tools, which we have been scaling up to support real-world use in three ways.
First, we are building generalisable models that can handle content in different languages and formats, and that can be initialised even without existing labelled data by using data from different sources and even languages, and then fine-tuned in situ, quickly learning to achieve optimal performance for a given setting.
Second, we have developed state-of-the-art temporal adaptation techniques to make these tools robust against temporal changes in the content to be moderated, and able to overcome potential biases in predictions.
And third, we are making our techniques personalisable and able to learn the moderation preferences of different users as well as their preferences for having abusive content blocked, hidden or demoted. This is enabling us to push the barriers in online content moderation by providing tools that are effective and efficient in real-world scenarios; our models have been tested and used in production by news publishers including 24sata (Croatia’s largest daily newspaper), RTVslo (Slovenia’s national broadcaster) and Ekspress Meedia (Estonia’s largest news publisher).

Children are particularly vulnerable to online threats.