0

How Hateful are Movies? A Study and Prediction on Movie Subtitles

Transfer learning from social media datasets using BERT effectively classifies hate and offensive speech in movie subtitles.

Year
2021
Venue
KONVENS (WS) 2021 9
Authors
5
Hosting
Abstract onlyARXIV-DEFAULT

Cite

Notes

Only stored in your browser.

Attribution

Abstract & full text
arxiv.org/abs/2108.10724ARXIV-DEFAULT
TL;DR
Semantic Scholar
Attribution policy →

Abstract

In this research, we investigate techniques to detect hate speech in movies. We introduce a new dataset collected from the subtitles of six movies, where each utterance is annotated either as hate, offensive or normal. We apply transfer learning techniques of domain adaptation and fine-tuning on existing social media datasets, namely from Twitter and Fox News. We evaluate different representations, i.e., Bag of Words (BoW), Bi-directional Long short-term memory (Bi-LSTM), and Bidirectional Encoder Representations from Transformers (BERT) on 11k movie subtitles. The BERT model obtained the best macro-averaged F1-score of 77%. Hence, we show that transfer learning from the social media domain is efficacious in classifying hate and offensive speech in movies through subtitles.

Authors

5