Classification of Toxic Comments on Social Networks Using Machine Learning

María Fernanda Revelo-Bautista, Jair Oswaldo Bedoya-Benavides, Jaime Paúl Sayago-Heredia, Pablo Pico-Valencia, Xavier Quiñonez-Ku

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

This research addresses the problem of toxic comments in social networks, and how artificial intelligence (AI) and machine learning (Machine Learning) can help. It presents the development of a classification model using AI with machine learning techniques to identify toxic comments on Twitter. The proposed classifier, developed in Python, was established with 7 different algorithms using approaches or strategies for multi-label classification, preprocessing, cleaning and data visualization. This model was trained with a total of 159571 comments from the Kaggle repository dataset called Jigsaw, which has the comments classified with various features. After the training, evaluation and comparison of the model created, the result was a classifier capable of identifying toxic and offensive words or comments with an accuracy of 92.16%.

Original languageEnglish
Title of host publicationInternational Conference on Applied Technologies - 5th International Conference on Applied Technologies, ICAT 2023, Revised Selected Papers
EditorsMiguel Botto-Tobar, Marcelo Zambrano Vizuete, Sergio Montes León, Pablo Torres-Carrión, Benjamin Durakovic
PublisherSpringer Science and Business Media Deutschland GmbH
Pages257-270
Number of pages14
ISBN (Print)9783031589522
DOIs
StatePublished - 2024
Event5th International Conference on Applied Technologies, ICAT 2023 - Samborondon, Ecuador
Duration: 22 Nov 202324 Nov 2023

Publication series

NameCommunications in Computer and Information Science
Volume2050 CCIS
ISSN (Print)1865-0929
ISSN (Electronic)1865-0937

Conference

Conference5th International Conference on Applied Technologies, ICAT 2023
Country/TerritoryEcuador
CitySamborondon
Period22/11/2324/11/23

Bibliographical note

Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.

Keywords

  • machine learning
  • sentiment analysis
  • text classification
  • toxic comments
  • tweets
  • twitter

Cite this