TY - GEN
T1 - Exploring the Impact of Toxic Comments in Code Quality
AU - Sayago-Heredia, Jaime
AU - Chango, Gustavo
AU - Pérez-Castillo, Ricardo
AU - Piattini, Mario
N1 - Publisher Copyright:
Copyright © 2022 by SCITEPRESS – Science and Technology Publications, Lda. All rights reserved.
PY - 2022
Y1 - 2022
N2 - Software development has an important human-side, which implies that developers' feelings have a significant impact to software development and could affect developers' quality, productivity, and performance. In this paper, we explore the process to find, understand and relate the effects of toxic emotions on code quality. We propose a tool and sentiments dataset, a clean set of commit messages, extracted from SonarQube code quality metrics and toxic comments obtained from GitHub. Moreover, we perform a preliminary statistical analysis of the dataset. We apply natural language processing techniques to identify toxic developer sentiments on commits that could impact code quality. Our study describes data retrieval process along with tools used for performing a preliminary analysis. The preliminary dataset is available in CSV format to facilitate queries on the data and to investigate in depth factors that impact developer emotions. Preliminary results imply that there is a relationship between toxic comments and code quality that may affect the quality of the software project. Future research will be the development of a complete dataset and an in-depth analysis for efficiency validation experiments along with a linear regression. Finally, we will estimate the code quality as a function of developers' toxic comments.
AB - Software development has an important human-side, which implies that developers' feelings have a significant impact to software development and could affect developers' quality, productivity, and performance. In this paper, we explore the process to find, understand and relate the effects of toxic emotions on code quality. We propose a tool and sentiments dataset, a clean set of commit messages, extracted from SonarQube code quality metrics and toxic comments obtained from GitHub. Moreover, we perform a preliminary statistical analysis of the dataset. We apply natural language processing techniques to identify toxic developer sentiments on commits that could impact code quality. Our study describes data retrieval process along with tools used for performing a preliminary analysis. The preliminary dataset is available in CSV format to facilitate queries on the data and to investigate in depth factors that impact developer emotions. Preliminary results imply that there is a relationship between toxic comments and code quality that may affect the quality of the software project. Future research will be the development of a complete dataset and an in-depth analysis for efficiency validation experiments along with a linear regression. Finally, we will estimate the code quality as a function of developers' toxic comments.
KW - Commits
KW - GitHub
KW - Sentiments Analysis
KW - Software Engineering
KW - Software Quality
KW - SonarQube
KW - Toxic Comment Classification
UR - https://www.scopus.com/pages/publications/85140988970
U2 - 10.5220/0011039700003176
DO - 10.5220/0011039700003176
M3 - Conference contribution
AN - SCOPUS:85140988970
T3 - International Conference on Evaluation of Novel Approaches to Software Engineering, ENASE - Proceedings
SP - 335
EP - 343
BT - Proceedings of the 17th International Conference on Evaluation of Novel Approaches to Software Engineering, ENASE 2022
A2 - Kaindl, Hermann
A2 - Mannion, Mike
A2 - Maciaszek, Leszek
A2 - Maciaszek, Leszek
PB - Science and Technology Publications, Lda
T2 - 17th International Conference on Evaluation of Novel Approaches to Software Engineering, ENASE 2022
Y2 - 25 April 2022 through 26 April 2022
ER -