As for the toxicity detection performance, the BERT model achieved a 91.27% classification accuracy and an area under the receiver operating characteristic curve (AUC) score of 0.963 and outperformed several baseline machine learning and neural network models. This study utilized the toxicity levels of user content to identify toxicity changes by the user within the same community, across multiple communities, and over time. The model predicted the toxicity levels of 87,376,912 posts from 577,835 users and 2,205,581,786 comments from 890,913 users on Reddit over 16 years, from 2005 to 2020. With the aid of crowdsourcing, we built a labeled dataset of 10,083 Reddit comments, then used the dataset to train and fine-tune a Bidirectional Encoder Representations from Transformers (BERT) neural network model. This research investigates changes in online behavior of users who publish in multiple communities on Reddit by measuring their toxicity at two levels. Our results provide new evidence of the harmful effects of echo chambers and the potential benefit of moderating them to reduce adoption of hateful speech. We show that the harmful speech does not remain contained within the community. Moreover, users are found to pick up this new hateful speech for months after initially joining the community. In all three cases we find an increase in hate speech outside the originating community, implying that joining such community leads to a spread of hate speech throughout the platform. We investigate four different Reddit sub-communities (subreddits) covering three areas of hate speech: racism, misogyny and fat-shaming. Using Interrupted Time Series (ITS) analysis as a causal inference method, we gauge the spillover effect, in which hateful language from within a certain community can spread outside that community by using the level of out-of-community hate word usage as a proxy for learned hate. We measure members' usage of hate speech outside the studied community before and after they become active participants. We leverage data from Reddit to assess the effect of joining one type of echo chamber: a digital community of like-minded users exhibiting hateful behavior. In this paper, we measure the impact of joining fringe hateful communities in terms of hate speech propagated to the rest of the social network. Such speech usually originates in fringe communities, yet it can spill over into mainstream channels. While online social media offers a way for ignored or stifled voices to be heard, it also allows users a platform to spread hateful speech. We further discuss how we can address these challenges and outline the benefits a full-study could provide to the software engineering research community. We present results from a pilot study and categorize issues faced into three broad categories including participant recruitment, community engagement, and data poisoning. We present several challenges faced by our research team in several distinct research studies, how they affected research, and motivate how, as researchers, we can address these challenges. In this paper we motivate the need to consider the unique challenges that human studies pose in software engineering research. Through meta research, we aim to deepen our understanding of online participant recruitment and human-subjects software engineering research. Recently, we, as a community, have come to acknowledge that there is a gap in meta-research and addressing the human-factors in software engineering research. Much of software engineering research focuses on tools, algorithms, and optimization of software. The study's findings highlight the importance of taking action to encourage more uplifting and productive online discourse across all platforms. Our computational approach provides decision-makers with useful information about reducing the spread of toxicity within online communities. Notably, our analysis of COVID-19 vaccination conversations on Twitter also revealed a significant presence of conspiracy theories among individuals with highly toxic attitudes. In contrast, Reddit showed the highest levels of toxicity, largely due to various anti-vaccine forums that spread misinformation about COVID-19 vaccines. The results indicate that Parler had lower toxicity levels than both Twitter and Reddit in discussions related to COVID-19. Using data analysis from January 1 through December 31, 2020, we examine the development of toxicity over time and compare the findings across the three platforms. Consequently, this study aims to assess the level of toxicity in COVID-19 discussions on Twitter, Parler, and Reddit. The emergence of toxic information on social networking sites, such as Twitter, Parler, and Reddit, has become a growing concern.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |