Log in

Relevant bibliographies by topics / Toxic comments classifier

Contents

Journal articles
Book chapters
Conference papers
Reports

Academic literature on the topic 'Toxic comments classifier'

Author: Grafiati

Published: 4 June 2025

Last updated: 15 July 2025

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Toxic comments classifier.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Toxic comments classifier"

1

Muhammad, Savad N., and K. H. Hasna. "Toxic comment classifier on social media platform." i-manager's Journal on Computer Science 10, no. 4 (2023): 17. http://dx.doi.org/10.26634/jcom.10.4.19307.

Full text

Abstract:

The objective of the paper is to mitigate internet negativity by identifying and blocking toxic comments related to a particular topic or product. The detrimental effects of social media abuse and harassment can cause people to refrain from expressing themselves. Although some platforms disable user comments altogether, this method is not efficient. The presence of toxicity in comments can assist platforms in taking appropriate measures. The paper aims to classify comments according to their toxicity levels for future blockage. The dataset comprises comments classified into six types, toxic, severe toxic, threat, obscene, identity hate, and insult. Multiple classification techniques will be employed to determine the most accurate one for the data. The authors will employ four types of classification and select the most precise one for each dataset. This methodology enables the authors to choose various datasets for the problem and select the most accurate classifier for each dataset.

APA, Harvard, Vancouver, ISO, and other styles

2

Abubakar, Muhammad, Aminu Tukur, and Usman Bukar Usman. "AN IMPROVED MULTI-LABELED LSTM TOXIC COMMENT CLASSIFICATION." Journal of Applied Science, Information and Computing 1, no. 2 (2020): 57–66. http://dx.doi.org/10.59568/jasic-2020-1-2-08.

Full text

Abstract:

The origin of text classification was far back to the early '60s. Text classification classified text into different predefined classifications. One of the techniques used for text classification long short term memory, which is an artificial recurrent neural network architecture. Today, all around the world people are expressing themselves with their opinions and also discuss among others via the media. In such a setup, it is quite observable that discussions may arise due to differences in opinion. These discussions might take a dirty side and which may further result in combats over the social media platforms and may lead to offensive language termed as toxic comments. To identify online hate speech, a large number of scientific studies have been devoted to using Natural Language Processing in combination with Machine Learning and Deep Learning methods. Among the challenges of toxic comment, classifiers are the Out-of-vocabulary words problem, which is the occurrence of words that are not present in the training data. Long-Range Dependencies are also a challenge to toxic comment classification. Which is a situation whereby the toxicity of comments often depends on expressions made in the early parts of the comment. This is especially problematic for longer comments. Another challenge is the low accuracy of comment classification techniques. Epoch was used in improving the accuracy of long short term memory. Epoch tends to improve the accuracy of the classifier since it positively affects the speed and quality of the learning process. We have an improvement of 0.4068 in precision, 0.2871 in a recall, 0.2293 in F1, and 0.4291 inaccuracy.

APA, Harvard, Vancouver, ISO, and other styles

3

Rahman, Md Abdur, Abu Nayem, Mahfida Amjad, and Md Saeed Siddik. "How do Machine Learning Algorithms Effectively Classify Toxic Comments? An Empirical Analysis." International Journal of Intelligent Systems and Applications 15, no. 4 (2023): 1–14. http://dx.doi.org/10.5815/ijisa.2023.04.01.

Full text

Abstract:

Toxic comments on social media platforms, news portals, and online forums are impolite, insulting, or unreasonable that usually make other users leave a conversation. Due to the significant number of comments, it is impractical to moderate them manually. Therefore, online service providers use the automatic detection of toxicity using Machine Learning (ML) algorithms. However, the model's toxicity identification performance relies on the best combination of classifier and feature extraction techniques. In this empirical study, we set up a comparison environment for toxic comment classification using 15 frequently used supervised ML classifiers with the four most prominent feature extraction schemes. We considered the publicly available Jigsaw dataset on toxic comments written by human users. We tested, analyzed and compared with every pair of investigated classifiers and finally reported a conclusion. We used the accuracy and area under the ROC curve as the evaluation metrics. We revealed that Logistic Regression and AdaBoost are the best toxic comment classifiers. The average accuracy of Logistic Regression and AdaBoost is 0.895 and 0.893, respectively, where both achieved the same area under the ROC curve score (i.e., 0.828). Therefore, the primary takeaway of this study is that the Logistic Regression and Adaboost leveraging BoW, TF-IDF, or Hashing features can perform sufficiently for toxic comment classification.

APA, Harvard, Vancouver, ISO, and other styles

4

Koriashkina, L. S., and H. V. Symonets. "APPLICATION OF MACHINE LEARNING ALGORITHMS FOR PROCESSING COMMENTS FROM THE YOUTUBE VIDEO HOSTING UNDER TRAINING VIDEOS." Science and Transport Progress. Bulletin of Dnipropetrovsk National University of Railway Transport, no. 6(90) (April 8, 2021): 33–42. http://dx.doi.org/10.15802/stp2020/225264.

Full text

Abstract:

Purpose. Detecting toxic comments on YouTube video hosting under training videos by classifying unstructured text using a combination of machine learning methods. Methodology. To work with the specified type of data, machine learning methods were used for cleaning, normalizing, and presenting textual data in a form acceptable for processing on a computer. Directly to classify comments as “toxic”, we used a logistic regression classifier, a linear support vector classification method without and with a learning method – stochastic gradient descent, a random forest classifier and a gradient enhancement classifier. In order to assess the work of the classifiers, the methods of calculating the matrix of errors, accuracy, completeness and F-measure were used. For a more generalized assessment, a cross-validation method was used. Python programming language. Findings. Based on the assessment indicators, the most optimal methods were selected – support vector machine (Linear SVM), without and with the training method using stochastic gradient descent. The described technologies can be used to analyze the textual comments under any training videos to detect toxic reviews. Also, the approach can be useful for identifying unwanted or even aggressive information on social networks or services where reviews are provided. Originality. It consists in a combination of methods for preprocessing a specific type of text, taking into account such features as the possibility of having a timecode, emoji, links, and the like, as well as in the adaptation of classification methods of machine learning for the analysis of Russian-language comments. Practical value. It is about optimizing (simplification) the comment analysis process. The need for this processing is due to the growing volumes of text data, especially in the field of education through quarantine conditions and the transition to distance learning. The volume of educational Internet content already needs to automate the processing and analysis of feedback, over time this need will only grow.

APA, Harvard, Vancouver, ISO, and other styles

5

Omprakash, Yadav, Barretto Giselle, Bhosle Siddhi, and Dmello Candice. "Detection and Classification of Online Toxic Comments." Journal of Advancement in Software Engineering and Testing 4, no. 1 (2021): 1–5. https://doi.org/10.5281/zenodo.4817933.

Full text

Abstract:

In the current century, social media has created many job opportunities and has become a unique place for people to freely express their opinions. But as every coin has two sides, the good and the bad, along with the pros, social media has many cons. Among these users, a few bunches of users are taking advantage of this system and are misusing this opportunity to express their toxic mindset (i.e., insulting, verbal sexual harassment, foul behavior, etc.). And hence cyberbullying has become a major problem. If we can filter out the hurtful, toxic words expressed on social media platforms like Twitter, Instagram, and Facebook, the online world will become a safer and more harmonious place. We gained initial ideas by researching current toxic comments classifiers to come up with this design. We then took what we found and made the most user-friendly product possible. For this project, we created a Toxic Comments Classifier which will classify the comments depending on the category of toxicity and will display the percentage of probability for each category of toxicity.

APA, Harvard, Vancouver, ISO, and other styles

6

Kadam, Ms Shivani, Ms Komal Ghatage, Mr Aadesh Chaugule, Mr Shubham Dilip Gajarushi, and Prof J. W. Bakal. "Comment Toxicity Tracker Using NLP with Emphasis on Machine Learning Algorithms." International Journal for Research in Applied Science and Engineering Technology 12, no. 3 (2024): 795–804. http://dx.doi.org/10.22214/ijraset.2024.58929.

Full text

Abstract:

Abstract: The rise of online stages has driven to an uncommon volume of user-generated content, including comments on various forums, social media posts, and news articles. However, this abundance of user comments has also brought to light the issue of toxicity, where certain comments contain harmful, offensive, or inflammatory language that can negatively impact online discussions and communities. To address this issue, investigate centers on the advancement of a comment harmfulness location show utilizing Normal Dialect Handling (NLP) & Machine Learning. The proposed system leverages state-of-the-art NLP. By training these models on labelled datasets of toxic and non-toxic comments, the system learns to identify patterns and linguistic cues associated with toxic language. Key components of the system preprocessing steps to clean and tokenize the comments.Feature extraction using word embeddings or contextual embeddings, and model training using Machine Learning algorithms like neural networks, Random Forest Classifier model, etc . Evaluation measurements such as exactness, exactness, review, and F1-score are utilized to survey the execution of the prepared model

APA, Harvard, Vancouver, ISO, and other styles

7

Romadina, Vira Nindya, Oktalia Juwita, and Priza Pandunata. "Analisis Komentar Toxic Terhadap Informasi COVID-19 pada YouTube Kementerian Kesehatan Menggunakan Metode Naïve Bayes Classifier." INFORMAL: Informatics Journal 9, no. 1 (2024): 92. http://dx.doi.org/10.19184/isj.v9i1.48126.

Full text

Abstract:

Countries around the world were shocked by the outbreak of a new virus in 2020, which quickly transmitted and attacks humans of all ages. The virus is COVID-19. The government has advised through social media to stay at home and got vaccinated. YouTube has become a platform for the government, especially the Ministry of Health, to share public information in the COVID-19’s pandemic. Public can put their comments on video uploaded by the Ministry of Health. An analysis of comments is needed so that the information in comments can be useful for those who read and evaluated by the government so that they can provide information that the public can understand. In analyzing toxic comments, it can used text mining. And one of that is the Naïve Bayes Classifier. This study uses the Naïve Bayes Classifier method to determine the results of the analysis. Measuring the value of accuracy, this study using the Confusion Matrix evaluation. From the final result, the highest accuracy value is in the comparison of 90%:10% with an accuracy value of 80%. And from the results of the analysis, the most toxic words used are the words ‘dead’, 'business', ‘public' and 'fool'. From the results show that there are still many people who do not believe in the existence of COVID-19 and think that vaccines can cause death in people who are vaccinated.

APA, Harvard, Vancouver, ISO, and other styles

8

Subha K, Benitlin. "Enhancing Social Media Safety with Machine Learning-Based Cyberbullying Detection." INTERNATIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT 09, no. 04 (2025): 1–9. https://doi.org/10.55041/ijsrem46570.

Full text

Abstract:

Abstract - Social media platforms have revolutionized global communication but also facilitated the rise of cyberbullying, posing serious threats to user well-being, particularly among youth. Manual moderation is inadequate for managing the scale and velocity of harmful content online. This paper proposes a machine learning-based system for real-time cyberbullying detection, leveraging TF-IDF vectorization and a Logistic Regression classifier to identify and categorize user comments as toxic, obscene, threatening, or hateful. A Flask-powered web interface enables users to evaluate comment toxicity interactively. Designed for scalability and future expansion, the system supports integration with social media APIs, multi-language processing, and potential adoption of deep learning models. This work aims to contribute to safer online environments through intelligent, automated moderation. Key Words: Cyberbullying Detection, Machine Learning, TF-IDF, Logistic Regression, Toxic Comment Classification, Online Safety, Flask Web Application, Social Media Moderation, Natural Language Processing, Real-Time Content Filtering

APA, Harvard, Vancouver, ISO, and other styles

9

A, Thilagavathy, Deepa R, Lalitha S.D, et al. "Semantic-Based Classification of Toxic Comments Using Ensemble Learning." E3S Web of Conferences 399 (2023): 04017. http://dx.doi.org/10.1051/e3sconf/202339904017.

Full text

Abstract:

A social media is rapidly expanding, and its anonymity feature completely supports free speech. Hate speech directed at anyone or any group because of their ethnicity, clan, religion, national or cultural their heritage, sex, disability, gender orientation, or other characteristics is a violation of their authority. Seriously encourages violence or hate crimes and causes social unrest by undermining peace, trustworthiness, and human rights, among other things. Identifying toxic remarks in social media conversation is a critical but difficult job. There are several difficulties in detecting toxic text remarks using a suitable and particular social media dataset and its high-performance, selected classifier. People nowadays share messages not only in person, but also in online settings such as social networking sites and online groups. As a result, all social media sites and apps, as well as all current communities in the digital world, require an identification and prevention system. Finding toxic social media remarks has proven critical for content screening. The identifying blocker in such a system would need to notice any bad online behavior and alert the prophylactic blocker to take appropriate action. The purpose of this research was to assess each text and find various kinds of toxicities such as profanity, threats, name-calling, and identity-based hatred. Jigsaw's designed Wikipedia remark collection is used for this.

APA, Harvard, Vancouver, ISO, and other styles

10

Wu, Siqi, and Paul Resnick. "Calibrate-Extrapolate: Rethinking Prevalence Estimation with Black Box Classifiers." Proceedings of the International AAAI Conference on Web and Social Media 18 (May 28, 2024): 1634–47. http://dx.doi.org/10.1609/icwsm.v18i1.31414.

Full text

Abstract:

In computational social science, researchers often use a pre-trained, black box classifier to estimate the frequency of each class in unlabeled datasets. A variety of prevalence estimation techniques have been developed in the literature, each yielding an unbiased estimate if certain stability assumption holds. This work introduces a framework to rethink the prevalence estimation process as calibrating the classifier outputs against ground truth labels to obtain the joint distribution of a base dataset and then extrapolating to the joint distribution of a target dataset. We call this framework "Calibrate-Extrapolate". It clarifies what stability assumptions must hold for a prevalence estimation technique to yield accurate estimates. In the calibration phase, the techniques assume only a stable calibration curve between a calibration dataset and the full base dataset. This allows for the classifier outputs to be used for disproportionate random sampling, thus improving the efficiency of calibration. In the extrapolation phase, some techniques assume a stable calibration curve while some assume stable class-conditional densities. We discuss the stability assumptions from a causal perspective. By specifying base and target joint distributions, we can generate simulated datasets, as a way to build intuitions about the impacts of assumption violations. This also leads to a better understanding of how the classifier's predictive power affects the accuracy of prevalence estimates: the greater the predictive power, the lower the sensitivity to violations of stability assumptions in the extrapolation phase. We illustrate the framework with an application that estimates the prevalence of toxic comments on news topics over time on Reddit, Twitter/X, and YouTube, using Jigsaw's Perspective API as a black box classifier. Finally, we summarize several practical advice for prevalence estimation.

APA, Harvard, Vancouver, ISO, and other styles

More sources

Book chapters on the topic "Toxic comments classifier"

1

Govinda, K., and Korhan Cengiz. "Toxic Comment Classifier." In Hybridization of Blockchain and Cloud Computing. Apple Academic Press, 2023. http://dx.doi.org/10.1201/9781003336624-3.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Sengupta, Disha, Zaid Rupani, Rajvi Jagani, and Kashish Nahar. "A Review Paper on Toxic Comment Classifier." In Emerging Trends in IoT and Computing Technologies. CRC Press, 2024. http://dx.doi.org/10.1201/9781003535423-13.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Singh, Sarthak Kumar, Sanskar Patwari, Siddharth Gautam, and Avjeet Singh. "An overview of toxic comment classifier using BLSTM." In Intelligent Computing and Communication Techniques. CRC Press, 2025. https://doi.org/10.1201/9781003530176-94.

Full text

APA, Harvard, Vancouver, ISO, and other styles

4

Gajic, Jelena, Lazar Drazeta, Lepa Babic, Jelena Kaljevic, Dejan Jovanovic, and Luka Jovanovic. "Twitter toxic comment identification in digital media and advertising using NLP and optimized classifiers." In Proceedings of the 2nd International Conference on Innovation in Information Technology and Business (ICIITB 2024). Atlantis Press International BV, 2024. http://dx.doi.org/10.2991/978-94-6463-482-2_12.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

S, Priyanka, Miss Hase, and Dr Baisa L. Gunjal. "A COMPREHENSIVE STUDY OF SENTIMENT ANALYSIS PREDICTED ON MACHINE LEARNING CLASSIFIERS FOR VARIOUS DATASETS." In Futuristic Trends in IOT Volume 3 Book 6. Iterative International Publishers, Selfypage Developers Pvt Ltd, 2024. http://dx.doi.org/10.58532/v3bbio6p1ch9.

Full text

Abstract:

Opinion mining refers to the analysis of sentiment as most important intelligence tools. It analyzes subjective information in an expression to determine the emotional tone. It can be either optimistic, pessimistic or unbiased. Today, a huge amount of data is available on the internet as emails, social media reviews, comments etc. Sentiment analysis helps to determine the author's attitude or the public opinion on a particular topic. There are a variety of applications for sentiment analysis that will be explored in this chapter

APA, Harvard, Vancouver, ISO, and other styles

6

"Basic Reflections on Photodynamic Therapy." In Combination Therapies Involving Photodynamic Therapy. Royal Society of Chemistry, 2023. http://dx.doi.org/10.1039/bk9781837672226-00026.

Full text

Abstract:

Photodynamic therapy photosensitizers are now classified into several generations depending on a number of factors, the top among which is their position in the photosensitizer and photodynamic therapy developmental timeline and milestones. Although the generational development of photosensitizers was first used to mark various milestone improvements in photodynamic therapy, it became diffuse after the description of third-generation photosensitizers. The technology that emerged around the 1950s as a simple application of the photosensitizer to generate oxygen-based tissue toxicity has now become one of the leading alternatives to cancer and antimicrobial therapy. Among the first photosensitizers to be licensed was Photofrin, now termed the first generation. A proposal of photosensitizer classification into five generations is presented in this chapter. The discussion of the mechanism of photodynamic therapy, which was introduced in Chapter 1, was described with the aid of a Jablonski diagram, showing the generation of reactive oxygen species, which is due to the interaction of the triplet-state photosensitizer with oxygen molecules present in the disease site in the triplet state. Excitation of the photosensitizer to its singlet excited state leads to photosensitization of oxygen present in the disease site in the triplet state after intersystem crossing to produce toxic reactive oxygen species, which cause irreversible cell damage. Photosensitizer development for photodynamic therapy generally follows a trajectory that commences with chemical synthesis, incorporation into nanomaterials, in vitro and in vivo studies, clinical trials and clinical case studies. A wide variety of innovations now in clinical applications of photodynamic therapy are based on photosensitizers that went through this trajectory.

APA, Harvard, Vancouver, ISO, and other styles

7

Griep, Mark A., and Marjorie L. Mikasen. "Isomorphs of Paranoia: Chemical Arsenals." In ReAction! Oxford University Press, 2009. http://dx.doi.org/10.1093/oso/9780195326925.003.0007.

Full text

Abstract:

In the United States after the September 2001 attacks, citizens were advised to protect themselves from toxic dusts by covering their windows with plastic sheeting and duct tape that could be purchased from any hardware store. One hundred years ago, terrorists would not have had ready access to today’s common chemicals to create makeshift explosives, and citizens would not have had access to plastic sheeting or duct tape to protect themselves from aerosols or gases. Chemical weapons have engendered a cloud of fear since their introduction into warfare during World War I. Recently, the large-scale use of chemicals as lethal weapons has drifted from warfare to terrorism. Chemical weapons are often equated with poison gases (either asphyxiation or nerve agents), but as can be seen in the list of movies for this chapter, they are actually the most diverse type of weapon. Some of these weapons are discussed elsewhere in the book (psychedelic agents, chapter 5; explosives, chapter 9). The chemistry in nuclear weapons movies is discussed in the commentary sections for those movies that use them. The movies in this chapter are closely linked to spy movies, which lie at the nexus of the action and thriller genres. Spy movies are appealing in part because these charming, good-looking government employees live by their wits and gut reactions to make split-second decisions that are best for the spy and the government. But a spy is only as good as the villain; otherwise, it wouldn’t be challenging or fun. So, the final ingredient for the movie choices in this chapter is that many of them refer to actual chemical weapons, which grounds them in the real world. The audience knows these weapons are dangerous and can be misused by the wrong person. Only about 70 chemical compounds have been put to use during military conflicts over the past century, and they are classified based on their effects. Asphyxiating and blistering agents were created for WWI (1914–1918); nerve agents were developed for WWII (1940–1945) but never used in that war; napalm was also created for WWII but it generated public comment only when used in the Vietnam War; nonlethal psychedelics were tested extensively during the 1950s but haven’t been documented as having been used yet; herbicides and tear gas were used tactically during the Vietnam War.

APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "Toxic comments classifier"

1

Kapse, Arvind S., Anamay Dubey, Harshvardhan Bisen, Kapil Kumar, and Md Tamheed. "Multilingual Toxic Comment Classifier." In 2023 7th International Conference on Intelligent Computing and Control Systems (ICICCS). IEEE, 2023. http://dx.doi.org/10.1109/iciccs56967.2023.10142540.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Maity, Amit, Rishi More, Prof Abhijit Patil, Jay Oza, and Gitesh Kambli. "Toxic Comment Detection Using Bidirectional Sequence Classifiers." In 2024 2nd International Conference on Intelligent Data Communication Technologies and Internet of Things (IDCIoT). IEEE, 2024. http://dx.doi.org/10.1109/idciot59759.2024.10467922.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Arnold, Tim, Helen Fuller, Serge Yee, Seema Nazeer, and Ruth Reeves. "Team-Based Text Analytics for Health Information Systems Learning." In 14th International Conference on Applied Human Factors and Ergonomics (AHFE 2023). AHFE International, 2023. http://dx.doi.org/10.54941/ahfe1003486.

Full text

Abstract:

In healthcare operations, narrative text and comments from questionnaires are common and abundant. Making sense of and coming to some shared meanings around text comments from such questionnaires is often time consuming. A lack of resources and expertise may contribute to hesitation and indecisions when deciding on how or if to analyze text. Because of challenges with analyzing text in operational settings, there can be reluctance to capture rich narrative information. Nonetheless, narrative comments can be a source of rich information that with reliable and faster approaches for analyzing may help with informing operational decisions and human-centered design efforts. In this paper, we describe using text analytics approaches for contributing to thematic analysis of users’ comments to help with health information systems learning.Several text analytic approaches were explored as possible pathways to reduce the burden of reviewing comments about training around health information systems. Approaches included topic modelling, keyword extraction and creating word clouds, word co-occurrence and uniquely co-occurring word visualizations, and text classifiers and nomograms that highlights top linguistic features for the trained classifier. The team walked through example approaches and visualizations and decided on next steps.Visualizations of word co-occurrence and uniquely co-occurring word networks and top linguistic features used to train a naïve-bayes text classifier were used to envision possible categories or codes. Regular expressions were iteratively formulated consisting of some combination of words and stems as codes were formulated and extracts were repeatedly reviewed. Code formulation corresponded with refinement of regular expressions. Individual comments could be multi-labeled and not all comments were coded. Static visuals, text examples, regular expressions, and extract quantities were collected, presented, discussed, and refined with the review team.The purpose of this work was to explore text analytic approaches to assist with response interpretation and to apply filtering techniques for addressing concerns of information overload. Addressing concerns about information overload may reduce hesitation with collecting and examining text. By reframing this as a filtering problem, we began to inquire into ways to review, create codes, and code comments more quickly. Including and fine-tuning text analytics approaches may help teams learn more quickly from questionnaire comments about how users perceive working within health information systems. Finally, lowering thresholds for analyzing text may boost motivations for gathering rich information keeping us from missing out on vital viewpoints and language use across time.

APA, Harvard, Vancouver, ISO, and other styles

4

Baségio Junior, Ademir, Lucas Darlindo Freitas Rodrigues, Antonio Fernando Lavareda Jacob Junior, and Fábio Manoel França Lobato. "Analisando Tweets Relacionados a Deficiências: uma Abordagem Baseada em Classificação." In Computer on the Beach. Universidade do Vale do Itajaí, 2020. http://dx.doi.org/10.14210/cotb.v11n1.p366-373.

Full text

Abstract:

Approximately 80 % of people with some form of physical, mental,or intellectual disability live in developing countries. These samecountries have shown significant growth in the availability of theinternet. Such facts reveal good possibilities regarding access toemotional support and experiences exchange among people withdisabilities through social media. However, hate speech and derogatorycomments about these people can be a recurring problem onthese platforms. In order to identify these posts, this article featuresa classifier developed using Twitter posts related to disabilities.The results show that the tool developed is promising in detectingoffensive and pejorative comments on this topic, which can be usedin content management systems.

APA, Harvard, Vancouver, ISO, and other styles

Reports on the topic "Toxic comments classifier"

1

Paynter, Robin A., Celia Fiordalisi, Elizabeth Stoeger, et al. A Prospective Comparison of Evidence Synthesis Search Strategies Developed With and Without Text-Mining Tools. Agency for Healthcare Research and Quality (AHRQ), 2021. http://dx.doi.org/10.23970/ahrqepcmethodsprospectivecomparison.

Full text

Abstract:

Background: In an era of explosive growth in biomedical evidence, improving systematic review (SR) search processes is increasingly critical. Text-mining tools (TMTs) are a potentially powerful resource to improve and streamline search strategy development. Two types of TMTs are especially of interest to searchers: word frequency (useful for identifying most used keyword terms, e.g., PubReminer) and clustering (visualizing common themes, e.g., Carrot2). Objectives: The objectives of this study were to compare the benefits and trade-offs of searches with and without the use of TMTs for evidence synthesis products in real world settings. Specific questions included: (1) Do TMTs decrease the time spent developing search strategies? (2) How do TMTs affect the sensitivity and yield of searches? (3) Do TMTs identify groups of records that can be safely excluded in the search evaluation step? (4) Does the complexity of a systematic review topic affect TMT performance? In addition to quantitative data, we collected librarians' comments on their experiences using TMTs to explore when and how these new tools may be useful in systematic review search¬¬ creation. Methods: In this prospective comparative study, we included seven SR projects, and classified them into simple or complex topics. The project librarian used conventional “usual practice” (UP) methods to create the MEDLINE search strategy, while a paired TMT librarian simultaneously and independently created a search strategy using a variety of TMTs. TMT librarians could choose one or more freely available TMTs per category from a pre-selected list in each of three categories: (1) keyword/phrase tools: AntConc, PubReMiner; (2) subject term tools: MeSH on Demand, PubReMiner, Yale MeSH Analyzer; and (3) strategy evaluation tools: Carrot2, VOSviewer. We collected results from both MEDLINE searches (with and without TMTs), coded every citation’s origin (UP or TMT respectively), deduplicated them, and then sent the citation library to the review team for screening. When the draft report was submitted, we used the final list of included citations to calculate the sensitivity, precision, and number-needed-to-read for each search (with and without TMTs). Separately, we tracked the time spent on various aspects of search creation by each librarian. Simple and complex topics were analyzed separately to provide insight into whether TMTs could be more useful for one type of topic or another. Results: Across all reviews, UP searches seemed to perform better than TMT, but because of the small sample size, none of these differences was statistically significant. UP searches were slightly more sensitive (92% [95% confidence intervals (CI) 85–99%]) than TMT searches (84.9% [95% CI 74.4–95.4%]). The mean number-needed-to-read was 83 (SD 34) for UP and 90 (SD 68) for TMT. Keyword and subject term development using TMTs generally took less time than those developed using UP alone. The average total time was 12 hours (SD 8) to create a complete search strategy by UP librarians, and 5 hours (SD 2) for the TMT librarians. TMTs neither affected search evaluation time nor improved identification of exclusion concepts (irrelevant records) that can be safely removed from the search set. Conclusion: Across all reviews but one, TMT searches were less sensitive than UP searches. For simple SR topics (i.e., single indication–single drug), TMT searches were slightly less sensitive, but reduced time spent in search design. For complex SR topics (e.g., multicomponent interventions), TMT searches were less sensitive than UP searches; nevertheless, in complex reviews, they identified unique eligible citations not found by the UP searches. TMT searches also reduced time spent in search strategy development. For all evidence synthesis types, TMT searches may be more efficient in reviews where comprehensiveness is not paramount, or as an adjunct to UP for evidence syntheses, because they can identify unique includable citations. If TMTs were easier to learn and use, their utility would be increased.

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!