To see the other types of publications on this topic, follow the link: TF-IDF vs.

Journal articles on the topic 'TF-IDF vs'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 30 journal articles for your research on the topic 'TF-IDF vs.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Gani, Mohammed Osman, Ramesh Kumar Ayyasamy, Anbuselvan Sangodiah, and Yong Tien Fui. "USTW Vs. STW: A Comparative Analysis for Exam Question Classification based on Bloom’s Taxonomy." MENDEL 28, no. 2 (2022): 25–40. http://dx.doi.org/10.13164/mendel.2022.2.025.

Full text
Abstract:
Bloom’s Taxonomy (BT) is widely used in educational institutions to produce high-quality exam papers to evaluate students’ knowledge at different cognitive levels. However, manual question labeling takes a long time, and not all evaluators are familiar with BT. The researchers worked to automate the exam question classification process based on BT as a solution. Enhancement in term weighting is one of the ways to increase classification accuracy while working with text data. However, all the past work on the term weighting in exam question classification focused on unsupervised term weighting
APA, Harvard, Vancouver, ISO, and other styles
2

Tri, Wahyuningsih, Manongga Danny, and Sembiring Irwan. "Comparing logistic regression and extreme gradient boosting on student arguments." IAES International Journal of Artificial Intelligence (IJ-AI) 13, no. 3 (2024): 3119–28. https://doi.org/10.11591/ijai.v13.i3.pp3119-3128.

Full text
Abstract:
Identifying the effectiveness level and quality of students' arguments poses a challenge for teachers. This is due to the lack of techniques that can accurately assist in identifying the effectiveness and quality of students' arguments. This research aims to develop a model that can identify effectiveness categories in students' arguments. The method employed involves the logistic regression+XGBoost algorithm combined with separate implementations of term frequency-inverse document frequency (TF-IDF) and CountVectorizer. Student argument data were collected and processed using natural language
APA, Harvard, Vancouver, ISO, and other styles
3

Zhan, Zerui. "Comparative Analysis of TF-IDF and Word2Vec in Sentiment Analysis: A Case of Food Reviews." ITM Web of Conferences 70 (2025): 02013. https://doi.org/10.1051/itmconf/20257002013.

Full text
Abstract:
Sentiment analysis is an important area of natural language processing that supports applications such as market analysis, customer feedback, and social media monitoring by identifying and classifying opinions in text. Text representation is the basis of sentiment analysis, and TF-IDF and Word2Vec are two commonly used methods to carry out text vectorization by counting word frequency and capturing semantic relations respectively. This paper compares the performance of TF-IDF and Word2Vec in sentiment analysis of food reviews to provide a more effective basis for enterprises and researchers to
APA, Harvard, Vancouver, ISO, and other styles
4

Dias Canedo, Edna, and Bruno Cordeiro Mendes. "Software Requirements Classification Using Machine Learning Algorithms." Entropy 22, no. 9 (2020): 1057. http://dx.doi.org/10.3390/e22091057.

Full text
Abstract:
The correct classification of requirements has become an essential task within software engineering. This study shows a comparison among the text feature extraction techniques, and machine learning algorithms to the problem of requirements engineer classification to answer the two major questions “Which works best (Bag of Words (BoW) vs. Term Frequency–Inverse Document Frequency (TF-IDF) vs. Chi Squared (CHI2)) for classifying Software Requirements into Functional Requirements (FR) and Non-Functional Requirements (NF), and the sub-classes of Non-Functional Requirements?” and “Which Machine Lea
APA, Harvard, Vancouver, ISO, and other styles
5

Nallapati, Ramesh, Xiaolin Shi, Daniel McFarland, Jure Leskovec, and Daniel Jurafsky. "LeadLag LDA: Estimating Topic Specific Leads and Lags of Information Outlets." Proceedings of the International AAAI Conference on Web and Social Media 5, no. 1 (2021): 558–61. http://dx.doi.org/10.1609/icwsm.v5i1.14147.

Full text
Abstract:
Identifying which outlet in social media leads the rest in disseminating novel information on specific topics is an interesting challenge for information analysts and social scientists. In this work, we hypothesize that novel ideas are disseminated through the creation and propagation of new or newly emphasized key words, and therefore lead/lag of outlets can be estimated by tracking word usage across these outlets. First, we demonstrate the validaty of our hypothesis by showing that a simple TF-IDF based nearest-neighbors approach can recover generally accepted lead/lag behavior on the outlet
APA, Harvard, Vancouver, ISO, and other styles
6

Nyoman Prayana Trisna, I., Ni Wayan Emmy Rosiana Dewi, and Muhammad Alam Pasirulloh. "Oversampling vs. undersampling in TF-IDF variations for imbalanced Indonesian short texts classification." TELKOMNIKA (Telecommunication Computing Electronics and Control) 23, no. 2 (2025): 382. https://doi.org/10.12928/telkomnika.v23i2.26510.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Zhang, Li. "Features extraction based on Naive Bayes algorithm and TF-IDF for news classification." PLOS One 20, no. 7 (2025): e0327347. https://doi.org/10.1371/journal.pone.0327347.

Full text
Abstract:
The rapid proliferation of online news demands robust automated classification systems to enhance information organization and personalized recommendation. Although traditional methods like TF-IDF with Naive Bayes provide foundational solutions, their limitations in capturing semantic nuances and handling real-time demands hinder practical applications. This study proposes a hybrid news classification framework that integrates classical machine learning with modern advances in NLP to address these challenges. Our methodology introduces three key innovations: (1) Domain-Specific Feature Enginee
APA, Harvard, Vancouver, ISO, and other styles
8

Ashraf, Mariyam Mohammed, and M. W. P. Maduranga. "Ingredient-Only Recipe Recommendation via Self-Supervised Contrastive Learning for Sustainable Cooking." Journal of Future Artificial Intelligence and Technologies 2, no. 2 (2025): 215–26. https://doi.org/10.62411/faith.3048-3719-98.

Full text
Abstract:
This paper presents a novel ingredient-based recipe recommendation system that suggests dishes using only available ingredients, eliminating reliance on user preference data. As online culinary platforms host millions of recipes, users struggle to find dishes that match their pantry constraints, dietary needs, and culinary preferences. Unlike traditional methods like TF-IDF, which lack semantic depth, or existing methods such as graph neural networks (GNNs), which incur high computational costs and struggle with cold-start scenarios, our self-supervised contrastive learning approach maps ingre
APA, Harvard, Vancouver, ISO, and other styles
9

Zhuang, Guanghe, and Xiang Lu. "A KeyBERT-Enhanced Pipeline for Electronic Information Curriculum Knowledge Graphs: Design, Evaluation, and Ontology Alignment." Information 16, no. 7 (2025): 580. https://doi.org/10.3390/info16070580.

Full text
Abstract:
This paper proposes a KeyBERT-based method for constructing a knowledge graph of the electronic information curriculum system, aiming to enhance the structured representation and relational analysis of educational content. Electronic Information Engineering curricula encompass diverse and rapidly evolving topics; however, existing knowledge graphs often overlook multi-word concepts and more nuanced semantic relationships. To address this gap, this paper presents a KeyBERT-enhanced method for constructing a knowledge graph of the electronic information curriculum system. Utilizing teaching plan
APA, Harvard, Vancouver, ISO, and other styles
10

Ardiansyah, Ricy, Herman Yuliansyah, and Anton Yudhana. "Multi-Label Classification for Opinion Mining in The Presidential Election using TF-IDF with NB And SVM." Jurnal ELTIKOM 9, no. 1 (2025): 35–46. https://doi.org/10.31961/eltikom.v9i1.1432.

Full text
Abstract:
Public opinion plays a crucial role in presidential elections, shaping voter choices and influencing outcomes. Most sentiment analysis studies focus on binary (positive vs. negative) or multiclass (positive, negative, neutral) classification, which limits their ability to capture opinions that express multiple sentiments simultaneously. In presidential elections, a single opinion may support one candidate while criticizing another. This study proposes a MultiLabelBinarizer model to classify candidate and sentiment labels simultaneously—an approach that remains underexplored. The model combines
APA, Harvard, Vancouver, ISO, and other styles
11

Agustio Dwitama, Meilinda Meilinda, Nevin Julian Masidin, and Andri Wijaya. "Perbandingan Algoritma Naïve Bayes Dan Support Vector Machine Dalam Menentukan Sentimen Publik Terhadap Copilot." Jurnal Sistem Informasi, Manajemen dan Teknologi Informasi 3, no. 1 (2025): 1–12. https://doi.org/10.33020/jsimtek.v3i1.800.

Full text
Abstract:
Chatbot modern, termasuk Copilot dari Microsoft Edge, berkembang pesat berkat teknologi pemrosesan bahasa alami (NLP). Penelitian ini menganalisis sentimen publik terhadap Copilot menggunakan algoritma Naïve Bayes dan Support Vector Machine (SVM). Dengan mengumpulkan 20.000 ulasan dari Google Play Store melalui library Python “google_play_scraper”, data diproses dengan langkah-langkah preprocessing, diikuti pelabelan menggunakan VADER untuk mengklasifikasikan ulasan menjadi positif, negatif, dan netral. Metode TF-IDF digunakan untuk mengindeks dan memberikan bobot pada istilah sebelum menerapk
APA, Harvard, Vancouver, ISO, and other styles
12

Huang, Shuyue, Lena Jingen Liang, and Hwansuk Chris Choi. "How We Failed in Context: A Text-Mining Approach to Understanding Hotel Service Failures." Sustainability 14, no. 5 (2022): 2675. http://dx.doi.org/10.3390/su14052675.

Full text
Abstract:
Service failure is inevitable. Although empirical studies on the outcomes and processes of service failures have been conducted in the hotel industry, the findings need more exploration to understand how different segments perceive service failures and the associated emotions differently. This approach enables hotel managers to develop more effective strategies to prevent service failures and implement more specific service-recovery actions. For analysis, we obtained a nine-year (2010–2018) longitudinal dataset containing 1224 valid respondents with 73,622 words of textual content from a prope
APA, Harvard, Vancouver, ISO, and other styles
13

Afuan, Lasmedi. "Sentiment Analysis of the Kampus Merdeka Program on Twitter Using Support Vector Machine and a Feature Extraction Comparison: TF-IDF vs. FastText." Journal of Applied Data Sciences 5, no. 4 (2024): 1738–53. https://doi.org/10.47738/jads.v5i4.436.

Full text
APA, Harvard, Vancouver, ISO, and other styles
14

Rizki, Agus Maula, Bustami Bustami, and Said Fadlan Anshari. "Comparison of Support Vector Machine and Naïve Bayes Algorithms in Sentiment Analysis of Tiktokshop Application User Reviews." Journal of Renewable Energy, Electrical, and Computer Engineering 5, no. 1 (2025): 18–29. https://doi.org/10.29103/jreece.v5i1.21342.

Full text
Abstract:
This study presents a comparative analysis of Support Vector Machine (SVM) and Naïve Bayes algorithms for sentiment analysis of TikTokShop application user reviews. As TikTokShop emerges as an innovative platform integrating social media with e-commerce, understanding user sentiments becomes crucial for both consumers and businesses. A balanced dataset of 3,000 user reviews (1,000 positive, 1,000 neutral, and 1,000 negative) was collected through web scraping from Google Play Store. Following comprehensive preprocessing including cleansing, case folding, normalization, tokenization, stopword r
APA, Harvard, Vancouver, ISO, and other styles
15

Shi, Tongxin, Roy A. McCann, Ying Huang, Wei Wang, and Jun Kong. "Malware Detection for Internet of Things Using One-Class Classification." Sensors 24, no. 13 (2024): 4122. http://dx.doi.org/10.3390/s24134122.

Full text
Abstract:
The increasing usage of interconnected devices within the Internet of Things (IoT) and Industrial IoT (IIoT) has significantly enhanced efficiency and utility in both personal and industrial settings but also heightened cybersecurity vulnerabilities, particularly through IoT malware. This paper explores the use of one-class classification, a method of unsupervised learning, which is especially suitable for unlabeled data, dynamic environments, and malware detection, which is a form of anomaly detection. We introduce the TF-IDF method for transforming nominal features into numerical formats tha
APA, Harvard, Vancouver, ISO, and other styles
16

Vrbanec, Tedo, and Ana Meštrović. "Corpus-Based Paraphrase Detection Experiments and Review." Information 11, no. 5 (2020): 241. http://dx.doi.org/10.3390/info11050241.

Full text
Abstract:
Paraphrase detection is important for a number of applications, including plagiarism detection, authorship attribution, question answering, text summarization, text mining in general, etc. In this paper, we give a performance overview of various types of corpus-based models, especially deep learning (DL) models, with the task of paraphrase detection. We report the results of eight models (LSI, TF-IDF, Word2Vec, Doc2Vec, GloVe, FastText, ELMO, and USE) evaluated on three different public available corpora: Microsoft Research Paraphrase Corpus, Clough and Stevenson and Webis Crowd Paraphrase Cor
APA, Harvard, Vancouver, ISO, and other styles
17

Kargozari, Kate, Junhua Ding, and Haihua Chen. "Empowering Consumer Decision-Making: Decoding Incentive vs. Organic Reviews for Smarter Choices Through Advanced Textual Analysis." Electronics 13, no. 21 (2024): 4316. http://dx.doi.org/10.3390/electronics13214316.

Full text
Abstract:
Online reviews play a crucial role in influencing seller–customer dynamics. This research evaluates the credibility and consistency of reviews based on volume, length, and content to understand the impacts of incentives on customer review behaviors, how to improve review quality, and decision-making in purchases. The data analysis reveals major factors such as costs, support, usability, and product features that may influence the impact. The analysis also highlights the indirect impact of company size, the direct impact of user experience, and the varying impacts of changing conditions over th
APA, Harvard, Vancouver, ISO, and other styles
18

Hadi, Abdul, Mukti Qamal, and Yesy Afrillia. "Comparison of Random Forest Algorithm Classifier and Naïve Bayes Algorithm in Whatsapp Message Type Classification." Journal of Renewable Energy, Electrical, and Computer Engineering 5, no. 1 (2025): 9–17. https://doi.org/10.29103/jreece.v5i1.21227.

Full text
Abstract:
This study compares the effectiveness of Random Forest and Naïve Bayes algorithms in classifying WhatsApp messages into three categories: normal, promotional, and fraudulent messages. With over 2.78 billion active users worldwide and 90% of Indonesian internet users utilizing WhatsApp, the platform's end-to-end encryption creates challenges for automatic spam detection, necessitating machine learning approaches. A dataset of 300 messages, equally distributed across the three categories, underwent preprocessing including cleansing, case folding, stopword removal, normalization, and stemming bef
APA, Harvard, Vancouver, ISO, and other styles
19

Putri, Sri Raihan, Asrianda Asrianda, and Lidya Rosnita. "Sentiment Analysis of Youtube and Gotube Reviews on Google Play Using the Support Vector Machine (SVM) Method in Indonesia." Journal of Applied Informatics and Computing 9, no. 3 (2025): 1025–33. https://doi.org/10.30871/jaic.v9i3.9461.

Full text
Abstract:
This research, titled Sentiment Analysis of YouTube and GoTube Reviews on Google Play Using the Support Vector Machine (SVM) Method in Indonesia, analyzes user perceptions of YouTube and GoTube based on Google Play reviews. The study is motivated by the growing popularity of video streaming apps in Indonesia and the limited sentiment analysis research on these platforms. The research collects 1,600 reviews (800 per app) from 2023-2024 using Python’s Scrapy library. The data is split 70% for training and 30% for testing, undergoing text preprocessing (tokenization, stop word removal, stemming),
APA, Harvard, Vancouver, ISO, and other styles
20

CHARAN, KONKI, and RONGALA RAJESH. "Automated Resume Screening Using Machine Learning." International Scientific Journal of Engineering and Management 04, no. 07 (2025): 1–9. https://doi.org/10.55041/isjem04860.

Full text
Abstract:
The exponential growth of digital job applications has posed significant challenges for recruiters, who often face the daunting task of manually screening thousands of resumes to identify suitable candidates. This research addresses these challenges by proposing an automated resume classification system leveraging Natural Language Processing (NLP) and machine learning techniques. The proposed system integrates comprehensive text preprocessing, feature extraction using Term Frequency–Inverse Document Frequency (TF-IDF), and a One-vs-Rest K-Nearest Neighbors (KNN) classifier to categorize resume
APA, Harvard, Vancouver, ISO, and other styles
21

Rizinski, Maryan, Andrej Jankov, Vignesh Sankaradas, Eugene Pinsky, Igor Mishkovski, and Dimitar Trajanov. "Comparative Analysis of NLP-Based Models for Company Classification." Information 15, no. 2 (2024): 77. http://dx.doi.org/10.3390/info15020077.

Full text
Abstract:
The task of company classification is traditionally performed using established standards, such as the Global Industry Classification Standard (GICS). However, these approaches heavily rely on laborious manual efforts by domain experts, resulting in slow, costly, and vendor-specific assignments. Therefore, we investigate recent natural language processing (NLP) advancements to automate the company classification process. In particular, we employ and evaluate various NLP-based models, including zero-shot learning, One-vs-Rest classification, multi-class classifiers, and ChatGPT-aided classifica
APA, Harvard, Vancouver, ISO, and other styles
22

Tian, Liang, and Nan Jiang. "Research on Detection Methods for Text Generated by Large Language Models Based on Multi-Model Ensemble." Applied and Computational Engineering 106, no. 1 (2024): 59–67. http://dx.doi.org/10.54254/2755-2721/106/20241331.

Full text
Abstract:
The rapid development of Large Language Models (LLMs) has made their generated text almost indistinguishable from human writing, posing significant challenges to traditional human-machine recognition techniques. This paper proposes a detection method based on multi-model ensemble to accurately identify text generated by LLMs. Firstly, a large-scale, diverse, and heterogeneous dataset is constructed, covering student writings and texts generated by models such as GPT-3, GPT-2, CTRL, and XLM. Then, a multifaceted detection framework integrating linear models, deep learning models, and pre-traine
APA, Harvard, Vancouver, ISO, and other styles
23

Ali, Irfan, Nimra Mughal, Zahid Hussain Khan, Javed Ahmed, and Ghulam Mujtaba. "Resume Classification System using Natural Language Processing and Machine Learning Techniques." Mehran University Research Journal of Engineering and Technology 41, no. 1 (2022): 65–79. http://dx.doi.org/10.22581/muet1982.2201.07.

Full text
Abstract:
The selection of a suitable job applicant from the pool of thousands applications is often daunting job for an employer. The categorization of job applications submitted in form of Resumes against available vacancy(s) takes significant time and efforts of an employer. Thus, Resume Classification System (RCS) using the Natural Language Processing (NLP) and Machine Learning (ML) techniques could automate this tedious process. Moreover, the automation of this process can significantly expedite and transparent the applicants’ screening process with mere human involvement. This experimental study p
APA, Harvard, Vancouver, ISO, and other styles
24

Beierle, Felix, Rüdiger Pryss, and Akiko Aizawa. "Sentiments about Mental Health on Twitter—Before and during the COVID-19 Pandemic." Healthcare 11, no. 21 (2023): 2893. http://dx.doi.org/10.3390/healthcare11212893.

Full text
Abstract:
During the COVID-19 pandemic, the novel coronavirus had an impact not only on public health but also on the mental health of the population. Public sentiment on mental health and depression is often captured only in small, survey-based studies, while work based on Twitter data often only looks at the period during the pandemic and does not make comparisons with the pre-pandemic situation. We collected tweets that included the hashtags #MentalHealth and #Depression from before and during the pandemic (8.5 months each). We used LDA (Latent Dirichlet Allocation) for topic modeling and LIWC, VADER
APA, Harvard, Vancouver, ISO, and other styles
25

Zabolotnia, Tetiana M., та Nazarii V. Kozynets. "Hybrid detection of fuzzy duplicate texts: сosine similarity and transformers". Applied Aspects of Information Technology 8, № 1 (2025): 48–61. https://doi.org/10.15276/aait.08.2025.4.

Full text
Abstract:
This paper addresses the challenge of detecting texts that share the same meaning but differ in wording and structure. Such “fuzzy duplicates” are increasingly prevalent in user-generated content, media articles, and academic materials. Traditional TF-IDF-based methods with cosine similarity process data swiftly but often overlook deeper semantic nuances, especially in languages with free word order and complex morphology (for example, Slavic languages such as Ukrainian or Bulgarian, and agglutinative languages like Hungarian). Fully neural solutions (e.g., transformers) typically offer higher
APA, Harvard, Vancouver, ISO, and other styles
26

Mohammed, Mohammed, Jabbar Abed Eleiwy, Hassan Mohamed Muhi Muhi-Aldeen, Yusra Al Al-Yasiri, and Ahmed Adil Nafea. "Human to Chatbot Text Classification Using Multi-Source AI Chatbots and Machine Learning Models." Journal of Intelligent Systems and Internet of Things 16, no. 1 (2025): 152–65. https://doi.org/10.54216/jisiot.160113.

Full text
Abstract:
The fast growth of artificial intelligence technologies, especially language processing technology has obscured the lines in between human-generated text comparing to chatbot-generated message. Recognizing which generated such, a text is essential for applications like information generating and manipulated text in order to guarantee authenticity between communicated parties. This research applies to a set of machine learning models to identify text as either human-written or chatbot-generated. The methodology of this research starts with a dataset including text generated from different Large
APA, Harvard, Vancouver, ISO, and other styles
27

Felipe, Luis, and Gilmer Valdes. "Abstract B007: TheBlueScrubs-v1: A Large-Scale Curated Dataset with ∼11 Billion Oncology Tokens for AI-Driven Cancer Research." Clinical Cancer Research 31, no. 13_Supplement (2025): B007. https://doi.org/10.1158/1557-3265.aimachine-b007.

Full text
Abstract:
Abstract Large language models (LLMs) are increasingly pivotal in cancer research, yet current public datasets offer insufficient scale and diversity to capture the complexity of oncology. To address this gap, we created TheBlueScrubs-v1, a 25-billion-token corpus of medical texts curated from the SlimPajama dataset. Approximately one-third of these tokens (∼11 billion) are annotated as cancer-related, making this one of the largest public, domain-focused text collections available for training and benchmarking oncology LLMs. Our two-stage pipeline first applied a high-speed logistic regressio
APA, Harvard, Vancouver, ISO, and other styles
28

Li, Baoku, Yafeng Nan, and Ruoxi Yao. "Carbon neutrality cognition, environmental value, and consumption preference of low-carbon products." Frontiers in Environmental Science 10 (September 8, 2022). http://dx.doi.org/10.3389/fenvs.2022.979783.

Full text
Abstract:
It is now the mainstream scientific consensus that carbon emissions cause global climate change. Achieving the goal of China’s carbon neutrality is essential for environmental protection and economic sustainable development worldwide. In the above context, this paper aims to explore the carbon neutrality cognition, environmental value, and consumption preference for low-carbon products from the perspective of consumption end. Thus, we built and checked a new conceptual model of consumers’ carbon neutrality cognition and the consumption preference for low-carbon products. The TF-IDF algorithm i
APA, Harvard, Vancouver, ISO, and other styles
29

Herrmann, Evelyn, Lukas Urs Von Rohr, Fizi Kotta Parambil, Sarath M Joy, and Yojena Chittazhathu Kurian Kuruvilla. "Validating a semantic similarity approach for automated data extraction in phase II oncology trials." Journal of Clinical Oncology 43, no. 16_suppl (2025). https://doi.org/10.1200/jco.2025.43.16_suppl.e13666.

Full text
Abstract:
e13666 Background: Manual data extraction from phase II oncology publications is error-prone, especially when text is paraphrased. Traditional exact-match metrics underestimate correctness. We developed a pipeline using GPT-4 and BERT to capture semantic meaning, aiming to improve large-scale data extraction. Methods: We analyzed 26 phase II oncology publications, each mapped to > 200 predefined fields (e.g., study ID, drug names, mechanisms of action, outcomes). Domain experts established ground-truth (“base file”) labels. PDFs were parsed via Azure Document Intelligence Studio (custom mod
APA, Harvard, Vancouver, ISO, and other styles
30

Adejumo, Philip, Benjamin Rosand, and Rohan Khera. "Abstract 251: Natural Language Processing Of Clinical Documentation To Assess Racial Differences In The Clinical Profile Of Patients Hospitalized With Heart Failure." Circulation: Cardiovascular Quality and Outcomes 15, Suppl_1 (2022). http://dx.doi.org/10.1161/circoutcomes.15.suppl_1.251.

Full text
Abstract:
Background: Self-reported racial classification is often used as a proxy for phenotypic differences among clinical populations. We demonstrate an approach to identify differences between Black and White individuals using clinical notes among patients hospitalized with heart failure (HF). Methods: We identified HF hospitalizations for patients at Yale and developed a word frequency embedding (TF-IDF)-based natural language processing (NLP) model on clinical notes to classify “tokens” or words from notes of Black and White individuals based on their frequency. We manually classified these tokens
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!