Log in

Relevant bibliographies by topics / Lexicon based classifier / Journal articles

To see the other types of publications on this topic, follow the link: Lexicon based classifier.

Journal articles on the topic 'Lexicon based classifier'

Author: Grafiati

Published: 5 June 2025

Last updated: 15 July 2025

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Lexicon based classifier.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Yang, Ai Min, Jiang Hao Lin, Yong Mei Zhou, and Jin Chen. "Research on Building a Chinese Sentiment Lexicon Based on SO-PMI." Applied Mechanics and Materials 263-266 (December 2012): 1688–93. http://dx.doi.org/10.4028/www.scientific.net/amm.263-266.1688.

Full text

Abstract:

Considering user behavior, this paper has built a Chinese sentiment lexicon based on improved SO-PMI algorithm. Sematic lexicons were used to classify the sentiment of the collected Chinese hotel reviews. The experiment has compared the feature extraction between CHI and sentiment lexicons to find out different classification performances. The results indicate that feature extraction based on sentiment lexicon gains higher F1. The performance of classification method “Basic Semantic Lexicon + BOOL + NB” gains 92.40% of F1. Based on different sentiment lexicons, the experimental results shows that (SO-A) and (SO-P) is slightly better than NB classifier. Therefore, it would be effective to use ((SO-A) and (SO-P) as text sentiment classifiers. The experiment also finds out the method “Hotel Reviews Semantic Lexicon using improved SO-PMI algorithm +((SO-A)” gains the highest F1 which is 92.84%. The results reveal that improved SO-PMI does more effective on weight calculation and sentiment lexicon building.

APA, Harvard, Vancouver, ISO, and other styles

2

Srivastava, Anima, Amit Srivastava, and Tanveer J. Siddiqui. "Sentiment Classification Using a Sense Enriched Lexicon-based Approach." International Journal on Recent and Innovation Trends in Computing and Communication 11, no. 5 (2023): 208–15. http://dx.doi.org/10.17762/ijritcc.v11i5.6607.

Full text

Abstract:

The prominent approach in sentiment polarity classification is the Lexicon-based approach which relies on a dictionary to assign a score to subjective words. Most of the existing work use score of the most dominant sense in this process instead of using the contextually appropriate sense. The use of Word Sense Disambiguation (WSD) is less investigated in the sentiment classification tasks. This paper investigates the effect of integrating WSD into a Lexicon-based approach for Sentiment Polarity classification and compares it with the existing Lexicon-based approaches and the state-of-art supervised approaches. The lexicon used in this work is SentiWordNet v2.0. The proposed approach, called Sense Enriched Lexicon-based Approach (SELSA), uses a word sense disambiguation module to identify the correct sense of subjective words. Instead of using the score of the most frequent sense, it uses the score of the contextually appropriate sense only. For the purpose of comparison with the supervised approaches, the authors investigate Naïve Bayes (NB) and Support Vector Machines (SVM) classifiers which tend to perform better in earlier research. The performance of these classifiers is evaluated using Word2vec, Hashing Vectorizer, and bi-gram feature. The best-performing classifier-feature combination is used for comparison. All the evaluations are done on the Movie Review dataset. SELSA achieves an accuracy of 96.25% which is significantly better than the accuracy obtained by SentiWordNet-based approach without WSD on the same dataset. The performance of the proposed algorithm is also compared with the best-performing supervised classifier investigated in this work and earlier reported works on the same dataset. The results reveal that the SVM classifier performs better than SentiWordNet approach without WSD. However, after incorporating WSD the performance of the proposed Lexicon-based approach is significantly improved and it surpasses the best-performing supervised classifier (SVM with bi-gram features).

APA, Harvard, Vancouver, ISO, and other styles

3

Mohamed, Ensaf Hussein, Mohammed ElSaid Moussa, and Mohamed Hassan Haggag. "An Enhanced Sentiment Analysis Framework Based on Pre-Trained Word Embedding." International Journal of Computational Intelligence and Applications 19, no. 04 (2020): 2050031. http://dx.doi.org/10.1142/s1469026820500315.

Full text

Abstract:

Sentiment analysis (SA) is a technique that lets people in different fields such as business, economy, research, government, and politics to know about people’s opinions, which greatly affects the process of decision-making. SA techniques are classified into: lexicon-based techniques, machine learning techniques, and a hybrid between both approaches. Each approach has its limitations and drawbacks, the machine learning approach depends on manual feature extraction, lexicon-based approach relies on sentiment lexicons that are usually unscalable, unreliable, and manually annotated by human experts. Nowadays, word-embedding techniques have been commonly used in SA classification. Currently, Word2Vec and GloVe are some of the most accurate and usable word embedding techniques, which can transform words into meaningful semantic vectors. However, these techniques ignore sentiment information of texts and require a huge corpus of texts for training and generating accurate vectors, which are used as inputs of deep learning models. In this paper, we propose an enhanced ensemble classifier framework. Our framework is based on our previously published lexicon-based method, bag-of-words, and pre-trained word embedding, first the sentence is preprocessed by removing stop-words, POS tagging, stemming and lemmatization, shortening exaggerated word. Second, the processed sentence is passed to three modules, our previous lexicon-based method (Sum Votes), bag-of-words module and semantic module (Word2Vec and Glove) and produced feature vectors. Finally, the previous features vectors are fed into 11 different classifiers. The proposed framework is tested and evaluated over four datasets with five different lexicons, the experiment results show that our proposed model outperforms the previous lexicon based and the machine learning methods individually.

APA, Harvard, Vancouver, ISO, and other styles

4

Lee, Ju-Hyoung, Sang-Ki Ko, and Yo-Sub Han. "SALNet: Semi-supervised Few-Shot Text Classification with Attention-based Lexicon Construction." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 14 (2021): 13189–97. http://dx.doi.org/10.1609/aaai.v35i14.17558.

Full text

Abstract:

We propose a semi-supervised bootstrap learning framework for few-shot text classification. From a small amount of the initial dataset, our framework obtains a larger set of reliable training data by using the attention weights from an LSTM-based trained classifier. We first train an LSTM-based text classifier from a given labeled dataset using the attention mechanism. Then, we collect a set of words for each class called a lexicon, which is supposed to be a representative set of words for each class based on the attention weights calculated for the classification task. We bootstrap the classifier using the new data that are labeled by the combination of the classifier and the constructed lexicons to improve the prediction accuracy. As a result, our approach outperforms the previous state-of-the-art methods including semi-supervised learning algorithms and pretraining algorithms for few-shot text classification task on four publicly available benchmark datasets. Moreover, we empirically confirm that the constructed lexicons are reliable enough and substantially improve the performance of the original classifier.

APA, Harvard, Vancouver, ISO, and other styles

5

Rani, Sangeeta, Nasib Singh Gill, and Preeti Gulia. "Analyzing impact of number of features on efficiency of hybrid model of lexicon and stack based ensemble classifier for twitter sentiment analysis using WEKA tool." Indonesian Journal of Electrical Engineering and Computer Science 22, no. 2 (2021): 1041–51. https://doi.org/10.11591/ijeecs.v22.i2.pp1041-1051.

Full text

Abstract:

Twitter is used by millions of people across the world, so the data collected from Twitter can be highly valuable for research and helpful in decision support. Here in this paper ‘Twitter US Airline data’ from Kaggle data repository is used for sentiment classification of customers’ reviews. The current research aims to implement various machine learning classifiers, Stack-based ensemble classifiers and hybrid of lexicon classifier with other classifiers. 11 different classification models are implemented for different sized feature sets. Also, all the 11 models are re-implemented by adding sentiment score of lexicon based classifier as one of the features in the feature set. Results are analyzed by varying number of input feature variables used in the classification. Four different size feature sets having 301,501, 701, and 1301 number of features are used to analyze the variations in the final findings. Chi-Square and Information gain techniques are used for feature selection. The results show that an increase in the number of features increases the accuracy up to 701 features. After that, accuracy is stable or decreases with increase in feature set size. Also, the cost of adding sentiment score of lexicon classifier to the input feature set is nominal, but the results are improved consistently. WEKA and R Studio tools are used for analysis and implementation. Accuracy and Kappa are used for representing and comparing the efficiency of models.

APA, Harvard, Vancouver, ISO, and other styles

6

Al Khadafi, Madonna, Kurnia Paranitha Kartika, and Filda Febrinita. "PENERAPAN METODE NAÏVE BAYES CLASSIFIER DAN LEXICON BASED UNTUK ANALISIS SENTIMEN CYBERBULLYING PADA BPJS." JATI (Jurnal Mahasiswa Teknik Informatika) 6, no. 2 (2022): 725–33. http://dx.doi.org/10.36040/jati.v6i2.5633.

Full text

Abstract:

Presiden Joko Widodo menerbitkan Instruksi Presiden Republik Indonesia Nomor 1 Tahun 2022 tentang Optimalisasi Pelaksanaan Program Jaminan Kesehatan Nasional. Inpres tersebut mengatur syarat mengurus sejumlah layanan publik seperti jual beli tanah, membuat SIM, SKCK, haji dan umrah harus terdaftar sebagai peserta BPJS Kesehatan. Peraturan tersebut dimulai per 1 Maret 2022. Tetapi, yang menjadi persoalan adalah ketika berpendapat tidak berlandaskan etika, sehingga muncul adanya cemooh atau menyudutkan pihak yang bersangkutan, maka bisa mengakibatkan adanya tindak cyberbullying. Untuk itu, perlu dilakukan menganalisis sentimen pada komentar Twitter untuk mengklasifikasikan tweet yang mengandung cyberbullying atau non cyberbullying. Metode yang digunakan dalam penelitian ini adalah deskriptif kualitatif dengan algoritma pengolahan data cyberbullying menggunakan Naive Bayes Classifier dan Lexicon Based. Proses dimulai dari pengambilan data tweet, pre-processing, pembobotan TF-IDF, performa Naive Bayes Classifier. Kemudian dilakukan proses klasifikasi menggunakan metode Lexicon Based dengan hasil keluaran sistem berupa identifkasi apakah tweet termasuk cyberbullying atau non cyberbullying. Pada penelitian ini, didapatkan hasil performansi dari Naive Bayes Classifier menghasilkan akurasi 80%, sedangkan Lexicon Based menghasilkan akurasi 22%. Setelah membandingkan kedua metode, maka kesimpulan yang dapat diambil adalah Naive Bayes Classifier lebih baik dan lebih akurat daripada Lexicon Based.

APA, Harvard, Vancouver, ISO, and other styles

7

Setiawan, Samuel Budi, and Auliya Rahman Isnain. "Sentimen Analisis Masyarakat Terhadap Pembangunan IKN Menggunakan Algoritma Lexicon Based Approach dan Naïve Bayes." JURNAL MEDIA INFORMATIKA BUDIDARMA 8, no. 2 (2024): 1019. http://dx.doi.org/10.30865/mib.v8i2.7506.

Full text

Abstract:

The relocation and construction of IKN (Capital City of the Archipelago) as a center for state administration activities has many benefits and shortcomings, starting from the selection of locations, the ratification of laws that are considered too hasty then raises pros and cons by the Indonesian people. President Joko Widodo decided to move the country's capital outside Java in a meeting on April 29, 2019. The location of the IKN development was determined in East Kalimantan. This research was conducted by retrieving data via Twitter with the keyword "IKN Development". The data that has been collected totals 3,680 tweets. Data analysis was carried out with two methods, namely Naïve Bayes Classifier and Lexicon Based, and the best accuracy value was found between the two methods in analyzing data on public responses to IKN Development. The initial step of the data analysis process is the preprocessing process which contains stages such as labelling, case folding, cleaning, tokenizing, stopword removal, stemming. It is known that the results obtained from the analysis of the Naïve Bayes Classifier method have an accuracy value of 79%, and Lexicon Based has an accuracy value of 76%. Sentiment analysis of the two methods has Positive, Negative, and Neutral sentiments. With the stages of the analysis process using the Naïve Bayes Classifier and lexicon based methods, it can be seen that the Naïve Bayes Classifier method shows a Positive sentiment of 47.18%, Negative of 6.33%, and Neutral of 46.49%, while for Lexicon Based, Positive sentiment reaches 54.15%, Negative 29.36%, and Neutral 16.49%. It should be noted that the highest positive polarity result is found in the Lexicon Based algorithm at 54.15%, while in the Naïve Bayes Classifier 47.18%. It can be concluded from the results of both methods that Naïve Bayes Classifier has a better analysis compared to Lexicon-Based analysis.

APA, Harvard, Vancouver, ISO, and other styles

8

Mohd, Zeeshan Ansari, Ahmad Tanvir, Mohd Sufyan Beg Mirza, and Bari Noaima. "Language lexicons for Hindi-English multilingual text processing." International Journal of Artificial Intelligence (IJ-AI) 11, no. 2 (2022): 641–48. https://doi.org/10.11591/ijai.v11.i2.pp641-648.

Full text

Abstract:

Language identification (LI) in textual documents is the process of automatically detecting the language contained in a document based on its content. The present language identification techniques presume that a document contains text in one of the fixed set of languages. However, this presumption is incorrect when dealing with multilingual document which includes content in more than one possible language. Due to the unavailability of standard corpora for Hindi-English mixed lingual language processing tasks, we propose the language lexicons, a novel kind of lexical database that augments several bilingual language processing tasks. These lexicons are built by learning classifiers over English and transliterated Hindi vocabulary. The designed lexicons possess condensed quantitative characteristics which reflect their linguistic strength in respect of Hindi and English language. On evaluating the lexicons, it is observed that words of the same language tend to cluster together and are separable over language classes. On comparing the classifier performance with existing works, the proposed lexicon models exhibit the better performance.

APA, Harvard, Vancouver, ISO, and other styles

9

Faizal, Ahmad, Agung Susilo Yuda Irawan, and Didi Juardi. "Perbandingan Lexicon Based Dan Naïve Bayes Classifier Pada Analisis Sentimen Pengguna Twitter Terhadap Gempa Turki." INTECOMS: Journal of Information Technology and Computer Science 6, no. 2 (2023): 1037–48. http://dx.doi.org/10.31539/intecoms.v6i2.7360.

Full text

Abstract:

Peristiwa bencana Gempa Turki yang menelan banyak korban jiwa sedang ramai saat ini baik di media nasional maupun media internasional, hal ini menyebabkan munculnya banyak opini pengguna sosial media teruma dalam Platform Twitter. Tweet yang diposting oleh pengguna sosial media Twitter tersebut kemudian dapat dijadikan sumber informasi yang bermanfaat. Dikarenakan hal tersebut, analisis sentimen dapat digunakan sebagai solusi untuk mengolah suara tersebut dengan menggunakan pendekatan Lexicon Based dan Naïve Bayes Classifier. Tujuan dari penelitian ini adalah untuk mengklasifikasikan pendapat tentang peristiwa Bencana Gempa yang terjadi di Turki pada 6 Februari 2023 berdasarkan kelas sentimen positif, sentimen netral dan sentimen negatif. Skenario 90:10 digunakan untuk pengujian. Hasil evaluasi menunjukkan bahwa pengujian pendekatan Lexicon Based dan Naïve Bayes Classifier menghasilkan nilai akurasi sebesar 65%. Sedangkan Naïve Bayes Classifier tanpa pendekatan Lexicon Based menghasilkan nilai akurasi sebesar 64%.

APA, Harvard, Vancouver, ISO, and other styles

10

Nang, Noon Kham. "Lexicon Based Emotion Analysis on Twitter Data." International Journal of Trend in Scientific Research and Development 3, no. 5 (2019): 1008–12. https://doi.org/10.5281/zenodo.3590493.

Full text

Abstract:

This paper presents a system that extracts information from automatically annotated tweets using well known existing opinion lexicons and supervised machine learning approach. In this paper, the sentiment features are primarily extracted from novel high coverage tweet specific sentiment lexicons. These lexicons are automatically generated from tweets with sentiment word hashtags and from tweets with emoticons. The sentence level or tweet level classification is done based on these word level sentiment features by using Sequential Minimal Optimization SMO classifier. SemEval 2013 Twitter sentiment dataset is applied in this work. The ablation experiments show that this system gains in F Score of up to 6.8 absolute percentage points. Nang Noon Kham "Lexicon Based Emotion Analysis on Twitter Data" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-3 | Issue-5 , August 2019, URL: https://www.ijtsrd.com/papers/ijtsrd26566.pdf

APA, Harvard, Vancouver, ISO, and other styles

11

Yang, Jia Neng, Ai Min Yang, and Yong Mei Zhou. "A Method of Building Chinese Sentiment Lexicon Based on Semantics." Applied Mechanics and Materials 596 (July 2014): 263–70. http://dx.doi.org/10.4028/www.scientific.net/amm.596.263.

Full text

Abstract:

A method was proposed to build a Chinese sentiment lexicon based on semantics. Sentiment intensity of the word was automatically calculated by decomposing it into multiple English semantic units (Esu). A lexicon proofreading method was used to optimize the sentiment intensity of words. The proposed lexicon was applied to the task of sentiment analysis, in which the method of support vector machine was used to build the sentiment classifier. The experiment results shown that the built sentiment lexicon was more effective than the general polar sentiment lexicon.

APA, Harvard, Vancouver, ISO, and other styles

12

Yogesha T. "Ensemble Classifier for Web Data Scraping with Lexicon Support." Journal of Information Systems Engineering and Management 10, no. 20s (2025): 406–16. https://doi.org/10.52783/jisem.v10i20s.3133.

Full text

Abstract:

An ensemble classifier for web data is used for selective web scraping with lexical support in an innovative way to improving the accuracy and efficiency of data classification from web sources. Web scraping, is a method for obtaining information from webpages, frequently produces massive, unstructured datasets and high risk in data reliability which leads to misuse in communication that are difficult to manage. To overcome this, selective online scraping is used to target certain information important to the classification task, resulting in less noise and higher data quality. The ensemble classifier integrates numerous machine learning models to maximize their strengths, resulting in better overall performance. In this approach, separate classifiers are trained on distinct subsets of scraped data that are chosen based on predetermined criteria utilizing lexicons, which are collections of domain-specific words and phrases. These lexicons guide the selective scraping process, ensuring that only the most relevant data is captured, hence improving classifier accuracy. After scraping and pre-processing the data, the ensemble method aggregates predictions from each classifier, generally using techniques like majority voting, stacking, or weighted average, to get a final classification result. This strategy not only promotes robustness by reducing the risk of overfitting, but it also improves flexibility across other domains by incorporating lexical assistance tailored to specific themes or sectors. The combination of selective web scraping and lexical assistance enables more targeted and resource-efficient data collecting, while the use of an ensemble classifier assures excellent accuracy and reliability in classification tasks. This methodology is especially useful in circumstances where the online data is large, dynamic, and contains a lot of unnecessary or noisy information. The resulting system provides a scalable and effective solution for real-time web data classification, with applications in sentiment analysis, content categorization, and market intelligence.

APA, Harvard, Vancouver, ISO, and other styles

13

Rani, Sangeeta, Nasib Singh Gill, and Preeti Gulia. "Analyzing impact of number of features on efficiency of hybrid model of lexicon and stack based ensemble classifier for twitter sentiment analysis using WEKA tool." Indonesian Journal of Electrical Engineering and Computer Science 22, no. 2 (2021): 1041. http://dx.doi.org/10.11591/ijeecs.v22.i2.pp1041-1051.

Full text

Abstract:

<span>Twitter is used by millions of people across the world, so the data collected from Twitter can be highly valuable for research and helpful in decision support. Here in this paper ‘Twitter US Airline data’ from Kaggle data repository is used for sentiment classification of customers’ reviews. The current research aims to implement various machine learning classifiers, Stack-based ensemble classifiers and hybrid of lexicon classifier with other classifiers. 11 different classification models are implemented for different sized feature sets. Also, all the 11 models are re-implemented by adding sentiment score of lexicon based classifier as one of the features in the feature set. Results are analyzed by varying number of input feature variables used in the classification. Four different size feature sets having 301,501, 701, and 1301 number of features are used to analyze the variations in the final findings. Chi-Square and Information gain techniques are used for feature selection. The results show that an increase in the number of features increases the accuracy up to 701 features. After that, accuracy is stable or decreases with increase in feature set size. Also, the cost of adding sentiment score of lexicon classifier to the input feature set is nominal, but the results are improved consistently. WEKA and R Studio tools are used for analysis and implementation. Accuracy and Kappa are used for representing and comparing the efficiency of models.</span>

APA, Harvard, Vancouver, ISO, and other styles

14

Ansari, Mohd Zeeshan, Tanvir Ahmad, Mirza Mohd Sufyan Beg, and Noaima Bari. "Language lexicons for Hindi-English multilingual text processing." IAES International Journal of Artificial Intelligence (IJ-AI) 11, no. 2 (2022): 641. http://dx.doi.org/10.11591/ijai.v11.i2.pp641-648.

Full text

Abstract:

<span lang="EN-US">Language identification (LI) in textual documents is the process of automatically detecting the language contained in a document based on its content. The present language identification techniques presume that a document contains text in one of the fixed set of languages. However, this presumption is incorrect when dealing with multilingual document which includes content in more than one possible language. Due to the unavailability of standard corpora for Hindi-English mixed lingual language processing tasks, we propose the language lexicons, a novel kind of lexical database that augments several bilingual language processing tasks. These lexicons are built by learning classifiers over English and transliterated Hindi vocabulary. The designed lexicons possess condensed quantitative characteristics which reflect their linguistic strength in respect of Hindi and English language. On evaluating the lexicons, it is observed that words of the same language tend to cluster together and are separable over language classes. On comparing the classifier performance with existing works, the proposed lexicon models exhibit the better performance.</span>

APA, Harvard, Vancouver, ISO, and other styles

15

Jordan, Andrean, and Suharjito. "Detection of Hoax Spread in The Whatsapp Group with Lexicon Based and Naive Bayes Classification." International Journal of Engineering and Advanced Technology (IJEAT) 9, no. 4 (2020): 506–11. https://doi.org/10.35940/ijeat.C6587.049420.

Full text

Abstract:

Spreading hoax through WhatsApp social media can lead to different beliefs and can cause disputes for those affected. This paper proposes a hybrid model for finding hoaxes in the WhatsApp group using a combination of knowledge-based and machine learning approaches. This Hybrid model combines two methods namely Lexicon based and Naive Bayes Classifier which will be applied to the WhatsApp monitoring application. This research focuses on two main aspects namely word weighting using the lexicon based method and data classification using the Naive Bayes Classifier and Decision tree-j48 methods. The dataset used is conversation data that is crossed from the WhatsApp group. Based on the experiments that have been carried out, it is obtained the results of classification using Naive Bayes classifier of 86.670% data conversation not indicated hoaxes and 13.330% indicated hoaxes. The average value of the percentage of truth obtained more than 75%. The average value of the classification performance evaluation results in a precision value of 0.771, a recall value of 0.754, an F-measure value of 0.773.

APA, Harvard, Vancouver, ISO, and other styles

16

Amaliah, Fitrah, and I. Kadek Dwi Nuryana. "Perbandingan Akurasi Metode Lexicon Based Dan Naive Bayes Classifier Pada Analisis Sentimen Pendapat Masyarakat Terhadap Aplikasi Investasi Pada Media Twitter." Journal of Informatics and Computer Science (JINACS) 3, no. 03 (2022): 384–93. http://dx.doi.org/10.26740/jinacs.v3n03.p384-393.

Full text

Abstract:

Investasi pada era globalisasi ini menjadi kegiatan yang penting dalam perekonomian dan bisnis. Sudah banyak masyarakat yang memilih untuk menempatkan dana yang mereka miliki dalam bentuk investasi. Dengan adanya perkembangan teknologi para developer membuat aplikasi investasi untuk memudahkan proses investasi. Dengan adanya aplikasi investasi, terdapat juga kelebihan dan kekurangan dari aplikasi yang ada, mulai dari aplikasi investasi bodong hingga aplikasi investasi yang terpercaya. Analisis sentimen pada Twitter dilakukan agar mengetahui aplikasi investasi yang harus dihindari dan dapat dipercaya. Metode lexicon based dan naive bayes classifier dipilih agar dapat mengklasifikasikan antara tweets yang bersentimen positif, netral, dan negatif agar memudahkan masyarakat dalam menentukan pilihan beserta mengetahui tingkat akurasi antara kedua metode. Dari hasil kedua metode telah didapatkan bahwa sentimen positif memiliki persentase tertinggi terhadap aplikasi investasi. Sedangkan perbandingan akurasi dari kedua metode menghasilkan 67% untuk metode lexicon based dan 78% untuk metode naive bayes classifier. Dari hasil tersebut dapat menunjukkan bahwa data yang telah dianalisis tentang aplikasi investasi memiliki nilai positif dan hasil akurasi dari metode naive bayes classifier memiliki nilai yang lebih tinggi dari pada metode lexicon based.

APA, Harvard, Vancouver, ISO, and other styles

17

Saprizal, Arpan Mualief, and Nor Anisa. "Analisis Sentimen Tiktok: Wajib Militer dengan Metode Lexicon Based dan Naive Bayes Classifier." TAMIKA: Jurnal Tugas Akhir Manajemen Informatika & Komputerisasi Akuntansi 4, no. 2 (2024): 242–46. https://doi.org/10.46880/tamika.vol4no2.pp242-246.

Full text

Abstract:

The issue of conscription in Indonesia has sparked a heated debate among the public, especially on the social media platform TikTok. This study aims to analyze public sentiment on the issue through analysis of TikTok user comments. The method used is lexicon-based sentiment analysis. Data of 5,212 comments were collected using web scraping techniques with the keyword "conscription in Indonesia". The results of the analysis showed that the majority of comments (53.28%) were positive, followed by neutral comments (35.79%), and negative comments (10.92%). This finding indicates that there is considerable support for the issue of military service among TikTok users. The research process includes data collection, data processing, sentiment analysis using a lexicon-based approach, and visualization of results. The results of this study are expected to provide a clearer picture of public perception of the issue of military conscription in Indonesia.

APA, Harvard, Vancouver, ISO, and other styles

18

Priadana, Adri, and Ahmad Ashril Rizal. "Sentiment Analysis on Government Performance in Tourism During The COVID-19 Pandemic Period With Lexicon Based." CAUCHY 7, no. 1 (2021): 28–39. http://dx.doi.org/10.18860/ca.v7i1.12488.

Full text

Abstract:

The COVID-19 pandemic impact has affected all industries in Indonesia and even the world, including the tourism industry. Researchers have a role in researching to answer the needs of the tourism industry, especially in making tourism and business destination management programs and carrying out activities oriented to meet the needs of the tourism industry. Meanwhile, the government has a role in making policies, especially in the roadmap, for developing the tourism industry. This study aims to track trending topics in social media Instagram since COVID-19 hit. The results of trending topics will be classified by sentiment analysis using a Lexicon-based and Naive Bayes Classifier. Based on Instagram data taken since January 2020, it shows the five highest topics in the tourism sector, namely health protocols, hotels, homes, streets, and beaches. Of the five topics, sentiment analysis was carried out with the Lexicon-based and Naive Bayes classifier, showing that beaches get an incredibly positive sentiment, namely 80.87%, and hotels provide the highest negative sentiment 57.89%. The accuracy of the Confusion matrix's sentiment results shows that the accuracy, precision, and recall are 82.53%, 86.99%, and 83.43%, respectively.

APA, Harvard, Vancouver, ISO, and other styles

19

Al-Sheikh, Eman S., and Mozaherul Hoque Abul Hasanat. "Social Media Mining for Assessing Brand Popularity." International Journal of Data Warehousing and Mining 14, no. 1 (2018): 40–59. http://dx.doi.org/10.4018/ijdwm.2018010103.

Full text

Abstract:

Businesses seek to analyse their customer feedback to compare their brand's popularity with the popularity of competing brands. The increasing use of social media in recent years is producing large amounts of textual content, which has become rich source of data for brand popularity analysis. In this article, a novel hybrid approach of classification and lexicon based methods is proposed to assess brand popularity based on the sentiments expressed in social media posts. Two different classification models using Naïve Bayes (NB) and SVM are built based on Twitter messages for 9 different brands of 3 cosmetic products. In addition, sentiment quantification have been performed using a lexicon-based approach. Based on the overall comparison of the proposed models, the SVM classifier has the highest performance with 78.85% accuracy and 94.60% AUC, compared to 73.57% and 63.63% accuracy, 80.63% and 69.38% AUC of the NB classifier and the sentiment quantification approach respectively. Specific indices based on classification and lexicon approaches are proposed to assess the brand popularity.

APA, Harvard, Vancouver, ISO, and other styles

20

Mudinas, Andrius, Dell Zhang, and Mark Levene. "Bootstrap Domain-Specific Sentiment Classifiers from Unlabeled Corpora." Transactions of the Association for Computational Linguistics 6 (December 2018): 269–85. http://dx.doi.org/10.1162/tacl_a_00020.

Full text

Abstract:

There is often the need to perform sentiment classification in a particular domain where no labeled document is available. Although we could make use of a general-purpose off-the-shelf sentiment classifier or a pre-built one for a different domain, the effectiveness would be inferior. In this paper, we explore the possibility of building domain-specific sentiment classifiers with unlabeled documents only. Our investigation indicates that in the word embeddings learned from the unlabeled corpus of a given domain, the distributed word representations (vectors) for opposite sentiments form distinct clusters, though those clusters are not transferable across domains. Exploiting such a clustering structure, we are able to utilize machine learning algorithms to induce a quality domain-specific sentiment lexicon from just a few typical sentiment words (“seeds”). An important finding is that simple linear model based supervised learning algorithms (such as linear SVM) can actually work better than more sophisticated semi-supervised/transductive learning algorithms which represent the state-of-the-art technique for sentiment lexicon induction. The induced lexicon could be applied directly in a lexicon-based method for sentiment classification, but a higher performance could be achieved through a two-phase bootstrapping method which uses the induced lexicon to assign positive/negative sentiment scores to unlabeled documents first, a nd t hen u ses those documents found to have clear sentiment signals as pseudo-labeled examples to train a document sentiment classifier v ia supervised learning algorithms (such as LSTM). On several benchmark datasets for document sentiment classification, our end-to-end pipelined approach which is overall unsupervised (except for a tiny set of seed words) outperforms existing unsupervised approaches and achieves an accuracy comparable to that of fully supervised approaches.

APA, Harvard, Vancouver, ISO, and other styles

21

Irwiensyah, Faldy, and Firman Noor Hasan. "Perbandingan Akurasi Metode Naïve Bayes Classifier dan Lexicon Based Pada Analisis Sentimen Respon Masyarakat Tentang Kebijakan Kenaikan Harga Minyak Goreng." Jurnal Teknik Informatika dan Komputer 2, no. 1 (2023): 18–23. http://dx.doi.org/10.22236/jutikom.v2i1.11500.

Full text

Abstract:

Minyak goreng merupakan kebutuhan dasar bagi masyarakat Indonesia. Indonesia mengalami kelangkaan minyak pada bulan maret 2022. Hal ini sudah menjadi perbincangan hangat di media sosial twitter pada bulan maret lalu, banyak masyarakat yang beranggapan positif maupun negatif. Namun dibalik itu semua terdapat perbedaan penilaian dari pihak-pihak yang merasakan pro dan kontra, berbagai pihak memiliki sudut pandang yang berbeda. Pada artikel ini melakukan analisis sentimen terhadap respon masyarakat terkait kelangkaan minyak goreng menggunakan sebuah dataset yang didapatkan dari platform digital twitter. Artikel ini mempunyai tujuan untuk mengelompokkan tweets terkait kelangkaan minyak goreng ke dalam sentimen positif dan negatif menggunakan strategi machine learning dengan metode Naive Bayes dan lexicon based. Algoritma ini dipilih untuk memudahkan pengguna yang berkepentingan melakukan perbandingan metode serta mengetahui seberapa akurat, yang dimana tingkat akurasi yang didapatkan dari metode lexicon 42% serta metode yang menggunakan naïve bayes classifier 72%. Menunjukan hasil analisis terkait kelangkaan minyak goreng dengan nilai netral dan juga tingkat akurasi yang paling tinggi yaitu pada metode yang menggunakan naïve bayes classifier dibandingkan metode yang menggunakan lexicon based

APA, Harvard, Vancouver, ISO, and other styles

22

Muhammad, Anwarul Azim, and Hasan Bhuiyan Mahmudul. "Text to Emotion Extraction Using Supervised Machine Learning Techniques." TELKOMNIKA Telecommunication, Computing, Electronics and Control 16, no. 3 (2018): 1394–401. https://doi.org/10.12928/TELKOMNIKA.v16i3.8387.

Full text

Abstract:

Proliferation of internet and social media has greatly increased the popularity of text communication. People convey their sentiment and emotion through text which promotes lively communication. Consequently, a tremendous amount of emotional text is generated on different social media and blogs in every moment. This has raised the necessity of automated tool for emotion mining from text. There are various rule based approaches of emotion extraction form text based on emotion intensity lexicon. However, creating emotion intensity lexicon is a time consuming and tedious process. Moreover, there is no hard and fast rule for assigning emotion intensity to words. To solve these difficulties, we propose a machine learning based approach of emotion extraction from text which relies on annotated example rather emotion intensity lexicon. We investigated Multinomial Naïve Bayesian (MNB) Classifier, Artificial Neural Network (ANN) and Support Vector Machine (SVM) for mining emotion from text. In our setup, SVM outperformed other classifiers with promising accuracy.

APA, Harvard, Vancouver, ISO, and other styles

23

Bhagyalakshmi, N. "Safeguarding Nations from Online News Threats Using Hybrid Technique." International Journal for Research in Applied Science and Engineering Technology 12, no. 3 (2024): 1185–90. http://dx.doi.org/10.22214/ijraset.2024.58781.

Full text

Abstract:

Abstract: The internet provides a potent platform for individuals to express their opinions and emotions, facilitated by widespread smartphone usage and high internet accessibility. However, monitoring these online sentiments is crucial for identifying any extreme emotions that could potentially pose risks to national security. To address this, a new theoretical framework has been proposed, which combines a lexicon-based approach with machine learning techniques in the digital realm. This hybrid framework incorporates Decision Tree, Naive Bayes, and Support Vector Machine classifiers to predict political security threats. Through experimentation, it was found that the combination of a lexicon-based approach with the Decision Tree classifier yielded the highest performance score in predicting these threats. Natural Language Processing (NLP) techniques are employed for opinion mining within this framework

APA, Harvard, Vancouver, ISO, and other styles

24

Alanazi, Saad Awadh. "Robust Sentimental Class Prediction Based on Cryptocurrency-Related Tweets Using Tetrad of Feature Selection Techniques in Combination with Filtered Classifier." Applied Sciences 12, no. 12 (2022): 6070. http://dx.doi.org/10.3390/app12126070.

Full text

Abstract:

Individual mental feelings and reactions are getting more significant as they help researchers, domain experts, businesses, companies, and other individuals understand the overall response of every individual in specific situations or circumstances. Every pure and compound sentiment can be classified using a dataset, which can be in the form of Twitter text by various Twitter users. Twitter is one of the vital platforms for individuals to participate and share their ideas about different topics; it is also considered to be one of the most famous and the biggest website for micro-blogging on the Internet. One of the key purposes of this study is to classify pure and compound sentiments based on text related to cryptocurrencies, an innovative way of trading and flourishing daily. The cryptocurrency market incurs many fluctuations in the coins’ value. A small positive or negative piece of news can sensate the whole scenario about the specific cryptocurrencies. In this paper, individuals’ pure and compound sentiments based on cryptocurrency-related Twitter text are classified. The dataset is collected through the Twitter API. In WEKA, the two deployment schemes are compared; firstly, straight with single feature selection technique (Tweet to lexicon feature vector), and secondly, a tetrad of feature selection techniques (Tweet to lexicon feature vector, Tweet to input lexicon feature vector, Tweet to SentiStrength feature vector, and Tweet to embedding feature vector) are used to purify the data LibLINEAR (LL) classifier, which contains fast algorithms for linear classification using L2-regularization L2-loss support vector machines (Dual SVM). The LL classifier differs in that it can potentially alleviate the sum of the absolute values of errors rather than the sum of the squared errors and is typically much speedier. Based on the overall performance parameters, the deployment scheme containing the tetrad of feature selection techniques with the LL classifier is considered the best choice for the purpose of classification. Among machine learning techniques, LL produces effective results and gives an efficient performance compared to other prevailing techniques. The findings of this research would be beneficial for Twitter users as well as cryptocurrency traders.

APA, Harvard, Vancouver, ISO, and other styles

25

Herbert, Marjorie. "A new classifier-based plural morpheme in German Sign Language (DGS)." Sign Language and Linguistics 21, no. 1 (2018): 115–36. http://dx.doi.org/10.1075/sll.00012.her.

Full text

Abstract:

Abstract German Sign Language (DGS) displays variation in the simple plural, the form of which is conditioned by classes of phonological features within the lexicon. As a consequence, the overt realization of the plural marker is restricted to a small set of nouns specified for the appropriate phonological features, while the rest are left bare (Pfau & Steinbach 2005, 2006; Steinbach 2012). Pfau & Steinbach (2005) report a number of ‘alternative pluralization strategies’ available as repairs for this underspecification, including classifier constructions, spatial localization, and number and quantifier phrases. I propose a previously undescribed mechanism for plural marking, the ‘classifier-based plural morpheme’ (CLP), grammaticalized from the classifier system into a morpheme in the grammars of individual DGS signers. Elicitation data show that this morpheme attaches only to nouns which are specified for phonological features that restrict the realization of the canonical plural marker, adding a new option to the range of pluralization strategies available.

APA, Harvard, Vancouver, ISO, and other styles

26

Šuman, Sabrina, Sanja Čandrlić, and Alen Jakupović. "A Corpus-Based Sentence Classifier for Entity–Relationship Modelling." Electronics 11, no. 6 (2022): 889. http://dx.doi.org/10.3390/electronics11060889.

Full text

Abstract:

Automated creation of a conceptual data model based on user requirements expressed in the textual form of a natural language is a challenging research area. The complexity of natural language requires deep insight into the semantics buried in words, expressions, and string patterns. For the purpose of natural language processing, we created a corpus of business descriptions and an adherent lexicon containing all the words in the corpus. Thus, it was possible to define rules for the automatic translation of business descriptions into the entity–relationship (ER) data model. However, since the translation rules could not always lead to accurate translations, we created an additional classification process layer—a classifier which assigns to each input sentence some of the defined ER method classes. The classifier represents a formalized knowledge of the four data modelling experts. This rule-based classification process is based on the extraction of ER information from a given sentence. After the detailed description, the classification process itself was evaluated and tested using the standard multiclass performance measures: recall, precision and accuracy. The accuracy in the learning phase was 96.77% and in the testing phase 95.79%.

APA, Harvard, Vancouver, ISO, and other styles

27

Prianto, Cahyo, and Nurul Izza Hamka. "Sentimen Analisis Terhadap Pembelajaran Jarak Jauh Menggunakan Metode Naïve Bayes Classifier dan Lexicon Based." Jurnal Ilmu Komputer 14, no. 2 (2021): 79. http://dx.doi.org/10.24843/jik.2021.v14.i02.p02.

Full text

Abstract:

Sejak awal pandemic covid-19 ini segala bidang terkena dampak khusunya dibidang pendidikan, dimana saat ini proses pembelajaran dilakukan secara jarak jauh. Penelitian ini dilakukan untuk mengetahui tanggapan dari masyarakat yang berada dalam ruang lingkup pendidikan seperti siswa,mahasiswa serta pengajar. Adapun jumlah responden yang memberikan tanggapannya yaitu sebanyak 265 melalui pengisian kuesioner dalam bentuk google form. Berdasarkan penelitian ini diketahui bahwa dari 265 orang yang memberikan respon dengan jumlah pernyataan sebanyak 6, maka diperoleh 1.590 jawaban yang berbeda. Dari 1.590 data diolah kembali sehingga data akhir sebanyak 1.468. Adapun hasil dari pelabelan dengan kamus lexicon based ini berjumlah 162 positif, 516 bernilai negatif, dan 790 netral. Hasil pengujian dengan naïve bayes diperoleh tingkat akurasi 53.8% dengan menggunakan penggukuran efektifitas confusion matrix

APA, Harvard, Vancouver, ISO, and other styles

28

Maree, Mohammed, Mujahed Eleyat, Shatha Rabayah, and Mohammed Belkhatir. "A hybrid composite features based sentence level sentiment analyzer." IAES International Journal of Artificial Intelligence (IJ-AI) 12, no. 1 (2023): 284. http://dx.doi.org/10.11591/ijai.v12.i1.pp284-294.

Full text

Abstract:

<div align="left"><span lang="EN-US">Current lexica and machine learning based sentiment analysis approaches still suffer from a two-fold limitation. First, manual lexicon construction and machine training is time consuming and error-prone. Second, the prediction’s accuracy entails sentences and their corresponding training text should fall under the same domain. In this article, we experimentally evaluate four sentiment classifiers, namely Support Vector Machines, Naive Bayes, Logistic Regression and Random Forest. We quantify the quality of each of these models using three real-world datasets that comprise 50,000 movie reviews, 10,662 sentences, and 300 generic movie reviews. Specifically, we study the impact of a variety of natural language processing (NLP) pipelines on the quality of the predicted sentiment orientations. Additionally, we measure the impact of incorporating lexical semantic knowledge captured by WordNet on expanding original words in sentences. Findings demonstrate that the utilizing different NLP pipelines and semantic relationships impacts the quality of the sentiment analyzers. In particular, results indicate that coupling lemmatization and knowledge-based n-gram features proved to produce higher accuracy results. With this coupling, the accuracy of the support vector machine (SVM) classifier has improved to 90.43%, while it was 86.83%, 90.11%, 86.20%, respectively using the three other classifiers. </span></div>

APA, Harvard, Vancouver, ISO, and other styles

29

Mohammed, Maree, Eleyat Mujahed, Rabayah Shatha, and Belkhatir Mohammed. "A hybrid composite features based sentence level sentiment analyzer." International Journal of Artificial Intelligence (IJ-AI) 12, no. 1 (2023): 284–94. https://doi.org/10.11591/ijai.v12.i1.pp284-294.

Full text

Abstract:

Current lexica and machine learning based sentiment analysis approaches still suffer from a two-fold limitation. First, manual lexicon construction and machine training is time consuming and error-prone. Second, the prediction’s accuracy entails sentences and their corresponding training text should fall under the same domain. In this article, we experimentally evaluate four sentiment classifiers, namely support vector machines (SVMs), Naive Bayes (NB), logistic regression (LR) and random forest (RF). We quantify the quality of each of these models using three real-world datasets that comprise 50,000 movie reviews, 10,662 sentences, and 300 generic movie reviews. Specifically, we study the impact of a variety of natural language processing (NLP) pipelines on the quality of the predicted sentiment orientations. Additionally, we measure the impact of incorporating lexical semantic knowledge captured by WordNet on expanding original words in sentences. Findings demonstrate that the utilizing different NLP pipelines and semantic relationships impacts the quality of the sentiment analyzers. In particular, results indicate that coupling lemmatization and knowledge-based n-gram features proved to produce higher accuracy results. With this coupling, the accuracy of the SVM classifier has improved to 90.43%, while it was 86.83%, 90.11%, 86.20%, respectively using the three other classifiers.

APA, Harvard, Vancouver, ISO, and other styles

30

GHOSH, MOUMITA, RANADHIR GHOSH, and BRIJESH VERMA. "A FULLY AUTOMATED OFFLINE HANDWRITING RECOGNITION SYSTEM INCORPORATING RULE BASED NEURAL NETWORK VALIDATED SEGMENTATION AND HYBRID NEURAL NETWORK CLASSIFIER." International Journal of Pattern Recognition and Artificial Intelligence 18, no. 07 (2004): 1267–83. http://dx.doi.org/10.1142/s0218001404003654.

Full text

Abstract:

In this paper we propose a fully automated offline handwriting recognition system that incorporates rule based segmentation, contour based feature extraction, neural network validation, a hybrid neural network classifier and a hamming neural network lexicon. The work is based on our earlier promising results in this area using heuristic segmentation and contour based feature extraction. The segmentation is done using many heuristic based set of rules in an iterative manner and finally followed by a neural network validation system. The extraction of feature is performed using both contour and structure based feature extraction algorithm. The classification is performed by a hybrid neural network that incorporates a hybrid combination of evolutionary algorithm and matrix based solution method. Finally a hamming neural network is used as a lexicon. A benchmark dataset from CEDAR has been used for training and testing.

APA, Harvard, Vancouver, ISO, and other styles

31

Mejova, Yelena, and Padmini Srinivasan. "Exploring Feature Definition and Selection for Sentiment Classifiers." Proceedings of the International AAAI Conference on Web and Social Media 5, no. 1 (2021): 546–49. http://dx.doi.org/10.1609/icwsm.v5i1.14163.

Full text

Abstract:

In this paper, we systematically explore feature definition and selection strategies for sentiment polarity classification. We begin by exploring basic questions, such as whether to use stemming, term frequency versus binary weighting, negation-enriched features, n-grams or phrases. We then move onto more complex aspects including feature selection using frequency-based vocabulary trimming, part-of-speech and lexicon selection (three types of lexicons), as well as using expected Mutual Information (MI). Using three product and movie review datasets of various sizes, we show, for example, that some techniques are more beneficial for larger datasets than the smaller. A classifier trained on only few features ranked high by MI outperformed one trained on all features in large datasets, yet in small dataset this did not prove to be true. Finally, we perform a space and computation cost analysis to further understand the merits of various feature types.

APA, Harvard, Vancouver, ISO, and other styles

32

Purwitasari, Diana, Adi Surya Suwardi Ansyah, Arya Putra Kurniawan, and Asiyah Nur Kholifah. "A Hybrid Method on Emotion Detection for Indonesian Tweets of COVID-19." Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) 7, no. 2 (2023): 254–62. http://dx.doi.org/10.29207/resti.v7i2.4816.

Full text

Abstract:

As a result of the COVID-19 pandemic, there have been restrictions on activities outside the home which has caused people to interact more and express their emotions through social media platforms, one of which is Twitter. Previous studies on emotion classification used only one feature extraction, namely the lexicon based or word embedding. Feature extraction using the emotion lexicon has the advantage of recognizing emotional words in a sentence while feature extraction using word embedding has the advantage of recognizing the semantic meaning. Therefore, the main contribution to this research is to use two lexicon feature extraction and word embedding to classify emotions. The classification technique used in this research is the Ensemble Voting Classifier by selecting the two best classifiers to try on both types of feature extraction. The experimental results for both types of feature extraction are the same, indicating that the best classifiers are Random Forest and SVM. Models using both types of feature extraction show increased accuracy compared to using only one feature extraction. The results of this emotional analysis can be used to determine the public's reaction to an event, product, or public policy.

APA, Harvard, Vancouver, ISO, and other styles

33

Tiana, Serly Marlis. "Analisis Sentimen Terhadap Kebijakan Pemerintah Tentang Ditutupnya Fitur Belanja Pada Tiktok Dengan Menggunakan Naïve Bayes Classifier Dan Random Forest Classifier." J-Com (Journal of Computer) 4, no. 1 (2024): 76–86. http://dx.doi.org/10.33330/j-com.v4i1.3140.

Full text

Abstract:

Pemerintah Indonesia menghadapi sejumlah problematika dalam bidang perekonomian, memerlukan solusi tepat untuk meningkatkan kondisi ekonomi. Keterlibatan langsung pemerintah dalam merespon dan memecahkan masalah tersebut memerlukan pemahaman mendalam terhadap problematika yang dihadapi masyarakat. Pasar, sebagai pusat ekonomi, mengalami transformasi signifikan dengan adanya platform online seperti TikTok, yang sebelumnya menyediakan fitur belanja yang populer. Dalam menghadapi dampak positif dan negatif dari fitur belanja TikTok, pemerintah memutuskan untuk menutup fitur tersebut. Penelitian ini bertujuan untuk menganalisis sentimen masyarakat terhadap kebijakan tersebut menggunakan Lexicon Based sebagai metode pelabelan, serta Naïve Bayes dan Random Forest sebagai model klasifikasi. Dengan menggunakan teknik crawling data, penelitian ini akan menyajikan analisis sentimen untuk memahami pandangan masyarakat terkait penutupan fitur belanja TikTok dan implikasinya terhadap perekonomian Indonesia.

APA, Harvard, Vancouver, ISO, and other styles

34

Hasan, Arid, Yudhi Raymond Ramadhan, and Minarto Minarto. "Sentiment Analysis of Telemedicine Applications on Twitter Using Lexicon-Based and Naive Bayes Classifier Methods." Jurnal Riset Informatika 5, no. 4 (2023): 481–90. http://dx.doi.org/10.34288/jri.v5i4.244.

Full text

Abstract:

Since the onset of the COVID-19 pandemic in Indonesia, many people have turned to telemedicine programs as an alternative to minimize social interactions, opting for consultations from the safety of their homes using smartphones and internet connectivity. Given the necessity for physical distancing and avoiding crowded places, these applications have become indispensable substitutes for in-person medical consultations. Numerous apps facilitating access to healthcare services have been introduced in Indonesia, ranging from business startups to initiatives by the Ministry of Health. Telemedicine can potentially revolutionize healthcare in Indonesia, addressing critical health challenges. A significant issue within Indonesia's healthcare system is the scarcity of doctors and their uneven distribution. With only four doctors per 10,000 people, this figure falls far below the WHO guideline of 10 doctors per 1,000. Sentiment analysis of these applications was conducted to evaluate how telemedicine applications meet public needs and offer an alternative solution. Lexicon-based and naive Bayes methods were employed to classify tweet data into positive, neutral, and negative sentiments. The results revealed 908 positive tweets, 172 negative tweets, and 168 neutral tweets, indicating predominantly positive public perceptions of telemedicine applications. The naive Bayes classifier exhibited a 74% accuracy rate, with a precision of 98% and a recall of 86%. These findings underscore the positive impact and acceptance of telemedicine applications among the Indonesian populace, emphasizing their significance in augmenting the nation's healthcare landscape.

APA, Harvard, Vancouver, ISO, and other styles

35

Prasetya, Heru, Ghulam Asrofi Buntoro, and Dyah Mustikasari. "ANALISIS SENTIMEN PADA CHANNEL AUTONETMAGZ TERHADAP REVIEW MOBIL ALMAZ 2019 DENGAN METODE NAIVE BAYES CLASSIFIER DAN LEXICON BASED." KOMPUTEK 4, no. 1 (2020): 58. http://dx.doi.org/10.24269/jkt.v4i1.358.

Full text

Abstract:

Mobil Almaz merupakan salah satu produk dari perusahaan mobil Wuling. Mobil wuling mengalami peningkatan penjualan yang besar selama 7 bulan terakhir di Indonesia. Wuling mampu bersaing penjualan terbanyak nomor 9 dari 20 produsen meskipun baru 2 tahun produk Wuling berada di Indonesia. Meskipun penjualannya mengalami peningkatan, tidak semua masyarakat indonesia berkomentar positif bahkan negatif bahkan netral. Masyarakat kini bisa melihat video dari media sosial yang paling umum digunakan yaitu Youtube tentang pembahasan kendaraan mobil Wuling Almaz. Dalam konten video vlogger, pengguna dapat memberikan sebuah feedback melalui komentar berupa opini yang menguatkan positif ataupun opini yang sangat melemahkan negatif atau bahkan netral berupa pertanyaan. Dari berbagai ragam komentar di channel Youtube Autonetmagz diperlukan teknik untuk membagi ke dalam kelas opini positif, netral maupun negatif. Penelitian ini menggunakan prepocessing dan melabeli opini kedalam 3 kelas sentimen yaitu kelas positif, netral dan negatif dengan metode lexicon Based. Sedangkan untuk klasifikasinya menggunakan metode Naive Bayes Classifer. Data yang digunakan berupa komentar tentang ulasan Wuling almaz dari channel Youtube Autonetmagz yang berjumlah 1000. Hasil dari pelabelan dengan Lexicon Based berjumlah 232 untuk positif, untuk netral berjumlah 456 dan negatif berjumlah 312. Sedangkan klasifikasi metode Naive Bayes Classifier menghasilkan akurasi 66,5%, presisi 60,94% dan recall 61,2%.

APA, Harvard, Vancouver, ISO, and other styles

36

Alharbi, Omar. "Negation Handling in Machine Learning-Based Sentiment Classification for Colloquial Arabic." International Journal of Operations Research and Information Systems 11, no. 4 (2020): 33–45. http://dx.doi.org/10.4018/ijoris.2020100102.

Full text

Abstract:

One crucial aspect of sentiment analysis is negation handling, where the occurrence of negation can flip the sentiment of a review and negatively affects the machine learning-based sentiment classification. The role of negation in Arabic sentiment analysis has been explored only to a limited extent, especially for colloquial Arabic. In this paper, the authors address the negation problem in colloquial Arabic sentiment classification using the machine learning approach. To this end, they propose a simple rule-based algorithm for handling the problem that affects the performance of a machine learning classifier. The rules were crafted based on observing many cases of negation, simple linguistic knowledge, and sentiment lexicon. They also examine the impact of the proposed algorithm on the performance of different machine learning algorithms. Furthermore, they compare the performance of the classifiers when their algorithm is used against three baselines. The experimental results show that there is a positive impact on the classifiers when the proposed algorithm is used compared to the baselines.

APA, Harvard, Vancouver, ISO, and other styles

37

Ashir, Abubakar M. "A Generalized Method for Sentiment Analysis across Different Sources." Applied Computational Intelligence and Soft Computing 2021 (December 18, 2021): 1–8. http://dx.doi.org/10.1155/2021/2529984.

Full text

Abstract:

Sentiment analysis is widely used in a variety of applications such as online opinion gathering for policy directives in government, monitoring of customers, and staff satisfactions in corporate bodies, in politics and security structures for public tension monitoring, and so on. In recent times, the field met with new set of challenges where new algorithms have to contend with highly unstructured sources for sentiment expressions emanating from online social media fora. In this study, a rule and lexical-based procedure is proposed together with unsupervised machine learning to implement sentiment analysis with an improved generalization ability across different sources. To deal with sources devoid of syntactic and grammatical structure, the approach incorporates a ruled-based technique for emoticon detection, word contraction expansion, noise removal, and lexicon-based text preprocessing using lexical features such as part of speech (POS), stop words, and lemmatization for local context analysis. A text is broken into number of tokens with each representing a sentence and then lexicon-dependent features are extracted from each token. The features are merged together using a combining function for a given text before being used to train a machine learning classifier. The proposed combining functions leverage on averaging and information gain concepts. Experimental results with different machine leaning classifiers indicate that improved performance with great deal of generalization capacity across both structured and nonstructured sources can be realized. The finding shows that carefully designed lexical features reinforce learning process in unsupervised learning more than using word embeddings alone as the features. Obtained experimental results from movie review dataset (recall = 74.9%, precision = 70.9%, F1-score = 72.9%, and accuracy = 72.0%) and twitter samples’ datasets (recall = 93.4%, precision = 89.5%, F1-score = 91.4%, and accuracy = 91.1%) show the efficacy of the proposed approach in comparison with other state-of-the-art research studies.

APA, Harvard, Vancouver, ISO, and other styles

38

Rizky Fauzi Akbar and Muhammad Habibi. "Sentiment Analysis Related National Social Security Agency for Employment in Indonesia: Hybrid Method Using Lexicon Based and Naive Bayes Classifier Approaches." INDONESIAN JOURNAL ON DATA SCIENCE 1, no. 1 (2023): 32–38. http://dx.doi.org/10.30989/ijds.v1i1.896.

Full text

Abstract:

    The National Social Security Agency (BPJS) for Employment is the Social Security Administering Agency with the goal of ensuring that each participant or member of the family receives adequate necessities. In its implementation, there is information that is spread, particularly on Twitter, regarding the Ministry of Health's decision, namely regarding Old Age Security (JHT), which can only be distributed/taken after the participant turns 56 years old, causing both pros and cons among the public. Based on unanalyzed tweets on Twitter, it is necessary to do extensive research to collect relevant information based on netizens' viewpoints. This research describes sentiment analysis of tweets from Twitter using the terms JHT, BPJSTK, and BPJS, which yield 4154 data tweets. We employ two approaches in this study: Lexicon Based and Nave Bayes Classifier. According to this study, the accuracy of the testing data is 92% for the Lexicon Based and 95% for the Nave Bayes Classifier. This study concluded that the JHT at BPJS Employment received unfavorable attitudes and negative reactions among users who addressed the rejection of new restrictions where JHT, could only be dispensed or taken when participants at BPJS Employment were 56 years old.    

APA, Harvard, Vancouver, ISO, and other styles

39

Fauzi Akbar, Rizky, Muhammad Habibi, Puji Winar Cahyo, and Nafisa Alfi Sa'diya. "Metode Hybrid Menggunakan Pendekatan Lexicon Based dan Naive Bayes Classifier Untuk Analisis Sentimen Terkait Jaminan Hari Tua." Teknomatika: Jurnal Informatika dan Komputer 16, no. 2 (2023): 73–79. http://dx.doi.org/10.30989/teknomatika.v16i2.1247.

Full text

Abstract:

Badan Penyelenggara Jaminan Sosial (BPJS) Ketenagakerjaan adalah badan aturan publik yang dibuat melalui Undang-Undang No 24 Tahun 2011 Tentang Badan Penyelenggaran Jaminan Sosial menggunakan tujuan untuk mewujudkan terselenggaranya pemberian jaminan terpenuhinya kebutuhan dasar yang layak bagi setiap peserta atau anggota keluarganya. Dalam pelaksanaannya terdapat informasi yang tersebar khususnya pada tweet di Twitter mengenai keputusan Kementrian Kesehatan yaitu mengenai Jaminan Hari Tua (JHT) yang hanya bisa dicairkan/diambil setelah peserta (BPJS) Ketenagakerjaan menginjak usia 56 tahun, menyebabkan adanya pro dan kontra yang ada dikalangan masyarakat. Berdasarkan tweet-tweet pada Twitter yang belum dianalisis maka perlu di analisis secara mendalam untuk mendapatkan informasi yang sesuai berdasarkan opini netizen. Berdasarkan hasil penelitian ini diperoleh nilai akurasi data testing sebesar 92% untuk metode Lexicon Based dan 95% untuk data testing pada metode Naïve Bayes Classifier lalu untuk data training Naïve Bayes Classifier mendapatkan akurasi 82%. Penelitian ini mendapatkan kesimpulan bahwa jaminan hari tua (JHT) pada (BPJS) Ketenagakerjaan mendapat sentimen negatif dari netizen yang banyak membahas mengenai penolakan peraturan baru dimana jaminan hari tua (JHT) pada (BPJS) Ketenagakerjaan, hanya bisa dicairkan atau diambil ketika peserta BPJS Ketenagakerjaan menginjak usia 56 tahun.

APA, Harvard, Vancouver, ISO, and other styles

40

Mustofa, R. L., and B. Prasetiyo. "Sentiment analysis using lexicon-based method with naive bayes classifier algorithm on #newnormal hashtag in twitter." Journal of Physics: Conference Series 1918, no. 4 (2021): 042155. http://dx.doi.org/10.1088/1742-6596/1918/4/042155.

Full text

APA, Harvard, Vancouver, ISO, and other styles

41

Khalifa, Khalid, and Nazlia Omar. "A HYBRID METHOD USING LEXICON-BASED APPROACH AND NAIVE BAYES CLASSIFIER FOR ARABIC OPINION QUESTION ANSWERING." Journal of Computer Science 10, no. 10 (2014): 1961–68. http://dx.doi.org/10.3844/jcssp.2014.1961.1968.

Full text

APA, Harvard, Vancouver, ISO, and other styles

42

Dubey, Gaurav, Santosh Kumar, Pavas Navaney, and Sunil Kumar. "Extended opinion lexicon and ML-based sentiment analysis of tweets: a novel approach towards accurate classifier." International Journal of Computational Vision and Robotics 10, no. 6 (2020): 505. http://dx.doi.org/10.1504/ijcvr.2020.10031564.

Full text

APA, Harvard, Vancouver, ISO, and other styles

43

Dubey, Gaurav, Santosh Kumar, Sunil Kumar, and Pavas Navaney. "Extended opinion lexicon and ML-based sentiment analysis of tweets: a novel approach towards accurate classifier." International Journal of Computational Vision and Robotics 10, no. 6 (2020): 505. http://dx.doi.org/10.1504/ijcvr.2020.110640.

Full text

APA, Harvard, Vancouver, ISO, and other styles

44

Hadjadji, Bilal, Youcef Chibani, and Hassiba Nemmour. "Hybrid one-class classifier ensemble based on fuzzy integral for open-lexicon handwritten Arabic word recognition." Pattern Analysis and Applications 22, no. 1 (2018): 99–113. http://dx.doi.org/10.1007/s10044-018-0735-y.

Full text

APA, Harvard, Vancouver, ISO, and other styles

45

Anggina, Sarah, Nanang Yudi Setiawan, and Fitra A. Bachtiar. "Analisis Ulasan Pelanggan Menggunakan Multinomial Naïve Bayes Classifier dengan Lexicon-Based dan TF-IDF Pada Formaggio Coffee and Resto." is The Best Accounting Information Systems and Information Technology Business Enterprise this is link for OJS us 7, no. 1 (2022): 76–90. http://dx.doi.org/10.34010/aisthebest.v7i1.7072.

Full text

Abstract:

Formaggio Coffee and Resto Tangerang menyajikan hidangan western dengan rasa yang disesuaikan pada selera masyarakat Indonesia. Adanya peningkatan jumlah restoran di Kota Tangerang setiap tahunnya membuat Formaggio Coffee and Resto harus memiliki keunggulan kompetitif dengan cara meningkatkan kepuasan pelanggan. Kepuasan pelanggan bisa didapatkan jika ekspektasi pelanggan terpenuhi. Pihak manajemen Formaggio menganggap kritik dan saran yang diberikan oleh pelanggan merupakan sebuah hal positif yang dapat meningkatkan kinerja mereka. Namun, banyaknya ulasan pelanggan yang tersebar di berbagai situs membuat pihak restoran sulit dalam mengelola pendapat pelanggan. Hal tersebut dapat diatasi dengan melakukan web scraping pada situs Traveloka, PergiKuliner, Zomato, dan Google Review, dimana data yang berhasil terkumpul adalah sebanyak 741 ulasan dengan rentang waktu mulai tahun 2018 hingga tahun 2021. Kemudian, salah satu cara untuk mendapatkan informasi dari ulasan pelanggan adalah dengan melakukan analisis sentimen menggunakan fitur kamus Indonesian Sentiment Lexicon (InSet Lexicon) dan pembobotan TF-IDF, serta algoritme klasifikasi Multinomial Naïve Bayes. Model klasifikasi selanjutnya diuji menggunakan Confusion Matrix dengan empat parameter, yaitu accuracy, recall, precision, dan f1-score. Didapatkan nilai rata-rata dari setiap parameter tersebut sebesar 95%, 68%, 85%, dan 72%. Hasil penelitian selanjutnya divisualisasikan ke dalam sebuah dashboard dan diuji menggunakan kuesioner System Usability Scale (SUS) dengan nilai akhir sebesar 67,5 yang berarti dashboard tersebut dapat diterima dengan baik oleh pihak manajemen Formaggio.

APA, Harvard, Vancouver, ISO, and other styles

46

Khatoon, Shaheen, and Lamis Abu Romman. "Domain Independent Automatic Labeling system for Large-scale Social Data using Lexicon and Web-based Augmentation." Information Technology And Control 49, no. 1 (2020): 36–54. http://dx.doi.org/10.5755/j01.itc.49.1.23769.

Full text

Abstract:

Recently, with the large-scale adoption of social media, people have begun to express their opinion on these sites in the form of reviews. Potential consumers often forced to wade through huge amount of reviews to make informed decision. Sentiment analysis has become rapid and effective way to automatically gauge consumers’ opinion. However, such analysis often requires tedious process of manual tagging of large training examples or manually building a lexicon for the purpose of classifying reviews as positive or negative. In this paper, we present a method to automate the tedious process of labeling large textual data in an unsupervised, domain independent and scalable manner. The proposed method combines the lexicon-based and Web-based Point Wise Mutual Information (PMI) statistics to find the Semantic Orientation (SO) of opinion expressed in a review. Based on proposed methods a system called Domain Independent Automatic Labeling System (DIALS) has been implemented, which takes collection of text from any domain as input and generates fully labeled dataset in an unsupervised and scalable manner. The result generated can be used to track and summarize online discussion and/or use to train any classifier in the next stage of development. The effectiveness of system is tested by comparing it with baseline machine learning and lexicon-based methods. Experiments on multi-domains dataset has shown that proposed method consistently shown improved recall and accuracy as compared to baseline machine learning and lexicon-based methods.

APA, Harvard, Vancouver, ISO, and other styles

47

ROJRATANAVIJIT, Jitrlada, Preecha VICHITTHAMAROS, and Sukanya PHONGSUPHAP. "Acquiring Sentiment from Twitter using Supervised Learning and Lexicon-based Techniques." Walailak Journal of Science and Technology (WJST) 15, no. 1 (2016): 63–80. http://dx.doi.org/10.48048/wjst.2018.2731.

Full text

Abstract:

The emergence of Twitter in Thailand has given millions of users a platform to express and share their opinions about products and services, among other subjects, and so Twitter is considered to be a rich source of information for companies to understand their customers by extracting and analyzing sentiment from Tweets. This offers companies a fast and effective way to monitor public opinions on their brands, products, services, etc. However, sentiment analysis performed on Thai Tweets has challenges brought about by language-related issues, such as the difference in writing systems between Thai and English, short-length messages, slang words, and word usage variation. This research paper focuses on Tweet classification and on solving data sparsity issues. We propose a mixed method of supervised learning techniques and lexicon-based techniques to filter Thai opinions and to then classify them into positive, negative, or neutral sentiments. The proposed method includes a number of pre-processing steps before the text is fed to the classifier. Experimental results showed that the proposed method overcame previous limitations from other studies and was very effective in most cases. The average accuracy was 84.80 %, with 82.42 % precision, 83.88 % recall, and 82.97 % F-measure.

APA, Harvard, Vancouver, ISO, and other styles

48

Neha, Kumari Dr. Mukesh Kumar. "SENTIMENT ANALYSIS USING NOVEL NORMALISATION, NEGATION HANDLING USING MACHINE LEARNING ALGORITHM." International Journal For Technological Research In Engineering 11, no. 5 (2024): 74–81. https://doi.org/10.5281/zenodo.10472620.

Full text

Abstract:

Due to the progression of technology, there is abrupt usage of microblogging sites such as Twitter for sharing of feelings and emotions towards any current hot topic, any product, services, or any event. Such opinionated data needs to be leveraged effectively to get valuable insight from that data. This research work focused on designing a comprehensive feature-based Twitter Sentiment Analysis (TSA) framework using the supervised machine learning approach with integrated sophisticated negation handling approach and knowledge-based Tweet Normalization System (TNS). We leveraged varieties of features such as lexicon-based features, pos-based, morphological, ngrams, negation, and cluster-based features to ascertain which classifier works well with which feature group. We employed three state-of-the-art classifiers including Support Vector Machine (SVM), for our twitter sentiment analysis framework. We found SVM to be the best performing classifier across all the twitter datasets except #9pm9minutes (DTC turned out to be the best for this dataset). Moreover, our SVM model trained on the SemEval-2013 training dataset outperformed the winning team NRC Canada of SemEval-2013 task 2 in terms of macro-averaged F1 score, averaged on positive and negative classes only.

APA, Harvard, Vancouver, ISO, and other styles

49

Zheng, Hang, Qingsong Li, Shen Chen, Yuxuan Liang, and Li Liu. "SENCR: A Span Enhanced Two-Stage Network with Counterfactual Rethinking for Chinese NER." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 17 (2024): 19679–87. http://dx.doi.org/10.1609/aaai.v38i17.29941.

Full text

Abstract:

Recently, lots of works that incorporate external lexicon information into character-level Chinese named entity recognition(NER) to overcome the lackness of natural delimiters of words, have achieved many advanced performance. However, obtaining and maintaining high-quality lexicons is costly, especially in special domains. In addition, the entity boundary bias caused by high mention coverage in some boundary characters poses a significant challenge to the generalization of NER models but receives little attention in the existing literature. To address these issues, we propose SENCR, a Span Enhanced Two-stage Network with Counterfactual Rethinking for Chinese NER, that contains a boundary detector for boundary supervision, a convolution-based type classifier for better span representation and a counterfactual rethinking(CR) strategy for debiased boundary detection in inference. The proposed boundary detector and type classifier are jointly trained with the same contextual encoder and then the trained boundary detector is debiased by our proposed CR strategy without modifying any model parameters in the inference stage. Extensive experiments on four Chinese NER datasets show the effectiveness of our proposed approach.

APA, Harvard, Vancouver, ISO, and other styles

50

Nasser, Ahmed, and Hayri Sever. "A Concept-based Sentiment Analysis Approach for Arabic." International Arab Journal of Information Technology 17, no. 5 (2020): 778–88. http://dx.doi.org/10.34028/iajit/17/5/11.

Full text

Abstract:

Concept-Based Sentiment Analysis (CBSA) methods are considered to be more advanced and more accurate when it compared to ordinary Sentiment Analysis methods, because it has the ability of detecting the emotions that conveyed by multi-word expressions concepts in language. This paper presented a CBSA system for Arabic language which utilizes both of machine learning approaches and concept-based sentiment lexicon. For extracting concepts from Arabic, a rule-based concept extraction algorithm called semantic parser is proposed. Different types of feature extraction and representation techniques are experimented among the building prosses of the sentiment analysis model for the presented Arabic CBSA system. A comprehensive and comparative experiments using different types of classification methods and classifier fusion models, together with different combinations of our proposed feature sets, are used to evaluate and test the presented CBSA system. The experiment results showed that the best performance for the sentiment analysis model is achieved by combined Support Vector Machine-Logistic Regression (SVM-LR) model where it obtained a F-score value of 93.23% using the Concept-Based-Features+Lexicon-Based-Features+Word2vec-Features (CBF+LEX+W2V) features combinations

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!