To see the other types of publications on this topic, follow the link: Tweet Classifier.

Journal articles on the topic 'Tweet Classifier'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Tweet Classifier.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

V, Ashwin. "Twitter Tweet Classifier." IAES International Journal of Artificial Intelligence (IJ-AI) 5, no. 1 (March 1, 2016): 41. http://dx.doi.org/10.11591/ijai.v5.i1.pp41-44.

Full text
Abstract:
<p>This paper addresses the task of building a classifier that would categorise tweets in Twitter. Microblogging nowadays has become a tool of communication for Internet users. They share opinion on different aspects of life. As the popularity of the microblogging sites increases the closer we get to the era of Information Explosion.Twitter is the second most used microblogging site which handles more than 500 million tweets tweeted everyday which translates to mind boggling 5,700 tweets per second. Despite the humongous usage of twitter there isn’t any specific classifier for these tweets that are tweeted on this site. This research attempts to segregate tweets and classify them to categories like Sports, News, Entertainment, Technology, Music, TV, Meme, etc. Naïve Bayes, a machine learning algorithm is used for building a classifier which classifies the tweets when trained with the twitter corpus. With this kind of classifier the user may simply skim the tweets without going through the tedious work of skimming the newsfeed.</p>
APA, Harvard, Vancouver, ISO, and other styles
2

Harsehanto, Ireicca Agustiorini, and M. Didik R. Wahyudi. "Analysis of Personality Characteristic Using the Naïve Bayess Classifier Algorithm (Case Study Official Twitter of Basuki Tjahaja Purnama's and Anies Baswedan)." IJID (International Journal on Informatics for Development) 7, no. 2 (January 7, 2019): 14. http://dx.doi.org/10.14421/ijid.2018.07203.

Full text
Abstract:
Abstract - This research uses data from social media Twitter based on the results of tweets from user_timeline @basuki_btp and @aniesbaswedan. This study uses 2100 tweet data. Data that has been collected is then pre-processed first and labeled manually. The next process is classification using the Naïve Bayess Classifier Algorithm using the Big Five Personality Theory. Based on the test results using 500 tweet data as training data and 1600 tweet data as testing data. The classification results obtained by using the Naïve Bayes Classifier Method and grouped in the "Big Five" personality groups: Openness, Conscientiousness, Extraversion, Agreeableness and Neuroticism on tweet data in Indonesian.
APA, Harvard, Vancouver, ISO, and other styles
3

Tarigan, Thomas Edison, Robby C. Buwono, and Sri Redjeki. "Extraction Opinion of Social Media in Higher Education Using Sentiment Analysis." bit-Tech 2, no. 1 (October 30, 2019): 11–19. http://dx.doi.org/10.32877/bt.v2i1.92.

Full text
Abstract:
The purpose of this research is to extract social media Twitter opinion on a tertiary institution using sentiment analysis. The results of sentiment analysis will provide input to universities as a form of evaluation of management performance in managing institutions. Sentiment analysis generated using the Naïve Bayes Classifier method which is classified into 4 classes: positive, normal, negative and unknown. This study uses 1000 data tweets used for training data needs. The data is classified manually to determine the sentiment of the tweet. Then 20 tweet data is used for testing. The results of this study produce a system that can classify sentiments automatically with 75% test results for sentiment, some obstacles in processing real-time tweets such as duplicate tweets (spam tweets), Indonesian structures that are quite complex and diverse.
APA, Harvard, Vancouver, ISO, and other styles
4

Talpur, Bandeh Ali, and Declan O’Sullivan. "Multi-Class Imbalance in Text Classification: A Feature Engineering Approach to Detect Cyberbullying in Twitter." Informatics 7, no. 4 (November 15, 2020): 52. http://dx.doi.org/10.3390/informatics7040052.

Full text
Abstract:
Twitter enables millions of active users to send and read concise messages on the internet every day. Yet some people use Twitter to propagate violent and threatening messages resulting in cyberbullying. Previous research has focused on whether cyberbullying behavior exists or not in a tweet (binary classification). In this research, we developed a model for detecting the severity of cyberbullying in a tweet. The developed model is a feature-based model that uses features from the content of a tweet, to develop a machine learning classifier for classifying the tweets as non-cyberbullied, and low, medium, or high-level cyberbullied tweets. In this study, we introduced pointwise semantic orientation as a new input feature along with utilizing predicted features (gender, age, and personality type) and Twitter API features. Results from experiments with our proposed framework in a multi-class setting are promising both with respect to Kappa (84%), classifier accuracy (93%), and F-measure (92%) metric. Overall, 40% of the classifiers increased performance in comparison with baseline approaches. Our analysis shows that features with the highest odd ratio: for detecting low-level severity include: age group between 19–22 years and users with <1 year of Twitter account activation; for medium-level severity: neuroticism, age group between 23–29 years, and being a Twitter user between one to two years; and for high-level severity: neuroticism and extraversion, and the number of times tweet has been favorited by other users. We believe that this research using a multi-class classification approach provides a step forward in identifying severity at different levels (low, medium, high) when the content of a tweet is classified as cyberbullied. Lastly, the current study only focused on the Twitter platform; other social network platforms can be investigated using the same approach to detect cyberbullying severity patterns.
APA, Harvard, Vancouver, ISO, and other styles
5

da Silva, Nádia F. F., Eduardo R. Hruschka, and Estevam R. Hruschka. "Tweet sentiment analysis with classifier ensembles." Decision Support Systems 66 (October 2014): 170–79. http://dx.doi.org/10.1016/j.dss.2014.07.003.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Zuhri, Khoirul, and Nurul Adha Oktarini Saputri. "Analisis Sentimen Masyarakat Terhadap Pilpres 2019 Berdasarkan Opini Dari Twitter Menggunakan Metode Naive Bayes Classifier." Journal of Computer and Information Systems Ampera 1, no. 3 (September 17, 2020): 185–99. http://dx.doi.org/10.51519/journalcisa.v1i3.45.

Full text
Abstract:
Twitter is a social media that is currently popular, where the public is free to comment and write anything. It is not uncommon for the public to comment with harsh words and even hate speech. The 2019 presidential election drew many comments, some praised, criticized and insulted. To be able to dig up information and classify a text, sentiment analysis is needed. In this study, sentiment analysis is a process of classifying textual documents into two classes, namely negative and positive sentiment classes. Opinion data were obtained from the Twitter social network in the form of tweets. The data used was 3337 tweets consisting of 80% training data and 20% training data. Training data is data with known sentiment. This study aims to determine whether a tweet is a positive or negative tweet conveyed on Twitter in Indonesian. The classification of tweet data uses the naïve Bayes classifier algorithm. The classification results of the test data show that the Naïve Bayes Classifier algorithm provides an accuracy value of 71%. The accuracy value for each sentiment is 71% for positive sentiment and 70% for negative sentiment
APA, Harvard, Vancouver, ISO, and other styles
7

Hafidz, Noor, and Dewi Yanti Liliana. "Klasifikasi Sentimen pada Twitter Terhadap WHO Terkait Covid-19 Menggunakan SVM, N-Gram, PSO." Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) 5, no. 2 (April 28, 2021): 213–19. http://dx.doi.org/10.29207/resti.v5i2.2960.

Full text
Abstract:
On March 2020 World Health Organization (WHO) has declared Covid-19 as global pandemic. As special agency of United Nation who responsible for international public healthy, WHO has done various actions to reduce this pandemic spreading rate. However, the handling of Covid-19 by WHO is not free from a number of controversies that gave rise to criticism and public opinion on the Twitter platform. In this research, a machine learning based classifier model has been made to determine the opinion or sentiment of the tweet. The dataset used is a set of tweets containing the phrase WHO and Covid-19 in period of March 1st until May 6th 2020 consisting of 4000 tweets with positive sentiments and 4000 tweets with negative sentiments. The proposed classifier model combined Support Vector Machine (SVM), N-Gram and Particle Swarm Optimization (PSO). The classifier model performance is evaluated using the value of Accuracy, Precision, Recall, and Area Under ROC Curve (AUC). Based on experiments conducted, the combination of SVM, N-gram (bigram), and PSO produced a pretty good performance in classifying tweet sentiment with values of Accuracy 0,755, Precision 0,719, Recall 0,837, and AUC 0,844.
APA, Harvard, Vancouver, ISO, and other styles
8

Silitonga, Wiranto Horsen, and Jay Idoan Sihotang. "Analisis Sentimen Pemilihan Presiden Indonesia Tahun 2019 Di Twitter Berdasarkan Geolocation Menggunakan Metode Naïve Bayesian Classification." TeIKa 9, no. 02 (October 31, 2019): 115–27. http://dx.doi.org/10.36342/teika.v9i02.2199.

Full text
Abstract:
Pemilihan Presiden Indonesia 2019 ramai diperbincangkan di dunia nyata maupun dunia maya, khususnya di media sosial Twitter. Semua orang bebas berpendapat tentang pasangan calon Presiden Indonesia 2019 tersebut. Sehingga memunculkan banyak opini, tidak hanya opini yang positif atau netral, ada pula opini negatif. Media sosial khususnya Twitter sekarang ini menjadi salah satu tempat promosi atau kampanye yang efektif dan efisien untuk menggait para pendukung. Dalam hal ini peneliti akan melakukan riset terhadap tokoh publik yang mencalonkan diri menjadi Presiden Indonesia. Metode penelitian yang digunakan dalam riset kali ini adalah algoritma klasifikasi Naïve Bayesian Classifer. Data yang digunakan adalah tweet berbahasa Indonesia dengan kata kunci Jokowi (#Jokowi2Periode) dan Prabowo (#PrabowoSandi) sebanyak 1009 data tweet selama 5 bulan dimulai dari 1 September 2019 sampai 31 Januar1 2019. Yang di mana data tweet tersebut diambil dari empat daerah terbesar di Indonesia, yaitu Jakarta, Bandung, Medan, dan Surabaya. Setiap data akan diambil secara manual menggunakan Geolocation API yang telah di sediakan oleh Twitter melalui Twitter search. Hasil dari klasifikasi menggunakan algoritma Naïve Bayesian Classifier didapat 839 tweet positif, 32 tweet negatif, dan 67 tweet netral dari 938 tweet keseluruhan, atau dalam bentuk persentase ada 90% merupakan sentimen positif, 3% sentimen negatif, dan 7% sentimen netral terhadap bapak Joko Widodo. Dan 56 tweet positif, 6 tweet negatif, dan 8 tweet netral dari 70 tweet keseluruhan, atau dalam bentuk persentase ada 80% merupakan sentimen positif, 9% sentimen negatif, dan 11% sentimen netral terhadap bapak Prabowo. Tingkat akurasi yang dihasilkan dari algoritma Naïve Bayesian Classifier sendiri terhadap penelitian ini sebesar 77,62%.
APA, Harvard, Vancouver, ISO, and other styles
9

Rozi, Imam Fahrur, Elok Nur Hamdana, and Muhammad Balya Iqbal Alfahmi. "PENGEMBANGAN APLIKASI ANALISIS SENTIMEN TWITTER MENGGUNAKAN METODE NAÏVE BAYES CLASSIFIER (Studi Kasus SAMSAT Kota Malang)." Jurnal Informatika Polinema 4, no. 2 (February 1, 2018): 149. http://dx.doi.org/10.33795/jip.v4i2.164.

Full text
Abstract:
Twitter adalah salah satu media sosial dimana pengguna dapat mencari topik tertentu dan membahas isu-isu terkini. Beberapa pesan singkat atau tweet dapat memuat opini terhadap produk dan layanan yang dirasakan oleh masyarakat. Data ini dapat menjadi sumber data untuk dijadikan objek penelitian. Penelitian ini bertujuan untuk membangun aplikasi analisis sentimen yang menerapkan pendekatan Naïve Bayes Classifier untuk mengklasifikasikan kata-kata dan difokuskan pada tweet dalam bahasa Indonesia. Data diperoleh melalui cara web scrapping dan sumber teks yang digunakan sebagai topik bahasan adalah Sistem Administrasi Manunggal Satu Atap (SAMSAT) Malang Kota. Proses klasifikasi dilakukan melalui serangkaian tahapan seperti preproses (case folding, cleaning, tokenizing, dan stopword) serta proses klasifikasi dengan algoritma Naïve Bayes Classifier itu sendiri untuk mendapatkan hasil klasifikasi dengan kategori positif, negatif atau netral. Berdasarkan hasil penelitian, algoritma Naïve Bayes Classifier memberikan unjuk kerja yang baik dalam analisis sentimen. Dari hasil uji akurasi klasifikasi yang dilakukan oleh aplikasi menghasilkan nilai akurasi tertinggi pada setiap kategori positif, negatif, netral masing-masing sebesar 82%, 92%, 80% dengan jumlah data latih 200 tweet negatif, 200 tweet positif, dan 200 tweet netral.
APA, Harvard, Vancouver, ISO, and other styles
10

Tarihoran, Yusran, and Kevin Jeremy Manurip. "Analisis Sentimen Pemilihan Gubernur Jawa Barat Tahun 2018 Dengan Aplikasi Twitter Menggunakan Metode Naïve Bayesian Classification." TeIKa 8, no. 1 (April 30, 2018): 99–105. http://dx.doi.org/10.36342/teika.v8i1.2243.

Full text
Abstract:
Pemilihan Gubernur Jawa Barat 2018 ramai diperbincangkan di dunia nyata maupun dunia maya, khususnya di media sosial Twitter. Semua orang bebas berpendapat atau beropini tentang calon Gubernur Jawa Barat 2018 sehingga memunculkan banyak opini, tidak hanya opini yang positif atau netral, adapula opini negatif. Media sosial khususnya Twitter sekarang ini menjadi salah satu tempat promosi atau kampanye yang efektif dan efisien untuk menggait para pendukung. Dalam hal ini peneliti akan melakukan riset terhadap salah satu tokoh publik yang mencalonkan diri gubernur Jawa Barat. Metode penelitian yang digunakan dalam riset kali ini adalah algoritma klasifikasi Naïve Bayesian Classifer. Data yang digunakan adalah tweet berbahasa Indonesia dengan kata kunci Ridwan Kamil (#RidwanKamil) sebanyak 1031 data tweet selamat setiap hari dimulai dari 15 Januari 2018 sampai 15 April 2018. Hasil dari klasifikasi menggunakan algoritma Naïve Bayesian Classifier didapat 690 jumlah tweet atau 67% dari jumlah keseluruhan data tweet yang mendukung bapak Ridwan Kamil atau bersifat positif khususnya terhadap program kerja yang akan dilakukan dan ini memberikan statistik probabilitas sebesar 73,13% tingkat akurasi Correctly Classified Instances.
APA, Harvard, Vancouver, ISO, and other styles
11

Ulfa, Maria Arista, Budi Irmawati, and Ario Yudo Husodo. "Twitter Sentiment Analysis using Na¨ive Bayes Classifier with Mutual Information Feature Selection." Journal of Computer Science and Informatics Engineering (J-Cosine) 2, no. 2 (December 24, 2018): 106–11. http://dx.doi.org/10.29303/jcosine.v2i2.120.

Full text
Abstract:
Analisis sentimen merupakan suatu teknik idetifikasi terhadap emosi yangdiekspresikan melalui teks. Tujuan analisis sentimen adalah menentukan apakah suatupendapat dalam kalimat atau dokumen termasuk kategori positif ataunegatif. Twitter merupakan salah satu media sosial yang sering digunakan dalammenyampaikan pendapat. Twitter memungkinkan penggunanya (user) untuk menulispendapat mereka mengenai berbagai topik dalam sebuah tweet. Data twitter dalampenelitian ini didownload melalui twitter Application Programming Interface (API).Data twitter tersebut terdiri dari 500 tweet tentang pariwisata Lombok dengan hashtag#lombok dan #woderfullombok. Fitur informasi dari setiap tweet diseleksimenggunakan metode Mutual Information dan dianalisis menggunakan modelklasifikasi Naïve Bayes (Naïve Bayes Classifier). Hasil pengujian klasifikasisentimen twitter pada kategori positif dan negatif menggunakan 10-fold crossvalidation memperoleh akurasi rata-rata sebesar 97,9%.Kata kunci : Analisis Sentimen, Twitter, Naïve Bayes Classifier, Mutual Information
APA, Harvard, Vancouver, ISO, and other styles
12

Hidayatillah, Rumaisah, Mirwan Mirwan, Mohammad Hakam, and Aryo Nugroho. "Levels of Political Participation Based on Naive Bayes Classifier." IJCCS (Indonesian Journal of Computing and Cybernetics Systems) 13, no. 1 (January 31, 2019): 73. http://dx.doi.org/10.22146/ijccs.42531.

Full text
Abstract:
Nowadays, social media is growing rapidly and globally until it finally became an important part of society. During campaign period for the regional head election in Indonesia, the candidates and their supporting parties actively use social media as a campaign tool. Social media like Twitter has been known as a political microblogging media that can provide data about current political event based on users’ tweets. By using Twitter as a data source, this study analyzes public participation during campaign period for 2018 Central Java regional head election. The purpose is to observe how much reaction is given to each candidate who advanced in the election. By using the crawling program, all tweets containing certain candidate names will be downloaded. After going through a series of preprocessing stages, data can be classified using Naive Bayes. Predictor features in classification datasets are the number of replies, retweets, and likes. While the target variable is reaction that is divided into three levels, including high, medium, and low. These levels are determined based on users’ reaction in a tweet. By using these rules, Naive Bayes managed to classify data correctly as much as 76.74% for Ganjar Pranowo and 68.81% for Sudirman Said.
APA, Harvard, Vancouver, ISO, and other styles
13

Astari, Ni Made Ayu Juli, Dewa Gede Hendra Divayana, and Gede Indrawan. "Analisis Sentimen Dokumen Twitter Mengenai Dampak Virus Corona Menggunakan Metode Naive Bayes Classifier." Jurnal Sistem dan Informatika (JSI) 15, no. 1 (November 30, 2020): 27–29. http://dx.doi.org/10.30864/jsi.v15i1.332.

Full text
Abstract:
Virus Corona menjadi permasalahan internasional pada tahun 2020. Hal ini sangat berdampak bagi kehidupan masyarakat. Pemerintah Indonesia mengambil peran dalam menekan peningkatan jumlah penderita virus Corona dengan cara membatasi kegiatan masyarakat di luar rumah. Salah satu dampak yang signifikan dari Virus Corona adalah di sektor perekonomian. Oleh karena itu, perlu dilakukan analisis sentimen untuk menentukan kecenderungan opini masyarakat terhadap dampak virus Corona. Twitter merupakan salah satu platform yang digunakan oleh masyarakat untuk mengekspresikan kondisi terkini setelah virus Corona merambah. Tujuan dari penelitian ini adalah memperoleh analisis dokumen text untuk mendapatkan sentimen positif atau negatif masyarakat. Data yang digunakan merupakan dokumen tweet dari Twitter mengenai dampak virus Corona. Data yang terkumpul dibagi untuk digunakan sebagai data latih dan data uji proses klasifikasi. Metode yang digunakan untuk klasifikasi dalam penelitian ini adalah Metode Naive Bayes Classifier. Hasil klasifikasi dievaluasi menggunakan accuracy dan error rate dengan tujuan mengetahui keakuratan dokumen setelah diklasifikasi menjadi sentimen positif atau negatif. Hasil penelitian menunjukkan metode Naive Bayes mampu mengklasifikasi dokumen tweet dengan akurasi 67% dan error rate sebesar 33%. Percobaan dengan menggunakan 3 jumlah data berbeda (100, 200, dan 500) menghasilkan selisih nilai akurasi yang tidak jauh berbeda yaitu 0,02. Hal ini menunjukkan metode Naive Bayes untuk klasifikasi data tweet terkait dampak virus Corona menghasilkan performa yang stabil. Nilai accuracy yang diperoleh cukup baik dan penelitian selanjutnya bisa dikembangkan dengan memperhitungkan unsur semantik pada dokumen tweet.
APA, Harvard, Vancouver, ISO, and other styles
14

Sari, Diana Ika, Yuliana Fajar Wati, and Widiastuti. "ANALISIS SENTIMEN DAN KLASIFIKASI TWEETS BERBAHASA INDONESIA TERHADAP TRANSPORTASI UMUM MRT JAKARTA MENGGUNAKAN NAÏVE BAYES CLASSIFIER." Jurnal Ilmiah Informatika Komputer 25, no. 1 (2020): 64–75. http://dx.doi.org/10.35760/ik.2020.v25i1.2427.

Full text
Abstract:
Penggunaan media sosial sebagai sarana untuk mengakses dan menyebarkan informasi telah banyak digunakan, salah satunya menggunakan media sosial Twitter. Twitter dalam penelitian ini digunakan sebagai sumber informasi , dalam hal ini digunakan sebagai data untuk menganalisis tweet berbahasa Indonesia yang membahas mengenai transportasi umum baru di Jakarta, MRT Jakarta. Analisis sentimen pada twitter MRT Jakarta digunakan untuk melihat kecenderungan respon pengguna MRT Jakarta apakah berkecenderungan positif atau negatif berdasarkan hasil tweet dari Twitter MRT Jakarta. Analisis sentimen ini dapat membantu masyarakat Indonesia dalam menentukan pilihan transportasi umum yang nyaman dan aman berdasarkan ulasan transportasi umum dari Twitter oleh pengguna MRT Jakarta. Hasil penelitian ini dapat digunakan untuk meningkatkan sistem pada MRT Jakarta, baik dalam meningkatkan layanan maupun fasilitas agar menarik membuat masyarakat untuk menggunakan MRT Jakarta sebagai alat transportasi. Analisis sentimen ini menggunakan metode Naïve Bayes Classifier yang merupakan metode pengklasifikasian. Tahap dalam program yang dilakukan pertama yaitu crawling, preprocessing yang terdiri dari case folding, cleansing, stopword removal, stemming, convert emoticon, dan tokenisasi. Tahap klasifikasi dilakukan setelah melalui fase preprocessing, dimana hasil klasifikasi tweet berkecenderungan positif atau negatif, menggunakan metode Naïve Bayes Classifier. Akurasi sistem pada analisis sentimen terhadap tweet yang terdapat dalam twitter MRT Jakarta adalah 96%.
APA, Harvard, Vancouver, ISO, and other styles
15

Munarko, Y., M. S. Sutrisno, W. A. I. Mahardika, I. Nuryasin, and Y. Azhar. "Named entity recognition model for Indonesian tweet using CRF classifier." IOP Conference Series: Materials Science and Engineering 403 (October 9, 2018): 012067. http://dx.doi.org/10.1088/1757-899x/403/1/012067.

Full text
APA, Harvard, Vancouver, ISO, and other styles
16

Mala Olhang, Maria Mega, Sentot Achmadi, and F. X. Ari Wibisono. "ANALISIS SENTIMEN PENGGUNA TWITTER TERHADAP COVID-19 DI INDONESIA MENGGUNAKAN METODE NAIVE BAYES CLASSIFIER (NBC)." JATI (Jurnal Mahasiswa Teknik Informatika) 4, no. 2 (December 19, 2020): 214–21. http://dx.doi.org/10.36040/jati.v4i2.2695.

Full text
Abstract:
Media sosial khususnya Twitter pada saat ini banyak membahas mengenai penyebaran virus corona atau lebih dikenal dengan COVID-19. Diawali dengan ditemukan kasus pertama di Wuhan, China, pemberitaan mengenai virus corona terus berlanjut hingga penyebarannya sampai ke Indonesia. Pemberitaan melalui artikel di Twitter mengenai dampak dari adanya COVID-19 ini antra lain persediaan bahan pokok yang mulai meningkat harganya termasuk harga masker dan hand sanitizer juga penyampaian setuju dan tidak setujunya masyarakat terhadap kebijakan pemerintah yang dianggap kurang tanggap dalam menangani kasus ini sangat banyak diminati dan dikritik oleh masyarakat. Pada penelitian ini, dilakukan proses menganalisis sentimen masyarakat terhadap aspirasi yang disampaikan melalui Twitter yaitu mengembangkan sistem dengan mengacu pada berbagai sistem yang sudah ada sebelumnya dengan menggunakan metode Naïve Bayes Classifier untuk mengklasifikasikan sentimen. Masukan pada sistem ini berupa tweet yang diperoleh dari Twitter menggunakan keyword seperti #coronavirusindonesia atau #covid-19 dengan jumlah data tidak melebihi 500 data tweet. Sedangkan outputnya berupa pengelompokkan sentimen positif dan negatif dari setiap tweet yang sudah melewati tahap pre proceessing. Dari hasil pengujian, dokumen dengan jumlah sebanyak 75 tweet diperoleh hasil pengukuran akurasi recall 32%, precission 80%, F-Measure 45% serta rata-rata akurasi 36%.
APA, Harvard, Vancouver, ISO, and other styles
17

Karajeh, Ola, Dirar Darweesh, Omar Darwish, Noor Abu-El-Rub, Belal Alsinglawi, and Nasser Alsaedi. "A Classifier to Detect Informational vs. Non-Informational Heart Attack Tweets." Future Internet 13, no. 1 (January 16, 2021): 19. http://dx.doi.org/10.3390/fi13010019.

Full text
Abstract:
Social media sites are considered one of the most important sources of data in many fields, such as health, education, and politics. While surveys provide explicit answers to specific questions, posts in social media have the same answers implicitly occurring in the text. This research aims to develop a method for extracting implicit answers from large tweet collections, and to demonstrate this method for an important concern: the problem of heart attacks. The approach is to collect tweets containing “heart attack” and then select from those the ones with useful information. Informational tweets are those which express real heart attack issues, e.g., “Yesterday morning, my grandfather had a heart attack while he was walking around the garden.” On the other hand, there are non-informational tweets such as “Dropped my iPhone for the first time and almost had a heart attack.” The starting point was to manually classify around 7000 tweets as either informational (11%) or non-informational (89%), thus yielding a labeled dataset to use in devising a machine learning classifier that can be applied to our large collection of over 20 million tweets. Tweets were cleaned and converted to a vector representation, suitable to be fed into different machine-learning algorithms: Deep neural networks, support vector machine (SVM), J48 decision tree and naïve Bayes. Our experimentation aimed to find the best algorithm to use to build a high-quality classifier. This involved splitting the labeled dataset, with 2/3 used to train the classifier and 1/3 used for evaluation besides cross-validation methods. The deep neural network (DNN) classifier obtained the highest accuracy (95.2%). In addition, it obtained the highest F1-scores with (73.6%) and (97.4%) for informational and non-informational classes, respectively.
APA, Harvard, Vancouver, ISO, and other styles
18

Ramasamy, Lakshmana Kumar, Seifedine Kadry, Yunyoung Nam, and Maytham N. Meqdad. "Performance analysis of sentiments in Twitter dataset using SVM models." International Journal of Electrical and Computer Engineering (IJECE) 11, no. 3 (June 1, 2021): 2275. http://dx.doi.org/10.11591/ijece.v11i3.pp2275-2284.

Full text
Abstract:
Sentiment Analysis is a current research topic by many researches using supervised and machine learning algorithms. The analysis can be done on movie reviews, twitter reviews, online product reviews, blogs, discussion forums, Myspace comments and social networks. The Twitter data set is analyzed using support vector machines (SVM) classifier with various parameters. The content of tweet is classified to find whether it contains fact data or opinion data. The deep analysis is required to find the opinion of the tweets posted by the individual. The sentiment is classified in to positive, negative and neutral. From this classification and analysis, an important decision can be made to improve the productivity. The performance of SVM radial kernel, SVM linear grid and SVM radial grid was compared and found that SVM linear grid performs better than other SVM models.
APA, Harvard, Vancouver, ISO, and other styles
19

Tyagi, Abhilasha, and Naresh Sharma. "Sentiment Analysis using Logistic Regression and Effective Word Score Heuristic." International Journal of Engineering & Technology 7, no. 2.24 (April 25, 2018): 20. http://dx.doi.org/10.14419/ijet.v7i2.24.11991.

Full text
Abstract:
Sentiment Analysis is a method for judging somebody's sentiment or feeling with respect to a specific thing. It is utilized to recognize and arrange the sentiments communicated in writings. The web-based social networking sites like twitter draws in a huge number of clients that are online for imparting their insights in the form of tweets or comments. The tweets can be then classified into positive, negative, or neutral. In the proposed work, logistic regression classification is used as a classifier and unigram as a feature vector. For accuracy, k fold cross validation data mining technique is used. For choosing precise training sample, tweet subjectivity is utilized. The idea of Effective Word Score heuristic is likewise presented to find the polarity score of words that are frequently used. This additional heuristic can speed up the classification process of sentiments with standard machine learning approaches.
APA, Harvard, Vancouver, ISO, and other styles
20

Rahmawati, Siti, and Muhammad Habibi. "Public Sentiments Analysis about Indonesian Social Insurance Administration Organization on Twitter." IJID (International Journal on Informatics for Development) 9, no. 2 (December 31, 2020): 87–93. http://dx.doi.org/10.14421/ijid.2020.09205.

Full text
Abstract:
Insurance Administration Organization, which can be used by all people. However, this organization has received various criticisms from the public through social media, namely Twitter. This study aims to analyze public sentiment about the Indonesian Social Insurance Administration Organization on Twitter. The method used in this research is the Naive Bayes Classifier (NBC) method and uses the Support Vector Machine (SVM) method as a comparison. The amount of data used was 12,990 tweets with a data collection period from September 14, 2019 - February 18, 2020. The study compared the two classifier models built, namely the classifier model with two sentiment classes and four sentiment classes. The accuracy results show that the SVM method has a better accuracy value than the NBC method. SVM has an accuracy value of 63.60% and 82.77% for the two sentiment classes in the four sentiment classifier model. The tweet classification results show that the public's conversation about the Indonesian Social Insurance Administration Organization on Twitter has a negative polarity value tendency.
APA, Harvard, Vancouver, ISO, and other styles
21

Gaye, Babacar, Dezheng Zhang, and Aziguli Wulamu. "A Tweet Sentiment Classification Approach Using a Hybrid Stacked Ensemble Technique." Information 12, no. 9 (September 14, 2021): 374. http://dx.doi.org/10.3390/info12090374.

Full text
Abstract:
With the extensive availability of social media platforms, Twitter has become a significant tool for the acquisition of peoples’ views, opinions, attitudes, and emotions towards certain entities. Within this frame of reference, sentiment analysis of tweets has become one of the most fascinating research areas in the field of natural language processing. A variety of techniques have been devised for sentiment analysis, but there is still room for improvement where the accuracy and efficacy of the system are concerned. This study proposes a novel approach that exploits the advantages of the lexical dictionary, machine learning, and deep learning classifiers. We classified the tweets based on the sentiments extracted by TextBlob using a stacked ensemble of three long short-term memory (LSTM) as base classifiers and logistic regression (LR) as a meta classifier. The proposed model proved to be effective and time-saving since it does not require feature extraction, as LSTM extracts features without any human intervention. We also compared our proposed approach with conventional machine learning models such as logistic regression, AdaBoost, and random forest. We also included state-of-the-art deep learning models in comparison with the proposed model. Experiments were conducted on the sentiment140 dataset and were evaluated in terms of accuracy, precision, recall, and F1 Score. Empirical results showed that our proposed approach manifested state-of-the-art results by achieving an accuracy score of 99%.
APA, Harvard, Vancouver, ISO, and other styles
22

Neelakandan, S., and D. Paulraj. "A gradient boosted decision tree-based sentiment classification of twitter data." International Journal of Wavelets, Multiresolution and Information Processing 18, no. 04 (May 26, 2020): 2050027. http://dx.doi.org/10.1142/s0219691320500277.

Full text
Abstract:
People communicate their views, arguments and emotions about their everyday life on social media (SM) platforms (e.g. Twitter and Facebook). Twitter stands as an international micro-blogging service that features a brief message called tweets. Freestyle writing, incorrect grammar, typographical errors and abbreviations are some noises that occur in the text. Sentiment analysis (SA) centered on a tweet posted by the user, and also opinion mining (OM) of the customers review is another famous research topic. The texts are gathered from users’ tweets by means of OM and automatic-SA centered on ternary classifications, namely positive, neutral and negative. It is very challenging for the researchers to ascertain sentiments as a result of its limited size, misspells, unstructured nature, abbreviations and slangs for Twitter data. This paper, with the aid of the Gradient Boosted Decision Tree classifier (GBDT), proposes an efficient SA and Sentiment Classification (SC) of Twitter data. Initially, the twitter data undergoes pre-processing. Next, the pre-processed data is processed using HDFS MapReduce. Now, the features are extracted from the processed data, and then efficient features are selected using the Improved Elephant Herd Optimization (I-EHO) technique. Now, score values are calculated for each of those chosen features and given to the classifier. At last, the GBDT classifier classifies the data as negative, positive, or neutral. Experiential results are analyzed and contrasted with the other conventional techniques to show the highest performance of the proposed method.
APA, Harvard, Vancouver, ISO, and other styles
23

Fajar Rodiyansyah, Sandi, and Edi Winarko. "Klasifikasi Posting Twitter Kemacetan Lalu Lintas Kota Bandung Menggunakan Naive Bayesian Classification." IJCCS (Indonesian Journal of Computing and Cybernetics Systems) 7, no. 1 (January 1, 2013): 13. http://dx.doi.org/10.22146/ijccs.3048.

Full text
Abstract:
AbstrakSetiap hari server Twitter menerima data tweet dengan jumlah yang sangat besar, dengan demikian, kita dapat melakukan data mining yang digunakan untuk tujuan tertentu. Salah satunya adalah untuk visualisasi kemacetan lalu lintas di sebuah kota.Naive bayes classifier adalah pendekatan yang mengacu pada teorema Bayes, dengan mengkombinasikan pengetahuan sebelumnya dengan pengetahuan baru. Sehingga merupakan salah satu algoritma klasifikasi yang sederhana namun memiliki akurasi tinggi. Untuk itu, dalam penelitian ini akan membuktikan kemampuan naive bayes classifier untuk mengklasifikasikan tweet yang berisi informasi dari kemacetan lalu lintas di Bandung.Dari hasil uji coba, aplikasi menunjukan bahwa nilai akurasi terkecil 78% dihasilkan pada pengujian dengan sampel sebanyak 100 dan menghasilkan nilai akurasi tinggi 91,60% pada pengujian dengan sampel sebanyak 13106. Hasil pengujian dengan perangkat lunak Rapid Miner 5.1 diperoleh nilai akurasi terkecil 72% dengan sampel sebanyak 100 dan nilai akurasi tertinggi 93,58% dengan sampel 13106 untuk metode naive bayesian classification. Sedangkan untuk metode support vector machine diperoleh nilai akurasi terkecil 92% dengan sampel sebanyak 100 dan nilai akurasi tertinggi 99,11% dengan sampel sebanyak 13106. Kata kunci— Twitter, tweet, klasifikasi, naive bayesian classification, support vector machine AbstractEvery day the Twitter server receives data tweet with a very large number, thus, we can perform data mining to be used for specific purpose. One of which is for the visualization of traffic jam in a city.Naive bayes classifier is an approach that refers to the bayes theorem, is a combination of prior knowledge with new knowledge. So that is one of the classification algorithm is simple but has a high accuracy. With this, in this research will prove the ability naive bayes classifier to classify the tweet that contains information of traffic jam in Bandung.The testing result, the program shows that the smallest value of the accuracy is 78% on testing by using a sample 100 record and generate high accuracy is 91,60% on the testing by using a sample 13106 record. The testing results with Rapid Miner 5.1 software obtained the smallest value of the accuracy is 72% by using a sample 100 records and the high accuracy is 93.58% by using a sample 13.106 records for naive bayesian classification. And for the method of support vector machine obtained the smallest value is 92% accuracy by using a sample 100 records and the high accuracy of 99.11% by using a sample 13.106 records. Keywords—Twitter, tweet, classification, naive bayesian classification, support vector machine
APA, Harvard, Vancouver, ISO, and other styles
24

Dharsono, Yulius Paulus. "Analisa Sentimen Dengan Korpus Sentiment140 Menggunakan Classifier Support Vector Machine RBF." CSRID (Computer Science Research and Its Development Journal) 12, no. 2 (March 3, 2021): 89. http://dx.doi.org/10.22303/csrid.12.2.2020.89-97.

Full text
Abstract:
<em>Studi dan strategi dalam penekanan laju penyebaran pandemi COVID-19 pernah dilakukan negara Singapura, pada masa epidemi SARS-CoV varian virus novel corona dengan menerapkan kebijakan pembatasan sosial. Hal ini menjadi topik tren pada tagar jejaring sosial Twitter. Banyaknya pengguna dan kecepatan respon terhadap situasi dan kondisi lingkungan, menjadikan Twitter sebagai sumber data besar opini potensial berupa informasi subyektif yang memiliki sentimen. Dalam hal ini, bagaimana opini dapat ditransformasi menjadi pengetahun terstruktur yang memiliki nilai dan dapat diterapkan secara praktis, menjadi menarik untuk dilakukan penelitian. Pendekatan penelitian dilakukan dengan mengadopsi label sentimen Twitter sebagai input pembuatan model pembelajaran mesin diawasi terhadap opini publik terkini. Fokus penelitian adalah analisa sentimen dataset berlabel Sentiment140, dengan data pengujian tweet tagar #socialdistancing menggunakan classifier SVM RBF. Hasil pengujian model classifier SVM RBF terhadap data pengujian 1116 tweet dengan prediksi sentimen pada uji1 77.51% positif dan uji2 63.97% positif. Dari kedua pengujian terdapat metrik dominan pada uji2, dengan nilai precision 72.83%. Secara umum parameter terbaik pengujian model terdapat pada keseimbangan antara precision dan recall, yakni F-measure dengan 70.57% pada uji1 dan 70.77% pada uji2.</em>
APA, Harvard, Vancouver, ISO, and other styles
25

Mayasari, Rakhmah Wahyu, Kartika Fithriasari, Nur Iriawan, and Wiwiek Setya Winahju. "Surabaya Government Performance Evaluation Using Tweet Analysis." MATEMATIKA 36, no. 1 (March 31, 2020): 31–42. http://dx.doi.org/10.11113/matematika.v36.n1.1176.

Full text
Abstract:
The purpose of this research is to determine the various positive attributes appreciated by the public, and the negative things that need to be improved by the Surabaya government. The sentiment analysis methods, including the Naïve Bayes Classifier, Support Vector Machine, and Logistic Regression, are employed to classify the pros and cons of the Surabaya government. The comparison of the three methods demonstrated that SVM gives the best classification accuracy compared to others. Police performance is the highlighted word in the positive category, while traffic congestion is in the negative.
APA, Harvard, Vancouver, ISO, and other styles
26

Deviyanti, Fransisca Julia Kusuma, Sri Suning Kusumawardani, and Paulus Insap Santosa. "Ontology-Based Social Media Talks Topic Classification (Twitter Case)." IJITEE (International Journal of Information Technology and Electrical Engineering) 3, no. 1 (September 13, 2019): 1. http://dx.doi.org/10.22146/ijitee.46534.

Full text
Abstract:
In the era of digital communication, the use of Twitter as a customer service has been widely encountered. Companies have started to develop strategies around effective use of Twitter, one of which was to identify problems that customers frequently complain about. Twitter, with its straightforward tweet characteristics, will certainly contain sentences with very specific and easily recognizable keywords. These characteristics can be used as a basis for classifying tweets into certain topics. With a help of ontology, classification with keywords can be done automatically. The purpose of this paper is to design an ontology used as a basis for classifying tweets into certain topics related to the 4G telecommunications network in Indonesia and to evaluate performance of proposed classifier model.
APA, Harvard, Vancouver, ISO, and other styles
27

Tiwari, Sunita, Sushil Kumar, Vikas Jethwani, Deepak Kumar, and Vyoma Dadhich. "PNTRS." Journal of Cases on Information Technology 24, no. 3 (July 2022): 1–19. http://dx.doi.org/10.4018/jcit.20220701.oa9.

Full text
Abstract:
A news recommendation system not only must recommend the latest, trending and personalized news to the users but also give opportunity to know about the people’s opinion on trending news. Most of the existing news recommendation systems focus on recommending news articles based on user-specific tweets. In contrast to these recommendation systems, the proposed Personalized News and Tweet Recommendation System (PNTRS) recommends tweets based on the recommended article. It firstly generates news recommendation based on user’s interest and twitter profile using the Multinomial Naïve Bayes (MNB) classifier. Further, the system uses these recommended articles to recommend various trending tweets using fuzzy inference system. Additionally, feedback-based learning is applied to improve the efficiency of the proposed recommendation system. The user feedback rating is taken to evaluate the satisfaction level and it is 7.9 on the scale of 10.
APA, Harvard, Vancouver, ISO, and other styles
28

Reddy, Cherlakola Abhinav, Sai Nitesh Gadiraju, and Dr Samala Nagaraj. "Detecting Fake News Tweets from Twitter." Journal of University of Shanghai for Science and Technology 23, no. 08 (August 16, 2021): 532–37. http://dx.doi.org/10.51201/jusst/21/08428.

Full text
Abstract:
Online media has progressively obtained integral to the route billions of individuals experience news and occasions, frequently bypassing writers—the conventional guardians of breaking news. Occasions,in reality, make a relating spike of posts (tweets) on Twitter. This projects a great deal of significance on the validity of data found via online media stages like Twitter. We have utilized different managed learning techniques like Naïve Bayes, Decision Trees, and Support Vector Machines on the information to separate tweets among genuine and counterfeit news. For our AI models, we have utilized tweet and client highlights as our indicators. We accomplished a precision of 88% utilizing the Random Forest classifier and 88% utilizing the Decision tree. Notwithstanding, we accept that breaking down client records would build the accuracy of our models.
APA, Harvard, Vancouver, ISO, and other styles
29

Liu, Li, Dashi Luo, Ming Liu, Jun Zhong, Ye Wei, and Letian Sun. "A Self-Adaptive Hidden Markov Model for Emotion Classification in Chinese Microblogs." Mathematical Problems in Engineering 2015 (2015): 1–8. http://dx.doi.org/10.1155/2015/987189.

Full text
Abstract:
Microblogging is increasingly becoming one of the most popular online social media for people to express ideas and emotions. The amount of socially generated content from this medium is enormous. Text mining techniques have been intensively applied to discover the hidden knowledge and emotions from this huge dataset. In this paper, we propose a modified version of hidden Markov model (HMM) classifier, called self-adaptive HMM, whose parameters are optimized by Particle Swarm Optimization algorithms. Since manually labeling large-scale dataset is difficult, we also employ the entropy to decide whether a new unlabeled tweet shall be contained in the training dataset after being assigned an emotion using our HMM-based approach. In the experiment, we collected about 200,000 Chinese tweets from Sina Weibo. The results show that theF-score of our approach gets 76% on happiness and fear and 65% on anger, surprise, and sadness. In addition, the self-adaptive HMM classifier outperforms Naive Bayes and Support Vector Machine on recognition of happiness, anger, and sadness.
APA, Harvard, Vancouver, ISO, and other styles
30

Sangavi, K. "An Efficient Subjective Sentiment Classification of Hate Speech Using Tri-Model Approach." Revista Gestão Inovação e Tecnologias 11, no. 2 (June 5, 2021): 406–17. http://dx.doi.org/10.47059/revistageintec.v11i2.1677.

Full text
Abstract:
Arrangement highlights were gotten from the substance of each tweet, including syntactic conditions between words to perceive "othering" phrases, actuation to react with adversarial activity, and cases of very much established or legitimized oppression social gatherings. The consequences of the classifier were ideal utilizing a blend of probabilistic, rule-based, and spatial-based classifiers with a casted a ballot group meta-classifier. We show how the consequences of the classifier can be powerfully used in a factual model used to figure the probably spread of digital scorn in an example of Twitter information. The applications to strategy and dynamic are examined. We propose a cooperative multi-space assessment arrangement way to deal with train supposition classifiers for numerous areas at the same time. In our methodology, the supposition data in various spaces is shared to prepare more precise and vigorous notion classifiers for every area when named information is scant. In particular, we decay the slant classifier of every space into two segments, a worldwide one and an area explicit one. The area explicit model can catch the particular feeling articulations in every space. Moreover, we extricate Tri_Model (Naive Bayes IBK, SVM) sentiment information from both marked and unlabelled examples in every area and use it to upgrade the learning of Tri_Model (Naive Bayes IBK, SVM) sentiment classifiers.
APA, Harvard, Vancouver, ISO, and other styles
31

Helmstetter, Stefan, and Heiko Paulheim. "Collecting a Large Scale Dataset for Classifying Fake News Tweets Using Weak Supervision." Future Internet 13, no. 5 (April 29, 2021): 114. http://dx.doi.org/10.3390/fi13050114.

Full text
Abstract:
The problem of automatic detection of fake news in social media, e.g., on Twitter, has recently drawn some attention. Although, from a technical perspective, it can be regarded as a straight-forward, binary classification problem, the major challenge is the collection of large enough training corpora, since manual annotation of tweets as fake or non-fake news is an expensive and tedious endeavor, and recent approaches utilizing distributional semantics require large training corpora. In this paper, we introduce an alternative approach for creating a large-scale dataset for tweet classification with minimal user intervention. The approach relies on weak supervision and automatically collects a large-scale, but very noisy, training dataset comprising hundreds of thousands of tweets. As a weak supervision signal, we label tweets by their source, i.e., trustworthy or untrustworthy source, and train a classifier on this dataset. We then use that classifier for a different classification target, i.e., the classification of fake and non-fake tweets. Although the labels are not accurate according to the new classification target (not all tweets by an untrustworthy source need to be fake news, and vice versa), we show that despite this unclean, inaccurate dataset, the results are comparable to those achieved using a manually labeled set of tweets. Moreover, we show that the combination of the large-scale noisy dataset with a human labeled one yields more advantageous results than either of the two alone.
APA, Harvard, Vancouver, ISO, and other styles
32

Seth, Pranav, Apoorv Sharma, and R. Vidhya. "Sentiment Analysis of Tweets Using Hadoop." International Journal of Engineering & Technology 7, no. 3.12 (July 20, 2018): 434. http://dx.doi.org/10.14419/ijet.v7i3.12.16123.

Full text
Abstract:
Blogging and networking platforms like Facebook, Reddit, Twitter and LinkedIn are social media channels where users can share their thoughts and opinions. Since online chatter is a vital and exhaustive source of information, these thoughts and opinions hold the key to the success of any endeavour. Tweets which are posted by millions all over the world can be used to analyse consumers’ opinions about individual products, services and campaigns. These tweets have proven to be a valuable source of information in the recent years, playing key roles in success of brands, businesses and politicians. We have tackled Sentiment Analysis with a lexicon-based approach for extracting positive, negative, and neutral tweets by using part-of-speech tagging from natural language processing. The approach manifests in the design of a software toolkit that facilitates the sentiment analysis. We collect dataset, i.e. the tweets are fetched from Twitter and text mining techniques like tokenization are executed to use it for building classifier that is able to predict sentiments for each tweet.
APA, Harvard, Vancouver, ISO, and other styles
33

Shin, Han-Sub, Hyuk-Yoon Kwon, and Seung-Jin Ryu. "A New Text Classification Model Based on Contrastive Word Embedding for Detecting Cybersecurity Intelligence in Twitter." Electronics 9, no. 9 (September 18, 2020): 1527. http://dx.doi.org/10.3390/electronics9091527.

Full text
Abstract:
Detecting cybersecurity intelligence (CSI) on social media such as Twitter is crucial because it allows security experts to respond cyber threats in advance. In this paper, we devise a new text classification model based on deep learning to classify CSI-positive and -negative tweets from a collection of tweets. For this, we propose a novel word embedding model, called contrastive word embedding, that enables to maximize the difference between base embedding models. First, we define CSI-positive and -negative corpora, which are used for constructing embedding models. Here, to supplement the imbalance of tweet data sets, we additionally employ the background knowledge for each tweet corpus: (1) CVE data set for CSI-positive corpus and (2) Wikitext data set for CSI-negative corpus. Second, we adopt the deep learning models such as CNN or LSTM to extract adequate feature vectors from the embedding models and integrate the feature vectors into one classifier. To validate the effectiveness of the proposed model, we compare our method with two baseline classification models: (1) a model based on a single embedding model constructed with CSI-positive corpus only and (2) another model with CSI-negative corpus only. As a result, we indicate that the proposed model shows high accuracy, i.e., 0.934 of F1-score and 0.935 of area under the curve (AUC), which improves the baseline models by 1.76∼6.74% of F1-score and by 1.64∼6.98% of AUC.
APA, Harvard, Vancouver, ISO, and other styles
34

Banbhrani, Santosh Kumar, Bo Xu, Haifeng Liu, and Hongfei Lin. "SC-Political ResNet: Hashtag Recommendation from Tweets Using Hybrid Optimization-Based Deep Residual Network." Information 12, no. 10 (September 22, 2021): 389. http://dx.doi.org/10.3390/info12100389.

Full text
Abstract:
Hashtags are considered important in various real-world applications, including tweet mining, query expansion, and sentiment analysis. Hence, recommending hashtags from tagged tweets has been considered significant by the research community. However, while many hashtag recommendation methods have been developed, finding the features from dictionary and thematic words has not yet been effectively achieved. Therefore, we developed an effective method to perform hashtag recommendations, using the proposed Sine Cosine Political Optimization-based Deep Residual Network (SC-Political ResNet) classifier. The developed SCPO is designed by integrating the Sine Cosine Algorithm (SCA) with the Political Optimizer (PO) algorithm. Employing the parametric features from both, optimization can enable the acquisition of the global best solution, by training the weights of classifier. The hybrid features acquired from the keyword set can effectively find the information of words associated with dictionary, thematic, and more relevant keywords. Extensive experiments are conducted on the Apple Twitter Sentiment and Twitter datasets. Our empirical results demonstrate that the proposed model can significantly outperform state-of-the-art methods in hashtag recommendation tasks.
APA, Harvard, Vancouver, ISO, and other styles
35

Arini, Arini, Luh Kesuma Wardhani, and Dimas Octaviano. "Perbandingan Seleksi Fitur Term Frequency & Tri-Gram Character Menggunakan Algoritma Naïve Bayes Classifier (Nbc) Pada Tweet Hashtag #2019gantipresiden." KILAT 9, no. 1 (April 25, 2020): 103–14. http://dx.doi.org/10.33322/kilat.v9i1.878.

Full text
Abstract:
Towards an election year (elections) in 2019 to come, many mass campaign conducted through social media networks one of them on twitter. One online campaign is very popular among the people of the current campaign with the hashtag #2019GantiPresiden. In studies sentiment analysis required hashtag 2019GantiPresiden classifier and the selection of robust functionality that mendaptkan high accuracy values. One of the classifier and feature selection algorithms are Naive Bayes classifier (NBC) with Tri-Gram feature selection Character & Term-Frequency which previous research has resulted in a fairly high accuracy. The purpose of this study was to determine the implementation of Algorithm Naive Bayes classifier (NBC) with each selection and compare features and get accurate results from Algorithm Naive Bayes classifier (NBC) with both the selection of the feature. The author uses the method of observation to collect data and do the simulation. By using the data of 1,000 tweets originating from hashtag # 2019GantiPresiden taken on 15 September 2018, the author divides into two categories: 950 tweets as training data and 50 tweets as test data where the labeling process using methods Lexicon Based sentiment. From this study showed Naïve Bayes classifier algorithm accuracy (NBC) with feature selection Character Tri-Gram by 76% and Term-Frequency by 74%,the result show that the feature selection Character Tri-Gram better than Term-Frequency.
APA, Harvard, Vancouver, ISO, and other styles
36

Chayangkoon, Narongsak, and Anongnart Srivihok. "Text classification model for methamphetamine-related tweets in Southeast Asia using dual data preprocessing techniques." International Journal of Electrical and Computer Engineering (IJECE) 11, no. 4 (August 1, 2021): 3617. http://dx.doi.org/10.11591/ijece.v11i4.pp3617-3628.

Full text
Abstract:
<span>Methamphetamine addiction is a prominent problem in Southeast Asia. Drug addicts often discuss illegal activities on popular social networking services. These individuals spread messages on social media as a means of both buying and selling drugs online. This paper proposes a model, the “text classification model of methamphetamine tweets in Southeast Asia” (TMTA), to identify whether a tweet from Southeast Asia is related to methamphetamine abuse. The research addresses the weakness of bag of words (BoW) by introducing BoW and Word2Vec feature selection (BWF) techniques. A domain-based feature selection method was performed using the BoW dataset and Word2Vec. The BWF dataset provided a smaller number of features than the BoW and TF–IDF dataset. We experimented with three candidate classifiers: Support vector machine (SVM), decision tree (J48) and naive bayes (NB). We found that the J48 classifier with the BWF dataset provided the best performance for the TMTA in terms of accuracy (0.815), F-measure (0.818), Kappa (0.528), Matthews correlation coefficient (0.529) and high area under the ROC Curve (0.763). Moreover, TMTA provided the lowest runtime (3.480 seconds) using the J48 with the BWF dataset.</span>
APA, Harvard, Vancouver, ISO, and other styles
37

Narayanan, K. S., and Dr S. Suganya. "Rocchio Nearest Centroid and Normalized Neural Network based Lead Generation in Social Media Marketing." Revista Gestão Inovação e Tecnologias 11, no. 4 (July 22, 2021): 2583–602. http://dx.doi.org/10.47059/revistageintec.v11i4.2302.

Full text
Abstract:
The employment of the internet and social media has transposed consumer behavior and the methods in which business organizations carry over their business. Social and digital marketing recommends noteworthy changes to business establishments via cost curtailment, enhanced brand understanding and surged sales. Despite enormous amount of potentialities, noteworthy disputes subsist from gloomy digital oral message in addition to trespassing and troublesome online presence of brand. Deep learning (DL) has fascinated escalated awareness owing to its notable processing power in tasks, to name a few being, speech, image, or text processing. Due to its aggressive evolution and extensive accessibility of digital social media (SM), examining these data utilizing conventional materials and methods is substantial or even complex. Also with the large growth in the volume of data, the multifariousness in data heterogeneity, are the most distinguished reasons, why and how the SM data mounted. In this paper we study the impact of tweets on distance learning to understand people’s opinions and to discover facts. However, adding redundant features minimizes the generalization capability of the model and may also minimize the overall accuracy of a classifier. We introduce Rocchio Nearest Centroid Laplacian Feature Selection model that combines Rocchio Nearest Centroid and Laplace function for selecting relevant features or tweets. Next an Arbitrary Normalized Attention-based Recurrent Neural Network Lead Generation algorithm is designed aggregating characterizations from preceding and succeeding tweets while generating lead via digital marketing tweet funnel. We validate and evaluate our method using data from distance learning dataset. Experiments and comparisons on distance learning data show that, compared to existing SMM methods, considering generalization capability and digital marketing tweet funnel results in improvements in processing time, lead generation accuracy and precision to a significant extent.
APA, Harvard, Vancouver, ISO, and other styles
38

Omran, Nahla F., Sara F. Abd-el Ghany, Hager Saleh, and Ayman Nabil. "Breast Cancer Identification from Patients’ Tweet Streaming Using Machine Learning Solution on Spark." Complexity 2021 (January 27, 2021): 1–12. http://dx.doi.org/10.1155/2021/6653508.

Full text
Abstract:
Twitter integrates with streaming data technologies and machine learning to add new value to healthcare. This paper presented a real-time system to predict breast cancer based on streaming patient’s health data from Twitter. The proposed system consists of two major components: developing an offline building model and an online prediction pipeline. For the first component, we made a correlation between the features to determine the correlation between features and reduce the number of features from the Breast Cancer Wisconsin Diagnostic dataset. Two feature selection algorithms are recursive feature elimination and univariate feature selection algorithms which are applied to features after correlation to select the essential features. Four decision trees, logistic regression, support vector machine, and random forest classifier have been used on features after correlation and feature selection. Also, hyperparameter tuning and cross-validation have been applied with machine learning to optimize models and enhance accuracy. Apache Spark, Apache Kafka, and Twitter Streaming API are used to develop the second component. The best model with the highest accuracy obtained from the first component predicts breast cancer in real time from tweets’ streaming. The results showed that the best model is the random forest classifier which achieved the best accuracy.
APA, Harvard, Vancouver, ISO, and other styles
39

Yang, Yuan-Chi, Mohammed Ali Al-Garadi, Whitney Bremer, Jane M. Zhu, David Grande, and Abeed Sarker. "Developing an Automatic System for Classifying Chatter About Health Services on Twitter: Case Study for Medicaid." Journal of Medical Internet Research 23, no. 5 (May 3, 2021): e26616. http://dx.doi.org/10.2196/26616.

Full text
Abstract:
Background The wide adoption of social media in daily life renders it a rich and effective resource for conducting near real-time assessments of consumers’ perceptions of health services. However, its use in these assessments can be challenging because of the vast amount of data and the diversity of content in social media chatter. Objective This study aims to develop and evaluate an automatic system involving natural language processing and machine learning to automatically characterize user-posted Twitter data about health services using Medicaid, the single largest source of health coverage in the United States, as an example. Methods We collected data from Twitter in two ways: via the public streaming application programming interface using Medicaid-related keywords (Corpus 1) and by using the website’s search option for tweets mentioning agency-specific handles (Corpus 2). We manually labeled a sample of tweets in 5 predetermined categories or other and artificially increased the number of training posts from specific low-frequency categories. Using the manually labeled data, we trained and evaluated several supervised learning algorithms, including support vector machine, random forest (RF), naïve Bayes, shallow neural network (NN), k-nearest neighbor, bidirectional long short-term memory, and bidirectional encoder representations from transformers (BERT). We then applied the best-performing classifier to the collected tweets for postclassification analyses to assess the utility of our methods. Results We manually annotated 11,379 tweets (Corpus 1: 9179; Corpus 2: 2200) and used 7930 (69.7%) for training, 1449 (12.7%) for validation, and 2000 (17.6%) for testing. A classifier based on BERT obtained the highest accuracies (81.7%, Corpus 1; 80.7%, Corpus 2) and F1 scores on consumer feedback (0.58, Corpus 1; 0.90, Corpus 2), outperforming the second best classifiers in terms of accuracy (74.6%, RF on Corpus 1; 69.4%, RF on Corpus 2) and F1 score on consumer feedback (0.44, NN on Corpus 1; 0.82, RF on Corpus 2). Postclassification analyses revealed differing intercorpora distributions of tweet categories, with political (400778/628411, 63.78%) and consumer feedback (15073/27337, 55.14%) tweets being the most frequent for Corpus 1 and Corpus 2, respectively. Conclusions The broad and variable content of Medicaid-related tweets necessitates automatic categorization to identify topic-relevant posts. Our proposed system presents a feasible solution for automatic categorization and can be deployed and generalized for health service programs other than Medicaid. Annotated data and methods are available for future studies.
APA, Harvard, Vancouver, ISO, and other styles
40

ROJRATANAVIJIT, Jitrlada, Preecha VICHITTHAMAROS, and Sukanya PHONGSUPHAP. "Acquiring Sentiment from Twitter using Supervised Learning and Lexicon-based Techniques." Walailak Journal of Science and Technology (WJST) 15, no. 1 (December 2, 2016): 63–80. http://dx.doi.org/10.48048/wjst.2018.2731.

Full text
Abstract:
The emergence of Twitter in Thailand has given millions of users a platform to express and share their opinions about products and services, among other subjects, and so Twitter is considered to be a rich source of information for companies to understand their customers by extracting and analyzing sentiment from Tweets. This offers companies a fast and effective way to monitor public opinions on their brands, products, services, etc. However, sentiment analysis performed on Thai Tweets has challenges brought about by language-related issues, such as the difference in writing systems between Thai and English, short-length messages, slang words, and word usage variation. This research paper focuses on Tweet classification and on solving data sparsity issues. We propose a mixed method of supervised learning techniques and lexicon-based techniques to filter Thai opinions and to then classify them into positive, negative, or neutral sentiments. The proposed method includes a number of pre-processing steps before the text is fed to the classifier. Experimental results showed that the proposed method overcame previous limitations from other studies and was very effective in most cases. The average accuracy was 84.80 %, with 82.42 % precision, 83.88 % recall, and 82.97 % F-measure.
APA, Harvard, Vancouver, ISO, and other styles
41

Cahyani, Alvie Delia, and Tati Mardiana. "SENTIMENT ANALYSIS OF DIGITAL WALLET SERVICE USERS USING NAÏVE BAYES CLASSIFIER AND PARTICLE SWARM OPTIMIZATION." Jurnal Riset Informatika 2, no. 4 (September 15, 2020): 241–50. http://dx.doi.org/10.34288/jri.v2i4.160.

Full text
Abstract:
Digital wallet services provide many conveniences and benefits to its users. However, not all digital wallet service users have a positive opinion of the service. Sentiment analysis in this study aims to determine the opinions given by Dana and Isaku digital wallet service users whether they contain positive or negative opinions and apply the Naïve Bayes Classifier and Particle Swarm Optimization (PSO) method to the sentiment analysis of digital wallet service users. The Naïve Bayes Classifier method is used because it is simple, fast, high accuracy, and has good enough performance to classify data, but the Naïve Bayes Classifier has the disadvantage that each independent variable is assumed to cause a decrease in the accuracy value. Therefore, this research added an attribute weighting method, namely Particle Swarm Optimization (PSO) to increase the accuracy of the classification of the Naïve Bayes Classifier. This study uses data taken from Twitter as many as 490 tweet data. The test results using the confusion matrix and ROC curve show an increase in accuracy of the Naïve Bayes Classifier Dana digital wallet from 60.00% to 91.67% and I.Saku digital wallet from 53.23% to 85.00%. T-Test and Anova test results show that the two classification methods tested have significant (significant) differences in Accuracy values.
APA, Harvard, Vancouver, ISO, and other styles
42

Saputra, Ade chandra, and Agus sehatman saragih. "APLIKASI SENTIMENT MONITORING UNTUK TWITTER DENGAN ALGORITMA NAIVE-BAYES CLASSIFIER." Jurnal Teknologi Informasi: Jurnal Keilmuan dan Aplikasi Bidang Teknik Informatika 15, no. 1 (January 10, 2021): 82–91. http://dx.doi.org/10.47111/jti.v15i1.1902.

Full text
Abstract:
Every day there are millions of opinion spread across social networks. This is often utilized by various parties to determine the opinion and sentiment of the public towards the product, brand or figures that they hold. Given the abundance of data and opinions, it is not possible to do sentiment analysis manually. In this research, author performs design and implementation of sentiment monitoring application, that could monitor people’s sentiment about a particular keyword, so it is known how the people response to those keywords, whether positive, negative or neutral. From various existing social networks, Twitter is chosen as the source of data that will be monitored. Classification algorithm used here is Naive-Bayes Classifier with Boolean Multinomial model, and feature extraction using unigram word. The training data used is 400,000 data for each type of sentiment, so the total is 1.200.000 data. In the process of classification and training, application will perform stemming to take the root words contained within the tweet. Stemming algorithm used here is Confix Stripping. The methodology of application development that used here is staged delivery. Implementation of application is done using PHP programming language. The result of this research is a sentiment monitoring application that can monitor public sentiment about a particular keyword in a particular time frame. From testing using k-fold cross validation, obtained accuracy rate for sentiment classification amounted to 85%.
APA, Harvard, Vancouver, ISO, and other styles
43

Saputra, Ade chandra, and Agus sehatman saragih. "APLIKASI SENTIMENT MONITORING UNTUK TWITTER DENGAN ALGORITMA NAIVE-BAYES CLASSIFIER." Jurnal Teknologi Informasi: Jurnal Keilmuan dan Aplikasi Bidang Teknik Informatika 15, no. 1 (January 10, 2021): 82–91. http://dx.doi.org/10.47111/jti.v15i1.1902.

Full text
Abstract:
Every day there are millions of opinion spread across social networks. This is often utilized by various parties to determine the opinion and sentiment of the public towards the product, brand or figures that they hold. Given the abundance of data and opinions, it is not possible to do sentiment analysis manually. In this research, author performs design and implementation of sentiment monitoring application, that could monitor people’s sentiment about a particular keyword, so it is known how the people response to those keywords, whether positive, negative or neutral. From various existing social networks, Twitter is chosen as the source of data that will be monitored. Classification algorithm used here is Naive-Bayes Classifier with Boolean Multinomial model, and feature extraction using unigram word. The training data used is 400,000 data for each type of sentiment, so the total is 1.200.000 data. In the process of classification and training, application will perform stemming to take the root words contained within the tweet. Stemming algorithm used here is Confix Stripping. The methodology of application development that used here is staged delivery. Implementation of application is done using PHP programming language. The result of this research is a sentiment monitoring application that can monitor public sentiment about a particular keyword in a particular time frame. From testing using k-fold cross validation, obtained accuracy rate for sentiment classification amounted to 85%.
APA, Harvard, Vancouver, ISO, and other styles
44

Putri, Alviana Dina, and Ajib Susanto. "Sistem Rekomendasi Pertemanan berdasarkan Hobi menggunakan Metode Multicriteria Decision Making." Jurnal Informatika dan Rekayasa Perangkat Lunak 2, no. 1 (March 3, 2020): 1. http://dx.doi.org/10.36499/jinrpl.v2i1.2787.

Full text
Abstract:
Hobi pada tahun 1816 mulai dikenalkan hanya dalam kosakata di kalangan sejumlah orang inggris. Istilah hobi pada abad itu diartikan dengan waktu senggang. Namun pada saat ini, mempresentasikan hobi dapat dikategorikan menjadi untuk memenuhi hasrat semata , menambah pengetahuan dan mengembangkan ke dalam dunia bisnis. Media twitter adalah salah satu media pendukung yang sering digunakan banyak orang didalam mempresentasikan hobi seseorang, dengan melihat siapa yang diikuti dan berdasarkan tweet seseorang tersebut. Bisa diklasifikasikan bahwa orang tersebut dikategorikan memiliki hobi yang sama dengan orang lain. Rekomendasi hybrid filltering adalah metode pendukung didalam proses mendapatkan kelas kategori yang dimiliki oleh seseorang tersebut . Karena sebagian besar sebuah aplikasi yang sudah ada hanya mempresentasikan hobi kedalam aplikasi berupa isian form inputan saja. Dengan menggunakan algoritma naive bayes classifier menjadi solusi baru bagi penulis didalam mengklasifikasikan tweet yang dimiliki user masuk kedalam kategori kelas hobi. Dengan tambahan algoritma Multi-criteria decision making sebagi proses akhir didalam pengelolahan data lanjutan untuk mendapatkan hasil dan hasil tersebut rekomendasikan kepada user lain dengan kategori hobi yang sama. Sehingga user mendapatkan rekomendasi teman sesuai dengan hobi yang sama.
APA, Harvard, Vancouver, ISO, and other styles
45

Fitriyyah, Sitti Nurul Jannah, Novi Safriadi, and Enda Esyudha Pratama. "Analisis Sentimen Calon Presiden Indonesia 2019 dari Media Sosial Twitter Menggunakan Metode Naive Bayes." Jurnal Edukasi dan Penelitian Informatika (JEPIN) 5, no. 3 (December 22, 2019): 279. http://dx.doi.org/10.26418/jp.v5i3.34368.

Full text
Abstract:
Pada tahun 2019 Indonesia akan mengadakan pesta demokrasi pemilihan kepala negara Indonesia. Setiap tokoh politik yang dicalonkan menjadi kepala negara akan mempertimbangkan popularitas mereka berdasarkan opini masyarakat. Sejak diumumkan nama calon Presiden Indonesia 2019 oleh Komisi Pemilihan Umum(KPU) nama-nama tersebut mulai banyak diperbincangkan, terutama di media sosial salah satunya adalah twitter. Terdapat berbagai opini pengguna twitter yang bersentimen negatif positif dan netral. Namun untuk menentukan sentimen dari pengguna twitter membutuhkan usaha dan waktu yang cukup banyak dikarenakan banyaknya jumlah tweet yang digunakan. Dibutuhkan pembelajaran mesin yang dengan cepat dalam pengklasisifikasian tweet tersebut dalam kelas negatif, positif dan netral. Naive Bayes Classifier adalah metode klasifikasi text yang memiliki kecepatan pemrosesan dan akurasi yang cukup tinggi apabila diterapkan pada data yang banyak, besar, dan beragam. Sebelum data tweet diklasifikasikan, data tersebut harus melalui beberapa proses, seperti prepocessing, pembobotan kata dan pemecahan data. Tujuan dari penelitian ini adalah mengetahui bagimana penerapan metode Naive Bayes pada sentimen pengguna twiter di 2 kelas (negatif, positif) dan 3 kelas (negatif, positif, netral). Hasil dari penelitian ini diperoleh bahwa dilakukan pengujian 3 kelas dan 2 kelas untuk setiap pasangan calon (paslon). Pada pengujian 3 kelas paslon 01 dan paslon 02 didapat hasil akurasi berturut-turut sebagai berikut 64,6% dan 58%. Sedangkan pada pengujian 2 kelas paslon 01 dan paslon 02 didapat hasil akurasi berturut-turut sebagai berikut 77,7% dan 88%. Performansi tertinggi terdapat pada calon presiden nomor urut dua dengan nilai f-measure sebesar 0,88.
APA, Harvard, Vancouver, ISO, and other styles
46

Kurniawan, Imam, and Ajib Susanto. "Implementasi Metode K-Means dan Naïve Bayes Classifier untuk Analisis Sentimen Pemilihan Presiden (Pilpres) 2019." Eksplora Informatika 9, no. 1 (September 30, 2019): 1–10. http://dx.doi.org/10.30864/eksplora.v9i1.237.

Full text
Abstract:
Pemilihan umum presiden yang diselenggarakan setiap lima tahun sekali merupakan momen yang penting untuk mewujudkan demokrasi dalam Negara Kesatuan Republik Indonesia. Penyampaian dukungan dilakukan baik tim sukses, buser maupun pendukung untuk mencitrakan positif calon masing-masing. Berbagai media digunakan salah satunya adalah Twitter, masyarakat menyampaikan komentar positif dan negatif bahkan cenderung “kampanye hitam” dan hoax sebelum pemilu dilaksanakan maupun saat pemilu sedang berlangsung mengenai pemilu yang diadakan, komentar di Twitter saat ini belum dapat ditentukan lebih ke arah positif atau negatif, oleh karena itu perlu dilakukan analisis sentimen untuk mengetahui kecenderungan opini masyarakat terhadap pemilu. Tujuan dari penelitian ini memperoleh analisis dokumen text untuk mendapatkan sentimen positif atau negatif. Metode yang digunakan K-Means untuk melakukan klastering pada data latih dan Naive Bayes classifier untuk mengklasifikasi pada data testing. Hasil dari pembobotan ini berupa sentimen positif dan negatif. Data diambil dari Twitter mengenai pemilu presiden 2019 sebanyak 500 data tweet. Dari hasil pengujian 100 dan 150 data uji diperoleh akurasi rata-rata 93.35% dan error rate sebesar 6.66%.
APA, Harvard, Vancouver, ISO, and other styles
47

Imam, Niddal H., and Vassilios G. Vassilakis. "A Survey of Attacks Against Twitter Spam Detectors in an Adversarial Environment." Robotics 8, no. 3 (July 4, 2019): 50. http://dx.doi.org/10.3390/robotics8030050.

Full text
Abstract:
Online Social Networks (OSNs), such as Facebook and Twitter, have become a very important part of many people’s daily lives. Unfortunately, the high popularity of these platforms makes them very attractive to spammers. Machine learning (ML) techniques have been widely used as a tool to address many cybersecurity application problems (such as spam and malware detection). However, most of the proposed approaches do not consider the presence of adversaries that target the defense mechanism itself. Adversaries can launch sophisticated attacks to undermine deployed spam detectors either during training or the prediction (test) phase. Not considering these adversarial activities at the design stage makes OSNs’ spam detectors vulnerable to a range of adversarial attacks. Thus, this paper surveys the attacks against Twitter spam detectors in an adversarial environment, and a general taxonomy of potential adversarial attacks is presented using common frameworks from the literature. Examples of adversarial activities on Twitter that were discovered after observing Arabic trending hashtags are discussed in detail. A new type of spam tweet (adversarial spam tweet), which can be used to undermine a deployed classifier, is examined. In addition, possible countermeasures that could increase the robustness of Twitter spam detectors to such attacks are investigated.
APA, Harvard, Vancouver, ISO, and other styles
48

Lamsal, Rabindra, and T. V. Vijay Kumar. "Classifying Emergency Tweets for Disaster Response." International Journal of Disaster Response and Emergency Management 3, no. 1 (January 2020): 14–29. http://dx.doi.org/10.4018/ijdrem.2020010102.

Full text
Abstract:
During disaster events such as floods, landslides, earthquakes, tsunamis, fire hazards, etc., social media platforms provide easy and timely access to information regarding the ongoing crisis events and thereby become an essential vehicle of information sharing. During such events, great amounts of such socially generated data becomes available, which can be accessed and processed to extract situational awareness insights. These insights, in turn, can be used to enhance the effectiveness and efficiency of disaster response in order to minimize the loss of lives and damage to property. People actively use social platforms like Facebook and Twitter to post information related to crisis events. Further, these platforms provide people the location and safety status of their family and friends during such events. Twitter, the microblogging platform, witnesses thousands of informally written tweets during crisis events, and since it provides high-level APIs to access its near real-time feed, it has become the primary source of data for researchers. It is generally observed that there is an exponential burst in the number of tweets during an ongoing crisis event. This sudden burst makes the task of monitoring, identifying, and processing each tweet virtually impossible for a human. However, such voluminous data can be processed using various machine learning and natural language processing techniques in coordination with a certain level of human interventions. This paper is focused on designing a semi-automated artificial intelligence-based classifier, which can classify the plethora of disaster-related tweets into various categories such as community needs, loss of lives, damage.
APA, Harvard, Vancouver, ISO, and other styles
49

Fauzi, M. Ali, and Anny Yuniarti. "Ensemble Method for Indonesian Twitter Hate Speech Detection." Indonesian Journal of Electrical Engineering and Computer Science 11, no. 1 (July 1, 2018): 294. http://dx.doi.org/10.11591/ijeecs.v11.i1.pp294-299.

Full text
Abstract:
Due to the massive increase of user-generated web content, in particular on social media networks where anyone can give a statement freely without any limitations, the amount of hateful activities is also increasing. Social media and microblogging web services, such as Twitter, allowing to read and analyze user tweets in near real time. Twitter is a logical source of data for hate speech analysis since users of twitter are more likely to express their emotions of an event by posting some tweet. This analysis can help for early identification of hate speech so it can be prevented to be spread widely. The manual way of classifying out hateful contents in twitter is costly and not scalable. Therefore, the automatic way of hate speech detection is needed to be developed for tweets in Indonesian language. In this study, we used ensemble method for hate speech detection in Indonesian language. We employed five stand-alone classification algorithms, including Naïve Bayes, K-Nearest Neighbours, Maximum Entropy, Random Forest, and Support Vector Machines, and two ensemble methods, hard voting and soft voting, on Twitter hate speech dataset. The experiment results showed that using ensemble method can improve the classification performance. The best result is achieved when using soft voting with F1 measure 79.8% on unbalance dataset and 84.7% on balanced dataset. Although the improvement is not truly remarkable, using ensemble method can reduce the jeopardy of choosing a poor classifier to be used for detecting new tweets as hate speech or not.
APA, Harvard, Vancouver, ISO, and other styles
50

Samuel, Sajay Thomas, and Booma Poolan Marikannan. "Sentiment Analysis on Movie Reviews Using Twitter." Journal of Computational and Theoretical Nanoscience 17, no. 7 (July 1, 2020): 2869–75. http://dx.doi.org/10.1166/jctn.2020.9326.

Full text
Abstract:
Machine learning can help people to perform complex tasks and solve problems as it uses historical data to learn its pattern and make predictions based on the past data. This research addresses the problem about movie reviews on social media specifically Twitter; where it will gather the tweets on movie reviews and display a rating based on the sentiment of the tweet. Twitter is an online social media website where people from all walks of life communicate by tweeting short updates without exceeding the character limit which is 240 characters. Twitter is continuously growing as a business and became one of the biggest platform for communication and instant messaging. Due to the large number of users, there are voluminous amounts of data available that can be used for more in depth information and insights and to get the sentiments from analysing the tweets. In today’s world, there are many applications that are using sentiment analysis in various fields such as to gets insights about a particular brand or product. To do sentiment analysis using the traditional ways can be time consuming and becomes very complex. The aim of this research is to investigate about the domain of sentiment analysis and incorporate a machine learning algorithm to create a system that is able to get and display the ratings of a particular movie. The machine learning algorithms used are Naïve Bayes Classifier and SVM. The algorithm with better accuracy will be chosen for the implementation phase.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography