To see the other types of publications on this topic, follow the link: Twitter stream analysis.

Journal articles on the topic 'Twitter stream analysis'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Twitter stream analysis.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

D'Andrea, Eleonora, Pietro Ducange, Beatrice Lazzerini, and Francesco Marcelloni. "Real-Time Detection of Traffic From Twitter Stream Analysis." IEEE Transactions on Intelligent Transportation Systems 16, no. 4 (2015): 2269–83. http://dx.doi.org/10.1109/tits.2015.2404431.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Srivastava, Ritesh, and M. P. S. Bhatia. "Real-Time Unspecified Major Sub-Events Detection in the Twitter Data Stream That Cause the Change in the Sentiment Score of the Targeted Event." International Journal of Information Technology and Web Engineering 12, no. 4 (2017): 1–21. http://dx.doi.org/10.4018/ijitwe.2017100101.

Full text
Abstract:
Twitter behaves as a social sensor of the world. The tweets provided by the Twitter Firehose reveal the properties of big data (i.e. volume, variety, and velocity). With millions of users on Twitter, the Twitter's virtual communities are now replicating the real-world communities. Consequently, the discussions of real world events are also very often on Twitter. This work has performed the real-time analysis of the tweets related to a targeted event (e.g. election) to identify those potential sub-events that occurred in the real world, discussed over Twitter and cause the significant change in the aggregated sentiment score of the targeted event with time. Such type of analysis can enrich the real-time decision-making ability of the event bearer. The proposed approach utilizes a three-step process: (1) Real-time sentiment analysis of tweets (2) Application of Bayesian Change Points Detection to determine the sentiment change points (3) Major sub-events detection that have influenced the sentiment of targeted event. This work has experimented on Twitter data of Delhi Election 2015.
APA, Harvard, Vancouver, ISO, and other styles
3

Anbu Durai, Srinaath. "Resale HDB Price Prediction Considering Covid-19 through Sentiment Analysis." European Conference on Social Media 10, no. 1 (2023): 276–85. http://dx.doi.org/10.34190/ecsm.10.1.1020.

Full text
Abstract:
Twitter sentiment has been used as a predictor to predict price values or trends in both the stock market and housing market. The pioneering works in this stream of research drew upon works in behavioural economics to show that sentiment or emotions impact economic decisions. Latest works in this stream focus on the algorithm used as opposed to the data used. A literature review of works in this stream through the lens of data used shows that there is a paucity of work that considers the impact of sentiments caused due to an external factor on either the stock or the housing market. This is despite an abundance of works in behavioural economics that show that sentiment or emotions caused due to an external factor impact economic decisions. To address this gap, this research studies the impact of Twitter sentiment pertaining to the Covid-19 pandemic on resale Housing Development Board (HDB) apartment prices in Singapore. It leverages SNSCRAPE to collect tweets pertaining to Covid-19 for sentiment analysis, lexicon-based tool, VADER, is used for sentiment analysis, Granger Causality is used to examine the relationship between Covid-19 cases and the sentiment score, and neural networks are leveraged as prediction models. Twitter sentiment pertaining to Covid-19 as a predictor of HDB price in Singapore is studied in comparison with the traditional predictors of housing prices i.e., the structural and neighbourhood characteristics. The results indicate that using Twitter sentiment pertaining to Covid-19 leads to better prediction than using only the traditional predictors and performs better as a predictor compared to two of the traditional predictors. Hence, Twitter sentiment pertaining to an external factor should be considered as important as traditional predictors. In a micro sense, this paper demonstrates the use of sentiment analysis of Twitter data in urban economics. In a macro sense, the paper demonstrates the extent to which social media is able to capture the behavioral economic cues of a population.
APA, Harvard, Vancouver, ISO, and other styles
4

Smetanin, Sergey. "RuSentiTweet: a sentiment analysis dataset of general domain tweets in Russian." PeerJ Computer Science 8 (July 19, 2022): e1039. http://dx.doi.org/10.7717/peerj-cs.1039.

Full text
Abstract:
The Russian language is still not as well-resourced as English, especially in the field of sentiment analysis of Twitter content. Though several sentiment analysis datasets of tweets in Russia exist, they all are either automatically annotated or manually annotated by one annotator. Thus, there is no inter-annotator agreement, or annotation may be focused on a specific domain. In this article, we present RuSentiTweet, a new sentiment analysis dataset of general domain tweets in Russian. RuSentiTweet is currently the largest in its class for Russian, with 13,392 tweets manually annotated with moderate inter-rater agreement into five classes: Positive, Neutral, Negative, Speech Act, and Skip. As a source of data, we used Twitter Stream Grab, a historical collection of tweets obtained from the general Twitter API stream, which provides a 1% sample of the public tweets. Additionally, we released a RuBERT-based sentiment classification model that achieved F1 = 0.6594 on the test subset.
APA, Harvard, Vancouver, ISO, and other styles
5

Rasul, Hakar Mohammed, and Alaa Khalil Jumaa. "Real-Time Twitter Data Analysis: A Survey." UHD Journal of Science and Technology 6, no. 2 (2022): 147–55. http://dx.doi.org/10.21928/uhdjst.v6n2y2022.pp147-155.

Full text
Abstract:
Internet users are used to a steady stream of facts in the contemporary world. Numerous social media platforms, including Twitter, Facebook, and Quora, are plagued with spam accounts, posing a significant problem. These accounts are created to trick unwary real users into clicking on dangerous links or to continue publishing repetitious messages using automated software. This may significantly affect the user experiences on these websites. Effective methods for detecting certain types of spam have been intensively researched and developed. Effectively resolving this issue might be aided by doing sentiment analysis on these postings. Hence, this research provides a background study on Twitter data analysis, and surveys existing papers on Twitter sentiment analysis and fake account detection and classification. The investigation is restricted to the identification of social bots on the Twitter social media network. It examines the methodologies, classifiers, and detection accuracies of the several detection strategies now in use.
APA, Harvard, Vancouver, ISO, and other styles
6

Sarawagi, Ankit, Rajeev Pandey, Raju Barskar, and S. P. "A Real Time Stream Data Processing and Analysis Model and Catchments over Twitter Stream Data." International Journal of Computer Applications 179, no. 1 (2017): 22–33. http://dx.doi.org/10.5120/ijca2017915663.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Mamo, Nicholas, Joel Azzopardi, and Colin Layfield. "An Automatic Participant Detection Framework for Event Tracking on Twitter." Algorithms 14, no. 3 (2021): 92. http://dx.doi.org/10.3390/a14030092.

Full text
Abstract:
Topic Detection and Tracking (TDT) on Twitter emulates human identifying developments in events from a stream of tweets, but while event participants are important for humans to understand what happens during events, machines have no knowledge of them. Our evaluation on football matches and basketball games shows that identifying event participants from tweets is a difficult problem exacerbated by Twitter’s noise and bias. As a result, traditional Named Entity Recognition (NER) approaches struggle to identify participants from the pre-event Twitter stream. To overcome these challenges, we describe Automatic Participant Detection (APD) to detect an event’s participants before the event starts and improve the machine understanding of events. We propose a six-step framework to identify participants and present our implementation, which combines information from Twitter’s pre-event stream and Wikipedia. In spite of the difficulties associated with Twitter and NER in the challenging context of events, our approach manages to restrict noise and consistently detects the majority of the participants. By empowering machines with some of the knowledge that humans have about events, APD lays the foundation not just for improved TDT systems, but also for a future where machines can model and mine events for themselves.
APA, Harvard, Vancouver, ISO, and other styles
8

Rahmat, Al Fauzi, and M. Rafi. "Social Media Network Analysis on Twitter Users Network to the Pension Plan Policy." Communicare : Journal of Communication Studies 8, no. 1 (2022): 62. http://dx.doi.org/10.37535/101009120225.

Full text
Abstract:
This article scrutinizes the network of Twitter users on the dissemination of tweets on the Old-Age Guarantee policy in Indonesia. A qualitative method with social media network analysis approach was used. Then, data sources were obtained from Twitter social media through #JHT_JokowiHarusTurun, #jaminanharitua, and #JHT. Furthermore, to manage data source, NVivo 12 plus software was used to analyze qualitative data from Twitter social media – including dissemination rate of tweets, followed by geographical map tweet stream, Twitter user’ network pattern, sentiment proportion, as well as words frequencies. Our results indicate that there are networks found from Twitter users with some account backgrounds in politicians, political parties, governments, online news media, actors and also cultural practitioners who participate in disseminating tweets. Even this network generated the significant distribution patterns and sentiments to a moderately negative value, coupled with pleasantries that are echoed between protests and support by words cloud to this movement. Overall, our research contributes to better understanding of how social media-promoted collective protest movements have the power to impact public opinion and policy and that their evolution is unexpected.
APA, Harvard, Vancouver, ISO, and other styles
9

Domade, Ashwini S. "Twitter sentiment Analysis Using Machine Learning." INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT 09, no. 02 (2025): 1–9. https://doi.org/10.55041/ijsrem41623.

Full text
Abstract:
abstract on page 1 With the development and expansion of web technology, a vast amount of data is generated and available to internet users, and the internet has evolved into a platform for online learning, idea exchange, and opinion sharing. Because they enable people to share and express their opinions on various topics, engage in discussions with various communities, or post messages globally, social networking sites like Facebook, Google, and Twitter are quickly becoming more and more popular. A lot of work has been done in the field of sentiment analysis of Twitter data, which is useful for analyzing the information in tweets where opinions are highly unstructured and heterogeneous and are either This paper presents a survey and comparative analyses of current techniques for opinion mining, such as machine learning and lexicon-based approaches, along with evaluation metrics using various machine learning algorithms, such as naive bayes max entropy and support vector machine. Given the growth and advancement of web technology, there is a huge volume of data available on the internet for internet users, and a lot of data is generated too. In some cases, this data is neutral, positive, or negative. We offer data stream research on Twitter. Additionally, we talked about the basic difficulties and uses of sentiment analysis on Twitter keywords. Opinion mining machine learning using sentiment analysis on Twitter Maximum Entropy Support Vector Machine (SVM) naïve Bayes NB Key Words: Twitter, Sentiment analysis (SA), Opinion mining, Machine learning, Naive Bayes (NB), Maximum Entropy, Support Vector Machine (SVM).
APA, Harvard, Vancouver, ISO, and other styles
10

Kim, Erin Hea-Jin, Yoo Kyung Jeong, Yuyoung Kim, Keun Young Kang, and Min Song. "Topic-based content and sentiment analysis of Ebola virus on Twitter and in the news." Journal of Information Science 42, no. 6 (2016): 763–81. http://dx.doi.org/10.1177/0165551515608733.

Full text
Abstract:
The present study investigates topic coverage and sentiment dynamics of two different media sources, Twitter and news publications, on the hot health issue of Ebola. We conduct content and sentiment analysis by: (1) applying vocabulary control to collected datasets; (2) employing the n-gram LDA topic modeling technique; (3) adopting entity extraction and entity network; and (4) introducing the concept of topic-based sentiment scores. With the query term ‘Ebola’ or ‘Ebola virus’, we collected 16,189 news articles from 1006 different publications and 7,106,297 tweets with the Twitter stream API. The experiments indicate that topic coverage of Twitter is narrower and more blurry than that of the news media. In terms of sentiment dynamics, the life span and variance of sentiment on Twitter is shorter and smaller than in the news. In addition, we observe that news articles focus more on event-related entities such as person, organization and location, whereas Twitter covers more time-oriented entities. Based on the results, we report on the characteristics of Twitter and news media as two distinct news outlets in terms of content coverage and sentiment dynamics.
APA, Harvard, Vancouver, ISO, and other styles
11

Punia, Sanjeev Kumar, Manoj Kumar, Thompson Stephan, Ganesh Gopal Deverajan, and Rizwan Patan. "Performance Analysis of Machine Learning Algorithms for Big Data Classification." International Journal of E-Health and Medical Communications 12, no. 4 (2021): 60–75. http://dx.doi.org/10.4018/ijehmc.20210701.oa4.

Full text
Abstract:
In broad, three machine learning classification algorithms are used to discover correlations, hidden patterns, and other useful information from different data sets known as big data. Today, Twitter, Facebook, Instagram, and many other social media networks are used to collect the unstructured data. The conversion of unstructured data into structured data or meaningful information is a very tedious task. The different machine learning classification algorithms are used to convert unstructured data into structured data. In this paper, the authors first collect the unstructured research data from a frequently used social media network (i.e., Twitter) by using a Twitter application program interface (API) stream. Secondly, they implement different machine classification algorithms (supervised, unsupervised, and reinforcement) like decision trees (DT), neural networks (NN), support vector machines (SVM), naive Bayes (NB), linear regression (LR), and k-nearest neighbor (K-NN) from the collected research data set. The comparison of different machine learning classification algorithms is concluded.
APA, Harvard, Vancouver, ISO, and other styles
12

Weng, Jianshu, and Bu-Sung Lee. "Event Detection in Twitter." Proceedings of the International AAAI Conference on Web and Social Media 5, no. 1 (2021): 401–8. http://dx.doi.org/10.1609/icwsm.v5i1.14102.

Full text
Abstract:
Twitter, as a form of social media, is fast emerging in recent years. Users are using Twitter to report real-life events. This paper focuses on detecting those events by analyzing the text stream in Twitter. Although event detection has long been a research topic, the characteristics of Twitter make it a non-trivial task. Tweets reporting such events are usually overwhelmed by high flood of meaningless “babbles”. Moreover, event detection algorithm needs to be scalable given the sheer amount of tweets. This paper attempts to tackle these challenges with EDCoW (Event Detection with Clustering of Wavelet-based Signals). EDCoW builds signals for individual words by applying wavelet analysis on the frequencybased raw signals of the words. It then filters away the trivial words by looking at their corresponding signal autocorrelations. The remaining words are then clustered to form events with a modularity-based graph partitioning technique. Experimental results show promising result of EDCoW.
APA, Harvard, Vancouver, ISO, and other styles
13

Karuna, G., Pavuluri Anvesh, Chiranji Sharath Singh, Kommula Ruthvik Reddy, Praveen Kumar Shah, and S. Siva Shankar. "Feasible Sentiment Analysis of Real Time Twitter Data." E3S Web of Conferences 430 (2023): 01045. http://dx.doi.org/10.1051/e3sconf/202343001045.

Full text
Abstract:
Sentiment analysis plays a significant role in understanding public opinion, trends, and sentiments expressed on social media platforms. In this paper, we focus on performing sentiment analysis on real-time Twitter data to gain insights into the sentiments related to specific topics or events, we collect a stream of tweets based on predefined keywords or hashtags. The collected tweets undergo pre-processing steps to clean and standardize the text for sentiment analysis. We employ machine learning classify the sentiments expressed in tweets, utilizing sentiment lexicons and training data as references. Real-time sentiment analysis is performed as new tweets are collected, enabling continuous monitoring and analysis of public sentiment. The sentiment analysis results are visualized through informative visualizations such as sentiment distribution charts and sentiment trends over time. Additionally, we focus on topic-specific analysis by filtering tweets based on relevant keywords or hashtags, providing deeper insights into sentiments related to specific subjects. The paper faces challenges such as noisy and informal text, ambiguity in sentiment expression, and handling large volumes of real-time data. Addressing these challenges, we aim to develop an effective sentiment analysis system that provides valuable insights into public sentiment and supports decision-making processes in various domains.
APA, Harvard, Vancouver, ISO, and other styles
14

M., MANIDEEP, and MALIREDDY VENKATA. "SENTIMENT ANALYSIS OF DATA MINING TECHNIQUES FOR SOCIAL NETWORKS." JournalNX - A Multidisciplinary Peer Reviewed Journal 4, no. 8 (2018): 27–31. https://doi.org/10.5281/zenodo.1472709.

Full text
Abstract:
 Sentiment analysis is defined as the task of finding and analyzing the opinions of authors about specific entities or topic. Social network has increased astounding consideration in the most recent decade. Getting to Social network destinations, for example, Twitter, Facebook LinkedIn and Google+ through the web and the web 2.0 innovations has turned out to be more moderate. Data mining gives an extensive variety of methods for identifying helpful learning from enormous datasets like pat- terns, examples and tenets. Data mining methods are utilized for data recovery, measurable displaying and machine learning. These strategies utilize data pre-processing, data analysis, and data interpretation processes in the course of data analysis. This survey talks about various in Data mining methods used as a piece of mining differing parts of the Social network over decades going from the verifiable procedures to the up to date models. Twitter is a miniaturized scale blogging administration worked to find what is going on at any minute in time, anyplace on the planet. Twitter messages are short, and created always, and appropriate for learning revelation utilizing information stream mining. https://journalnx.com/journal-article/20150775
APA, Harvard, Vancouver, ISO, and other styles
15

Lanagan, James, and Alan Smeaton. "Using Twitter to Detect and Tag Important Events in Sports Media." Proceedings of the International AAAI Conference on Web and Social Media 5, no. 1 (2021): 542–45. http://dx.doi.org/10.1609/icwsm.v5i1.14170.

Full text
Abstract:
In this paper we examine the effectiveness of using a filtered stream of tweets from Twitter to automatically identify events of interest within the video of live sports transmissions. We show that using just the volume of tweets generated at any moment of a game actually provides a very accurate means of event detection, as well as an automatic method for tagging events with representative words from the tweet stream. We compare this method with an alternative approach that uses complex audio-visual content analysis of the video, showing that it provides near-equivalent accuracy for major event detection at a fraction of the computational cost. Using community tweets and discussion also provides a sense of what the audience themselves found to be the talking points of a video.
APA, Harvard, Vancouver, ISO, and other styles
16

Gera, Suruchi, and Adwitiya Sinha. "A machine learning-based malicious bot detection framework for trend-centric twitter stream." Journal of Discrete Mathematical Sciences and Cryptography 24, no. 5 (2021): 1337–48. http://dx.doi.org/10.1080/09720529.2021.1932923.

Full text
APA, Harvard, Vancouver, ISO, and other styles
17

Chierichetti, Flavio, Jon Kleinberg, Ravi Kumar, Mohammad Mahdian, and Sandeep Pandey. "Event Detection via Communication Pattern Analysis." Proceedings of the International AAAI Conference on Web and Social Media 8, no. 1 (2014): 51–60. http://dx.doi.org/10.1609/icwsm.v8i1.14536.

Full text
Abstract:
Social media applications such as Twitter provide a powerful medium through which users can communicate their observations with friends and with the world at large. We have witnessed live reporting of many events, from soccer games in Johannesburg to revolutions in Cairo and Tunis, and these reports have in many ways rivaled the content provided by the official media. Tapping into this valuable resource is a challenge, due to the heterogeneity and noise inherent in realtime text, diversity of languages, and fast-evolving linguistic norms. In this paper we seek to analyze a tweet stream to automatically discover points in time when an important event happens, and to classify such events based on the type of the sentiments they evoke, using only non-textual features of the tweeting pattern. This results not only in a robust way of analyzing tweet streams independent of the languages used; it also provides insights about how users behave on social media websites. For example, we observe that users often react to an exciting external event by decreasing the volume of communication with other users. We explain this effect through a model of how users switch between producing information or sentiments and sharing others’ news or sentiments. We develop and evaluate our models and algorithms using several Twitter data sets, focusing in particular on the tweets sent during the soccer World Cup of 2010. This data set has the feature that the underlying ground truth is welldefined and known whereby goals serve as events.
APA, Harvard, Vancouver, ISO, and other styles
18

Hoeber, Orland, Larena Hoeber, Maha El Meseery, Kenneth Odoh, and Radhika Gopi. "Visual Twitter Analytics (Vista)." Online Information Review 40, no. 1 (2016): 25–41. http://dx.doi.org/10.1108/oir-02-2015-0067.

Full text
Abstract:
Purpose – Due to the size and velocity at which user generated content is created on social media services such as Twitter, analysts are often limited by the need to pre-determine the specific topics and themes they wish to follow. Visual analytics software may be used to support the interactive discovery of emergent themes. The paper aims to discuss these issues. Design/methodology/approach – Tweets collected from the live Twitter stream matching a user’s query are stored in a database, and classified based on their sentiment. The temporally changing sentiment is visualized, along with sparklines showing the distribution of the top terms, hashtags, user mentions, and authors in each of the positive, neutral, and negative classes. Interactive tools are provided to support sub-querying and the examination of emergent themes. Findings – A case study of using Vista to analyze sport fan engagement within a mega-sport event (2013 Le Tour de France) is provided. The authors illustrate how emergent themes can be identified and isolated from the large collection of data, without the need to identify these a priori. Originality/value – Vista provides mechanisms that support the interactive exploration among Twitter data. By combining automatic data processing and machine learning methods with interactive visualization software, researchers are relieved of tedious data processing tasks, and can focus on the analysis of high-level features of the data. In particular, patterns of Twitter use can be identified, emergent themes can be isolated, and purposeful samples of the data can be selected by the researcher for further analysis.
APA, Harvard, Vancouver, ISO, and other styles
19

Abdullahi, Habeeba Ibraheem, Muhammad Aminu Ahmad, and Khalid Haruna. "Twitter sentiment analysis for Hausa abbreviations and acronyms." Science World Journal 19, no. 1 (2024): 101–4. http://dx.doi.org/10.4314/swj.v19i1.13.

Full text
Abstract:
The use of natural language processing, to identify, extract and organize sentiment from user generated texts in social networks, blogs or product review of text is known as sentiment analysis or opinion mining. Hausa language belongs to one of the major well-spoken languages in Africa and one of the three major Nigerian languages. Now investigating into such a language will have significant influence on social, economic business political and even educational services and settings. Some of these Hausa texts are abbreviated and some in acronym format which is a challenge to researchers as such comments are in an unstructured format and needs normalization to get further understanding of that text and also there is scarcity of sentiment analysis on Hausa abbreviation and acronym. Abbreviation is a shorten form of a word while acronym is an abbreviation formed from the initial letters of other words and pronounced as a word. This research aims to develop an improved Hausa Sentiment Dataset for the enhancement of sentiment analysis with abbreviation and acronyms. This is achieved by adapting to the approach for Hausa Sentiment Analysis based on Multinomial Naïve Bayes (MNB) and Logistic Regression algorithms using the count vectorizer, along with python libraries for NLP. This research affirmed that the improved dataset with abbreviation and acronym outperforms the plain Hausa dataset by 4% in accuracy using Multinomial Naïve Bayes. The result shows that in addition to normal preprocessing techniques of the social media stream, understanding, interpreting and resolving ambiguity in the usage of abbreviations and acronyms lead to improved accuracy of algorithms with evidence in the experimental result.
APA, Harvard, Vancouver, ISO, and other styles
20

Mr.P.Vijayaragavan, Associate.Prof S.Swetha student M.Selva Udhaya student. "USER BEHAVIOUR ANALYSIS USING SEQUENCE OF DOCUMENT ON INTERNET OF STREAM." INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY 6, no. 4 (2017): 87–91. https://doi.org/10.5281/zenodo.496088.

Full text
Abstract:
Textual documents designed and divided on the Internet are ever changing in various forms. The aim of this project is to characterize and detect personalized and abnormal behaviours of Internet users.It can be applied in many real-life scenarios, such as real-time monitoring on abnormal user behaviours. The existing system of our project works are devoted to topic modelling and the evolution of individual topics, while sequential relations of topics in successive documents published by a specific user are ignored. Hence the users activity monitoring doesn’t feasibly and effectively. We proposed our system to extract the user’s activity on real time web application data set on Twitter and Gmail. Using our technique can monitor the user’s sequential topic pattern based on their session identification on multiple applications with single sign on email id and their session id
APA, Harvard, Vancouver, ISO, and other styles
21

K.V.G.N., Naidu*1 &. P. Sireesha2. "TWITTER ANALYSISFOR IDENTIFICATION OF REAL-TIME TRAFFIC." INTERNATIONAL JOURNAL OF RESEARCH SCIENCE & MANAGEMENT 4, no. 5 (2017): 148–51. https://doi.org/10.5281/zenodo.573507.

Full text
Abstract:
Social networks have been recently empployed as a source of information for event detection, with specific reference to road traffic activity congestion and accidents or earthquack reporting system. In our paper, we present a real-time detection of traffic from Twitter stream analysis. The system fetches tweets from Twitter as per a several search criteria; process tweets by applying text mining methods; lastly performs the classification of tweets. The aim is to assign suitable class label to every tweet, as related with an activity of traffic event or not. The traffic detection system or framework was utilized for real-time monitoring of several areas of the Italian street network, taking into consideration detection of traffic events just almost in real time, regularly before online traffic news sites. We employed the support vector machine as a classification model, furthermore, we accomplished an accuracy value of 90.75% by tackling a binar classification issue (traffic versus nontraffic tweets). We were also able to discriminate if traffic is caused by an external event or not, by solving a multiclass classification problem and obtaining accuracy value of 80.89%
APA, Harvard, Vancouver, ISO, and other styles
22

Aljebreen, Abdullah, Weiyi Meng, and Eduard Dragut. "Segmentation of Tweets with URLs and its Applications to Sentiment Analysis." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 14 (2021): 12480–88. http://dx.doi.org/10.1609/aaai.v35i14.17480.

Full text
Abstract:
An important means for disseminating information in social media platforms is by including URLs that point to external sources in user posts. In Twitter, we estimate that about 21% of the daily stream of English-language tweets contain URLs. We notice that NLP tools make little attempt at understanding the relationship between the content of the URL and the text surrounding it in a tweet. In this work, we study the structure of tweets with URLs relative to the content of the Web documents pointed to by the URLs. We identify several segments classes that may appear in a tweet with URLs, such as the title of a Web page and the user's original content. Our goals in this paper are: introduce, define, and analyze the segmentation problem of tweets with URLs, develop an effective algorithm to solve it, and show that our solution can benefit sentiment analysis on Twitter. We also show that the problem is an instance of the block edit distance problem, and thus an NP-hard problem.
APA, Harvard, Vancouver, ISO, and other styles
23

Sailaja Kumar, K., D. Evangelin Geetha, and Pratap Rudra Sahoo. "A Methodology to Handle Heterogeneous Data Generated in Online Social Networks." Journal of Computational and Theoretical Nanoscience 17, no. 9 (2020): 4098–102. http://dx.doi.org/10.1166/jctn.2020.9025.

Full text
Abstract:
Analyzing the heterogeneous data generated by social networking sites is a research challenge. Twitter is a massive social networking site. In this paper, for processing the heterogeneous data, a methodology is devised, which helps in categorizing the data obtained from Twitter into different directories and understanding the text data explicitly. The methodology is implemented using Python programming language. Python’s tweepy package is used to download the Twitter stream data which includes images, videos and text data. Python’s Aylien API is used for analyzing the Twitter text data. Using this API, sentiment analysis report is generated. Using Python’s matplotlib package, a pie chart is generated to visualize the sentiment analysis results. Further an algorithm is proposed for sentiment analysis, which not only categorizes the tweets into positive, negative and neutral (as Aylien API does), but also categorizes the tweets into strongly and weakly, positive and negative based on the polarity and subjectivity. Django platform and Python’s TextBlob package are used for implementing this algorithm. For this experiment, data is collected from Twitter using the hash tags related to different events/topics like IPL2018, World Cup2018, Modi, and Delete Facebook etc. during the period Monday Jan 22, 2018 to Monday May 28, 2018. Moreover, the data is collected and processed using Python TextBlob. Also conducted the Sentiment analysis on text data using TextBlob and visual reports are generated using Google chart. The results obtained from both the above-mentioned approaches are compared and it is observed that the proposed algorithm gives better sentiment analysis of the tweets.
APA, Harvard, Vancouver, ISO, and other styles
24

Ranjan, Ayush. "Real-Time Twitter Trends Analysis Using Latent Dirichlet Allocation and Machine Learning." INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT 08, no. 03 (2024): 1–6. http://dx.doi.org/10.55041/ijsrem29198.

Full text
Abstract:
As far as social media is concerned twitter has become a major source of public data in the form of tweets. Twitter is an emerging source of large textual data(big data).People can easily express their opinions , reviews, interests and tastes about a particular event, topic ,product etc., occurring worldwide. This makes twitter a good source of valuable data which can be further used to perform sentiment analysis, know trending topics of public intrests and public opinion on a particular product or event which can be beneficial for business growth and political parties to know public choices and they can take actions accordingly. The extraction of meaningful insights from the vast and dynamic stream of social media data is greatly facilitated by the implementation of topic modeling and sentiment analysis on Twitter tweets. The identification of recurring themes or subjects within a collection of tweets, which is the essence of topic modeling, serves to reveal patterns in discussions, thereby aiding in the comprehension of prevalent topics and trends. Concurrently, sentiment analysis, through the evaluation of the emotional tone of tweets, enables the discernment of whether the expressed sentiments are positive, negative, or neutral. The combination of these techniques provides researchers and businesses with valuable perspectives on public opinions, emerging issues, and user sentiments, thereby empowering them to make informed decisions and develop effective engagement strategies in the ever-evolving landscape of social media. But twitter data(tweets) are unstructured ,contains noisy data, urls , stop words, re-tweets video ,emoji etc., which need to be cleaned and preprocessed to perform proper sentiment analysis and use it to extract meaningful effective insights from it. This paper focuses on various methods of topic modeling and discovering latent topics ,text-mining approaches ,micro-blogging methods used in various researches .This paper focuses on latent dirichlet allocation method of topic modeling and text-mining to discover latent topics in tweets, micro-blogging ,text-mining approaches .A proper survey is done on previous researches on this topic in this article. Keywords:-LDA, text-mining, topic modeling, micro-blogging ,machine learning ,sentiment analysis, NLP.
APA, Harvard, Vancouver, ISO, and other styles
25

Burns, John, Tom Kelsey, and Carl Donovan. "Extracting emerging events from social media: X/Twitter and the multilingual analysis of emerging geopolitical topics in near real time." Journal of Social Media Research 2, no. 1 (2025): 50–70. https://doi.org/10.29329/jsomer.14.

Full text
Abstract:
This study uses multiple languages to investigate the emergence of geopolitical topics on X / Twitter across two different time intervals: daily and hourly. For the daily interval, we examined the emergence of topics from February 4th, 2023, to March 23rd, 2023, at random three-hour intervals, compiling the topic modeling results for each day into a time series. For the hourly interval, we considered two days of data, June 1st, 2023, and June 6th, 2023, where we tracked the growth of topics for those days. We collected our data through the X / Twitter Filtered Stream using key bigrams (two-word phrases) for various geopolitical topics for multiple languages to identify emerging geopolitical events at the global and regional levels. Lastly, we compared the trends created by tracking emerging topics over time to Google Trends data, another data source for emerging topics. At the daily level, we found that our X / Twitter-based algorithm was able to identify multiple geopolitical events at least a day before they became relevant on Google Trends, and in the case of North Korean missile launches during this period, several languages identified more missile launches than the Google Trends data. As for the hourly data, we again found several topics that emerged hours before they started appearing on Google Trends. Our analyses also found that the different languages allowed for greater diversity in topics that would not have been possible if only one language had been used.
APA, Harvard, Vancouver, ISO, and other styles
26

Cherichi, Soumaya, and Rim Faiz. "Leveraging Temporal Markers to Detect Event from Microblogs." International Journal of Knowledge Society Research 8, no. 3 (2017): 54–67. http://dx.doi.org/10.4018/ijksr.2017070104.

Full text
Abstract:
One of the marvels of our time is the unprecedented development and use of technologies that support social interaction. Social mediating technologies have engendered radically new ways of information and communication, particularly during events; in case of natural disaster like earthquakes tsunami and American presidential election. This paper is based on data obtained from Twitter because of its popularity and sheer data volume. This content can be combined and processed to detect events, entities and popular moods to feed various new large-scale data-analysis applications. On the downside, these content items are very noisy and highly informal, making it difficult to extract sense out of the stream. Taking to account all the difficulties, we propose a new event detection approach combining linguistic features and Twitter features. Finally, we present our system that aims (1) detect new events, (2) to recognize temporal markers pattern of an event, (3) and to classify important events according to thematic pertinence, author pertinence and tweet volume.
APA, Harvard, Vancouver, ISO, and other styles
27

García-Méndez, Silvia, Arriba-Pérez Francisco de, Ana Barros-Vila, and Francisco J. González-Castaño. "Targeted aspect-based emotion analysis to detect opportunities and precaution in financial Twitter messages." Expert Systems with Applications 218 (January 23, 2023): 14. https://doi.org/10.1016/j.eswa.2023.119611.

Full text
Abstract:
Microblogging platforms, of which Twitter is a representative example, are valuable information sources for market screening and financial models. In them, users voluntarily provide relevant information, including educated knowledge on investments, reacting to the state of the stock markets in real-time and, often, influencing this state. We are interested in the user forecasts in financial, social media messages expressing opportunities and precautions about assets. We propose a novel Targeted Aspect-Based Emotion Analysis (TABEA) system that can individually discern the financial emotions (positive and negative forecasts) on the different stock market assets in the same tweet (instead of making an overall guess about that whole tweet). It is based on Natural Language Processing (NLP) techniques and Machine Learning streaming algorithms. The system comprises a constituency parsing module for parsing the tweets and splitting them into simpler declarative clauses; an offline data processing module to engineer textual, numerical and categorical features and analyse and select them based on their relevance; and a stream classification module to continuously process tweets on-the-fly. Experimental results on a labelled data set endorse our solution. It achieves over 90 % precision for the target emotions, financial opportunity, and precaution on Twitter. To the best of our knowledge, no prior work in the literature has addressed this problem despite its practical interest in decision-making, and we are not aware of any previous NLP nor online Machine Learning approaches to TABEA.
APA, Harvard, Vancouver, ISO, and other styles
28

SRISANKAR, M., and Dr K. P. LOCHANAMBAL. "THE SENTIMENTAL ANALYSIS USING DEEP LEARNING MODELS." INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT 07, no. 11 (2023): 1–11. http://dx.doi.org/10.55041/ijsrem27151.

Full text
Abstract:
ABSTRACT:The tweets are brief and come in a steady stream. Emotions have a significant impact on feelings. People can express their ideas about anything and anything on social media. Public perception is divided into three categories: positive, negative, and neutral. In this study, Twitter hotel reviews are gathered and pre-processed before being analyzed using Python's Tweepy package. Re-tweets, tags, URLs, hash tag symbols, and duplicate entries are all eliminated as part of a screening procedure to remove any discrepancies in the data. Using Python's scikit-learn module, tweets are up-sampled and divided. Python turns textual data into vectors using the keras Tokenizer. Bi-sense Emoji Embedding (BSEE) is used to perform a sentimental analysis. Sentiment is categorized using Support Vector Machines (SVM) and Random Forest (RF),and LSTM (Long Short Term Memory) where compared based on accuracy, recall, F-measure, precision, time duration, and performance. It is clear that the proposed classifier produces better results. Keywords: Bi-Sense Emoji Embedding (BSEE), Long Short Term Memory (LSTM), Support Vector Machine (SVM), Random Forest (RF),
APA, Harvard, Vancouver, ISO, and other styles
29

Elmas, Tuğrulcan, Rebekah Overdorf, and Karl Aberer. "A Dataset of State-Censored Tweets." Proceedings of the International AAAI Conference on Web and Social Media 15 (May 22, 2021): 1009–15. http://dx.doi.org/10.1609/icwsm.v15i1.18124.

Full text
Abstract:
Many governments impose traditional censorship methods on social media platforms. Instead of removing it completely, many social media companies, including Twitter, only withhold the content from the requesting country. This makes such content still accessible outside of the censored region, allowing for an excellent setting in which to study government censorship on social media. We mine such content using the Internet Archive's Twitter Stream Grab. We release a dataset of 583,437 tweets by 155,715 users that were censored between 2012-2020 July. We also release 4,301 accounts that were censored in their entirety. Additionally, we release a set of 22,083,759 supplemental tweets made up of all tweets by users with at least one censored tweet as well as instances of other users retweeting the censored user. We provide an exploratory analysis of this dataset. Our dataset will not only aid in the study of government censorship but will also aid in studying hate speech detection and the effect of censorship on social media users. The dataset is publicly available at https://doi.org/10.5281/zenodo.4439509
APA, Harvard, Vancouver, ISO, and other styles
30

Aramburu, María José, Rafael Berlanga, and Indira Lanza. "Social Media Multidimensional Analysis for Intelligent Health Surveillance." International Journal of Environmental Research and Public Health 17, no. 7 (2020): 2289. http://dx.doi.org/10.3390/ijerph17072289.

Full text
Abstract:
Background: Recent work in social network analysis has shown the usefulness of analysing and predicting outcomes from user-generated data in the context of Public Health Surveillance (PHS). Most of the proposals have focused on dealing with static datasets gathered from social networks, which are processed and mined off-line. However, little work has been done on providing a general framework to analyse the highly dynamic data of social networks from a multidimensional perspective. In this paper, we claim that such a framework is crucial for including social data in PHS systems. Methods: We propose a dynamic multidimensional approach to deal with social data streams. In this approach, dynamic dimensions are continuously updated by applying unsupervised text mining methods. More specifically, we analyse the semantics and temporal patterns in posts for identifying relevant events, topics and users. We also define quality metrics to detect relevant user profiles. In this way, the incoming data can be further filtered to cope with the goals of PHS systems. Results: We have evaluated our approach over a long-term stream of Twitter. We show how the proposed quality metrics allow us to filter out the users that are out-of-domain as well as those with low quality in their messages. We also explain how specific user profiles can be identified through their descriptions. Finally, we illustrate how the proposed multidimensional model can be used to identify main events and topics, as well as to analyse their audience and impact. Conclusions: The results show that the proposed dynamic multidimensional model is able to identify relevant events and topics and analyse them from different perspectives, which is especially useful for PHS systems.
APA, Harvard, Vancouver, ISO, and other styles
31

Rodrigues, Anisha P., Roshan Fernandes, Aakash A, et al. "Real-Time Twitter Spam Detection and Sentiment Analysis using Machine Learning and Deep Learning Techniques." Computational Intelligence and Neuroscience 2022 (April 15, 2022): 1–14. http://dx.doi.org/10.1155/2022/5211949.

Full text
Abstract:
In this modern world, we are accustomed to a constant stream of data. Major social media sites like Twitter, Facebook, or Quora face a huge dilemma as a lot of these sites fall victim to spam accounts. These accounts are made to trap unsuspecting genuine users by making them click on malicious links or keep posting redundant posts by using bots. This can greatly impact the experiences that users have on these sites. A lot of time and research has gone into effective ways to detect these forms of spam. Performing sentiment analysis on these posts can help us in solving this problem effectively. The main purpose of this proposed work is to develop a system that can determine whether a tweet is “spam” or “ham” and evaluate the emotion of the tweet. The extracted features after preprocessing the tweets are classified using various classifiers, namely, decision tree, logistic regression, multinomial naïve Bayes, support vector machine, random forest, and Bernoulli naïve Bayes for spam detection. The stochastic gradient descent, support vector machine, logistic regression, random forest, naïve Bayes, and deep learning methods, namely, simple recurrent neural network (RNN) model, long short-term memory (LSTM) model, bidirectional long short-term memory (BiLSTM) model, and 1D convolutional neural network (CNN) model are used for sentiment analysis. The performance of each classifier is analyzed. The classification results showed that the features extracted from the tweets can be satisfactorily used to identify if a certain tweet is spam or not and create a learning model that will associate tweets with a particular sentiment.
APA, Harvard, Vancouver, ISO, and other styles
32

Schafer, Valérie, Gérôme Truc, Romain Badouard, Lucien Castex, and Francesca Musiani. "Paris and Nice terrorist attacks: Exploring Twitter and web archives." Media, War & Conflict 12, no. 2 (2019): 153–70. http://dx.doi.org/10.1177/1750635219839382.

Full text
Abstract:
The attacks suffered by France in January and November 2015, and then in the course of 2016, especially the Nice attack, provoked intense online activity both during the events and in the months that followed. The digital traces left by this reactivity and reactions to events gave rise, from the very first days and even hours after the attacks, to a ‘real-time’ institutional archiving by the National Library of France ( Bibliothèque nationale de France, BnF) and the National Audio-visual Institute ( Institut national de l’audiovisuel, Ina). The results amount to millions of archived tweets and URLs. This article seeks to highlight some of the most significant issues raised by these relatively unprecedented corpora, from collection to exploitation, from online stream of data to its mediation and re-composition. Indeed, web archiving practices in times of emergency and crises are significant, almost emblematic, loci to explore the human and technical agencies, and the complex temporalities, of ‘born-digital’ heritage. The cases examined here emphasize the way these ‘emergency collections’ challenge the perimeters and the very nature of web archives as part of our digital and societal heritage, and the guiding visions of its governance and mission. Finally, the present analysis underlines the need for a careful contextualization of the design process – both of original web pages or tweets and of their archived images – and of the tools deployed to collect, retrieve and analyse them.
APA, Harvard, Vancouver, ISO, and other styles
33

Rajeshri, R. Shelke. "Identification of User Aware Rare Sequential Pattern in Document Stream An Overview." International Journal of Trend in Scientific Research and Development 3, no. 4 (2019): 1340–42. https://doi.org/10.5281/zenodo.3591065.

Full text
Abstract:
Documents created and distributed on the Internet are ever changing in various forms. Most of existing works are devoted to topic modeling and the evolution of individual topics, while sequential relations of topics in successive documents published by a specific user are ignored. In order to characterize and detect personalized and abnormal behaviours of Internet users, we propose Sequential Topic Patterns STPs and formulate the problem of mining User aware Rare Sequential Topic Patterns URSTPs in document streams on the Internet. They are rare on the whole but relatively frequent for specific users, so can be applied in many real life scenarios, such as real time monitoring on abnormal user behaviours. Here present solutions to solve this innovative mining problem through three phases pre processing to extract probabilistic topics and identify sessions for different users, generating all the STP candidates with expected support values for each user by pattern growth, and selecting URSTPs by making useraware rarity analysis on derived STPs. Experiments on both real Twitter and synthetic datasets show that our approach can indeed discover special users and interpretable URSTPs effectively and efficiently, which significantly reflect users' characteristics. Rajeshri R. Shelke "Identification of User Aware Rare Sequential Pattern in Document Stream- An Overview" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-3 | Issue-4 , June 2019, URL: https://www.ijtsrd.com/papers/ijtsrd24008.pdf
APA, Harvard, Vancouver, ISO, and other styles
34

Kryvasheyeu, Yury, Haohui Chen, Nick Obradovich, et al. "Rapid assessment of disaster damage using social media activity." Science Advances 2, no. 3 (2016): e1500779. http://dx.doi.org/10.1126/sciadv.1500779.

Full text
Abstract:
Could social media data aid in disaster response and damage assessment? Countries face both an increasing frequency and an increasing intensity of natural disasters resulting from climate change. During such events, citizens turn to social media platforms for disaster-related communication and information. Social media improves situational awareness, facilitates dissemination of emergency information, enables early warning systems, and helps coordinate relief efforts. In addition, the spatiotemporal distribution of disaster-related messages helps with the real-time monitoring and assessment of the disaster itself. We present a multiscale analysis of Twitter activity before, during, and after Hurricane Sandy. We examine the online response of 50 metropolitan areas of the United States and find a strong relationship between proximity to Sandy’s path and hurricane-related social media activity. We show that real and perceived threats, together with physical disaster effects, are directly observable through the intensity and composition of Twitter’s message stream. We demonstrate that per-capita Twitter activity strongly correlates with the per-capita economic damage inflicted by the hurricane. We verify our findings for a wide range of disasters and suggest that massive online social networks can be used for rapid assessment of damage caused by a large-scale disaster.
APA, Harvard, Vancouver, ISO, and other styles
35

Pandey, Vidushi, Sumeet Gupta, and Manojit Chattopadhyay. "A framework for understanding citizens’ political participation in social media." Information Technology & People 33, no. 4 (2019): 1053–75. http://dx.doi.org/10.1108/itp-03-2018-0140.

Full text
Abstract:
Purpose The purpose of this paper is to explore how the use of social media by citizens has impacted the traditional conceptualization and operationalization of political participation in the society. Design/methodology/approach This study is based on Teorell et al.’s (2007) classification of political participation which is modified to suit the current context of social media. The authors classified 15,460 tweets along three parameters suggested in the framework with help of supervised text classification algorithms. Findings The analysis reveals that Activism is the most prominent form of political participation undertaken by people on Twitter. Other activities that were undertaken include Formal Political participation and Consumer participation. The analysis also reveals that identity of participant does not play a classifying role as expected from the theoretical framework. It was found that the social media as a platform facilitates new forms of participation which are not feasible offline. Research limitations/implications The current work considers only the microblogging platform of Twitter as the data source. For a more comprehensive insight, analysis of other social media platforms is also required. Originality/value To the best of the authors’ knowledge, this is one of the few analyses where such a large database covering multiple social media events has been created and analysed using supervised text classification algorithms. A large proportion of previous studies on social media have been based on case study and have limited analysis to only a particular event on social media. Although there exist a few works that have studied a vast and varied collection of social media data (Gaby and Caren, 2012; Shirazi, 2013; Rane and Salem, 2012), such efforts are few in number. This study aims to add to that stream of work where a wider and more generalized set of social media data is studied.
APA, Harvard, Vancouver, ISO, and other styles
36

Trezise, Bryoni. "Minor Representations: From Anne Frank to Bana Alabed – The Radically Performative Literacies of a Viral Child." Forum for Modern Language Studies 56, no. 2 (2020): 115–34. http://dx.doi.org/10.1093/fmls/cqaa006.

Full text
Abstract:
Abstract In this article, I construct a comparative analysis of two forms of child-authored life-narrative: the famous Diary of a Young Girl written by Anne Frank and the contemporary Twitter stream authored by the Syrian child-writer Bana Alabed. My interest in these two textual practices is focused on how they each formulate notions and experiences of temporality that are central to how conceptions of the modern, innocent child and its most recent counterpart – a figure whom I term the viral child – function. Across this analysis, I observe how the textual utterances performed by each child-author make specific claims about the autobiographical ‘I’ as it enacts a literary relationship to the temporality of its ongoing construction. I argue that the radical shift in relations between the child-figure and the future cast by Alabed’s tweets positions the viral child as not only a voice, but also a medium, of the cultural present.
APA, Harvard, Vancouver, ISO, and other styles
37

Caviggioli, Federico, Lucio Lamberti, Paolo Landoni, and Paolo Meola. "Technology adoption news and corporate reputation: sentiment analysis about the introduction of Bitcoin." Journal of Product & Brand Management 29, no. 7 (2020): 877–97. http://dx.doi.org/10.1108/jpbm-03-2018-1774.

Full text
Abstract:
Purpose Evidence from previous literature indicates that adopting a new innovative technology has a positive impact on a company’s business performance. Much less work has been carried out into examining whether a technology adoption has impact on corporate reputation. This paper aims to examine the latter topic in a context where social media is the channel used to share news about the introduction of a new technology. The empirical setting of the study consists of five retail companies located in the USA that decided to include Bitcoin as a payment platform. Design/methodology/approach Twitter data were used to measure how sharing news about the adoption of new technology could affect the reputation of the companies selected, keeping a clear distinction between the volume of data relating to social media responses and the sentiment expressed in the tweets. A panel vector autoregression model was used to incorporate series of data relating to news items, volume and sentiment. Findings The results show that the news about the adoption of a new technology has a positive impact on both the volume of tech-related tweets and the sentiment expressed in the tweets themselves, although the patterns of these two effects are different. The resulting impact decreases after a few days, both in volume and in sentiment. Research limitations/implications The analysis has limitations that future research could address by extending and diversifying the examined companies and the social media used as data sources. The research suggests that managers in medium-sized companies can leverage on the introduction of new technologies that have a direct impact on their customers and gain reputational benefits in terms of immediate visibility. Originality/value The research introduces an additional dimension of analysis to the current stream of corporate reputation. Although the literature has already covered the dynamics of response to events on Twitter, by focusing on the adoption of the new Bitcoin technology, the paper provides novel insights.
APA, Harvard, Vancouver, ISO, and other styles
38

Salama, Mohamed, Hatem Abdul Kader, and Amira Abdelwahab. "An analytic framework for enhancing the performance of big heterogeneous data analysis." International Journal of Engineering Business Management 13 (January 1, 2021): 184797902199052. http://dx.doi.org/10.1177/1847979021990523.

Full text
Abstract:
The use of social media networks is becoming a current phenomenon in the world today where people are sharing posts and tweets, connect with different groups, and share their opinions about things. This data is extremely heterogeneous and so it is hard to analyze and derive information from this data that is considered an indispensable source for decision-makers. New techniques are therefore needed to handle these huge amounts of data to find the hidden information thus improve the results of the analysis. We are developing a framework for the analysis of heterogeneous data using machine learning (ML) techniques. In contrast to most of the literature frameworks that focus on a specific type of heterogeneous data for evaluating the proposed framework, we have analyzed 15k tweets data from six American airlines. These tweets are collected from the open stream of Twitter, also predict, classify each tweet as a negative or positive review, and test the ability of deep learning (DL) algorithms by comparing it with traditional ML algorithms. The findings confirmed the validity of the proposed framework and helped to achieve the study objective by providing excellent analysis performance and provide insights into additional aspects of information extraction from heterogeneous data.
APA, Harvard, Vancouver, ISO, and other styles
39

Ahmed, Wasim, and Sergej Lugovic. "Social media analytics: analysis and visualisation of news diffusion using NodeXL." Online Information Review 43, no. 1 (2019): 149–60. http://dx.doi.org/10.1108/oir-03-2018-0093.

Full text
Abstract:
Purpose The purpose of this paper is to provide an overview of NodeXL in the context of news diffusion. Journalists often include a social media dimension in their stories but lack the tools to get digital photos of the virtual crowds about which they write. NodeXL is an easy to use tool for collecting, analysing, visualising and reporting on the patterns found in collections of connections in streams of social media. With a network map patterns emerge that highlight key people, groups, divisions and bridges, themes and related resources. Design/methodology/approach This study conducts a literature review of previous empirical work which has utilised NodeXL and highlights the potential of NodeXL to provide network insights of virtual crowds during emerging news events. It then develops a number of guidelines which can be utilised by news media teams to measure and map information diffusion during emerging news events. Findings One emergent software application known as NodeXL has allowed journalists to take “group photos” of the connections among a group of users on social media. It was found that a diverse range of disciplines utilise NodeXL in academic research. Furthermore, based on the features of NodeXL, a number of guidelines were developed which provide insight into how to measure and map emerging news events on Twitter. Social implications With a set of social media network images a journalist can cover a set of social media content streams and quickly grasp “situational awareness” of the shape of the crowd. Since social media popular support is often cited but not documented, NodeXL social media network maps can help journalists quickly document the social landscape utilising an innovative approach. Originality/value This is the first empirical study to review literature on NodeXL, and to provide insight into the value of network visualisations and analytics for the news media domain. Moreover, it is the first empirical study to develop guidelines that will act as a valuable resource for newsrooms looking to acquire insight into emerging news events from the stream of social media posts. In the era of fake news and automated accounts, i.e., bots the ability to highlight opinion leaders and ascertain their allegiances will be of importance in today’s news climate.
APA, Harvard, Vancouver, ISO, and other styles
40

Yang, Kai-Cheng, Onur Varol, Pik-Mai Hui, and Filippo Menczer. "Scalable and Generalizable Social Bot Detection through Data Selection." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 01 (2020): 1096–103. http://dx.doi.org/10.1609/aaai.v34i01.5460.

Full text
Abstract:
Efficient and reliable social bot classification is crucial for detecting information manipulation on social media. Despite rapid development, state-of-the-art bot detection models still face generalization and scalability challenges, which greatly limit their applications. In this paper we propose a framework that uses minimal account metadata, enabling efficient analysis that scales up to handle the full stream of public tweets of Twitter in real time. To ensure model accuracy, we build a rich collection of labeled datasets for training and validation. We deploy a strict validation system so that model performance on unseen datasets is also optimized, in addition to traditional cross-validation. We find that strategically selecting a subset of training data yields better model accuracy and generalization than exhaustively training on all available data. Thanks to the simplicity of the proposed model, its logic can be interpreted to provide insights into social bot characteristics.
APA, Harvard, Vancouver, ISO, and other styles
41

Atiqah Sia Abdullah, Nur, and Hamizah Binti Anuar. "Review of Data Visualization for Social Media Postings." International Journal of Engineering & Technology 7, no. 4.38 (2018): 939. http://dx.doi.org/10.14419/ijet.v7i4.38.27613.

Full text
Abstract:
Facebook and Twitter are the most popular social media platforms among netizen. People are now more aggressive to express their opinions, perceptions, and emotions through social media platforms. These massive data provide great value for the data analyst to understand patterns and emotions related to a certain issue. Mining the data needs techniques and time, therefore data visualization becomes trending in representing these types of information. This paper aims to review data visualization studies that involved data from social media postings. Past literature used node-link diagram, node-link tree, directed graph, line graph, heatmap, and stream graph to represent the data collected from the social media platforms. An analysis by comparing the social media data types, representation, and data visualization techniques is carried out based on the previous studies. This paper critically discussed the comparison and provides a suggestion for the suitability of data visualization based on the type of social media data in hand.
APA, Harvard, Vancouver, ISO, and other styles
42

Enriquez-Gibson, Judith. "Following hushtag (#)MOOC." Proceedings of the International Conference on Networked Learning 9 (April 7, 2014): 111–20. http://dx.doi.org/10.54337/nlc.v9.8977.

Full text
Abstract:
Electronic posts in social media sites have led to an interpersonal shift that allows discourse-search through the use of hashtags resulting in the emergence of ‘searchable talk' (Zappanigna, 2011) and the rise of database culture (Miller, 2008). The hashtags have referred to ‘trending topics' and have become linguistic markers for ‘findability' towards a new form of sociality that is not based on reciprocity or notion of virtual community. What connects users is not who or an ego-centric node, but what is pass along in a ‘stream' (ie. the movement of ideas, information and sentiments) of re-tweets (RT's) and mentions (@) about some hashtagged (#) topics. This streaming sociality is the focus of this paper. To understand streaming sociality as an opportunity to expand social science research, this paper focuses on small talk about ‘massive' and trendy topics on education-related ideas and initiatives, namely, mooc, coursera and futurelearn, on Twitter. First, it examines how Twitter as a social technology could be performed differently in simultaneous ways as context, tool and data set of social science research. At the level of theory, a ‘new' form of networked sociality is considered outside the ‘big data' slogan and extreme reactions on topics related to particular political events, campaigns, disasters, disease outbreaks, brand marketing or self-promotion. Then, this paper asks how the streaming sociality of Twitter may transform the content, movement and geographical trends of online courses associated with #coursera and #futurelearn by tagging and performing the circulation of a different kind of mooc - mobility of online courses ((#)mooc). It will venture into unfamiliar territories and spaces mostly at an interface hoping to follow (#)mooc through hashtags, re-tweets and mentions of education-related tweets. At the point of data collection and analysis, free trial versions of social media analytics and visualisation softwares are used to attempt an alternative way of following the movement of online courses without ‘going big' or ‘being massive'. To follow and tag what is happening to online courses through the ‘code/space' (Kitchen & Dodge, 2011) of Twitter, it is necessary to attend to ‘hushtag' bondings. ‘Hushtag' refers to all those posts or news and events not explicitly tagged or mentioned by users. These muted tags give room for analytical moves that could potentially bridge the methodological divide between qualitative and quantitative epistemologies as social science researchers are confronted with the power of algorithms and monumentally detailed by-product data (Beer, 2012).
APA, Harvard, Vancouver, ISO, and other styles
43

Tao, Chunliang, Destiny Diaz, Zidian Xie, Long Chen, Dongmei Li, and Richard O’Connor. "Potential Impact of a Paper About COVID-19 and Smoking on Twitter Users’ Attitudes Toward Smoking: Observational Study." JMIR Formative Research 5, no. 6 (2021): e25010. http://dx.doi.org/10.2196/25010.

Full text
Abstract:
Background A cross-sectional study (Miyara et al, 2020) conducted by French researchers showed that the rate of current daily smoking was significantly lower in patients with COVID-19 than in the French general population, implying a potentially protective effect of smoking. Objective We aimed to examine the dissemination of the Miyara et al study among Twitter users and whether a shift in their attitudes toward smoking occurred after its publication as preprint on April 21, 2020. Methods Twitter posts were crawled between April 14 and May 4, 2020, by the Tweepy stream application programming interface, using a COVID-19–related keyword query. After filtering, the final 1929 tweets were classified into three groups: (1) tweets that were not related to the Miyara et al study before it was published, (2) tweets that were not related to Miyara et al study after it was published, and (3) tweets that were related to Miyara et al study after it was published. The attitudes toward smoking, as expressed in the tweets, were compared among the above three groups using multinomial logistic regression models in the statistical analysis software R (The R Foundation). Results Temporal analysis showed a peak in the number of tweets discussing the results from the Miyara et al study right after its publication. Multinomial logistic regression models on sentiment scores showed that the proportion of negative attitudes toward smoking in tweets related to the Miyara et al study after it was published (17.07%) was significantly lower than the proportion in tweets that were not related to the Miyara et al study, either before (44/126, 34.9%; P<.001) or after the Miyara et al study was published (68/198, 34.3%; P<.001). Conclusions The public’s attitude toward smoking shifted in a positive direction after the Miyara et al study found a lower incidence of COVID-19 cases among daily smokers.
APA, Harvard, Vancouver, ISO, and other styles
44

Panagiotou, Nikolaos, Antonia Saravanou, and Dimitrios Gunopulos. "News Monitor: A Framework for Exploring News in Real-Time." Data 7, no. 1 (2021): 3. http://dx.doi.org/10.3390/data7010003.

Full text
Abstract:
News articles generated by online media are a major source of information. In this work, we present News Monitor, a framework that automatically collects news articles from a wide variety of online news portals and performs various analysis tasks. The framework initially identifies fresh news (first stories) and clusters articles about the same incidents. For every story, at first, it extracts all of the corresponding triples and, then, it creates a knowledge base (KB) using open information extraction techniques. This knowledge base is then used to create a summary for the user. News Monitor allows for the users to use it as a search engine, ask their questions in their natural language and receive answers that have been created by the state-of-the-art framework BERT. In addition, News Monitor crawls the Twitter stream using a dynamic set of “trending” keywords in order to retrieve all messages relevant to the news. The framework is distributed, online and performs analysis in real-time. According to the evaluation results, the fake news detection techniques utilized by News Monitor allow for a F-measure of 82% in the rumor identification task and an accuracy of 92% in the stance detection tasks. The major contribution of this work can be summarized as a novel real-time and scalable architecture that combines various effective techniques under a news analysis framework.
APA, Harvard, Vancouver, ISO, and other styles
45

Shashwat Shukla., Srishti Sinha.,, Sohan Singh &. Anupam Lakhanpal. "Jarvis: Desktop Assistant." International Journal for Modern Trends in Science and Technology 7, no. 05 (2021): 178–83. http://dx.doi.org/10.46501/ijmtst0705030.

Full text
Abstract:
“Jarvis” was main character of Tony’s Stark’s life assistant in Movies Iron Man. Unlike original comic in which Jarvis was Stark’s human butler, the movie version of Jarvis is an intelligent computer that converses with stark, monitors his household and help to build and program his superhero suit. In this Project Jarvis is Digital Life Assistant which uses mainly human communication means such Twitter, instant message and voice to create two way connections between human and his apartment, controlling lights and appliances, assist in cooking, notify him of breaking news, Facebook’s Notifications and many more. In our project we mainly use voice as communication means so the Jarvis is basically the Speech recognition application. The concept of speech technology really encompasses two technologies: Synthesizer and recognizer. A speech synthesizer takes as input and produces an audio stream as output. A speech recognizer on the other hand does opposite. It takes an audio stream as input and thus turns it into text transcription. The voice is a signal of infinite information. A direct analysisand synthesizing the complex voice signal is due to too much information contained in the signal. Therefore the digital signal processes such as Feature Extraction and Feature Matching are introduced to represent the voice signal. In this project we directly use speech engine which use Feature extraction technique as Mel scaled frequency cepstral. The mel- scaled frequency cepstral coefficients (MFCCs) derived from Fourier transform and filter bank analysis are perhaps the most widely used front- ends in state-of-the-art speech recognition systems. Our aim to create more and more functionalities which can help human to assist in their daily life and also reduces their efforts. In our test we check all this functionality is working properly. We test this on 2 speakers(1 Female and 1 Male) for accuracy purpose.
APA, Harvard, Vancouver, ISO, and other styles
46

Baru Khan Bau. "Managing the E-commerce Data Deluge through Text Analytics and Web Management (Overview of Amazon.com)." International Journal of Information Technology and Computer Science Applications 2, no. 2 (2024): 17–24. http://dx.doi.org/10.58776/ijitcsa.v2i2.147.

Full text
Abstract:
Today, more than 80% of the big data handled in the e-commerce industry is text and unstructured data. Text analytics is an automated process for analyzing text and extracting useful information from it. It can discover trends and relationships in data. Web analytics is the collection, processing, and analysis of data in order to draw conclusions to optimize usability on a website. Web analytics can be used to improve the usability of a site by analyzing user behavior patterns such as time spent on the site, abandonment rates, most frequently accessed products, click-through rates, etc. It can also help analyze the interests of different user demographics, as it tracks granular details such as user demographics, age and gender, geography, and devices used as data. In order to obtain UpToDate information, the business can utilize business intelligence for real-time data processing, then they can practice stream analysis to analyse continuous flow of data. For instance, the business can collect instant information in Twitter or other social media and analyse it by using social media analysis. For website management, business can practice web analysis to analyse the customer’s behaviours. Tracking the customer’s activity, page view and conversion rate is important for business to analyse how to improve the website performance. Text analytics of comments received on Amazon can be used to group text data and produce results in terms of word frequency distribution and sentiment analysis. Text analytics could be used for decision making, improving service quality, and developing new business models.
APA, Harvard, Vancouver, ISO, and other styles
47

Alok Soreng. "Empowering Public Trust in Vaccines for Effective Outbreak Response." Journal of Information Systems Engineering and Management 10, no. 19s (2025): 625–48. https://doi.org/10.52783/jisem.v10i19s.3104.

Full text
Abstract:
Social media sites like Twitter act as vibrant centres of worldwide communication. highly influential due to the large number of users and the constant stream of information shared. This can be both positive and negative. On the negative side, social media can be a breeding ground for misinformation and can exacerbate public anxieties, especially during crisis situations like pandemics. This paper highlights Understanding how important it is to measure public opinion on social media sites is critical to surviving in today's digital environments, especially regarding concerns lingering after a pandemic. By analyzing these concerns, policymakers can make better decisions to address public anxieties and improve public health policies in the future. Preprocessing involves tokenization, stop word removal, stemming/lemmatization, and normalization. Each preprocessed token is then embedded using an LSTM layer to capture context and sequence. An attention mechanism focuses on crucial aspects of the text. Sentiment analysis with polarity and objectivity ranking is performed alongside context extraction. Finally, an LSTM classifier leverages both sentiment and context features to categorize the text. This approach offers a comprehensive method for text classification considering sentiment, context, and inherent textual features. proposed approach achieved higher accuracy (92.8%) compared to Random Forest and LSTM models used with these embeddings.
APA, Harvard, Vancouver, ISO, and other styles
48

Alok Soreng. "Empowering Public Trust in Vaccines for Effective Outbreak Response." Panamerican Mathematical Journal 35, no. 3s (2025): 251–72. https://doi.org/10.52783/pmj.v35.i3s.3890.

Full text
Abstract:
Social media sites like Twitter act as vibrant centres of worldwide communication. highly influential due to the large number of users and the constant stream of information shared. This can be both positive and negative. On the negative side, social media can be a breeding ground for misinformation and can exacerbate public anxieties, especially during crisis situations like pandemics. This paper highlights Understanding how important it is to measure public opinion on social media sites is critical to surviving in today's digital environments, especially regarding concerns lingering after a pandemic. By analyzing these concerns, policymakers can make better decisions to address public anxieties and improve public health policies in the future. Preprocessing involves tokenization, stop word removal, stemming/lemmatization, and normalization. Each preprocessed token is then embedded using an LSTM layer to capture context and sequence. An attention mechanism focuses on crucial aspects of the text. Sentiment analysis with polarity and objectivity ranking is performed alongside context extraction. Finally, an LSTM classifier leverages both sentiment and context features to categorize the text. This approach offers a comprehensive method for text classification considering sentiment, context, and inherent textual features. proposed approach achieved higher accuracy (92.8%) compared to Random Forest and LSTM models used with these embeddings.
APA, Harvard, Vancouver, ISO, and other styles
49

Baqi, Md Adnan, Mohammad Sarfraz, Mohammad Umar, Aquib Jawed, and Dr Anum Kamal. "A HYBRID TRANSFORMER BASED MODEL FOR DISASTER TWEET CLASSIFICATION USING BERT AND RoBERTa." Journal of Dynamics and Control 9, no. 5 (2025): 107–14. https://doi.org/10.71058/jodac.v9i5010.

Full text
Abstract:
In the era of social media, platforms such as Twitter play an essential role in real-time disaster reporting, offering immediate access to firsthand information during emergencies. This research presents a novel hybrid deep learning model for classifying disaster related tweets by integrating two state-of-the-art transformer architectures: BERT and RoBERTa. Our approach leverages the complementary strengths of each model by independently encoding the same tweet using both architectures, and then fusing their mean pooled representations to generate a more robust feature set for final classification. This dual stream method captures rich semantic nuances and contextual variations, substantially improving performance on noisy, unstructured text data. Extensive experiments were conducted on a carefully curated dataset of disaster and no disaster tweets. The experimental results demonstrate that our hybrid model significantly outperforms individual transformer models in terms of accuracy, precision, recall, and overall robustness. Detailed analysis of the combined outputs reveals that the hybrid approach effectively mitigates model specific limitations and enhances semantic representation. This work provides valuable insights into Mult transformer fusion strategies and suggests that integrating diverse pretrained models can yield substantial improvements in natural language processing tasks for real world crisis management applications. Furthermore, these findings exhibit strong promise.
APA, Harvard, Vancouver, ISO, and other styles
50

Wongkoblap, Akkapon, Miguel A. Vadillo, and Vasa Curcin. "Deep Learning With Anaphora Resolution for the Detection of Tweeters With Depression: Algorithm Development and Validation Study." JMIR Mental Health 8, no. 8 (2021): e19824. http://dx.doi.org/10.2196/19824.

Full text
Abstract:
Background Mental health problems are widely recognized as a major public health challenge worldwide. This concern highlights the need to develop effective tools for detecting mental health disorders in the population. Social networks are a promising source of data wherein patients publish rich personal information that can be mined to extract valuable psychological cues; however, these data come with their own set of challenges, such as the need to disambiguate between statements about oneself and third parties. Traditionally, natural language processing techniques for social media have looked at text classifiers and user classification models separately, hence presenting a challenge for researchers who want to combine text sentiment and user sentiment analysis. Objective The objective of this study is to develop a predictive model that can detect users with depression from Twitter posts and instantly identify textual content associated with mental health topics. The model can also address the problem of anaphoric resolution and highlight anaphoric interpretations. Methods We retrieved the data set from Twitter by using a regular expression or stream of real-time tweets comprising 3682 users, of which 1983 self-declared their depression and 1699 declared no depression. Two multiple instance learning models were developed—one with and one without an anaphoric resolution encoder—to identify users with depression and highlight posts related to the mental health of the author. Several previously published models were applied to our data set, and their performance was compared with that of our models. Results The maximum accuracy, F1 score, and area under the curve of our anaphoric resolution model were 92%, 92%, and 90%, respectively. The model outperformed alternative predictive models, which ranged from classical machine learning models to deep learning models. Conclusions Our model with anaphoric resolution shows promising results when compared with other predictive models and provides valuable insights into textual content that is relevant to the mental health of the tweeter.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography