To see the other types of publications on this topic, follow the link: Sentiment Analysis Applications.

Dissertations / Theses on the topic 'Sentiment Analysis Applications'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 23 dissertations / theses for your research on the topic 'Sentiment Analysis Applications.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Osika, Anton. "Statistical analysis of online linguistic sentiment measures with financial applications." Thesis, KTH, Matematisk statistik, 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-177106.

Full text
Abstract:
Gavagai is a company that uses different methods to aggregate senti-ment towards specific topics from a large stream of real time published documents. Gavagai wants to find a procedure to decide which way of measuring sentiment (sentiment measure) towards a topic is most useful in a given context. This work discusses what criterion are desirable for aggregating sentiment and derives and evaluates procedures to select "optimal" sentiment measures. Three novel models for selecting a set of sentiment measures that describe independent attributes of the aggregated data are evaluated. The models can be summarized as: maximizing variance of the last principal compo-nent of the data, maximizing the differential entropy of the data and, in the special case of selecting an additional sentiment measure, maximizing the unexplained variance conditional on the previous sentiment measures. When exogenous time varying data considering a topic is available, the data can be used to select the sentiment measure that best explain the data. With this goal in mind, the hypothesis that sentiment data can be used to predict financial volatility and political poll data is tested. The null hypothesis can not be rejected. A framework for aggregating sentiment measures in a mathematically co-herent way is summarized in a road map.<br>Företaget Gavagai använder olika mått för att i realtid uppskatta sen-timent ifrån diverse strömmar av publika dokument. Gavagai vill hitta ett en procedur som bestämmer vilka mått som passar passar bäst i en given kontext. Det här arbetet diskuterar vilka kriterium som är önskvärda för att mäta sentiment samt härleder och utvärderar procedurer för att välja öptimalasentimentmått. Tre metoder för att välja ut en grupp av mått som beskriver oberoende polariseringar i text föreslås. Dessa bygger på att: välja mått där principal-komponentsanalys uppvisar hög dimensionalitet hos måtten, välja mått som maximerar total uppskattad differentialentropi, välja ett mått som har hög villkorlig varians givet andra polariseringar. Då exogen tidsvarierande data om ett ämne finns tillgängligt kan denna data användas för att beräkna vilka sentimentmått som bäst beskriver datan. För att undersöka potentialen i att välja sentimentmått på detta sätt testas hypoteserna att publika sentimentmått kan förutspå finansiell volatilitet samt politiska opinionsundersökningar. Nollhypotesen kan ej förkastas. En sammanfattning för att på ett genomgående matematiskt koherent sätt aggregera sentiment läggs fram tillsammans med rekommendationer för framtida efterforskningar.
APA, Harvard, Vancouver, ISO, and other styles
2

Clark, Eric Michael. "Applications In Sentiment Analysis And Machine Learning For Identifying Public Health Variables Across Social Media." ScholarWorks @ UVM, 2019. https://scholarworks.uvm.edu/graddis/1006.

Full text
Abstract:
Twitter, a popular social media outlet, has evolved into a vast source of linguistic data, rich with opinion, sentiment, and discussion. We mined data from several public Twitter endpoints to identify content relevant to healthcare providers and public health regulatory professionals. We began by compiling content related to electronic nicotine delivery systems (or e-cigarettes) as these had become popular alternatives to tobacco products. There was an apparent need to remove high frequency tweeting entities, called bots, that would spam messages, advertisements, and fabricate testimonials. Algorithms were constructed using natural language processing and machine learning to sift human responses from automated accounts with high degrees of accuracy. We found the average hyperlink per tweet, the average character dissimilarity between each individual's content, as well as the rate of introduction of unique words were valuable attributes in identifying automated accounts. We performed a 10-fold Cross Validation and measured performance of each set of tweet features, at various bin sizes, the best of which performed with 97% accuracy. These methods were used to isolate automated content related to the advertising of electronic cigarettes. A rich taxonomy of automated entities, including robots, cyborgs, and spammers, each with different measurable linguistic features were categorized. Electronic cigarette related posts were classified as automated or organic and content was investigated with a hedonometric sentiment analysis. The overwhelming majority (≈ 80%) were automated, many of which were commercial in nature. Others used false testimonials that were sent directly to individuals as a personalized form of targeted marketing. Many tweets advertised nicotine vaporizer fluid (or e-liquid) in various “kid-friendly” flavors including 'Fudge Brownie', 'Hot Chocolate', 'Circus Cotton Candy' along with every imaginable flavor of fruit, which were long ago banned for traditional tobacco products. Others offered free trials, as well as incentives to retweet and spread the post among their own network. Free prize giveaways were also hosted whose raffle tickets were issued for sharing their tweet. Due to the large youth presence on the public social media platform, this was evidence that the marketing of electronic cigarettes needed considerable regulation. Twitter has since officially banned all electronic cigarette advertising on their platform. Social media has the capacity to afford the healthcare industry with valuable feedback from patients who reveal and express their medical decision-making process, as well as self-reported quality of life indicators both during and post treatment. We have studied several active cancer patient populations, discussing their experiences with the disease as well as survivor-ship. We experimented with a Convolutional Neural Network (CNN) as well as logistic regression to classify tweets as patient related. This led to a sample of 845 breast cancer survivor accounts to study, over 16 months. We found positive sentiments regarding patient treatment, raising support, and spreading awareness. A large portion of negative sentiments were shared regarding political legislation that could result in loss of coverage of their healthcare. We refer to these online public testimonies as “Invisible Patient Reported Outcomes” (iPROs), because they carry relevant indicators, yet are difficult to capture by conventional means of self-reporting. Our methods can be readily applied interdisciplinary to obtain insights into a particular group of public opinions. Capturing iPROs and public sentiments from online communication can help inform healthcare professionals and regulators, leading to more connected and personalized treatment regimens. Social listening can provide valuable insights into public health surveillance strategies.
APA, Harvard, Vancouver, ISO, and other styles
3

Yu, Xiang. "Analysis of new sentiment and its application to finance." Thesis, Brunel University, 2014. http://bura.brunel.ac.uk/handle/2438/9062.

Full text
Abstract:
We report our investigation of how news stories influence the behaviour of tradable financial assets, in particular, equities. We consider the established methods of turning news events into a quantifiable measure and explore the models which connect these measures to financial decision making and risk control. The study of our thesis is built around two practical, as well as, research problems which are determining trading strategies and quantifying trading risk. We have constructed a new measure which takes into consideration (i) the volume of news and (ii) the decaying effect of news sentiment. In this way we derive the impact of aggregated news events for a given asset; we have defined this as the impact score. We also characterise the behaviour of assets using three parameters, which are return, volatility and liquidity, and construct predictive models which incorporate impact scores. The derivation of the impact measure and the characterisation of asset behaviour by introducing liquidity are two innovations reported in this thesis and are claimed to be contributions to knowledge. The impact of news on asset behaviour is explored using two sets of predictive models: the univariate models and the multivariate models. In our univariate predictive models, a universe of 53 assets were considered in order to justify the relationship of news and assets across 9 different sectors. For the multivariate case, we have selected 5 stocks from the financial sector only as this is relevant for the purpose of constructing trading strategies. We have analysed the celebrated Black-Litterman model (1991) and constructed our Bayesian multivariate predictive models such that we can incorporate domain expertise to improve the predictions. Not only does this suggest one of the best ways to choose priors in Bayesian inference for financial models using news sentiment, but it also allows the use of current and synchronised data with market information. This is also a novel aspect of our work and a further contribution to knowledge.
APA, Harvard, Vancouver, ISO, and other styles
4

Norell, Alexandra Jenny. "Application of sentiment analysis for information overload detection in an Ecommerce competitive environment." Thesis, Högskolan i Halmstad, Akademin för ekonomi, teknik och naturvetenskap, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:hh:diva-42065.

Full text
Abstract:
This master thesis is focusing on the information overload in digital marketing and using the method of sentiment analysis to detect if the issue occurs or not. A model and method of different sentiments (positive and negative) were organized, and evaluated based on the statistical and prominent findings of the emotional value in the customer satisfaction in online reviews. Findings were analyzed, as to what data, and categories showed value which proved information overload and these were thereafter connected to previous academic studies of sentiment analysis and customer satisfaction connected to information overload. The results of the analysis proved that the sentiment analysis had significance in some aspects and categories to combat the information overload issue in digital marketing for online consumers.
APA, Harvard, Vancouver, ISO, and other styles
5

Chouchani, Nadia. "Une approche de détection des communautés d'intérêt dans les réseaux sociaux : application à la génération d'IHM personnalisées." Thesis, Valenciennes, 2018. http://www.theses.fr/2018VALE0048/document.

Full text
Abstract:
De nos jours, les Réseaux Sociaux sont omniprésents dans tous les aspects de la vie. Une fonctionnalité fondamentale de ces réseaux est la connexion entre les utilisateurs. Ces derniers sont engagés progressivement à contribuer en ajoutant leurs propres contenus. Donc, les Réseaux Sociaux intègrent également les créations des utilisateurs ; ce qui incite à revisiter les méthodes de leur analyse. Ce domaine a conduit désormais à de nombreux travaux de recherche ces dernières années. L’un des problèmes principaux est la détection des communautés. Les travaux de recherche présentés dans ce mémoire se positionnent dans les thématiques de l’analyse sémantique des Réseaux Sociaux et de la génération des applications interactives personnalisées. Cette thèse propose une approche pour la détection des communautés d’intérêt dans les Réseaux Sociaux. Cette approche modélise les données sociales sous forme d’un profil utilisateur social représenté par un ontologie. Elle met en oeuvre une méthode pour l’Analyse des Sentiments basées sur les phénomènes de l’influence sociale et d’Homophilie. Les communautés détectées sont exploitées dans la génération d’applications interactives personnalisées. Cette génération est basée sur une approche de type MDA, indépendante du domaine d’application. De surcroît, cet ouvrage fait état d’une évaluation de nos propositions sur des données issues de Réseaux Sociaux réels<br>Nowadays, Social Networks are ubiquitous in all aspects of life. A fundamental feature of these networks is the connection between users. These are gradually engaged to contribute by adding their own content. So Social Networks also integrate user creations ; which encourages researchers to revisit the methods of their analysis. This field has now led to a great deal of research in recent years. One of the main problems is the detection of communities. The research presented in this thesis is positioned in the themes of the semantic analysis of Social Networks and the generation of personalized interactive applications. This thesis proposes an approach for the detection of communities of interest in Social Networks. This approach models social data in the form of a social user profile represented by an ontology. It implements a method for the Sentiment Analysis based on the phenomena of social influence and homophily. The detected communities are exploited in the generation of personalized interactive applications. This generation is based on an approach of type MDA, independent of the application domain. In addition, this manuscript reports an evaluation of our proposals on data from Real Social Networks
APA, Harvard, Vancouver, ISO, and other styles
6

Tian, Nan. "Feature taxonomy learning from user generated content and application in review selection." Thesis, Queensland University of Technology, 2016. https://eprints.qut.edu.au/101169/1/Nan_Tian_Thesis.pdf.

Full text
Abstract:
This thesis developed new methods to find useful information from massive customer generated product review data in order to assist customers in decision making. It first examines the distinct features of review text to find useful information about the reviewed product using a number of existing techniques. Then, based upon derived product information, this thesis developed novel methods to assess the review quality in order to find most useful or helpful product reviews for customers.
APA, Harvard, Vancouver, ISO, and other styles
7

El, alaoui Imane. "Transformer les big social data en prévisions - méthodes et technologies : Application à l'analyse de sentiments." Thesis, Angers, 2018. http://www.theses.fr/2018ANGE0011/document.

Full text
Abstract:
Extraire l'opinion publique en analysant les Big Social data a connu un essor considérable en raison de leur nature interactive, en temps réel. En effet, les données issues des réseaux sociaux sont étroitement liées à la vie personnelle que l’on peut utiliser pour accompagner les grands événements en suivant le comportement des personnes. C’est donc dans ce contexte que nous nous intéressons particulièrement aux méthodes d’analyse du Big data. La problématique qui se pose est que ces données sont tellement volumineuses et hétérogènes qu’elles en deviennent difficiles à gérer avec les outils classiques. Pour faire face aux défis du Big data, de nouveaux outils ont émergés. Cependant, il est souvent difficile de choisir la solution adéquate, car la vaste liste des outils disponibles change continuellement. Pour cela, nous avons fourni une étude comparative actualisée des différents outils utilisés pour extraire l'information stratégique du Big Data et les mapper aux différents besoins de traitement.La contribution principale de la thèse de doctorat est de proposer une approche d’analyse générique pour détecter de façon automatique des tendances d’opinion sur des sujets donnés à partir des réseaux sociaux. En effet, étant donné un très petit ensemble de hashtags annotés manuellement, l’approche proposée transfère l'information du sentiment connue des hashtags à des mots individuels. La ressource lexicale qui en résulte est un lexique de polarité à grande échelle dont l'efficacité est mesurée par rapport à différentes tâches de l’analyse de sentiment. La comparaison de notre méthode avec différents paradigmes dans la littérature confirme l'impact bénéfique de notre méthode dans la conception des systèmes d’analyse de sentiments très précis. En effet, notre modèle est capable d'atteindre une précision globale de 90,21%, dépassant largement les modèles de référence actuels sur l'analyse du sentiment des réseaux sociaux<br>Extracting public opinion by analyzing Big Social data has grown substantially due to its interactive nature, in real time. In fact, our actions on social media generate digital traces that are closely related to our personal lives and can be used to accompany major events by analysing peoples' behavior. It is in this context that we are particularly interested in Big Data analysis methods. The volume of these daily-generated traces increases exponentially creating massive loads of information, known as big data. Such important volume of information cannot be stored nor dealt with using the conventional tools, and so new tools have emerged to help us cope with the big data challenges. For this, the aim of the first part of this manuscript is to go through the pros and cons of these tools, compare their respective performances and highlight some of its interrelated applications such as health, marketing and politics. Also, we introduce the general context of big data, Hadoop and its different distributions. We provide a comprehensive overview of big data tools and their related applications.The main contribution of this PHD thesis is to propose a generic analysis approach to automatically detect trends on given topics from big social data. Indeed, given a very small set of manually annotated hashtags, the proposed approach transfers information from hashtags known sentiments (positive or negative) to individual words. The resulting lexical resource is a large-scale lexicon of polarity whose efficiency is measured against different tasks of sentiment analysis. The comparison of our method with different paradigms in literature confirms the impact of our method to design accurate sentiment analysis systems. Indeed, our model reaches an overall accuracy of 90.21%, significantly exceeding the current models on social sentiment analysis
APA, Harvard, Vancouver, ISO, and other styles
8

Erik, Cambria. "Application of common sense computing for the development of a novel knowledge-based opinion mining engine." Thesis, University of Stirling, 2011. http://hdl.handle.net/1893/6497.

Full text
Abstract:
The ways people express their opinions and sentiments have radically changed in the past few years thanks to the advent of social networks, web communities, blogs, wikis and other online collaborative media. The distillation of knowledge from this huge amount of unstructured information can be a key factor for marketers who want to create an image or identity in the minds of their customers for their product, brand, or organisation. These online social data, however, remain hardly accessible to computers, as they are specifically meant for human consumption. The automatic analysis of online opinions, in fact, involves a deep understanding of natural language text by machines, from which we are still very far. Hitherto, online information retrieval has been mainly based on algorithms relying on the textual representation of web-pages. Such algorithms are very good at retrieving texts, splitting them into parts, checking the spelling and counting their words. But when it comes to interpreting sentences and extracting meaningful information, their capabilities are known to be very limited. Existing approaches to opinion mining and sentiment analysis, in particular, can be grouped into three main categories: keyword spotting, in which text is classified into categories based on the presence of fairly unambiguous affect words; lexical affinity, which assigns arbitrary words a probabilistic affinity for a particular emotion; statistical methods, which calculate the valence of affective keywords and word co-occurrence frequencies on the base of a large training corpus. Early works aimed to classify entire documents as containing overall positive or negative polarity, or rating scores of reviews. Such systems were mainly based on supervised approaches relying on manually labelled samples, such as movie or product reviews where the opinionist’s overall positive or negative attitude was explicitly indicated. However, opinions and sentiments do not occur only at document level, nor they are limited to a single valence or target. Contrary or complementary attitudes toward the same topic or multiple topics can be present across the span of a document. In more recent works, text analysis granularity has been taken down to segment and sentence level, e.g., by using presence of opinion-bearing lexical items (single words or n-grams) to detect subjective sentences, or by exploiting association rule mining for a feature-based analysis of product reviews. These approaches, however, are still far from being able to infer the cognitive and affective information associated with natural language as they mainly rely on knowledge bases that are still too limited to efficiently process text at sentence level. In this thesis, common sense computing techniques are further developed and applied to bridge the semantic gap between word-level natural language data and the concept-level opinions conveyed by these. In particular, the ensemble application of graph mining and multi-dimensionality reduction techniques on two common sense knowledge bases was exploited to develop a novel intelligent engine for open-domain opinion mining and sentiment analysis. The proposed approach, termed sentic computing, performs a clause-level semantic analysis of text, which allows the inference of both the conceptual and emotional information associated with natural language opinions and, hence, a more efficient passage from (unstructured) textual information to (structured) machine-processable data. The engine was tested on three different resources, namely a Twitter hashtag repository, a LiveJournal database and a PatientOpinion dataset, and its performance compared both with results obtained using standard sentiment analysis techniques and using different state-of-the-art knowledge bases such as Princeton’s WordNet, MIT’s ConceptNet and Microsoft’s Probase. Differently from most currently available opinion mining services, the developed engine does not base its analysis on a limited set of affect words and their co-occurrence frequencies, but rather on common sense concepts and the cognitive and affective valence conveyed by these. This allows the engine to be domain-independent and, hence, to be embedded in any opinion mining system for the development of intelligent applications in multiple fields such as Social Web, HCI and e-health. Looking ahead, the combined novel use of different knowledge bases and of common sense reasoning techniques for opinion mining proposed in this work, will, eventually, pave the way for development of more bio-inspired approaches to the design of natural language processing systems capable of handling knowledge, retrieving it when necessary, making analogies and learning from experience.
APA, Harvard, Vancouver, ISO, and other styles
9

Dermouche, Mohamed. "Modélisation conjointe des thématiques et des opinions : application à l'analyse des données textuelles issues du Web." Thesis, Lyon 2, 2015. http://www.theses.fr/2015LYO22007/document.

Full text
Abstract:
Cette thèse se situe à la confluence des domaines de "la modélisation de thématiques" (topic modeling) et l'"analyse d'opinions" (opinion mining). Le problème que nous traitons est la modélisation conjointe et dynamique des thématiques (sujets) et des opinions (prises de position) sur le Web et les médias sociaux. En effet, dans la littérature, ce problème est souvent décomposé en sous-tâches qui sont menées séparément. Ceci ne permet pas de prendre en compte les associations et les interactions entre les opinions et les thématiques sur lesquelles portent ces opinions (cibles). Dans cette thèse, nous nous intéressons à la modélisation conjointe et dynamique qui permet d'intégrer trois dimensions du texte (thématiques, opinions et temps). Afin d'y parvenir, nous adoptons une approche statistique, plus précisément, une approche basée sur les modèles de thématiques probabilistes (topic models). Nos principales contributions peuvent être résumées en deux points : 1. Le modèle TS (Topic-Sentiment model) : un nouveau modèle probabiliste qui permet une modélisation conjointe des thématiques et des opinions. Ce modèle permet de caractériser les distributions d'opinion relativement aux thématiques. L'objectif est d'estimer, à partir d'une collection de documents, dans quelles proportions d'opinion les thématiques sont traitées. 2. Le modèle TTS (Time-aware Topic-Sentiment model) : un nouveau modèle probabiliste pour caractériser l'évolution temporelle des thématiques et des opinions. En s'appuyant sur l'information temporelle (date de création de documents), le modèle TTS permet de caractériser l'évolution des thématiques et des opinions quantitativement, c'est-à-dire en terme de la variation du volume de données à travers le temps. Par ailleurs, nous apportons deux autres contributions : une nouvelle mesure pour évaluer et comparer les méthodes d'extraction de thématiques, ainsi qu'une nouvelle méthode hybride pour le classement d'opinions basée sur une combinaison de l'apprentissage automatique supervisé et la connaissance a priori. Toutes les méthodes proposées sont testées sur des données réelles en utilisant des évaluations adaptées<br>This work is located at the junction of two domains : topic modeling and sentiment analysis. The problem that we propose to tackle is the joint and dynamic modeling of topics (subjects) and sentiments (opinions) on the Web. In the literature, the task is usually divided into sub-tasks that are treated separately. The models that operate this way fail to capture the topic-sentiment interaction and association. In this work, we propose a joint modeling of topics and sentiments, by taking into account associations between them. We are also interested in the dynamics of topic-sentiment associations. To this end, we adopt a statistical approach based on the probabilistic topic models. Our main contributions can be summarized in two points : 1. TS (Topic-Sentiment model) : a new probabilistic topic model for the joint extraction of topics and sentiments. This model allows to characterize the extracted topics with distributions over the sentiment polarities. The goal is to discover the sentiment proportions specfic to each of theextracted topics. 2. TTS (Time-aware Topic-Sentiment model) : a new probabilistic model to caracterize the topic-sentiment dynamics. Relying on the document's time information, TTS allows to characterize the quantitative evolutionfor each of the extracted topic-sentiment pairs. We also present two other contributions : a new evaluation framework for measuring the performance of topic-extraction methods, and a new hybrid method for sentiment detection and classification from text. This method is based on combining supervised machine learning and prior knowledge. All of the proposed methods are tested on real-world data based on adapted evaluation frameworks
APA, Harvard, Vancouver, ISO, and other styles
10

Lawani, Abdelaziz. "THREE ESSAYS ON THE APPLICATION OF MACHINE LEARNING METHODS IN ECONOMICS." UKnowledge, 2018. https://uknowledge.uky.edu/agecon_etds/68.

Full text
Abstract:
Over the last decades, economics as a field has experienced a profound transformation from theoretical work toward an emphasis on empirical research (Hamermesh, 2013). One common constraint of empirical studies is the access to data, the quality of the data and the time span it covers. In general, applied studies rely on surveys, administrative or private sector data. These data are limited and rarely have universal or near universal population coverage. The growth of the internet has made available a vast amount of digital information. These big digital data are generated through social networks, sensors, and online platforms. These data account for an increasing part of the economic activity yet for economists, the availability of these big data also raises many new challenges related to the techniques needed to collect, manage, and derive knowledge from them. The data are in general unstructured, complex, voluminous and the traditional software used for economic research are not always effective in dealing with these types of data. Machine learning is a branch of computer science that uses statistics to deal with big data. The objective of this dissertation is to reconcile machine learning and economics. It uses threes case studies to demonstrate how data freely available online can be harvested and used in economics. The dissertation uses web scraping to collect large volume of unstructured data online. It uses machine learning methods to derive information from the unstructured data and show how this information can be used to answer economic questions or address econometric issues. The first essay shows how machine learning can be used to derive sentiments from reviews and using the sentiments as a measure for quality it examines an old economic theory: Price competition in oligopolistic markets. The essay confirms the economic theory that agents compete for price. It also confirms that the quality measure derived from sentiment analysis of the reviews is a valid proxy for quality and influences price. The second essay uses a random forest algorithm to show that reviews can be harnessed to predict consumers’ preferences. The third essay shows how properties description can be used to address an old but still actual problem in hedonic pricing models: the Omitted Variable Bias. Using the Least Absolute Shrinkage and Selection Operator (LASSO) it shows that pricing errors in hedonic models can be reduced by including the description of the properties in the models.
APA, Harvard, Vancouver, ISO, and other styles
11

Abidi, Karima. "La construction automatique de ressources multilingues à partir des réseaux sociaux : application aux données dialectales du Maghreb." Electronic Thesis or Diss., Université de Lorraine, 2019. http://www.theses.fr/2019LORR0274.

Full text
Abstract:
Le traitement automatique des langues est fondé sur l'utilisation des ressources langagières telles que les corpus de textes, les dictionnaires, les lexiques de sentiments, les analyseurs morpho-syntaxiques, les taggers, etc. Pour les langues naturelles, ces ressources sont souvent disponibles. En revanche, lorsqu'il est question de traiter les langues peu dotées, on est souvent confronté au manque d'outils et de données. Dans cette thèse, on s'intéresse à certaines formes vernaculaires de l'arabe utilisées au Maghreb. Ces formes sont connues sous le terme de dialecte que l'on peut classer dans la catégorie des langues peu dotées. Exceptés des textes brutes extraits généralement des réseaux sociaux, il existe très peu de ressources permettant de traiter les dialectes arabes. Ces derniers, comparativement aux autres langues peu dotées possèdent plusieurs spécificités qui les rendent plus difficile à traiter. Nous pouvons citer notamment l'absence de règles d'écriture de ces dialectes, ce qui conduit les usagers à écrire le dialecte sans suivre des règles précises, par conséquent un même mot peut avoir plusieurs graphies. Les mots en arabe dialectal peuvent s’écrire en utilisant le script arabe et/ou le script latin (écriture dite arabizi). Pour les dialectes arabes du Maghreb, ils sont particulièrement influencés par des langues étrangères comme le français et l'anglais. En plus de l'emprunt de mots de ces langues, un autre phénomène est à prendre en compte en traitement automatique des dialectes. Il s'agit du problème connu sous le terme de code-switching. Ce phénomène est connu en linguistique sous le terme de diglossie. Cela a pour conséquence de laisser libre cours à l’utilisateur qui peut écrire en plusieurs langues dans une même phrase. Il peut ainsi commencer en dialecte arabe et au milieu de la phrase, il peut "switcher" vers le français, l'anglais ou l’arabe standard. En plus de cela, il existe plusieurs dialectes dans un même pays et a fortiori plusieurs dialectes différents dans le monde arabe. Il est donc clair que les outils NLP classiques développés pour l’arabe standard ne peuvent être utilisés directement pour traiter les dialectes. L'objectif principal de ce travail consiste à proposer des méthodes permettant la construction automatique de ressources pour les dialectes arabes en général et les dialectes du Maghreb en particulier. Cela représente notre contribution à l'effort fourni par la communauté travaillant sur le traitement automatique des dialectes arabes. Nous avons ainsi produit des méthodes permettant de construire des corpus comparables, des ressources lexicales contenant les différentes formes d'une entrée et leur polarité. Par ailleurs, nous avons développé des méthodes pour le traitement de l'arabe standard sur des données de Twitter et également sur les transcriptions provenant d'un système de reconnaissance automatique de la parole opérant sur des vidéos en arabe extraites de chaînes de télévisions arabes telles que Al Jazeera, France24, Euronews, etc. Nous avons ainsi comparé les opinions des transcriptions automatiques provenant de sources vidéos multilingues différentes et portant sur le même sujet en développant une méthode fondée sur la théorie linguistique dite Appraisal<br>Automatic language processing is based on the use of language resources such as corpora, dictionaries, lexicons of sentiments, morpho-syntactic analyzers, taggers, etc. For natural languages, these resources are often available. On the other hand, when it comes to dealing with under-resourced languages, there is often a lack of tools and data. In this thesis, we are interested in some of the vernacular forms of Arabic used in Maghreb. These forms are known as dialects, which can be classified as poorly endowed languages. Except for raw texts, which are generally extracted from social networks, there is not plenty resources allowing to process Arabic dialects. The latter, compared to other under-resourced languages, have several specificities that make them more difficult to process. We can mention, in particular the lack of rules for writing these dialects, which leads the users to write the dialect without following strict rules, so the same word can have several spellings. Words in Arabic dialect can be written using the Arabic script and/or the Latin script (arabizi). For the Arab dialects of the Maghreb, they are particularly impacted by foreign languages such as French and English. In addition to the borrowed words from these languages, another phenomenon must be taken into account in automatic dialect processing. This is the problem known as code- switching. This phenomenon is known in linguistics as diglossia. This gives free rein to the user who can write in several languages in the same sentence. He can start in Arabic dialect and in the middle of the sentence, he can switch to French, English or modern standard Arabic. In addition to this, there are several dialects in the same country and a fortiori several different dialects in the Arab world. It is therefore clear that the classic NLP tools developed for modern standard Arabic cannot be used directly to process dialects. The main objective of this thesis is to propose methods to build automatically resources for Arab dialects in general and more particularly for Maghreb dialects. This represents our contribution to the effort made by the community working on Arabic dialects. We have thus produced methods for building comparable corpora, lexical resources containing the different forms of an input and their polarity. In addition, we developed methods for processing modern standard Arabic on Twitter data and also on transcripts from an automatic speech recognition system operating on Arabic videos extracted from Arab television channels such as Al Jazeera, France24, Euronews, etc. We compared the opinions of automatic transcriptions from different multilingual video sources related to the same subject by developing a method based on linguistic theory called Appraisal
APA, Harvard, Vancouver, ISO, and other styles
12

"Sensing Human Sentiment via Social Media Images: Methodologies and Applications." Doctoral diss., 2018. http://hdl.handle.net/2286/R.I.50552.

Full text
Abstract:
abstract: Social media refers computer-based technology that allows the sharing of information and building the virtual networks and communities. With the development of internet based services and applications, user can engage with social media via computer and smart mobile devices. In recent years, social media has taken the form of different activities such as social network, business network, text sharing, photo sharing, blogging, etc. With the increasing popularity of social media, it has accumulated a large amount of data which enables understanding the human behavior possible. Compared with traditional survey based methods, the analysis of social media provides us a golden opportunity to understand individuals at scale and in turn allows us to design better services that can tailor to individuals’ needs. From this perspective, we can view social media as sensors, which provides online signals from a virtual world that has no geographical boundaries for the real world individual's activity. One of the key features for social media is social, where social media users actively interact to each via generating content and expressing the opinions, such as post and comment in Facebook. As a result, sentiment analysis, which refers a computational model to identify, extract or characterize subjective information expressed in a given piece of text, has successfully employs user signals and brings many real world applications in different domains such as e-commerce, politics, marketing, etc. The goal of sentiment analysis is to classify a user’s attitude towards various topics into positive, negative or neutral categories based on textual data in social media. However, recently, there is an increasing number of people start to use photos to express their daily life on social media platforms like Flickr and Instagram. Therefore, analyzing the sentiment from visual data is poise to have great improvement for user understanding. In this dissertation, I study the problem of understanding human sentiments from large scale collection of social images based on both image features and contextual social network features. We show that neither visual features nor the textual features are by themselves sufficient for accurate sentiment prediction. Therefore, we provide a way of using both of them, and formulate sentiment prediction problem in two scenarios: supervised and unsupervised. We first show that the proposed framework has flexibility to incorporate multiple modalities of information and has the capability to learn from heterogeneous features jointly with sufficient training data. Secondly, we observe that negative sentiment may related to human mental health issues. Based on this observation, we aim to understand the negative social media posts, especially the post related to depression e.g., self-harm content. Our analysis, the first of its kind, reveals a number of important findings. Thirdly, we extend the proposed sentiment prediction task to a general multi-label visual recognition task to demonstrate the methodology flexibility behind our sentiment analysis model.<br>Dissertation/Thesis<br>Doctoral Dissertation Computer Science 2018
APA, Harvard, Vancouver, ISO, and other styles
13

Ou, Jen-Bing, and 歐仁彬. "Applications of fuzzy support vector machines for sentiment analysis: stock prediction and live streaming analysis." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/wf9865.

Full text
Abstract:
博士<br>國立高雄科技大學<br>電子工程系<br>107<br>Fuzziness is considered in decision-making systems as human estimation is involved. This thesis applies the concept of fuzzy set theory into the decision model of the support vector machine (SVM) for tackling two real-life applications, which integrates the advantages of support vector machine (i.e. well generalization ability) and fuzzy set theory (i.e. closer to human thinking). Application I: Obviously, many incentives for us to forecast the price trends of stock. According to the hypothesis of efficiency market, the prices of stock are evaluated by all the current existing information. In this Big-data era, with the explosive increase of on-line news and the gigantic text information, more and more institutes depend on modern powerful computer to process text mining and machine learning for constructing the more precise price trend forecast model. In this study, we first extract the implied topic model and emotional information from news articles. Then, a fuzzy SVM is introduced to merge the abundant information from the on-line news, which can be utilized to forecast the trend of stock prices. The results conclude that the highest forecast accuracy rate was 87% for the food-related stocks, 71% for the semiconductors-related stocks, and 69% for the computer accessories-related stocks. As the forecast accuracy rates from the traditional SVM of stock are barely above 50%, our proposed method has shown significantly better than the forecasting model of traditional SVM. Application II: "Live streaming video" has been an emerging social media platform that cannot be ignored in recent years. Compared with the traditional video platforms which are uploaded to Internet for sharing through pre-photographing, the live streaming has another two characteristics of “immediateness” and “interactivity”. While the live streaming is processing, the live actor can reply to the audience’s response in the chat room immediately, creating more differentiated content. However, when audience talks actively, the live actor may miss some messages from the audience. The purpose of this study is to explore the contents in the chat room through emotion mining technique, and then to interpret what the audience wants to express in a relatively simple means. Hopefully, this may help the live actor understand the audience reaction in a more relaxing way, and take it as a reference for content adjustment. The emotion mining system proposed in this study can be divided into three main steps. Firstly, we establish a Socket connection to the Twitch-IRC (Server) through "Internet chat room crawler" to retrieve the content in the chat room instantly and automatically, and then to process “internet terms normalization" to the retrieved content. Secondly, the classification of the "positive" and "negative" categories of emotions through fuzzy support vector machines is performed. The membership degree of positive (negative) emotions based on the fuzzy set theory has demonstrated more accurately. Finally, the results are presented by various visual effects including "word cloud", "emotional fluctuation chart", "emotional radar chart", "emotional histogram" and "emotional box plot."
APA, Harvard, Vancouver, ISO, and other styles
14

Glorot, Xavier. "Apprentissage des réseaux de neurones profonds et applications en traitement automatique de la langue naturelle." Thèse, 2014. http://hdl.handle.net/1866/11989.

Full text
Abstract:
En apprentissage automatique, domaine qui consiste à utiliser des données pour apprendre une solution aux problèmes que nous voulons confier à la machine, le modèle des Réseaux de Neurones Artificiels (ANN) est un outil précieux. Il a été inventé voilà maintenant près de soixante ans, et pourtant, il est encore de nos jours le sujet d'une recherche active. Récemment, avec l'apprentissage profond, il a en effet permis d'améliorer l'état de l'art dans de nombreux champs d'applications comme la vision par ordinateur, le traitement de la parole et le traitement des langues naturelles. La quantité toujours grandissante de données disponibles et les améliorations du matériel informatique ont permis de faciliter l'apprentissage de modèles à haute capacité comme les ANNs profonds. Cependant, des difficultés inhérentes à l'entraînement de tels modèles, comme les minima locaux, ont encore un impact important. L'apprentissage profond vise donc à trouver des solutions, en régularisant ou en facilitant l'optimisation. Le pré-entraînnement non-supervisé, ou la technique du ``Dropout'', en sont des exemples. Les deux premiers travaux présentés dans cette thèse suivent cette ligne de recherche. Le premier étudie les problèmes de gradients diminuants/explosants dans les architectures profondes. Il montre que des choix simples, comme la fonction d'activation ou l'initialisation des poids du réseaux, ont une grande influence. Nous proposons l'initialisation normalisée pour faciliter l'apprentissage. Le second se focalise sur le choix de la fonction d'activation et présente le rectifieur, ou unité rectificatrice linéaire. Cette étude a été la première à mettre l'accent sur les fonctions d'activations linéaires par morceaux pour les réseaux de neurones profonds en apprentissage supervisé. Aujourd'hui, ce type de fonction d'activation est une composante essentielle des réseaux de neurones profonds. Les deux derniers travaux présentés se concentrent sur les applications des ANNs en traitement des langues naturelles. Le premier aborde le sujet de l'adaptation de domaine pour l'analyse de sentiment, en utilisant des Auto-Encodeurs Débruitants. Celui-ci est encore l'état de l'art de nos jours. Le second traite de l'apprentissage de données multi-relationnelles avec un modèle à base d'énergie, pouvant être utilisé pour la tâche de désambiguation de sens.<br>Machine learning aims to leverage data in order for computers to solve problems of interest. Despite being invented close to sixty years ago, Artificial Neural Networks (ANN) remain an area of active research and a powerful tool. Their resurgence in the context of deep learning has led to dramatic improvements in various domains from computer vision and speech processing to natural language processing. The quantity of available data and the computing power are always increasing, which is desirable to train high capacity models such as deep ANNs. However, some intrinsic learning difficulties, such as local minima, remain problematic. Deep learning aims to find solutions to these problems, either by adding some regularisation or improving optimisation. Unsupervised pre-training or Dropout are examples of such solutions. The two first articles presented in this thesis follow this line of research. The first analyzes the problem of vanishing/exploding gradients in deep architectures. It shows that simple choices, like the activation function or the weights initialization, can have an important impact. We propose the normalized initialization scheme to improve learning. The second focuses on the activation function, where we propose the rectified linear unit. This work was the first to emphasise the use of linear by parts activation functions for deep supervised neural networks, which is now an essential component of such models. The last two papers show some applications of ANNs to Natural Language Processing. The first focuses on the specific subject of domain adaptation in the context of sentiment analysis, using Stacked Denoising Auto-encoders. It remains state of the art to this day. The second tackles learning with multi-relational data using an energy based model which can also be applied to the task of word-sense disambiguation.
APA, Harvard, Vancouver, ISO, and other styles
15

Huang, Shu Wei, and 黃書韋. "AppCAT: Systematic Sentiment Analysis of Mobile Application Reviews." Thesis, 2016. http://ndltd.ncl.edu.tw/handle/7uzvfp.

Full text
Abstract:
碩士<br>國立政治大學<br>資訊管理學系<br>104<br>User reviews of mobile apps often contain complaints or suggestions which are valuable for app developers to improve user experience and satisfaction. However, due to the large volume and noisy-nature of those reviews, manually analyzing them for useful opinions is quite challenging. To address this problem, we propose Ap- pCAT, a sentiment and feature mining framework for automated review analysis. AppCAT defines the initial sets of keywords of those comments. And it use word similarity technique to expand the initial sets by grouping other keywords to find out the product features of those apps. Furthermore, AppCAT detects the sentiment and its subject(a product feature) of those reviews and figure out the user attitude towards those product feature of a specific app. AppCAT use those data to plot a bar chart to visualize those feature polarities for users to facilitate if they should consider this app. For the app developers, they can use this system to get the opinion overview of users as a basis of revision.
APA, Harvard, Vancouver, ISO, and other styles
16

Liao, Ying-Chien, and 廖瑩蒨. "Chinese Sentiment Analysis with Application to Online Opinion Reviews." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/xwv3fe.

Full text
Abstract:
碩士<br>淡江大學<br>統計學系應用統計學碩士班<br>106<br>In this paper, we presents a semi-supervised sentiment analysis method, is divided into two stages. The first stage uses the unsupervised sentiment analysis approach that adopts a SO-PMI technique to build the emotion lexicon for different topics. And calculates the emotion score based on the topic emotion lexicon. The second stage analysis uses the supervised sentiment analysis approach that adopts the support vector machine(SVM). The new method proposed in this paper can automatically find the best threshold value of the first stage to select the data that needs to be entered into the second stage analysis. This study also considered the cost loss caused by the analysis error and manual marking. When other analysts do similar operations in the future, it is possible to judge whether the half-supervised affective analysis method is suitable according to the proportion of loss cost in this study.
APA, Harvard, Vancouver, ISO, and other styles
17

Gordillo, Juan Pablo Guevara. "Mobile Application for Analysis of Sentiments in Twitter." Master's thesis, 2018. http://hdl.handle.net/10400.8/3488.

Full text
Abstract:
El Análisis de Sentimientos es una técnica muy popular para el estudio de redes sociales. Una de las redes sociales más populares para microblogging, con gran crecimiento, es Twitter, ya que permite a las personas expresar sus opiniones utilizando oraciones cortas y simples. Estos textos se generan a diario y por esta razón, es común que las personas quieran saber cuáles son los temas de actualidad y sus derivaciones. En este trabajo, proponemos implementar una aplicación móvil que brinde información a las personas, como un grado de polaridad positiva o negativa, sobre cualquier tema relevante en la sociedad, ayudando de esta manera a que las personas puedan tomar la mejor decisión. En el aplicativo se utilizarán varias técnicas de clasificación de texto de manera conjunta. Estas técnicas están enfocadas en el aprendizaje de máquina y de léxico.
APA, Harvard, Vancouver, ISO, and other styles
18

Chuang, Chih-Hsin, and 莊之信. "The Effect of Temperature on Consumption Emotion: Application of Sentiment Analysis." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/6j3x4v.

Full text
Abstract:
碩士<br>國立中興大學<br>行銷學系所<br>106<br>Sensory marketing is derived from embodied cognition theory that explains how external stimulates influence consumers'' five senses, in turn affect their perception, judgment, and behavior. Temperature cue is one of important factors of sensory stimulus. There are inconsistent findings in the relationships between temperature and emotion. Thus, this research considers three important boundary conditions, such as uncertainty of consumption, characteristics of reviewers (i.e., the type of avatar and the degree of expertise), in the context of writing online review for clarifying the relationship between temperature and consumption emotion. Because the popularity of e-commerce and social media, several consumers prefer express their thoughts after consumption in the social media. This research conducted a sentiment analysis via text mining, using Mozenda crawling tool to retrieve a dataset from the iPeen.com website in Taiwan. The sentiment score was calculated from the review content and was regarded as dependent variable (i.e., consumption emotion). The independent variable, the temperature data, is retrieved from the Taichung City Observatory from January 2010 to October 2017. The results of multiple regression analysis revealed that when the consumption condition is uncertain, and the reviewer is a novice who uses the virtual avatar, the higher temperature will lead to negative consumption emotion. Conversely, when the consumption condition is certain, there did not have any different consumption emotion to the temperature no matter what characteristics of reviewers are. With an understanding of how temperature affects consumption emotion in writing online review content. Marketers could monitor external environment to boost their service quality, in turn strengthen their consumption experience. For online review platforms, the results in this research could help them to establish the policy for promoting the helpfulness of reviews more effectively.
APA, Harvard, Vancouver, ISO, and other styles
19

Veiga, Gonçalo Marmelo da Silva. "The application of sentiment analysis to a psychotherapy session : an exploratory study using four general-purpose lexicons." Master's thesis, 2019. http://hdl.handle.net/10400.12/7372.

Full text
Abstract:
Dissertação de Mestrado apresentada no ISPA – Instituto Universitário para obtenção de grau de Mestre na especialidade de Psicologia Clínica.<br>In this study we explore the application of sentiment analysis to a complete and in-person psychotherapy session. Sentiment analysis is a text mining technique that allows for the analysis, interpretation, and visualization of textual data. We investigate how we can apply a lexicon-based approach to analyze clinical session data, using four general-purpose lexicons available within an open-source statistical programming language environment, R. We conducted our study by comparing the performance of four general-purpose lexicons to the performance of n = 52 human raters, using inter-rater reliability (IRR) and intraclass correlation (ICC) measurements. Our findings suggest there is low to moderate agreement between human ratings and lexicon generated ratings, depending on the lexicon used. There are some benefits in applying a lexicon-based sentiment analysis approach to psychotherapy session data, namely the way it efficiently processes and analyses data and allows for novel visualizations of psychotherapy data. We recommend further investigation into the application of sentiment analysis as a technique, focusing on the performance of specific-purpose lexicons. We also recommend further research into comparing the performance of lexicon-based approaches to text classification approaches to the analysis of psychotherapy data.
APA, Harvard, Vancouver, ISO, and other styles
20

ZHU, JUN-YU, and 朱俊宇. "The Application of Paragraph Vector Representation with Long Short-Term Memory Neural Network in Sentiment Analysis." Thesis, 2017. http://ndltd.ncl.edu.tw/handle/63h8d9.

Full text
Abstract:
碩士<br>國立高雄第一科技大學<br>資訊管理系碩士班<br>105<br>Sentiment analysis is one of the main challenging task in the natural language processing. In recent year, deep learning gets impressive achievement in the natural language processing task. Many deep learning algorithm including LSTM has got the brilliant achievement on natural language statistical modeling, natural language understanding and machine translation. This thesis tried several different natural language processing algorithm to conduct sentiment analysis, including bags of words, distributed bags of words paragraph vector, logistic regression, LSTM, LSTM with dropout and bidirectional LSTM. The experimental data sets include the IMDB movie reviews and the Booking.com hotel reviews which were crawled by this study. Both of them are the data for binary classification. According to the IMDB dataset, the result of the research illustrates that the accuracy of logistic regression with the bag of words or paragraph vector is close to that of the publication. According to the Booking.com review dataset, the accuracy of LSTM is better than that of the logistic regression with bag of words or paragraph vector. The accuracies of LSTM, LSTM with dropout, and bidirectional LSTM are as follows: average accuracy of 93.89%, highest accuracy of 94.61% and lowest accuracy 91.61%. This study concludes that the LSTM models are most suitable for the sentiment analysis of the Booking.com dataset.
APA, Harvard, Vancouver, ISO, and other styles
21

Yu, Shu-Ti, and 俞舒禔. "Application of Sentiment Analysis in Product Comparison and Brand Recommendation System - Taking Cosmetics as an Example." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/t799s5.

Full text
APA, Harvard, Vancouver, ISO, and other styles
22

Yan, Wun-jin, and 顏雯津. "AN ANALYSIS OF THE RELATIONSHIP BETWEEN MARKET SENTIMENT AND TAIEX TREND--WITH AN APPLICATION TO TXO VOLATILITY." Thesis, 2007. http://ndltd.ncl.edu.tw/handle/kq366q.

Full text
Abstract:
碩士<br>南華大學<br>財務管理研究所<br>95<br>Based on the decision-making process of traders in Behavioral Finance, we construct the mental volatility index from the prices of options in Taiwan options markets by taking the difference between the implicit information in the prices of options and the information within the market indexes to simulate the extra sentiment to the risk perception in the trade decisions of traders to the information on the market at present. Since there are different information call options and put options, we utilizes the implied volatility of the call options, the put options and the historical volatility of market index to build up the volatility indexes of the traders’ mental sentiments from the call options and the put options separately. Moreover, we use these volatility indexes to discuss the relationship between the market sentiments and the returns of the market index. The empirical results show as follows:     There are positive (negative) relationships between the volatility indexes for call (put) mental sentiments and stock returns, generally. But, when the extremes of market sentiments occurred, the relationships are reversed. Moreover, we add the information about the past prices of the market index to separate from the momentum traders, news watchers and risk demand traders, and show that under the perception of the same risk, different traders’ risk perception will exert different degree of influence on the trend of the market index.
APA, Harvard, Vancouver, ISO, and other styles
23

Zhang, Chuan-Heng, and 張傳珩. "Text Mining and Sentiment Analysis for the Application of the Product Recommendation-The Case of PTT Movie Board." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/62w2x2.

Full text
Abstract:
碩士<br>東吳大學<br>資訊管理學系<br>107<br>Thanks to internet technology improvements and the smart devices popularized, we can find a huge variety of information and different kind of social media platforms. Nowadays people prefer to search for comments and information on the internet than ask others opinions before they make purchases. However, there is massive information around the internet world. When people use the keywords to search in the comments, they will have to read a lot of texts and pages, which will take a bunch of time. This is not an easy job for people. The research "Subject analysis" and "Emotional analysis" help people to search for the diversity of emotional analysis consequences from movies. People won't have to review many comments to understand the movie evaluation. By collecting the half-year comments from PTT, this research has analyzed the adjective words to get the emotional score and use the score to build movie recommendations. After that, analyze the topics to get the topic models including the emotional score from analyzed words to give people the movie they prefer.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography