Academic literature on the topic 'Unsupervised short texts clustering'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Unsupervised short texts clustering.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Unsupervised short texts clustering"

1

Dai, Zuhua, Kelong Li, Hongyi Li, and Xiaoting Li. "An Unsupervised Learning Short Text Clustering Method." Journal of Physics: Conference Series 1650 (October 2020): 032090. http://dx.doi.org/10.1088/1742-6596/1650/3/032090.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Gong, Xiaolong, Linpeng Huang, and Fuwei Wang. "Feature Sampling Based Unsupervised Semantic Clustering for Real Web Multi-View Content." Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 17, 2019): 102–9. http://dx.doi.org/10.1609/aaai.v33i01.3301102.

Full text
Abstract:
Real web datasets are often associated with multiple views such as long and short commentaries, users preference and so on. However, with the rapid growth of user generated texts, each view of the dataset has a large feature space and leads to the computational challenge during matrix decomposition process. In this paper, we propose a novel multi-view clustering algorithm based on the non-negative matrix factorization that attempts to use feature sampling strategy in order to reduce the complexity during the iteration process. In particular, our method exploits unsupervised semantic informatio
APA, Harvard, Vancouver, ISO, and other styles
3

Zupan, Katja, Nikola Ljubešić, and Tomaž Erjavec. "How to tag non-standard language: Normalisation versus domain adaptation for Slovene historical and user-generated texts." Natural Language Engineering 25, no. 5 (2019): 651–74. http://dx.doi.org/10.1017/s1351324919000366.

Full text
Abstract:
AbstractPart-of-speech (PoS) tagging of non-standard language with models developed for standard language is known to suffer from a significant decrease in accuracy. Two methods are typically used to improve it: word normalisation, which decreases the out-of-vocabulary rate of the PoS tagger, and domain adaptation where the tagger is made aware of the non-standard language variation, either through supervision via non-standard data being added to the tagger’s training set, or via distributional information calculated from raw texts. This paper investigates the two approaches, normalisation and
APA, Harvard, Vancouver, ISO, and other styles
4

Park, Dongjoo, and Laurence R. Rilett. "Forecasting Multiple-Period Freeway Link Travel Times Using Modular Neural Networks." Transportation Research Record: Journal of the Transportation Research Board 1617, no. 1 (1998): 163–70. http://dx.doi.org/10.3141/1617-23.

Full text
Abstract:
With the advent of route guidance systems (RGS), the prediction of short-term link travel times has become increasingly important. For RGS to be successful, the calculated routes should be based on not only historical and real-time link travel time information but also anticipatory link travel time information. An examination is conducted on how realtime information gathered as part of intelligent transportation systems can be used to predict link travel times for one through five time periods (of 5 minutes’ duration). The methodology developed consists of two steps. First, the historical link
APA, Harvard, Vancouver, ISO, and other styles
5

Kocher, Mirco, and Jacques Savoy. "Evaluation of text representation schemes and distance measures for authorship linking." Digital Scholarship in the Humanities 34, no. 1 (2018): 189–207. http://dx.doi.org/10.1093/llc/fqy013.

Full text
Abstract:
Abstract Based on n text excerpts, the authorship linking task is to determine a way to link pairs of documents written by the same person together. This problem is closely related to authorship attribution questions, and its solution can be used in the author clustering task. However, no training information is provided and the solution must be unsupervised. To achieve this, various text representation strategies can be applied, such as characters, punctuation symbols, or letter n-grams as well as words, lemmas, Part-Of-Speech (POS) tags, and sequences of them. To estimate the stylistic dista
APA, Harvard, Vancouver, ISO, and other styles
6

Feng, Cong, Mingjian Cui, Bri-Mathias Hodge, Siyuan Lu, Hendrik F. Hamann, and Jie Zhang. "Unsupervised Clustering-Based Short-Term Solar Forecasting." IEEE Transactions on Sustainable Energy 10, no. 4 (2019): 2174–85. http://dx.doi.org/10.1109/tste.2018.2881531.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Rabinovich, Ella, and Shuly Wintner. "Unsupervised Identification of Translationese." Transactions of the Association for Computational Linguistics 3 (December 2015): 419–32. http://dx.doi.org/10.1162/tacl_a_00148.

Full text
Abstract:
Translated texts are distinctively different from original ones, to the extent that supervised text classification methods can distinguish between them with high accuracy. These differences were proven useful for statistical machine translation. However, it has been suggested that the accuracy of translation detection deteriorates when the classifier is evaluated outside the domain it was trained on. We show that this is indeed the case, in a variety of evaluation scenarios. We then show that unsupervised classification is highly accurate on this task. We suggest a method for determining the c
APA, Harvard, Vancouver, ISO, and other styles
8

Öztürk, Ömer Faruk, Alessandro Pigoni, Julian Wenzel, et al. "O6.4. ASSOCIATION BETWEEN CLUSTERS OF FORMAL THOUGHT DISORDERS SEVERITY AND NEUROCOGNITIVE AND FUNCTIONAL OUTCOME INDICES IN THE EARLY STAGES OF PSYCHOSIS – RESULTS FROM THE PRONIA COHORT." Schizophrenia Bulletin 46, Supplement_1 (2020): S14—S15. http://dx.doi.org/10.1093/schbul/sbaa028.033.

Full text
Abstract:
Abstract Background Formal thought disorder (FThD) has been associated with more severe illness courses and functional deficits in psychosis patients. Given these associations, it remains unclear whether the presence of FThD accounts for the heterogeneous presentation of psychoses, and whether it characterises a specific subgroup of patients showing prominent differential illness severity, neurocognitive and functional impairments already in the early stages of psychosis. Thus, our aim is 1) to evaluate whether there are stable subtypes of patients with Recent-Onset Psychosis (ROP) that are ch
APA, Harvard, Vancouver, ISO, and other styles
9

Kozlowski, Marek, and Henryk Rybinski. "Clustering of semantically enriched short texts." Journal of Intelligent Information Systems 53, no. 1 (2018): 69–92. http://dx.doi.org/10.1007/s10844-018-0541-4.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Wang, Yu, Lihui Wu, and Hongyu Shao. "Clusters Merging Method for Short Texts Clustering." Open Journal of Social Sciences 02, no. 09 (2014): 186–92. http://dx.doi.org/10.4236/jss.2014.29032.

Full text
APA, Harvard, Vancouver, ISO, and other styles
More sources

Dissertations / Theses on the topic "Unsupervised short texts clustering"

1

Yahaya, Alassan Mahaman Sanoussi. "Amélioration du système de recueils d'information de l'entreprise Semantic Group Company grâce à la constitution de ressources sémantiques." Thesis, Paris 10, 2017. http://www.theses.fr/2017PA100086/document.

Full text
Abstract:
Prendre en compte l'aspect sémantique des données textuelles lors de la tâche de classification s'est imposé comme un réel défi ces dix dernières années. Cette difficulté vient s'ajouter au fait que la plupart des données disponibles sur les réseaux sociaux sont des textes courts, ce qui a notamment pour conséquence de rendre les méthodes basées sur la représentation "bag of words" peu efficientes. L'approche proposée dans ce projet de recherche est différente des approches proposées dans les travaux antérieurs sur l'enrichissement des messages courts et ce pour trois raisons. Tout d'abord, no
APA, Harvard, Vancouver, ISO, and other styles
2

Hang, Sijia. "Clustering Short Texts: Categorizing Initial Utterances from Customer Service Dialogue Agents." Thesis, Uppsala universitet, Institutionen för lingvistik och filologi, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-453814.

Full text
Abstract:
Text classification involves labeled data, which is not always available, or requires expensive manual labour.User-generated short texts are being produced in abundance in customer service sectors through transcripts of phone calls or chats online. This kind of unstructured textual data can be noisy and thus poses challenges to unsupervised classification methods developed for standard documents such as news articles.This thesis project explores some possible methods of unsupervised classification of user-generated short texts in Swedish on a real-world dataset of short texts collected from fi
APA, Harvard, Vancouver, ISO, and other styles
3

Dail, Mathias. "Clustering unstructured life sciences experiments with unsupervised machine learning : Natural language processing for unstructured life sciences texts." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-265549.

Full text
Abstract:
The purpose of this master’s thesis is to analyse different types of document representations in the context of improving, in an unsupervised manner, the searchability of unstructured textual life sciences experiments by clustering similar experiments together. The challenge is to produce, analyse and compare different representations of the life sciences data by using traditional and advanced unsupervised Machine learning models. The text data analysed in this work is noisy and very heterogeneous, as it comes from a real-world Electronic Lab Notebook. Clustering unstructured and unlabeled tex
APA, Harvard, Vancouver, ISO, and other styles
4

Pinto, Avendaño David Eduardo. "On Clustering and Evaluation of Narrow Domain Short-Test Corpora." Doctoral thesis, Universitat Politècnica de València, 2008. http://hdl.handle.net/10251/2641.

Full text
Abstract:
En este trabajo de tesis doctoral se investiga el problema del agrupamiento de conjuntos especiales de documentos llamados textos cortos de dominios restringidos. Para llevar a cabo esta tarea, se han analizados diversos corpora y métodos de agrupamiento. Mas aún, se han introducido algunas medidas de evaluación de corpus, técnicas de selección de términos y medidas para la validez de agrupamiento con la finalidad de estudiar los siguientes problemas: -Determinar la relativa dificultad de un corpus para ser agrupado y estudiar algunas de sus características como longitud de los textos, ampli
APA, Harvard, Vancouver, ISO, and other styles

Book chapters on the topic "Unsupervised short texts clustering"

1

Dey, Lipika, Kunal Ranjan, Ishan Verma, and Abir Naskar. "A Semantic Overlapping Clustering Algorithm for Analyzing Short-Texts." In Rough Sets. Springer International Publishing, 2016. http://dx.doi.org/10.1007/978-3-319-47160-0_43.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Pinto, David, José-Miguel Benedí, and Paolo Rosso. "Clustering Narrow-Domain Short Texts by Using the Kullback-Leibler Distance." In Computational Linguistics and Intelligent Text Processing. Springer Berlin Heidelberg, 2007. http://dx.doi.org/10.1007/978-3-540-70939-8_54.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Popova, Svetlana, Vera Danilova, and Artem Egorov. "Clustering Narrow-Domain Short Texts Using K-Means, Linguistic Patterns and LSI." In Communications in Computer and Information Science. Springer International Publishing, 2014. http://dx.doi.org/10.1007/978-3-319-12580-0_18.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Chen, Xin, Yuqing Zhang, Long Cao, and Donghui Li. "An Improved Feature Selection Method for Chinese Short Texts Clustering Based on HowNet." In Lecture Notes in Electrical Engineering. Springer International Publishing, 2013. http://dx.doi.org/10.1007/978-3-319-01766-2_73.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Yan, Yingying, Ruizhang Huang, Can Ma, et al. "Improving Document Clustering for Short Texts by Long Documents via a Dirichlet Multinomial Allocation Model." In Web and Big Data. Springer International Publishing, 2017. http://dx.doi.org/10.1007/978-3-319-63579-8_47.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Torney, Rosemary, John Yearwood, Peter Vamplew, and Andrei V. Kelarev. "Applications of Machine Learning for Linguistic Analysis of Texts." In Machine Learning Algorithms for Problem Solving in Computational Applications. IGI Global, 2012. http://dx.doi.org/10.4018/978-1-4666-1833-6.ch008.

Full text
Abstract:
This chapter describes a novel multistage method for linguistic clustering of large collections of texts available on the Internet as a precursor to linguistic analysis of these texts. This method addresses the practicalities of applying clustering operations to a very large set of text documents by using a combination of unsupervised clustering and supervised classification. The method relies on creating a multitude of independent clusterings of a randomized sample selected from the International Corpus of Learner English. Several consensus functions and sophisticated algorithms are applied in two substages to combine these independent clusterings into one final consensus clustering, which is then used to train fast classifiers in order to enable them to perform the profiling of very large collections of text and web data. This approach makes it possible to apply advanced highly accurate and sophisticated clustering techniques by combining them with fast supervised classification algorithms. For the effectiveness of this multistage method it is crucial to determine how well the supervised classification algorithms are going to perform at the final stage, when they are used to process large data sets available on the Internet. This performance may also serve as an indication of the quality of the combined consensus clustering obtained in the preceding stages. The authors’ experimental results compare the performance of several classification algorithms incorporated in this multistage scheme and demonstrate that several of these classification algorithms achieve very high precision and recall and can be used in practical implementations of their method.
APA, Harvard, Vancouver, ISO, and other styles
7

Connolly, Andrew J., Jacob T. VanderPlas, Alexander Gray, Andrew J. Connolly, Jacob T. VanderPlas, and Alexander Gray. "Classification." In Statistics, Data Mining, and Machine Learning in Astronomy. Princeton University Press, 2014. http://dx.doi.org/10.23943/princeton/9780691151687.003.0009.

Full text
Abstract:
Chapter 6 described techniques for estimating joint probability distributions from multivariate data sets and for identifying the inherent clustering within the properties of sources. This approach can be viewed as the unsupervised classification of data. If, however, we have labels for some of these data points (e.g., an object is tall, short, red, or blue) we can utilize this information to develop a relationship between the label and the properties of a source. We refer to this as supervised classification, which is the focus of this chapter. The motivation for supervised classification comes from the long history of classification in astronomy. Possibly the most well known of these classification schemes is that defined by Edwin Hubble for the morphological classification of galaxies based on their visual appearance. This chapter discusses generative classification, k-nearest-neighbor classifier, discriminative classification, support vector machines, decision trees, and evaluating classifiers.
APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "Unsupervised short texts clustering"

1

Banerjee, Somnath, Krishnan Ramanathan, and Ajay Gupta. "Clustering short texts using wikipedia." In the 30th annual international ACM SIGIR conference. ACM Press, 2007. http://dx.doi.org/10.1145/1277741.1277909.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Singh, Neetu, and Narendra S. Chaudhari. "A novel clustering technique for short texts." In 2016 5th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO). IEEE, 2016. http://dx.doi.org/10.1109/icrito.2016.7784956.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Goswami, Koustava, Rajdeep Sarkar, Bharathi Raja Chakravarthi, Theodorus Fransen, and John P. McCrae. "Unsupervised Deep Language and Dialect Identification for Short Texts." In Proceedings of the 28th International Conference on Computational Linguistics. International Committee on Computational Linguistics, 2020. http://dx.doi.org/10.18653/v1/2020.coling-main.141.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Goswami, Koustava, Rajdeep Sarkar, Bharathi Raja Chakravarthi, Theodorus Fransen, and John P. McCrae. "Unsupervised Deep Language and Dialect Identification for Short Texts." In Proceedings of the 28th International Conference on Computational Linguistics. International Committee on Computational Linguistics, 2020. http://dx.doi.org/10.18653/v1/2020.coling-main.141.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Glavaš, Goran, Federico Nanni, and Simone Paolo Ponzetto. "Unsupervised Cross-Lingual Scaling of Political Texts." In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers. Association for Computational Linguistics, 2017. http://dx.doi.org/10.18653/v1/e17-2109.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

He, Yunchao, Chin-Sheng Yang, Liang-Chih Yu, K. Robert Lai, and Weiyi Liu. "Sentiment classification of short texts based on semantic clustering." In 2015 International Conference on Orange Technologies (ICOT). IEEE, 2015. http://dx.doi.org/10.1109/icot.2015.7498505.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Rangarajan Sridhar, Vivek Kumar. "Unsupervised Topic Modeling for Short Texts Using Distributed Representations of Words." In Proceedings of the 1st Workshop on Vector Space Modeling for Natural Language Processing. Association for Computational Linguistics, 2015. http://dx.doi.org/10.3115/v1/w15-1526.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Jin, Ou, Nathan N. Liu, Kai Zhao, Yong Yu, and Qiang Yang. "Transferring topical knowledge from auxiliary long texts for short text clustering." In the 20th ACM international conference. ACM Press, 2011. http://dx.doi.org/10.1145/2063576.2063689.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Mei, Mei, Xinyu Guo, Belinda C. Williams, et al. "Using Semantic Clustering And Autoencoders For Detecting Novelty In Corpora Of Short Texts." In 2018 International Joint Conference on Neural Networks (IJCNN). IEEE, 2018. http://dx.doi.org/10.1109/ijcnn.2018.8489431.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Lingling Yuan. "An effective Chinese short message texts clustering algorithm based on the ward's method." In 2011 2nd International Conference on Artificial Intelligence, Management Science and Electronic Commerce (AIMSEC). IEEE, 2011. http://dx.doi.org/10.1109/aimsec.2011.6010901.

Full text
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!