Relevant bibliographies by topics / Text dataset

Journal articles
Dissertations / Theses
Books
Book chapters
Conference papers
Reports

Academic literature on the topic 'Text dataset'

Author: Grafiati

Published: 29 September 2021

Last updated: 18 July 2025

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Text dataset.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Text dataset"

Khan, Shafiq Ur Rehman, and Muhammad Arshad Islam. "Event-Dataset: Temporal information retrieval and text classification dataset." Data in Brief 25 (August 2019): 104048. http://dx.doi.org/10.1016/j.dib.2019.104048.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Assad, Ali, Abdul Hadi M. Alaidi, Amjad Yousif Sahib, Haider TH Salim ALRikabi, and Ahmed Magdy. "Transformer-based automatic Arabic text diacritization." Sustainable Engineering and Innovation 6, no. 2 (2024): 285–96. https://doi.org/10.37868/sei.v6i2.id305.

Full text

Abstract:

In Arabic natural language processing (NLP), automatic text diacritization is a major obstacle, and progress has been slow when compared to other language processing tasks. Automatic diacritical marking of Arabic text is proposed in this work using the first transformer-based paradigm designed solely for this task. By taking advantage of the attention mechanism, our system is able to capture more of the innate patterns in Arabic, surpassing the performance of both rule-based alternatives and neural network techniques. The model trained with the Clean-50 dataset had a diacritic error rate (DER)

APA, Harvard, Vancouver, ISO, and other styles

Full text

Abstract:

APA, Harvard, Vancouver, ISO, and other styles

Васильев, А. А., and А. С. Нестеров. "APPLYING TEXT QUESTIONS GENERATION ALGORITHMS FOR AUTOMATIC TEST GENERATION." Proceedings in Cybernetics 22, no. 3 (2023): 17–22. http://dx.doi.org/10.35266/1999-7604-2023-3-17-22.

Full text

Abstract:

The article presents findings of manual, semi-automatic, and automatic approaches to gen-erate test questions based on such methods as annotation, keyword extraction, and learning datasets for com-piling tests for studying material, along with a description of each method algorithm, examples of generated questions, and their quality assessment. These examples demonstrate the advantages of an algorithm for gen-erating a method using a dataset and a combination of methods, as well as their possible practical application.

APA, Harvard, Vancouver, ISO, and other styles

Saeed, Ari M. "AN AUTOMATED NEW APPROACH IN FAST TEXT CLASSIFICATION: A CASE STUDY FOR KURDISH TEXT." Science Journal of University of Zakho 12, no. 3 (2024): 330–36. http://dx.doi.org/10.25271/sjuoz.2024.12.3.1296.

Full text

Abstract:

With the rapid development of internet technology, text classification has become a vital part of obtaining quick and accurate data. Traditional machine learning methods often suffer from poor performance and high-dimensional feature spaces, which reduce their accuracy. In this paper, the FastText model is proposed as the first-ever classifier on Kurdish text and the results are compared with traditional machine learning methods to show the effects on Kurdish Text. For evaluating the model four datasets Kurdish News Dataset Headlines (KNDH), Medical Kurdish Dataset (MKD), Kurdish-Emotional-Dat

APA, Harvard, Vancouver, ISO, and other styles

O, Hyon-Gwang, Myong-Chol Kim, Il-Nam Pak, Un-Hyok Choe, and Chol-Jun O. "RanPil: New Dataset and Benchmark for Offline Handwritten Korean Text Recognition." International Journal on Data Science and Technology 11, no. 2 (2025): 27–34. https://doi.org/10.11648/j.ijdst.20251102.12.

Full text

Abstract:

In recent years, since deep learning technology have been applied to handwritten text recognition, the need for handwritten document image Datasets has been growing more and more. In particular, the development of the dataset is of great significance for improving performance of handwritten Korean text recognition because no dataset for handwritten Korean text recognition has been published. In this paper, we present the “RanPil”, a new training and performance evaluation dataset for handwritten Korean text recognition, which consists of a total of 8,600 pages of images (182,000 text lines and

APA, Harvard, Vancouver, ISO, and other styles

Maekawa, Aru, Satoshi Kosugi, Kotaro Funakoshi, and Manabu Okumura. "DiLM: Distilling Dataset into Language Model for Text-level Dataset Distillation." Journal of Natural Language Processing 32, no. 1 (2025): 252–82. https://doi.org/10.5715/jnlp.32.252.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Kolesov, Anton, Dmitry Kamyshenkov, Maria Litovchenko, Elena Smekalova, Alexey Golovizin, and Alex Zhavoronkov. "On Multilabel Classification Methods of Incompletely Labeled Biomedical Text Data." Computational and Mathematical Methods in Medicine 2014 (2014): 1–11. http://dx.doi.org/10.1155/2014/781807.

Full text

Abstract:

Multilabel classification is often hindered by incompletely labeled training datasets; for some items of such dataset (or even for all of them) some labels may be omitted. In this case, we cannot know if any item is labeled fully and correctly. When we train a classifier directly on incompletely labeled dataset, it performs ineffectively. To overcome the problem, we added an extra step, training set modification, before training a classifier. In this paper, we try two algorithms for training set modification: weighted k-nearest neighbor (WkNN) and soft supervised learning (SoftSL). Both of the

APA, Harvard, Vancouver, ISO, and other styles

Tian, Jing, Wushour Slamu, Miaomiao Xu, Chunbo Xu, and Xue Wang. "Research on Aspect-Level Sentiment Analysis Based on Text Comments." Symmetry 14, no. 5 (2022): 1072. http://dx.doi.org/10.3390/sym14051072.

Full text

Abstract:

Sentiment analysis is the processing of textual data and giving positive or negative opinions to sentences. In the ABSA dataset, most sentences contain one aspect of sentiment polarity, or sentences of one aspect have multiple identical sentiment polarities, which weakens the sentiment polarity of the ABSA dataset. Therefore, this paper uses the SemEval 14 Restaurant Review dataset, in which each document is symmetrically divided into individual sentences, and two versions of the datasets ATSA and ACSA are created. ATSA: Aspect Term Sentiment Analysis Dataset. ACSA: Aspect Category Sentiment A

APA, Harvard, Vancouver, ISO, and other styles

Zhao, Huanhuan, Haihua Chen, Thomas A. Ruggles, Yunhe Feng, Debjani Singh, and Hong-Jun Yoon. "Improving Text Classification with Large Language Model-Based Data Augmentation." Electronics 13, no. 13 (2024): 2535. http://dx.doi.org/10.3390/electronics13132535.

Full text

Abstract:

Large Language Models (LLMs) such as ChatGPT possess advanced capabilities in understanding and generating text. These capabilities enable ChatGPT to create text based on specific instructions, which can serve as augmented data for text classification tasks. Previous studies have approached data augmentation (DA) by either rewriting the existing dataset with ChatGPT or generating entirely new data from scratch. However, it is unclear which method is better without comparing their effectiveness. This study investigates the application of both methods to two datasets: a general-topic dataset (Re

APA, Harvard, Vancouver, ISO, and other styles

More sources

Dissertations / Theses on the topic "Text dataset"

Zakaria, Suliman Zubi. "Retrieving Electronic Data Interchange (EDI) Dataset using Text Mining Methods." Thesis, Сумський державний університет, 2012. http://essuir.sumdu.edu.ua/handle/123456789/28658.

Full text

Abstract:

Abstract: - The internet is a huge source of documents, containing a massive number of texts presented in multilingual languages on a wide range of topics. These texts are demonstrating in an electronic documents format hosted on the web. The documents exchanged using special forms in an Electronic Data Interchange (EDI) environment. Using web text mining approaches to mine documents in EDI environment could be new challenging guidelines in web text mining. Applying text-mining approaches to discover knowledge previously unknown patters retrieved from the web documents by using partitioned clu

APA, Harvard, Vancouver, ISO, and other styles

Sharma, Nabin. "Multi-lingual Text Processing from Videos." Thesis, Griffith University, 2015. http://hdl.handle.net/10072/367489.

Full text

Abstract:

Advances in digital technology have produced low priced portable imaging devices such as digital cameras attached to mobile phones, camcorders, PDA’s etc. which are highly portable. These devices can be used to capture videos and images at ease, which can be shared through the internet and other communication media. In the commercial do- main, cameras are used to create news, advertisement videos and other forms of material for information communication. The use of multiple languages to create information for targeted audiences is quite common in countries having mul

APA, Harvard, Vancouver, ISO, and other styles

Milintsevich, Kirill. "Estimatiοn οf depressiοn level frοm text : symptοm-based apprοach, external knοwledge, dataset validity". Electronic Thesis or Diss., Normandie, 2024. http://www.theses.fr/2024NORMC227.

Full text

Abstract:

Le trouble dépressif majeur (TDM) est l'un des troubles mentaux les plus répandus au monde, entraînant souvent une incapacité et un risque accru de suicide. La récente pandémie de coronavirus (COVID-19) a fait grimper le taux de dépression dans le monde entier. De plus, la stigmatisation et l'accès limité aux traitements entravent le diagnostic et les soins appropriés pour de nombreuses personnes.Des études préliminaires ont montré que les personnes déprimées et non déprimées utilisent un vocabulaire différent. Par exemple, les personnes déprimées ont tendance à utiliser davantage de mots néga

APA, Harvard, Vancouver, ISO, and other styles

Wu, Yingyu. "Using Text based Visualization in Data Analysis." Kent State University / OhioLINK, 2014. http://rave.ohiolink.edu/etdc/view?acc_num=kent1398079502.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Shrimpton, Luke William. "Efficient techniques for streaming cross document coreference resolution." Thesis, University of Edinburgh, 2017. http://hdl.handle.net/1842/28895.

Full text

Abstract:

Large text streams are commonplace; news organisations are constantly producing stories and people are constantly writing social media posts. These streams should be analysed in real-time so useful information can be extracted and acted upon instantly. When natural disasters occur people want to be informed, when companies announce new products financial institutions want to know and when celebrities do things their legions of fans want to feel involved. In all these examples people care about getting information in real-time (low latency). These streams are massively varied, people’s interest

APA, Harvard, Vancouver, ISO, and other styles

Ryan, Elisabeth. "Towards word alignment and dataset creation for shorthand documents and transcripts." Thesis, Uppsala universitet, Institutionen för informationsteknologi, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-452278.

Full text

Abstract:

Analysing handwritten texts and creating labelled data sets can facilitate novel research on languages and advanced computerized analysis of authors works. However, few handwritten works have word wise labelling or data sets associated with them. More often a transcription of the text is available, but without any exact coupling between words in the transcript and word representations in the document images. Can an algorithm be created that will take only an image of handwritten text and a corresponding transcript and return a partial alignment and data set? An algorithm is developed in this t

APA, Harvard, Vancouver, ISO, and other styles

Baraheem, Samah Saeed. "Text to Image Synthesis via Mask Anchor Points and Aesthetic Assessment." University of Dayton / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=dayton158800567702413.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Belay, Birhanu Hailu [Verfasser], and Didier [Akademischer Betreuer] Stricker. "Deep Learning for Amharic Text-Image Recognition: Algorithm, Dataset and Application / Birhanu Hailu Belay ; Betreuer: Didier Stricker." Kaiserslautern : Technische Universität Kaiserslautern, 2021. http://d-nb.info/1229436308/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Monsen, Julius. "Building high-quality datasets for abstractive text summarization : A filtering‐based method applied on Swedish news articles." Thesis, Linköpings universitet, Institutionen för datavetenskap, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-176352.

Full text

Abstract:

With an increasing amount of information on the internet, automatic text summarization could potentially make content more readily available for a larger variety of people. Training and evaluating text summarization models require datasets of sufficient size and quality. Today, most such datasets are in English, and for minor languages such as Swedish, it is not easy to obtain corresponding datasets with handwritten summaries. This thesis proposes methods for compiling high-quality datasets suitable for abstractive summarization from a large amount of noisy data through characterization and fi

APA, Harvard, Vancouver, ISO, and other styles

Lewis, Jonathan Scott. "The Role of Work Experiences in College Student Leadership Development: Evidence From a National Dataset and a Text Mining Approach to Examining Beliefs About Leadership." Thesis, Boston College, 2017. http://hdl.handle.net/2345/bc-ir:107652.

Full text

Abstract:

Thesis advisor: Heather Rowan-Kenyon<br>Paid employment is one of the most common extracurricular activities among full-time undergraduates, and an array of studies has attempted to measure its impact. Methodological concerns with the extant literature, however, make it difficult to draw reliable conclusions. Furthermore, the research on working college students has little to say about relationships between employment and leadership development, a key student learning outcome. This study addressed these gaps in two ways, using a national sample of 77,489 students from the 2015 Multi-Institutio

APA, Harvard, Vancouver, ISO, and other styles

More sources

Books on the topic "Text dataset"

Shi, Feng. Learn About Text Pre-Processing in R With Data From How ISIS Uses Twitter Dataset (2016). SAGE Publications, Ltd., 2019. http://dx.doi.org/10.4135/9781526488909.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Shi, Feng. Learn About Text Pre-Processing in Python With Data From How ISIS Uses Twitter Dataset (2016). SAGE Publications, Ltd., 2019. http://dx.doi.org/10.4135/9781526497864.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Shi, Feng. Learn About Basic Concepts in Text Analysis in R With Data From How ISIS Uses Twitter Dataset (2016). SAGE Publications, Ltd., 2019. http://dx.doi.org/10.4135/9781526488626.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Shi, Feng. Learn About Basic Concepts in Text Analysis in Python With Data From How ISIS Uses Twitter Dataset (2016). SAGE Publications, Ltd., 2019. http://dx.doi.org/10.4135/9781526497796.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Shi, Feng. Learn About Term Frequency–Inverse Document Frequency in Text Analysis in R With Data From How ISIS Uses Twitter Dataset (2016). SAGE Publications, Ltd., 2019. http://dx.doi.org/10.4135/9781526489012.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Shi, Feng. Learn About Term Frequency–Inverse Document Frequency in Text Analysis in Python With Data From How ISIS Uses Twitter Dataset (2016). SAGE Publications, Ltd., 2019. http://dx.doi.org/10.4135/9781526498038.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Wiesen, Christopher. Learn to Use the Kolmogorov–Smirnov Test in Stata With the Cardiac Catheterization Diagnostic Dataset (2018). SAGE Publications, Ltd., 2019. http://dx.doi.org/10.4135/9781526489302.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Scott Jones, Julie. Learn to Test for Multicollinearity in SPSS With Data From the English Health Survey (Teaching Dataset) (2002). SAGE Publications, Ltd., 2019. http://dx.doi.org/10.4135/9781526485793.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Scott Jones, Julie. Learn to Test for Multicollinearity in R With Data From the English Health Survey (Teaching Dataset) (2002). SAGE Publications, Ltd., 2019. http://dx.doi.org/10.4135/9781526498670.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Scott Jones, Julie. Learn to Use the Kaiser-Meyer-Olkin Test in SPSS With Data From the Northern Ireland Life and Times Survey: Lesbian, Gay, Bisexual, and Transgender Issues Teaching Dataset (Open Access Dataset) (2012). SAGE Publications Ltd., 2019. http://dx.doi.org/10.4135/9781526486745.

Full text

APA, Harvard, Vancouver, ISO, and other styles

More sources

Book chapters on the topic "Text dataset"

Aghaebrahimian, Ahmad. "Quora Question Answer Dataset." In Text, Speech, and Dialogue. Springer International Publishing, 2017. http://dx.doi.org/10.1007/978-3-319-64206-2_8.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Hao, Yanrong, Bo Chen, and Xiaobing Zhao. "TiLTS:Tibetan Long Text Summarization Dataset." In Lecture Notes in Computer Science. Springer Nature Singapore, 2024. http://dx.doi.org/10.1007/978-981-97-9440-9_21.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Iwamura, Masakazu, Takahiro Matsuda, Naoyuki Morimoto, Hitomi Sato, Yuki Ikeda, and Koichi Kise. "Downtown Osaka Scene Text Dataset." In Lecture Notes in Computer Science. Springer International Publishing, 2016. http://dx.doi.org/10.1007/978-3-319-46604-0_32.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Svoboda, Lukás̆, and Tomás̆ Brychcín. "Czech Dataset for Semantic Textual Similarity." In Text, Speech, and Dialogue. Springer International Publishing, 2018. http://dx.doi.org/10.1007/978-3-030-00794-2_23.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Rafique, Aftab, and M. Ishtiaq. "UOHTD: Urdu Offline Handwritten Text Dataset." In Frontiers in Handwriting Recognition. Springer International Publishing, 2022. http://dx.doi.org/10.1007/978-3-031-21648-0_34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Sowański, Marcin, and Artur Janicki. "Leyzer: A Dataset for Multilingual Virtual Assistants." In Text, Speech, and Dialogue. Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-58323-1_51.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Saxena, Prateek, and Soma Paul. "EPIE Dataset: A Corpus for Possible Idiomatic Expressions." In Text, Speech, and Dialogue. Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-58323-1_9.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Saxena, Prateek, and Soma Paul. "Labelled EPIE: A Dataset for Idiom Sense Disambiguation." In Text, Speech, and Dialogue. Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-83527-9_18.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Yang, Zhongliang, Jin He, Siyu Zhang, Jinshuai Yang, and Yongfeng Huang. "TStego-THU: Large-Scale Text Steganalysis Dataset." In Advances in Artificial Intelligence and Security. Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-78621-2_27.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Yin, Xu-Cheng, Chun Yang, and Chang Liu. "Open-Set Text Recognition: Concept, Dataset, Protocol, and Framework." In Open-Set Text Recognition. Springer Nature Singapore, 2024. http://dx.doi.org/10.1007/978-981-97-0361-6_3.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "Text dataset"

Mareen, Hannes, Dimitrios Karageorgiou, Glenn Van Wallendael, Peter Lambert, and Symeon Papadopoulos. "TGIF: Text-Guided Inpainting Forgery Dataset." In 2024 IEEE International Workshop on Information Forensics and Security (WIFS). IEEE, 2024. https://doi.org/10.1109/wifs61860.2024.10810690.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Al-Dulaimi, Ahmed, Hala Adnan Fadel, and Maryam K. Hasan. "Ultimate Arabic News Dataset: A New Efficient Dataset for Arabic Text Classification." In 2024 10th International Engineering Conference on Advances in Computer and Civil Engineering (IEC). IEEE, 2024. https://doi.org/10.1109/iec61018.2024.11063800.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Wang, Hao, Zhengdong Lu, Hang Li, and Enhong Chen. "A Dataset for Research on Short-Text Conversations." In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2013. http://dx.doi.org/10.18653/v1/d13-1096.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Madanbhavi, Lalitha, Padmashree Desai, Neha Dhirendra Sirur, Ananya Deshpande, Risheek V. Hiremath, and Chetan M. Patil. "An Efficient Multilingual Text Classification using IndicCorp dataset." In 2024 5th IEEE Global Conference for Advancement in Technology (GCAT). IEEE, 2024. https://doi.org/10.1109/gcat62922.2024.10923964.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Xie, Zeyu, Xuenan Xu, Zhizheng Wu, and Mengyue Wu. "AudioTime: A Temporally-aligned Audio-text Benchmark Dataset." In ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2025. https://doi.org/10.1109/icassp49660.2025.10889879.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abdiansah, Abdiansah, Novi Yusliani, Fathoni Fathoni, Muhammad Fazri Nizar, Aulia Salsabella, and Agi Agustian Davi. "IDSpider: Indonesian Standard Dataset for Text-to-SQL." In 2024 Ninth International Conference on Informatics and Computing (ICIC). IEEE, 2024. https://doi.org/10.1109/icic64337.2024.10956918.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Koto, Fajri, Jey Han Lau, and Timothy Baldwin. "Liputan6: A Large-scale Indonesian Dataset for Text Summarization." In Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing. Association for Computational Linguistics, 2020. http://dx.doi.org/10.18653/v1/2020.aacl-main.60.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Bouchiha, Djelloul, Abdelghani Bouziane, Noureddine Doumi, et al. "WiHArD: Wikipedia Based Hierarchical Arabic Dataset for Text Classification." In 2024 4th International Conference on Embedded & Distributed Systems (EDiS). IEEE, 2024. https://doi.org/10.1109/edis63605.2024.10783418.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Weng, Lifen, Qibing Zhu, and Jiangbin Guo. "Generating Sketch Faces from Text Descriptions: Dataset and Algorithm." In 2024 IEEE 18th International Conference on Anti-counterfeiting, Security, and Identification (ASID). IEEE, 2024. https://doi.org/10.1109/asid63618.2024.10839706.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Gongqu, Zhuome, Peng Luo, Dongzhou Jiayang, Jia Cairang, Jiacuo Cizhen, and Dongzhu Renqing. "A Tibetan Ancient Uchen Text Line Dataset for OCR." In 2024 International Conference on Image Processing, Computer Vision and Machine Learning (ICICML). IEEE, 2024. https://doi.org/10.1109/icicml63543.2024.10958135.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Reports on the topic "Text dataset"

Montiel Olea, César E., and Leonardo R. Corral. Text Analysis of Project Completion Reports. Inter-American Development Bank, 2021. http://dx.doi.org/10.18235/0003611.

Full text

Abstract:

Project Completion Reports (PCRs) are the main instrument through which different multilateral organizations measure the success of a project once it closes. PCRs are important for development effectiveness as they serve to understand achievements, failures, and challenges within the project cycle they can feed back into the design and execution of new projects. The aim of this paper is to introduce text analysis tools for the exploration of PCR documents. We describe and apply different text analysis tools to explore the content of a sample of PCRs. We seek to illustrate a way in which PCRs c

APA, Harvard, Vancouver, ISO, and other styles

Hoshi Larsson, Kaori. Do LiU researchers publish data – and where? Dataset analysis using ODDPub. Linköping University Electronic Press, 2025. https://doi.org/10.3384/report-119790.

Full text

Abstract:

Swedish researchers are encouraged to share their research data, with a government goal for all publicly funded research to provide open research data by 2026. Hoshi Larsson (2023) investigated the extent and location of research data from LiU researchers. However, the search was limited to datasets with DOIs listed in DataCite Commons, suggesting that many datasets were excluded in the investigation. Therefore, the purpose of this study is to identify, through articles’ descriptions of open data, to what extent LiU’s researchers are sharing their research data and open code, and if so, which

APA, Harvard, Vancouver, ISO, and other styles

Marra de Artiñano, Ignacio, Franco Riottini Depetris, and Christian Volpe Martincus. Automatic Product Classification in International Trade: Machine Learning and Large Language Models. Inter-American Development Bank, 2023. http://dx.doi.org/10.18235/0005012.

Full text

Abstract:

Accurately classifying products is essential in international trade. Virtually all countries categorize products into tariff lines using the Harmonized System (HS) nomenclature for both statistical and duty collection purposes. In this paper, we apply and assess several different algorithms to automatically classify products based on text descriptions. To do so, we use agricultural product descriptions from several public agencies, including customs authorities and the United States Department of Agriculture (USDA). We find that while traditional machine learning (ML) models tend to perform we

APA, Harvard, Vancouver, ISO, and other styles

Warin, Thierry. The World Health Organization in a Post-COVID-19 Era: An Exploration of Public Engagement on Twitter. CIRANO, 2022. http://dx.doi.org/10.54932/ehuh4224.

Full text

Abstract:

This article analyses the conversations on Twitter related to the World Health Organization (WHO). We collect the text of the discussions as well as the metadata associated with each tweet. Our dataset is exhaustive as it includes all the tweets produced by WHO. Likes, retweets, and replies capture the level of engagement. The goal is to quantify the balance of likes, retweets, and replies, also known as “ratios”, and study their dynamics as proxy for the collective engagement in response to WHO’s communications. Our results demonstrate a higher engagement of the public receiving the informati

APA, Harvard, Vancouver, ISO, and other styles

Madsen, Jens, Nikhil Kuppa, and Lucas Parra. The Brain, Body, and Behaviour Dataset - Neural Engineering Lab, CCNY. Fcp-indi, 2025. https://doi.org/10.15387/fcp_indi.retro.bbbd.

Full text

Abstract:

When humans engage with video, their brain and body interact in response to sensory input. To investigate these interactions, we recorded and are releasing a dataset from N=178 participants across five experiments featuring short online educational videos. This dataset comprises approximately 110 hours of multimodal data including electrocardiogram (ECG), heart rate, respiration, breathing rate, pupil size, electrooculogram (EOG), gaze position, saccades, blinks, fixations, head movement, and electroencephalogram (EEG). Participants viewed 3-6 videos (mean total duration: 28±5 min) to test att

APA, Harvard, Vancouver, ISO, and other styles

Zinilli, Antonio. Text Mining in Action: Tools and Techniques using Python. Instats Inc., 2024. http://dx.doi.org/10.61700/k4powzm518m5z1739.

Full text

Abstract:

This seminar provides a comprehensive exploration of text mining techniques using Python, tailored for academic researchers seeking to analyze large textual datasets effectively. Participants will gain hands-on experience with Python libraries and methodologies for natural language processing, sentiment analysis, topic modeling, text classification, and more, enhancing their data analysis capabilities across various disciplines.

APA, Harvard, Vancouver, ISO, and other styles

Johra, Hicham, Martin Veit, Mathias Østergaard Poulsen, et al. Training and testing labelled image and video datasets of human faces for different indoor visual comfort and glare visual discomfort situations. Department of the Built Environment, 2023. http://dx.doi.org/10.54337/aau542153983.

Full text

Abstract:

The aim of this technical report is to provide a description and access to labelled image and video datasets of human faces that have been generated for different indoor visual comfort and glare visual discomfort situations. These datasets have been used to train and test a computer-vision artificial neural network detecting glare discomfort from images of human faces.

APA, Harvard, Vancouver, ISO, and other styles

Kumar, Praveen. PR753-233900-R01 Enhanced Leak Detection Using Minimally Invasive Multi-Sensor Device Based Inspection. Pipeline Research Council International, Inc. (PRCI), 2024. http://dx.doi.org/10.55274/r0000078.

Full text

Abstract:

The project team investigated the feasibility of identifying pipeline leaks using Novel sensing approaches that have been recently gaining popularity in the "Pipeline Integrity assessment" realm (such as multi-Sensor inline inspection tools) that incorporate sensors such as Audio, Magnetometry, Pressure etc. The flow loop setup at the PRCI TDC site was leveraged to create a customized test setup and a test execution methodology was developed and executed towards this end. Two Sensing equipment vendors (hereinafter referred to as Vendor A and Vendor B) were used to collect various sensor datase

APA, Harvard, Vancouver, ISO, and other styles

Stucchi, Rodolfo, Alessandro Maffioli, Sofía Rojo, and Victoria Castillo. Knowledge Spillovers of Innovation Policy through Labor Mobility: An Impact Evaluation of the FONTAR Program in Argentina. Inter-American Development Bank, 2014. http://dx.doi.org/10.18235/0011534.

Full text

Abstract:

Although knowledge spillovers are at the core of the innovation policy's justification, they have never been properly measured by any impact evaluation. This paper fills this gap by estimating the spillover effects of the FONTAR program in Argentina. We use an employer-employee matched panel dataset with the entire population of firms and workers in Argentina for the period 2002-2010. This dataset allows us to track the mobility of qualified workers from FONTAR beneficiary firms to other firms and, therefore, to identify firms that indirectly benefit from the program through knowledge diffusio

APA, Harvard, Vancouver, ISO, and other styles

Meloncelli, Daniel. Foundations of Statistical Analysis in R. Instats Inc., 2025. https://doi.org/10.61700/90jrgibx52lka1460.

Full text

Abstract:

This seminar introduces researchers to analysing relational and correlational research questions in R, covering key methods for hypothesis testing, t-tests, ANOVA, and non-parametric alternatives. Participants will learn to choose the right test, check assumptions, perform analyses, interpret results, and visualise findings using R. The seminar will include live coding demonstrations and practical examples to help participants apply statistical methods to real-world datasets.

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

Contents

Academic literature on the topic 'Text dataset'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Journal articles on the topic "Text dataset"

Dissertations / Theses on the topic "Text dataset"

Books on the topic "Text dataset"

Book chapters on the topic "Text dataset"

Conference papers on the topic "Text dataset"

Reports on the topic "Text dataset"