Log in

Relevant bibliographies by topics / Text document classification

Contents

Journal articles
Dissertations / Theses
Books
Book chapters
Conference papers
Reports

Academic literature on the topic 'Text document classification'

Author: Grafiati

Published: 19 June 2022

Last updated: 31 July 2025

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Text document classification.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Text document classification"

1

Y Baravkar, B. "Automated Text Document Classification Using Predictive Network." International Journal of Scientific Engineering and Research 12, no. 1 (2024): 17–19. https://doi.org/10.70729/se24120142420.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Mr. D Krishna, Erukulla Laasya, A Sowmya Sri, T Ravinder Reddy, and Akhil Sanjoy. "BIOMEDICAL TEXT DOCUMENT CLASSIFICATION." international journal of engineering technology and management sciences 7, no. 3 (2023): 788–92. http://dx.doi.org/10.46647/ijetms.2023.v07i03.121.

Full text

Abstract:

Information extraction, retrieval, and text categorization are only a few of the significant research fields covered by "bio medical text classification." This study examines many text categorization techniques utilised in practise, as well as their strengths and weaknesses, in order to improve knowledge of various information extraction opportunities in the field of data mining. We compiled a dataset with a focus on three categories: "Thyroid Cancer," "Lung Cancer," and "Colon Cancer." This paper presents an empirical study of a classifier. The investigation was carried out using biomedical l

APA, Harvard, Vancouver, ISO, and other styles

3

Mukherjee, Indrajit, Prabhat Kumar Mahanti, Vandana Bhattacharya, and Samudra Banerjee. "Text classification using document-document semantic similarity." International Journal of Web Science 2, no. 1/2 (2013): 1. http://dx.doi.org/10.1504/ijws.2013.056572.

Full text

APA, Harvard, Vancouver, ISO, and other styles

4

Wu, Tiandeng, Qijiong Liu, Yi Cao, Yao Huang, Xiao-Ming Wu, and Jiandong Ding. "Continual Graph Convolutional Network for Text Classification." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 11 (2023): 13754–62. http://dx.doi.org/10.1609/aaai.v37i11.26611.

Full text

Abstract:

Graph convolutional network (GCN) has been successfully applied to capture global non-consecutive and long-distance semantic information for text classification. However, while GCN-based methods have shown promising results in offline evaluations, they commonly follow a seen-token-seen-document paradigm by constructing a fixed document-token graph and cannot make inferences on new documents. It is a challenge to deploy them in online systems to infer steaming text data. In this work, we present a continual GCN model (ContGCN) to generalize inferences from observed documents to unobserved docum

APA, Harvard, Vancouver, ISO, and other styles

5

Katta, Divya. "Multilingual Text Document Clustering and Classification." International Journal for Research in Applied Science and Engineering Technology 13, no. 7 (2025): 1009–14. https://doi.org/10.22214/ijraset.2025.73119.

Full text

Abstract:

The increasing volume of digital content in multiple languages has created a strong need for intelligent systems that can organize and retrieve multilingual documents efficiently. This project introduces a comprehensive pipeline for clustering and semantic search of multilingual text documents, supporting English, Hindi, and Telugu. The system begins by accepting PDF documents and identifying their language using the langdetect library. This is followed by language-specific preprocessing, including Unicode normalization, sentence tokenization, punctuation removal, stopword elimination, and lem

APA, Harvard, Vancouver, ISO, and other styles

6

Yao, Liang, Chengsheng Mao, and Yuan Luo. "Graph Convolutional Networks for Text Classification." Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 17, 2019): 7370–77. http://dx.doi.org/10.1609/aaai.v33i01.33017370.

Full text

Abstract:

Text classification is an important and classical problem in natural language processing. There have been a number of studies that applied convolutional neural networks (convolution on regular grid, e.g., sequence) to classification. However, only a limited number of studies have explored the more flexible graph convolutional neural networks (convolution on non-grid, e.g., arbitrary graph) for the task. In this work, we propose to use graph convolutional networks for text classification. We build a single text graph for a corpus based on word co-occurrence and document word relations, then lea

APA, Harvard, Vancouver, ISO, and other styles

7

Mohammed, Ali Sura I., Marwah Nihad, Sharaf Hussien Mohamed, and Haitham Farouk. "Machine learning for text document classification-efficient classification approach." IAES International Journal of Artificial Intelligence (IJ-AI) 13, no. 1 (2024): 703–10. https://doi.org/10.11591/ijai.v13.i1.pp703-710.

Full text

Abstract:

Numerous alternative methods for text classification have been created because of the increase in the amount of online text information available. The cosine similarity classifier is the most extensively utilized simple and efficient approach. It improves text classification performance. It is combined with estimated values provided by conventional classifiers such as Multinomial Naive Bayesian (MNB). Consequently, combining the similarity between a test document and a category with the estimated value for the category enhances the performance of the classifier. This approach provides a text d

APA, Harvard, Vancouver, ISO, and other styles

8

K, Dinesh Balaji. "SMART DOCUMENT COMPANION - TEXT DATA CLASSIFICATION IN DOCUMENTS USING AI." International Research Journal of Education and Technology 6, no. 11 (2024): 2041–46. https://doi.org/10.70127/irjedt.vol.7.issue03.2046.

Full text

APA, Harvard, Vancouver, ISO, and other styles

9

Mohammed Ali, Sura I., Marwah Nihad, Hussien Mohamed Sharaf, and Haitham Farouk. "Machine learning for text document classification-efficient classification approach." IAES International Journal of Artificial Intelligence (IJ-AI) 13, no. 1 (2024): 703. http://dx.doi.org/10.11591/ijai.v13.i1.pp703-710.

Full text

Abstract:

<p>Numerous alternative methods for text classification have been created because of the increase in the amount of online text information available. The cosine similarity classifier is the most extensively utilized simple and efficient approach. It improves text classification performance. It is combined with estimated values provided by conventional classifiers such as Multinomial Naive Bayesian (MNB). Consequently, combining the similarity between a test document and a category with the estimated value for the category enhances the performance of the classifier. This approach provides

APA, Harvard, Vancouver, ISO, and other styles

10

Cheng, Betty Yee Man, Jaime G. Carbonell, and Judith Klein-Seetharaman. "Protein classification based on text document classification techniques." Proteins: Structure, Function, and Bioinformatics 58, no. 4 (2005): 955–70. http://dx.doi.org/10.1002/prot.20373.

Full text

APA, Harvard, Vancouver, ISO, and other styles

More sources

Dissertations / Theses on the topic "Text document classification"

1

Mondal, Abhro Jyoti. "Document Classification using Characteristic Signatures." University of Cincinnati / OhioLINK, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1511793852923472.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Sendur, Zeynel. "Text Document Categorization by Machine Learning." Scholarly Repository, 2008. http://scholarlyrepository.miami.edu/oa_theses/209.

Full text

Abstract:

Because of the explosion of digital and online text information, automatic organization of documents has become a very important research area. There are mainly two machine learning approaches to enhance the organization task of the digital documents. One of them is the supervised approach, where pre-defined category labels are assigned to documents based on the likelihood suggested by a training set of labeled documents; and the other one is the unsupervised approach, where there is no need for human intervention or labeled documents at any point in the whole process. In this thesis, we conce

APA, Harvard, Vancouver, ISO, and other styles

3

Blein, Florent. "Automatic Document Classification Applied to Swedish News." Thesis, Linköping University, Department of Computer and Information Science, 2005. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-3065.

Full text

Abstract:

<p>The first part of this paper presents briefly the ELIN[1] system, an electronic newspaper project. ELIN is a framework that stores news and displays them to the end-user. Such news are formatted using the xml[2] format. The project partner Corren[3] provided ELIN with xml articles, however the format used was not the same. My first task has been to develop a software that converts the news from one xml format (Corren) to another (ELIN).</p><p>The second and main part addresses the problem of automatic document classification and tries to find a solution for a specific issue. The goal is to

APA, Harvard, Vancouver, ISO, and other styles

4

Alsaad, Amal. "Enhanced root extraction and document classification algorithm for Arabic text." Thesis, Brunel University, 2016. http://bura.brunel.ac.uk/handle/2438/13510.

Full text

Abstract:

Many text extraction and classification systems have been developed for English and other international languages; most of the languages are based on Roman letters. However, Arabic language is one of the difficult languages which have special rules and morphology. Not many systems have been developed for Arabic text categorization. Arabic language is one of the Semitic languages with morphology that is more complicated than English. Due to its complex morphology, there is a need for pre-processing routines to extract the roots of the words then classify them according to the group of acts or m

APA, Harvard, Vancouver, ISO, and other styles

5

Anne, Chaitanya. "Advanced Text Analytics and Machine Learning Approach for Document Classification." ScholarWorks@UNO, 2017. http://scholarworks.uno.edu/td/2292.

Full text

Abstract:

Text classification is used in information extraction and retrieval from a given text, and text classification has been considered as an important step to manage a vast number of records given in digital form that is far-reaching and expanding. This thesis addresses patent document classification problem into fifteen different categories or classes, where some classes overlap with other classes for practical reasons. For the development of the classification model using machine learning techniques, useful features have been extracted from the given documents. The features are used to classify

APA, Harvard, Vancouver, ISO, and other styles

6

McElroy, Jonathan David. "Automatic Document Classification in Small Environments." DigitalCommons@CalPoly, 2012. https://digitalcommons.calpoly.edu/theses/682.

Full text

Abstract:

Document classification is used to sort and label documents. This gives users quicker access to relevant data. Users that work with large inflow of documents spend time filing and categorizing them to allow for easier procurement. The Automatic Classification and Document Filing (ACDF) system proposed here is designed to allow users working with files or documents to rely on the system to classify and store them with little manual attention. By using a system built on Hidden Markov Models, the documents in a smaller desktop environment are categorized with better results than the traditional N

APA, Harvard, Vancouver, ISO, and other styles

7

Felhi, Mehdi. "Document image segmentation : content categorization." Thesis, Université de Lorraine, 2014. http://www.theses.fr/2014LORR0109/document.

Full text

Abstract:

Dans cette thèse, nous abordons le problème de la segmentation des images de documents en proposant de nouvelles approches pour la détection et la classification de leurs contenus. Dans un premier lieu, nous étudions le problème de l'estimation d'inclinaison des documents numérisées. Le but de ce travail étant de développer une approche automatique en mesure d'estimer l'angle d'inclinaison du texte dans les images de document. Notre méthode est basée sur la méthode Maximum Gradient Difference (MGD), la R-signature et la transformée de Ridgelets. Nous proposons ensuite une approche hybride pour

APA, Harvard, Vancouver, ISO, and other styles

8

Wang, Yanbo Justin. "Language-independent pre-processing of large document bases for text classification." Thesis, University of Liverpool, 2008. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.445960.

Full text

Abstract:

Text classification is a well-known topic in the research of knowledge discovery in databases. Algorithms for text classification generally involve two stages. The first is concerned with identification of textual features (i.e. words andlor phrases) that may be relevant to the classification process. The second is concerned with classification rule mining and categorisation of "unseen" textual data. The first stage is the subject of this thesis and often involves an analysis of text that is both language-specific (and possibly domain-specific), and that may also be computationally costly espe

APA, Harvard, Vancouver, ISO, and other styles

9

Wang, Yalin. "Document analysis : table structure understanding and zone content classification /." Thesis, Connect to this title online; UW restricted, 2002. http://hdl.handle.net/1773/6079.

Full text

APA, Harvard, Vancouver, ISO, and other styles

10

Wei, Zhihua. "The research on chinese text multi-label classification." Thesis, Lyon 2, 2010. http://www.theses.fr/2010LYO20025/document.

Full text

Abstract:

Text Classification (TC) which is an important field in information technology has many valuable applications. When facing the sea of information resources, the objects of TC are more complicated and diversity. The researches in pursuit of effective and practical TC technology are fairly challenging. More and more researchers regard that multi-label TC is more suited for many applications. This thesis analyses the difficulties and problems in multi-label TC and Chinese text representation based on a mass of algorithms for single-label TC and multi-label TC. Aiming at high dimensionality in fea

APA, Harvard, Vancouver, ISO, and other styles

More sources

Books on the topic "Text document classification"

1

Meister, Burkhardt W. The German Limited Liability Company: An introduction to the Act on limited liability companies with German/English text, synoptically arranged, of the act, a sample of arcticles of association, samples of the other formation documents of the company, the classification of the balance sheet and the profit and loss statement of a company and an extract from the commercial register = Die deutsche Gesellschaft mit beschränkter Haftung : eine Einführung zum Gesetz betreffend die Gesellschaften mit beschränkter Haftung mit synoptisch angeordnetem deutsch/englischem Text des Gesetzes, eines Gesellschaftsvertrages, der sonstigen Gründungsdokumente einer Bilanz und der Gewinn- und Verlustrechnung einer Gesellschaft und eines Auszugs aus dem Handelsregister. 7th ed. Beck, 2010.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

2

Newton, David E. Substance Abuse. 2nd ed. ABC-CLIO, LLC, 2017. http://dx.doi.org/10.5040/9798216021032.

Full text

Abstract:

This go-to resource on substance abuse supplies the broad background knowledge and historical information needed to understand this important sociological issue and provides readers with a range of additional sources for continuing their study of the topic. From the pharmaceuticals advertised on television for various specific medical conditions; to alcohol, which is consumed regularly as a societal norm; to illicit drugs such as cocaine, heroin, and methamphetamine; to marijuana, which is becoming legal in an increasing number of U.S. states, drugs are all around us and are ingrained in our c

APA, Harvard, Vancouver, ISO, and other styles

Book chapters on the topic "Text document classification"

1

Guthrie, Louise, Joe Guthrie, and James Leistensnider. "Document Classification and Routing." In Text, Speech and Language Technology. Springer Netherlands, 1999. http://dx.doi.org/10.1007/978-94-017-2388-6_12.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Huang, Chaochao, Xipeng Qiu, and Xuanjing Huang. "Text Classification with Document Embeddings." In Lecture Notes in Computer Science. Springer International Publishing, 2014. http://dx.doi.org/10.1007/978-3-319-12277-9_12.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Hrala, Michal, and Pavel Král. "Multi-label Document Classification in Czech." In Text, Speech, and Dialogue. Springer Berlin Heidelberg, 2013. http://dx.doi.org/10.1007/978-3-642-40585-3_44.

Full text

APA, Harvard, Vancouver, ISO, and other styles

4

Kralicek, Jiri, and Jiri Matas. "Fast Text vs. Non-text Classification of Images." In Document Analysis and Recognition – ICDAR 2021. Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-86337-1_2.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

Král, Pavel, and Ladislav Lenc. "Confidence Measure for Czech Document Classification." In Computational Linguistics and Intelligent Text Processing. Springer International Publishing, 2015. http://dx.doi.org/10.1007/978-3-319-18117-2_39.

Full text

APA, Harvard, Vancouver, ISO, and other styles

6

Penha, Gustavo, Raphael Campos, Sérgio Canuto, Marcos André Gonçalves, and Rodrygo L. T. Santos. "Document Performance Prediction for Automatic Text Classification." In Lecture Notes in Computer Science. Springer International Publishing, 2019. http://dx.doi.org/10.1007/978-3-030-15719-7_17.

Full text

APA, Harvard, Vancouver, ISO, and other styles

7

Anni, Isaac Kobby, and Venu G. Dasigi. "Analysis of Document Representation for Text Classification." In Lecture Notes in Networks and Systems. Springer Nature Switzerland, 2025. https://doi.org/10.1007/978-3-031-84460-7_17.

Full text

APA, Harvard, Vancouver, ISO, and other styles

8

Howland, Peg, and Haesun Park. "Cluster-Preserving Dimension Reduction Methods for Document Classification." In Survey of Text Mining II. Springer London, 2008. http://dx.doi.org/10.1007/978-1-84800-046-9_1.

Full text

APA, Harvard, Vancouver, ISO, and other styles

9

Xia, Zhonghang, Guangming Xing, Houduo Qi, and Qi Li. "Applications of Semidefinite Programming in XML Document Classification." In Survey of Text Mining II. Springer London, 2008. http://dx.doi.org/10.1007/978-1-84800-046-9_7.

Full text

APA, Harvard, Vancouver, ISO, and other styles

10

Gelbukh, Alexander, Grigori Sidorov, and Adolfo Guzman-Arénas. "Use of a Weighted Topic Hierarchy for Document Classification." In Text, Speech and Dialogue. Springer Berlin Heidelberg, 1999. http://dx.doi.org/10.1007/3-540-48239-3_24.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "Text document classification"

1

Labarga, John, and Sam Friedman. "Automated Labeling of Helicopter Maintenance Records with Text Classification." In Vertical Flight Society 74th Annual Forum & Technology Display. The Vertical Flight Society, 2018. http://dx.doi.org/10.4050/f-0074-2018-12850.

Full text

Abstract:

Separating plaintext documents into defined classes, known in the data science literature as text classification, is a time-consuming task that is essential to the processing of correspondence, safety records, and maintenance documents in the aviation industry. This work discusses a machine learning approach to this problem, and demonstrates how to construct an end-to-end document classification solution. This system is capable of classifying documents with high accuracy, thus alleviating the need for labor dedicated to this task.

APA, Harvard, Vancouver, ISO, and other styles

2

Saraswat, Pankaj, S. E. Manu, and Ritesh Kumar. "Automatic Detection and Classification of Handwritten Text in Document Images." In 2024 3rd International Conference for Advancement in Technology (ICONAT). IEEE, 2024. https://doi.org/10.1109/iconat61936.2024.10775224.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Deshpande, Vivek, K. Yuvaraj, Akhilesh Kalia, Praveena J, Shivangi Gupta, and Uma C. Swadimath. "Using DL Measure and Distance to Assess Text Mining Document Classification Problems." In 2024 Global Conference on Communications and Information Technologies (GCCIT). IEEE, 2024. https://doi.org/10.1109/gccit63234.2024.10862243.

Full text

APA, Harvard, Vancouver, ISO, and other styles

4

Alshamari, Fatimah, and Abdou Youssef. "A Study into Math Document Classification using Deep Learning." In 8th International Conference on Computational Science and Engineering (CSE 2020). AIRCC Publishing Corporation, 2020. http://dx.doi.org/10.5121/csit.2020.101702.

Full text

Abstract:

Document classification is a fundamental task for many applications, including document annotation, document understanding, and knowledge discovery. This is especially true in STEM fields where the growth rate of scientific publications is exponential, and where the need for document processing and understanding is essential to technological advancement. Classifying a new publication into a specific domain based on the content of the document is an expensive process in terms of cost and time. Therefore, there is a high demand for a reliable document classification system. In this paper, we foc

APA, Harvard, Vancouver, ISO, and other styles

5

Wu, Qin, Eddie Fuller, and Cun-Quan Zhang. "Text Document Classification and Pattern Recognition." In 2009 International Conference on Advances in Social Network Analysis and Mining (ASONAM). IEEE, 2009. http://dx.doi.org/10.1109/asonam.2009.21.

Full text

APA, Harvard, Vancouver, ISO, and other styles

6

Chen, ZhiHang, Liping Huang, and Yi L. Murphey. "Incremental Learning for Text Document Classification." In 2007 International Joint Conference on Neural Networks. IEEE, 2007. http://dx.doi.org/10.1109/ijcnn.2007.4371367.

Full text

APA, Harvard, Vancouver, ISO, and other styles

7

Zhang, Haopeng, and Jiawei Zhang. "Text Graph Transformer for Document Classification." In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, 2020. http://dx.doi.org/10.18653/v1/2020.emnlp-main.668.

Full text

APA, Harvard, Vancouver, ISO, and other styles

8

Simske, Steven J., and Rafael Lins. "Automatic Text Summarization and Classification." In DocEng '18: ACM Symposium on Document Engineering 2018. ACM, 2018. http://dx.doi.org/10.1145/3209280.3232791.

Full text

APA, Harvard, Vancouver, ISO, and other styles

9

Lima, João Marcos Carvalho, and José Everardo Bessa Maia. "A Topical Word Embeddings for Text Classification." In XV Encontro Nacional de Inteligência Artificial e Computacional. Sociedade Brasileira de Computação - SBC, 2018. http://dx.doi.org/10.5753/eniac.2018.4401.

Full text

Abstract:

This paper presents an approach that uses topic models based on LDA to represent documents in text categorization problems. The document representation is achieved through the cosine similarity between document embeddings and embeddings of topic words, creating a Bag-of-Topics (BoT) variant. The performance of this approach is compared against those of two other representations: BoW (Bag-of-Words) and Topic Model, both based on standard tf-idf. Also, to reveal the effect of the classifier, we compared the performance of the nonlinear classifier SVM against that of the linear classifier Naive B

APA, Harvard, Vancouver, ISO, and other styles

10

Thi Xuan Lam, Thanh, Anh Duc Le, and Masaki Nakagawa. "User Interface for Text and Non-Text Classification." In 2019 International Conference on Document Analysis and Recognition Workshops (ICDARW). IEEE, 2019. http://dx.doi.org/10.1109/icdarw.2019.20044.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Reports on the topic "Text document classification"

1

Idakwo, Gabriel, Sundar Thangapandian, Joseph Luttrell, Zhaoxian Zhou, Chaoyang Zhang, and Ping Gong. Deep learning-based structure-activity relationship modeling for multi-category toxicity classification : a case study of 10K Tox21 chemicals with high-throughput cell-based androgen receptor bioassay data. Engineer Research and Development Center (U.S.), 2021. http://dx.doi.org/10.21079/11681/41302.

Full text

Abstract:

Deep learning (DL) has attracted the attention of computational toxicologists as it offers a potentially greater power for in silico predictive toxicology than existing shallow learning algorithms. However, contradicting reports have been documented. To further explore the advantages of DL over shallow learning, we conducted this case study using two cell-based androgen receptor (AR) activity datasets with 10K chemicals generated from the Tox21 program. A nested double-loop cross-validation approach was adopted along with a stratified sampling strategy for partitioning chemicals of multiple AR

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!