Добірка наукової літератури з теми "Text document classification"

Оформте джерело за APA, MLA, Chicago, Harvard та іншими стилями

Оберіть тип джерела:

Ознайомтеся зі списками актуальних статей, книг, дисертацій, тез та інших наукових джерел на тему "Text document classification".

Біля кожної праці в переліку літератури доступна кнопка «Додати до бібліографії». Скористайтеся нею – і ми автоматично оформимо бібліографічне посилання на обрану працю в потрібному вам стилі цитування: APA, MLA, «Гарвард», «Чикаго», «Ванкувер» тощо.

Також ви можете завантажити повний текст наукової публікації у форматі «.pdf» та прочитати онлайн анотацію до роботи, якщо відповідні параметри наявні в метаданих.

Статті в журналах з теми "Text document classification"

1

Y Baravkar, B. "Automated Text Document Classification Using Predictive Network." International Journal of Scientific Engineering and Research 12, no. 1 (2024): 17–19. https://doi.org/10.70729/se24120142420.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
2

Mr. D Krishna, Erukulla Laasya, A Sowmya Sri, T Ravinder Reddy, and Akhil Sanjoy. "BIOMEDICAL TEXT DOCUMENT CLASSIFICATION." international journal of engineering technology and management sciences 7, no. 3 (2023): 788–92. http://dx.doi.org/10.46647/ijetms.2023.v07i03.121.

Повний текст джерела
Анотація:
Information extraction, retrieval, and text categorization are only a few of the significant research fields covered by "bio medical text classification." This study examines many text categorization techniques utilised in practise, as well as their strengths and weaknesses, in order to improve knowledge of various information extraction opportunities in the field of data mining. We compiled a dataset with a focus on three categories: "Thyroid Cancer," "Lung Cancer," and "Colon Cancer." This paper presents an empirical study of a classifier. The investigation was carried out using biomedical l
Стилі APA, Harvard, Vancouver, ISO та ін.
3

Mukherjee, Indrajit, Prabhat Kumar Mahanti, Vandana Bhattacharya, and Samudra Banerjee. "Text classification using document-document semantic similarity." International Journal of Web Science 2, no. 1/2 (2013): 1. http://dx.doi.org/10.1504/ijws.2013.056572.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
4

Wu, Tiandeng, Qijiong Liu, Yi Cao, Yao Huang, Xiao-Ming Wu, and Jiandong Ding. "Continual Graph Convolutional Network for Text Classification." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 11 (2023): 13754–62. http://dx.doi.org/10.1609/aaai.v37i11.26611.

Повний текст джерела
Анотація:
Graph convolutional network (GCN) has been successfully applied to capture global non-consecutive and long-distance semantic information for text classification. However, while GCN-based methods have shown promising results in offline evaluations, they commonly follow a seen-token-seen-document paradigm by constructing a fixed document-token graph and cannot make inferences on new documents. It is a challenge to deploy them in online systems to infer steaming text data. In this work, we present a continual GCN model (ContGCN) to generalize inferences from observed documents to unobserved docum
Стилі APA, Harvard, Vancouver, ISO та ін.
5

Katta, Divya. "Multilingual Text Document Clustering and Classification." International Journal for Research in Applied Science and Engineering Technology 13, no. 7 (2025): 1009–14. https://doi.org/10.22214/ijraset.2025.73119.

Повний текст джерела
Анотація:
The increasing volume of digital content in multiple languages has created a strong need for intelligent systems that can organize and retrieve multilingual documents efficiently. This project introduces a comprehensive pipeline for clustering and semantic search of multilingual text documents, supporting English, Hindi, and Telugu. The system begins by accepting PDF documents and identifying their language using the langdetect library. This is followed by language-specific preprocessing, including Unicode normalization, sentence tokenization, punctuation removal, stopword elimination, and lem
Стилі APA, Harvard, Vancouver, ISO та ін.
6

Yao, Liang, Chengsheng Mao, and Yuan Luo. "Graph Convolutional Networks for Text Classification." Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 17, 2019): 7370–77. http://dx.doi.org/10.1609/aaai.v33i01.33017370.

Повний текст джерела
Анотація:
Text classification is an important and classical problem in natural language processing. There have been a number of studies that applied convolutional neural networks (convolution on regular grid, e.g., sequence) to classification. However, only a limited number of studies have explored the more flexible graph convolutional neural networks (convolution on non-grid, e.g., arbitrary graph) for the task. In this work, we propose to use graph convolutional networks for text classification. We build a single text graph for a corpus based on word co-occurrence and document word relations, then lea
Стилі APA, Harvard, Vancouver, ISO та ін.
7

Mohammed, Ali Sura I., Marwah Nihad, Sharaf Hussien Mohamed, and Haitham Farouk. "Machine learning for text document classification-efficient classification approach." IAES International Journal of Artificial Intelligence (IJ-AI) 13, no. 1 (2024): 703–10. https://doi.org/10.11591/ijai.v13.i1.pp703-710.

Повний текст джерела
Анотація:
Numerous alternative methods for text classification have been created because of the increase in the amount of online text information available. The cosine similarity classifier is the most extensively utilized simple and efficient approach. It improves text classification performance. It is combined with estimated values provided by conventional classifiers such as Multinomial Naive Bayesian (MNB). Consequently, combining the similarity between a test document and a category with the estimated value for the category enhances the performance of the classifier. This approach provides a text d
Стилі APA, Harvard, Vancouver, ISO та ін.
8

K, Dinesh Balaji. "SMART DOCUMENT COMPANION - TEXT DATA CLASSIFICATION IN DOCUMENTS USING AI." International Research Journal of Education and Technology 6, no. 11 (2024): 2041–46. https://doi.org/10.70127/irjedt.vol.7.issue03.2046.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
9

Mohammed Ali, Sura I., Marwah Nihad, Hussien Mohamed Sharaf, and Haitham Farouk. "Machine learning for text document classification-efficient classification approach." IAES International Journal of Artificial Intelligence (IJ-AI) 13, no. 1 (2024): 703. http://dx.doi.org/10.11591/ijai.v13.i1.pp703-710.

Повний текст джерела
Анотація:
<p>Numerous alternative methods for text classification have been created because of the increase in the amount of online text information available. The cosine similarity classifier is the most extensively utilized simple and efficient approach. It improves text classification performance. It is combined with estimated values provided by conventional classifiers such as Multinomial Naive Bayesian (MNB). Consequently, combining the similarity between a test document and a category with the estimated value for the category enhances the performance of the classifier. This approach provides
Стилі APA, Harvard, Vancouver, ISO та ін.
10

Cheng, Betty Yee Man, Jaime G. Carbonell, and Judith Klein-Seetharaman. "Protein classification based on text document classification techniques." Proteins: Structure, Function, and Bioinformatics 58, no. 4 (2005): 955–70. http://dx.doi.org/10.1002/prot.20373.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Більше джерел

Дисертації з теми "Text document classification"

1

Mondal, Abhro Jyoti. "Document Classification using Characteristic Signatures." University of Cincinnati / OhioLINK, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1511793852923472.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
2

Sendur, Zeynel. "Text Document Categorization by Machine Learning." Scholarly Repository, 2008. http://scholarlyrepository.miami.edu/oa_theses/209.

Повний текст джерела
Анотація:
Because of the explosion of digital and online text information, automatic organization of documents has become a very important research area. There are mainly two machine learning approaches to enhance the organization task of the digital documents. One of them is the supervised approach, where pre-defined category labels are assigned to documents based on the likelihood suggested by a training set of labeled documents; and the other one is the unsupervised approach, where there is no need for human intervention or labeled documents at any point in the whole process. In this thesis, we conce
Стилі APA, Harvard, Vancouver, ISO та ін.
3

Blein, Florent. "Automatic Document Classification Applied to Swedish News." Thesis, Linköping University, Department of Computer and Information Science, 2005. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-3065.

Повний текст джерела
Анотація:
<p>The first part of this paper presents briefly the ELIN[1] system, an electronic newspaper project. ELIN is a framework that stores news and displays them to the end-user. Such news are formatted using the xml[2] format. The project partner Corren[3] provided ELIN with xml articles, however the format used was not the same. My first task has been to develop a software that converts the news from one xml format (Corren) to another (ELIN).</p><p>The second and main part addresses the problem of automatic document classification and tries to find a solution for a specific issue. The goal is to
Стилі APA, Harvard, Vancouver, ISO та ін.
4

Alsaad, Amal. "Enhanced root extraction and document classification algorithm for Arabic text." Thesis, Brunel University, 2016. http://bura.brunel.ac.uk/handle/2438/13510.

Повний текст джерела
Анотація:
Many text extraction and classification systems have been developed for English and other international languages; most of the languages are based on Roman letters. However, Arabic language is one of the difficult languages which have special rules and morphology. Not many systems have been developed for Arabic text categorization. Arabic language is one of the Semitic languages with morphology that is more complicated than English. Due to its complex morphology, there is a need for pre-processing routines to extract the roots of the words then classify them according to the group of acts or m
Стилі APA, Harvard, Vancouver, ISO та ін.
5

Anne, Chaitanya. "Advanced Text Analytics and Machine Learning Approach for Document Classification." ScholarWorks@UNO, 2017. http://scholarworks.uno.edu/td/2292.

Повний текст джерела
Анотація:
Text classification is used in information extraction and retrieval from a given text, and text classification has been considered as an important step to manage a vast number of records given in digital form that is far-reaching and expanding. This thesis addresses patent document classification problem into fifteen different categories or classes, where some classes overlap with other classes for practical reasons. For the development of the classification model using machine learning techniques, useful features have been extracted from the given documents. The features are used to classify
Стилі APA, Harvard, Vancouver, ISO та ін.
6

McElroy, Jonathan David. "Automatic Document Classification in Small Environments." DigitalCommons@CalPoly, 2012. https://digitalcommons.calpoly.edu/theses/682.

Повний текст джерела
Анотація:
Document classification is used to sort and label documents. This gives users quicker access to relevant data. Users that work with large inflow of documents spend time filing and categorizing them to allow for easier procurement. The Automatic Classification and Document Filing (ACDF) system proposed here is designed to allow users working with files or documents to rely on the system to classify and store them with little manual attention. By using a system built on Hidden Markov Models, the documents in a smaller desktop environment are categorized with better results than the traditional N
Стилі APA, Harvard, Vancouver, ISO та ін.
7

Felhi, Mehdi. "Document image segmentation : content categorization." Thesis, Université de Lorraine, 2014. http://www.theses.fr/2014LORR0109/document.

Повний текст джерела
Анотація:
Dans cette thèse, nous abordons le problème de la segmentation des images de documents en proposant de nouvelles approches pour la détection et la classification de leurs contenus. Dans un premier lieu, nous étudions le problème de l'estimation d'inclinaison des documents numérisées. Le but de ce travail étant de développer une approche automatique en mesure d'estimer l'angle d'inclinaison du texte dans les images de document. Notre méthode est basée sur la méthode Maximum Gradient Difference (MGD), la R-signature et la transformée de Ridgelets. Nous proposons ensuite une approche hybride pour
Стилі APA, Harvard, Vancouver, ISO та ін.
8

Wang, Yanbo Justin. "Language-independent pre-processing of large document bases for text classification." Thesis, University of Liverpool, 2008. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.445960.

Повний текст джерела
Анотація:
Text classification is a well-known topic in the research of knowledge discovery in databases. Algorithms for text classification generally involve two stages. The first is concerned with identification of textual features (i.e. words andlor phrases) that may be relevant to the classification process. The second is concerned with classification rule mining and categorisation of "unseen" textual data. The first stage is the subject of this thesis and often involves an analysis of text that is both language-specific (and possibly domain-specific), and that may also be computationally costly espe
Стилі APA, Harvard, Vancouver, ISO та ін.
9

Wang, Yalin. "Document analysis : table structure understanding and zone content classification /." Thesis, Connect to this title online; UW restricted, 2002. http://hdl.handle.net/1773/6079.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
10

Wei, Zhihua. "The research on chinese text multi-label classification." Thesis, Lyon 2, 2010. http://www.theses.fr/2010LYO20025/document.

Повний текст джерела
Анотація:
Text Classification (TC) which is an important field in information technology has many valuable applications. When facing the sea of information resources, the objects of TC are more complicated and diversity. The researches in pursuit of effective and practical TC technology are fairly challenging. More and more researchers regard that multi-label TC is more suited for many applications. This thesis analyses the difficulties and problems in multi-label TC and Chinese text representation based on a mass of algorithms for single-label TC and multi-label TC. Aiming at high dimensionality in fea
Стилі APA, Harvard, Vancouver, ISO та ін.
Більше джерел

Книги з теми "Text document classification"

1

Meister, Burkhardt W. The German Limited Liability Company: An introduction to the Act on limited liability companies with German/English text, synoptically arranged, of the act, a sample of arcticles of association, samples of the other formation documents of the company, the classification of the balance sheet and the profit and loss statement of a company and an extract from the commercial register = Die deutsche Gesellschaft mit beschränkter Haftung : eine Einführung zum Gesetz betreffend die Gesellschaften mit beschränkter Haftung mit synoptisch angeordnetem deutsch/englischem Text des Gesetzes, eines Gesellschaftsvertrages, der sonstigen Gründungsdokumente einer Bilanz und der Gewinn- und Verlustrechnung einer Gesellschaft und eines Auszugs aus dem Handelsregister. 7th ed. Beck, 2010.

Знайти повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
2

Newton, David E. Substance Abuse. 2nd ed. ABC-CLIO, LLC, 2017. http://dx.doi.org/10.5040/9798216021032.

Повний текст джерела
Анотація:
This go-to resource on substance abuse supplies the broad background knowledge and historical information needed to understand this important sociological issue and provides readers with a range of additional sources for continuing their study of the topic. From the pharmaceuticals advertised on television for various specific medical conditions; to alcohol, which is consumed regularly as a societal norm; to illicit drugs such as cocaine, heroin, and methamphetamine; to marijuana, which is becoming legal in an increasing number of U.S. states, drugs are all around us and are ingrained in our c
Стилі APA, Harvard, Vancouver, ISO та ін.

Частини книг з теми "Text document classification"

1

Guthrie, Louise, Joe Guthrie, and James Leistensnider. "Document Classification and Routing." In Text, Speech and Language Technology. Springer Netherlands, 1999. http://dx.doi.org/10.1007/978-94-017-2388-6_12.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
2

Huang, Chaochao, Xipeng Qiu, and Xuanjing Huang. "Text Classification with Document Embeddings." In Lecture Notes in Computer Science. Springer International Publishing, 2014. http://dx.doi.org/10.1007/978-3-319-12277-9_12.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
3

Hrala, Michal, and Pavel Král. "Multi-label Document Classification in Czech." In Text, Speech, and Dialogue. Springer Berlin Heidelberg, 2013. http://dx.doi.org/10.1007/978-3-642-40585-3_44.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
4

Kralicek, Jiri, and Jiri Matas. "Fast Text vs. Non-text Classification of Images." In Document Analysis and Recognition – ICDAR 2021. Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-86337-1_2.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
5

Král, Pavel, and Ladislav Lenc. "Confidence Measure for Czech Document Classification." In Computational Linguistics and Intelligent Text Processing. Springer International Publishing, 2015. http://dx.doi.org/10.1007/978-3-319-18117-2_39.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
6

Penha, Gustavo, Raphael Campos, Sérgio Canuto, Marcos André Gonçalves, and Rodrygo L. T. Santos. "Document Performance Prediction for Automatic Text Classification." In Lecture Notes in Computer Science. Springer International Publishing, 2019. http://dx.doi.org/10.1007/978-3-030-15719-7_17.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
7

Anni, Isaac Kobby, and Venu G. Dasigi. "Analysis of Document Representation for Text Classification." In Lecture Notes in Networks and Systems. Springer Nature Switzerland, 2025. https://doi.org/10.1007/978-3-031-84460-7_17.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
8

Howland, Peg, and Haesun Park. "Cluster-Preserving Dimension Reduction Methods for Document Classification." In Survey of Text Mining II. Springer London, 2008. http://dx.doi.org/10.1007/978-1-84800-046-9_1.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
9

Xia, Zhonghang, Guangming Xing, Houduo Qi, and Qi Li. "Applications of Semidefinite Programming in XML Document Classification." In Survey of Text Mining II. Springer London, 2008. http://dx.doi.org/10.1007/978-1-84800-046-9_7.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
10

Gelbukh, Alexander, Grigori Sidorov, and Adolfo Guzman-Arénas. "Use of a Weighted Topic Hierarchy for Document Classification." In Text, Speech and Dialogue. Springer Berlin Heidelberg, 1999. http://dx.doi.org/10.1007/3-540-48239-3_24.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.

Тези доповідей конференцій з теми "Text document classification"

1

Labarga, John, and Sam Friedman. "Automated Labeling of Helicopter Maintenance Records with Text Classification." In Vertical Flight Society 74th Annual Forum & Technology Display. The Vertical Flight Society, 2018. http://dx.doi.org/10.4050/f-0074-2018-12850.

Повний текст джерела
Анотація:
Separating plaintext documents into defined classes, known in the data science literature as text classification, is a time-consuming task that is essential to the processing of correspondence, safety records, and maintenance documents in the aviation industry. This work discusses a machine learning approach to this problem, and demonstrates how to construct an end-to-end document classification solution. This system is capable of classifying documents with high accuracy, thus alleviating the need for labor dedicated to this task.
Стилі APA, Harvard, Vancouver, ISO та ін.
2

Saraswat, Pankaj, S. E. Manu, and Ritesh Kumar. "Automatic Detection and Classification of Handwritten Text in Document Images." In 2024 3rd International Conference for Advancement in Technology (ICONAT). IEEE, 2024. https://doi.org/10.1109/iconat61936.2024.10775224.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
3

Deshpande, Vivek, K. Yuvaraj, Akhilesh Kalia, Praveena J, Shivangi Gupta, and Uma C. Swadimath. "Using DL Measure and Distance to Assess Text Mining Document Classification Problems." In 2024 Global Conference on Communications and Information Technologies (GCCIT). IEEE, 2024. https://doi.org/10.1109/gccit63234.2024.10862243.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
4

Alshamari, Fatimah, and Abdou Youssef. "A Study into Math Document Classification using Deep Learning." In 8th International Conference on Computational Science and Engineering (CSE 2020). AIRCC Publishing Corporation, 2020. http://dx.doi.org/10.5121/csit.2020.101702.

Повний текст джерела
Анотація:
Document classification is a fundamental task for many applications, including document annotation, document understanding, and knowledge discovery. This is especially true in STEM fields where the growth rate of scientific publications is exponential, and where the need for document processing and understanding is essential to technological advancement. Classifying a new publication into a specific domain based on the content of the document is an expensive process in terms of cost and time. Therefore, there is a high demand for a reliable document classification system. In this paper, we foc
Стилі APA, Harvard, Vancouver, ISO та ін.
5

Wu, Qin, Eddie Fuller, and Cun-Quan Zhang. "Text Document Classification and Pattern Recognition." In 2009 International Conference on Advances in Social Network Analysis and Mining (ASONAM). IEEE, 2009. http://dx.doi.org/10.1109/asonam.2009.21.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
6

Chen, ZhiHang, Liping Huang, and Yi L. Murphey. "Incremental Learning for Text Document Classification." In 2007 International Joint Conference on Neural Networks. IEEE, 2007. http://dx.doi.org/10.1109/ijcnn.2007.4371367.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
7

Zhang, Haopeng, and Jiawei Zhang. "Text Graph Transformer for Document Classification." In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, 2020. http://dx.doi.org/10.18653/v1/2020.emnlp-main.668.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
8

Simske, Steven J., and Rafael Lins. "Automatic Text Summarization and Classification." In DocEng '18: ACM Symposium on Document Engineering 2018. ACM, 2018. http://dx.doi.org/10.1145/3209280.3232791.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
9

Lima, João Marcos Carvalho, and José Everardo Bessa Maia. "A Topical Word Embeddings for Text Classification." In XV Encontro Nacional de Inteligência Artificial e Computacional. Sociedade Brasileira de Computação - SBC, 2018. http://dx.doi.org/10.5753/eniac.2018.4401.

Повний текст джерела
Анотація:
This paper presents an approach that uses topic models based on LDA to represent documents in text categorization problems. The document representation is achieved through the cosine similarity between document embeddings and embeddings of topic words, creating a Bag-of-Topics (BoT) variant. The performance of this approach is compared against those of two other representations: BoW (Bag-of-Words) and Topic Model, both based on standard tf-idf. Also, to reveal the effect of the classifier, we compared the performance of the nonlinear classifier SVM against that of the linear classifier Naive B
Стилі APA, Harvard, Vancouver, ISO та ін.
10

Thi Xuan Lam, Thanh, Anh Duc Le, and Masaki Nakagawa. "User Interface for Text and Non-Text Classification." In 2019 International Conference on Document Analysis and Recognition Workshops (ICDARW). IEEE, 2019. http://dx.doi.org/10.1109/icdarw.2019.20044.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.

Звіти організацій з теми "Text document classification"

1

Idakwo, Gabriel, Sundar Thangapandian, Joseph Luttrell, Zhaoxian Zhou, Chaoyang Zhang, and Ping Gong. Deep learning-based structure-activity relationship modeling for multi-category toxicity classification : a case study of 10K Tox21 chemicals with high-throughput cell-based androgen receptor bioassay data. Engineer Research and Development Center (U.S.), 2021. http://dx.doi.org/10.21079/11681/41302.

Повний текст джерела
Анотація:
Deep learning (DL) has attracted the attention of computational toxicologists as it offers a potentially greater power for in silico predictive toxicology than existing shallow learning algorithms. However, contradicting reports have been documented. To further explore the advantages of DL over shallow learning, we conducted this case study using two cell-based androgen receptor (AR) activity datasets with 10K chemicals generated from the Tox21 program. A nested double-loop cross-validation approach was adopted along with a stratified sampling strategy for partitioning chemicals of multiple AR
Стилі APA, Harvard, Vancouver, ISO та ін.
Ми пропонуємо знижки на всі преміум-плани для авторів, чиї праці увійшли до тематичних добірок літератури. Зв'яжіться з нами, щоб отримати унікальний промокод!