To see the other types of publications on this topic, follow the link: 20 newsgroup.

Journal articles on the topic '20 newsgroup'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 29 journal articles for your research on the topic '20 newsgroup.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Nurdin, Arliyanti, Bernadus Anggo Seno Aji, Anugrayani Bustamin, and Zaenal Abidin. "PERBANDINGAN KINERJA WORD EMBEDDING WORD2VEC, GLOVE, DAN FASTTEXT PADA KLASIFIKASI TEKS." Jurnal Tekno Kompak 14, no. 2 (2020): 74. http://dx.doi.org/10.33365/jtk.v14i2.732.

Full text
Abstract:
Karakteristik teks yang tidak terstruktur menjadi tantangan dalam ekstraksi fitur pada bidang pemrosesan teks. Penelitian ini bertujuan untuk membandingkan kinerja dari word embedding seperti Word2Vec, GloVe dan FastText dan diklasifikasikan dengan algoritma Convolutional Neural Network. Ketiga metode ini dipilih karena dapat menangkap makna semantik, sintatik, dan urutan bahkan konteks di sekitar kata jika dibandingkan dengan feature engineering tradisional seperti Bag of Words. Proses word embedding dari metode tersebut akan dibandingkan kinerjanya pada klasifikasi berita dari dataset 20 new
APA, Harvard, Vancouver, ISO, and other styles
2

Zhou, Hongfang, Jie Guo, Yinghui Wang, and Minghua Zhao. "A Feature Selection Approach Based on Interclass and Intraclass Relative Contributions of Terms." Computational Intelligence and Neuroscience 2016 (2016): 1–8. http://dx.doi.org/10.1155/2016/1715780.

Full text
Abstract:
Feature selection plays a critical role in text categorization. During feature selecting, high-frequency terms and the interclass and intraclass relative contributions of terms all have significant effects on classification results. So we put forward a feature selection approach, IIRCT, based on interclass and intraclass relative contributions of terms in the paper. In our proposed algorithm, three critical factors, which are term frequency and the interclass relative contribution and the intraclass relative contribution of terms, are all considered synthetically. Finally, experiments are made
APA, Harvard, Vancouver, ISO, and other styles
3

Ghanem, Khadoudja. "Local and Global Latent Semantic Analysis for Text Categorization." International Journal of Information Retrieval Research 4, no. 3 (2014): 1–13. http://dx.doi.org/10.4018/ijirr.2014070101.

Full text
Abstract:
In this paper the authors propose a semantic approach to document categorization. The idea is to create for each category a semantic index (representative term vector) by performing a local Latent Semantic Analysis (LSA) followed by a clustering process. A second use of LSA (Global LSA) is adopted on a term-Class matrix in order to retrieve the class which is the most similar to the query (document to classify) in the same way where the LSA is used to retrieve documents which are the most similar to a query in Information Retrieval. The proposed system is evaluated on a popular dataset which i
APA, Harvard, Vancouver, ISO, and other styles
4

Borrajo, L., A. Seara Vieira, and E. L. Iglesias. "An HMM-based synthetic view generator to improve the efficiency of ensemble systems." Logic Journal of the IGPL 28, no. 1 (2019): 4–18. http://dx.doi.org/10.1093/jigpal/jzz067.

Full text
Abstract:
Abstract One of the most active areas of research in semi-supervised learning has been to study methods for constructing good ensembles of classifiers. Ensemble systems are techniques that create multiple models and then combine them to produce improved results. These systems usually produce more accurate solutions than a single model would. Specially, multi-view ensemble systems improve the accuracy of text classification because they optimize the functions to exploit different views of the same input data. However, despite being more promising than the single-view approaches, document datase
APA, Harvard, Vancouver, ISO, and other styles
5

Li, Qin, Shaobo Li, Jie Hu, Sen Zhang, and Jianjun Hu. "Tourism Review Sentiment Classification Using a Bidirectional Recurrent Neural Network with an Attention Mechanism and Topic-Enriched Word Vectors." Sustainability 10, no. 9 (2018): 3313. http://dx.doi.org/10.3390/su10093313.

Full text
Abstract:
Sentiment analysis of online tourist reviews is playing an increasingly important role in tourism. Accurately capturing the attitudes of tourists regarding different aspects of the scenic sites or the overall polarity of their online reviews is key to tourism analysis and application. However, the performances of current document sentiment analysis methods are not satisfactory as they either neglect the topics of the document or do not consider that not all words contribute equally to the meaning of the text. In this work, we propose a bidirectional gated recurrent unit neural network model (B
APA, Harvard, Vancouver, ISO, and other styles
6

Kaur, Bipanjyot, and Gourav Bathla. "An efficient technique for hybrid classification and feature extraction using normalization." International Journal of Engineering & Technology 7, no. 2.27 (2018): 156. http://dx.doi.org/10.14419/ijet.v7i2.27.14534.

Full text
Abstract:
Text classification is technique for assigning the class or label to a particular document within predefined class labels. Predefined classes examples are sports, business, technical, education and science etc. Classification is supervised learning technique i.e. these classes are trained with certain features and then document is classified based on similarity measure with these trained document set. Text classification is used in many applications like assigning the label to the documents, separating the spam messages from the genuine one, filtering of text, natural language processing etc.
APA, Harvard, Vancouver, ISO, and other styles
7

Chirra, Venkata RamiReddy, Hoolda Daniel Maddiboyina, Yakobu Dasari, and Ranganadhareddy Aluru. "Performance Evaluation of Email Spam Text Classification Using Deep Neural Networks." Review of Computer Engineering Studies 7, no. 4 (2020): 91–95. http://dx.doi.org/10.18280/rces.070403.

Full text
Abstract:
Spam in email box is received because of advertising, collecting personal information, or to indulge malware through websites or scripts. Most often, spammers send junk mail with an intention of committing email fraud. Today spam mail accounts for 45% of all email and hence there is an ever-increasing need to build efficient spam filters to identify and block spam mail. However, notably today’s spam filters in use are built using traditional approaches such as statistical and content-based techniques. These techniques don’t improve their performance while handling huge data and they need a lot
APA, Harvard, Vancouver, ISO, and other styles
8

Vidyadhari, Ch, N. Sandhya, and P. Premchand. "Particle Grey Wolf Optimizer (PGWO) Algorithm and Semantic Word Processing for Automatic Text Clustering." International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 27, no. 02 (2019): 201–23. http://dx.doi.org/10.1142/s0218488519500090.

Full text
Abstract:
Text mining refers to the process of extracting the high-quality information from the text. It is broadly used in applications, like text clustering, text categorization, text classification, etc. Recently, the text clustering becomes the facilitating and challenging task used to group the text document. Due to some irrelevant terms and large dimension, the accuracy of text clustering is reduced. In this paper, the semantic word processing and novel Particle Grey Wolf Optimizer (PGWO) is proposed for automatic text clustering. Initially, the text documents are given as input to the pre-process
APA, Harvard, Vancouver, ISO, and other styles
9

Kjelgren, Roger, and Larry Rupp. "461 Multimedia Dissemination On and Off Campus of Two Landscape Horticulture Courses." HortScience 35, no. 3 (2000): 473C—473. http://dx.doi.org/10.21273/hortsci.35.3.473c.

Full text
Abstract:
We developed two courses, sustainable landscaping and landscape water conservation, to meet time-constrained students on campus and place-bound students off campus. Lecture material consisting of text, slides, drawings, and some video were assembled digitally using presentation software. Each course was broken into nine to10 units by topic matter, and each unit consisted of 50 to 100 individual “slides” containing visuals, text, and audio narration. The lecture material was then packaged for student consumption onto videotape and CD-ROM, and on the Web (without audio) and as hard copy. Student
APA, Harvard, Vancouver, ISO, and other styles
10

Ogura, Hiroshi, Hiromi Amano, and Masato Kondo. "Gamma-Poisson Distribution Model for Text Categorization." ISRN Artificial Intelligence 2013 (April 4, 2013): 1–17. http://dx.doi.org/10.1155/2013/829630.

Full text
Abstract:
We introduce a new model for describing word frequency distributions in documents for automatic text classification tasks. In the model, the gamma-Poisson probability distribution is used to achieve better text modeling. The framework of the modeling and its application to text categorization are demonstrated with practical techniques for parameter estimation and vector normalization. To investigate the efficiency of our model, text categorization experiments were performed on 20 Newsgroups, Reuters-21578, Industry Sector, and TechTC-100 datasets. The results show that the model allows perform
APA, Harvard, Vancouver, ISO, and other styles
11

Vlachostergiou, Aggeliki, George Caridakis, Phivos Mylonas, and Andreas Stafylopatis. "Learning Representations of Natural Language Texts with Generative Adversarial Networks at Document, Sentence, and Aspect Level." Algorithms 11, no. 10 (2018): 164. http://dx.doi.org/10.3390/a11100164.

Full text
Abstract:
The ability to learn robust, resizable feature representations from unlabeled data has potential applications in a wide variety of machine learning tasks. One way to create such representations is to train deep generative models that can learn to capture the complex distribution of real-world data. Generative adversarial network (GAN) approaches have shown impressive results in producing generative models of images, but relatively little work has been done on evaluating the performance of these methods for the learning representation of natural language, both in supervised and unsupervised set
APA, Harvard, Vancouver, ISO, and other styles
12

Xu, Shuo. "Bayesian Naïve Bayes classifiers to text classification." Journal of Information Science 44, no. 1 (2016): 48–59. http://dx.doi.org/10.1177/0165551516677946.

Full text
Abstract:
Text classification is the task of assigning predefined categories to natural language documents, and it can provide conceptual views of document collections. The Naïve Bayes (NB) classifier is a family of simple probabilistic classifiers based on a common assumption that all features are independent of each other, given the category variable, and it is often used as the baseline in text classification. However, classical NB classifiers with multinomial, Bernoulli and Gaussian event models are not fully Bayesian. This study proposes three Bayesian counterparts, where it turns out that classica
APA, Harvard, Vancouver, ISO, and other styles
13

Yang, Jieming, Zhaoyang Qu, and Zhiying Liu. "Improved Feature-Selection Method Considering the Imbalance Problem in Text Categorization." Scientific World Journal 2014 (2014): 1–17. http://dx.doi.org/10.1155/2014/625342.

Full text
Abstract:
The filtering feature-selection algorithm is a kind of important approach to dimensionality reduction in the field of the text categorization. Most of filtering feature-selection algorithms evaluate the significance of a feature for category based on balanced dataset and do not consider the imbalance factor of dataset. In this paper, a new scheme was proposed, which can weaken the adverse effect caused by the imbalance factor in the corpus. We evaluated the improved versions of nine well-known feature-selection methods (Information Gain, Chi statistic, Document Frequency, Orthogonal Centroid F
APA, Harvard, Vancouver, ISO, and other styles
14

Aubaid, Asmaa M., and Alok Mishra. "A Rule-Based Approach to Embedding Techniques for Text Document Classification." Applied Sciences 10, no. 11 (2020): 4009. http://dx.doi.org/10.3390/app10114009.

Full text
Abstract:
With the growth of online information and sudden expansion in the number of electronic documents provided on websites and in electronic libraries, there is difficulty in categorizing text documents. Therefore, a rule-based approach is a solution to this problem; the purpose of this study is to classify documents by using a rule-based. This paper deals with the rule-based approach with the embedding technique for a document to vector (doc2vec) files. An experiment was performed on two data sets Reuters-21578 and the 20 Newsgroups to classify the top ten categories of these data sets by using a
APA, Harvard, Vancouver, ISO, and other styles
15

AL-TAHRAWI, MAYY M., and RAED ABU ZITAR. "POLYNOMIAL NETWORKS VERSUS OTHER TECHNIQUES IN TEXT CATEGORIZATION." International Journal of Pattern Recognition and Artificial Intelligence 22, no. 02 (2008): 295–322. http://dx.doi.org/10.1142/s0218001408006247.

Full text
Abstract:
Many techniques and algorithms for automatic text categorization had been devised and proposed in the literature. However, there is still much space for researchers in this area to improve existing algorithms or come up with new techniques for text categorization (TC). Polynomial Networks (PNs) were never used before in TC. This can be attributed to the huge datasets used in TC, as well as the technique itself which has high computational demands. In this paper, we investigate and propose using PNs in TC. The proposed PN classifier has achieved a competitive classification performance in our e
APA, Harvard, Vancouver, ISO, and other styles
16

Thangamani, M., and P. Thangaraj. "Effective Fuzzy Ontology Based Distributed Document Using Non-Dominated Ranked Genetic Algorithm." International Journal of Intelligent Information Technologies 7, no. 4 (2011): 26–46. http://dx.doi.org/10.4018/jiit.2011100102.

Full text
Abstract:
The increase in the number of documents has aggravated the difficulty of classifying those documents according to specific needs. Clustering analysis in a distributed environment is a thrust area in artificial intelligence and data mining. Its fundamental task is to utilize characters to compute the degree of related corresponding relationship between objects and to accomplish automatic classification without earlier knowledge. Document clustering utilizes clustering technique to gather the documents of high resemblance collectively by computing the documents resemblance. Recent studies have s
APA, Harvard, Vancouver, ISO, and other styles
17

Iwendi, Celestine, Suresh Ponnan, Revathi Munirathinam, Kathiravan Srinivasan, and Chuan-Yu Chang. "An Efficient and Unique TF/IDF Algorithmic Model-Based Data Analysis for Handling Applications with Big Data Streaming." Electronics 8, no. 11 (2019): 1331. http://dx.doi.org/10.3390/electronics8111331.

Full text
Abstract:
As the field of data science grows, document analytics has become a more challenging task for rough classification, response analysis, and text summarization. These tasks are used for the analysis of text data from various intelligent sensing systems. The conventional approach for data analytics and text processing is not useful for big data coming from intelligent systems. This work proposes a novel TF/IDF algorithm with the temporal Louvain approach to solve the above problem. Such an approach is supposed to help the categorization of documents into hierarchical structures showing the relati
APA, Harvard, Vancouver, ISO, and other styles
18

UCHYIGIT, GULDEN, and KEITH CLARK. "A NEW FEATURE SELECTION METHOD FOR TEXT CLASSIFICATION." International Journal of Pattern Recognition and Artificial Intelligence 21, no. 02 (2007): 423–38. http://dx.doi.org/10.1142/s0218001407005466.

Full text
Abstract:
Text classification is the problem of classifying a set of documents into a pre-defined set of classes. A major problem with text classification problems is the high dimensionality of the feature space. Only a small subset of these words are feature words which can be used in determining a document's class, while the rest adds noise and can make the results unreliable and significantly increase computational time. A common approach in dealing with this problem is feature selection where the number of words in the feature space are significantly reduced. In this paper we present the experiments
APA, Harvard, Vancouver, ISO, and other styles
19

SONG, FENGXI, DAVID ZHANG, YONG XU, and JIZHONG WANG. "FIVE NEW FEATURE SELECTION METRICS IN TEXT CATEGORIZATION." International Journal of Pattern Recognition and Artificial Intelligence 21, no. 06 (2007): 1085–101. http://dx.doi.org/10.1142/s0218001407005831.

Full text
Abstract:
Feature selection has been extensively applied in statistical pattern recognition as a mechanism for cleaning up the set of features that are used to represent data and as a way of improving the performance of classifiers. Four schemes commonly used for feature selection are Exponential Searches, Stochastic Searches, Sequential Searches, and Best Individual Features. The most popular scheme used in text categorization is Best Individual Features as the extremely high dimensionality of text feature spaces render the other three feature selection schemes time prohibitive. This paper proposes fiv
APA, Harvard, Vancouver, ISO, and other styles
20

PERA, MARIA SOLEDAD, and YIU-KAI NG. "A NAÏVE BAYES CLASSIFIER FOR WEB DOCUMENT SUMMARIES CREATED BY USING WORD SIMILARITY AND SIGNIFICANT FACTORS." International Journal on Artificial Intelligence Tools 19, no. 04 (2010): 465–86. http://dx.doi.org/10.1142/s0218213010000285.

Full text
Abstract:
Text classification categorizes web documents in large collections into predefined classes based on their contents. Unfortunately, the classification process can be time-consuming and users are still required to spend considerable amount of time scanning through the classified web documents to identify the ones with contents that satisfy their information needs. In solving this problem, we first introduce CorSum, an extractive single-document summarization approach, which is simple and effective in performing the summarization task, since it only relies on word similarity to generate high-qual
APA, Harvard, Vancouver, ISO, and other styles
21

Selvaraj, Suganya, and Eunmi Choi. "Swarm Intelligence Algorithms in Text Document Clustering with Various Benchmarks." Sensors 21, no. 9 (2021): 3196. http://dx.doi.org/10.3390/s21093196.

Full text
Abstract:
Text document clustering refers to the unsupervised classification of textual documents into clusters based on content similarity and can be applied in applications such as search optimization and extracting hidden information from data generated by IoT sensors. Swarm intelligence (SI) algorithms use stochastic and heuristic principles that include simple and unintelligent individuals that follow some simple rules to accomplish very complex tasks. By mapping features of problems to parameters of SI algorithms, SI algorithms can achieve solutions in a flexible, robust, decentralized, and self-o
APA, Harvard, Vancouver, ISO, and other styles
22

Balakumar, Janani, and S. Vijayarani Mohan. "Artificial bee colony algorithm for feature selection and improved support vector machine for text classification." Information Discovery and Delivery 47, no. 3 (2019): 154–70. http://dx.doi.org/10.1108/idd-09-2018-0045.

Full text
Abstract:
Purpose Owing to the huge volume of documents available on the internet, text classification becomes a necessary task to handle these documents. To achieve optimal text classification results, feature selection, an important stage, is used to curtail the dimensionality of text documents by choosing suitable features. The main purpose of this research work is to classify the personal computer documents based on their content. Design/methodology/approach This paper proposes a new algorithm for feature selection based on artificial bee colony (ABCFS) to enhance the text classification accuracy. T
APA, Harvard, Vancouver, ISO, and other styles
23

Srilakshmi, V., K. Anuradha, and C. Shoba Bindu. "Incremental text categorization based on hybrid optimization-based deep belief neural network." Journal of High Speed Networks 27, no. 2 (2021): 183–202. http://dx.doi.org/10.3233/jhs-210659.

Full text
Abstract:
One of the effective text categorization methods for learning the large-scale data and the accumulated data is incremental learning. The major challenge in the incremental learning is improving the accuracy as the text document consists of numerous terms. In this research, a incremental text categorization method is developed using the proposed Spider Grasshopper Crow Optimization Algorithm based Deep Belief Neural network (SGrC-based DBN) for providing optimal text categorization results. The proposed text categorization method has four processes, such as are pre-processing, feature extractio
APA, Harvard, Vancouver, ISO, and other styles
24

Parlak, Bekir, and Alper Kursat Uysal. "A novel filter feature selection method for text classification: Extensive Feature Selector." Journal of Information Science, April 13, 2021, 016555152199103. http://dx.doi.org/10.1177/0165551521991037.

Full text
Abstract:
As the huge dimensionality of textual data restrains the classification accuracy, it is essential to apply feature selection (FS) methods as dimension reduction step in text classification (TC) domain. Most of the FS methods for TC contain several number of probabilities. In this study, we proposed a new FS method named as Extensive Feature Selector (EFS), which benefits from corpus-based and class-based probabilities in its calculations. The performance of EFS is compared with nine well-known FS methods, namely, Chi-Squared (CHI2), Class Discriminating Measure (CDM), Discriminative Power Meas
APA, Harvard, Vancouver, ISO, and other styles
25

"AEAO: Auto Encoder with Adam Optimizer Method for Efficient Document Indexing Of Big Data." International Journal of Recent Technology and Engineering 8, no. 3 (2019): 3933–42. http://dx.doi.org/10.35940/ijrte.c5141.098319.

Full text
Abstract:
In the big data era, the document classification became an active research area due to the explosive nature in the volumes of data. Document Indexing is one of the important tasks under text classification. The objective of this research is to increase the performance of the document indexing by proposing Adam optimizer in the auto-encoder. Due to the larger dimension and multi-class classification problem, the accuracy of document indexing is reduced. In this paper, an enhanced auto encoder is used based on the objective function of the Adam optimization (AEAO), which improves the learning ra
APA, Harvard, Vancouver, ISO, and other styles
26

Esraa H. Abd Al-Ameer, Ahmed H. Aliwy. "English Text Classification Using Improved Recursive Feature Elimination (IRFE) Algorithm: تصنيف النص الإنجليزي باستخدام الخوارزمية العودية المحسنة لإزالة الخواص (IRFE)". Journal of engineering sciences and information technology 4, № 2 (2020). http://dx.doi.org/10.26389/ajsrp.r080420.

Full text
Abstract:
Documents classification is from most important fields for Natural language processing and text mining. There are many algorithms can be used for this task. In this paper, focuses on improving Text Classification by feature selection. This means determine some of the original features without affecting the accuracy of the work, where our work is a new feature selection method was suggested which can be a general formulation and mathematical model of Recursive Feature Elimination (RFE). The used method was compared with other two well-known feature selection methods: Chi-square and threshold. T
APA, Harvard, Vancouver, ISO, and other styles
27

"Text Classification Using Ensemble Of Non-Linear Support Vector Machines." VOLUME-8 ISSUE-10, AUGUST 2019, REGULAR ISSUE 8, no. 10 (2019): 3169–74. http://dx.doi.org/10.35940/ijitee.j9520.0881019.

Full text
Abstract:
With the advent of digital era, billions of the documents generate every day that need to be managed, processed and classified. Enormous size of text data is available on world wide web and other sources. As a first step of managing this mammoth data is the classification of available documents in right categories. Supervised machine learning approaches try to solve the problem of document classification but working on large data sets of heterogeneous classes is a big challenge. Automatic tagging and classification of the text document is a useful task due to its many potential applications su
APA, Harvard, Vancouver, ISO, and other styles
28

Esau, Katharina. "Incivility (Hate Speech/Incivility)." DOCA - Database of Variables for Content Analysis, March 26, 2021. http://dx.doi.org/10.34778/5c.

Full text
Abstract:
The variable incivility is an indicator used to describe violations of communication norms. These norms can be social norms established within a society, a culture or parts of a society (e.g. a social class, milieu or group) or democratic norms established within a democratic society. In this sense incivility is associated with behaviors that threaten a collective face or a democratic society, deny people their personal freedoms, and stereotype individuals or social groups. Furthermore, some scholars include impoliteness into the concept of incivility and argue that the two concepts have no cl
APA, Harvard, Vancouver, ISO, and other styles
29

Esau, Katharina. "Impoliteness (Hate Speech/Incivility)." DOCA - Database of Variables for Content Analysis, March 26, 2021. http://dx.doi.org/10.34778/5b.

Full text
Abstract:
The variable impoliteness is an indicator used to describe violations of communication norms. These norms can be social norms established within a society, a culture or parts of a society (e.g. a social class, milieu or group). In this sense impoliteness is associated with, among other things, aggressive, offensive or derogatory communication expressed directly or indirectly to other individuals or parties. More specifically name calling, vulgar expressions or aspersions are classified as examples of impolite statements (e.g. Papacharissi, 2004; Seely, 2017). While some scholars distinguish be
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!