Log in

Relevant bibliographies by topics / Stemming Techniques / Journal articles

To see the other types of publications on this topic, follow the link: Stemming Techniques.

Journal articles on the topic 'Stemming Techniques'

Author: Grafiati

Published: 3 June 2025

Last updated: 20 July 2025

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Stemming Techniques.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Kaur, Prabhjot. "REVIEW ON STEMMING TECHNIQUES." International Journal of Advanced Research in Computer Science 9, no. 5 (2018): 64–68. http://dx.doi.org/10.26483/ijarcs.v9i5.6308.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Razmi, Nurul Atiqah, Muhammad Zharif Zamri, Sharifah Syafiera Syed Ghazalli, and Noraini Seman. "Visualizing stemming techniques on online news articles text analytics." Bulletin of Electrical Engineering and Informatics 10, no. 1 (2021): 365–73. http://dx.doi.org/10.11591/eei.v10i1.2504.

Full text

Abstract:

Stemming is the process to convert words into their root words by the stemming algorithm. It is one of the main processes in text analytics where the text data needs to go through stemming process before proceeding to further analysis. Text analytics is a very common practice nowadays that is practiced toanalyze contents of text data from various sources such as the mass media and media social. In this study, two different stemming techniques; Porter and Lancaster are evaluated. The differences in the outputs that are resulted from the different stemming techniques are discussed based on the stemming error and the resulted visualization. The finding from this study shows that Porter stemming performs better than Lancaster stemming, by 43%, based on the stemming error produced. Visualization can still be accommodated by the stemmed text data but some understanding of the background on the text data is needed by the tool users to ensure that correct interpretation can be made on the visualization outputs.

APA, Harvard, Vancouver, ISO, and other styles

3

Nurul, Atiqah Razmi, Zharif Zamri Muhammad, Syafiera Syed Ghazalli Sharifah, and Seman Noraini. "Visualizing stemming techniques on online news articles text analytics." Bulletin of Electrical Engineering and Informatics 10, no. 1 (2021): 365–73. https://doi.org/10.11591/eei.v10i1.2504.

Full text

Abstract:

Stemming is the process to convert words into their root words by the stemming algorithm. It is one of the main processes in text analytics where the text data needs to go through stemming process before proceeding to further analysis. Text analytics is a very common practice nowadays that is practiced toanalyze contents of text data from various sources such as the mass media and media social. In this study, two different stemming techniques; Porter and Lancaster are evaluated. The differences in the outputs that are resulted from the different stemming techniques are discussed based on the stemming error and the resulted visualization. The finding from this study shows that Porter stemming performs better than Lancaster stemming, by 43%, based on the stemming error produced. Visualization can still be accommodated by the stemmed text data but some understanding of the background on the text data is needed by the tool users to ensure that correct interpretation can be made on the visualization outputs.

APA, Harvard, Vancouver, ISO, and other styles

4

Muhammad Hassaan Rafiq, Shiza Gul Niazi, and Subaika Ali. "Comparative Analysis of Urdu Based Stemming Techniques." Lahore Garrison University Research Journal of Computer Science and Information Technology 2, no. 3 (2018): 11–14. http://dx.doi.org/10.54692/lgurjcsit.2018.020348.

Full text

Abstract:

Stemming reduces many variant forms of a word into its base, stem or root, which is necessary for many different language processing application including Urdu. Urdu is a morphologically rich and resourceful language. Multilingual Urdu words are very challenging to process due to complexity of morphology. The Research of Urdu stemming has an age of a decade. The present work introduces a research on Urdu stemmers with better performance as compare to the existing Urdu stemmer.

APA, Harvard, Vancouver, ISO, and other styles

5

Singh, Jasmeet, and Vishal Gupta. "A systematic review of text stemming techniques." Artificial Intelligence Review 48, no. 2 (2016): 157–217. http://dx.doi.org/10.1007/s10462-016-9498-2.

Full text

APA, Harvard, Vancouver, ISO, and other styles

6

Ullah, Sami, Muhammad Ibrahim, Tufail Ahmed, Asif Abbas, and Roshan Khan. "AN ASSESSMENT OF STEMMING TECHNIQUES AND THE PERFORMANCE OF STEMMING PLUGS IN BENCH BLASTING OPERATIONS IN PAKISTAN." Journal of Mountain Area Research 9 (May 8, 2024): 54. http://dx.doi.org/10.53874/jmar.v9i0.187.

Full text

Abstract:

This research article presents a comparative analysis of stemming techniques and the performance of stemming plugs in bench blasting operations, based on a case study carried out in cement quarries of Pakistan. For Image analysis, the Wip-Frag software is utilized in order to demonstrates the benefits of stemming plugs over traditional methods. Real time images captured during blasting at various cement quarries reveal a substantial presence of boulders within the muck pile when using conventional stemming materials such as drill cuttings and clay. This typically requires additional time and costs for secondary drilling and blasting or hydraulic hammering. Blast tests results indicate that stemming plugs improved the throw and shape of the muck pile, and significantly decreased the boulders percentage. This study also compares the results of Wip-Frag software with sieve analysis of drill cuttings, indicating potential limitations in the software's reliability for particle size distribution. These results contribute to improve blasting operations and emphasize the importance of proper stemming techniques. improving blasting efficiency, the use of stemming plugs can potentially minimize the operational costs and improve overall performance.

APA, Harvard, Vancouver, ISO, and other styles

7

Carolina, Vinnesa Patricia, Ema Utami, and Ainul Yaqin. "Exploring Stemming Techniques in Ambon Malay Languages: A Systematic Literature Review." Jambura Journal of Informatics 6, no. 1 (2024): 01–13. http://dx.doi.org/10.37905/jji.v6i1.24954.

Full text

Abstract:

Stemming in Ambonese posed a significant challenge due to its extensive lexicon, encompassing approximately 127,000 base words as recorded in the Kamus Besar Bahasa Indonesia (Indonesian Dictionary). This complexity arises from the task of extracting base words from those with affixes, necessitating the removal of various affixes such as prefixes, infixes, suffixes, and their combinations. This process greatly influences analytical outcomes. To address this linguistic complexity, several stemming algorithms were developed. These include Nazief Adriani, Enhanced Confix Stripping, Sastrawi, and Tala, each offering unique techniques to handle stemming complexities in Indonesian. The selection of the appropriate algorithm is crucial for ensuring the accuracy and reliability of the stemming process within the analytical framework. In conducted stemming research, there were variations in methods used. The most frequently used algorithm was Nazief Adriani, with 17 recorded cases, followed by Enhanced Confix Stripping with 12 cases. Sastrawi, although less frequent, was used in 4 cases, while Tala appeared in 1 case. This diversity reflects the available choices in selecting a fitting stemming method. However, this may relate to factors such as ongoing research projects, funding availability, or other external conditions affecting research production during that period. Consequently, stemming research remains an interesting and relevant topic, with the potential for continued growth and significant contributions to text processing and linguistic research in the future.

APA, Harvard, Vancouver, ISO, and other styles

8

Mustafa, Mohammad, Afag Salah Aldeen, Mohammed E. Zidan, Rihab E. Ahmed, and Yasir Eltigani. "Developing Two Different Novel Techniques for Arabic Text Stemming." Intelligent Information Management 11, no. 01 (2019): 1–23. http://dx.doi.org/10.4236/iim.2019.111001.

Full text

APA, Harvard, Vancouver, ISO, and other styles

9

M., Rouhia, Hamdy M., and Mahmoud Hussein. "Improving Arabic Text Categorization using Normalization and Stemming Techniques." International Journal of Computer Applications 135, no. 2 (2016): 38–43. http://dx.doi.org/10.5120/ijca2016908328.

Full text

APA, Harvard, Vancouver, ISO, and other styles

10

Hawraa Fadhil Khelil, Mohammed Fadhil Ibrahim, Hafsa Ataallah Hussein, and Raed Kamil Naser. "Evaluation of Different Stemming Techniques on Arabic Customer Reviews." Journal of Techniques 6, no. 2 (2024): 1–8. http://dx.doi.org/10.51173/jt.v6i2.2313.

Full text

Abstract:

Customer opinion and reviews play a vital role in marketing expansion. Big companies all over the world assign a lot of their efforts to analyzing customers’ feedback to keep track of their needs. Natural Language Processing (NLP) is widely used to analyze such review texts. Arabic customer analysis and classification also began to gain researchers’ attention due to the wide range of Arabic language speakers. Working with Arabic Language is a very challenging task because of the orthographic nature of Arabic. Also, customers often write their reviews in their dialectical style, which often diverts from standard Arabic. This study presents a method to classify Arabic customer reviews using four classifiers (K-nearest Neighbor (KNN), Support Vector Machine (SVM), Logistic Regression (RL), and Naïve Bayes (NB)). The classification is implemented with three stemming techniques (Snowball, Khoja, and Tashaphyne). The HARD dataset is adopted to perform the experiments. The results stated that the stemming methods can enhance classification performance despite the complexity of Arabic scripts and dialects. This work sheds light on utilizing and investigating more machine learning (ML) techniques and evaluating the results.

APA, Harvard, Vancouver, ISO, and other styles

11

Mustafa, Suleiman H. "Combining N-Grams and Stemming for Arabic Word-Based Inexact Matching and Term Conflation." Journal of Information & Knowledge Management 04, no. 01 (2005): 29–36. http://dx.doi.org/10.1142/s0219649205000992.

Full text

Abstract:

In this paper, the results of three N-gram techniques have been reported. Two of these techniques were based on the idea of combining N-grams and stemming. The first used first-order stemming, while the other used light stemming. The performance of the combined approach was then compared with that of pure conventional N-gram-based string matching. The results provide good evidence that combining N-grams with stemming improves the overall performance, as measured by word-match recall and word-match precision, using different similarity threshold values.

APA, Harvard, Vancouver, ISO, and other styles

12

Bounabi, Mariem, Karim El Moutaouakil, and Khalid Satori. "A comparison of text classification methods using different stemming techniques." International Journal of Computer Applications in Technology 60, no. 4 (2019): 298. http://dx.doi.org/10.1504/ijcat.2019.101171.

Full text

APA, Harvard, Vancouver, ISO, and other styles

13

Yohana, Karuniawati Paskahningrum, Utami Ema, and Yaqin Ainul. "A Systematic Literature Review of Stemming in Non-Formal Indonesian Language." International Journal of Innovative Science and Research Technology 8, no. 1 (2023): 62–69. https://doi.org/10.5281/zenodo.7547482.

Full text

Abstract:

This study tries to review studies on the stemming process in non-standard Indonesian Language. This is done to understand the methods that researchers use to collect data from various sources, process the data that has been collected, and classify data so that it becomes information that is easier to understand. The researcher collected, filtered, and reviewed the discovered research papers using a Systematic Literature Review approach. We collect research works from ScienceDirect, IEEE, arXiv, ACM Digital Library, Semantic Scholar, Google Scholar, Springer, and Elsevier by choosing research published from 2016 to 2022. The purpose of the researcher conducting this literature review is to understand stemming in Indonesian, gain an understanding of data collection techniques, stemming methods, and study stemming results from previous research. This study collects and summarizes twenty-seven stemming studies on the Indonesian language, selected from the forty-seven previously collected studies. The study was conducted regarding how to collect data, the language-stemming research methods used, and the stemming research results.

APA, Harvard, Vancouver, ISO, and other styles

14

Athallah, Muhammad Rafi, and Kemas Muslim Lhaksmana. "Hadith Text Classification Based on Topic Using Convolutional Neural Network (CNN) and TF-IDF." Journal of Renewable Energy, Electrical, and Computer Engineering 5, no. 1 (2025): 30–36. https://doi.org/10.29103/jreece.v5i1.20354.

Full text

Abstract:

Convolutional Neural Networks (CNN) will develop a hadith classification system to categorize texts based on specific topics or categories. This study compares two text representation techniques, namely Term Frequency- Inverse Document Frequency (TF-IDF) and Word2Vec, concerning the application of stemming and without stemming in the process. This study utilizes Category ID 0-5. About 2,845 data have been processed as required for testing. The data was divided into two parts, with a proportion of 80:20 for training and testing. Next, several models were evaluated, namely Word2Vec with stemming, TFIDFCNN without stemming, and TFIDFCNN with stemming. Accuracy, precision, recall, and F1 score metrics were used to assess the performance. The results show that the TFIDFCNN model without stemming performs best with 85% accuracy in topic-based text classification. This is due to the stability and efficiency of the model in processing data.

APA, Harvard, Vancouver, ISO, and other styles

15

Kumar, K. Dileep. "Multilingual Hate Speech Detection Using NLP Techniques." INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT 09, no. 03 (2025): 1–9. https://doi.org/10.55041/ijsrem42063.

Full text

Abstract:

- Hate speech detection in multiple languages has emerged as a significant challenge in Natural Language Processing (NLP), primarily due to the diverse linguistic structures, cultural nuances, and variations in contextual meanings across languages. Unlike monolingual hate speech detection, which relies on well-established lexicons and training datasets, multilingual detection requires sophisticated models capable of handling code-switching, dialectal variations, and the absence of extensive labeled data for many languages. We explore various NLP techniques, including machine learning models, deep learning architectures, and transformer-based approaches for detecting hate speech across different languages. A critical aspect of hate speech detection is text preprocessing, which varies depending on the language. The preprocessing techniques such as tokenization, stopword removal, stemming, lemmatization, and handling emojis, slang, and abbreviations commonly found in online discourse. Additionally, we examine feature engineering methods, including Term Frequency-Inverse Document Frequency (TF-IDF), word embeddings (Word2Vec, GloVe, FastText), and contextual embeddings generated by transformer models. Key Words: Hate speech detection, NLP, multilingual, machine learning, deep learning, tokenization, stopword removal, stemming, lemmatization

APA, Harvard, Vancouver, ISO, and other styles

16

Alshammari, Nasser O., and Fawaz D. Alharbi. "Combining a Novel Scoring Approach with Arabic Stemming Techniques for Arabic Chatbots Conversation Engine." ACM Transactions on Asian and Low-Resource Language Information Processing 21, no. 4 (2022): 1–21. http://dx.doi.org/10.1145/3511215.

Full text

Abstract:

Arabic is recognized as one of the main languages around the world. Many attempts and efforts have been done to provide computing solutions to support the language. Developing Arabic chatbots is still an evolving research field and requires extra efforts due to the nature of the language. One of the common tasks of any natural language processing application is the stemming step. It is important for developing chatbots, since it helps with pre-processing the input data and it can be involved with different phases of the chatbot development process. The aim of this article is to combine a scoring approach with Arabic stemming techniques for developing an Arabic chatbot conversation engine. Two experiments are conducted to evaluate the proposed solution. The first experiment is to select which stemmer is more accurate when applying our solution, since our algorithm can support various stemmers. The second experiment was conducted to evaluate our proposed approach against various machine learning models. The results show that the ISRIS stemming algorithm is the best fit for our solution with accuracy 78.06%. The results also indicate that our novel solution achieved an F1 score of 65.5%, while the other machine learning models achieved slightly lower scores. Our study presents a novel technique by combining scoring mechanisms with stemming processes to produce the best answer for every query sent by chatbots users compared to other approaches. This can be helpful for developing Arabic chatbot and can support many domains such as education, business, and health. This technique is among the first techniques that developed purposefully to serve the development of Arabic chatbots conversation engine.

APA, Harvard, Vancouver, ISO, and other styles

17

Jabbar, Abdul, Sajid Iqbal, Muhammad Usman Ghani Khan, and Shafiq Hussain. "A survey on Urdu and Urdu like language stemmers and stemming techniques." Artificial Intelligence Review 49, no. 3 (2016): 339–73. http://dx.doi.org/10.1007/s10462-016-9527-1.

Full text

APA, Harvard, Vancouver, ISO, and other styles

18

Ashwini Brahme. "Association Rule Mining and Information Retrieval Using Stemming and Text Mining Techniques." Journal of Information Systems Engineering and Management 10, no. 18s (2025): 622–28. https://doi.org/10.52783/jisem.v10i18s.2958.

Full text

Abstract:

Heterogeneous, complex and enormous data mining plays significant role in the today’s big data scenario all over the globe. The research paper is intended toward the natural language processing, mining of textual data, and pattern discovery through association rule mining. The research is aimed towards mining of digital news of epidemic diseases and generating the hidden patterns from the corpus data. The present study also aimed towards developing knowledge discovery system for healthcare for prediction of epidemic viral diseases and their related measures which will be helpful for the healthcare experts, doctors, and healthcare organizations as well as for governments also to take the precautionary measures. The study deigned for predictive analytics of epidemic diseases and their patterns using association rule mining. The precautionary measures for the healthcare and highly impacted geographical location of widespread diseases are generated through the proposed system.

APA, Harvard, Vancouver, ISO, and other styles

19

Jefriyanto, Jefriyanto, Nur Ainun, and Muchamad Arif Al Ardha. "Application of Naïve Bayes Classification to Analyze Performance Using Stopwords." Journal of Information System, Technology and Engineering 1, no. 2 (2023): 49–53. http://dx.doi.org/10.61487/jiste.v1i2.15.

Full text

Abstract:

Based on current data, there has been an increase in social media users, which shows that more and more people are using social media as a place to express themselves and their emotions. This will generate thousands of tweets within a day. The tweet data is processed so that it is useful for stakeholders who need it to help them make a decision. Because sentence structures on social media are often irregular, pre-processing is necessary to make tweet sentences normal. Stemming and Stopwords are pre-processing techniques that are widely used in sentiment analysis. In previous studies, there were indications that its use did not have a significant effect on accuracy. In this study, the authors divide it into four models: using stemming and stopwords and without using stemming and stopwords. Data using stemming gets the best results with an f1-score of 65%. These results indicate an increase in performance in the use of stemming and stopwords using Multi-class Naive Bayes

APA, Harvard, Vancouver, ISO, and other styles

20

Aufar, Arrizqi Fauzy, Mochamad Alfan Rosid, Ade Eviyanti, and Ika Ratna Indra Astutik. "Optimizing Text Preprocessing for Accurate Sentiment Analysis on E-Wallet Reviews." JICTE (Journal of Information and Computer Technology Education) 7, no. 2 (2023): 42–50. https://doi.org/10.21070/jicte.v7i2.1650.

Full text

Abstract:

This Research aims to optimize preprocessing techniques in sentiment analysis of reviews for the E-Wallet Dana application on the Google Play Store. Text preprocessing is a crucial step in Natural Language Processing (NLP) that affects the accuracy and efficiency of sentiment analysis. This study employs various preprocessing methods, including stopwords removal, stemming, and lemmatization, to clean and prepare the review data before analysis. The results show that lemmatization techniques significantly improve accuracy compared to basic preprocessing techniques such as stopwords removal and stemming. With proper preprocessing optimization, sentiment analysis can provide more accurate and informative results, which can be used to enhance the application's quality and user experience. This study uses SVM classification testing models with 4 kernels, where the highest results were achieved with cleaning, case folding, tokenization, and lemmatization techniques at 100% for Linear; 100% for RBF, 99% for Polynomial, and 99.50% for Sigmoid with an average accuracy of 99.63%. Highlights: Preprocessing Optimization: Lemmatization significantly improves sentiment analysis accuracy compared to basic techniques like stemming or stopword removal. High SVM Accuracy: The best preprocessing combination achieved an average accuracy of 99.63% across multiple SVM kernels, with linear and RBF kernels reaching 100%. Real-World Dataset: Analysis used 1000 authentic reviews from Google Play Store, highlighting practical insights for improving E-Wallet services like DANA. Keywords: DANA, Google Play Store, Preprocessing, Sentiment Analysis

APA, Harvard, Vancouver, ISO, and other styles

21

1Ramalingam, Sugumar &. 2M.Rama Priya. "IMPROVED PERFORMANCE OF STEMMING USING ENHANCED PORTER STEMMER ALGORITHM FOR INFORMATION RETRIEVAL." INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY 7, no. 4 (2018): 681–86. https://doi.org/10.5281/zenodo.1228745.

Full text

Abstract:

In the era of digitalization, information retrieval (IR) are retrieves and ranks documents from large  collections according to users search queries, has been usually applied in the several domains. Building records using electronic and searching literature for topics of interest are some IR use cases. For the moment, Natural Language Processing (NLP), such as tokenization, stop word removal and stemming or Part-Of-Speech (POS) tagging, has been developed for processing documents or literature. This study offer that NLP can be incorporated into IR to strengthen the conventional IR models. In this paper proposed Enhanced Porter Stemmer algorithm for improving the efficiency of pre-processing in text mining. The Enhanced Porter Stemmer algorithm is extension version of new porter stemmer. The Enhanced Porter Stemmer algorithm performance is compared with several algorithms such as porter, new porter and etc. The performance of the Enhanced Porter Stemmer is better than others.

APA, Harvard, Vancouver, ISO, and other styles

22

G. Anish Kumar and Dr. C Jayapratha. "Twitter Sentiment Analysis Using Machine Learning Techniques." International Journal of Scientific Research in Science and Technology 12, no. 4 (2025): 01–04. https://doi.org/10.32628/ijsrst251241.

Full text

Abstract:

This paper presents an effective sentiment analysis system designed to classify the polarity of tweets into positive, negative, or neutral sentiments. The framework utilizes supervised machine learning algorithms, including Logistic Regression, Support Vector Machines (SVM), and Random Forest, trained on the Sentiment140 dataset. Text preprocessing techniques such as tokenization, stopword removal, stemming, and TF-IDF vectorization are applied to improve classification performance. The proposed system achieves an accuracy of 87.2% with SVM, outperforming other baseline models. This solution offers scalable deployment in social media monitoring, political campaign tracking, and customer feedback analysis.

APA, Harvard, Vancouver, ISO, and other styles

23

Namly, Driss, and Karim Bouzoubaa. "An innovative Arabic light stemmer developed using a hybrid approach." International Journal of Electrical and Computer Engineering (IJECE) 15, no. 2 (2025): 2356. https://doi.org/10.11591/ijece.v15i2.pp2356-2363.

Full text

Abstract:

Our study introduces an innovative light stemming tool tailored for Arabic morphology challenges. In conformance with the templatic and concatenative structures, our stemmer utilizes a combination of clitic stripping, lexicon-based, and statistical disambiguation techniques to ensure accurate stemming. To accomplish this, we rely on our clitic rules lexicon to detect all potential combinations of clitics for each input entry. Subsequently, we depend on an extensive lexicon of over 7 million stems to verify the potential stems. Lastly, we employ a statistical model to ascertain the most likely stem based on the sentence's context. Experimental results demonstrate the effectiveness of the proposed stemmer in comparison with existing ones. Using different datasets, our stemmer achieves higher accuracy and F1 scores, highlighting its efficiency in Arabic stemming tasks.

APA, Harvard, Vancouver, ISO, and other styles

24

Namly, Driss, and Karim Bouzoubaa. "An innovative Arabic light stemmer developed using a hybrid approach." International Journal of Electrical and Computer Engineering (IJECE) 15, no. 2 (2025): 2356–63. https://doi.org/10.11591/ijece.v15i2.pp2356-2363.

Full text

Abstract:

Our study introduces an innovative light stemming tool tailored for Arabic morphology challenges. In conformance with the templatic and concatenative structures, our stemmer utilizes a combination of clitic stripping, lexicon- based, and statistical disambiguation techniques to ensure accurate stemming. To accomplish this, we rely on our clitic rules lexicon to detect all potential combinations of clitics for each input entry. Subsequently, we depend on an extensive lexicon of over 7 million stems to verify the potential stems. Lastly, we employ a statistical model to ascertain the most likely stem based on the sentence's context. Experimental results demonstrate the effectiveness of the proposed stemmer in comparison with existing ones. Using different datasets, our stemmer achieves higher accuracy and F1 scores, highlighting its efficiency in Arabic stemming tasks.

APA, Harvard, Vancouver, ISO, and other styles

25

Saeed, Ari M., Tarik A. Rashid, Arazo M. Mustafa, Polla Fattah, and Birzo Ismael. "Improving Kurdish Web Mining through Tree Data Structure and Porter’s Stemmer Algorithms." UKH Journal of Science and Engineering 2, no. 1 (2018): 48–54. http://dx.doi.org/10.25079/ukhjse.v2n1y2018.pp48-54.

Full text

Abstract:

Stemming is one of the main important preprocessing techniques that can be used to enhance the accuracy of text classification. The key purpose of using the stemming is combining the number of words that have same stem to decrease high dimensionality of feature space. Reducing feature space cause to decline time to construct a model and minimize the memory space. In this paper, a new stemming approach is explored for enhancing Kurdish text classification performance. Tree data structure and Porter’s stemmer algorithms are incorporated for building the proposed approach. The system is assessed through using Support Vector Machine (SVM) and Decision Tree (C4.5) to illustrate the performance of the suggested stemmer after and before applying it. Furthermore, the usefulness of using stop words are considered before and after implementing the suggested approach.

APA, Harvard, Vancouver, ISO, and other styles

26

Ivanedra, Kasyfi, and Metty Mustikasari. "Implementasi Metode Reccurrent Neural Network pada Text Summarization dengan Teknik Abstraktif." Jurnal Teknologi Informasi dan Ilmu Komputer 6, no. 4 (2019): 377. http://dx.doi.org/10.25126/jtiik.2019641067.

Full text

Abstract:

<p>Text Summarization atau peringkas text merupakan salah satu penerapan Artificial Intelligence (AI) dimana komputer dapat meringkas text pada suatu kalimat atau artikel menjadi lebih sederhana dengan tujuan untuk mempermudah manusia dalam mengambil kesimpulan dari artikel yang panjang tanpa harus membaca secara keseluruhan. Peringkasan teks secara otomatis dengan menggunakan teknik Abstraktif memiliki kemampuan meringkas teks lebih natural sebagaimana manusia meringkas dibandingkan dengan teknik ekstraktif yang hanya menyusun kalimat berdasarkan frekuensi kemunculan kata. Untuk dapat menghasilkan sistem peringkas teks dengan metode abstraktif, membutuhkan metode Recurrent Neural Network (RNN) yang memiliki sistematika perhitungan bobot secara berulang. RNN merupakan bagian dari Deep Learning dimana nilai akurasi yang dihasilkan dapat lebih baik dibandingkan dengan jaringan saraf tiruan sederhana karena bobot yang dihitung akan lebih akurat mendekati persamaan setiap kata. Jenis RNN yang digunakan adalah LSTM (Long Short Term Memory) untuk menutupi kekurangan pada RNN yang tidak dapat menyimpan memori untuk dipilah dan menambahkan mekanisme Attention agar setiap kata dapat lebih fokus pada konteks. Penelitian ini menguji performa sistem menggunakan Precision, Recall, dan F-Measure dengan membandingan hasil ringkasan yang dihasilkan oleh sistem dan ringkasan yang dibuat oleh manusia. Dataset yang digunakan adalah data artikel berita dengan jumlah total artikel sebanyak 4515 buah artikel. Pengujian dibagi berdasarkan data dengan menggunakan Stemming dan dengan teknik Non-stemming. Nilai rata-rata recall artikel berita non-stemming adalah sebesar 41%, precision sebesar 81%, dan F-measure sebesar 54,27%. Sedangkan nilai rata-rata recall artikel berita dengan teknik stemming sebesar 44%, precision sebesar 88%, dan F-measure sebesar 58,20 %.</p><p><em><strong>Abstract</strong></em></p><p class="Judul2"><em>Text Summarization is the application of Artificial Intelligence (AI) where the computer can summarize text of article to make it easier for humans to draw conclusions from long articles without having to read entirely. Abstractive techniques has ability to summarize the text more naturally as humans summarize. The summary results from abstractive techinques are more in context when compared to extractive techniques which only arrange sentences based on the frequency of occurrence of the word. To be able to produce a text summarization system with an abstractive techniques, it is required Deep Learning by using the Recurrent Neural Network (RNN) rather than simple Artificial Neural Network (ANN) method which has a systematic calculation of weight repeatedly in order to improve accuracy. The type of RNN used is LSTM (Long Short Term Memory) to cover the shortcomings of the RNN which cannot store memory to be sorted and add an Attention mechanism so that each word can focus more on the context.This study examines the performance of Precision, Recall, and F-Measure from the comparison of the summary results produced by the system and summaries made by humans. The dataset used is news article data with 4515 articles. Testing was divided based on data using Stemming and Non-stemming techniques.</em> <em>The average recall value of non-stemming news articles is 41%, precision is 81%, and F-measure is 54.27%. While the average value of recall of news articles with stemming technique is 44%, precision is 88%, and F-measure is 58.20%.</em></p><p><em><strong><br /></strong></em></p>

APA, Harvard, Vancouver, ISO, and other styles

27

AARAB, Abdelkrim, Ahmed Oussous, and Mohammed Saddoune. "Review on Recent Arabic Information Retrieval Techniques." EAI Endorsed Transactions on Internet of Things 8, no. 3 (2022): e5. http://dx.doi.org/10.4108/eetiot.v8i3.2276.

Full text

Abstract:

Information retrieval is an important field that aims to provide a relevant document to a user information need, expressed through a query. Arabic is a challenging language that gained much attention recently in the information retrieval domain. To overcome the problems related to its complexity, many studies and techniques have been presented, most of them were conducted to solve the stemming problem. This paper presents an overview of the Arabic information retrieval process, including various text processing techniques, ranking approaches, evaluation measures, and some important information retrieval models. The paper finally presents some recent related studies and approaches in different Arabic information retrieval fields.

APA, Harvard, Vancouver, ISO, and other styles

28

K Alese, Boniface, Aderonke Favour-Bethy Thompson, Olufunso Dayo Alowolodu, and Blessing Emmanuel Oladele. "Multilevel Authentication System for Stemming Crime in Online Banking." Interdisciplinary Journal of Information, Knowledge, and Management 13 (2018): 079–94. http://dx.doi.org/10.28945/4063.

Full text

Abstract:

Aim/Purpose: The wide use of online banking and technological advancement has attracted the interest of malicious and criminal users with a more sophisticated form of attacks. Background: Therefore, banks need to adapt their security systems to effectively stem threats posed by imposters and hackers and to also provide higher security standards that assure customers of a secured environment to perform their financial transactions. Methodology : The use of authentication techniques that include the mutual secure socket layer authentication embedded with some specific features. Contribution: An approach was made through this paper towards providing a more reliable and complete solution for implementing multi-level user authentication in a banking environment. Findings: The use of soft token as the final stage of authentication provides ease of management with no additional hardware requirement. Recommendations for Practitioners : This work is an approach made towards providing a more reliable and complete solution for implementing multi-level user authentication in a banking environment to stem cybercrime. Recommendation for Researchers: With this approach, a reliable system of authentication is being suggested to stem the growing rate of hacking activities in the information technology sector. Impact on Society :This work if adopted will give the entire populace confidence in carrying out online banking without fear of any compromise. Future Research: This work can be adopted to model a real-life scenario.

APA, Harvard, Vancouver, ISO, and other styles

29

Gavali, Mrs A. V., and Mr Sunil B. Hebbale. "A Rule Based Answer Extraction System with Stemming & Anaphora Resolution." INTERNATIONAL JOURNAL OF COMPUTERS & TECHNOLOGY 11, no. 2 (2013): 2256–61. http://dx.doi.org/10.24297/ijct.v11i2.1178.

Full text

Abstract:

Natural Language Processing (NLP) is an area of computer Science and Sub area of Artificial Intelligence (AI).We are developing a rule-based system that can read a large collection of text (say for e.g. story) and find the sentence in the text that best answers the given question. The system uses set of handcrafted rules augmented with some NLP techniques like stemming, named entity extraction etc. that look for Lexical and semantic clues in the question and the text (i.e. story). Each rule awards a certain number of points to each sentence. After all of the rules have been applied, the sentence that obtains the highest score is returned as the answer.

APA, Harvard, Vancouver, ISO, and other styles

30

Jena, Gouranga Charan, and Siddharth Swarup Rautaray. "Design and implementation of an effective web-based hybrid stemmer for Odia language." International Journal of Advances in Applied Sciences 9, no. 1 (2020): 12. http://dx.doi.org/10.11591/ijaas.v9.i1.pp12-19.

Full text

Abstract:

<p><span>Stemmer is used for reducing inflectional or derived word to its stem. This technique involves removing the suffix or prefix affixed in a word. It can be used for information retrieval system to refine the overall execution of the retrieval process. This process is not equivalent to morphological analysis. This process only finds the stem of a word. This technique decreases the number of terms in information retrieval system. There are various techniques exists for stemming. In this paper, a new web-based stemmer has been proposed named as “Mula” for Odia Language. It uses the Hybrid approach (i.e. combination of brute force and suffix removal approach) for Odia language. The new born stemmer is both computationally faster and domain independent. The results are favourable and indicate that the proposed stemmer can be used effectively in Odia Information Retrieval systems. This stemmer also handles the problem of over-stemming and under-stemming in some extend.</span></p>

APA, Harvard, Vancouver, ISO, and other styles

31

Gouranga, Charan Jena, and Swarup Rautaray Siddharth. "Design and implementation of an effective web-based hybrid stemmer for Odia language." International Journal of Advances in Applied Sciences (IJAAS) 9, no. 1 (2020): 12–19. https://doi.org/10.11591/ijaas.v9.i1.pp12-19.

Full text

Abstract:

Stemmer is used for reducing inflectional or derived word to its stem. This technique involves removing the suffix or prefix affixed in a word. It can be used for information retrieval system to refine the overall execution of the retrieval process. This process is not equivalent to morphological analysis. This process only finds the stem of a word. This technique decreases the number of terms in information retrieval system. There are various techniques exists for stemming. In this paper, a new web-based stemmer has been proposed named as “Mula” for Odia Language. It uses the Hybrid approach (i.e. combination of brute force and suffix removal approach) for Odia language. The new born stemmer is both computationally faster and domain independent. The results are favourable and indicate that the proposed stemmer can be used effectively in Odia Information Retrieval systems. This stemmer also handles the problem of over-stemming and under-stemming in some extend.

APA, Harvard, Vancouver, ISO, and other styles

32

El Koshiry, Amr Mohamed, Entesar Hamed I. Eliwa, Tarek Abd El-Hafeez, and Marwa Khairy. "Detecting cyberbullying using deep learning techniques: a pre-trained glove and focal loss technique." PeerJ Computer Science 10 (March 27, 2024): e1961. http://dx.doi.org/10.7717/peerj-cs.1961.

Full text

Abstract:

This study investigates the effectiveness of various deep learning and classical machine learning techniques in identifying instances of cyberbullying. The study compares the performance of five classical machine learning algorithms and three deep learning models. The data undergoes pre-processing, including text cleaning, tokenization, stemming, and stop word removal. The experiment uses accuracy, precision, recall, and F1 score metrics to evaluate the performance of the algorithms on the dataset. The results show that the proposed technique achieves high accuracy, precision, and F1 score values, with the Focal Loss algorithm achieving the highest accuracy of 99% and the highest precision of 86.72%. However, the recall values were relatively low for most algorithms, indicating that they struggled to identify all relevant data. Additionally, the study proposes a technique using a convolutional neural network with a bidirectional long short-term memory layer, trained on a pre-processed dataset of tweets using GloVe word embeddings and the focal loss function. The model achieved high accuracy, precision, and F1 score values, with the GRU algorithm achieving the highest accuracy of 97.0% and the NB algorithm achieving the highest precision of 96.6%.

APA, Harvard, Vancouver, ISO, and other styles

33

Yani Zhang. "Narrative Strategies in Sherwood Anderson’s Death in the Woods." Frontiers in Humanities and Social Research 2, no. 3 (2025): 242–47. https://doi.org/10.71465/fhsr251.

Full text

Abstract:

Sherwood Anderson’s Death in the Woods utilizes intricate narrative strategies to explore the themes of isolation and existential reflection. This essay focuses on three key techniques:(1) a complex first-person narrator whose framing creates intimacy yet reveals subjectivity; (2) the narrator’s unreliability, stemming from distance and invention, questioning factual truth; and (3) narrative time manipulation, using flashbacks, varied pacing, and repetition. These intertwined techniques help to create a hauntingly ambiguous meditation on life, death, and human relationship.

APA, Harvard, Vancouver, ISO, and other styles

34

Wensrich, Christopher M., Erich H. Kisi, Vladimir Luzin, Oliver Kirstein, Alexander L. Smith, and Jian Feng Zhang. "Neutron Diffraction Techniques in Granular Mechanics." Materials Science Forum 905 (August 2017): 190–95. http://dx.doi.org/10.4028/www.scientific.net/msf.905.190.

Full text

Abstract:

Granular materials demonstrate unique mechanical properties stemming from their discrete nature. At large length scales granular assemblies are often viewed from the perspective of continuum theory where they show complex behaviour such as elastic and plastic anisotropy related to the load and deformation history. This complex behaviour is inextricably linked to the micromechanics of load sharing and force transmission at the particle level. At these scales, bulk stress is not shared homogeneously between particles, but rather by a network of `force chains' that form a skeleton supporting the vast majority of the applied load. The formation and failure of these structures govern much of the bulk behaviour of these materials. Neutron diffraction techniques are now providing a window into the mechanics of granular materials at both bulk and particle scales. Through a combination of tomographic neutron imaging and diffraction based strain measurement it is now possible to directly examine the stress within individual particles in granular assemblies. Results of these experiments in two and three dimensions are presented and the outlook for this approach to studying the mechanics of granular materials is discussed.

APA, Harvard, Vancouver, ISO, and other styles

35

Adnan Allama. "Analysis of Blast Fragmentation Results of the Top Air Decking Method in Coal Mines." Journal of Information System, Technology and Engineering 1, no. 4 (2023): 132–36. http://dx.doi.org/10.61487/jiste.v1i4.50.

Full text

Abstract:

This research aims to find out what influences the results of blasting fragmentation in the top air decking method. The method used in this research is a quantitative method. The data collection techniques used in this research came from literature studies and field data collection. Processing data analysis is carried out by considering several factors, such as rock characteristics, fragmentation results, and digging time results. After analysis, it can be seen that the design for determining the ADF, ADL, and stemming values is adjusted to the hole depth and PF used so that the fragmentation and digging time obtained are better in order to increase production activities. Based on the research results, it was found that the factors that influence the fragmentation results from top air deck blasting are the powder factor, blasting geometry (burden and spacing), air deck length (ADL), stemming depth, and wet hole conditions. The greater the PF value used, the smaller the fragmentation resulting from blasting and vice versa. The greater the ADL value used, the stemming depth will decrease and there will be potential for stemming injection and producing boulders. More holes in wet conditions will potentially result in greater fragmentation as well. The optimal ADF value to use is 0.3. With an ADF value of 0.3, the ADL used is 0.4–1.47 meters, resulting in better fragmentation. The previous boulder percentage of 18% changed to 7% and the previous digging time of 11.87 seconds changed to 10.57 seconds.

APA, Harvard, Vancouver, ISO, and other styles

36

Nafea, Ahmed Adil, Muhmmad Shihab Muayad, Russel R. Majeed, et al. "A Brief Review on Preprocessing Text in Arabic Language Dataset: Techniques and Challenges." Babylonian Journal of Artificial Intelligence 2024 (May 18, 2024): 46–53. http://dx.doi.org/10.58496/bjai/2024/007.

Full text

Abstract:

Text preprocessing plays an important role in natural language processing (NLP) tasks containing text classification, sentiment analysis, and machine translation. The preprocessing of Arabic text still presents unique challenges due to the language's rich morphology, complex grammar, and various character sets. This brief review studied various techniques utilized for preprocessing Arabic text data. This study discusses the challenges specific to Arabic text and current an overview of key preprocessing steps including normalization, tokenization, stemming, stop-word removal, and noise reduction. This survey analyzes preprocessing techniques on NLP tasks and focus on current research trends and future directions in Arabic text preprocessing.

APA, Harvard, Vancouver, ISO, and other styles

37

Taiwo, Blessing Olamide, and Babatunde Adebayo. "Assessment of Small Scale Mine Blast Toe Volume Effect on Fragmentation Size Distribution: An Application of Edge Detection Software." International Journal of Engineering and Advanced Technology Studies 12, no. 3 (2024): 49–59. http://dx.doi.org/10.37745/ijeats.13/vol12n34959.

Full text

Abstract:

This study explores the critical aspect of blast toe volume in small-scale mining operations and its impact on fragmentation size distribution. The assessment is conducted to understand the relationship between blast toe volume and the resultant blast design patterns. A comprehensive analysis is carried out using production blast result from four dolomite quarries in Akoko Edo, Nigeria, focusing on various explosive engineering parameters. The research employs advanced measurement techniques and statistical methods to quantify blast toe volume and assess its influence on fragmentation size distribution. By systematically varying blast toe volumes in controlled experiments, the study aims to establish correlations between toe volume and the resulting fragmentation size. The Variance inflation factor obtained in this study revealed that the selection of parameters during toe volume simulation must be carried out with respect to stemming length and explosive weight (MIC). The result shows that the maximum instantaneous charge has a negative correlation influence on toe volume as stemming also increases. This reveals that variation in stemming length results in low explosive energy release along the blast hole column, causing toe undulation. Finally, it was also revealed that at the blast mean size (X50) increases, the toe also increases due to the poor utilization of explosive energy at the blast column.

APA, Harvard, Vancouver, ISO, and other styles

38

Borjigin, Garimagai, Yuqiang Ding, John Semmen, Hosna Tajvidi Safa, Hideki Kakeya, and Shin-Tson Wu. "Coarse Integral Volumetric Imaging Display with Time and Polarization Multiplexing." Photonics 11, no. 1 (2023): 7. http://dx.doi.org/10.3390/photonics11010007.

Full text

Abstract:

This paper introduces an innovative approach to integral volumetric imaging employing time and polarization multiplexing techniques to present volumetric three-dimensional images. Traditional integral volumetric imaging systems with a coarse lens array often face moiré pattern issues stemming from layered panel structures. In response, our proposed system utilizes a combination of time and polarization multiplexing to achieve two focal planes using a single display panel.

APA, Harvard, Vancouver, ISO, and other styles

39

Sejati, Priska Trisna, Farrikh Alzami, Aris Marjuni, Heni Indrayani, and Ika Dewi Puspitarini. "Aspect-Based Sentiment Analysis for Enhanced Understanding of 'Kemenkeu' Tweets." Journal of Applied Informatics and Computing 8, no. 2 (2024): 487–98. https://doi.org/10.30871/jaic.v8i2.8558.

Full text

Abstract:

The perceptions and expressions shared by the public on social media play a crucial role in shaping the reputation of government institutions, such as the Ministry of Finance MOF (Kemenkeu) in Indonesia which also has faced increased scrutiny, particularly on Twitter. This study analyzes public sentiment towards the Indonesian Ministry of Finance (MoF) through Aspect-Based Sentiment Analysis (ABSA) on Twitter data. Using a dataset of 10,099 tweets from January to July 2024, this study combines IndoBERT for sentiment classification and Latent Dirichlet Allocation (LDA) for topic modeling. Here, LDA was tested across four scenarios that considered various combinations of stopwords removal and stemming techniques, resulting in coherence scores of 0.314256, 0.369636, 0.350285, and 0.541752. The most optimal results were achieved in the scenario of stopwords removal without stemming (with 0.314256 coherence score). The main results show: 1) Identification of four main topics related to MoF: Economy, Budget, Employees, and Tax; 2) The dominance of negative sentiment (6,837 tweets) compared to positive sentiment (198 tweets) across all topics; 3) The effectiveness of IndoBERT in handling the complexity of the Indonesian language, especially in interpreting context and language nuances; 4) The importance of proper preprocessing, with a scenario of removing stopwords without stemming resulting in the most relevant topics. This study provides valuable insights for MoF to understand public perception and identify areas that require special attention in public communication and policy.

APA, Harvard, Vancouver, ISO, and other styles

40

Muthee, Mutwiri George, Mutua Makau, and Omamo Amos. "review of techniques for morphological analysis in natural language processing." African Journal of Science, Technology and Social Sciences 1, no. 2 (2022): 93–103. http://dx.doi.org/10.58506/ajstss.v1i2.11.

Full text

Abstract:

Natural language is a crucial tool to facilitate communication in our day-to-day activities. This can be achieved either in text or speech forms. Natural language processing (NLP) involves making computers understand and process natural language. NLP has enhanced the way humans interact with computers, from having computers use speech to talk to humans as well as having computers translate human speech. Apart from speech, computers also create and understand sentences in natural language in a process called morphological analysis. Morphological analysis is an important part in natural language processing, being applied as a preprocessing step in most NLP tasks. Morphological analysis consists of four subtasks, that is, lemmatization, part-of-speech (POS) tagging, word segmentation and stemming. In this paper, we explore in detail each of these tasks of morphological analysis. We then evaluate the techniques used in this NLP field. Finally, we give a summary of the results of each of these techniques.

APA, Harvard, Vancouver, ISO, and other styles

41

Mathis, Stéphane, Gwendal Le Masson, and Jean-Michel Vallat. "Early clinicopathologic description of nodoparanodopathy in the 19th century." Neurology 93, no. 18 (2019): 788–92. http://dx.doi.org/10.1212/wnl.0000000000008399.

Full text

Abstract:

Nodoparanodopathy is a recent concept in the field of peripheral neuropathy, corresponding to peripheral nerve disorders stemming from an autoimmune attack directed and limited to the nodal region. This concept was identified using modern techniques of electrophysiology, immunology, and pathology (including electron microscopy). We present here what we believe to be the earlier well-documented case of nodoparanodopathy in the medical literature, based on an article written by Samuel Gilbert Webber (1838–1926) in 1884.

APA, Harvard, Vancouver, ISO, and other styles

42

DiGiovanni, Jeffrey J., and Travis L. Riffle. "The Developing Relationship Among Cognition, Amplification, and Aural Rehabilitation." Perspectives of the ASHA Special Interest Groups 1, no. 6 (2016): 47–54. http://dx.doi.org/10.1044/persp1.sig6.46.

Full text

Abstract:

The search for best practices in hearing aid fittings and aural rehabilitation has generally used the audiogram and function stemming from peripheral sensitivity. In recent years, however, we have learned that individuals respond differently to various hearing aid and aural rehabilitation techniques based on cognitive abilities. In this paper, we review basic concepts of working memory and the literature driving our knowledge in newer concepts of hearing aid fitting and aural rehabilitation.

APA, Harvard, Vancouver, ISO, and other styles

43

Kernberg, Otto F. "Therapeutic Implications of Transference Structures in Various Personality Pathologies." Journal of the American Psychoanalytic Association 67, no. 6 (2019): 951–86. http://dx.doi.org/10.1177/0003065119898190.

Full text

Abstract:

Definitions of specific organizations of transference developments are proposed for neurotic, borderline, narcissistic, schizoid, symbiotic, and psychotic character structures. These distinct organizations of transference developments correspond to the underlying characteristics of internalized object relations stemming from the conflictual implications of split-off, idealized, and persecutory self- and object representations. The transference structures described have implications for the corresponding application of psychoanalytic technique. Clinical cases illustrate the relationship between personality structure, transference organization, and psychoanalytic techniques.

APA, Harvard, Vancouver, ISO, and other styles

44

Ahmad, M. Haroon, Ali Saeed, M. Usman Bhatti, Naveed Hussain, Muhammad Farhat Ullah, and Mehmood Anwar. "Next Word Prediction for Urdu using Deep Learning Techniques." VFAST Transactions on Software Engineering 13, no. 1 (2025): 49–59. https://doi.org/10.21015/vtse.v13i1.2044.

Full text

Abstract:

A language model for next-word prediction is a probabilistic representation of a natural language that utilizes text corpora to generate word probabilities. These models play a crucial role in text generation, machine translation, and question-answering applications. The focus of this study is to develop an improved algorithm for next-word prediction in Urdu. The study implements deep learning models, including RNN, LSTM, and Bi-LSTM, on a subset of the Ur-Mono Urdu corpus containing 3,000 and 5,000 sentences. To prepare the data for experimentation, tokenization and stemming data cleaning techniques are applied. The study achieved an accuracy of 87% using the RNN model on the ﬁrst 3,000 sentences of the Ur-Mono dataset and 84% accuracy using the RNN model on the ﬁrst 5,000 sentences of the Ur-Mono dataset. In conclusion, it can be stated that when the corpus size is small, the RNN outperforms both the LSTM and BiLSTM. However, as the corpus size increases, the Bi-LSTM exhibits superior performance compared to both RNN and LSTM.

APA, Harvard, Vancouver, ISO, and other styles

45

MR ADEPU RAJESH and DR TRYAMBAK HIWARKAR. "Exploring Preprocessing Techniques for Natural LanguageText: A Comprehensive Study Using Python Code." international journal of engineering technology and management sciences 7, no. 5 (2023): 390–99. http://dx.doi.org/10.46647/ijetms.2023.v07i05.047.

Full text

Abstract:

The paper highlights the significance of efficient text preprocessing strategies in Natural Language Processing (NLP), a field focused on enabling machines to understand and interpret human language. Text preprocessing is a crucial step in converting unstructured text into a machine-understandable format. It plays a vital role in various text classification tasks, including web search, document classification, chatbots, and virtual assistants. Techniques such as tokenization, stop word removal, and lemmatization are carefully studied and applied in a specific order to ensure accurate and efficient information retrieval. The paper emphasizes the importance of selecting and ordering preprocessing techniques wisely to achieve high-quality results. Effective text preprocessing involves cleaning and filtering textual data to eliminate noise and enhance efficiency. Furthermore, it provides insights into the impact of different techniques, such as raw text, tokenization, stop word removal, and stemming, using a Python implementation.

APA, Harvard, Vancouver, ISO, and other styles

46

Gómez-Torrecillas, J. "Separable functors in corings." International Journal of Mathematics and Mathematical Sciences 30, no. 4 (2002): 203–25. http://dx.doi.org/10.1155/s016117120201270x.

Full text

Abstract:

We develop some basic functorial techniques for the study of the categories of comodules over corings. In particular, we prove that the induction functor stemming from every morphism of corings has a left adjoint, called ad-induction functor. This construction generalizes the known adjunctions for the categories of Doi-Hopf modules and entwined modules. The separability of the induction and ad-induction functors are characterized, extending earlier results for coalgebra and ring homomorphisms, as well as for entwining structures.

APA, Harvard, Vancouver, ISO, and other styles

47

Maharani, Zahra Nabila, Ardytha Luthfiarta, and Nabila Zibriza Farsya. "Sentiment Analysis of the 2024 Indonesian Presidential Dispute Trial Election using SVM and Naïve Bayes on Platform X." Building of Informatics, Technology and Science (BITS) 6, no. 1 (2024): 440–49. https://doi.org/10.47065/bits.v6i1.5380.

Full text

Abstract:

Indonesian presidential dispute trial election are crucial activities in the democratic process where open exchanges of views and opinions occur. Sentiment analysis can help understand public opinion regarding these sessions. This study aims to conduct sentiment analysis of the 2024 Indonesian presidential dispute trial election using the Support Vector Machine (SVM) and Gausian Naïve Bayes (GNB) with Nazief Adriani and Sastrawi stemming methods on Platform X. The research addresses the challenge of uncertainty in interpreting public sentiment towards Indonesian presidential dispute trial election. SVM and GNB was chosen for its ability to classify large and complex data sets. The Nazief Andriani and Sastrawi stemming techniques were employed to reduce words to their base forms, thereby enhancing the quality of text analysis. The study was conducted on Platform X, which provides access to text data from various sources including social media and news platforms. The data used covered specific periods before, during, and after Indonesian presidential dispute trial election. The keywords used for the crawling process are “sidang sengketa pilpress”, “sidang sengketa pemilu”, and “sidang pilpres”. The classification technique is carried out by classifying it into two classes, namely positive and negative. In applying sentiment analysis using machine learning methods, there are several methods that are often used. Based on the results comparation of tests carried out on 2,443 tweets using SVM with Sastrawi stemming method produce the best accuracy of 91.1%, precision 90%, recall 91%., and F1-Score 91%.

APA, Harvard, Vancouver, ISO, and other styles

48

Bamatov, Dzhabrail M., Ismail L. Daudov, and Magomed M. Arsanov. "Emerging technologies are revolutionizing the rootstock rooting process in agricultural engineering through in vitro techniques." E3S Web of Conferences 486 (2024): 03007. http://dx.doi.org/10.1051/e3sconf/202448603007.

Full text

Abstract:

This article endeavors to showcase the seamless integration of cutting-edge technologies into the agricultural sector, with a specific emphasis on their application within biotechnology laboratories specializing in in vitro cultivation. The pioneering technology at the heart of this project centers on the automated detection of viral symptoms in micro-cut rootstocks grown in nutrient-rich substrates. Furthermore, we will delve into a comparative analysis, contrasting laboratories equipped with automated processes against those devoid of such advancements, specifically in the context of in vitro cultivation of subject-rooted plants in identical nutrient-rich environments. The anticipated annual financial gains stemming from the incorporation of an automated system into the in vitro laboratory are projected to reach an impressive 1.8 million Russian rubles.

APA, Harvard, Vancouver, ISO, and other styles

49

Strickler, J. Rudi. "Observing free-swimming copepods mating." Philosophical Transactions of the Royal Society of London. Series B: Biological Sciences 353, no. 1369 (1998): 671–80. http://dx.doi.org/10.1098/rstb.1998.0233.

Full text

Abstract:

Planktonic copepods are small transparent animals swimming in water. To observe how a male finds its mate, special optical systems had to be designed. The animals are treated as phase objects and matched spatial filters allow three-dimensional recordings of the swimming behaviour in a 1-litre vessel. Application of the techniques described shows how a male cyclopoid copepod swims for 20 s in synchronicity with the female before mating. Results stemming from observations with this optical system are published in this volume.

APA, Harvard, Vancouver, ISO, and other styles

50

Rahamat, Basha S., and J. K. Rani. "A Comparative Approach of Dimensionality Reduction Techniques in Text Classification." Engineering, Technology & Applied Science Research 9, no. 6 (2019): 4974–79. https://doi.org/10.5281/zenodo.3566201.

Full text

Abstract:

This work deals with document classification. It is a supervised learning method (it needs a labeled document set for training and a test set of documents to be classified). The procedure of document categorization includes a sequence of steps consisting of text preprocessing, feature extraction, and classification. In this work, a self-made data set was used to train the classifiers in every experiment. This work compares the accuracy, average precision, precision, and recall with or without combinations of some feature selection techniques and two classifiers (KNN and Naive Bayes). The results concluded that the Naive Bayes classifier performed better in many situations.

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!