Log in

Relevant bibliographies by topics / Automatic analysis of texts / Journal articles

To see the other types of publications on this topic, follow the link: Automatic analysis of texts.

Journal articles on the topic 'Automatic analysis of texts'

Author: Grafiati

Published: 3 June 2025

Last updated: 31 July 2025

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Automatic analysis of texts.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

DIERKS, K. "Automatic Stylistic Analysis of Lyrical Texts." Literary and Linguistic Computing 1, no. 3 (1986): 129–35. http://dx.doi.org/10.1093/llc/1.3.129.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Archer, Wendy, Stefano Consiglio, Paolo Ferri, Luca Pareschi, and Silvio Peroni. "Call for papers: Automatic understanding of texts in social and computer sciences." puntOorg International Journal 1, no. 1 (2019): 1. http://dx.doi.org/10.19245/25.05.cfp.05.

Full text

Abstract:

Over the last 20 years, the use of automated and semi-automated techniques for extracting meanings from text have been widely debated in the social sciences. Automated and semi-automated techniques can be employed in all research phases: data collection (e.g. scraping), data cleaning (e.g. lemmatization of words), analysis (e.g. Named Entity Recognition, Part-of-speech Tagging, Topic Modeling, Keyword Analysis, Semantic Network Analysis, Sentiment Analysis), and visualization. Far from forcing epistemological choices, these techniques can be inductively used to deal with big corpora of data, i

APA, Harvard, Vancouver, ISO, and other styles

3

Mandravickaitė, Justina, Eglė Rimkienė, Danguolė Kotryna Kapkan, Danguolė Kalinauskaitė, and Tomas Krilavičius. "Automatic Simplification of Lithuanian Administrative Texts." Algorithms 17, no. 11 (2024): 533. http://dx.doi.org/10.3390/a17110533.

Full text

Abstract:

Text simplification reduces the complexity of text while preserving essential information, thus making it more accessible to a broad range of readers, including individuals with cognitive disorders, non-native speakers, children, and the general public. In this paper, we present experiments on text simplification for the Lithuanian language, aiming to simplify administrative texts to a Plain Language level. We fine-tuned mT5 and mBART models for this task and evaluated the effectiveness of ChatGPT as well. We assessed simplification results via both quantitative metrics and qualitative evaluat

APA, Harvard, Vancouver, ISO, and other styles

4

Zuban, Oksana. "Automatic Morphemic Analysis in the Corpus of the Ukrainian Language: Results and Prospects." Journal of Linguistics/Jazykovedný casopis 68, no. 2 (2017): 415–25. http://dx.doi.org/10.1515/jazcas-2017-0051.

Full text

Abstract:

Abstract The article describes theoretical issues, principles of constructing and functioning of the Automated System of Morphemic and Derivational Analysis (ASMDA). The ASMDA system performs the following functions: 1) information system; 2) automatic morphemic annotation of text; 3) automatic linguistic constructor for frequency dictionaries. Description of the use of ASMDA as an automatic morphemic analyser of Ukrainian texts’ lexicon is in the centre of attention; this article also describes structure as well as search and classification options of electronic morphemic dictionaries present

APA, Harvard, Vancouver, ISO, and other styles

5

Dmitrishin, A. N., V. D. Revina, V. I. Rusnak, Al-dr A. Khoroshilov, and Al-ey A. Khoroshilov. "METHODS OF AUTOMATED CREATION OF THEMATIC ONTOLOGY." Informatization and communication, no. 1 (March 20, 2019): 28–35. http://dx.doi.org/10.34219/2078-8320-2019-10-1-28-35.

Full text

Abstract:

The article describes the methods, the procedures and technology of automated creation of thematic ontology based on automatic processing and semantic analysis of unstructured normative-technical texts and texts of draft documents and determining the semantic relation between detected by concept names. For an application this method necessary to possess procedures complex of automatic semantic the text processing. The article describes the hybrid methods of automated creation of technologies, based for the use of program-linguistic platform of MetaFraz. The research was supported by the Russia

APA, Harvard, Vancouver, ISO, and other styles

6

Myznikov, P. V. "Case-Based Reasoning in Automatic Analysis of News Texts." Vestnik NSU. Series: Information Technologies 15, no. 2 (2017): 59–65. http://dx.doi.org/10.25205/1818-7900-2017-15-2-59-65.

Full text

APA, Harvard, Vancouver, ISO, and other styles

7

Balabanova, Elisaveta. "Tagging, Morphological Dictionary in Electronic Form, Lemmatization and Normalization of Texts." Izdatel XXII, no. 2 (2020): 48–55. https://doi.org/10.70300/cyl5dcubw2zzthrd8.

Full text

Abstract:

The paper deals with the process of automatic morphological analysis. The issues of morphology and the operations in it are briefly discussed. The morphological dictionary in electronic form is discussed and also the process of automatic morphological analysis in NLP. The problems, encountered when doing automatic morphological analysis, are reviewed. The processes of automatic normalization and lemmatization are also discussed briefly.

APA, Harvard, Vancouver, ISO, and other styles

8

Ugryn, Tetiana. "AUTOMATIC TEXT SUMMARIZATION: PROBLEMS AND PERSPECTIVES." Naukovì zapiski Nacìonalʹnogo unìversitetu «Ostrozʹka akademìâ». Serìâ «Fìlologìâ» 1, no. 17(85) (2023): 96–101. http://dx.doi.org/10.25264/2519-2558-2023-17(85)-96-101.

Full text

Abstract:

The present paper focusses on the automatic text summarization (AS), the analysis of linguistic problems related to it and the ways to overcome them, as well as on the perspectives of using some natural language processing computer programs. The author carries out a comparative analysis of two AS programs, MSWord2003 and Pertinence Summarizer, for literary, journalistic and scientific texts. The chosen methodology of comparative analysis allows not only to single out the peculiarities and limitations of each program, but also to make some general conclusions about the problems existing in the

APA, Harvard, Vancouver, ISO, and other styles

9

Misini, Arta, Ercan Canhasi, Arbana Kadriu, and Endrit Fetahi. "Automatic authorship attribution in Albanian texts." PLOS ONE 19, no. 10 (2024): e0310057. http://dx.doi.org/10.1371/journal.pone.0310057.

Full text

Abstract:

Automatic authorship identification is a challenging task that has been the focus of extensive research in natural language processing. Regardless of the progress made in attributing authorship, the need for corpora in under-resourced languages impedes advancing and examining present methods. To address this gap, we investigate the problem of authorship attribution in Albanian. We introduce a newly compiled corpus of Albanian newsroom columns and literary works and analyze machine-learning methods for detecting authorship. We create a set of hand-crafted features targeting various categories (

APA, Harvard, Vancouver, ISO, and other styles

10

Salton, G., J. Allan, C. Buckley, and A. Singhal. "Automatic Analysis, Theme Generation, and Summarization of Machine-Readable Texts." Science 264, no. 5164 (1994): 1421–26. http://dx.doi.org/10.1126/science.264.5164.1421.

Full text

APA, Harvard, Vancouver, ISO, and other styles

11

Molina Beltrán, Carlos, Alejandra Andrea Segura Navarrete, Christian Vidal-Castro, Clemente Rubio-Manzano, and Claudia Martínez-Araneda. "Improving the affective analysis in texts." Electronic Library 37, no. 6 (2019): 984–1006. http://dx.doi.org/10.1108/el-11-2018-0219.

Full text

Abstract:

Purpose This paper aims to propose a method for automatically labelling an affective lexicon with intensity values by using the WordNet Similarity (WS) software package with the purpose of improving the results of an affective analysis process, which is relevant to interpreting the textual information that is available in social networks. The hypothesis states that it is possible to improve affective analysis by using a lexicon that is enriched with the intensity values obtained from similarity metrics. Encouraging results were obtained when an affective analysis based on a labelled lexicon wa

APA, Harvard, Vancouver, ISO, and other styles

12

Bjekic, Jovana, Ljiljana Lazarevic, Marko Zivanovic, and Goran Knezevic. "Psychometric evaluation of the Serbian dictionary for automatic text analysis - LIWCser." Psihologija 47, no. 1 (2014): 5–32. http://dx.doi.org/10.2298/psi1401005b.

Full text

Abstract:

LIWC (Linguistic Inquiry and Word Count) is widely used word-level content analysis software. It was used in large number of studies in the fields of clinical, social and personality psychology, and it is adapted for text analysis in 11 world languages. The aim of this research was to validate empirically newly constructed adaptation of LIWC software for Serbian language (LIWCser). The sample of the texts consisted of 384 texts in Serbian and 141 texts in English. It included scientific paper abstracts, newspaper articles, movie subtitles, short stories and essays. Comparative analysis of Serb

APA, Harvard, Vancouver, ISO, and other styles

13

Yangiboyevna, Urazaliyeva Mavluda. "Linguistic and Statistical Analysis of Audio Texts." Journal for Research in Applied Sciences and Biotechnology 4, no. 3 (2025): 45–48. https://doi.org/10.55544/jrasb.4.3.6.

Full text

Abstract:

This article provides an in-depth analysis of the phonetic, morphological, and syntactic features of Uzbek audio texts based on linguostatistical analysis. The study integrates modern information and communication technologies, particularly natural language processing (NLP), speech technologies, and corpus linguistics methods. The article scientifically examines phonetic analysis (elision, assimilation, coarticulation), morphological modeling (word forms, affixes), and syntactic structures (asyndetic sentences, introductory words) conducted on audio texts. The Praat software was used for exper

APA, Harvard, Vancouver, ISO, and other styles

14

Kang, Hyerim, Jaepa Baek, Taesoo Kong, and Juhee Yun. "A Study on Automatic Detection of Off-Topic Texts in KFL Automatic Writing Assessment Using Document Similarity." Urimal Society 80 (January 31, 2025): 89–111. https://doi.org/10.35902/urm.2025.80.89.

Full text

Abstract:

The purpose of this study is to propose a method for detecting off-topic texts in KFL (Korean as a Foreign Language) automatic writing assessments. To achieve this, a training dataset consisting of 150 texts was constructed using the Korean learner corpus provided by the National Institute of the Korean Language. Based on this dataset, cosine similarity, Euclidean similarity, and Manhattan similarity models were developed. Subsequently, two test datasets were created: one consisting of texts on the same topic as the training dataset (30 texts) and the other consisting of texts on different top

APA, Harvard, Vancouver, ISO, and other styles

15

Avanesyan, Nina, Fedor Solovev, Elizaveta Tikhomirova, and Andrey Chepovskiy. "Identifying the Significant Features in Illegal Texts." Voprosy kiberbezopasnosti, no. 4(38) (2020): 76–84. http://dx.doi.org/10.21681/2311-3456-2020-04-76-84.

Full text

Abstract:

The purpose of the study: development of a technique for determining lexical characteristics and psycholinguistic factors as discriminative features for identifying the topics of illegal texts by frequency methods for information security purposes. Method: automatic morphological and syntactic analysis, frequency methods, comparison of auto-generated dictionaries by correlation analysis methods. Results: a technique of frequency analysis of the illegal texts vocabulary has been developed, which allows to compare different sets of texts using frequency dictionaries and identify discriminative f

APA, Harvard, Vancouver, ISO, and other styles

16

Maksimenko, Olga I. "AUTOMATIC DISTRIBUTIVE-STATISTIC ANALYSIS AS SYSTEM TEXT PROCESSING." RUDN Journal of Language Studies, Semiotics and Semantics 10, no. 1 (2019): 92–100. http://dx.doi.org/10.22363/2313-2299-2019-10-1-92-100.

Full text

Abstract:

The article is devoted to the to the description of the automatic distributive-statistic analysis, distinction in distributive and statistical approaches for the analysis of terminological lexicon, an opportunity of application of the method for automatic text processing, formation of thesauruses and possibility to use it for the identification text genre in the corpse of technical texts.

APA, Harvard, Vancouver, ISO, and other styles

17

Dagaev, Alexander Evgenevich, and Dmitry Ivanovich Popov. "Comparison of automatic summarization of texts in Russian." Программные системы и вычислительные методы, no. 4 (April 2024): 13–22. http://dx.doi.org/10.7256/2454-0714.2024.4.69474.

Full text

Abstract:

The subject of the research in this article is the generalization of texts in Russian using artificial intelligence models. In particular, the authors compare the popular models GigaChat, YaGPT2, ChatGPT-3.5, ChatGPT-4, Bard, Bing AI and YouChat and conduct a comparative study of their work on Russian texts. The article uses datasets for the Russian language, such as Gazeta, XL-Sum and WikiLingua, as source materials for subsequent generalization, as well as additional datasets in English, CNN Dailymail and XSum, were taken to compare the effectiveness of generalization. The article uses the f

APA, Harvard, Vancouver, ISO, and other styles

18

Li, Huan Qin, and Shi Tao Yan. "The Research on Chinese Automatic Segmentation." Advanced Materials Research 798-799 (September 2013): 818–21. http://dx.doi.org/10.4028/www.scientific.net/amr.798-799.818.

Full text

Abstract:

Abstract. Word Segmentation is a fundamental problem of the Chinese natural language progressing. Based on the analysis of the state on research and key issues of it .The paper introduces a tree structure statistical method of word frequency, which enables key words to match one another highly-efficiently. by which we can rapidly express texts as the set of high-frequency words,so the classification of texts is conveniently reached.

APA, Harvard, Vancouver, ISO, and other styles

19

Kan, A. V., Y. D. Kozlovskaya, N. A. Kadushkin, and A. A. Khoroshilov. "Automatic Clustering of Mass Media Documents Based on the Analysis of Their Semantic Content." Моделирование и анализ данных 10, no. 3 (2020): 24–38. http://dx.doi.org/10.17759/mda.2020100302.

Full text

Abstract:

The article describes the solution to the problem of automatic clustering of media documents based on the analysis of their semantic analysis. The proposed solution is based on the methods of machine grammar, semantic-syntactic and conceptual analysis of texts, as well as methods for identifying the conceptual composition of a collection of documents and formalizing the semantic content of texts. The developed algorithm of the document clustering process provides for the possibility of its implementation in a fully automatic mode without prior machine learning.

APA, Harvard, Vancouver, ISO, and other styles

20

Serediuk, Vitalii. "Possibilities of using artificial intelligence and natural language processing to analyse legal norms and interpret them." Social Legal Studios 7, no. 2 (2024): 191–200. http://dx.doi.org/10.32518/sals2.2024.191.

Full text

Abstract:

The study aaddressed the possibilities of using information technology and natural language in the study of legal norms. The study aimed to develop methods for using artificial intelligence and natural language processing to analyse jurisprudence. To achieve this goal, automatic strategies were created to recognise the main topics in legal texts, identify key legal concepts and analyse the structure of documents. The results of the study included an analysis of existing methods of using technology and natural language to analyse legal norms. The methods used included machine and deep learning,

APA, Harvard, Vancouver, ISO, and other styles

21

Sidorova, E. A., I. R. Akhmadeeva, Yu A. Zagorulko, et al. "An integrated approach to the analysis of argumentative relationships in scientific communication texts." Ontology of Designing 13, no. 4 (2023): 562–79. http://dx.doi.org/10.18287/2223-9537-2023-13-4-562-579.

Full text

Abstract:

The problem of automatic analysis of argumentation in scientific communication texts is considered. Argumentation is understood as an ordered set of arguments used to support a certain thesis. An argument includes at least one premise and one conclusion, connected by an argumentative relation. The purpose of the work is an experimental study of neu-ral network approaches to solving the problem of searching and extracting argumentative relations between statements located closely in the text. The study was conducted on a corpus of texts with argumentative markup created using the previously dev

APA, Harvard, Vancouver, ISO, and other styles

22

Manna, Sukanya. "Automatic analysis of microblogging data to aid in emergency management." Encyclopedia with Semantic Computing and Robotic Intelligence 02, no. 02 (2018): 1850019. http://dx.doi.org/10.1142/s2529737618500193.

Full text

Abstract:

Microblogging platforms like Twitter, in the recent years, have become one of the important sources of information for a wide spectrum of users. As a result, these platforms have become great resources to provide support for emergency management. During any crisis, it is necessary to sieve through a huge amount of social media texts within a short span of time to extract meaningful information from them. Extraction of emergency-specific information, such as topic keywords or landmarks or geo-locations of sites, from these texts plays a significant role in building an application for emergency

APA, Harvard, Vancouver, ISO, and other styles

23

Shanshar, S. Sh, and I. M. Ualiyeva. "FEATURE SELECTION FOR AUTOMATIC DETECTION OF TEXT GENRE." Bulletin of Kazakh National Women's Teacher Training University, no. 1 (April 9, 2021): 84–90. http://dx.doi.org/10.52512/2306-5079-2021-85-1-84-90.

Full text

Abstract:

This article discusses the algorithms that can be used in the study and analysis of symbols to determine the genre of texts. There are differences in defining the genre of texts. Algorithm is also defined by describing the text, removing unnecessary characters, leaving only the text, and comparing it with the database. The article describes a practical method of automatic recognition of the text genre based on all parameters. Comparing the logistics regression, solution tree, random forest, MLPClassifier, AdaBoostClassifier, svm, GaussianNB algorithms, the choice of the most important paramete

APA, Harvard, Vancouver, ISO, and other styles

24

Li, Min, Meng Dong Chen, and Xiang Bin Li. "An Approach for Sentiment Tendency Analysis on Comment Text." Advanced Materials Research 989-994 (July 2014): 1913–17. http://dx.doi.org/10.4028/www.scientific.net/amr.989-994.1913.

Full text

Abstract:

With the rapid development of network, texts which contain position, views and opinions of events are exploding. Texts of review contain author’s feelings, views and tendencies the author wants to express. People need to analyze these texts automatically to acquire sentiment tendency of the author. This paper presents a model for automatic text analysis about sentiment tendency on comment text. The model combines algorithms based on emotional dictionary and Support Vector Machine learning algorithm together, which takes advantage of both algorithms.

APA, Harvard, Vancouver, ISO, and other styles

25

Барахнин, Владимир Борисович, Ольга Юрьевна Кожемякина, Ирина Владимировна Кузнецова, and Вера Алексеевна Карпова. "The model of facture of Russian poetic texts." Вычислительные технологии, no. 3 (August 13, 2021): 107–17. http://dx.doi.org/10.25743/ict.2021.26.3.007.

Full text

Abstract:

При исследовании зависимости смыслового содержания поэтических текстов от их метроритмических характеристик и строфики возникает необходимость в описании фактур - совокупностей названных характеристик поэтических текстов. В статье сформулировано определение фактуры - структурной модели поэтического текста, которая однозначно определяет метроритмический и строфический шаблон стихотворного текста; представлены алгоритмы определения метра и стопности, а также строфики стихотворения, позволяющие определить его фактуру. В ходе работы создана таблица всех фактур, встречающихся в стихотворениях А.С.

APA, Harvard, Vancouver, ISO, and other styles

26

Kazachkova, M. B., and Kh N. Galimova. "Comparative Analysis of Text Complexity in English Textbooks." Professional Discourse & Communication 4, no. 4 (2022): 22–32. http://dx.doi.org/10.24833/2687-0126-2022-4-4-22-32.

Full text

Abstract:

The article presents the results of a comparative analysis of the complexity of texts on the material of the linguistic subcorpus of English textbooks “English”, “Starlight” and “Rainbow English” for the 11th grade. The total volume of the sub-corpus of texts acting as empirical material for the research amounted to 36 texts (3955 word uses). One of the characteristics of the complexity of the text is its readability. It depends on the degree of complexity of words and the structures used in the sentences, as well as on such quantitative parameters of the text as the average length of the sent

APA, Harvard, Vancouver, ISO, and other styles

27

Kupriyanov, Roman, Marina Solnyshkina, and Polina Lekhnitskaya. "Parametric Taxonomy of Educational Texts." Vestnik Volgogradskogo gosudarstvennogo universiteta. Serija 2. Jazykoznanije, no. 6 (February 2024): 80–94. http://dx.doi.org/10.15688/jvolsu2.2023.6.6.

Full text

Abstract:

The article is aimed at considering the issue of the discursive text typology and developing a parametric model of the elementary school texts for the ontological domain by employing a corpus-based approach and methods of linguistic statistics. The research corpus of over 90,000 tokens comprises texts of 13 textbooks acknowledged in the 2 nd grade of Russian schools. The applied multifactor discriminant analysis enabled identification and validation of typological characteristics of the texts under study, offering the formula for referring educational texts to a subject domain on Philology, Ma

APA, Harvard, Vancouver, ISO, and other styles

28

Wolfe, Christopher R., Mitchell Dandignac, Rachel Sullivan, Tatum Moleski, and Valerie F. Reyna. "Automatic Evaluation of Cancer Treatment Texts for Gist Inferences and Comprehension." Medical Decision Making 39, no. 8 (2019): 939–49. http://dx.doi.org/10.1177/0272989x19874316.

Full text

Abstract:

Background. It is difficult to write about cancer for laypeople such that everyone understands. One common approach to readability is the Flesch-Kincaid Grade Level (FKGL). However, FKGL has been shown to be less effective than emerging discourse technologies in predicting readability. Objective. Guided by fuzzy-trace theory, we used the discourse technology Coh-Metrix to create a Gist Inference Score (GIS) and applied it to texts from the National Cancer Institute website written for patients and health care providers. We tested the prediction that patient cancer texts with higher GIS scores

APA, Harvard, Vancouver, ISO, and other styles

29

AMANCIO, DIEGO R., LUCAS ANTIQUEIRA, THIAGO A. S. PARDO, LUCIANO da F. COSTA, OSVALDO N. OLIVEIRA, and MARIA G. V. NUNES. "COMPLEX NETWORKS ANALYSIS OF MANUAL AND MACHINE TRANSLATIONS." International Journal of Modern Physics C 19, no. 04 (2008): 583–98. http://dx.doi.org/10.1142/s0129183108012285.

Full text

Abstract:

Complex networks have been increasingly used in text analysis, including in connection with natural language processing tools, as important text features appear to be captured by the topology and dynamics of the networks. Following previous works that apply complex networks concepts to text quality measurement, summary evaluation, and author characterization, we now focus on machine translation (MT). In this paper we assess the possible representation of texts as complex networks to evaluate cross-linguistic issues inherent in manual and machine translation. We show that different quality tran

APA, Harvard, Vancouver, ISO, and other styles

30

Zarco-Tejada, Ángeles, Carmen Noya Gallardo, Mª Carmen Merino Ferradá, and Isabel Calderón López. "Analysing corpus-based criterial conjunctions for automatic proficiency classification." Journal of English Studies 14 (December 16, 2016): 215. http://dx.doi.org/10.18172/jes.3090.

Full text

Abstract:

The linguistic profiling of L2 learning texts can be taken as a model for automatic proficiency assessment of new texts. But proficiency levels are distinguished by many different linguistic features among which the use of cohesive devices can be a criterial element for level distinctions, either in the number of conjunctions used (quantitative) and/or in the type and variety of them (qualitative). We have carried such an analysis with a subgroup of the CLEC (CEFR-levelled English Corpus) using Coh-Metrix, a tool for computing computational cohesion and coherence metrics for written and spoken

APA, Harvard, Vancouver, ISO, and other styles

31

Santucci, Valentino, Filippo Santarelli, Luciana Forti, and Stefania Spina. "Automatic Classification of Text Complexity." Applied Sciences 10, no. 20 (2020): 7285. http://dx.doi.org/10.3390/app10207285.

Full text

Abstract:

This work introduces an automatic classification system for measuring the complexity level of a given Italian text under a linguistic point-of-view. The task of measuring the complexity of a text is cast to a supervised classification problem by exploiting a dataset of texts purposely produced by linguistic experts for second language teaching and assessment purposes. The commonly adopted Common European Framework of Reference for Languages (CEFR) levels were used as target classification classes, texts were elaborated by considering a large set of numeric linguistic features, and an experimen

APA, Harvard, Vancouver, ISO, and other styles

32

Voronina, Irina Evgenievna, and Marina Konstantinovna Pastrevich. "Model for Classifying Verbal Aggression in Unstructured Texts." Herald of the Siberian State University of Telecommunications and Information Science 19, no. 2 (2025): 81–88. https://doi.org/10.55648/1998-6920-2025-19-2-81-88.

Full text

Abstract:

A software package for automatic classification of aggressive vocabulary in unstructured texts in the information space is considered, taking into account the contextual analysis of statements at the lexical-semantic level and justification for the choice of a vectorizer and the most optimal classification for machine learning.

APA, Harvard, Vancouver, ISO, and other styles

33

Dimitrova, Ludmila, and Radovan Garabík. "Translation equivalence of demonstrative pronouns in Bulgarian-Slovak parallel texts." Cognitive Studies | Études cognitives, no. 14 (September 4, 2014): 65–74. http://dx.doi.org/10.11649/cs.2014.007.

Full text

Abstract:

Translation equivalence of demonstrative pronouns in Bulgarian-Slovak parallel textsIn this paper we describe our automatic analysis of several parallel Bulgarian-Slovak texts with the goal to obtain useful information about Slovak translation equivalents of (definite) articles and demonstrative pronouns in Bulgarian. Rather than focusing on individual translation equivalents, we present a method for automatic extraction and visualization of the translations. This can serve as a guide for pinpointing interesting features in specific translated documents and could be extended for other parts of

APA, Harvard, Vancouver, ISO, and other styles

34

Бершадский, А., П. А. Гудков, and Е. М. Подмарькова. "Bottom-up syntax analysis for natural language texts." МОДЕЛИРОВАНИЕ, ОПТИМИЗАЦИЯ И ИНФОРМАЦИОННЫЕ ТЕХНОЛОГИИ 9, no. 1(32) (2021): 1–2. http://dx.doi.org/10.26102/2310-6018/2021.32.1.001.

Full text

Abstract:

Актуальность работы обусловлена необходимостью автоматизации процесса принятия решений по юридическим вопросам в различных областях человеческой деятельности. В связи с этим, данная статья направлена на раскрытие подхода к организации процесса синтаксического анализа текстов на естественном языке для последующего автоматического построения семантической сети в соответствии с заданными входными документами. В качестве предметной области рассматривается сфера правовой информации. Предлагаемый авторами подход открывает широкие возможности по смысловому анализу правовых документов и их сравнении м

APA, Harvard, Vancouver, ISO, and other styles

35

Hayashi, Victor, Mateus Carvalho, João Carlos Néto, et al. "Information Retrieval Based on Brazilian Portuguese Texts." Journal of Systemics, Cybernetics and Informatics 20, no. 1 (2022): 249–69. http://dx.doi.org/10.54808/jsci.20.01.249.

Full text

Abstract:

Knowledge-based intelligent systems might be used in the banking sector to automate customer service. One of the ways to represent knowledge that is both understandable by humans and readable by machines is by using ontologies. Whenever a customer queries its bank regarding specific products or services, the existing knowledge modeled in an ontology might be used by a customer service chatbot to answer it in an automated way. The existing manual information retrieval process from banking specialists is laborious and time-consuming. Specialists use natural language, visual representations, and

APA, Harvard, Vancouver, ISO, and other styles

36

Metlitskaya, N. A. "LINGUISTIC DATABASE FOR AUTOMATIC GENERATION SYSTEM OF ENGLISH ADVERTISING TEXTS." «System analysis and applied information science», no. 2 (August 7, 2017): 62–67. http://dx.doi.org/10.21122/2309-4923-2017-2-62-67.

Full text

Abstract:

The article deals with the linguistic database for the system of automatic generation of English advertising texts on cosmetics and perfumery. The database for such a system includes two main blocks: automatic dictionary (that contains semantic and morphological information for each word), and semantic-syntactical formulas of the texts in a special formal language SEMSINT. The database is built on the result of the analysis of 30 English advertising texts on cosmetics and perfumery. First, each word was given a unique code. For example, N stands for nouns, A – for adjectives, V – for verbs, et

APA, Harvard, Vancouver, ISO, and other styles

37

Kochkonbaeva, B., and A. Aldosova. "Automatic processing of text in natural language." Bulletin of Science and Practice 4, no. 7 (2018): 216–21. https://doi.org/10.5281/zenodo.1312217.

Full text

Abstract:

In this article, questions of artificial intelligence, in particular, automatic processing in natural language texts are considered. As well as types of wordform analysis are considered and an algorithm for finding the initial form of the word is proposed.

APA, Harvard, Vancouver, ISO, and other styles

38

Vopilova, E. V., and E. N. Kryuchkova. "Methods of Automatic Analysis of Information Presentation Dynamics in Texts Based on Adaptable Dictionaries of Scientific Terms." Programmnaya Ingeneria 15, no. 4 (2024): 206–15. http://dx.doi.org/10.17587/prin.15.206-215.

Full text

Abstract:

The paper proposes a method for determining the dynamics of aspectual content of scientific texts. The problems arising in the automatic processing of scientific texts are discussed, approaches to the creation of a combined method of aspect-oriented analysis of scientific texts are proposed. The results of experiments of aspect analysis of scientific publications in the field of mathematics are given. The algorithm of aspect analysis is built on the basis of processing both the semantic domain graph formed as a result of automatic extraction of information from the text of the mathematical enc

APA, Harvard, Vancouver, ISO, and other styles

39

Zheng, Xuan, Wei Chen, Haijun Zhou, Zhe Li, Tianfan Zhang, and Qi Yuan. "Emoji-Integrated Polyseme Probabilistic Analysis Model: Sentiment Analysis of Short Review Texts on Library Service Quality." Traitement du Signal 39, no. 1 (2022): 313–22. http://dx.doi.org/10.18280/ts.390133.

Full text

Abstract:

It is a great challenge to understand user evaluation of library service quality based on short review texts. This is because short texts are limited in length and lack context support. What is worse, the polysemes and emojis in short texts make the literal emotions of these texts rather ambiguous and variable. The variability is often overlooked in previous research on service quality evaluation, which reduces the accuracy of automatic analysis methods. Considering the effects of polysemes and emojis in short texts, this paper introduces probabilistic linguistic term sets (PLTS) and support v

APA, Harvard, Vancouver, ISO, and other styles

40

Korycinski, C., and Alan F. Newell. "Natural-language processing and automatic indexing." Indexer: The International Journal of Indexing: Volume 17, Issue 1 17, no. 1 (1990): 21–29. http://dx.doi.org/10.3828/indexer.1990.17.1.8.

Full text

Abstract:

The task of producing satisfactory indexes by automatic means has been tackled on two fronts: by statistical analysis of text and by attempting content analysis of the text in much the same way as a human indexcr does. Though statistical techniques have a lot to offer for free-text database systems, neither method has had much success with back-of-the-bopk indexing. This review examines some problems associated with the application of natural-language processing techniques to book texts.

APA, Harvard, Vancouver, ISO, and other styles

41

Pakniat, Nasrollah, and Jalaleddin Nasiri. "Analysis of Citation Texts in Persian Using Support Vector Machines." Journal of Information Processing and Management 37, no. 4 (2022): 1245–68. https://doi.org/10.5281/zenodo.14041748.

Full text

Abstract:

A citation text can be considered as a collection of components such as author names, title, place of publication, year of publication, page numbers, and so on. While the analysis of citation texts at the end of a scientific document can be easily performed by a human user, the diversity in citation styles, along with errors made by authors in writing these texts, complicates the automation of this process. Many methods have been proposed for the automation of citation text analysis; however, these methods are language-dependent, and applying a method designed for one language to another can l

APA, Harvard, Vancouver, ISO, and other styles

42

Zinoveva, Anastasiia Yu, Svetlana O. Sheremetyeva, and Ekaterina D. Nerucheva. "THE ANALYSIS OF AMBIGUITY IN CONCEPTUAL ANNOTATION OF RUSSIAN TEXTS." Tyumen State University Herald. Humanities Research. Humanitates 6, no. 3 (2020): 38–60. http://dx.doi.org/10.21684/2411-197x-2020-6-3-38-60.

Full text

Abstract:

Properly annotated text corpora are an essential condition in constructing effective and efficient tools for natural language processing (NLP), which provide an operational solution to both theoretical and applied linguistic and informational problems. One of the main and the most complex problems of corpus annotation is resolving tag ambiguities on a specific level of annotation (morphological, syntactic, semantic, etc.). This paper addresses the issue of ambiguity that emerges on the conceptual level, which is the most relevant text annotation level for solving informational tasks. Conceptua

APA, Harvard, Vancouver, ISO, and other styles

43

Tkachenko, Kostiantyn. "Semantic Analysis of Natural Language Texts: Ontological Approach." Digital Platform: Information Technologies in Sociocultural Sphere 7, no. 2 (2024): 211–23. https://doi.org/10.31866/2617-796x.7.2.2024.317726.

Full text

Abstract:

The development of information (intelligent) learning systems, electronic document management systems, web-oriented systems working with text information in natural language has led to an increase in the volume of educational content and/or arrays of processed full-text documents. All this requires new means of organizing access to information, many of which should be classified as intelligent systems for knowledge processing. One of the effective approaches to identifying and processing the meaning of educational content (and/or text documents) is the use of ontologies. The purpose of the art

APA, Harvard, Vancouver, ISO, and other styles

44

Kolesnichenko, Aliona, and Natalya Zhmayeva. "Grammatical Difficulties of Automatic Translation of Scientific and Technical Literature." Naukovy Visnyk of South Ukrainian National Pedagogical University named after K. D. Ushynsky: Linguistic Sciences 26, no. 27 (2019): 134–41. http://dx.doi.org/10.24195/2616-5317-2018-27-16.

Full text

Abstract:

The article is devoted to the analysis of grammatical difficulties encountered in the process of automatic translation. The paper discusses the advantages and disadvantages of the SDL Trados automatic translation service. The types of grammatical errors when translating scientific and technical texts in SDL Trados are classified, the ways of overcoming them are outlined. Key words: scientific and technical literature, automatic translation, grammatical difficulties.

APA, Harvard, Vancouver, ISO, and other styles

45

Isaeva, Ekaterina V., and Behruz Z. Safarbekov. "Basic Algorithm for Automatic Spelling Correction of Russian Texts: Development, Evaluation and Prospects." Вестник Пермского университета. Математика. Механика. Информатика, no. 1 (68) (2025): 91–108. https://doi.org/10.17072/1993-0550-2025-1-91-108.

Full text

Abstract:

Automatic spelling check and correction of texts in Russian is an urgent task in the field of natural language processing. Our research is aimed at developing, evaluating, and describing a computer programme for correcting spelling errors with high accuracy.The proposed method is based on line-by-line text processing using rules for spelling and capitalisation accuracy and a probabilistic model for proposing candidate words for error correction. Our algorithm operates at the level of individual words, which limits its ability to take context into account. The metrics used to test the quality o

APA, Harvard, Vancouver, ISO, and other styles

46

KUZNIETSOV, Oleksii, and Gennadiy KYSELOV. "USING AND ANALYSIS OF FORMAL METHODS FOR EVALUATING THE RELEVANCE OF AUTOMATICALLY GENERATED SUMMARIES OF INFORMATIONAL TEXTS." Advanced Information Technology, no. 1 (3) (2024): 32–48. https://doi.org/10.17721/ait.2024.1.04.

Full text

Abstract:

B a c k g r o u n d . The article reviews existing approaches to evaluating the quality of automatically generated summaries of informational texts. It provides an overview of automatic summarization methods, including classical approaches and modern models based on artificial intelligence. The review covers extractive summarization methods such as TF-IDF and PageRank, as well as graph-based methods, specifically TextRank. Special attention is given to abstractive approaches, including Generative Pretrained Transformer (GPT) and Bidirectional and Auto-Regressive Transformers (BART) models. The

APA, Harvard, Vancouver, ISO, and other styles

47

Butenko, I. I., N. S. Nikolaeva, and T. D. Margaryan. "Structural Models of Terminological Word Combinations for Marking up a Corpus of Scientific and Technical Texts." NSU Vestnik. Series: Linguistics and Intercultural Communication 19, no. 3 (2021): 45–56. http://dx.doi.org/10.25205/1818-7935-2021-19-3-45-56.

Full text

Abstract:

The article presents structural models of terminological phrases from the subject area “Welding” as the basis for creating automated tools to mark up the corpus of scientific and technical texts. The place of scientific and technical corpora in corpus linguistics and the prospects for their further research are outlined. The relevance of the research stems from the need to create corpora of scientific and technical texts in general and to provide tools for automatic detection of terms in particular. It is substantiated that the main problem in designing such corpora is the automatic markup of

APA, Harvard, Vancouver, ISO, and other styles

48

Pérez-Guadarramas, Yamel, Manuel Barreiro-Guerrero, Alfredo Simón-Cuevas, Francisco P. Romero, and José A. Olivas. "Analysis of OWA operators for automatic keyphrase extraction in a semantic context." Intelligent Data Analysis 24 (December 4, 2020): 43–62. http://dx.doi.org/10.3233/ida-200008.

Full text

Abstract:

Automatic keyphrase extraction from texts is useful for many computational systems in the fields of natural language processing and text mining. Although a number of solutions to this problem have been described, semantic analysis is one of the least exploited linguistic features in the most widely-known proposals, causing the results obtained to have low accuracy and performance rates. This paper presents an unsupervised method for keyphrase extraction, based on the use of lexico-syntactic patterns for extracting information from texts, and a fuzzy topic modeling. An OWA operator combining se

APA, Harvard, Vancouver, ISO, and other styles

49

Gulyamova, Shakhnoza Kakhramonovna Gulyamova. "SEMANTIC ANALYSIS AND SYN SIS AND SYNTHESIS IN THE A THESIS IN THE AUTOMATIC ANALYSIS OF THE TEXT." Scientific Reports of Bukhara State University 5, no. 1 (2021): 112–24. http://dx.doi.org/10.52297/2181-1466/2021/5/1/9.

Full text

Abstract:

Introduction. In the information-search engine, semantic analysis and synthesis occupy a leading place. When we say automatic semantic analysis, using specially developed linguistic algorithms, we understand a set of methods and techniques that can be used with sufficient accuracy to express the meaning of random speech in a natural language with the help of a rigorous, accurate tool that is carried out on a computer. Highlighting the importance of the semantic analyzer in the information search engine, it is first of all associated with the study of the process of semantic analysis and synthe

APA, Harvard, Vancouver, ISO, and other styles

50

Lila, A. M., I. Yu Torshin, A. N. Gromov, V. A. Semenov, and O. A. Gromova. "Pharmacoinformation studies of chondroprotectors." Modern Rheumatology Journal 15, no. 5 (2021): 114–20. http://dx.doi.org/10.14412/1996-7012-2021-5-114-120.

Full text

Abstract:

The pharmacoinformation approach to the assessment and modeling of drugs involves the use of modern methods of data mining. These methods include: 1) analysis of big data (selection of texts of scientific publications, search for new biomarkers); 2) computer analysis of texts (automatic classification of texts by content, identification of pseudoscientific texts); 3) analysis of metric maps (visualization and analysis of complex patterns, including clustering) and 4) chemoinformation analysis, including the assessment of the effect of drugs on the transcriptome, proteome and microbiome of a pe

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!