To see the other types of publications on this topic, follow the link: Linguistic Extraction.

Journal articles on the topic 'Linguistic Extraction'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Linguistic Extraction.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Diedrichsen, Elke. "Linguistic challenges in automatic summarization technology." Journal of Computer-Assisted Linguistic Research 1, no. 1 (2017): 40. http://dx.doi.org/10.4995/jclr.2017.7787.

Full text
Abstract:
Automatic summarization is a field of Natural Language Processing that is increasingly used in industry today. The goal of the summarization process is to create a summary of one document or a multiplicity of documents that will retain the sense and the most important aspects while reducing the length considerably, to a size that may be user-defined. One differentiates between extraction-based and abstraction-based summarization. In an extraction-based system, the words and sentences are copied out of the original source without any modification. An abstraction-based summary can compress, fuse
APA, Harvard, Vancouver, ISO, and other styles
2

Rodzuan, Nur Aniq Syafiq, Shahreen Kasim, Mohanavali Sithambranathan, and Muhammad Zaki Hassan. "Classification of Biomedical Literature in Hypertension and Diabetes." International Journal on Data Science 1, no. 2 (2020): 114–19. http://dx.doi.org/10.18517/ijods.1.2.114-119.2020.

Full text
Abstract:
Textual information gives us more clear information as it is presented using words and characters, which is easy for humans to understand. To extract this kind of information, text mining was introduced as new technology. Text mining is the process of extracting non-trivial patterns or knowledge from text documents or from textual databases. The purpose of this research paper is to perform and compare keyword extraction using statistical and linguistic extraction tools for 120 text documents related to hypertension and diabetes disease. In order to draw this comparison, RStudio, a statistical-
APA, Harvard, Vancouver, ISO, and other styles
3

Vo, Duc-Thuan, and Ebrahim Bagheri. "Open information extraction." Encyclopedia with Semantic Computing and Robotic Intelligence 01, no. 01 (2017): 1630003. http://dx.doi.org/10.1142/s2425038416300032.

Full text
Abstract:
Open information extraction (Open IE) systems aim to obtain relation tuples with highly scalable extraction in portable across domain by identifying a variety of relation phrases and their arguments in arbitrary sentences. The first generation of Open IE learns linear chain models based on unlexicalized features such as Part-of-Speech (POS) or shallow tags to label the intermediate words between pair of potential arguments for identifying extractable relations. Open IE currently is developed in the second generation that is able to extract instances of the most frequently observed relation typ
APA, Harvard, Vancouver, ISO, and other styles
4

Hukari, Thomas E., and Robert D. Levine. "Adjunct extraction." Journal of Linguistics 31, no. 2 (1995): 195–226. http://dx.doi.org/10.1017/s0022226700015590.

Full text
Abstract:
In current linguistic theory, the theoretical status of adjunct extractions, as in for example How often do you think Robin sees Kim? is, somewhat surprisingly, an unresolved issue, with some investigators arguing that only arguments extract syntactically, entailing analyses of adverbial gaps via fundamentally different mechanisms from those posited for argument extraction. We adduce extensive evidence against such positions from a number of languages which exhibit morphological or syntactic phenomena which are sensitive to binding (extraction) domains and where this morphosyntactic flagging i
APA, Harvard, Vancouver, ISO, and other styles
5

Szymanski, Terrence. "Automatic Extraction of Linguistic Data from Digitized Documents." Annual Meeting of the Berkeley Linguistics Society 39, no. 1 (2013): 273. http://dx.doi.org/10.3765/bls.v39i1.3886.

Full text
Abstract:
In lieu of an abstract, here is a brief excerpt:This paper presents a system for automatically extracting linguistic data from digitized linguistic documents using a combination of existing software packages and custom scripts. The system is designed to leverage existing resources in online digital libraries in order to bootstrap the creation of large, multi-lingual linguistic corpora, which can then be used to conduct data-driven experimental research into cross-linguistic or universal linguistic phenomena. The system identifies instances of foreign-language text accompanied by reference-lang
APA, Harvard, Vancouver, ISO, and other styles
6

Khairova, Nina, Orken Mamyrbayev, Kuralay Mukhsina, Anastasiia Kolesnyk, and Saurabh Pratap. "Logical-linguistic model for multilingual Open Information Extraction." Cogent Engineering 7, no. 1 (2020): 1714829. http://dx.doi.org/10.1080/23311916.2020.1714829.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Benamara, Farah, Maite Taboada, and Yannick Mathieu. "Evaluative Language Beyond Bags of Words: Linguistic Insights and Computational Applications." Computational Linguistics 43, no. 1 (2017): 201–64. http://dx.doi.org/10.1162/coli_a_00278.

Full text
Abstract:
The study of evaluation, affect, and subjectivity is a multidisciplinary enterprise, including sociology, psychology, economics, linguistics, and computer science. A number of excellent computational linguistics and linguistic surveys of the field exist. Most surveys, however, do not bring the two disciplines together to show how methods from linguistics can benefit computational sentiment analysis systems. In this survey, we show how incorporating linguistic insights, discourse information, and other contextual phenomena, in combination with the statistical exploitation of data, can result in
APA, Harvard, Vancouver, ISO, and other styles
8

Solovyev, Valery, and Vladimir Ivanov. "Knowledge-Driven Event Extraction in Russian: Corpus-Based Linguistic Resources." Computational Intelligence and Neuroscience 2016 (2016): 1–11. http://dx.doi.org/10.1155/2016/4183760.

Full text
Abstract:
Automatic event extraction form text is an important step in knowledge acquisition and knowledge base population. Manual work in development of extraction system is indispensable either in corpus annotation or in vocabularies and pattern creation for a knowledge-based system. Recent works have been focused on adaptation of existing system (for extraction from English texts) to new domains. Event extraction in other languages was not studied due to the lack of resources and algorithms necessary for natural language processing. In this paper we define a set of linguistic resources that are neces
APA, Harvard, Vancouver, ISO, and other styles
9

Fulford, Heather. "Exploring terms and their linguistic environment in text." Terminology 7, no. 2 (2001): 259–79. http://dx.doi.org/10.1075/term.7.2.08ful.

Full text
Abstract:
The proliferation of specialist texts over recent decades has exacerbated the need for term extraction software to assist terminologists in compiling terminology collections. To this end, an automated approach to English term extraction is presented, which, in keeping with the multidisciplinary working environments of many contemporary terminologists, is designed to be domain independent. Based on observations made of the linguistic features of terms and their linguistic environment in text, this approach identifies single- and multi-word terms spanning a range of word classes. An implementati
APA, Harvard, Vancouver, ISO, and other styles
10

Gu, Jinghang, Longhua Qian, and Guodong Zhou. "Chemical-induced disease relation extraction with various linguistic features." Database 2016 (2016): baw042. http://dx.doi.org/10.1093/database/baw042.

Full text
APA, Harvard, Vancouver, ISO, and other styles
11

BASILI, ROBERTO, MARIA TERESA PAZIENZA, and PAOLA VELARDI. "SEMI-AUTOMATIC EXTRACTION OF LINGUISTIC INFORMATION FOR SYNTACTIC DISAMBIGUATION." Applied Artificial Intelligence 7, no. 4 (1993): 339–64. http://dx.doi.org/10.1080/08839519308949994.

Full text
APA, Harvard, Vancouver, ISO, and other styles
12

BEREND, GÁBOR. "Exploiting extra-textual and linguistic information in keyphrase extraction." Natural Language Engineering 22, no. 1 (2014): 73–95. http://dx.doi.org/10.1017/s1351324914000126.

Full text
Abstract:
AbstractKeyphrases are the most important phrases of documents that make them suitable for improving natural language processing tasks, including information retrieval, document classification, document visualization, summarization and categorization. Here, we propose a supervised framework augmented by novel extra-textual information derived primarily from Wikipedia. Wikipedia is utilized in such an advantageous way that – unlike most other methods relying on Wikipedia – a full textual index of all the Wikipedia articles is not required by our approach, as we only exploit the category hierarc
APA, Harvard, Vancouver, ISO, and other styles
13

DeLima, Pedro G., and Gary G. Yen. "Multiple objective evolutionary algorithm for temporal linguistic rule extraction." ISA Transactions 44, no. 2 (2005): 315–27. http://dx.doi.org/10.1016/s0019-0578(07)60184-0.

Full text
APA, Harvard, Vancouver, ISO, and other styles
14

Jun-Tae Kim and D. I. Moldovan. "Acquisition of linguistic patterns for knowledge-based information extraction." IEEE Transactions on Knowledge and Data Engineering 7, no. 5 (1995): 713–24. http://dx.doi.org/10.1109/69.469825.

Full text
APA, Harvard, Vancouver, ISO, and other styles
15

Rau, Lisa F., Paul S. Jacobs, and Uri Zernik. "Information extraction and text summarization using linguistic knowledge acquisition." Information Processing & Management 25, no. 4 (1989): 419–28. http://dx.doi.org/10.1016/0306-4573(89)90069-1.

Full text
APA, Harvard, Vancouver, ISO, and other styles
16

Fu, Xiuju, and Lipo Wang. "Linguistic Rule Extraction From a Simplified RBF Neural Network." Computational Statistics 16, no. 3 (2001): 361–72. http://dx.doi.org/10.1007/s001800100072.

Full text
APA, Harvard, Vancouver, ISO, and other styles
17

Poibeau, Thierry, and Dominique Dutoit. "Automatic extraction of paraphrastic phrases from small-size corpora." Lingvisticæ Investigationes. International Journal of Linguistics and Language Resources 32, no. 1 (2009): 77–98. http://dx.doi.org/10.1075/li.32.1.04poi.

Full text
Abstract:
This paper presents a versatile system intended to acquire paraphrastic phrases from a small-size representative corpus. In order to decrease the time spent on the elaboration of resources for NLP system (for example for Information Extraction), we suggest to use a knowledge acquisition module that helps extracting new information despite linguistic variation. This knowledge is semi-automatically derived from the text collection, in interaction with a large semantic network.
APA, Harvard, Vancouver, ISO, and other styles
18

Ishibuchi, Hisao, Tadahiko Murata, and Tomoharu Nakashima. "Linguistic Rule Extraction from Numerical Data for High-dimensional Classification Problems." Journal of Advanced Computational Intelligence and Intelligent Informatics 3, no. 5 (1999): 386–93. http://dx.doi.org/10.20965/jaciii.1999.p0386.

Full text
Abstract:
We discuss the linguistic rule extraction from numerical data for high-dimensional classification problems. Difficulties in the handling of high-dimensional problems stem from the curse of dimensionality: the number of combinations of antecedent linguistic values exponentially increases as the number of attributes increases. Our goal is to extract a small number of simple linguistic rules with high classification ability. In this paper, the rule extraction is to find a set of linguistic rules using three criteria: its classification ability, its compactness, and the simplicity of each rule. Ou
APA, Harvard, Vancouver, ISO, and other styles
19

GARCIA, MARCOS, and PABLO GAMALLO. "Exploring the effectiveness of linguistic knowledge for biographical relation extraction." Natural Language Engineering 21, no. 4 (2013): 519–51. http://dx.doi.org/10.1017/s1351324913000314.

Full text
Abstract:
AbstractMachine learning techniques have been implemented to extract instances of semantic relations using diverse features based on linguistic knowledge, such as tokens, lemmas, PoS-tags, or dependency paths. However, there has been little work aiming to know which of these features works better in the relation extraction task, and less in languages other than English. In this paper, various features representing different levels of linguistic knowledge are systematically evaluated for biographical relation extraction. The effectiveness of these features was measured by training several super
APA, Harvard, Vancouver, ISO, and other styles
20

AlArfaj, Abeer. "Towards relation extraction from Arabic text: a review." International Robotics & Automation Journal 5, no. 5 (2019): 212–15. http://dx.doi.org/10.15406/iratj.2019.05.00195.

Full text
Abstract:
Semantic relation extraction is an important component of ontologies that can support many applications e.g. text mining, question answering, and information extraction. However, extracting semantic relations between concepts is not trivial and one of the main challenges in Natural Language Processing (NLP) Field. The Arabic language has complex morphological, grammatical, and semantic aspects since it is a highly inflectional and derivational language, which makes task even more challenging. In this paper, we present a review of the state of the art for relation extraction from texts, address
APA, Harvard, Vancouver, ISO, and other styles
21

Ishibuchi, Hisao, Tomoharu Nakashima, and Tadahiko Murata. "Three-objective genetics-based machine learning for linguistic rule extraction." Information Sciences 136, no. 1-4 (2001): 109–33. http://dx.doi.org/10.1016/s0020-0255(01)00144-x.

Full text
APA, Harvard, Vancouver, ISO, and other styles
22

Agrawal, Shaishav, Ratna Sanyal, and Sudip Sanyal. "Statistics and linguistic rules in multiword extraction: a comparative analysis." International Journal of Reasoning-based Intelligent Systems 6, no. 1/2 (2014): 59. http://dx.doi.org/10.1504/ijris.2014.063954.

Full text
APA, Harvard, Vancouver, ISO, and other styles
23

Golik, Wiktoria, Robert Bossy, Zorana Ratkovic, and Claire Nédellec. "Improving term extraction with linguistic analysis in the biomedical domain." Research in Computing Science 70, no. 1 (2013): 157–72. http://dx.doi.org/10.13053/rcs-70-1-12.

Full text
APA, Harvard, Vancouver, ISO, and other styles
24

Spasić, Irena, Farzaneh Sarafraz, John A. Keane, and Goran Nenadić. "Medication information extraction with linguistic pattern matching and semantic rules." Journal of the American Medical Informatics Association 17, no. 5 (2010): 532–35. http://dx.doi.org/10.1136/jamia.2010.003657.

Full text
APA, Harvard, Vancouver, ISO, and other styles
25

Serban, Radu, Annette ten Teije, Frank van Harmelen, Mar Marcos, and Cristina Polo-Conde. "Extraction and use of linguistic patterns for modelling medical guidelines." Artificial Intelligence in Medicine 39, no. 2 (2007): 137–49. http://dx.doi.org/10.1016/j.artmed.2006.07.012.

Full text
APA, Harvard, Vancouver, ISO, and other styles
26

Segura-Bedmar, Isabel, Paloma Martínez, and Cesar de Pablo-Sánchez. "Using a shallow linguistic kernel for drug–drug interaction extraction." Journal of Biomedical Informatics 44, no. 5 (2011): 789–804. http://dx.doi.org/10.1016/j.jbi.2011.04.005.

Full text
APA, Harvard, Vancouver, ISO, and other styles
27

Law, Locky. "Creativity and television drama: a corpus-based multimodal analysis of pattern-reforming creativity in House M.D." Corpora 14, no. 2 (2019): 135–71. http://dx.doi.org/10.3366/cor.2019.0167.

Full text
Abstract:
Carter's (2004) theory of creativity in everyday common talk is by far the most influential in the field. He hypothesises that linguistic creativity can be categorised into pattern-forming and pattern-reforming creativity. Television drama, despite its global popularity, receives little attention from the field of linguistics. This paper aims to explore the ‘common ground’ in television drama dialogue and linguistic creativity through deciphering how pattern-reforming creativity is realised through screenplay, telecinemato-graphy and acting as meaning-making strategies. Using dialogues from th
APA, Harvard, Vancouver, ISO, and other styles
28

Gillam, Lee, Mariam Tariq, and Khurshid Ahmad. "Terminology and the construction of ontology." Terminology 11, no. 1 (2005): 55–81. http://dx.doi.org/10.1075/term.11.1.04gil.

Full text
Abstract:
This paper discusses a method for corpus-driven ontology design: extracting conceptual hierarchies from arbitrary domain-specific collections of texts. These hierarchies can form the basis for a concept-oriented (onomasiological) terminology collection, and hence may be used as the basis for developing knowledge-based systems using ontology editors. This reference to ontology is explored in the context of collections of terms. The method presented is a hybrid of statistical and linguistic techniques, employing statistical techniques initially to elicit a conceptual hierarchy, which is then aug
APA, Harvard, Vancouver, ISO, and other styles
29

Macketanz, Vivien, Eleftherios Avramidis, Aljoscha Burchardt, Jindrich Helcl, and Ankit Srivastava. "Machine Translation: Phrase-Based, Rule-Based and Neural Approaches with Linguistic Evaluation." Cybernetics and Information Technologies 17, no. 2 (2017): 28–43. http://dx.doi.org/10.1515/cait-2017-0014.

Full text
Abstract:
Abstract In this article we present a novel linguistically driven evaluation method and apply it to the main approaches of Machine Translation (Rule-based, Phrase-based, Neural) to gain insights into their strengths and weaknesses in much more detail than provided by current evaluation schemes. Translating between two languages requires substantial modelling of knowledge about the two languages, about translation, and about the world. Using English-German IT-domain translation as a case-study, we also enhance the Phrase-based system by exploiting parallel treebanks for syntax-aware phrase extr
APA, Harvard, Vancouver, ISO, and other styles
30

Gang, Ju-Yeon, Hyo-Jeong Jang, and Hyo-Jung Oh. "Linguistic Features Extraction for Analysis of Great Natural Disasters Damage State." Journal of Korean Institute of Information Technology 15, no. 9 (2017): 11–21. http://dx.doi.org/10.14801/jkiit.2017.15.9.11.

Full text
APA, Harvard, Vancouver, ISO, and other styles
31

Campello, R. J. G. B., and W. C. Amaral. "Modeling and linguistic knowledge extraction from systems using fuzzy relational models." Fuzzy Sets and Systems 121, no. 1 (2001): 113–26. http://dx.doi.org/10.1016/s0165-0114(99)00175-x.

Full text
APA, Harvard, Vancouver, ISO, and other styles
32

Zou, Li, Hongmei Lin, Xiaoying Song, Kaihua Feng, and Xin Liu. "Rule extraction based on linguistic-valued intuitionistic fuzzy layered concept lattice." International Journal of Approximate Reasoning 133 (June 2021): 1–16. http://dx.doi.org/10.1016/j.ijar.2020.12.018.

Full text
APA, Harvard, Vancouver, ISO, and other styles
33

Condamines, Anne. "Terminology." Terminology 2, no. 2 (1995): 219–38. http://dx.doi.org/10.1075/term.2.2.03con.

Full text
Abstract:
Terminology is reaching an important moment in its development. Within firms, new terminological needs are appearing which go beyond those of traditional translation departments. Thanks to the results of natural-language processing (NLP) and knowledge acquisition and representation in artificial intelligence (AI), new possibilities for integrating terminological data are coming to light. In order to meet these new needs, it is necessary to approach the analysis of phenomena with strong theoretical bases. The article proposes to root terminology firmly within linguistics, while paying special a
APA, Harvard, Vancouver, ISO, and other styles
34

KIM, JUNG-JAE, and JONG C. PARK. "BIOIE: RETARGETABLE INFORMATION EXTRACTION AND ONTOLOGICAL ANNOTATION OF BIOLOGICAL INTERACTIONS FROM THE LITERATURE." Journal of Bioinformatics and Computational Biology 02, no. 03 (2004): 551–68. http://dx.doi.org/10.1142/s0219720004000739.

Full text
Abstract:
The need for extracting general biological interactions of arbitrary types from the rapidly growing volume of the biomedical literature is drawing increased attention, while the need for this much diversity also requires both a robust treatment of complex linguistic phenomena and a method to consistently characterize the results. We present a biomedical information extraction system, BioIE, to address both of these needs by utilizing a full-fledged English grammar formalism, or a combinatory categorial grammar, and by annotating the results with the terms of Gene Ontology, which provides a com
APA, Harvard, Vancouver, ISO, and other styles
35

Xi, Xiaoming. "What does corpus linguistics have to offer to language assessment?" Language Testing 34, no. 4 (2017): 565–77. http://dx.doi.org/10.1177/0265532217720956.

Full text
Abstract:
In recent years, continuing advances in technology have increased the capacity to automate the extraction of a range of linguistic features of texts and thus have provided the impetus for the substantial growth of corpus linguistics. While corpus linguistic tools and methods have been used extensively in second language learning research, they have also been used increasingly in the design and validation of language assessments (Callies & Götz, 2015; Deshors, Götz, & Laporte, 2016; Park, 2014). The collection of papers in this special issue represents an intentional and systematic effo
APA, Harvard, Vancouver, ISO, and other styles
36

Agrawal, Shaishav, Ratna Sanyal, and Sudip Sanyal. "Hybrid method for automatic extraction of multiword expressions." International Journal of Engineering & Technology 7, no. 2.6 (2018): 33. http://dx.doi.org/10.14419/ijet.v7i2.6.10063.

Full text
Abstract:
A three phase hybrid method for automatic extraction of English multiword expressions (MWEs) has been proposed. The proposed method is based on linguistic patterns, association and context similarity between constituent words of the MWEs. First, the expressions are extract-ed in the form of N-grams from the raw text and then filtered using well defined linguistic patterns. Next, these expressions are again fil-tered using association score and context similarity score between their constituent words. Two association measures, Dice’s coefficient and PMI have been used for calculating the associ
APA, Harvard, Vancouver, ISO, and other styles
37

Peng, Nanyun, Hoifung Poon, Chris Quirk, Kristina Toutanova, and Wen-tau Yih. "Cross-Sentence N-ary Relation Extraction with Graph LSTMs." Transactions of the Association for Computational Linguistics 5 (December 2017): 101–15. http://dx.doi.org/10.1162/tacl_a_00049.

Full text
Abstract:
Past work in relation extraction has focused on binary relations in single sentences. Recent NLP inroads in high-value domains have sparked interest in the more general setting of extracting n-ary relations that span multiple sentences. In this paper, we explore a general relation extraction framework based on graph long short-term memory networks (graph LSTMs) that can be easily extended to cross-sentence n-ary relation extraction. The graph formulation provides a unified way of exploring different LSTM approaches and incorporating various intra-sentential and inter-sentential dependencies, s
APA, Harvard, Vancouver, ISO, and other styles
38

Fkih, Fethi, and Mohamed Nazih Omri. "Complex Terminology Extraction Model from Unstructured Web Text Based Linguistic and Statistical Knowledge." International Journal of Information Retrieval Research 2, no. 3 (2012): 1–18. http://dx.doi.org/10.4018/ijirr.2012070101.

Full text
Abstract:
Textual data remain the most interesting source of information in the web. In the authors’ research, they focus on a very specific kind of information namely “complex terms”. Indeed, complex terms are defined as semantic units composed of several lexical units that can describe in a relevant and exhaustive way the text content. In this paper, they present a new model for complex terminology extraction (COTEM), which integrates linguistic and statistical knowledge. Thus, the authors try to focus on three main contributions: firstly, they show the possibility of using a linear Conditional Random
APA, Harvard, Vancouver, ISO, and other styles
39

Pimeshkov, V. K., V. V. Dikovitsky, and M. G. Shishaev. "Extraction of relation from natural language texts using statistical and linguistic methods." Transaction Kola Science Centre 11, no. 8-2020 (2020): 188–92. http://dx.doi.org/10.37614/2307-5252.2020.8.11.028.

Full text
Abstract:
The work is devoted to the automated extraction of knowledge from unstructured text with the aim of their application in factextraction, the formation and replenishment of a thesaurus, analysis of document consistency. To extract and structure knowledge, methods of statistical and linguistic analysis are used.
APA, Harvard, Vancouver, ISO, and other styles
40

Omar, Nazlia, and Qasem Al-Tashi. "Arabic Nested Noun Compound Extraction Based on Linguistic Features and Statistical Measures." GEMA Online® Journal of Language Studies 18, no. 2 (2018): 93–107. http://dx.doi.org/10.17576/gema-2018-1802-07.

Full text
APA, Harvard, Vancouver, ISO, and other styles
41

Ville-Ometz, Fabienne, Jean Royauté, and Alain Zasadzinski. "Enhancing in automatic recognition and extraction of term variants with linguistic features." Terminology 13, no. 1 (2007): 35–59. http://dx.doi.org/10.1075/term.13.1.03vil.

Full text
Abstract:
The recognition and extraction of terms and their variants in texts are crucial processes in text mining. We use the ILC platform, an automatic controlled indexing platform, to perform these linguistic processes. We present a methodology for enhancing the recognition of syntactic term variation in English, using syntactic and morpho-syntactic features. Principal spurious variants of terms are ascribed to incorrect word dependencies. To overcome these problems, we consider each term variant as a window on the sentence and introduce two criteria: an internal syntactic criterion which checks that
APA, Harvard, Vancouver, ISO, and other styles
42

HONKELA, TIMO, AAPO HYVÄRINEN, and JAAKKO J. VÄYRYNEN. "WordICA—emergence of linguistic representations for words by independent component analysis." Natural Language Engineering 16, no. 3 (2010): 277–308. http://dx.doi.org/10.1017/s1351324910000057.

Full text
Abstract:
AbstractWe explore the use of independent component analysis (ICA) for the automatic extraction of linguistic roles or features of words. The extraction is based on the unsupervised analysis of text corpora. We contrast ICA with singular value decomposition (SVD), widely used in statistical text analysis, in general, and specifically in latent semantic analysis (LSA). However, the representations found using the SVD analysis cannot easily be interpreted by humans. In contrast, ICA applied on word context data gives distinct features which reflect linguistic categories. In this paper, we provid
APA, Harvard, Vancouver, ISO, and other styles
43

Rudyakov, A. N. "Linguistic Functionalism as a Basis for the Formation of Reading Literacy." Uchenye Zapiski Kazanskogo Universiteta. Seriya Gumanitarnye Nauki 162, no. 5 (2020): 89–100. http://dx.doi.org/10.26907/2541-7738.2020.5.89-100.

Full text
Abstract:
The theoretical and applied aspects of linguistics associated with the use of the functional approach were studied. The analysis of the interrelated problems of the formation and development of functional literacy, reading literacy, as well as perception and understanding of the text was carried out taking into account the principles of studying and describing linguistic phenomena outlined in the earlier own publications. The text was described as a major object of Russian philology at the present stage and as an instrument of social interaction, which is structurally determined by its functio
APA, Harvard, Vancouver, ISO, and other styles
44

Beliga, Slobodan, Ana Meštrović, and Sanda Martinčić-Ipšić. "Selectivity-Based Keyword Extraction Method." International Journal on Semantic Web and Information Systems 12, no. 3 (2016): 1–26. http://dx.doi.org/10.4018/ijswis.2016070101.

Full text
Abstract:
In this work the authors propose a novel Selectivity-Based Keyword Extraction (SBKE) method, which extracts keywords from the source text represented as a network. The node selectivity value is calculated from a weighted network as the average weight distributed on the links of a single node and is used in the procedure of keyword candidate ranking and extraction. The authors show that selectivity-based keyword extraction slightly outperforms an extraction based on the standard centrality measures: in/out-degree, betweenness and closeness. Therefore, they include selectivity and its modificati
APA, Harvard, Vancouver, ISO, and other styles
45

Xu, Jia Xin, Yan Wei, Feng Qiu, and Bo Sun. "Novel Long-Term Information Based Language Identification." Advanced Materials Research 655-657 (January 2013): 1805–8. http://dx.doi.org/10.4028/www.scientific.net/amr.655-657.1805.

Full text
Abstract:
A novel long-term information feature for language identification called shifted cepstra curve (SCC) is presented in this paper. Long-term information consists of information over multiple frames, which are commonly used in language identification systems. For instance, in parallel phone recognition language model (PPRLM), the feature vector contains not only information surpassing multiple frames but also linguistic knowledge [1]. However high computational cost for modeling linguistic gram may preclude their use in tasks which demand low memory. By contrast, experiments have proved that ling
APA, Harvard, Vancouver, ISO, and other styles
46

Rayson, Paul. "From key words to key semantic domains." International Journal of Corpus Linguistics 13, no. 4 (2008): 519–49. http://dx.doi.org/10.1075/ijcl.13.4.06ray.

Full text
Abstract:
This paper reports the extension of the key words method for the comparison of corpora. Using automatic tagging software that assigns part-of-speech and semantic field (domain) tags, a method is described which permits the extraction of key domains by applying the keyness calculation to tag frequency lists. The combination of the key words and key domains methods is shown to allow macroscopic analysis (the study of the characteristics of whole texts or varieties of language) to inform the microscopic level (focussing on the use of a particular linguistic feature) and thereby suggesting those l
APA, Harvard, Vancouver, ISO, and other styles
47

MIURA, KIKUKA, ICHIRO YAMADA, HIDEKI SUMIYOSHI, and NOBUYUKI YAGI. "IDENTIFICATION OF NAMES AND ACTIONS OF PRINCIPAL OBJECTS IN TV PROGRAM SEGMENTS USING CLOSED CAPTIONS." International Journal of Semantic Computing 02, no. 02 (2008): 191–206. http://dx.doi.org/10.1142/s1793351x08000403.

Full text
Abstract:
This paper proposes a method for automatically extracting principal video objects that appear in TV program segments and their actions using linguistic analysis of closed captions. We focus on features based on the text style of the closed captions by using Quinlan's C4.5 decision-tree learning algorithm. We extract a noun describing a video object and a verb describing an action for each video shot. To show the effectiveness of the method, we conducted experiments on the extraction of video segments in which animals appear and perform actions in twenty episodes of a Nature program. We obtaine
APA, Harvard, Vancouver, ISO, and other styles
48

Pérez-Guadarramas, Yamel, Manuel Barreiro-Guerrero, Alfredo Simón-Cuevas, Francisco P. Romero, and José A. Olivas. "Analysis of OWA operators for automatic keyphrase extraction in a semantic context." Intelligent Data Analysis 24 (December 4, 2020): 43–62. http://dx.doi.org/10.3233/ida-200008.

Full text
Abstract:
Automatic keyphrase extraction from texts is useful for many computational systems in the fields of natural language processing and text mining. Although a number of solutions to this problem have been described, semantic analysis is one of the least exploited linguistic features in the most widely-known proposals, causing the results obtained to have low accuracy and performance rates. This paper presents an unsupervised method for keyphrase extraction, based on the use of lexico-syntactic patterns for extracting information from texts, and a fuzzy topic modeling. An OWA operator combining se
APA, Harvard, Vancouver, ISO, and other styles
49

Mohamed, Sally, Mahmoud Hussien, and Hamdy M. Mousa. "ADPBC: Arabic Dependency Parsing Based Corpora for Information Extraction." International Journal of Information Technology and Computer Science 13, no. 1 (2021): 54–61. http://dx.doi.org/10.5815/ijitcs.2021.01.04.

Full text
Abstract:
There is a massive amount of different information and data in the World Wide Web, and the number of Arabic users and contents is widely increasing. Information extraction is an essential issue to access and sort the data on the web. In this regard, information extraction becomes a challenge, especially for languages, which have a complex morphology like Arabic. Consequently, the trend today is to build a new corpus that makes the information extraction easier and more precise. This paper presents Arabic linguistically analyzed corpus, including dependency relation. The collected data includes
APA, Harvard, Vancouver, ISO, and other styles
50

K. AL-Mashhadany, Abeer, Dalal N. Hamood, Ahmed T. Sadiq Al-Obaidi, and Waleed K. Al-Mashhsdany. "Extracting numerical data from unstructured Arabic texts (ENAT)." Indonesian Journal of Electrical Engineering and Computer Science 21, no. 3 (2021): 1759. http://dx.doi.org/10.11591/ijeecs.v21.i3.pp1759-1770.

Full text
Abstract:
<span id="docs-internal-guid-5dcc170c-7fff-e8e4-10d4-4a07701ca923"><span>Unstructured data becomes challenges because in recent years have observed the ability to gather a massive amount of data from annotated documents. This paper interested with Arabic unstructured text analysis. Manipulating unstructured text and converting it into a form understandable by computer is a high-level aim. An important step to achieve this aim is to understand numerical phrases. This paper aims to extract numerical data from Arabic unstructured text in general. This work attempts to recognize numeri
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!