To see the other types of publications on this topic, follow the link: Corpus linguistics.

Journal articles on the topic 'Corpus linguistics'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Corpus linguistics.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Jorroch, Anna, and Irena Prawdzic-Jankowska. "Corpus Linguistics meets Contact Linguistics and Sociolinguistics: multimodal German–Polish digital linguistic corpus." Corpora 19, no. 3 (November 2024): 375–94. http://dx.doi.org/10.3366/cor.2024.0319.

Full text
Abstract:
The objective of this paper is to present procedures of building, tool construction and annotation schemes of bilingual corpus LangGener, which is an innovative compilation of data influenced by contact linguistics and sociolinguistics. The multimodality of the corpus is related to the possibility of viewing the transcription and grammatical and sociolinguistic annotation of recorded original interviews and listening to them, because it contains ‘recordings, transcriptions and annotations’ ( Allwood, 2008 : 207). The corpus is part of the international LangGener project on changes in morphology and syntax in Polish and German in the speech of bilinguals over two generations. It contains seventy-eight hours of material, equating to 679,741 words. The material was collected during fieldwork in Germany and in the former German territories incorporated into Poland in 1945. The article describes the demanding procedures and the complex work on the recorded interviews, the tools used in the project, the annotations of the phenomenon of mutual Polish–German linguistic contact and dialectal remnants. In order to relate the results to the language biographies of the interviewees, sociolinguistic annotation was made in parallel, presenting the modes of acquisition and language use in the speech of bilinguals. The corpus also enables a general typologisation of language contact phenomena for Slavic languages in contact with Germanic languages.
APA, Harvard, Vancouver, ISO, and other styles
2

MEYER, CHARLES F. "CORPUS LINGUISTICS." World Englishes 13, no. 1 (March 1994): 101–3. http://dx.doi.org/10.1111/j.1467-971x.1994.tb00288.x.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Showalter, Michael. "Corpus Linguistics Criticisms of Heller Misuse Corpus Linguistics." SMU Law Review Forum 75, no. 1 (June 2022): 264–76. http://dx.doi.org/10.25172/slrf.75.1.9.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Ooi, V. B. Y. "Corpus Linguistics at Work (Studies in Corpus Linguistics.)." Literary and Linguistic Computing 17, no. 2 (June 1, 2002): 263–66. http://dx.doi.org/10.1093/llc/17.2.263.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Toirova, Guli. "Creation and importance of language corps in Uzbekistan." BIO Web of Conferences 84 (2024): 04003. http://dx.doi.org/10.1051/bioconf/20248404003.

Full text
Abstract:
The article discusses the transformation of language into the language of the Internet, computer technology, mathematical linguistics, its continuation and the formation and development of computer linguistics, in particular the question of modeling natural languages for artificial intelligence. The Uzbek National Corps plays an important role in enhancing the international status of the Uzbek language. The work carried out in the field of computer linguistics plays an important role in resolving existing problems in the Uzbek language. The question of the linguistic and extralinguistic separation of special tags for marking texts and their components is studied in particular.The coding requirements for important text information are defined. The state analyzes the linguistic module and the algorithm and its types from independent components of the linguistic program code. The need for algorithms for phonological, morphological and spelling rules for the formation of the lexical and grammatical code is scientifically substantiated. The importance of such linguistic modules as phonology, morphology and spelling in the formation of the linguistic base of the national corpus of the Uzbek language is emphasized. The article examines the corpus’s primary purpose as a complex linguistic source, as well as the fact that it primarily contains two sorts of information and its types. The key effective capabilities of the corpus, according to the paper, are reducing time spent on the text analysis process and being able to explain the properties of language units in speech with thousands of instances. The national corpus, the educational corpus, and the parallel corpus are all discussed in the subject of computer linguistics. It was stressed that linguistic and extralinguistic tagging of them, the development of corpus formation algorithms, and the establishment of corpus linguistic support are all societal need. It recognizes the urgency of developing the basis for the creation of the Uzbek language corpus, conducting research in the field of computer linguistics as a scientific and theoretical source.
APA, Harvard, Vancouver, ISO, and other styles
6

Toirova, Guli Ibragimovna. "THE IMPORTANCE OF LINGUISTIC MODELS IN THE DEVELOPMENT OF LANGUAGE BASES GE BASE." Scientific Reports of Bukhara State University 4, no. 6 (December 29, 2020): 98–106. http://dx.doi.org/10.52297/2181-1466/2020/4/6/8.

Full text
Abstract:
Relevance. In Uzbek linguistics, a number of studies have been carried out on automatic translation, the development of the linguistic foundations of the author's corpus, the processing of lexicographic texts and linguistic-statistical analysis. However, the processing of the Uzbek language as the language of the Internet: spelling, automatic processing and translation programs, search programs for various characters, text generation, the linguistic basis of the text corpus and national corpus, the technology of its software is not studied in any monograph. The article discusses such problems as: the transformation of language into the language of the Internet, computer technology, mathematical linguistics, its continuation and the formation and development of computer linguistics, in particular the question of modeling natural languages for artificial intelligence. The Uzbek National Corps plays an important role in enhancing the international status of the Uzbek language.
APA, Harvard, Vancouver, ISO, and other styles
7

Chilingaryan, Kamo P. "Corpus Linguistics: Theory Vs Methodilogy." RUDN Journal of Language Studies, Semiotics and Semantics 12, no. 1 (December 15, 2021): 196–218. http://dx.doi.org/10.22363/2313-2299-2021-12-1-196-218.

Full text
Abstract:
The article is devoted to a comprehensive study of the stages of formation and development of corpus linguistics. The purpose of the article is to analyze various scientific approaches to the scientific significance of this linguistic discipline and identify a set of concepts and criteria that form the foundation of this field. Corpus linguistics is one of the most promising and rapidly developing areas of language research. Linguistics of the XIX century set as its goal the study of language as such, and linguistics of the XXI century sees the relevance of the research not in identifying absolute linguistic categories and meanings but in the practical application of linguistic knowledge. The relevance of the article is determined by the fact that the linguistic corpus contains a vast potential, which the scientific community has not fully comprehended since the text as the main object of corpus linguistics in various forms of its implementation is one of the central components systems of language and speech-thinking activity of a modern native speaker of any language. The content and volume of linguistic corpora of various kinds allow obtaining reliable information about the modern and real use of a particular term: the corpus becomes a tool for analyzing the functioning of this term both in the linguistic field of morphology, syntax, and vocabulary and in the theory and practice of translation, identifying the register of its formal or informal usage. The fundamental novelty of this studys results allows us to speak about the legitimacy of the creation of corpus dictionaries and corpus grammars of a new generation, developed and verified concerning a specific fixed corpus. Simultaneously, the author substantiates the proposition that the corpus nature of dictionaries and grammars increases their reliability and objectivity and avoids the subjectivity that is often characteristic of research-based solely on the intuition of a linguist. The corpus is a medium for obtaining new scientific data, the comprehension of which seems to be a priority for modern linguistic description and necessary in the scientific activity of a modern researcher. From our point of view, this article's relevance and novelty lie in the fact that the expediency of corpus research is an essential requirement of the time, associated with a new quality of linguistic reality and meeting the needs of modern society. The article examines the main stages of the formation of corpus linguistics as a scientific field, characterizes the scientific concepts and approaches inherent in each of these stages, provides an overview of the main conceptual provisions of corpus linguistics within the framework of domestic and foreign linguistics. The author analyzes in detail the polemics between representatives of various scientific directions and reveals the advantages of one or another approach, traces the similarities and differences between approaches to the study of corpora at various historical stages of their formation. The review's focus is the role and place of corpus studies of language in modern linguistics, comparison of the pro and contra arguments of the use of corpus technologies in linguistic description. Considerable attention is paid to the main criteria for the classification of corpora, a brief overview of the most famous corpora in history is offered, and the prospects for their use in various fields of modern language science are discussed.
APA, Harvard, Vancouver, ISO, and other styles
8

Vukolova, Kateryna. "USAGE OF LANGUAGE CORPUSES IN STUDYING WORD FORMATION OF MODERN AMERICAN ENGLISH DIALECTS." Naukovì zapiski Nacìonalʹnogo unìversitetu «Ostrozʹka akademìâ». Serìâ «Fìlologìâ» 1, no. 17(85) (June 22, 2023): 10–13. http://dx.doi.org/10.25264/2519-2558-2023-17(85)-10-13.

Full text
Abstract:
The article deals with the principles of using the concepts of corpus linguistics in studying American English dialects. The origins of corpus linguistics of American English dialects are considered, and developments in corpus linguistics by W. Francis and G. Kuchera are investigated (Brown Corpus, 1964). The concept of «corpus linguistics» is defined, and some problematic aspects are described. The qualifications for using the corpus-linguistic approach in the study of dialect grammatical features of American English are defined. The grammatical differences between British and American English are outlined. "Spoken English" is defined as how the latter was dialectical in the United States. The American dialects and Standardized American English are distinguished. Attention is focused on comparative and legal developments in corpus linguistics of British and American English (Brown and Lancaster Corps). Discographic studies of spoken American (dialectical) are mentioned: Lancaster/IBM SEC and Corpus of Spoken American English. The peculiarities of using the International Corpus of English, which studies British and American English in philological and methodological unity, are outlined. The processes of expanding the language corpus of the American English language from 1990 till the present (through the prism of statistics from The Bank of English) are determined. The main methodological developments in the systematization of the corpus of American English from the point of view of the theory of Corpus Linguistics are indicated: software corpus; problem-parametric industry selections; text transcription; search and system work; description encoding of "preserved" dialect words and word formations.
APA, Harvard, Vancouver, ISO, and other styles
9

Barlow, Michael. "Corpus linguistics and theoretical linguistics." International Journal of Corpus Linguistics 16, no. 1 (March 11, 2011): 3–44. http://dx.doi.org/10.1075/ijcl.16.1.02bar.

Full text
Abstract:
This paper examines the relationship between corpus linguistics and theoretical linguistics from a variety of standpoints. We consider the nature of the fit between particular theoretical approaches and the three areas in which corpus linguistics has made a significant contribution to our understanding of language: the provision of frequency information, the highlighting of the importance of collocations, and the description of variation and text types. The complex relationship between data, theory, and representation is described with the aim of situating corpus-based research with respect to different linguistic theories, looking broadly at British and American traditions and paying particular attention to usage-based models of language. We then briefly discuss some current issues surrounding theoretical developments within corpus linguistics, including the divide between cognitive and social perspectives; the representation of corpus-based generalisations; and the relationship between patterns in corpus data and patterns in the mind.
APA, Harvard, Vancouver, ISO, and other styles
10

Seidl-Péch, Olívia. "Zu theoretischen und praktischen Aspekten des Fachübersetzens." Acta Universitatis Sapientiae, Philologica 9, no. 3 (December 1, 2017): 135–44. http://dx.doi.org/10.1515/ausp-2017-0034.

Full text
Abstract:
AbstractIn the past few decades, it has extensively been written about corpus linguistics, which has owned its upswing mainly to the use of electronic corpora since the 1960s (Brown Corpus). Meanwhile, an increasing number of fields within general and applied linguistics (e.g. computational linguistics, discourse analysis, contrastive linguistics, diachronic and synchronic linguistics, language teaching and learning research, lexicology and lexicography, psycholinguistics, sociolinguistics, translation studies) have been using corpus linguistic methods. In linguistic research, the empirical and descriptive character of corpus-based linguistic analysis has also been given an emphasis.Thanks to the digital revolution of the 20th and 21st centuries the creation and provision of digital linguistic corpora is becoming accessible for smaller nations and language communities as well as for scientists. Nowadays, linguistic corpora cannot only be regarded as a tool to support language research and Translation Studies, but they also contribute to the enrichment of cultural diversity. The article focuses on international examples as well as on the most significant Hungarian corpora. The paper also discusses the criteria of corpus creation and several cultural aspects of corpus linguistics.
APA, Harvard, Vancouver, ISO, and other styles
11

Gries, Stefan Th. "Corpus linguistics and theoretical linguistics." International Journal of Corpus Linguistics 15, no. 3 (July 30, 2010): 327–43. http://dx.doi.org/10.1075/ijcl.15.3.02gri.

Full text
APA, Harvard, Vancouver, ISO, and other styles
12

Wilks, Yorick. "Corpus linguistics and computational linguistics." International Journal of Corpus Linguistics 15, no. 3 (July 30, 2010): 408–11. http://dx.doi.org/10.1075/ijcl.15.3.12wil.

Full text
APA, Harvard, Vancouver, ISO, and other styles
13

Romero-Barranco, Jesús, and Paula Rodríguez-Abruñeiras. "Current trends in Corpus Linguistics and textual variation." Research in Corpus Linguistics 9, no. 2 (2021): i—xiii. http://dx.doi.org/10.32714/ricl.09.02.01.

Full text
Abstract:
Corpus Linguistics has proved of great value as a methodological tool in shedding light on how discourse is constructed in different text types. This opening contribution to the special issue “Corpus-linguistic perspectives on textual variation” provides an account of some of the most common applications of Corpus Linguistics, describes some of the most widely used corpora, and pins down some of the most influential corpus-based research works. In so doing, we contextualise the contributions to this collection of articles. The main aim of this special issue is to showcase cutting-edge research on textual variation based on linguistic corpora, thus illustrating how Corpus Linguistics draws from but also feeds a multiplicity of linguistic branches, such as (Critical) Discourse Analysis, Register Studies, Historical Linguistics, and Dialectology.
APA, Harvard, Vancouver, ISO, and other styles
14

Gulyamova, Shakhnoza Kakhramonovna. "CORPUS LINGUISTICS IS A PRIORITY AREA OF MODERN APPLIED LINGUISTICS." Scientific Reports of Bukhara State University 4, no. 4 (August 28, 2020): 106–10. http://dx.doi.org/10.52297/2181-1466/2020/4/4/9.

Full text
Abstract:
This article describes an independent branch of computational linguistics - corpus linguistics, which is the main and most promising direction of modern applied linguistics. Based on the essence, goals and objectives of corpus linguistics, the results achieved in world linguistics, the scientific views of a number of scientists are summarized. It was noted that the creation of a national corpus in the Uzbek language is one of the urgent tasks facing science, and comments were made on this.
APA, Harvard, Vancouver, ISO, and other styles
15

Motschenbacher, Heiko. "Corpus linguistics in language and sexuality studies." Journal of Language and Sexuality 7, no. 2 (August 27, 2018): 145–74. http://dx.doi.org/10.1075/jls.17019.mot.

Full text
Abstract:
Abstract As an introduction to the special issue, this paper presents an overview of previous corpus linguistic work in the field of language and sexuality and discusses the compatibility of corpus linguistic methodology with queer linguistics as a central theoretical approach in language and sexuality studies. The discussion is structured around five prototypical aspects of corpus linguistics that may be deemed problematic from a poststructuralist, queer linguistic perspective: quantification and associated notions of objectivity, reliance on linguistic forms and formal presence, concentration on highly frequent features, reliance on categories, and highlighting of differences. It is argued that none of these aspects rules out an application of corpus linguistic techniques within queer theoretically informed linguistic work per se and that it is rather the way these techniques are employed that can be seen as more or less compatible with queer linguistics. To complement the theoretical discussion, a collocation analysis of sexual descriptive adjectives in the Corpus of Contemporary American English (COCA) is conducted in an attempt to address some of the issues raised. The concluding section makes suggestions for future research.
APA, Harvard, Vancouver, ISO, and other styles
16

Helt, Marie E., Graeme Kennedy, Tony McEnery, Andrew Wilson, Douglas Biber, Susan Conrad, and Randi Reppen. "Corpus Linguistics Texts." TESOL Quarterly 37, no. 3 (October 1, 2003): 559. http://dx.doi.org/10.2307/3588406.

Full text
APA, Harvard, Vancouver, ISO, and other styles
17

Čermák, František. "Today's corpus linguistics." International Journal of Corpus Linguistics 7, no. 2 (December 31, 2002): 265–82. http://dx.doi.org/10.1075/ijcl.7.2.06cer.

Full text
Abstract:
The paper is concerned with problems of methodology. Against this background, the situation of today's corpora is discussed and some fields are identified as being in a far from satisfactory shape. The place of corpora in linguistics is briefly looked at, suggesting that structuralist tradition is the only one to use them extensively. Problems of annotation and ways, less (statistical) or more successful (rule-based), are raised and discussed. Here, some of the most serious shortcomings, such as multi-word units or status of language units in general that computational linguists should deal with, are listed. In a more general direction, implications and status of paradigmatics and syntagmatics are discussed, too, with considerable and critical attention paid to ontologies.
APA, Harvard, Vancouver, ISO, and other styles
18

Stoykova, Velislava. "Teaching Corpus Linguistics." Procedia - Social and Behavioral Sciences 143 (August 2014): 437–41. http://dx.doi.org/10.1016/j.sbspro.2014.07.513.

Full text
APA, Harvard, Vancouver, ISO, and other styles
19

Farwell, David. "English corpus linguistics." English for Specific Purposes 13, no. 1 (1994): 109–12. http://dx.doi.org/10.1016/0889-4906(94)90032-9.

Full text
APA, Harvard, Vancouver, ISO, and other styles
20

Plungian, V. A. "Corpus linguistics nowadays." Vestnik Rossijskoj akademii nauk 94, no. 9 (November 4, 2024): 787–94. http://dx.doi.org/10.31857/s0869587324090018.

Full text
Abstract:
The article proposes a general presentation of corpus linguistics, its history, its methodology and its influence on current views on how a language must be studied – the process which is usually referred to as “corpus revolution”.
APA, Harvard, Vancouver, ISO, and other styles
21

Tolmasova, Khursantosh Anvar kizi. "USING CORPUS LINGUISTICS IN TEACHING LEISURE READING." American Journal of Social Science and Education Innovations 04, no. 04 (April 1, 2022): 34–39. http://dx.doi.org/10.37547/tajssei/volume04issue04-05.

Full text
Abstract:
The use of methods of applied linguistics in teaching a foreign language is considered. The potential of learning on the basis of corpus language material is analyzed. An analysis of such corpus linguistics methods as information retrieval (IR), data-driven learning, text searches in large-scale corpora (concordances), natural language processing methods (natural language processing) is presented. NLP)). In addition, the role and importance of the corps in leisure reading was studied.
APA, Harvard, Vancouver, ISO, and other styles
22

Degano, Chiara. "Corpus linguistics and argumentation." Journal of Argumentation in Context 5, no. 2 (October 14, 2016): 113–38. http://dx.doi.org/10.1075/jaic.5.2.01deg.

Full text
Abstract:
This paper explores the viability of a synergy between corpus linguistics and the study of argumentation in context. While quantitative approaches to the study of discourse have been profitably integrated at the levels of lexico-grammar and syntax, more rarely has this been the case for higher levels of analysis such as argumentative structures. Such an approach would help identify those recurring patterns of argumentation that build up cumulatively, and which can only be identified in larger samples of discourse. In particular this paper concerns how the tools of corpus linguistics can be put to use for the analysis of strategic manoeuvring, and especially topical selection. In order to do so, the televised prime ministerial debates held on the occasion of the 2010 general election in the UK will be taken as a case study, with a focus on the use of linguistic indicators that might help retrieve argumentative patterns.
APA, Harvard, Vancouver, ISO, and other styles
23

ALYIEVA, A., and I. OSTAPCHUK. "CORPORAL LINGUISTICS IN TEACHING FOREIGN LANGUAGES." Current issues of linguistics and translations studies, no. 27 (April 27, 2023): 5–8. http://dx.doi.org/10.31891/2415-7929-2023-27-1.

Full text
Abstract:
The article is devoted to the study of the role of corpus linguistics in the teaching of foreign languages, in particular English. It defines the main aspects of corpus linguistics, considers the corpus approach as a method of teaching foreign languages, its advantages and disadvantages, gives a brief history of corpus linguistics as a branch of linguistics, and gives examples of possible tasks based on linguistic corpora. The article presents the tasks of the corpus approach, which are related to the teaching of foreign languages, its main characteristics, which determine its reliability and validity, are indicated. By analyzing researches on corpus linguistics, the authors summarized the typology of corpora. The article substantiates the expediency of using the corpus approach in teaching a foreign language in higher educational institutions. The authors established that the use of corpora in the work of students with the involvement of the inductive method contributes to the students’ awareness of the main language patterns and the development of linguistic intuition, while the study material is authentic texts. The authors revealed the possibility of using a direct and indirect corpus approach in the formation of lexical and grammatical skills of students. Thus, the direct use of this method may involve the teaching of corpus linguistics to university students as a purely academic subject, the performance of certain tasks or exercises using concordance programs, and the implementation of individual research projects of students. The indirect corpus approach includes material development and language testing. The article contains information about special software that can be used to implement training on the main types of tasks that can be offered on the basis of these programs.
APA, Harvard, Vancouver, ISO, and other styles
24

Kogan, Marina, and Victor Zakharov. "A Project Work as a Way of Bringing Corpora to Secondary School." Journal of Linguistics/Jazykovedný casopis 73, no. 1 (June 1, 2022): 986–95. http://dx.doi.org/10.2478/jazcas-2022-0022.

Full text
Abstract:
Abstract Corpus linguistics is one of the most dynamic and rapidly developing areas of modern linguistics. It affects all areas of linguistics, including methodology of teaching foreign languages, translation and other linguistic disciplines. Corpus linguistics has had a direct impact on teaching foreign languages. However, in general, it remains a marginal method in teaching. Analysis of publications on the subject allows us to conclude that very few studies are long-term and aimed at working with schoolchildren. This article proposes a model for the development of sustainable interest among high school students in online corpora as sources of linguistic information, including the initiation stage in the form of project work in mini-groups to study well-known sayings with the consequent stage aiming at completing tasks supplementing the main textbook on a regular basis. The organization of project work addressing the corps of 11th grade students of the Natural Science Lyceum at Peter the Great St. Petersburg Polytechnic University is described. The paper outlines further research.
APA, Harvard, Vancouver, ISO, and other styles
25

Jensen, Kim Ebensgaard. "Linguistics in the digital humanities: (computational) corpus linguistics." MedieKultur: Journal of media and communication research 30, no. 57 (December 19, 2014): 20. http://dx.doi.org/10.7146/mediekultur.v30i57.15968.

Full text
Abstract:
<p class="p1">Corpus linguistics has been closely intertwined with digital technology since the introduction of university computer mainframes in the 1960s. Making use of both digitized data in the form of the language corpus and computational methods of analysis involving concordancers and statistics software, corpus linguistics arguably has a place in the digital humanities. Still, it remains obscure and fi gures only sporadically in the literature on the digital humanities. Th is article provides an overview of the main principles of corpus linguistics and the role of computer technology in relation to data and method and also off ers a bird's-eye view of the history of corpus linguistics with a focus on its intimate relationship with digital technology and how digital technology has impacted the very core of corpus linguistics and shaped the identity of the corpus linguist. Ultimately, the article is oriented towards an acknowledgment of corpus linguistics' alignment with the digital humanities.</p>
APA, Harvard, Vancouver, ISO, and other styles
26

Aarts, Jan, Hans van Halteren, and Nelleke Oostdijk. "The Linguistic Annotation of Corpora." International Journal of Corpus Linguistics 3, no. 2 (January 1, 1998): 189–210. http://dx.doi.org/10.1075/ijcl.3.2.02aar.

Full text
Abstract:
The article discusses the role of linguistic annotation in corpus linguistics as opposed to annotation in natural language processing. In corpus linguistics, annotation is an integral part of the process of linguistic interpretation and description of the data. Tagging and parsing are discussed as the automatic counterparts of, respectively, the paradigmatic and the syntagmatic description of corpus data. The requirements for a corpus linguistic annotation system are considered. An account is given of the TOSCA analysis system as representative of such an annotation system. Performance results of the system are given, and an evaluation is made.
APA, Harvard, Vancouver, ISO, and other styles
27

Xu, Jiajin. "Corpus-based Chinese studies." Chinese Language and Discourse 6, no. 2 (December 30, 2015): 218–44. http://dx.doi.org/10.1075/cld.6.2.06xu.

Full text
Abstract:
This article reviews corpus-based Chinese studies, both applied and theoretical, from the 1920s to the present. It will be shown that, while corpus-based Chinese studies have been gaining momentum for only the last couple of decades, the roots of Chinese corpus linguistics go all the way back to the beginning of the 20th century. Today the bulk of corpus-based Chinese studies is oriented toward applied linguistics, with the compilation of frequency character/word lists and interlanguage Chinese studies being the most popular types of research. In addition to applied linguistic studies, this overview also highlights some innovative corpus studies on lexical and grammatical aspects of both classical and modern Chinese, as well as studies of sociolinguistic variation and discourse pragmatics. Overall, important groundwork in Chinese corpus linguistics is acknowledged and future directions are discussed.
APA, Harvard, Vancouver, ISO, and other styles
28

Khoy, Bunlot, and Piseth An. "Identifying High-Frequency Words in Khmer Texts: A Corpus Linguistics Analysis." Cambodian Journal of Education and STEM 2, no. 1 (March 31, 2024): 56–70. http://dx.doi.org/10.62219/cjes.2024214.

Full text
Abstract:
Reading, listening, speaking, and writing skills are essential in human communication. Learners or teachers who do not understand the high-frequency words which are the foundation for understanding the four skills are more likely to spend much time and less likely to get good results. Therefore, high-frequency words play an essential role in helping learners achieve their goals and in helping curriculum designers or developers create applications that are easily accessible to the public. This study aims at identifying the high-frequency words in the standard of NormFreq in Khmer text 40 times per 1 million words analyzed from texts from Koh Santepheap newspapers, Let’s Read! (developed by the Asia Foundation), books on a collection of Khmer/Cambodian folktales, a set of Khmer wisdom books, and books on Khmer literature at the high school level. AntConc is a corpus linguistics program used to analyze high-frequency words, and the Khmer dictionary 2022 is used to classify the parts of speech of the most frequent words, drawn from the standard of NormFreq 40 times per 1 million words. As a result, this study identified a list of 1,974 high-frequency words, with nouns being the most commonly part of speech, comprising 1,008 words. These research findings may assist teachers, curriculum developers, NGOs, or relevant partners in considering high-frequency words and nouns when preparing reading texts or materials for basic or elementary levels.
APA, Harvard, Vancouver, ISO, and other styles
29

Koplenig, Alexander. "Against statistical significance testing in corpus linguistics." Corpus Linguistics and Linguistic Theory 15, no. 2 (October 25, 2019): 321–46. http://dx.doi.org/10.1515/cllt-2016-0036.

Full text
Abstract:
Abstract In the first volume of Corpus Linguistics and Linguistic Theory, Gries (2005. Null-hypothesis significance testing of word frequencies: A follow-up on Kilgarriff. Corpus Linguistics and Linguistic Theory 1(2). doi:10.1515/cllt.2005.1.2.277. http://www.degruyter.com/view/j/cllt.2005.1.issue-2/cllt.2005.1.2.277/cllt.2005.1.2.277.xml: 285) asked whether corpus linguists should abandon null-hypothesis significance testing. In this paper, I want to revive this discussion by defending the argument that the assumptions that allow inferences about a given population – in this case about the studied languages – based on results observed in a sample – in this case a collection of naturally occurring language data – are not fulfilled. As a consequence, corpus linguists should indeed abandon null-hypothesis significance testing.
APA, Harvard, Vancouver, ISO, and other styles
30

Джуманиязова, Интизор. "КОРПУСНАЯ ЛИНГВИСТИКА КАК НОВЕЙШЕЕ НАПРАВЛЕНИЕ В ЯЗЫКОЗНАНИИ." TAMADDUN NURI JURNALI 8, no. 59 (August 30, 2024): 65–67. http://dx.doi.org/10.69691/27gybh22.

Full text
Abstract:
Currently, there is a fairly large number of language corpora, including Russian language corpora, which differ from each other in a variety of ways. The article considers a brief analysis of the history of corpus linguistic linguistics and defines its tasks and methods, identifies its interaction with other linguistic disciplines, gives a general description of the corpus as the basic concept of corpus linguistics, and presents a classification of text corpora.
APA, Harvard, Vancouver, ISO, and other styles
31

Arppe, Antti, Gaëtanelle Gilquin, Dylan Glynn, Martin Hilpert, and Arne Zeschel. "Cognitive Corpus Linguistics: five points of debate on current theory and methodology." Corpora 5, no. 1 (May 2010): 1–27. http://dx.doi.org/10.3366/cor.2010.0001.

Full text
Abstract:
Within cognitive linguistics, there is an increasing awareness that the study of linguistic phenomena needs to be grounded in usage. Ideally, research in cognitive linguistics should be based on authentic language use, its results should be replicable, and its claims falsifiable. Consequently, more and more studies now turn to corpora as a source of data. While corpus-based methodologies have increased in sophistication, the use of corpus data is also associated with a number of unresolved problems. The study of cognition through off-line linguistic data is, arguably, indirect, even if such data fulfils desirable qualities such as being natural, representative and plentiful. Several topics in this context stand out as particularly pressing issues. This discussion note addresses (1) converging evidence from corpora and experimentation, (2) whether corpora mirror psychological reality, (3) the theoretical value of corpus linguistic studies of ‘alternations’, (4) the relation of corpus linguistics and grammaticality judgments, and, lastly, (5) the nature of explanations in cognitive corpus linguistics. We do not claim to resolve these issues nor to cover all possible angles; instead, we strongly encourage reactions and further discussion.
APA, Harvard, Vancouver, ISO, and other styles
32

Oktavianti, Ikmi Nur. "Corpora: From theoretical linguistics to language teaching." UAD TEFL International Conference 2 (January 16, 2021): 19. http://dx.doi.org/10.12928/utic.v2.5731.2019.

Full text
Abstract:
Corpus has gained its popularity in linguistics over the past five decades, from the computerized storage of English language in Survey of English Usage in 1959 to the ongoing development of Corpus of Contemporary American English. Because of the huge size of actual language data compiled in corpora, many linguists and language teachers working with English language have benefited from them in linguistic research and teaching practice. Up to now, there are innumerable English online corpora recording data from various genres, modes, and regions as well as corpus tools to analyze self-compiled corpus. The massive development of corpora, however, has not been widely discussed among English language researchers and practitioners in Indonesia, let alone in English language teaching. Although linguistics and language teaching are two inseparable and firmly related fields, corpus as a concept and product of linguistics seems ignored or even avoided. This paper then aims to review the nature of corpus and how it is used to assist linguistic analysis. More importantly, this paper discusses another possible application of corpus, e.g., the use of corpus in teaching language. Considering the nature and the benefits of using corpora, it is then important to promote the use of corpus to enhance English language teaching and learning, either directly in the classrooms or indirectly in materials development.
APA, Harvard, Vancouver, ISO, and other styles
33

Gulchekhra, Khurramova. "SIMILARITIES OF LEXICAL-SEMANTIC RELATIONS IN UZBEK AND ENGLISH LANGUAGES." European International Journal of Multidisciplinary Research and Management Studies 4, no. 6 (June 1, 2024): 91–94. http://dx.doi.org/10.55640/eijmrms-04-06-14.

Full text
Abstract:
This article deals with corpus linguistics, ideas about the corpus and its parallel corpus link, its structure, corpus types, tokens, lemmas, stemming. Today, the theoretical and practical significance of the corps is in the study of the existing possibilities of language in Uzbek linguistics, the identification of problematic aspects of linguistics, the creation of electronic dictionaries, increasing the effectiveness of modern information technology in language learning, automatic translation, search and computer analysis. In solving problems, there is a need to build a corpus of language in specific areas.
APA, Harvard, Vancouver, ISO, and other styles
34

Mahmudov, Masud. "THE PRİORİTY İSSUES OF THE CORPUS LİNGUİSTİCS." Alatoo Academic Studies 19, no. 3 (October 30, 2019): 124–28. http://dx.doi.org/10.17015/aas.2019.193.10.

Full text
Abstract:
Corpus linguistics is an intensively developed branch of modern linguistics that deals with the development, creation and use of large-volume text corpora. The term was introduced into scientific circulation in the 1960s in connection with the development of technology for the creation of buildings, which since the 1980s has contributed to the development of a new generation of computer technology. The extract information about the specific features, the direction of the development of the Corpus linguistics being the newest field of the modern linguistic science, the realized research works in the Turkish and Azerbaijani languages, and the priority issues of the Corpus linguistics is given in the thesis.
APA, Harvard, Vancouver, ISO, and other styles
35

Brookes, Gavin, and Tony McEnery. "Corpus linguistics for indexing." Indexer: The International Journal of Indexing 37, no. 2 (June 2019): 105–23. http://dx.doi.org/10.3828/indexer.2019.16.

Full text
APA, Harvard, Vancouver, ISO, and other styles
36

Sardinha, Tony Berber. "Metaphor and Corpus Linguistics." Revista Brasileira de Linguística Aplicada 11, no. 2 (2011): 329–60. http://dx.doi.org/10.1590/s1984-63982011000200004.

Full text
Abstract:
In this paper, I look at four different aspects of metaphor research from a corpus linguistic perspective, namely: (1) the lexicogrammar of metaphors, which refers to the patterning of linguistic metaphor revealed by corpus analysis; (2) metaphor probabilities, which is a facet of metaphor that emerges from frequency-based studies of metaphor; (3) dimensions of metaphor variation, or the search for systematic parameters of variation in metaphor use across different registers; and (4) automated metaphor retrieval, which relates to the development of software to help identify metaphors in corpora. I argue that these four aspects are interrelated, and that advances in one of them can drive changes in the others.
APA, Harvard, Vancouver, ISO, and other styles
37

Butler, Christopher S. "Corpus linguistics at work." System 31, no. 1 (March 2003): 128–32. http://dx.doi.org/10.1016/s0346-251x(02)00078-7.

Full text
APA, Harvard, Vancouver, ISO, and other styles
38

Schönefeld, Doris. "Corpus Linguistics and Cognitivism." International Journal of Corpus Linguistics 4, no. 1 (August 13, 1999): 137–71. http://dx.doi.org/10.1075/ijcl.4.1.07sch.

Full text
Abstract:
The following article is meant to discuss the status of corpus linguistics, how it is seen and sees itself as a field: Is it merely a method of doing linguistics, or can it be considered a distinct approach to language description? In our argument, we claim that corpus linguistics is on the way of becoming more than a methodology, since its research results are increasingly interpreted with regard to their impact on the commonly held views about language. Dealing with these interpretations, we have noticed a number of similarities with assumptions made by cognitive linguistics, and we aim at showing that the two trends—corpus linguistics and cognitivism—are compatible in that they complement each other.
APA, Harvard, Vancouver, ISO, and other styles
39

Teubert, W. "Corpus Linguistics and Lexicography." International Journal of Corpus Linguistics 6, no. 1 (December 1, 2001): 125–53. http://dx.doi.org/10.1075/ijcl.6.3.11teu.

Full text
APA, Harvard, Vancouver, ISO, and other styles
40

Teubert, Wolfgang. "Corpus Linguistics and Lexicography." Text Corpora and Multilingual Lexicography 6, no. 3 (December 17, 2001): 125–53. http://dx.doi.org/10.1075/ijcl.6.si.11teu.

Full text
APA, Harvard, Vancouver, ISO, and other styles
41

Adolphs, Svenja. "Advances in Corpus Linguistics." Journal of Pragmatics 38, no. 2 (February 2006): 292–96. http://dx.doi.org/10.1016/j.pragma.2005.02.008.

Full text
APA, Harvard, Vancouver, ISO, and other styles
42

Tognini-Bonelli, Elena. "Corpus Linguistics at Work." Computational Linguistics 28, no. 4 (December 2002): 583. http://dx.doi.org/10.1162/coli.2002.28.4.583a.

Full text
APA, Harvard, Vancouver, ISO, and other styles
43

Gries, Stefan Th. "What is Corpus Linguistics?" Language and Linguistics Compass 3, no. 5 (July 9, 2009): 1225–41. http://dx.doi.org/10.1111/j.1749-818x.2009.00149.x.

Full text
APA, Harvard, Vancouver, ISO, and other styles
44

Le Foll, Elen. "‘Opening up’ Corpus Linguistics." Second Language Teacher Education 2, no. 2 (March 7, 2024): 161–86. http://dx.doi.org/10.1558/slte.25371.

Full text
Abstract:
Despite a multitude of empirical studies pointing to the benefits of integrating corpora in second language (L2) teaching and learning, corpus use in the pre-tertiary L2 classroom remains the exception rather than the norm. This much-discussed research-practice gap can be attributed to the limited physical, intellectual and social accessibility of both corpus research and corpus resources. To address this issue, I explore the potential of Open Education in a course aimed at imparting corpus literacy to pre-service EFL teachers. I present the design of a course focused on the collaborative creation of a new Open Educational Resource (OER): a guide to creating corpus-informed teaching materials. The present study evaluates the effectiveness of three iterations of this course in enhancing the physical, intellectual, and social accessibility of corpus linguistics for L2 education. Analyses of pre- and post-course surveys, students’ OER chapter submissions, and reflection statements show that the semester-long course successfully developed participants’ technical and pedagogical corpus literacy. The findings suggest that the adoption of OER-enabled pedagogy in an initial teacher education course can make a positive contribution to bridging the corpus research-teaching gap.
APA, Harvard, Vancouver, ISO, and other styles
45

Ünsal Şakiroğlu, Hülya. "Corpus-based variationist linguistics." Asian Languages and Linguistics 5, no. 1 (July 5, 2024): 91–103. http://dx.doi.org/10.1075/alal.23005.uns.

Full text
Abstract:
Abstract This paper aims to identify what archaic words/word groups were still known and used both among language speakers and Turkish National Corpus (TNC) as an indication of lexical change in Turkish from 1900 to 2020. The present study explores the diachronic variation of lexical change in Turkish by combining the corpus-based variationist sociolinguistic approach with the perspective of historical sociolinguistics. The words/collocations thought to be outdated from the original version of “Eylül” novel, written in 1900, were selected and randomly subsampled using a computer-based randomization algorithm. A survey was formed using the outdated words/collocations along with the context. The results indicated that demographical variables did not affect word knowledge and that the archaic words were unfamiliar to all participants uniformly. The overall comparison of words/collocations tested in TNC and survey indicated similar results as the most and the least frequently used words were also the most and least abundantly present in TNC.
APA, Harvard, Vancouver, ISO, and other styles
46

MISNAWATI, Misnawati, Sahril NUR, and Saidna Zulfikar Bin TAHIR. "Corpus Linguistics Today: A Qualitative Approach." Research and Innovation in Applied Linguistics-Electronic Journal 2, no. 1 (February 29, 2024): 45. http://dx.doi.org/10.31963/rial.v2i1.4486.

Full text
Abstract:
Corpus linguistics, the study of language form and function using computerized corpora, involves collecting extensive electronic texts to analyze language usage. It serves various objectives: qualitative analysis, exploring nuances in language use, and quantitative analysis, identifying patterns in word usage and collocations. Corpus linguistics tracks language variation over time, aids language teaching, supports lexicography, and contributes to discourse analysis. Its evolution grew from the 1950s to the present day, marked by technological advancements and theoretical integration. The article underscores the significance of qualitative analysis in providing contextual insights into language use. The systematic creation of a corpus involves steps like text selection, data collection, preprocessing, annotation, and quality control. Various qualitative analysis techniques, from discourse analysis to lexical semantics, offer diverse perspectives for studying linguistic phenomena. This article provides a concise overview of corpus linguistics, its evolution, the importance of qualitative analysis, corpus creation, and qualitative analysis techniques.
APA, Harvard, Vancouver, ISO, and other styles
47

Yue, Liao, and Li Kunyu. "Research Status and Current Problems of Corpus Linguistics in China." Sinología hispánica. China Studies Review 17, no. 2 (March 6, 2024): 139–58. http://dx.doi.org/10.18002/sin.v17i2.8236.

Full text
Abstract:
After more than 40 years of development, China has made significant achievements in corpus-based research, while problems still remain: corpus-based studies of linguistic phenomena are not thorough enough, practice and research have not yet been bridged, and while the English corpus has made significant progress, the multilingual one lag far behind. It has become urgent to accelerate the construction of corpora in China to keep pace with international corpora. With the development of linguistics research on Chinese corpus and corpus construction in China as the research object, this study has adopted both diachronic and synchronic research methods, combing the history and current studies in corpus linguistics in China systematically. The paper is designed to summarize the current bottlenecks and problems of Chinese corpus construction with academic consensus, and to comprehensively and objectively analyze the problems in Chinese corpus linguistics research and the difficulties in solving these problems. Hopefully, it could draw the attention of academic circles in China and abroad, provide international experience in advanced corpus construction, solve the problems restricting the development of Chinese corpus, and promote corpus linguistics research and corpus construction in China.
APA, Harvard, Vancouver, ISO, and other styles
48

Goldfarb, Neal. "The Use of Corpus Linguistics in Legal Interpretation." Annual Review of Linguistics 7, no. 1 (January 14, 2021): 473–91. http://dx.doi.org/10.1146/annurev-linguistics-050520-093942.

Full text
Abstract:
Over the past decade, the idea of using corpus linguistics in legal interpretation has attracted interest on the part of judges, lawyers, and legal academics in the United States. This review provides an introduction to this nascent movement, which is generally referred to as Law and Corpus Linguistics (LCL). After briefly summarizing LCL's origin and development, I situate LCL within legal interpretation by discussing the legal concept of ordinary meaning, which establishes the framework within which LCL operates. Next, I situate LCL within linguistics by identifying the subfields that are most relevant to LCL. I then offer a linguistic justification for an idea that is implicit in the case law and that provides important support for using corpus analysis in legal interpretation: that data about patterns of usage provide evidence of how words and other expressions are ordinarily understood. I go on to discuss linguistic issues that arise from the use of corpus linguistics in disputes that involve lexical ambiguity and categorization. Finally, I point out some challenges that the growth of LCL will present for both legal professionals and linguists.
APA, Harvard, Vancouver, ISO, and other styles
49

Sisebaev, А. Zh, and A. Zh Suyundukova. "TECHNOLOGIES OF CORPUS LINGUISTICS IN THE DEVELOPMENT OF STUDENTS ' VOCABULARY." BULLETIN Series of Philological Sciences 76, no. 2 (June 15, 2021): 46–55. http://dx.doi.org/10.51889/2021-2.1728-7804.06.

Full text
Abstract:
The main tasks facing a modern foreign language teacher are to enrich the content of education and improve the educational process and to master new information technologies in a timely manner. The technologies of corpus linguistics considered in this article certainly belong to this technology.The article focuses on the use of corpus linguistics technology in the development of students' vocabulary.Also, the topic of corpus linguistics is touched upon:history of development, types and applications.As a result, it is determined that exercises for the development of students' lexical skills on the basis of the linguistic corpus are effective if the necessary conditions are met.
APA, Harvard, Vancouver, ISO, and other styles
50

Roberto, Tania Mikaela Garcia. "Corpus linguistics in language teaching." REVISTA FOCO 16, no. 7 (July 19, 2023): e2631. http://dx.doi.org/10.54751/revistafoco.v16n7-089.

Full text
Abstract:
In this article, Corpus Linguistics is assumed as a methodology of linguistic studies that uses the computer as a tool, in pedagogical proposals in which the student is active in the process of construction of knowledge through research, interacting with the environment and the other agents, and having the teacher as the mediator of this process. In CL, the student, through the so-called concordance programs[1], will be able to manipulate the corpus, analyze it, observe contextual information, draw conclusions and build patterns and concepts about the subject studied, and not just browse the web, type texts or do exercises. The teacher thus effectively assumes the role of mediator in the process, promoting the development of learners' autonomy in the teaching and learning process. The challenge of making use of a methodology that does not present linguistic analysis a priori, but instead proposes its construction throughout the teaching/learning process, leads to the conclusion that the relationship with knowledge has changed with the use of new technologies, the Internet and new conceptions of social interaction. The person who holds knowledge needs to give space to the mediator of these relationships for a linguistic and cognitive development that meets the new demands of this learner.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography