Academic literature on the topic 'Statistical linguistics'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Statistical linguistics.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Statistical linguistics"

1

Koplenig, Alexander. "Against statistical significance testing in corpus linguistics." Corpus Linguistics and Linguistic Theory 15, no. 2 (October 25, 2019): 321–46. http://dx.doi.org/10.1515/cllt-2016-0036.

Full text
Abstract:
Abstract In the first volume of Corpus Linguistics and Linguistic Theory, Gries (2005. Null-hypothesis significance testing of word frequencies: A follow-up on Kilgarriff. Corpus Linguistics and Linguistic Theory 1(2). doi:10.1515/cllt.2005.1.2.277. http://www.degruyter.com/view/j/cllt.2005.1.issue-2/cllt.2005.1.2.277/cllt.2005.1.2.277.xml: 285) asked whether corpus linguists should abandon null-hypothesis significance testing. In this paper, I want to revive this discussion by defending the argument that the assumptions that allow inferences about a given population – in this case about the studied languages – based on results observed in a sample – in this case a collection of naturally occurring language data – are not fulfilled. As a consequence, corpus linguists should indeed abandon null-hypothesis significance testing.
APA, Harvard, Vancouver, ISO, and other styles
2

Zhukovska, Victoriia V., Oleksandr O. Mosiiuk, and Veronika V. Komarenko. "ЗАСТОСУВАННЯ ПРОГРАМНОГО ПАКЕТУ R У НАУКОВИХ ДОСЛІДЖЕННЯХ МАЙБУТНІХ ФІЛОЛОГІВ." Information Technologies and Learning Tools 66, no. 4 (September 30, 2018): 272. http://dx.doi.org/10.33407/itlt.v66i4.2196.

Full text
Abstract:
Corpus linguistics is a newly emerging field of study in applied linguistics that deals with construction, processing, and exploitation of text corpora. To date, a high-quality analysis of vast amounts of empirical language data provided by computerized corpora is impossible without computer technologies and relevant statistical methods. Therefore, teaching future philologists to effectively apply statistical computer programs is an important stage in their research training. The article discusses the possibilities of using one of the leading in Western linguistics, but not well-known in Ukraine, software packages for statistical data analysis – R statistical software environment – in the research by future philologists. The paper reveals the advantages and disadvantages of this program in comparison with other similar software packages (SPSS and Statistica) and provides Internet links to R self-learn tutorials. The flexibility and efficacy of R for linguistic research are demonstrated on the example of a statistical analysis of the use of hedges in the corpus of academic speech. For novice philologists to properly understand the peculiarities of conducting a statistical linguistic experiment with R, a detailed description of each stage of the study is provided. The statistical verification of hedges in the speech of students and lecturers was carried out using such statistical methods as the Kolmogorov–Smirnov test and the Mann-Whitney U Test. The article presents the developed algorithms to calculate the specified tests applying the built-in commands and various specialized library functions, created by R user community to enhance the functionality of this statistical software. Each script for statistical calculations in R is accompanied by a detailed description and interpretation of the results obtained. Further study of the issue will involve a number of activities aimed at raising awareness and improving skills of future philologists in using R statistical software, which is important for their professional development as researchers.
APA, Harvard, Vancouver, ISO, and other styles
3

Gries, Stefan Th, and Nick C. Ellis. "Statistical Measures for Usage-Based Linguistics." Language Learning 65, S1 (May 21, 2015): 228–55. http://dx.doi.org/10.1111/lang.12119.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Zhukovska, Viktoriia V., and Oleksandr O. Mosiiuk. "STATISTICAL SOFTWARE R IN CORPUS-DRIVEN RESEARCH AND MACHINE LEARNING." Information Technologies and Learning Tools 86, no. 6 (December 30, 2021): 1–18. http://dx.doi.org/10.33407/itlt.v86i6.4627.

Full text
Abstract:
The rapid development of computer software and network technologies has facilitated the intensive application of specialized statistical software not only in the traditional information technology spheres (i.e., statistics, engineering, artificial intelligence) but also in linguistics. The statistical software R is one of the most popular analytical tools for statistical processing a huge array of digitalized language data, especially in quantitative corpus linguistic studies of Western Europe and North America. This article discusses the functionality of the software package R, focusing on its advantages in performing complex statistical analyses of linguistic data in corpus-driven studies and creating linguistic classifiers in machine learning. With this in mind, a three-stage strategy of computer-statistical analysis of linguistic corpus data is elaborated: 1) data processing and preparing to be subjected to a statistical procedure, 2) utilizing statistical hypothesis testing methods (MANOVA, ANOVA) and the Tukey post-hoc test, and 3) developing a model of a linguistic classifier and analyzing its effectiveness. The strategy is implemented on 11 000 tokens of English detached nonfinite constructions with an explicit subject extracted from the BNC-BYU corpus. The statistical analysis indicates significant differences in the realization of the factors of the parameter “Part of speech of the subject”. The analyzed linguistic data are employed to build a machine model for the classification of the given constructions. Particular attention is devoted to the methodological perspectives of interdisciplinary research in the fields of linguistics and computer studies. The potential application of the elaborated case study in training undergraduate, master, and postgraduate students of Applied Linguistics is indicated. The article provides all the statistical data and codes written in the R script with comprehensive descriptions and explanations. The concluding part of the article summarizes the obtained results and highlights the issues for further research connected with the popularization of the statistical software complex R and raising the awareness of specialists in this statistical analysis system.
APA, Harvard, Vancouver, ISO, and other styles
5

Mortarino, Cinzia. "An improved statistical test for historical linguistics." Statistical Methods and Applications 18, no. 2 (January 3, 2008): 193–204. http://dx.doi.org/10.1007/s10260-007-0085-1.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Shaikevich, Anatole. "Contrastive and Comparable Corpora: Quantitative Aspects." International Journal of Corpus Linguistics 6, no. 2 (December 31, 2001): 229–55. http://dx.doi.org/10.1075/ijcl.6.2.03sha.

Full text
Abstract:
This paper draws attention to the complexity of problems arising in statistical linguistics when it must compare various corpora. Those problems are discussed from the point of view of distributional statistical analysis of texts; that is, a set of formal procedures with a minimum of preconceived linguistic knowledge. The terminological distinction between contrastive and comparable corpora is introduced.
APA, Harvard, Vancouver, ISO, and other styles
7

Kusz, Ewa. "Statistics for linguists revisited: the review of some basic statistical tools in linguistic research and data analysis." Studia Anglica Resoviensia 17 (2021): 31–46. http://dx.doi.org/10.15584/sar.2020.17.3.

Full text
Abstract:
The major aim of this paper is to emphasise the importance of implementing statistical tools in the field of linguistic research, as well as to acquaint the reader with the basic statistical methods that can be used while conducting linguistic studies. The article introduces the idea of five steps in data analysis that any researcher of applied linguistics can take in order to carry out relevant studies. The steps include choosing statistical programmes, eliciting data, selecting some visual methods and applying normality tests, as well as choosing applicable parametric or nonparametric tests, all of which requires appropriate planning, designing, analysing and interpreting data. The theoretical part is an interlude to the practical realisation of the above-mentioned five steps, which is based on the part of linguistic research conducted on the students of English Philology. The major purpose of it was to prove (or refute) that there is a positive correlation between participants’ level of musical intelligence and their L2 pronunciation skills. The practical use of statistical methods enables the readers to familiarise themselves with one of the patterns of statistical analysis in the field of applied linguistics.
APA, Harvard, Vancouver, ISO, and other styles
8

Sidnyaev, Nikolai I., Juliia I. Butenko, and Vladislav V. Garazha. "STATISTICAL ASSESSMENT OF MEANINGLESS LETTER STRINGS ASSOCIATIVE POWER." Theoretical and Applied Linguistics, no. 4 (2019): 107–24. http://dx.doi.org/10.22250/2410-7190_2019_5_4_107_124.

Full text
Abstract:
The article proposes a method of compiling statistics of the most common trigrams in texts of different lengths, comparing several small passages with general statistics, and on the basis of the obtained data, a minimum adequate sample is proposed. The method for verification of hypotheses is proposed to test the distribution laws by using different criteria. The statistical processing of the results of the quantitative analysis of trigrams is presented. Calculation of metrological parameters for estimation of unknown parameters of the trigram distribution is performed. In the quantitative analysis, not an infinitely large number of definitions but several independent definitions is made, that is, having a sample (total sample) of 5-6 options. The conditions for the choice of linguistic models, as well as the following types of linguistic-mathematical models are described: ideal and reproducing. The methodological functions of applied linguistics are reviewed. The special sections of mathematics used in linguistic theory and practice are reviewed. The possibility of extracting the sample from the log-normal general population is statistically tested as a complex non-parametric hypothesis. The test was carried out using Kolmogorov's criterion.
APA, Harvard, Vancouver, ISO, and other styles
9

Janda, Laura A. "Quantitative perspectives in Cognitive Linguistics." Review of Cognitive Linguistics 17, no. 1 (August 20, 2019): 7–28. http://dx.doi.org/10.1075/rcl.00024.jan.

Full text
Abstract:
Abstract As a usage-based approach to the study language, cognitive linguistics is theoretically well poised to apply quantitative methods to the analysis of corpus and experimental data. In this article, I review the historical circumstances that led to the quantitative turn in cognitive linguistics and give an overview of statistical models used by cognitive linguists, including chi-square test, Fisher test, Binomial test, t-test, ANOVA, correlation, regression, classification and regression trees, naïve discriminative learning, cluster analysis, multi-dimensional scaling, and correspondence analysis. I stress the essential role of introspection in the design and interpretation of linguistic studies, and assess the pros and cons of the quantitative turn. I also make a case for open access science and appropriate archiving of linguistic data.
APA, Harvard, Vancouver, ISO, and other styles
10

Silva-Corvalán, Carmen. "Analyzing Linguistic Variation: Statistical Models and Methods." Journal of Linguistic Anthropology 16, no. 2 (December 2006): 295–96. http://dx.doi.org/10.1525/jlin.2006.16.2.295.

Full text
APA, Harvard, Vancouver, ISO, and other styles
More sources

Dissertations / Theses on the topic "Statistical linguistics"

1

Onnis, Luca. "Statistical language learning." Thesis, University of Warwick, 2003. http://wrap.warwick.ac.uk/54811/.

Full text
Abstract:
Theoretical arguments based on the "poverty of the stimulus" have denied a priori the possibility that abstract linguistic representations can be learned inductively from exposure to the environment, given that the linguistic input available to the child is both underdetermined and degenerate. I reassess such learnability arguments by exploring a) the type and amount of statistical information implicitly available in the input in the form of distributional and phonological cues; b) psychologically plausible inductive mechanisms for constraining the search space; c) the nature of linguistic representations, algebraic or statistical. To do so I use three methodologies: experimental procedures, linguistic analyses based on large corpora of naturally occurring speech and text, and computational models implemented in computer simulations. In Chapters 1,2, and 5, I argue that long-distance structural dependencies - traditionally hard to explain with simple distributional analyses based on ngram statistics - can indeed be learned associatively provided the amount of intervening material is highly variable or invariant (the Variability effect). In Chapter 3, I show that simple associative mechanisms instantiated in Simple Recurrent Networks can replicate the experimental findings under the same conditions of variability. Chapter 4 presents successes and limits of such results across perceptual modalities (visual vs. auditory) and perceptual presentation (temporal vs. sequential), as well as the impact of long and short training procedures. In Chapter 5, I show that generalisation to abstract categories from stimuli framed in non-adjacent dependencies is also modulated by the Variability effect. In Chapter 6, I show that the putative separation of algebraic and statistical styles of computation based on successful speech segmentation versus unsuccessful generalisation experiments (as published in a recent Science paper) is premature and is the effect of a preference for phonological properties of the input. In chapter 7 computer simulations of learning irregular constructions suggest that it is possible to learn from positive evidence alone, despite Gold's celebrated arguments on the unlearnability of natural languages. Evolutionary simulations in Chapter 8 show that irregularities in natural languages can emerge from full regularity and remain stable across generations of simulated agents. In Chapter 9 I conclude that the brain may endowed with a powerful statistical device for detecting structure, generalising, segmenting speech, and recovering from overgeneralisations. The experimental and computational evidence gathered here suggests that statistical language learning is more powerful than heretofore acknowledged by the current literature.
APA, Harvard, Vancouver, ISO, and other styles
2

Zhang, Lidan, and 张丽丹. "Exploiting linguistic knowledge for statistical natural language processing." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2011. http://hub.hku.hk/bib/B46506299.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

White, Christopher Wm. "Some Statistical Properties of Tonality, 1650-1900." Thesis, Yale University, 2014. http://pqdtopen.proquest.com/#viewpdf?dispub=3578472.

Full text
Abstract:

This dissertation investigates the statistical properties present within corpora of common practice music, involving a data set of more than 8,000 works spanning from 1650 to 1900, and focusing specifically on the properties of the chord progressions contained therein.

In the first chapter, methodologies concerning corpus analysis are presented and contrasted with text-based methodologies. It is argued that corpus analyses not only can show large-scale trends within data, but can empirically test and formalize traditional or inherited music theories, while also modeling corpora as a collection of discursive and communicative materials. Concerning the idea of corpus analysis as an analysis of discourse, literature concerning musical communication and learning is reviewed, and connections between corpus analysis and statistical learning are explored. After making this connection, we explore several problems with models of musical communication (e.g., music's composers and listeners likely use different cognitive models for their respective production and interpretation) and several implications of connecting corpora to cognitive models (e.g., a model's dependency on a particular historical situation).

Chapter 2 provides an overview of literature concerning computational musical analysis. The divide between top-down systems and bottom-up systems is discussed, and examples of each are reviewed. The chapter ends with an examination of more recent applications of information theory in music analysis.

Chapter 3 considers various ways corpora can be grouped as well as the implications those grouping techniques have on notions of musical style. It is hypothesized that the evolution of musical style can be modeled through the interaction of corpus statistics, chronological eras, and geographic contexts. This idea is tested by quantifying the probabilities of various composers' chord progressions, and cluster analyses are performed on these data. Various ways to divide and group corpora are considered, modeled, and tested.

In the fourth chapter, this dissertation investigates notions of harmonic vocabulary and syntax, hypothesizing that music involves syntactic regularity in much the same way as occurs in spoken languages. This investigation first probes this hypothesis through a corpus analysis of the Bach chorales, identifying potential syntactic/functional categories using a Hidden Markov Model. The analysis produces a three-function model as well as models with higher numbers of functions. In the end, the data suggest that music does indeed involve regularities, while also arguing for a definition of chord function that adds subtlety to models used by traditional music theory. A number of implications are considered, including the interaction of chord frequency and chord function, and the preeminence of triads in the resulting syntactic models.

Chapter 5 considers a particularly difficult problem of corpus analysis as it relates to musical vocabulary and syntax: the variegated and complex musical surface. One potential algorithm for vocabulary reduction is presented. This algorithm attempts to change each chord within an n-grams to its subset or superset that maximizes the probability of that trigram occurring. When a corpus of common-practice music is processed using this algorithm, a standard tertian chord vocabulary results, along with a bigram chord syntax that adheres to our intuitions concerning standard chord function.

In the sixth chapter, this study probes the notion of musical key as it concerns communication, suggesting that if musical practice is constrained by its point in history and progressions of chords exhibit syntactic regularities, then one should be able to build a key-finding model that learns to identify key by observing some historically situated corpus. Such a model is presented, and is trained on the music of a variety of different historical periods. The model then analyzes two famous moments of musical ambiguity: the openings of Beethoven's Eroica and Wagner's prelude to Tristan und Isolde. The results confirm that different corpus-trained models produce subtly different behavior.

The dissertation ends by considering several general and summarizing issues, for instance the notion that there are many historically-situated tonal models within Western music history, and that the difference between listening and compositional models likely accounts for the gap between the complex statistics of the tonal tradition and traditional concepts in music theory.

APA, Harvard, Vancouver, ISO, and other styles
4

Arad, Iris. "A quasi-statistical approach to automatic generation of linguistic knowledge." Thesis, University of Manchester, 1991. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.358872.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

McMahon, John George Gavin. "Statistical language processing based on self-organising word classification." Thesis, Queen's University Belfast, 1994. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.241417.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Clark, Stephen. "Class-based statistical models for lexical knowledge acquisition." Thesis, University of Sussex, 2001. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.341541.

Full text
Abstract:
This thesis is about the automatic acquisition of a particular kind of lexical knowledge, namely the knowledge of which noun senses can fill the argument slots of predicates. The knowledge is represented using probabilities, which agrees with the intuition that there are no absolute constraints on the arguments of predicates, but that the constraints are satisfied to a certain degree; thus the problem of knowledge acquisition becomes the problem of probability estimation from corpus data. The problem with defining a probability model in terms of senses is that this involves a huge number of parameters, which results in a sparse data problem. The proposal here is to define a probability model over senses in a semantic hierarchy, and exploit the fact that senses can be grouped into classes consisting of semantically similar senses. A novel class-based estimation technique is developed, together with a procedure that determines a suitable class for a sense (given a predicate and argument position). The problem of determining a suitable class can be thought of as finding a suitable level of generalisation in the hierarchy. The generalisation procedure uses a statistical test to locate areas consisting of semantically similar senses, and, as well as being used for probability estimation, is also employed as part of a re-estimation algorithm for estimating sense frequencies from incomplete data. The rest of the thesis considers how the lexical knowledge can be used to resolve structural ambiguities, and provides empirical evaluations. The estimation techniques are first integrated into a parse selection system, using a probabilistic dependency model to rank the alternative parses for a sentence. Then, a PP-attachment task is used to provide an evaluation which is more focussed on the class-based estimation technique, and, finally, a pseudo disambiguation task is used to compare the estimation technique with alternative approaches.
APA, Harvard, Vancouver, ISO, and other styles
7

Lakeland, Corrin, and n/a. "Lexical approaches to backoff in statistical parsing." University of Otago. Department of Computer Science, 2006. http://adt.otago.ac.nz./public/adt-NZDU20060913.134736.

Full text
Abstract:
This thesis develops a new method for predicting probabilities in a statistical parser so that more sophisticated probabilistic grammars can be used. A statistical parser uses a probabilistic grammar derived from a training corpus of hand-parsed sentences. The grammar is represented as a set of constructions - in a simple case these might be context-free rules. The probability of each construction in the grammar is then estimated by counting its relative frequency in the corpus. A crucial problem when building a probabilistic grammar is to select an appropriate level of granularity for describing the constructions being learned. The more constructions we include in our grammar, the more sophisticated a model of the language we produce. However, if too many different constructions are included, then our corpus is unlikely to contain reliable information about the relative frequency of many constructions. In existing statistical parsers two main approaches have been taken to choosing an appropriate granularity. In a non-lexicalised parser constructions are specified as structures involving particular parts-of-speech, thereby abstracting over individual words. Thus, in the training corpus two syntactic structures involving the same parts-of-speech but different words would be treated as two instances of the same event. In a lexicalised grammar the assumption is that the individual words in a sentence carry information about its syntactic analysis over and above what is carried by its part-of-speech tags. Lexicalised grammars have the potential to provide extremely detailed syntactic analyses; however, Zipf�s law makes it hard for such grammars to be learned. In this thesis, we propose a method for optimising the trade-off between informative and learnable constructions in statistical parsing. We implement a grammar which works at a level of granularity in between single words and parts-of-speech, by grouping words together using unsupervised clustering based on bigram statistics. We begin by implementing a statistical parser to serve as the basis for our experiments. The parser, based on that of Michael Collins (1999), contains a number of new features of general interest. We then implement a model of word clustering, which we believe is the first to deliver vector-based word representations for an arbitrarily large lexicon. Finally, we describe a series of experiments in which the statistical parser is trained using categories based on these word representations.
APA, Harvard, Vancouver, ISO, and other styles
8

Stymne, Sara. "Compound Processing for Phrase-Based Statistical Machine Translation." Licentiate thesis, Linköping : Department of Computer and Information Science, Linköpings universitet, 2009. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-51416.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Yamangil, Elif. "Rich Linguistic Structure from Large-Scale Web Data." Thesis, Harvard University, 2013. http://dissertations.umi.com/gsas.harvard:11162.

Full text
Abstract:
The past two decades have shown an unexpected effectiveness of Web-scale data in natural language processing. Even the simplest models, when paired with unprecedented amounts of unstructured and unlabeled Web data, have been shown to outperform sophisticated ones. It has been argued that the effectiveness of Web-scale data has undermined the necessity of sophisticated modeling or laborious data set curation. In this thesis, we argue for and illustrate an alternative view, that Web-scale data not only serves to improve the performance of simple models, but also can allow the use of qualitatively more sophisticated models that would not be deployable otherwise, leading to even further performance gains.
Engineering and Applied Sciences
APA, Harvard, Vancouver, ISO, and other styles
10

Phillips, Aaron B. "Modeling Relevance in Statistical Machine Translation: Scoring Alignment, Context, and Annotations of Translation Instances." Research Showcase @ CMU, 2012. http://repository.cmu.edu/dissertations/134.

Full text
Abstract:
Machine translation has advanced considerably in recent years, primarily due to the availability of larger datasets. However, one cannot rely on the availability of copious, high-quality bilingual training data. In this work, we improve upon the state-of-the-art in machine translation with an instance-based model that scores each instance of translation in the corpus. A translation instance reflects a source and target correspondence at one specific location in the corpus. The significance of this approach is that our model is able to capture that some instances of translation are more relevant than others. We have implemented this approach in Cunei, a new platform for machine translation that permits the scoring of instance-specific features. Leveraging per-instance alignment features, we demonstrate that Cunei can outperform Moses, a widely-used machine translation system. We then expand on this baseline system in three principal directions, each of which shows further gains. First, we score the source context of a translation instance in order to favor those that are most similar to the input sentence. Second, we apply similar techniques to score the target context of a translation instance and favor those that are most similar to the target hypothesis. Third, we provide a mechanism to mark-up the corpus with annotations (e.g. statistical word clustering, part-of-speech labels, and parse trees) and then exploit this information to create additional perinstance similarity features. Each of these techniques explicitly takes advantage of the fact that our approach scores each instance of translation on demand after the input sentence is provided and while the target hypothesis is being generated; similar extensions would be impossible or quite difficult in existing machine translation systems. Ultimately, this approach provides a more exible framework for integration of novel features that adapts better to new data. In our experiments with German-English and Czech-English translation, the addition of instance-specific features consistently shows improvement.
APA, Harvard, Vancouver, ISO, and other styles
More sources

Books on the topic "Statistical linguistics"

1

Quantitative and statistical linguistics: Bibliography. Montréal, Qué: Infolingua, 1994.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
2

Statistical language learning. Cambridge, Mass: MIT Press, 1993.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
3

Těšitelová, Marie. Quantitative linguistics. Praha: Academia, 1992.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
4

Quantitative linguistics. Amsterdam: Benjamins Pub., 1992.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
5

Statistics in linguistics. Oxford [Oxfordshire]: B. Blackwell, 1985.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
6

Statistics for corpus linguistics. Edinburgh: Edinburgh University Press, 1998.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
7

Statistics in historical linguistics. Bochum: Studienverlag Brockmeyer, 1986.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
8

Issues in quantitative linguistics. Lüdenscheid [Germany]: RAM-Verlag, 2009.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
9

Statistical methods for speech recognition. Cambridge, Mass: MIT Press, 1997.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
10

1983-, Seton Bregtje, ed. Essential statistics for applied linguistics. New York, NY: Palgrave Macmillan, 2012.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
More sources

Book chapters on the topic "Statistical linguistics"

1

Altmann, Eduardo G., and Martin Gerlach. "Statistical Laws in Linguistics." In Lecture Notes in Morphogenesis, 7–26. Cham: Springer International Publishing, 2016. http://dx.doi.org/10.1007/978-3-319-24403-7_2.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Lowie, Wander, and Bregtje Seton. "Statistical Logic." In Essential Statistics for Applied Linguistics, 39–49. London: Macmillan Education UK, 2013. http://dx.doi.org/10.1007/978-1-137-28490-7_4.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Kneser, Reinhard, and Hermann Ney. "Forming Word Classes by Statistical Clustering for Statistical Language Modelling." In Contributions to Quantitative Linguistics, 221–26. Dordrecht: Springer Netherlands, 1993. http://dx.doi.org/10.1007/978-94-011-1769-2_15.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Womser-Hacker, Christa. "Statistical Experiments on Computer Talk." In Contributions to Quantitative Linguistics, 251–63. Dordrecht: Springer Netherlands, 1993. http://dx.doi.org/10.1007/978-94-011-1769-2_18.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Clark, Stephen. "Statistical Parsing." In The Handbook of Computational Linguistics and Natural Language Processing, 333–63. Oxford, UK: Wiley-Blackwell, 2010. http://dx.doi.org/10.1002/9781444324044.ch13.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Desagulier, Guillaume. "Notions of Statistical Testing." In Corpus Linguistics and Statistics with R, 151–95. Cham: Springer International Publishing, 2017. http://dx.doi.org/10.1007/978-3-319-64572-8_8.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Ibrahim, Michael Nawar. "Statistical Arabic Grammar Analyzer." In Computational Linguistics and Intelligent Text Processing, 187–200. Cham: Springer International Publishing, 2015. http://dx.doi.org/10.1007/978-3-319-18111-0_15.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Essen, Ute, and Hermann Ney. "Statistical Language Modelling Using a Cache Memory." In Contributions to Quantitative Linguistics, 213–20. Dordrecht: Springer Netherlands, 1993. http://dx.doi.org/10.1007/978-94-011-1769-2_14.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Chelba, Ciprian. "Statistical Language Modeling." In The Handbook of Computational Linguistics and Natural Language Processing, 74–104. Oxford, UK: Wiley-Blackwell, 2010. http://dx.doi.org/10.1002/9781444324044.ch3.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Krámský, J. "On the Statistical Investigation of Explosives." In Prague Studies in Mathematical Linguistics, 85. Amsterdam: John Benjamins Publishing Company, 1987. http://dx.doi.org/10.1075/llsee.22.09kra.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "Statistical linguistics"

1

Johnson, Mark. "How the statistical revolution changes (computational) linguistics." In the EACL 2009 Workshop. Morristown, NJ, USA: Association for Computational Linguistics, 2009. http://dx.doi.org/10.3115/1642038.1642041.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Reynar, Jeffrey C. "Statistical models for topic segmentation." In the 37th annual meeting of the Association for Computational Linguistics. Morristown, NJ, USA: Association for Computational Linguistics, 1999. http://dx.doi.org/10.3115/1034678.1034735.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Collins, Michael, Lance Ramshaw, Jan Hajič, and Christoph Tillmann. "A statistical parser for Czech." In the 37th annual meeting of the Association for Computational Linguistics. Morristown, NJ, USA: Association for Computational Linguistics, 1999. http://dx.doi.org/10.3115/1034678.1034754.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Vydrin, V. F., and J. J. Méric. "CORPUS-DRIVEN BAMBARA SPELLING DICTIONARY." In International Conference on Computational Linguistics and Intellectual Technologies "Dialogue". Russian State University for the Humanities, 2020. http://dx.doi.org/10.28995/2075-7182-2020-19-1180-1187.

Full text
Abstract:
A model for the development of a corpus-driven spelling dictionary for the Bambara language is described. First, a list of about 4,000 lexemes characterized by spelling variability is extracted from an electronic BambaraFrench dictionary. At the next stage, a script is applied to determine the number of occurrences of each spelling variant in the Bambara Reference Corpus, separately for the entire Corpus (more than 11 million words) and for its disambiguated subcorpus (about 1.5 million words). Statistics on the diversity of sources and authors are also obtained automatically. The statistical data are then sorted manually into two lists of lexemes: those whose standard spelling can be established statistically, and those requiring evaluation by expert linguists. Some difficult cases are discussed in the paper. At the final stage, a representative expert commission will discuss all those lexemes for which statistical data alone do not suffice to define a standard spelling variant, before taking a final decision on each. The resulting Bambara spelling dictionary will be published electronically and on paper.
APA, Harvard, Vancouver, ISO, and other styles
5

Bladier, Tatiana, Jakub Waszczuk, and Laura Kallmeyer. "Statistical Parsing of Tree Wrapping Grammars." In Proceedings of the 28th International Conference on Computational Linguistics. Stroudsburg, PA, USA: International Committee on Computational Linguistics, 2020. http://dx.doi.org/10.18653/v1/2020.coling-main.595.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Bladier, Tatiana, Jakub Waszczuk, and Laura Kallmeyer. "Statistical Parsing of Tree Wrapping Grammars." In Proceedings of the 28th International Conference on Computational Linguistics. Stroudsburg, PA, USA: International Committee on Computational Linguistics, 2020. http://dx.doi.org/10.18653/v1/2020.coling-main.595.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Mukherjee, Arjun. "Detecting Deceptive Opinion Spam using Linguistics, Behavioral and Statistical Modeling." In Tutorials. Stroudsburg, PA, USA: Association for Computational Linguistics, 2015. http://dx.doi.org/10.3115/v1/p15-5007.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Bernasconi, Beatrice, and Valentina Noseda. "Examining the role of linguistic context in aspectual competition: a statistical study." In Computational Linguistics and Intellectual Technologies. Russian State University for the Humanities, 2021. http://dx.doi.org/10.28995/2075-7182-2021-20-110-118.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Koehn, Philipp, Franz Josef Och, and Daniel Marcu. "Statistical phrase-based translation." In the 2003 Conference of the North American Chapter of the Association for Computational Linguistics. Morristown, NJ, USA: Association for Computational Linguistics, 2003. http://dx.doi.org/10.3115/1073445.1073462.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Nixon, Jessie S., Jacolien van Rij, Peggy Mok, Harald Baayen, and Yiya Chen. "Eye movements reflect acoustic cue informativity and statistical noise." In 6th Tutorial and Research Workshop on Experimental Linguistics. ExLing Society, 2019. http://dx.doi.org/10.36505/exling-2015/06/0013/000250.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Reports on the topic "Statistical linguistics"

1

Magerman, David, Mitchell Marcus, and Beatric Santorini. Deducing Linguistic Structure from the Statistics of Large Corpora. Fort Belvoir, VA: Defense Technical Information Center, January 1990. http://dx.doi.org/10.21236/ada458686.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Moore, Robert C., and Michael H. Cohen. A Real-Time Spoken-Language System for Interactive Problem-Solving, Combining Linguistic and Statistical Technology for Improved Spoken Language Understanding. Fort Belvoir, VA: Defense Technical Information Center, September 1993. http://dx.doi.org/10.21236/ada270901.

Full text
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography