Academic literature on the topic 'Corpus-based data'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Corpus-based data.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Corpus-based data"

1

Gu, Chonglong. "Corpus triangulation: combining data and methods in corpus-based translation studies." Translator 24, no. 1 (December 6, 2017): 107–10. http://dx.doi.org/10.1080/13556509.2018.1411639.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Gamper, Johann, and Oliviero Stock. "Corpus-based terminology." Terminology 5, no. 2 (December 31, 1998): 147–59. http://dx.doi.org/10.1075/term.5.2.05gam.

Full text
Abstract:
The manual acquisition of terminological material from the domain-specific text material is a very time-consuming task. Recent advances in text-processing research provide a basis for automating this task. Computer-assisted term acquisition improves both the quantity and the quality of terminological work. This paper gives a brief overview of this new approach in terminology acquisition. Three subtasks are distinguished: compilation of an electronic text corpus, extraction of terminological data, and management of terminological data. Each of the subtasks will be discussed in some detail by identifying the core problems as well as proposed solutions. As a concrete initiative in this emerging field, we present an ongoing research project at the European Academy Bolzano, which illustrates the importance of computer-assisted terminology acquisition and of the resulting steps that have been taken in recent times. The paper concludes with a summary of five selected papers which have been presented at a workshop on corpus-based terminology in Bolzano. The full papers are published in this volume and in volume 4(2) of this journal.
APA, Harvard, Vancouver, ISO, and other styles
3

Wolk, Christoph, and Benedikt Szmrecsanyi. "Probabilistic corpus-based dialectometry." Journal of Linguistic Geography 6, no. 1 (April 2018): 56–75. http://dx.doi.org/10.1017/jlg.2018.6.

Full text
Abstract:
Researchers in dialectometry have begun to explore measurements based on fundamentally quantitative metrics, often sourced from dialect corpora, as an alternative to the traditional signals derived from dialect atlases. This change of data type amplifies an existing issue in the classical paradigm, namely that locations may vary in coverage and that this affects the distance measurements: pairs involving a location with lower coverage suffer from greater noise and therefore imprecision. We propose a method for increasing robustness using generalized additive modeling, a statistical technique that allows leveraging the spatial arrangement of the data. The technique is applied to data from the British English dialect corpus FRED; the results are evaluated regarding their interpretability and according to several quantitative metrics. We conclude that data availability is an influential covariate in corpus-based dialectometry and beyond, and recommend that researchers be aware of this issue and of methods to alleviate it.
APA, Harvard, Vancouver, ISO, and other styles
4

Khamis, Noorli. "Corpus-based Data for Determining Specialised Language Features." International Journal of Advanced Trends in Computer Science and Engineering 9, no. 1 (February 15, 2020): 36–41. http://dx.doi.org/10.30534/ijatcse/2020/07912020.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Mikulová, Marie, Eduard Bejček, Veronika Kolářová, and Jarmila Panevová. "Subcategorization of Adverbial Meanings Based on Corpus Data." Journal of Linguistics/Jazykovedný casopis 68, no. 2 (December 1, 2017): 268–77. http://dx.doi.org/10.1515/jazcas-2017-0036.

Full text
Abstract:
Abstract We introduce a corpus based description of selected adverbial meanings in Czech sentences. Its basic repertory is one of a long lasting tradition in both scientific and school grammars. However, before the corpus era, researchers had to rely on their own excerption; but nowadays, current syntax has a vast material basis in the form of electronic corpora available. On the case of spatial adverbials, we describe our methodology which we used to acquire a detailed, comprehensive, well-arranged description of meanings of adverbials including a list of formal realizations with examples. Theoretical knowledge stemming from this work will lead into an improval of the annotation of the meanings in the Prague Dependency Treebanks which serve as the corpus sources for our research. The Prague Dependency Treebanks include data manually annotated on the layer of deep syntax and thus provide a large amount of valuable examples on the basis of which the meanings of adverbials can be defined more accurately and subcategorized more precisely. Both theoretical and practical results will subsequently be used in NLP, such as machine translation.
APA, Harvard, Vancouver, ISO, and other styles
6

Bloothooft, Gerrit. "Corpus-based Name Standardization." History and Computing 6, no. 3 (October 1994): 153–67. http://dx.doi.org/10.3366/hac.1994.6.3.153.

Full text
Abstract:
A method is described to standardize nominal data on the basis of a combination of rules and a probabilistic similarity measure. Onomastic corpora are used to estimate the probability of spelling variations automatically. These corpora are also the basis for finding the most likely standard for a name not encountered before.
APA, Harvard, Vancouver, ISO, and other styles
7

Szmrecsanyi, Benedikt, and Christoph Wolk. "Holistic corpus-based dialectology." Revista Brasileira de Linguística Aplicada 11, no. 2 (2011): 561–92. http://dx.doi.org/10.1590/s1984-63982011000200011.

Full text
Abstract:
This paper is concerned with sketching future directions for corpus-based dialectology. We advocate a holistic approach to the study of geographically conditioned linguistic variability, and we present a suitable methodology, 'corpusbased dialectometry', in exactly this spirit. Specifically, we argue that in order to live up to the potential of the corpus-based method, practitioners need to (i) abandon their exclusive focus on individual linguistic features in favor of the study of feature aggregates, (ii) draw on computationally advanced multivariate analysis techniques (such as multidimensional scaling, cluster analysis, and principal component analysis), and (iii) aid interpretation of empirical results by marshalling state-of-the-art data visualization techniques. To exemplify this line of analysis, we present a case study which explores joint frequency variability of 57 morphosyntax features in 34 dialects all over Great Britain.
APA, Harvard, Vancouver, ISO, and other styles
8

Escudero-Mancebo, David, and Valentín Cardeñoso-Payo. "Applying data mining techniques to corpus based prosodic modeling." Speech Communication 49, no. 3 (March 2007): 213–29. http://dx.doi.org/10.1016/j.specom.2007.01.008.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Lyddon, Paul. "Discovering Language Properties through Corpus-Based Dictionary Data Analysis." Vocabulary Learning and Instruction 6, no. 2 (2017): 61–70. http://dx.doi.org/10.7820/vli.v06.2.lyddon.

Full text
Abstract:
To reveal underlying patterns in real language use, linguists have increasingly come to rely on corpus analyses, involving the evaluation of statistical frequencies in generally sizable bodies of natural linguistic data. However, accessing and analyzing large samples of raw language is neither always practical nor even truly necessary, especially in cases pertaining to structural characteristics. In fact, the requisite data can oftentimes be gleaned from a state-of-the-art (i.e., corpus-based) dictionary. Moreover, given the widespread availability of easily searchable electronic dictionaries nowadays, almost any language teacher or learner can use one to answer a number of these types of queries. This paper illustrates this claim with a step-by-step analysis of corpus-based dictionary data for the purpose of formulating the sound-symbol relations in English words with vowels preceding –gh.
APA, Harvard, Vancouver, ISO, and other styles
10

de Monnink, Inge. "Combining Corpus and Experimental Data." International Journal of Corpus Linguistics 4, no. 1 (August 13, 1999): 77–111. http://dx.doi.org/10.1075/ijcl.4.1.05mon.

Full text
Abstract:
In this article I argue that, from a methodological point of view, descriptive studies improve considerably if they use a multi-method approach to the data, more specifically, if they use a combination of corpus data and experimental data. In the modern conception of corpus linguistics, intuitive data play an important role. The linguist formulates research hypotheses based on his or her intuitive knowledge. These hypotheses are then tested on the corpus data. I argue that a sound descriptive study should not end with simply stating the results from the corpus study. Instead, the corpus data have to be supplemented. An appropriate way to supplement corpus data is through the use of elicitation techniques. I illustrate the multi-method approach on a case study of floating postmodification in the English noun phrase.
APA, Harvard, Vancouver, ISO, and other styles
More sources

Dissertations / Theses on the topic "Corpus-based data"

1

Nolli, Carla Fernanda. "Data-driven learning and corpus-based approaches in language education." Florianópolis, SC, 2006. http://repositorio.ufsc.br/xmlui/handle/123456789/88465.

Full text
Abstract:
Dissertação (mestrado) - Universidade Federal de Santa Catarina, Centro de Comunicação e Expressão. Programa de Pós-Graduação em Letras/Inglês e Literatura Correspondente
Made available in DSpace on 2012-10-22T09:21:53Z (GMT). No. of bitstreams: 0
This study focuses on the analysis of conditional sentences examples found in teaching materials (textbooks and grammar books) and compares them with a large corpus in order to verify their frequency and authenticity. In order to do so, the comparison was carried out with the help of a corpus analysis software, which generated a concordance list of the word if. These tokens were analyzed and classified in order to distinguish the three types of conditional sentences studied in this thesis. One of the purposes of this research is also to shed light on an approach that still remains largely unexplored in Brazil, namely Data-Driven Learning (DDL), which explores teaching and learning through corpus linguistics. Este estudo se concentra na análise de exemplos de sentenças condicionais em materiais de ensino (livros textos e gramáticas) e compara-os com um corpus lingüístico a fim de verificar sua freqüência e autenticidade. Para isso, a comparação foi realizada com a ajuda de um software de análise de corpus, que gerou uma lista de concordâncias com a palavra if. Todos os exemplos foram analisados e classificados a fim de detectar os três tipos de sentenças condicionais estudadas nesta dissertação. Um dos objetivos desta pesquisa é também dar ênfase a uma metodologia que ainda permanece muito inexplorada no Brasil, chamada de Aprendizagem a Partir de Dados, que explora o ensino e a aprendizagem através de lingüística de corpus.
APA, Harvard, Vancouver, ISO, and other styles
2

Adolphs, Svenja. "Linking lexico-grammar and speech acts : a corpus-based approach." Thesis, University of Nottingham, 2001. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.391412.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Marchewka, Katarzyna M. "Gender agreement in Polish : a study based on elicitation and corpus data." Thesis, University of Surrey, 2016. http://epubs.surrey.ac.uk/809946/.

Full text
Abstract:
This thesis explores the role of gender and explains how gender agreement operates in Polish and presents the possible agreement that particular gender can provide when conjoined with a noun of different gender or with a hybrid noun. The linguistic representation of specific gender is connected not only with the morphological shape but also the inherent semantics of a given noun. In the Polish language a great deal of information about possible masculine-personal or masculine-non-personal agreement is provided by the value ‘person’ for a given noun, and recent research on gender agreement in Polish has shown that some of the proposed rules for gender resolution and agreement between subject and predicate do not describe all the agreement possibilities. Likewise, with regard to hybrid nouns in Polish little research has been done on their agreement. This thesis thus examines the interaction of nouns of different genders, their values and their verbal agreement. Drawing mostly on primary questionnaire work with native speakers of Polish, I argue that semantics has a predominant impact on gender agreement. I support my claim by presenting data from the Polish corpus. The thesis provides the most comprehensive description of Polish gender agreement in sentences with conjoined noun phrases and agreement with hybrid nouns to date, by investigating their morphological status, their semantic restrictions, and their use in discourse. Building on previous analyses of agreement possibilities in the Polish language, I argue for an additional rule in gender resolution. I provide a description of various types of hybrid nouns in Polish and check the impact of semantic agreement versus formal agreement on Polish hybrid nouns using Corbett’s Agreement Hierarchy.
APA, Harvard, Vancouver, ISO, and other styles
4

Wang, Lixum. "The use of parallel texts in language learning : computer software and teaching materials for English and Chinese." Thesis, University of Birmingham, 2000. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.368990.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Zhang, Min, and 張珉. "Using corpus data in a MOODLE-based self-learning course : teaching education students to 'cite like an academic'." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2015. http://hdl.handle.net/10722/211141.

Full text
Abstract:
Citation, an essential feature of academic writing, is a challenging area for second language (L2) student writers due to its linguistic and functional complexities. In an effort to address this challenge, I report the development and evaluation of a MOODLE-based self-access workshop on citation learning, Cite Like an Academic (CLA). CLA aims to enhance the understanding of citation use among postgraduate students in education. It employs a design-based research approach characterized by three iterative phases involving needs analysis, pedagogical design, and evaluation of an online learning artefact for increased understanding to guide further improvements (Phillips, McNaught, & Kennedy, 2012). For the first-phase needs analysis research, I investigated the rhetorical functions of citations across various research article (RA) sections and their linguistic features. To this end, genre and corpus approaches were integrated to compare an expert corpus of research articles (the RAC) and a student corpus of master’s in education (MEd) dissertations (the MDC). The findings indicate that (1) all the RA Introduction-Methods-Results-Discussion (IMRD) sections contained citations fulfilling a wide range of rhetorical functions, and (2) RAC writers differed from MDC writers in their preference for citation types across sections, citation density across sections, reporting verb (RV) categories, RV lexico-grammatical patterns, and RV rhetorical functions. Alongside this investigation on citation use, I interviewed postgraduate students and communicated via email with supervisors to understand the needs of potential workshop participants. The second phase, the CLA pedagogy design, was guided by the adapted critical pragmatic approach (Harwood & Hadley, 2004) with adaption. Following the pragmatic approach, instruction materials were informed by the needs analysis research findings. The critical approach involved the participants in trying out genre analysis and corpus analysis of RAs they selected for citation learning. The third phase was the evaluation of the workshop through a user walk-through trial and three rounds of implementations. Various types of data were collected from 41 participants, including personal communications, MOODLE records of forum discussions and log reports, participants’ writing, interviews, and pre-CLA and post-CLA questionnaires. I report the findings on the effects of genre-based materials on thesis revision, as well as students’ gains and difficulties in carrying out genre analysis and building and using their I-Corpus for citation learning. The findings indicate that content familiarity and peer interaction contributed to learners’ in-depth genre analysis; however, Move interpretation needed attention in students’ learning of genre analysis. Genre familiarity and completed writing ready for revision facilitated learners’ direct use of genre-based materials in writing, and building an individual corpus of RA part genres raised learners’ awareness of the variations in RA macro-structures. In addition, the findings demonstrate that students needed training on formulating search terms for citation searches and using corpus analytic software for corpus data observation and interpretation. In particular, students should be reminded of the disciplinary context and textual context when reusing language data from a corpus in writing revision. Finally, I provide suggestions for how to improve and adapt the workshop to support students’ citation learning and accommodate their different learning needs.
published_or_final_version
Education
Doctoral
Doctor of Philosophy
APA, Harvard, Vancouver, ISO, and other styles
6

Tsiros, Augoustinos. "A multidimensional sketching interface for visual interaction with corpus-based concatenative sound synthesis." Thesis, Edinburgh Napier University, 2016. http://researchrepository.napier.ac.uk/Output/463438.

Full text
Abstract:
The present research sought to investigate the correspondence between auditory and visual feature dimensions and to utilise this knowledge in order to inform the design of audio-visual mappings for visual control of sound synthesis. The first stage of the research involved the design and implementation of Morpheme, a novel interface for interaction with corpus-based concatenative synthesis. Morpheme uses sketching as a model for interaction between the user and the computer. The purpose of the system is to facilitate the expression of sound design ideas by describing the qualities of the sound to be synthesised in visual terms, using a set of perceptually meaningful audio-visual feature associations. The second stage of the research involved the preparation of two multidimensional mappings for the association between auditory and visual dimensions. The third stage of this research involved the evaluation of the Audio-Visual (A/V) mappings and of Morpheme's user interface. The evaluation comprised two controlled experiments, an online study and a user study. Our findings suggest that the strength of the perceived correspondence between the A/V associations prevails over the timbre characteristics of the sounds used to render the complementary polar features. Hence, the empirical evidence gathered by previous research is generalizable/ applicable to different contexts and the overall dimensionality of the sound used to render should not have a very significant effect on the comprehensibility and usability of an A/V mapping. However, the findings of the present research also show that there is a non-linear interaction between the harmonicity of the corpus and the perceived correspondence of the audio-visual associations. For example, strongly correlated cross-modal cues such as size-loudness or vertical position-pitch are affected less by the harmonicity of the audio corpus in comparison to weaker correlated dimensions (e.g. texture granularity-sound dissonance). No significant differences were revealed as a result of musical/audio training. The third study consisted of an evaluation of Morpheme's user interface were participants were asked to use the system to design a sound for a given video footage. The usability of the system was found to be satisfactory. An interface for drawing visual queries was developed for high level control of the retrieval and signal processing algorithms of concatenative sound synthesis. This thesis elaborates on previous research findings and proposes two methods for empirically driven validation of audio-visual mappings for sound synthesis. These methods could be applied to a wide range of contexts in order to inform the design of cognitively useful multi-modal interfaces and representation and rendering of multimodal data. Moreover this research contributes to the broader understanding of multimodal perception by gathering empirical evidence about the correspondence between auditory and visual feature dimensions and by investigating which factors affect the perceived congruency between aural and visual structures.
APA, Harvard, Vancouver, ISO, and other styles
7

Vieira, Nataliya Godinho Soares. "Training and discovering corpus-based data driven exercices in english teaching (L2/FL) to native speakers of portuguese (L1)." Master's thesis, Faculdade de Ciências Sociais e Humanas, Universidade Nova de Lisboa, 2012. http://hdl.handle.net/10362/7422.

Full text
Abstract:
Project submitted as part requirement for the degree of Masters in English teaching,
Considerando o rápido desenvolvimento das novas tecnologias e o seu uso no ensino de línguas estrangeiras, Linguística de Corpus oferece novas ferramentas e materiais que enriquecem a aprendizagem de uma segunda língua. Este projecto apresenta um quadro de princípios teóricos relacionados com os corpora online e propõe os exemplos de training e discovering corpus-based data-driven exercícios, que são uma contribuição original para o ensino/aprendizagem de Inglês (L2) aos falantes nativos da língua Portuguesa (L1). Os data-driven exercícios, com base em concordâncias extraídas de corpora, proporcionam um ensino-descoberta e envolvem os alunos numa "aprendizagemdescoberta", enriquecendo, deste modo, o desenvolvimento pessoal dos professores e dos alunos. Múltiplas são as finalidades pedagógicas deste projecto relacionadas com a utilização da data-driven learning (DDL) abordagem assim como a aplicação dos recursos baseados em TIC no ensino/aprendizagem das línguas estrangeiras.
APA, Harvard, Vancouver, ISO, and other styles
8

Garcia, William Danilo. "Fanfictions, linguística de corpus e aprendizagem direcionada por dados : tarefas de produção escrita com foco no uso autêntico de língua e atividades que visam à autonomia dos alunos de letras em analisar preposições /." São José do Rio Preto, 2020. http://hdl.handle.net/11449/192699.

Full text
Abstract:
Orientador: Paula Tavares Pinto
Resumo: A relação da Linguística de Corpus com o Ensino de Línguas, apesar de receber foco mesmo antes do advento dos computadores, se intensificou por volta da década de 90, momento em que pesquisas em corpora de aprendizes e em Aprendizagem Direcionada por Dados foram enfatizadas. Considerado esse estreitamento, esta pesquisa objetiva compilar quatro corpora de aprendizes a partir do uso autêntico da língua com o intuito de desenvolver atividades didáticas direcionadas por dados dos próprios alunos que promovam nos discentes um perfil autônomo de investigação linguística (mais precisamente das preposições with, in, on, at, for e to). No tocante à fundamentação teórica, destacam-se Prabhu (1987), Skehan (1996), Willis (1996), Nunan (2004) e Ellis (2006) a respeito do Ensino de Línguas por Tarefas, Jenkins (2012) e Neves (2014) que discorrem sobre as ficções de fã. Já sobre a Linguística de Corpus, tem-se Sinclair (1991), Berber Sardinha (2000) e Viana (2011). Granger (1998, 2002, 2013) mais relacionado a Corpus de Aprendizes, e Johns (1991, 1994), Berber Sardinha (2011) e Boulton (2010) no que diz respeito à Aprendizagem Direcionada por Dados. Como metodologia, levantaram-se textos escritos pelos alunos a partir de uma tarefa de produção escrita em que eles redigiram uma ficção de fã. Em seguida, esses textos formaram dois corpora de aprendizes iniciais, que foram analisados com o auxílio da ferramenta AntConc (ANTHONY, 2018) no intuito de observar a presença ou não de inadequações ... (Resumo completo, clicar acesso eletrônico abaixo)
Abstract: Although the relation between Corpus Linguistics and Language Teaching has been emphasized even before the advent of computers, it has been highlighted around the 90s. This was the moment when research on learner corpora and Data-Driven Learning was focused. Having said that, this study aimed to compile four learner corpora based on the authentic use of the language. This was done in order to develop data-driven teaching activities that could promote, among the students, an autonomous profile of linguistic investigation (more precisely about the prepositions with, in, on, at, for and to). Concerning the existing literature, we highlight the works of Prabhu (1987), Skehan (1996), Willis (1996), Nunan (2004) and Ellis (2006) about Task-Based Language Teaching, and Jenkins (2012) and Neves (2014) about fanfictions. In relation to Corpus Linguistics, this study is based on Sinclair (1991), Berber Sardinha (2000) and Viana (2011). Granger (1998, 2012, 2013) is referenced to define learner corpora, and Johns (1991, 1994), Berber Sardinha (2011) and Boulton (2010) to discuss Data-Driven Learning. The methodological approach involved the collection of the compositions from Language Teaching undergraduate students who developed a writing task in which they had to write a fanfiction. These texts composed two learner corpora, which were analyzed with the AntConc tool (ANTHONY, 2018) with the purpose of observing the occurrence of prepositions in English and whether they were accurately ... (Complete abstract click electronic access below)
Mestre
APA, Harvard, Vancouver, ISO, and other styles
9

Gentilini, Livia. "La terminologia della sicurezza informatica nella banca dati FranceTerme: un'analisi corpus-based." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2019. http://amslaurea.unibo.it/17696/.

Full text
Abstract:
L’elaborato si propone di indagare la diffusione di anglicismi relativi alla terminologia della sicurezza informatica nella lingua francese. L’obiettivo è quello di osservare l’operato degli enti ufficiali francesi per la protezione linguistica, tramite un confronto tra la percentuale d’uso in francese di alcuni anglicismi scelti e dei rispettivi traducenti francesi ufficiali proposti dal Dispositif d’enrichissement de la langue française. Il primo capitolo fornisce una panoramica del tema dei linguaggi specialistici, con particolare attenzione al fenomeno della variazione terminologica, e descrive le caratteristiche del linguaggio informatico in lingua inglese e francese, oltre a fornire una definizione del concetto di sicurezza informatica. Il secondo capitolo affronta il tema dell’interferenza linguistica, partendo dal concetto di neologia per arrivare a quelli di prestito e di calco. Il terzo capitolo tratta delle politiche linguistiche in Francia: le evoluzioni degli enti ufficiali, l’approccio politico di fronte alla diffusione di elementi esteri nella lingua, e le specifiche leggi promulgate a riguardo. Il focus è sui principali enti appartenenti al Dispositif d’enrichissement de la langue française, del quale vengono presentati il funzionamento e gli obiettivi. Il quarto capitolo si focalizza sulla metodologia di ricerca. Sono state selezionate una serie di schede terminologiche affini al dominio della sicurezza informatica, individuate nella banca dati ministeriale FranceTerme. Il web corpus Araneum Francogallicum Maius è stato utilizzato per individuare le occorrenze dei suddetti termini, da confrontarsi quantitativamente. Il quinto capitolo si concentra sull’analisi dei materiali: dopo aver elencato le occorrenze totali riscontrate nel corpus, il capitolo passa al confronto delle frequenze assolute e delle frequenze relative percentuali dei traducenti francesi ufficiali e dei rispettivi forestierismi, allo scopo di individuare eventuali tendenze.
APA, Harvard, Vancouver, ISO, and other styles
10

Ghisi, Daniele. "Music across music : towards a corpus-based, interactive computer-aided composition." Thesis, Paris 6, 2017. http://www.theses.fr/2017PA066561/document.

Full text
Abstract:
Le traitement de musique existante pour en construire de nouvelle est une caractéristique fondamentale de la tradition musicale occidentale. Cette thèse propose et discute mon approche personnelle au sujet : l'emprunt de fragments de musique à partir de grands corpus (contenant des échantillons audio ainsi que des partitions symboliques) afin de créer une palette de grains organisée par descripteurs de bas niveau. Les paramètres sont gérés par des partitions numériques hybrides. Cette thèse présente également la bibliothèque "dada", qui fournit au logiciel Max la possibilité d'organiser, de sélectionner et de générer du contenu musical grâce à un ensemble d'interfaces graphiques manifestant une approche exploratoire à la composition. Ses modules abordent, entre autre, la visualisation de bases de données, la segmentation et l'analyse des partitions, la synthèse concaténative, la génération musicale à travers la modélisation physique ou géométrique, la synthèse "wave-terrain", l'exploration de graphes, les automates cellulaires, l'intelligence distribuée et les jeux vidéo. Pour terminer, cette thèse traite de la question de savoir si la représentation classique de la musique, démêlée dans l'ensemble standard des paramètres traditionnels, est optimale. Deux alternatives possibles aux décompositions orthogonales sont présentées : des représentations de partitions fondées sur les "grains", qui héritent les techniques de la composition basée sur corpus, et des modèles d'apprentissage automatique non supervisés, fournissant représentations de la musique "agnostiques". La thèse détaille aussi ma première expérience d'écriture collaborative au sein du collectif /nu/thing
The reworking of existing music in order to build new one is a quintessential characteristic of the Western musical tradition. This thesis proposes and discusses my personal approach to the subject: the borrowing of music fragments from large-scale corpora (containing audio samples as well as symbolic scores) in order to build a low-level, descriptor-based palette of grains. Parameters are handled via digital hybrid scores, in order to equip corpus-based composition with the control of notational practices. This thesis also introduces the dada library, providing Max with the ability to organize, select and generate musical content via a set of graphical interfaces manifesting an exploratory approach towards music composition. Its modules address a range of scenarios, including, but not limited to, database visualization, score segmentation and analysis, concatenative synthesis, music generation via physical or geometrical modelling, wave terrain synthesis, graph exploration, cellular automata, swarm intelligence, and videogames. The library is open-source and it fosters a performative approach to computer-aided composition. Finally, this thesis addresses the issue of whether classical representation of music, disentangled in the standard set of traditional parameters, is optimal. Two possible alternatives to orthogonal decompositions are presented: grain-based score representations, inheriting techniques from corpus-based composition, and unsupervised machine learning models, providing entangled, `agnostic' representations of music. The thesis also details my first experience of collaborative writing within the /nu/thing collective
APA, Harvard, Vancouver, ISO, and other styles
More sources

Books on the topic "Corpus-based data"

1

Antonymy: A corpus based perspective. London: Routledge, 2002.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
2

Müller-Landmann, Sonja. Corpus-based parse pruning: Applying empirical data to symbolic knowledge. Saarbrücken: DFKI, 2000.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
3

Corpus-based studies of lesser-described languages: The CorpAfroAs corpus of spoken AfroAsiatic languages. Amsterdam: John Benjamins Publishing Company, 2015.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
4

Studies in authorship recognition: A corpus-based approach. Frankfurt am Main: P. Lang, 1999.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
5

Corpus-based analyses of the problem-solution pattern: A phraseological approach. Amsterdam: John Benjamins Pub., 2008.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
6

Postmodifying clauses in the English noun phrase: A corpus-based study. Amsterdam: Rodopi, 1989.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
7

Basciano, Bianca, Franco Gatti, and Anna Morbiato. Corpus-Based Research on Chinese Language and Linguistics. Venice: Fondazione Università Ca’ Foscari, 2020. http://dx.doi.org/10.30687/978-88-6969-406-6.

Full text
Abstract:
This volume collects papers presenting corpus-based research on Chinese language and linguistics, from both a synchronic and a diachronic perspective. The contributions cover different fields of linguistics, including syntax and pragmatics, semantics, morphology and the lexicon, sociolinguistics, and corpus building. There is now considerable emphasis on the reliability of linguistic data: the studies presented here are all grounded in the tenet that corpora, intended as collections of naturally occurring texts produced by a variety of speakers/writers, provide a more robust, statistically significant foundation for linguistic analysis. The volume explores not only the potential of using corpora as tools allowing access to authentic language material, but also the challenges involved in corpus interrogation, analysis, and building.
APA, Harvard, Vancouver, ISO, and other styles
8

Wohlgenannt, Gerhard. Learning ontology relations by combining corpus-based techniques and reasoning on data from semantic web sources. Frankfurt am Main: P. Lang, 2011.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
9

Wohlgenannt, Gerhard. Learning Ontology Relations by Combining Corpus-Based Techniques and Reasoning on Data from Semantic Web Sources. Bern: Peter Lang International Academic Publishers, 2018.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
10

Hundt, Marianne. English mediopassive constructions: A cognitive, corpus-based study of their origin, spread, and current status. Amsterdam: Rodopi, 2004.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
More sources

Book chapters on the topic "Corpus-based data"

1

Gries, Stefan Th. "Corpus data in usage-based linguistics." In Human Cognitive Processing, 237–56. Amsterdam: John Benjamins Publishing Company, 2011. http://dx.doi.org/10.1075/hcp.32.15gri.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Viereck, Wolfgang. "The Atlas Linguarum Europae: A diachronic analysis of its data." In Corpus-based Analysis and Diachronic Linguistics, 21–36. Amsterdam: John Benjamins Publishing Company, 2011. http://dx.doi.org/10.1075/tufs.3.04vie.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Yaneva, Victoria, Shiva Taslimipoor, Omid Rohanian, and Le An Ha. "Cognitive Processing of Multiword Expressions in Native and Non-native Speakers of English: Evidence from Gaze Data." In Computational and Corpus-Based Phraseology, 363–79. Cham: Springer International Publishing, 2017. http://dx.doi.org/10.1007/978-3-319-69805-2_26.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Pooley, Tim. "The uneasy interface: Methodological issues in using data from traditional and urban dialectology in (re-)constructing sociolinguistic history." In Corpus-Based Perspectives in Linguistics, 169–89. Amsterdam: John Benjamins Publishing Company, 2007. http://dx.doi.org/10.1075/ubli.6.13poo.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Yoshitomi, Asako. "Testing the primacy of aspect and reverse order hypothesis in Japanese returnees: Towards constructing a corpus of second language attrition data." In Corpus-Based Perspectives in Linguistics, 371–89. Amsterdam: John Benjamins Publishing Company, 2007. http://dx.doi.org/10.1075/ubli.6.25yos.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Wang, Xingfu, Zhongfu Wu, Yan Li, Qian Huang, and Jinglu Hui. "Corpus-Based Analysis of the Co-occurrence of Chinese Antonym Pairs." In Advanced Data Mining and Applications, 500–507. Berlin, Heidelberg: Springer Berlin Heidelberg, 2010. http://dx.doi.org/10.1007/978-3-642-17313-4_50.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Maes, Francis, Ludovic Denoyer, and Patrick Gallinari. "Corpus-Based Structure Mapping of XML Document Corpora: A Reinforcement Learning Based Model." In Modeling, Learning, and Processing of Text Technological Data Structures, 249–66. Berlin, Heidelberg: Springer Berlin Heidelberg, 2011. http://dx.doi.org/10.1007/978-3-642-22613-7_13.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Oliveira, Francisco, Fai Wong, Anna Ho, Yiping Li, and Mingchui Dong. "Overcoming Data Sparseness Problem in Statistical Corpus Based Sense Disambiguation." In Computational Methods in Engineering & Science, 314. Berlin, Heidelberg: Springer Berlin Heidelberg, 2006. http://dx.doi.org/10.1007/978-3-540-48260-4_160.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Pimentel, Braulio Andres Soncco, and Roxana L. Q. Portugal. "Fake News in Spanish: Towards the Building of a Corpus Based on Twitter." In Information Management and Big Data, 333–39. Cham: Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-46140-9_32.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Sun, Jiawen. "A Corpus-Based Multi-dimensional Study of Tourism English Register Features." In Lecture Notes on Data Engineering and Communications Technologies, 262–68. Singapore: Springer Singapore, 2021. http://dx.doi.org/10.1007/978-981-16-5854-9_33.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "Corpus-based data"

1

Tian, Xueqin. "Foreign Language Writing Based on Corpus-based Data-driven." In 4th International Conference on Management Science, Education Technology, Arts, Social Science and Economics 2016. Paris, France: Atlantis Press, 2016. http://dx.doi.org/10.2991/msetasse-16.2016.110.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Wawer, Aleksander, and Dominika Rogozinska. "How Much Supervision? Corpus-Based Lexeme Sentiment Estimation." In 2012 IEEE 12th International Conference on Data Mining Workshops. IEEE, 2012. http://dx.doi.org/10.1109/icdmw.2012.119.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

YANG, Yanyu. "A Corpus-Based Study on Oral Language Education of Police English." In DSDE '21: 2021 4th International Conference on Data Storage and Data Engineering. New York, NY, USA: ACM, 2021. http://dx.doi.org/10.1145/3456146.3456164.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Wu, Yaguang, Haichun Sun, and Chungang Yan. "An event timeline extraction method based on news corpus." In 2017 IEEE 2nd International Conference on Big Data Analysis (ICBDA). IEEE, 2017. http://dx.doi.org/10.1109/icbda.2017.8078725.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Larsen-Walker, Melissa. "How Does Data Driven Learning Affect the Production of Multi-Word Sequences in EAP Students’ Academic Writing?" In EUROPHRAS 2017 - Computational and Corpus-based Phraseology: Recent Advances and Interdisciplinary Approaches. Editions Tradulex, Geneva, Switzerland, 2017. http://dx.doi.org/10.26615/978-2-9701095-2-5_010.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Guo, Siqiao, Xianbo Li, and Zhixin Ma. "Association Rule Mining of Anaphora Based on ParCorFull Corpus." In ICCDE 2020: 2020 The 6th International Conference on Computing and Data Engineering. New York, NY, USA: ACM, 2020. http://dx.doi.org/10.1145/3379247.3379277.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Liu, Xuanjun, Zheyu Zhu, Tengyan Fu, Jiaxuan Chen, and Ying Jiang. "Corpus Annotation System Based on HanLP Chinese Word Segmentation." In CONF-CDS 2021: The 2nd International Conference on Computing and Data Science. New York, NY, USA: ACM, 2021. http://dx.doi.org/10.1145/3448734.3450845.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Qingzhi, Sun, Du Qingfeng, Zhang Chenxi, and Li Jun. "Chinese News Event Corpus Construction Method Based on Syntax Tree." In ICBDT 2020: 2020 3rd International Conference on Big Data Technologies. New York, NY, USA: ACM, 2020. http://dx.doi.org/10.1145/3422713.3422741.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Zhu, Ying, and Eric Friginal. "Interactive Visual Text Analysis for Corpus-Based Language Learning." In 2015 IEEE First International Conference on Big Data Computing Service and Applications (BigDataService). IEEE, 2015. http://dx.doi.org/10.1109/bigdataservice.2015.55.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Rybka, Roman, Alexander Sboev, Ivan Moloshnikov, and Dmitry Gudovskikh. "Morpho-syntactic parsing based on neural networks and corpus data." In 2015 Artificial Intelligence and Natural Language and Information Extraction, Social Media and Web Search FRUCT Conference (AINL-ISMW FRUCT). IEEE, 2015. http://dx.doi.org/10.1109/ainl-ismw-fruct.2015.7382975.

Full text
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography