Dissertations / Theses on the topic 'Évaluation de la traduction automatique'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 50 dissertations / theses for your research on the topic 'Évaluation de la traduction automatique.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Rémillard, Judith. "Utilité et utilisation de la traduction automatique dans l’environnement de traduction : une évaluation axée sur les traducteurs professionnels." Thesis, Université d'Ottawa / University of Ottawa, 2018. http://hdl.handle.net/10393/37784.
Full textRaybaud, Sylvain. "De l'utilisation de mesures de confiance en traduction automatique : évaluation, post-édition et application à la traduction de la parole." Thesis, Université de Lorraine, 2012. http://www.theses.fr/2012LORR0260/document.
Full textIn this thesis I shall deal with the issues of confidence estimation for machine translation and statistical machine translation of large vocabulary spontaneous speech translation. I shall first formalize the problem of confidence estimation. I present experiments under the paradigm of multivariate classification and regression. I review the performances yielded by different techniques, present the results obtained during the WMT2012 internation evaluation campaign and give the details of an application to post edition of automatically translated documents. I then deal with the issue of speech translation. After going into the details of what makes it a very specific and particularly challenging problem, I present original methods to partially solve it, by using phonetic confusion networks, confidence estimation techniques and speech segmentation. I show that the prototype I developped yields performances comparable to state-of-the-art of more standard design
Nikoulina, Vassilina. "Modèle de traduction statistique à fragments enrichi par la syntaxe." Phd thesis, Université de Grenoble, 2010. http://tel.archives-ouvertes.fr/tel-00996317.
Full textGouirand, Olivier. "Méthodologie de l'évaluation de la traduction assistée par ordinateur : application au traducteur professionnel en français-anglais et vice versa." Toulouse 2, 2005. http://www.theses.fr/2005TOU20069.
Full textThis research aims at laying the foundations of an essentially linguistic evaluation of computer-aided translation limited to its use by independent translators on French-English language pairs. More than defining the specificities of the former - which had never been done before - and carrying out a critical and systematic study of the numerous approaches to evaluation, an experiment was conducted on corpora, confirming the tight dependency of semantics and syntax, the latter matching a categorial distribution close to the law on anomalous numbers. The invariants obtained in a statistical fashion were then compared to syntactic and conceptual primitives in the continuity and connexionist paradigm in view to forming a dynamic analysis system for linguistic quality in machine translation, the aim of which is helping it to break the semantic barrier
Bawden, Rachel. "Going beyond the sentence : Contextual Machine Translation of Dialogue." Thesis, Université Paris-Saclay (ComUE), 2018. http://www.theses.fr/2018SACLS524/document.
Full textWhile huge progress has been made in machine translation (MT) in recent years, the majority of MT systems still rely on the assumption that sentences can be translated in isolation. The result is that these MT models only have access to context within the current sentence; context from other sentences in the same text and information relevant to the scenario in which they are produced remain out of reach. The aim of contextual MT is to overcome this limitation by providing ways of integrating extra-sentential context into the translation process. Context, concerning the other sentences in the text (linguistic context) and the scenario in which the text is produced (extra-linguistic context), is important for a variety of cases, such as discourse-level and other referential phenomena. Successfully taking context into account in translation is challenging. Evaluating such strategies on their capacity to exploit context is also a challenge, standard evaluation metrics being inadequate and even misleading when it comes to assessing such improvement in contextual MT. In this thesis, we propose a range of strategies to integrate both extra-linguistic and linguistic context into the translation process. We accompany our experiments with specifically designed evaluation methods, including new test sets and corpora. Our contextual strategies include pre-processing strategies designed to disambiguate the data on which MT models are trained, post-processing strategies to integrate context by post-editing MT outputs and strategies in which context is exploited during translation proper. We cover a range of different context-dependent phenomena, including anaphoric pronoun translation, lexical disambiguation, lexical cohesion and adaptation to properties of the scenario such as speaker gender and age. Our experiments for both phrase-based statistical MT and neural MT are applied in particular to the translation of English to French and focus specifically on the translation of informal written dialogues
Shah, Ritesh. "SUFT-1, un système pour aider à comprendre les tweets spontanés multilingues et à commutation de code en langues étrangères : expérimentation et évaluation sur les tweets indiens et japonais." Thesis, Université Grenoble Alpes (ComUE), 2017. http://www.theses.fr/2017GREAM062/document.
Full textAs Twitter evolves into a ubiquitous information dissemination tool, understanding tweets in foreign languages becomes an important and difficult problem. Because of the inherent code-mixed, disfluent and noisy nature of tweets, state-of-the-art Machine Translation (MT) is not a viable option (Farzindar & Inkpen, 2015). Indeed, at least for Hindi and Japanese, we observe that the percentage of "understandable" tweets falls from 80% for natives to below 30% for target (English or French) monolingual readers using Google Translate. Our starting hypothesis is that it should be possible to build generic tools, which would enable foreigners to make sense of at least 70% of “native tweets”, using a versatile “active reading” (AR) interface, while simultaneously determining the percentage of understandable tweets under which such a system would be deemed useless by intended users.We have thus specified a generic "SUFT" (System for Helping Understand Tweets), and implemented SUFT-1, an interactive multi-layout system based on AR, and easily configurable by adding dictionaries, morphological modules, and MT plugins. It is capable of accessing multiple dictionaries for each source language and provides an evaluation interface. For evaluations, we introduce a task-related measure inducing a negligible cost, and a methodology aimed at enabling a « continuous evaluation on open data », as opposed to classical measures based on test sets related to closed learning sets. We propose to combine understandability ratio and understandability decision time as a two-pronged quality measure, one subjective and the other objective, and experimentally ascertain that a dictionary-based active reading presentation can indeed help understand tweets better than available MT systems.In addition to gathering various lexical resources, we constructed a large resource of "word-forms" appearing in Indian tweets with their morphological analyses (viz. 163221 Hindi word-forms from 68788 lemmas and 72312 Marathi word-forms from 6026 lemmas) for creating a multilingual morphological analyzer specialized to tweets, which can handle code-mixed tweets, compute unified features, and present a tweet with an attached AR graph from which foreign readers can intuitively extract a plausible meaning, if any
Denoual, Etienne. "Méthodes en caractères pour le traitement automatique des langues." Phd thesis, Université Joseph Fourier (Grenoble), 2006. http://tel.archives-ouvertes.fr/tel-00107056.
Full textLe présent travail promeut l'utilisation de méthodes travaillant au niveau du signal de l'écrit: le caractère, unité immédiatement accessible dans toute langue informatisée, permet de se passer de segmentation en mots, étape actuellement incontournable pour des langues comme le chinois ou le japonais.
Dans un premier temps, nous transposons et appliquons en caractères une méthode bien établie d'évaluation objective de la traduction automatique, BLEU.
Les résultats encourageants nous permettent dans un deuxième temps d'aborder d'autres tâches de traitement des données linguistiques. Tout d'abord, le filtrage de la grammaticalité; ensuite, la caractérisation de la similarité et de l'homogénéité des ressources linguistiques. Dans toutes ces tâches, le traitement en caractères obtient des résultats acceptables, et comparables à ceux obtenus en mots.
Dans un troisième temps, nous abordons des tâches de production de données linguistiques: le calcul analogique sur les chaines de caractères permet la production de paraphrases aussi bien que la traduction automatique.
Ce travail montre qu'on peut construire un système complet de traduction automatique ne nécessitant pas de segmentation, a fortiori pour traiter des langues sans séparateur orthographique.
Ahmed, Assowe Houssein. "Construction et évaluation pour la TA d'un corpus journalistique bilingue : application au français-somali." Thesis, Université Grenoble Alpes (ComUE), 2019. http://www.theses.fr/2019GREAM019/document.
Full textAs part of ongoing work to computerize a large number of "poorly endowed" languages, especially those in the French-speaking world, we have created a French-Somali machine translation system dedicated to a journalistic sub-language, allowing to obtain quality translations from a bilingual body built by post-editing of GoogleTranslate results for the Somali and non-French speaking populations of the Horn of Africa. For this, we have created the very first quality French-Somali parallel corpus, comprising to date 98,912 words (about 400 standard pages) and 10,669 segments. The latter is an aligned corpus of very good quality, because we built in by post-editions editing pre-translations of produced by GT, which uses with a combination of the its French-English and English-Somali MT language pairs. It That corpus was also evaluated by 9 bilingual annotators who gave assigned a quality note score to each segment of the corpus and corrected our post-editing. From Using this growing body corpus as training corpusof work, we have built several successive versions of a MosesLIG-fr-so fragmented statistical Phrase-Based Automatic Machine Translation System (PBMT), which has proven to be better than GoogleTranslate on this language pair and this sub-language, in terms BLEU and of post-editing time. We also did used OpenNMT to build a first French-Somali neural automatic translationMT system and experiment it.in order to improve the results of TA without leading to prohibitive calculation times, both during training and during decoding.On the other hand, we have set up an iMAG (multilingual interactive access gateway) that allows non-French-speaking Somali surfers on the continent to access the online edition of the newspaper "La Nation de Djibouti" in Somali. The segments (sentences or titles), pre- automatically translated automatically by our any available fr-so MT system, can be post-edited and rated (out on a 1 to of 20scale) by the readers themselves, so as to improve the system by incremental learning, in the same way as the has been done before for the French-Chinese PBMT system. (PBMT) created by [Wang, 2015]
Potet, Marion. "Vers l'intégration de post-éditions d'utilisateurs pour améliorer les systèmes de traduction automatiques probabilistes." Phd thesis, Université de Grenoble, 2013. http://tel.archives-ouvertes.fr/tel-00995104.
Full textNegrichi, Khalil. "Approche intégrée pour l'analyse de risques et l'évaluation des performances : application aux services de stérilisation hospitalière." Thesis, Université Grenoble Alpes (ComUE), 2015. http://www.theses.fr/2015GREAI105/document.
Full textSterilization services are vulnerable to risks, due to the contagious nature of their environment and to the degradation that risks can cause to their performances and to the safety of patients and staff. The risks in these facilities can range from equipment failure to the transmission of nosocomial infections and diseases. In this kind of high risk environment, these services are also required to maintain an adequate level of performance to ensure continuity of care in operating theaters.We focus in this research on the development of an integrated approach for risk analysis and performance assessment. This work is part of a collaborative work between the G-SCOP laboratory and the sterilization service of the University Hospital of Grenoble, which was the case study chosen to implement the proposed approach.The approach we propose is conducted in several steps: first, following a comparison of the risk analysis methods, we have chosen a model driven approach called FIS (Function Interaction Structure). Based on FIS, we have developed a risk model of Grenoble University Hospital sterilization service. This model describes the functions, the resources to achieve these functions as well as the various risks that may be encountered. Secondly, we introduced a new view to the FIS model dedicated to describe the dynamic behaviour of the resulting risk model.This dynamic model can simulate the behaviour of the sterilization service in normal situations of operations and risk situations.To do this, we have introduced a new Petri Net class called PTPS (Predicate-Transition, Prioritized, Synchronous) Petri Net to represent and simulate the dynamic behaviour of the FIS model. Subsequently, we automated the switching between the risk model and the dynamic model. This automation is performed by a set of translation algorithms capable of automatically converting the FIS model to a PTPS Petri Net simulation model .This approach resulted in a modelling and simulation tool in degraded mode called SIM-RISK. We also showed the usefulness of this tool by some examples based on different risks encountered in the sterilization service
Wang, Lingxiao. "Outils et environnements pour l'amélioration incrémentale, la post-édition contributive et l'évaluation continue de systèmes de TA. Application à la TA français-chinois." Thesis, Université Grenoble Alpes (ComUE), 2015. http://www.theses.fr/2015GREAM057/document.
Full textThe thesis, conducted as part of a CIFRE grant, and extending one of the aspects of the ANR project Traouiero, first addresses the production, extension and improvement of multilingual corpora by machine translation (MT) and contributory post-editing (PE). Functional and technical improvements have been made to the SECTra and iMAG software produced in previous PhD theses (P.C. Huynh, H.T. Nguyen), and progress has ben made toward a generic definition of the structure of a multilingual, annotated and multi-media corpus that may contain usual documents as well as pseudo-documents (such as Web pages) and meta-segments. This part has been validated by the creation of good French-Chinese bilingual corpora, one of them resulting from the first application to literary translation (a Jules Verne novel).A second part, initially motivated by an industrial need, has consisted in building MT systems of Moses type, specialized to sub-languages, for french↔chinese, and to study how to improve them in the context of a continuous use with the possibility of PE. As part of an internal project on the LIG website and of a project (TABE-FC) in cooperation with Xiamen University, it has been possible to demonstrate the value of incremental learning in statistical MT, under certain conditions, through an experiment that spread over the whole thesis.The third part of the thesis is devoted to contributing and making available computer tools and resources. The main ones are related to the COST project MUMIA of the EU and result from the exploitation of the CLEF-2011 collection of 1.5 million partially multilingual patents. Large translation memories have been extracted from it (17.5 million segments), 3 MT systems have been produced (de-fr, en-fr, fr-de), and a website of support for multilingual IR on patents has been constructed. One also describes the on-going implementation of JianDan-eval, a platform for building, deploying and evaluating MT systems
Cennamo, Ilaria. "Enseigner la traduction humaine en s'inspirant de la traduction automatique." Thesis, Brest, 2015. http://www.theses.fr/2015BRES0021/document.
Full textOur project aims at studying human-machine (H-M) interaction in the context of Italian to French translation teaching and learning, at a master degree level in translation and interpretation. More precisely, our focus is on the pedagogical usefulness of such a H-M interaction having been put in place thanks to the integration of a rule-based machine translator, namely the system Apertium , in a prototypical version.Can this interaction between machine translation and human translation strategies represent a useful pedagogical tool for translation training? Our hypothesis is that H-M interaction taking place between human translation learners and our machine translation prototype can encourage learners’ meta-translational reflection. This process would help them in becoming aware of all the factors implied in translating, and would allow the systematisation of their translation knowledge
Dragović-Drouet, Mila. "Évaluation de la qualité des traductions éditoriales." Paris 3, 2003. http://www.theses.fr/2003PA030049.
Full textThe present thesis aims to define quality assessment criteria for professionally translated published works, based on the interpretive theory of translation (Paris school). The theoretical assumption is that the translation process and the assessment of its results must be viewed holistically as part of a single communicative act encompassing the roles, competences and objectives of all those involved. The revising and the reviewing of published translated works are observed in the light of the fact that published translations stand alongside "native language" works and must therefore be able to stand alone as texts in their own right, the reference to the original being inevitable only in fine. A corpus of translated works is assessed and used to demonstrate that the effective quality assessment of published translations requires at least some knowledge of the translation process and of the translator's specific skills
EDIGHOFFER, KLING VERONIQUE. "Approche linguistique et informatique de la traduction automatique. Elaboration d'un programme de traduction automatique du francais vers l'italien : traductor 98." Université Marc Bloch (Strasbourg) (1971-2008), 1998. http://www.theses.fr/1998STR20072.
Full textKuroda, Kyoko. "Traduction automatique : divergences de traduction entre le japonais et le français." Besançon, 2006. http://www.theses.fr/2006BESA1004.
Full textSentences obtained by translation show varied and important syntactic discrepancies. This is true especially when the sentences are in distant languages such as Japanese and French. This work described here explores the issue of translation divergence of these particular languages in order to apply the results of this investigation to our transfer system. With this intention we are interested in the discrepancy observed at the level of verbal argument structures. We put special focus on changes of lexical category, changes of voice, diversification of the actancial distribution and various forms of actualized predicate. These disparities often are correlated with each other and have common origins. For example they can be explained by what Pottier calls ‘event statutes' of the predicate. That is, according to whether the event to be expressed is one of state or evolution, the way of representing the event is different. Furthermore, this differentiation largely depends on the lexicon of each language and also on the syntactico-semantic constraints which each language imposes on its lexicon. We have thus endeavored to extract factors which, on the one hand, are correlated between the diverging facts and, on the other hand, are common to our two languages. We have included these factors in the lexical description, in considering that they thus enable the transfer system to apply operations which neutralize the disparity. Having formalized these lexical descriptions which are based on unification grammar, we show how the transfer system uses the common factors in order to neutralize the discrepancy
Kanelliadou, Polyxène. "Procédures de levée d'ambiguïté en traduction automatique." Paris 13, 2002. http://www.theses.fr/2002PA131007.
Full textAbdul, Rauf Sadaf. "Sélection de corpus en traduction automatique statistique." Phd thesis, Université du Maine, 2012. http://tel.archives-ouvertes.fr/tel-00732984.
Full textHewson, Lance. "Les paramètres de la traduction." Montpellier 3, 1987. http://www.theses.fr/1987MON30012.
Full textThe subject of our thesis is the parameters of translation. We consider both translation as an activity in all its complexity, and the status and importance of the translated text. After exploring recent translation theories and practices, we examine the parameters of the source language text (slt). The role played by the person ordering the translation (and choosing the slt) is underlined, as is the importance of his own linguistic background. The text itself is considered from the point of view of its genesis (its production situation, i. E. Being produced by a writer writing in particular conditions and within a cultural system of reference). The central part of the thesis deals with the translation operation. After a consideration of the influence exercised by slt structures, we look at the way in which the translator comes under the sway of texts which have already been translated -"le deja-traduit". We then look at crative activity in translation, where the translator explores the potential and the limits of the target language -"le possible-du-traduire" in our terminology. In part four, we examine the target language text (tlt). We consider the nature and status of the tlt together with the importance of the receiver of the tlt. We look at the different images of the tlt in our society and explore the translations of a kafka short story in three languages. We then turn our attention to translation criticism, with a consideration both of its epistemological status and its practices. In our fifth part, we look at the future of research in translation studies, and in particular at applications for translation teaching
Lin, Hsiang-I. "Vers une traduction automatique des expressions figées françaises en chinois : la traduction canonique." Besançon, 2004. http://www.theses.fr/2004BESA1023.
Full textFixed expressions are frequently used in the media, in the literature and in daily conversations. Thus, they often perplex learners of a new language, but also translators. Moreover, existing machine translation systems give frustrating results for fixed expression entries, especially when it comes to translating french fixed expressions into chinese. Would it be possible to create a machine translation system that is able to recognize the right sense of a french fixed expression and then to translate it into chinese in an intelligible way ? After studying fixed expressions, (human) translation and machine translation, we find out the principal problems that are inherents in the related areas : the polysemic nature of french fixed expressions ; the principle of fidelity in translation: fidelity to the sense ; the recognization of the right sense of a french fixed expression, and the output of an intelligible chinese translation - by a machine : the formalization of all linguistic and translating respects is therefore indispensable. These problems show that the sense is the essential element in the related areas, and that the sense underlies concrete forms. The sense of a fixed expression may not only be determined by the form that the expression takes, but also be outlined by its co-textes. Based on these findings, We suggest the canonical approach on the bassis of semantic and formal respects of french fixed expressions and their chinese translations. With this approach, We establish the trace system (canonical translation of expressions) which is able to translate correctly a french fixed expression into chinese
Guilbault, Isabelle. "Les dictionnaires dans les systèmes de traduction automatique." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1999. http://www.collectionscanada.ca/obj/s4/f2/dsk1/tape9/PQDD_0001/MQ45222.pdf.
Full textDouib, Ameur. "Algorithmes bio-inspirés pour la traduction automatique statistique." Thesis, Université de Lorraine, 2019. http://www.theses.fr/2019LORR0005/document.
Full textDifferent components of statistical machine translation systems are considered as optimization problems. Indeed, the learning of the translation model, the decoding and the optimization of the weights of the log-linear function are three important optimization problems. Knowing how to define the right algorithms to solve them is one of the most important tasks in order to build an efficient translation system. Several optimization algorithms are proposed to deal with decoder optimization problems. They are combined to solve, on the one hand, the decoding problem that produces a translation in the target language for each source sentence, on the other hand, to solve the problem of optimizing the weights of the combined scores in the log-linear function to fix the translation evaluation function during the decoding. The reference system in statistical translation is based on a beam-search algorithm for the decoding, and a line search algorithm for optimizing the weights associated to the scores. We propose a new statistical translation system with a decoder entirely based on genetic algorithms. Genetic algorithms are bio-inspired optimization algorithms that simulate the natural process of evolution of species. They allow to handle a set of solutions through several iterations to converge towards optimal solutions. This work allows us to study the efficiency of the genetic algorithms for machine translation. The originality of our work is the proposition of two algorithms: a genetic algorithm, called GAMaT, as a decoder for a phrase-based machine translation system, and a second genetic algorithm, called GAWO, for optimizing the weights of the log-linear function in order to use it as a fitness function for GAMaT. We propose also, a neuronal approach to define a new fitness function for GAMaT. This approach consists in using a neural network to learn a function that combines several scores, which evaluate different aspects of a translation hypothesis, previously combined in the log-linear function, and that predicts the BLEU score of this translation hypothesis. This work allowed us to propose a new machine translation system with a decoder entirely based on genetic algorithms
Gasquet, Olivier. "Déduction automatique en logique multi-modale par traduction." Toulouse 3, 1994. http://www.theses.fr/1994TOU30078.
Full textDéchelotte, Daniel. "Traduction automatique de la parole par méthodes statistiques." Paris 11, 2007. http://www.theses.fr/2007PA112244.
Full textThe subject of this thesis is automatic speech translation. The task is the translation of the European Parliamentary Plenary Sessions proceedings, between English and Spanish. Two statistical translation systems are used. The first one has been entirely developed during this thesis and relies on the IBM-4 model. The second system employs Moses, an open-source, state-of-the-art phrase-based translation decoder. A collaboration between the two decoders is envisaged. The neural-network language model proves extremely useful in both translation directions. The systems described in this thesis obtained top rankings at the last TC-Star evaluation of February 2007. An algorithm inspired from the Perceptron is proposed to modify the phrase-table scores in a discriminative manner, based on errors observed on a development corpus. With respect to the interaction between speech recognition and translation, we measure the impact of the speech recognition word error rate on translation performances, and evaluate separately the respective impact of the source language model and the acoustic model. We also run experiments to take into account the ambiguity of the speech recognition output, i. E. The words between which the speech recognizer "hesitates". We then present several speech-specific processings, occurring after the recognition and before the translation. Eventually, we modify the speech recognition system so as to improve the overall speech translation performance
Do, Quoc khanh. "Apprentissage discriminant des modèles continus en traduction automatique." Thesis, Université Paris-Saclay (ComUE), 2016. http://www.theses.fr/2016SACLS071/document.
Full textOver the past few years, neural network (NN) architectures have been successfully applied to many Natural Language Processing (NLP) applications, such as Automatic Speech Recognition (ASR) and Statistical Machine Translation (SMT).For the language modeling task, these models consider linguistic units (i.e words and phrases) through their projections into a continuous (multi-dimensional) space, and the estimated distribution is a function of these projections. Also qualified continuous-space models (CSMs), their peculiarity hence lies in this exploitation of a continuous representation that can be seen as an attempt to address the sparsity issue of the conventional discrete models. In the context of SMT, these echniques have been applied on neural network-based language models (NNLMs) included in SMT systems, and oncontinuous-space translation models (CSTMs). These models have led to significant and consistent gains in the SMT performance, but are also considered as very expensive in training and inference, especially for systems involving large vocabularies. To overcome this issue, Structured Output Layer (SOUL) and Noise Contrastive Estimation (NCE) have been proposed; the former modifies the standard structure on vocabulary words, while the latter approximates the maximum-likelihood estimation (MLE) by a sampling method. All these approaches share the same estimation criterion which is the MLE ; however using this procedure results in an inconsistency between theobjective function defined for parameter stimation and the way models are used in the SMT application. The work presented in this dissertation aims to design new performance-oriented and global training procedures for CSMs to overcome these issues. The main contributions lie in the investigation and evaluation of efficient training methods for (large-vocabulary) CSMs which aim~:(a) to reduce the total training cost, and (b) to improve the efficiency of these models when used within the SMT application. On the one hand, the training and inference cost can be reduced (using the SOUL structure or the NCE algorithm), or by reducing the number of iterations via a faster convergence. This thesis provides an empirical analysis of these solutions on different large-scale SMT tasks. On the other hand, we propose a discriminative training framework which optimizes the performance of the whole system containing the CSM as a component model. The experimental results show that this framework is efficient to both train and adapt CSM within SMT systems, opening promising research perspectives
Afli, Haithem. "La Traduction automatique statistique dans un contexte multimodal." Thesis, Le Mans, 2014. http://www.theses.fr/2014LEMA1012/document.
Full textThe performance of Statistical Machine Translation Systems statistics depends on the availability of bilingual parallel texts, also known as bitexts. However, freely available parallel texts are also a sparse resource : the size is often limited, languistic coverage insufficient or the domain of texts is not appropriate. There are relatively few pairs of languages for which parallel corpora sizes are available for some domains. One way to overcome the lack of parallel data is to exploit comparable corpus that are more abundant. Previous work in this area have been applied for the text modality. The question we asked in this thesis is : can comparable multimodal corpus allows us to make solutions to the lack of parallel data in machine translation? In this thesis, we studied how to use resources from different modalities (text or speech) for the development of a Statistical machine translation System. The first part of the contributions is to provide a method for extracting parallel data from a comparable multimodal corpus (text and audio). The audio data are transcribed with an automatic speech recognition system and translated with a machine translation system. These translations are then used as queries to select parallel sentences and generate a bitext. In the second part of the contribution, we aim to improve our method to exploit the sub-sentential entities creating an extension of our system to generate parallel segments. We also improve the filtering module. Finally, we présent several approaches to adapt translation systems with the extracted data. Our experiments were conducted on data from the TED and Euronews web sites which show the feasibility of our approaches
Kraif, Olivier. "Constitution et exploitation de bi-textes pour l'Aide à la traduction." Nice, 2001. http://www.theses.fr/2001NICE2018.
Full textSeverini, Alfiero. "Mise au point d'un système de traduction automatique italien-français." Paris 13, 2001. http://www.theses.fr/2001PA131007.
Full textHaddaine, Mihoubi Houria. "Une approche déclarative de traduction d'ontologies." Grenoble 3, 2000. http://www.theses.fr/2000GRE39044.
Full textLavecchia, Caroline. "Les Triggers Inter-langues pour la Traduction Automatique Statistique." Phd thesis, Université Nancy II, 2010. http://tel.archives-ouvertes.fr/tel-00545463.
Full textRubino, Raphaël. "Traduction automatique statistique et adaptation à un domaine spécialisé." Phd thesis, Université d'Avignon, 2011. http://tel.archives-ouvertes.fr/tel-00879945.
Full textLoginova, Clouet Elizaveta. "Traitement automatique des termes composés : segmentation, traduction et variation." Nantes, 2014. http://archive.bu.univ-nantes.fr/pollux/show.action?id=f9a1d95c-ba61-4322-96a9-ffda96d82504.
Full textThe number of specialized terms continuously grows in the documents, at a pace which is difficult to follow for the terminology standardization organizations. The methods of bilingual term lexicon construction from the text corpora provide solutions. Our thesis falls into this topic: bilingual lexicon acquisition from comparable corpora. Compound terms (terms containing several roots, but a single graphical unit) are challenging for natural language processing applications. Given their graphical form, they are often handled in the same manner as single word terms, which prevents from apprehending their semantic complexity. Our involvement in an automatical terminology extraction evaluation allowed us to check our hypothesis: compound terms need a particular processing in a multilingual context. We proposed a method for compound terms recognition and splitting, which combines language-independent and language-specific features. It allowed us to obtain results comparable with those of state-of-the-art methods, while validating on a sample of languages from several families (germanic, slavic, romance languages), and adapting the method to specialized domains (tested on two domains: wind energy and breast cancer). We used the produced segmentations for compositional translation of compound terms, and for their multi-word variant recognition in the specialized texts. These two experiments illustrate that compound splitting is beneficial for the bilingual term lexicon acquisition task
Dymetman, Marc. "Transformations de grammaires logiques et réversibilité en traduction automatique." Grenoble 1, 1992. http://www.theses.fr/1992GRE10097.
Full textMarie, Benjamin. "Exploitation d’informations riches pour guider la traduction automatique statistique." Thesis, Université Paris-Saclay (ComUE), 2016. http://www.theses.fr/2016SACLS066/document.
Full textAlthough communication between languages has without question been made easier thanks to Machine Translation (MT), especially given the recent advances in statistical MT systems, the quality of the translations produced by MT systems is still well below the translation quality that can be obtained through human translation. This gap is partly due to the way in which statistical MT systems operate; the types of models that can be used are limited because of the need to construct and evaluate a great number of partial hypotheses to produce a complete translation hypothesis. While more “complex” models learnt from richer information do exist, in practice, their integration into the system is not always possible, would necessitate a complete hypothesis to be computed or would be too computationally expensive. Such features are therefore typically used in a reranking step applied to the list of the best complete hypotheses produced by the MT system.Using these features in a reranking framework does often provide a better modelization of certain aspects of the translation. However, this approach is inherently limited: reranked hypothesis lists represent only a small portion of the decoder's search space, tend to contain hypotheses that vary little between each other and which were obtained with features that may be very different from the complex features to be used during reranking.In this work, we put forward the hypothesis that such translation hypothesis lists are poorly adapted for exploiting the full potential of complex features. The aim of this thesis is to establish new and better methods of exploiting such features to improve translations produced by statistical MT systems.Our first contribution is a rewriting system guided by complex features. Sequences of rewriting operations, applied to hypotheses obtained by a reranking framework that uses the same features, allow us to obtain a substantial improvement in translation quality.The originality of our second contribution lies in the construction of hypothesis lists with a multi-pass decoding that exploits information derived from the evaluation of previously translated hypotheses, using a set of complex features. Our system is therefore capable of producing more diverse hypothesis lists, which are globally of a better quality and which are better adapted to a reranking step with complex features. What is more, our forementioned rewriting system enables us to further improve the hypotheses produced with our multi-pass decoding approach.Our third contribution is based on the simulation of an ideal information type, designed to perfectly identify the correct fragments of a translation hypothesis. This perfect information gives us an indication of the best attainable performance with the systems described in our first two contributions, in the case where the complex features are able to modelize the translation perfectly. Through this approach, we also introduce a novel form of interactive translation, coined "pre-post-editing", under a very simplified form: a statistical MT system produces its best translation hypothesis, then a human indicates which fragments of the hypothesis are correct, and this new information is then used during a new decoding pass to find a new best translation
Ben, Larbi Sara. "La traduction de la métaphore en poésie : ses difficultés et son évaluation." Thesis, Université de Lorraine, 2014. http://www.theses.fr/2014LORR0218.
Full textWe undertake to study the metaphor translation carrying out corpora in three linguistics codes: English, Arabic and French different by their system and their linguistic use. To what extent the metaphor has it been translated without being disfigured? We analyse in a comparative study the metaphors translated from English into French of the poet Thomas Stearns Eliot by Pierre Leyris in The Waste Land, Ash Wednesday, and Ariel Poems putting the verse in the standard English. We also study Mahmoud Darwich metaphors translated From Arabic into French by Elias Sanbar in three poetry books: Why do you let the horse to loneliness? Don’t apologize and as an almond tree flowers or farther. However, our thesis is divided into three chapters: firstly, we rethink the metaphor by evaluating its definitions in language, traditional grammar, rhetoric and it redefines the metaphor features linguistically. Secondly, in order to solve the difficulties metaphor translation in poetry, we postulate reclassifying six metaphor rhetoric variations in two sets: common metaphor and particular one. Thirdly, it postulates many kinds of metaphor translation strategies. Translate the metaphor can be asked in the contrastive linguistic and traductological perspective explaining the translator difficulties. The sample, composed of 30 difficulties, shows that the translators Leyris and Sanbar share the same problems: the lexicon and semantic order, the lexicon inappropriate or the creation of a new lexical unit and the translation approach of the metaphor proves that the translators highlight three strategies: the translation by the same reference, the different reference and the conversion. In order to achieve the results, we ask this following question: common metaphor translation from English into French, would it be modeled?
Schwarzl, Anja. "The (im)possibilities of machine translation /." Frankfurt am Main ; Berlin ; Paris [etc.] : P. Lang, 2001. http://catalogue.bnf.fr/ark:/12148/cb38934029b.
Full textCirce, Karine. "Traduction automatique, mémoire de traduction ou traduction humaine? Proposition d'une approche pour déterminer la meilleure méthode à adopter, selon le texte." Thesis, University of Ottawa (Canada), 2005. http://hdl.handle.net/10393/26876.
Full textHadj, salah Marwa. "Désambiguïsation lexicale de l'arabe pour et par la traduction automatique." Thesis, Université Grenoble Alpes (ComUE), 2018. http://www.theses.fr/2018GREAM089/document.
Full textThis thesis concerns a study of Word Sense Disambiguation (WSD), which is a central task in natural language processing and that can improve applications such as machine translation or information extraction. Researches in word sense disambiguation predominantly concern the English language, because the majority of other languages lacks a standard lexical reference for the annotation of corpora, and also lacks sense annotated corpora for the evaluation, and more importantly for the construction of word sense disambiguation systems. In English, the lexical database wordnet is a long-standing de-facto standard used in most sense annotated corpora and in most WSD evaluation campaigns.Our contribution to this thesis focuses on several areas:first of all, we present a method for the automatic creation of sense annotated corpora for any language, by taking advantage of the large amount of wordnet sense annotated English corpora, and by using a machine translation system. This method is applied on Arabic and is evaluated, to our knowledge, on the only Arabic manually sense annotated corpus with wordnet: the Arabic OntoNotes 5.0, which we have semi-automatically enriched.Its evaluation is performed thanks to an implementation of two supervised word sense disambiguation systems that are trained on the corpora produced using our method. We hence propose a solid baseline for the evaluation of future Arabic word sense disambiguation systems, in addition to sense annotated Arabic corpora that we provide as a freely available resource.Secondly, we propose an in vivo evaluation of our Arabic word sense disambiguation system by measuring its contribution to the performance of the machine translation task
Vial, Loïc. "Modèles neuronaux joints de désambiguïsation lexicale et de traduction automatique." Thesis, Université Grenoble Alpes, 2020. http://www.theses.fr/2020GRALM032.
Full textWord Sense Disambiguation (WSD) and Machine Translation (MT) are two central and among the oldest tasks of Natural Language Processing (NLP). Although they share a common origin, WSD being initially conceived as a fundamental problem to be solved for MT, the two tasks have subsequently evolved very independently of each other. Indeed, on the one hand, MT has been able to overcome the explicit disambiguation of terms thanks to statistical and neural models trained on large amounts of parallel corpora, and on the other hand, WSD, which faces some limitations such as the lack of unified resources and a restricted scope of applications, remains a major challenge to allow a better understanding of the language in general.Today, in a context in which neural networks and word embeddings are becoming more and more important in NLP research, the recent neural architectures and the new pre-trained language models offer not only some new possibilities for developing more efficient WSD and MT systems, but also an opportunity to bring the two tasks together through joint neural models, which facilitate the study of their interactions.In this thesis, our contributions will initially focus on the improvement of WSD systems by unifying the ressources that are necessary for their implementation, constructing new neural architectures and developing original approaches to improve the coverage and the performance of these systems. Then, we will develop and compare different approaches for the integration of our state of the art WSD systems and language models into MT systems for the overall improvement of their performance. Finally, we will present a new architecture that allows to train a joint model for both WSD and MT, based on our best neural systems
Pétry, Jean-Luc. "Le prénom "il" en traduction assistée par ordinateur étude théorique /." Metz : Université Metz, 2008. ftp://ftp.scd.univ-metz.fr/pub/Theses/1996/Petry.Jean_Luc.LMZ969.pdf.
Full textLavigne, Pierre-Étienne. "L’utilisation de la traduction automatique en contexte professionnel : étude de cas concernant les perceptions de la traduction automatique ainsi que son utilisation en contexte professionnel." Thesis, Université d'Ottawa / University of Ottawa, 2017. http://hdl.handle.net/10393/36592.
Full textGahbiche-Braham, Souhir. "Amélioration des systèmes de traduction par analyse linguistique et thématique : Application à la traduction depuis l'arabe." Phd thesis, Université Paris Sud - Paris XI, 2013. http://tel.archives-ouvertes.fr/tel-00878887.
Full textSaint-André, Louise. "Quelle formation donner aux traducteurs-postéditeurs de demain?" Thesis, Université d'Ottawa / University of Ottawa, 2014. http://hdl.handle.net/10393/31881.
Full textIgnat, Camelia Rousselot François. "Improving statistical alignment and translation using highly multilingual corpora." Strasbourg : Université de Strasbourg, 2009. http://eprints-scd-ulp.u-strasbg.fr:8080/1147/01/IGNAT_Camelia_2009.pdf.
Full textAnchaleenukoon, Sunant. "Etude contrastive des systemes predicatifs francais et thai en vue de la traduction automatique." Paris 3, 1993. http://www.theses.fr/1993PA030135.
Full textThe predicative systems of french and thai are analyzed with a view to machine translation. This analysis is carried out in terms of logic and semantics, as these two languages are quite different at the syntactic level. We then use the notion of lexical units. The lexical units concerning the predicative system play a very important role in machine translation. They allow to neutralize the differences of the syntactic classes for the same semantic class. According to the method of prof. B. Vauquois, multilevel structures are used in mt systems to represent the analysis of units of translation. A multilevel structure is a tree structure where the nodes bear complex labels contraining the necessary informations to interpret each level selected. These levels are presented by the hierarchy of their "deep structure". The general analysis allow us to observe that the french and thai languages have the same kinds of concepts : process, state and entity, which constitute the derivative types of the predicates. The result of this analysis shows that the greater part of the predicative system in french and thai is quite the same. We found that there will be very few difficult points in french-thai machine translation caused by differences in the predicative systems. On the one hand, there are very few differences concerning semantic restrictions on arguments of. .
Anchaleenukoon, Sunant. "Étude contrastive des systèmes prédicatifs français et thai en vue de la traduction automatique." Paris 3, 1993. http://www.theses.fr/2003PA03A135.
Full textBlain, Frédéric. "Modèles de traduction évolutifs." Thesis, Le Mans, 2013. http://www.theses.fr/2013LEMA1034/document.
Full textAlthough machine translation research achieved big progress for several years, the output of an automated system cannot be published without prior revision by human annotators. Based on this fact, we wanted to exploit the user feedbacks from the review process in order to incrementally adapt our statistical system over time.As part of this thesis, we are therefore interested in the post-editing, one of the most active fields of research, and what is more widely used in the translation and localization industry.However, the integration of user feedbacks is not an obvious task. On the one hand, we must be able to identify the information that will be useful for the system, among all changes made by the user. To address this problem, we introduced a new concept (the “Post-Editing Actions”), and proposed an analysis methodology for automatic identification of this information from post-edited data. On the other hand, for the continuous integration of user feedbacks, we havedeveloped an algorithm for incremental adaptation of a statistical machine translation system, which gets higher performance than the standard procedure. This is even more interesting as both development and optimization of this type of translation system has a very computational cost, sometimes requiring several days of computing.Conducted jointly with SYSTRAN and LIUM, the research work of this thesis is part of the French Government Research Agency project COSMAT 2. This project aimed to provide a collaborative machine translation service for scientific content to the scientific community. The collaborative aspect of this service with the possibility for users to review the translations givesan application framework for our research
Nakamura, Delloye Yayoi. "Alignement automatique de textes parallèles français - japonais." Paris 7, 2007. http://www.theses.fr/2007PA070054.
Full textAutomatic alignment aims to match elements of parallel texts. We are interested especially in the implementation of a System which carries out alignment at the clause level. Clause is a beneficial linguistic unit for many applications. This thesis consists of two types of works: the introductory works and those that constitute the thesis core. It is structured around the concept of syntactic clause. The introductory works include an overview of alignment and studies on sentence alignment. These works resulted in the creation of a sentence alignment System adapted to French and Japanese text processing. The thesis core consists of two types of works: linguistic studies and implementations. The linguistic studies are themselves divided into two topics: French clause and Japanese clause. The goal of our French clause studies is to define a grammar for clause identification. For this purpose, we attempted to define a typological classification of clauses, based on formal criteria only. In Japanese studies, we first define the Japanese sentence on the basis of the theme-rheme structure. We then try to elucidate the notion of clause. Implementation works consist of three tasks which finally constitute the clause alignment processing. These tasks are carried out by three separate tools: two clauses identification Systems (one for French texts and one for Japanese texts) and a clause alignment System
Dahmani, Halima. "Contribution à la généralisation de la traduction noir braille." Paris 11, 1988. http://www.theses.fr/1988PA112029.
Full textFränne, Ellen. "Google Traduction et le texte idéologique : dans quelle mesure une traduction automatique transmet-elle le contenu idéologique d'un texte?" Thesis, Linnéuniversitetet, Institutionen för språk (SPR), 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:lnu:diva-64460.
Full textLab, Frédérique. "Linguistique contrastive et traduction automatique, une étude de cas : les traductions anglaises du présent français." Paris 7, 1991. http://www.theses.fr/1991PA070041.
Full textAfter a brief introduction to machine translation, with a few definitions relating to this domain, the author studies the treatment of time and aspect in some of the existing m. T. Systems. A whole section is devoted to the european pr0ject "eurotra", with a critical approach. The main partof the thesis consists of a thorough examination of the translations of the french "present" tense into english : 1) what are the various possible translations (study of a corpus)? 2) what has to be taken into account in order to infer the value of the present and hence translate it (referential interpretation, status of the process, noun determination, lexical aspect), 3) what are the rules that can be drawn for the automatic translation of these verbal forms?