Tesis sobre el tema "TAL (Traitement Automatique des Langues)"
Crea una cita precisa en los estilos APA, MLA, Chicago, Harvard y otros
Consulte los 50 mejores tesis para su investigación sobre el tema "TAL (Traitement Automatique des Langues)".
Junto a cada fuente en la lista de referencias hay un botón "Agregar a la bibliografía". Pulsa este botón, y generaremos automáticamente la referencia bibliográfica para la obra elegida en el estilo de cita que necesites: APA, MLA, Harvard, Vancouver, Chicago, etc.
También puede descargar el texto completo de la publicación académica en formato pdf y leer en línea su resumen siempre que esté disponible en los metadatos.
Explore tesis sobre una amplia variedad de disciplinas y organice su bibliografía correctamente.
Tirilly, Pierre. "Traitement automatique des langues pour l'indexation d'images". Phd thesis, Université Rennes 1, 2010. http://tel.archives-ouvertes.fr/tel-00516422.
Texto completoBourgeade, Tom. "Interprétabilité a priori et explicabilité a posteriori dans le traitement automatique des langues". Thesis, Toulouse 3, 2022. http://www.theses.fr/2022TOU30063.
Texto completoWith the advent of Transformer architectures in Natural Language Processing a few years ago, we have observed unprecedented progress in various text classification or generation tasks. However, the explosion in the number of parameters, and the complexity of these state-of-the-art blackbox models, is making ever more apparent the now urgent need for transparency in machine learning approaches. The ability to explain, interpret, and understand algorithmic decisions will become paramount as computer models start becoming more and more present in our everyday lives. Using eXplainable AI (XAI) methods, we can for example diagnose dataset biases, spurious correlations which can ultimately taint the training process of models, leading them to learn undesirable shortcuts, which could lead to unfair, incomprehensible, or even risky algorithmic decisions. These failure modes of AI, may ultimately erode the trust humans may have otherwise placed in beneficial applications. In this work, we more specifically explore two major aspects of XAI, in the context of Natural Language Processing tasks and models: in the first part, we approach the subject of intrinsic interpretability, which encompasses all methods which are inherently easy to produce explanations for. In particular, we focus on word embedding representations, which are an essential component of practically all NLP architectures, allowing these mathematical models to process human language in a more semantically-rich way. Unfortunately, many of the models which generate these representations, produce them in a way which is not interpretable by humans. To address this problem, we experiment with the construction and usage of Interpretable Word Embedding models, which attempt to correct this issue, by using constraints which enforce interpretability on these representations. We then make use of these, in a simple but effective novel setup, to attempt to detect lexical correlations, spurious or otherwise, in some popular NLP datasets. In the second part, we explore post-hoc explainability methods, which can target already trained models, and attempt to extract various forms of explanations of their decisions. These can range from diagnosing which parts of an input were the most relevant to a particular decision, to generating adversarial examples, which are carefully crafted to help reveal weaknesses in a model. We explore a novel type of approach, in parts allowed by the highly-performant but opaque recent Transformer architectures: instead of using a separate method to produce explanations of a model's decisions, we design and fine-tune an architecture which jointly learns to both perform its task, while also producing free-form Natural Language Explanations of its own outputs. We evaluate our approach on a large-scale dataset annotated with human explanations, and qualitatively judge some of our approach's machine-generated explanations
Denoual, Etienne. "Méthodes en caractères pour le traitement automatique des langues". Phd thesis, Université Joseph Fourier (Grenoble), 2006. http://tel.archives-ouvertes.fr/tel-00107056.
Texto completoLe présent travail promeut l'utilisation de méthodes travaillant au niveau du signal de l'écrit: le caractère, unité immédiatement accessible dans toute langue informatisée, permet de se passer de segmentation en mots, étape actuellement incontournable pour des langues comme le chinois ou le japonais.
Dans un premier temps, nous transposons et appliquons en caractères une méthode bien établie d'évaluation objective de la traduction automatique, BLEU.
Les résultats encourageants nous permettent dans un deuxième temps d'aborder d'autres tâches de traitement des données linguistiques. Tout d'abord, le filtrage de la grammaticalité; ensuite, la caractérisation de la similarité et de l'homogénéité des ressources linguistiques. Dans toutes ces tâches, le traitement en caractères obtient des résultats acceptables, et comparables à ceux obtenus en mots.
Dans un troisième temps, nous abordons des tâches de production de données linguistiques: le calcul analogique sur les chaines de caractères permet la production de paraphrases aussi bien que la traduction automatique.
Ce travail montre qu'on peut construire un système complet de traduction automatique ne nécessitant pas de segmentation, a fortiori pour traiter des langues sans séparateur orthographique.
Moreau, Fabienne. "Revisiter le couplage traitement automatique des langues et recherche d'information". Phd thesis, Université Rennes 1, 2006. http://tel.archives-ouvertes.fr/tel-00524514.
Texto completoBouamor, Houda. "Etude de la paraphrase sous-phrastique en traitement automatique des langues". Phd thesis, Université Paris Sud - Paris XI, 2012. http://tel.archives-ouvertes.fr/tel-00717702.
Texto completoLe, Berre Guillaume. "Vers la mitigation des biais en traitement neuronal des langues". Electronic Thesis or Diss., Université de Lorraine, 2023. http://www.theses.fr/2023LORR0074.
Texto completoIt is well known that deep learning models are sensitive to biases that may be present in the data used for training. These biases, which can be defined as useless or detrimental information for the task in question, can be of different kinds: one can, for example, find biases in the writing styles used, but also much more problematic biases relating to the sex or ethnic origin of individuals. These biases can come from different sources, such as annotators who created the databases, or from the annotation process itself. My thesis deals with the study of these biases and, in particular, is organized around the mitigation of the effects of biases on the training of Natural Language Processing (NLP) models. In particular, I have worked a lot with pre-trained models such as BERT, RoBERTa or UnifiedQA which have become essential in recent years in all areas of NLP and which, despite their extensive pre-training, are very sensitive to these bias problems.My thesis is organized in three parts, each presenting a different way of managing the biases present in the data. The first part presents a method allowing to use the biases present in an automatic summary database in order to increase the variability and the controllability of the generated summaries. Then, in the second part, I am interested in the automatic generation of a training dataset for the multiple-choice question-answering task. The advantage of such a generation method is that it makes it possible not to call on annotators and therefore to eliminate the biases coming from them in the data. Finally, I am interested in training a multitasking model for optical text recognition. I show in this last part that it is possible to increase the performance of our models by using different types of data (handwritten and typed) during their training
Filhol, Michael. "Modèle descriptif des signes pour un traitement automatique des langues des signes". Phd thesis, Université Paris Sud - Paris XI, 2008. http://tel.archives-ouvertes.fr/tel-00300591.
Texto completoAZILI, ABRAK SAIDA. "Une architecture logicielle pour un systeme de traitement automatique de la langue : cas du systeme criss-tal". Grenoble 2, 1991. http://www.theses.fr/1991GRE29048.
Texto completoThis work amis at contributing to the definition and setting of astal architecture for the criss-tal system wich is automatic processinf system for writen french used specially for automatic indexing. Astral is designed as a flexible and open frame structure around a nucleus : the object management system (oms). A prototype of the nucleux (oms) is realised using the prolog-criss programming language. Specificisties in t'erms of user help are also proposed for astal
Charnois, Thierry. "Accès à l'information : vers une hybridation fouille de données et traitement automatique des langues". Habilitation à diriger des recherches, Université de Caen, 2011. http://tel.archives-ouvertes.fr/tel-00657919.
Texto completoBeust, Pierre. "Pour une démarche centrée sur l'utilisateur dans les ENT. Apport au Traitement Automatique des Langues". Habilitation à diriger des recherches, Université de Caen, 2013. http://tel.archives-ouvertes.fr/tel-01070522.
Texto completoDuran, Maximiliano. "Dictionnaire électronique français-quechua des verbes pour le TAL". Thesis, Bourgogne Franche-Comté, 2017. http://www.theses.fr/2017UBFCC006/document.
Texto completoThe automatic processing of the Quechua language (APQL) lacks an electronic dictionary of French Quechua verbs. However, any NLP project requires this important linguistic resource.The present thesis proposes such a dictionary. The realization of such a resource couId also open new perspectives on different domains such as multilingual access to information, distance learning,inthe areas of annotation /indexing of documents, spelling correction and eventually in machine translation.The first challenge was the choice of the French dictionary which would be used as our basic reference. Among the numerous French dictionaries, there are very few which are presented in an electronic format, and even less that may be used as an open source. Among the latter, we found the dictionary Les verbes français (LVF}, of Jean Dubois and Françoise Dubois-Charlier, edited by Larousse en 1997. lt is a remarkably complete dictionary. lt contains 25 610 verbal senses and with open source license. lt is entirely compatible with the Nooj platform. That's why we have chosen this dictionary to be the one to translate into Quechua.However, this task faces a considerable obstacle: the Quechua lexicon of simple verbs contains around 1,500 entries. How to match 25,610 French verbal senses with only 1,500 Quechua verbs?Are we condemned to produce many polysemies? For example, in LVF, we have 27 verbal senses of the verb "tourner" to turn; should we translate them all by the Quechua verb muyuy to turn? Or, can we make use of a particular and remarkable Quechua strategy that may allow us to face thischallenge: the generation of new verbs by suffix derivation?As a first step, we have inventoried ail the Quechua suffixes that make possible to obtain a derived verbal form which behaves as if it was a simple verb. This set of suffixes, which we call IPS_DRV, contains 27 elements. Thus each Quechua verb, transitive or intransitive, gives rise to at least 27 derived verbs. Next, we need to formalize the paradigms and grammars that will allow us to obtain derivations compatible with the morphology of the language. This was done with the help of the NooJ platform.The application of these grammars allowed us to obtain 40,500 conjugable atomic linguistic units (CALU) out of 1,500 simple Quechua verbs. This encouraging first result allows us to hope to get a favorable solution to our project of translation of the 25,000 verbal senses of French into Quechua.At this point, a new difficulty appears: the translation into French of this enormous quantity of generated conjugable verbal forms. This work is essential if we want to obtain the translation of a large part of the twenty-five thousand French verbs into Quechua. ln order to obtain the translation of these CALUs, we first needed to know the modalities of enunciation that each IPS have and transmits to the verbal radical when it is agglutinated to it. Each suffix can have several modalities of enunciation. We have obtained an inventory of them from the corpus, our own experience and some recordings obtained in fieldwork. We constructed an indexed table containing all of these modalities.Next, we used NooJ operators to program grammars that present automatic translation into a glossed form of enunciation modalities.Finally, we developed an algorithm that allowed us to obtain the reciprocal translation from French to Quechua of more than 8,500 Verbal senses of Level 3 and a number of verbal senses of Levels 4 and 5
Stroppa, Nicolas. "Définitions et caractérisations de modèles à base d'analogies pour l'apprentissage automatique des langues naturelles". Phd thesis, Télécom ParisTech, 2005. http://tel.archives-ouvertes.fr/tel-00145147.
Texto completoDans le cadre d'un apprentissage automatique de données linguistiques, des modèles inférentiels alternatifs ont alors été proposés qui remettent en cause le principe d'abstraction opéré par les règles ou les modèles probabilistes. Selon cette conception, la connaissance linguistique reste implicitement représentée dans le corpus accumulé. Dans le domaine de l'Apprentissage Automatique, les méthodes suivant les même principes sont regroupées sous l'appellation d'apprentissage \og{}paresseux\fg{}. Ces méthodes reposent généralement sur le biais d'apprentissage suivant~: si un objet $Y$ est \og{}proche\fg{} d'un objet $X$, alors son analyse $f(Y)$ est un bon candidat pour $f(X)$. Alors que l'hypothèse invoquée se justifie pour les applications usuellement traitées en Apprentissage Automatique, la nature structurée et l'organisation paradigmatique des données linguistiques suggèrent une approche légèrement différente. Pour rendre compte de cette particularité, nous étudions un modèle reposant sur la notion de \og{}proportion analogique\fg{}. Dans ce modèle, l'analyse $f(T)$ d'un nouvel objet $T$ s'opère par identification d'une proportion analogique avec des objets $X$, $Y$ et $Z$ déjà connus. L'hypothèse analogique postule ainsi que si \lana{X}{Y}{Z}{T}, alors \lana{$f(X)$}{$f(Y)$}{$f(Z)$}{$f(T)$}. Pour inférer $f(T)$ à partir des $f(X)$, $f(Y)$, $f(Z)$ déjà connus, on résout l'\og{}équation analogique\fg{} d'inconnue $I$~: \lana{$f(X)$}{$f(Y)$}{$f(Z)$}{$I$}.
Nous présentons, dans la première partie de ce travail, une étude de ce modèle de proportion analogique au regard d'un cadre plus général que nous qualifierons d'\og{}apprentissage par analogie\fg{}. Ce cadre s'instancie dans un certain nombre de contextes~: dans le domaine des sciences cognitives, il s'agit de raisonnement par analogie, faculté essentielle au c\oe{}ur de nombreux processus cognitifs~; dans le cadre de la linguistique traditionnelle, il fournit un support à un certain nombre de mécanismes tels que la création analogique, l'opposition ou la commutation~; dans le contexte de l'apprentissage automatique, il correspond à l'ensemble des méthodes d'apprentissage paresseux. Cette mise en perspective offre un éclairage sur la nature du modèle et les mécanismes sous-jacents.
La deuxième partie de notre travail propose un cadre algébrique unifié, définissant la notion de proportion analogique. Partant d'un modèle de proportion analogique entre chaînes de symboles, éléments d'un monoïde libre, nous présentons une extension au cas plus général des semigroupes. Cette généralisation conduit directement à une définition valide pour tous les ensembles dérivant de la structure de semigroupe, permettant ainsi la modélisation des proportions analogiques entre représentations courantes d'entités linguistiques telles que chaînes de symboles, arbres, structures de traits et langages finis. Des algorithmes adaptés au traitement des proportions analogiques entre de tels objets structurés sont présentés. Nous proposons également quelques directions pour enrichir le modèle, et permettre ainsi son utilisation dans des cas plus complexes.
Le modèle inférentiel étudié, motivé par des besoins en Traitement Automatique des Langues, est ensuite explicitement interprété comme une méthode d'Apprentissage Automatique. Cette formalisation a permis de mettre en évidence plusieurs de ses éléments caractéristiques. Une particularité notable du modèle réside dans sa capacité à traiter des objets structurés, aussi bien en entrée qu'en sortie, alors que la tâche classique de classification suppose en général un espace de sortie constitué d'un ensemble fini de classes. Nous montrons ensuite comment exprimer le biais d'apprentissage de la méthode à l'aide de l'introduction de la notion d'extension analogique. Enfin, nous concluons par la présentation de résultats expérimentaux issus de l'application de notre modèle à plusieurs tâches de Traitement Automatique des Langues~: transcription orthographique/phonétique, analyse flexionnelle et analyse dérivationnelle.
Nie, Shuling. "Enseignement de français à un public chinois constitué sur un modèle TAL implanté sur internet". Besançon, 2002. http://www.theses.fr/2002BESA1005.
Texto completoBoulaknadel, Siham. "Traitement Automatique des Langues et Recherche d'Information en langue arabe dans un domaine de spécialité : Apport des connaissances morphologiques et syntaxiques pour l'indexation". Phd thesis, Université de Nantes, 2008. http://tel.archives-ouvertes.fr/tel-00479982.
Texto completoPerez, Laura Haide. "Génération automatique de phrases pour l'apprentissage des langues". Electronic Thesis or Diss., Université de Lorraine, 2013. http://www.theses.fr/2013LORR0062.
Texto completoIn this work, we explore how Natural Language Generation (NLG) techniques can be used to address the task of (semi-)automatically generating language learning material and activities in Camputer-Assisted Language Learning (CALL). In particular, we show how a grammar-based Surface Realiser (SR) can be usefully exploited for the automatic creation of grammar exercises. Our surface realiser uses a wide-coverage reversible grammar namely SemTAG, which is a Feature-Based Tree Adjoining Grammar (FB-TAG) equipped with a unification-based compositional semantics. More precisely, the FB-TAG grammar integrates a flat and underspecified representation of First Order Logic (FOL) formulae. In the first part of the thesis, we study the task of surface realisation from flat semantic formulae and we propose an optimised FB-TAG-based realisation algorithm that supports the generation of longer sentences given a large scale grammar and lexicon. The approach followed to optimise TAG-based surface realisation from flat semantics draws on the fact that an FB-TAG can be translated into a Feature-Based Regular Tree Grammar (FB-RTG) describing its derivation trees. The derivation tree language of TAG constitutes a simpler language than the derived tree language, and thus, generation approaches based on derivation trees have been already proposed. Our approach departs from previous ones in that our FB-RTG encoding accounts for feature structures present in the original FB-TAG having thus important consequences regarding over-generation and preservation of the syntax-semantics interface. The concrete derivation tree generation algorithm that we propose is an Earley-style algorithm integrating a set of well-known optimisation techniques: tabulation, sharing-packing, and semantic-based indexing. In the second part of the thesis, we explore how our SemTAG-based surface realiser can be put to work for the (semi-)automatic generation of grammar exercises. Usually, teachers manually edit exercises and their solutions, and classify them according to the degree of dificulty or expected learner level. A strand of research in (Natural Language Processing (NLP) for CALL addresses the (semi-)automatic generation of exercises. Mostly, this work draws on texts extracted from the Web, use machine learning and text analysis techniques (e.g. parsing, POS tagging, etc.). These approaches expose the learner to sentences that have a potentially complex syntax and diverse vocabulary. In contrast, the approach we propose in this thesis addresses the (semi-)automatic generation of grammar exercises of the type found in grammar textbooks. In other words, it deals with the generation of exercises whose syntax and vocabulary are tailored to specific pedagogical goals and topics. Because the grammar-based generation approach associates natural language sentences with a rich linguistic description, it permits defining a syntactic and morpho-syntactic constraints specification language for the selection of stem sentences in compliance with a given pedagogical goal. Further, it allows for the post processing of the generated stem sentences to build grammar exercise items. We show how Fill-in-the-blank, Shuffle and Reformulation grammar exercises can be automatically produced. The approach has been integrated in the Interactive French Learning Game (I-FLEG) serious game for learning French and has been evaluated both based in the interactions with online players and in collaboration with a language teacher
Arnulphy, Béatrice. "Désignations nominales des événements : étude et extraction automatique dans les textes". Phd thesis, Université Paris Sud - Paris XI, 2012. http://tel.archives-ouvertes.fr/tel-00758062.
Texto completoPerez, Laura Haide. "Génération automatique de phrases pour l'apprentissage des langues". Thesis, Université de Lorraine, 2013. http://www.theses.fr/2013LORR0062/document.
Texto completoIn this work, we explore how Natural Language Generation (NLG) techniques can be used to address the task of (semi-)automatically generating language learning material and activities in Camputer-Assisted Language Learning (CALL). In particular, we show how a grammar-based Surface Realiser (SR) can be usefully exploited for the automatic creation of grammar exercises. Our surface realiser uses a wide-coverage reversible grammar namely SemTAG, which is a Feature-Based Tree Adjoining Grammar (FB-TAG) equipped with a unification-based compositional semantics. More precisely, the FB-TAG grammar integrates a flat and underspecified representation of First Order Logic (FOL) formulae. In the first part of the thesis, we study the task of surface realisation from flat semantic formulae and we propose an optimised FB-TAG-based realisation algorithm that supports the generation of longer sentences given a large scale grammar and lexicon. The approach followed to optimise TAG-based surface realisation from flat semantics draws on the fact that an FB-TAG can be translated into a Feature-Based Regular Tree Grammar (FB-RTG) describing its derivation trees. The derivation tree language of TAG constitutes a simpler language than the derived tree language, and thus, generation approaches based on derivation trees have been already proposed. Our approach departs from previous ones in that our FB-RTG encoding accounts for feature structures present in the original FB-TAG having thus important consequences regarding over-generation and preservation of the syntax-semantics interface. The concrete derivation tree generation algorithm that we propose is an Earley-style algorithm integrating a set of well-known optimisation techniques: tabulation, sharing-packing, and semantic-based indexing. In the second part of the thesis, we explore how our SemTAG-based surface realiser can be put to work for the (semi-)automatic generation of grammar exercises. Usually, teachers manually edit exercises and their solutions, and classify them according to the degree of dificulty or expected learner level. A strand of research in (Natural Language Processing (NLP) for CALL addresses the (semi-)automatic generation of exercises. Mostly, this work draws on texts extracted from the Web, use machine learning and text analysis techniques (e.g. parsing, POS tagging, etc.). These approaches expose the learner to sentences that have a potentially complex syntax and diverse vocabulary. In contrast, the approach we propose in this thesis addresses the (semi-)automatic generation of grammar exercises of the type found in grammar textbooks. In other words, it deals with the generation of exercises whose syntax and vocabulary are tailored to specific pedagogical goals and topics. Because the grammar-based generation approach associates natural language sentences with a rich linguistic description, it permits defining a syntactic and morpho-syntactic constraints specification language for the selection of stem sentences in compliance with a given pedagogical goal. Further, it allows for the post processing of the generated stem sentences to build grammar exercise items. We show how Fill-in-the-blank, Shuffle and Reformulation grammar exercises can be automatically produced. The approach has been integrated in the Interactive French Learning Game (I-FLEG) serious game for learning French and has been evaluated both based in the interactions with online players and in collaboration with a language teacher
Cadilhac, Anaïs. "Preference extraction and reasoning in negotiation dialogues". Toulouse 3, 2013. http://thesesups.ups-tlse.fr/2168/.
Texto completoModelling user preferences is crucial in many real-life problems, ranging from individual and collective decision-making to strategic interactions between agents for example. But handling preferences is not easy. Since agents don't come with their preferences transparently given in advance, we have only two means to determine what they are if we wish to exploit them in reasoning: we can infer them from what an agent says or from his nonlinguistic actions. Preference acquisition from nonlinguistic actions has been wildly studied within the Artificial Intelligence community. However, to our knowledge, there has been little work that has so far investigated how preferences can be efficiently elicited from users using Natural Language Processing (NLP) techniques. In this work, we propose a new approach to extract and reason on preferences expressed in negotiation dialogues. After having extracted the preferences expressed in each dialogue turn, we use the discursive structure to follow their evolution as the dialogue progresses. We use CP-nets, a model used for the representation of preferences, to formalize and reason about these extracted preferences. The method is first evaluated on different negotiation corpora for which we obtain promising results. We then apply the end-to-end method with principles from Game Theory to predict trades in the win-lose game The Settlers of Catan. Our method shows good results, beating baselines that don't adequately track or reason about preferences. This work thus presents a new approach at the intersection of several research domains: Natural Language Processing (for the automatic preference extraction and the reasoning on their verbalisation), Artificial Intelligence (for the modelling and reasoning on the extracted preferences) and Game Theory (for strategic action prediction in a bargaining game)
Sébillot, Pascale. "Apprentissage sur corpus de relations lexicales sémantiques - La linguistique et l'apprentissage au service d'applications du traitement automatique des langues". Habilitation à diriger des recherches, Université Rennes 1, 2002. http://tel.archives-ouvertes.fr/tel-00533657.
Texto completoWeissenbacher, Davy. "Influence des annotations imparfaites sur les systèmes de Traitement Automatique des Langues, un cadre applicatif : la résolution de l'anaphore pronominale". Phd thesis, Université Paris-Nord - Paris XIII, 2008. http://tel.archives-ouvertes.fr/tel-00641504.
Texto completoKervajan, Loïc. "Contribution à la traduction automatique Français/Langue des Signes Française (LSF) au moyen de personnages virtuels". Phd thesis, Université de Provence - Aix-Marseille I, 2011. http://tel.archives-ouvertes.fr/tel-00697726.
Texto completoPham, Thi Nhung. "Résolution des anaphores nominales pour la compréhension automatique des textes". Thesis, Sorbonne Paris Cité, 2017. http://www.theses.fr/2017USPCD049/document.
Texto completoIn order to facilitate the interpretation of texts, this thesis is devoted to the development of a system to identify and resolve the indirect nominal anaphora and the associative anaphora. Resolution of the indirect nominal anaphora is based on calculating salience weights of candidate antecedents with the purpose of associating these antecedents with the anaphoric expressions identified. It is processed by twoAnnexe317different methods based on a linguistic approach: the first method uses lexical and morphological parameters; the second method uses morphological and syntactical parameters. The resolution of associative anaphora is based on syntactical and semantic parameters.The results obtained are encouraging: 90.6% for the indirect anaphora resolution with the first method, 75.7% for the indirect anaphora resolution with the second method and 68.7% for the associative anaphora resolution. These results show the contribution of each parameter used and the utility of this system in the automatic interpretation of the texts
Ramisch, Carlos eduardo. "Un environnement générique et ouvert pour le traitement des expressions polylexicales : de l'acquisition aux applications". Phd thesis, Université de Grenoble, 2012. http://tel.archives-ouvertes.fr/tel-00859910.
Texto completoEl, Ayari Sarra. "Évaluation transparente du traitement des éléments de réponse à une question factuelle". Phd thesis, Université Paris Sud - Paris XI, 2009. http://tel.archives-ouvertes.fr/tel-00618355.
Texto completoSegouat, Jérémie. "Modélisation de la coarticulation en Langue des Signes Française pour la diffusion automatique d'informations en gare ferroviaire à l'aide d'un signeur virtuel". Phd thesis, Université Paris Sud - Paris XI, 2010. http://tel.archives-ouvertes.fr/tel-00602117.
Texto completoAUDIBERT, Laurent. "Outils d'exploration de corpus et désambiguïsation lexicale automatique". Phd thesis, Université de Provence - Aix-Marseille I, 2003. http://tel.archives-ouvertes.fr/tel-00004475.
Texto completoCailliau, Frederik. "Des ressources aux traitements linguistiques : le rôle d‘une architecture linguistique". Phd thesis, Université Paris-Nord - Paris XIII, 2010. http://tel.archives-ouvertes.fr/tel-00546798.
Texto completoIgnat, Camelia. "Amélioration de l'alignement et de la traduction statistique par utilisation de corpus parallèles multilingues". Phd thesis, Université de Strasbourg, 2009. http://tel.archives-ouvertes.fr/tel-00405733.
Texto completoBuet, François. "Modèles neuronaux pour la simplification de parole, application au sous-titrage". Electronic Thesis or Diss., université Paris-Saclay, 2022. https://theses.hal.science/tel-03920729.
Texto completoIn the context of linguistics, simplification is generally defined as the process consisting in reducing the complexity of a text (or speech), while preserving its meaning as much as possible. Its primary application is to make understanding and reading easier for a user. It is regarded, inter alia, as a way to enhance the legibility of texts toward deaf and hard-of-hearing people (deafness often causes a delay in reading development), in particular in the case of subtitling. While interlingual subtitles are used to disseminate movies and programs in other languages, intralingual subtitles (or captions) are the only means, with sign language interpretation, by which the deaf and hard-of-hearing can access audio-visual contents. Yet videos have taken a prominent place in society, wether for work, recreation, or education. In order to ensure the equality of people through participation in public and social life, many countries in the world (including France) have implemented legal obligations concerning television programs subtitling. ROSETTA (Subtitling RObot and Adapted Translation) is a public-private collaborative research program, seeking to develop technological accessibility solutions for audio-visual content in French. This thesis, conducted within the ROSETTA project, aims to study automatic speech simplification with neural models, and to apply it into the context of intralinguistic subtitling for French television programs. Our work mainly focuses on analysing length control methods, adapting subtitling models to television genres, and evaluating subtitles segmentation. We notably present a new subtitling corpus created from data collected as part of project ROSETTA, as well as a new metric for subtitles evaluation, Sigma
Tanguy, Ludovic. "Complexification des données et des techniques en linguistique : contributions du TAL aux solutions et aux problèmes". Habilitation à diriger des recherches, Université Toulouse le Mirail - Toulouse II, 2012. http://tel.archives-ouvertes.fr/tel-00734493.
Texto completoNguyen, Thi Minh Huyen. "Outils et ressources linguistiques pour l'alignement de textes multilingues français-vietnamiens". Phd thesis, Université Henri Poincaré - Nancy I, 2006. http://tel.archives-ouvertes.fr/tel-00105592.
Texto completoRamisch, Carlos Eduardo. "Un environnement générique et ouvert pour le traitement des expressions polylexicales : de l'acquisition aux applications". Phd thesis, Université de Grenoble, 2012. http://tel.archives-ouvertes.fr/tel-00741147.
Texto completoBove, Rémi. "Analyse syntaxique automatique de l'oral : étude des disfluences". Phd thesis, Université de Provence - Aix-Marseille I, 2008. http://tel.archives-ouvertes.fr/tel-00647900.
Texto completoSam, Sethserey. "Vers une adaptation autonome des modèles acoustiques multilingues pour le traitement automatique de la parole". Phd thesis, Université de Grenoble, 2011. http://tel.archives-ouvertes.fr/tel-00685204.
Texto completoFalco, Mathieu-Henri. "Répondre à des questions à réponses multiples sur le Web". Phd thesis, Université Paris Sud - Paris XI, 2014. http://tel.archives-ouvertes.fr/tel-01015869.
Texto completoRetoré, Christian. "Logique linéaire et syntaxe des langues". Habilitation à diriger des recherches, Université de Nantes, 2002. http://tel.archives-ouvertes.fr/tel-00354041.
Texto completoLaignelet, Marion. "Analyse discursive pour le repérage automatique de segments obsolescents dans des documents encyclopédiques". Phd thesis, Université Toulouse le Mirail - Toulouse II, 2009. http://tel.archives-ouvertes.fr/tel-00461579.
Texto completoJi, Hyungsuk. "Étude d'un modèle computationnel pour la représentation du sens des mots par intégration des relations de contexte". Phd thesis, Grenoble INPG, 2004. http://tel.archives-ouvertes.fr/tel-00008384.
Texto completoLecorvé, Gwénolé. "Adaptation thématique non supervisée d'un système de reconnaissance automatique de la parole". Phd thesis, INSA de Rennes, 2010. http://tel.archives-ouvertes.fr/tel-00566824.
Texto completoKostov, Jovan. "Le verbe macédonien : pour un traitement informatique de nature linguistique et applications didactiques (réalisation d'un conjugueur)". Institut National des Langues et Civilisations Orientales, 2013. http://www.theses.fr/2013INAL0033.
Texto completoAfter the standardization of the Macedonian language in 1945, the description of its current standard variety has been carried out by several generations of experts working – most often – in Macedonian institutions. The fact that several manuals were published is an undeniable proof of significant efforts made to describe the Macedonian verbal system and yet, today verbs represent the least exploited word-category. Inflexion rules cannot envisage all possible models of the Macedonian conjugaison and their approach is too synthetic to be fully operational from a didactic point of view. For all these reasons, the purpose of this doctoral thesis is to study a large number of conjugated verbs in order to map stable patterns opening up new forays into the teaching of the Macedonian verbal system. Moreover, these patterns are used to produce computational models resulting in an automatized conjugation tool which derives paradigms from the lexical verbal forms : FlexiMac 1. 1
Claveau, Vincent. "Acquisition automatique de lexiques sémantiques pour la recherche d'information". Phd thesis, Université Rennes 1, 2003. http://tel.archives-ouvertes.fr/tel-00524646.
Texto completoGuinaudeau, Camille. "Structuration automatique de flux télévisuels". Phd thesis, INSA de Rennes, 2011. http://tel.archives-ouvertes.fr/tel-00646522.
Texto completoDaoud, Mohammad. "Utilisation de ressources non conventionnelles et de méthodes contributives pour combler le fossé terminologique entre les langues en développant des "préterminologies" multilingues". Phd thesis, Grenoble, 2010. http://tel.archives-ouvertes.fr/tel-00583682.
Texto completoLoiseau, Mathieu. "Elaboration d'un modèle pour une base de textes indexée pédagogiquement pour l'enseignement des langues". Grenoble 3, 2009. https://tel.archives-ouvertes.fr/tel-00440460v3.
Texto completoThis PhD thesis deals with the notion of pedagogical indexation and tackles it from the point of view of searching for and selecting texts for language teaching. This particular problem is set in the field of Computer Assisted Language Learning (CALL) and of the potential contribution of Natural Language Processing (NLP) to this discipline, before being considered within the scope of elements more directly relevant to language didactics, in order to propose an empirical approach. The latter is then justified by the inadequacy of current description standards for pedagogical resources where modeling of raw objects in a consistent fashion is concerned. This is particularly true for texts in the context of language learning. The thesis subsequently revolves around two questionnaires the aim of which is to provide insight into language teachers' declared practices regarding searching for and selecting texts in the context of class planning. The first questionnaire provides data to formalize the notion of pedagogical context, which is later considered through some of its components thanks to the second questionnaire. Finally, these first formalization drafts provide foundations for the definition of a model aiming at taking into account the contextuality of the properties said to be pedagogical, which is inherent to raw resources. Finally, possible leads for implementing this model are suggested through the description of a computerized system
Falaise, Achille. "Conception et prototypage d'un outil web de médiation et d'aide au dialogue tchaté écrit en langue seconde". Phd thesis, Université Joseph Fourier (Grenoble), 2009. http://tel.archives-ouvertes.fr/tel-00442754.
Texto completoViale, Greta. "Auxiliary selection in Italian and French : a comparative study of the so-called peripheral verbs". Electronic Thesis or Diss., Sorbonne université, 2025. http://www.theses.fr/2025SORUL020.
Texto completoThis thesis investigates the intricate phenomenon of auxiliary selection in two Romance languages, namely Italian and French. The primary objective is to elucidate the factors that influence the choice between the auxiliaries ‘be' and ‘have' in the formation of the perfect tense. The study focuses on verbs that can select both auxiliaries, commonly known as peripheral verbs (Sorace 2000), which, despite extensive individual examination, have not been comprehensively analyzed (Giancarli 2015).The central research questions addressed are: What characteristics enable these verbs to select both auxiliaries? Which factors determine the predominance of one auxiliary over the other? What is the relative weight of factors such as agentivity and telicity in auxiliary selection (Sorace 2000)? For the first time, this research systematically explores auxiliary selection in Italian and French using corpus analysis and natural language processing (NLP). By integrating these methods, the study aims to identify the most significant factors influencing auxiliary choice in intransitive verbs with double auxiliation. The research combines qualitative analysis of manually annotated occurrences from SketchEngine (Kilgarriff et al., 2014) with quantitative analysis using statistical models to determine the most significant parameters in auxiliary selection. The findings reveal the paramount importance of semantic, syntactic, and morphological aspects in the choice of ‘be' or ‘have'. Notably, telicity is found to be less relevant for these verbs. The study also highlights significant differences between Italian and French. Italian verbs are categorized into full verbs and semi-auxiliaries. For full verbs, internal cause and human traits are crucial factors in auxiliary selection. For semi-auxiliary verbs, the type of infinitive and the human trait associated with particular infinitives are shown to be significant. In French, the type of construction heavily influences auxiliary choice.By providing comprehensive answers to previously unexplored areas, this study aligns with and extends the existing literature. It significantly enhances our understanding of verb categorization and auxiliary selection, with substantial implications for both theoretical and applied linguistics. Furthermore, it underscores the importance of integrative methodological approaches for analyzing complex linguistic phenomena
Malaisé, Véronique. "Méthodologie linguistique et terminologique pour la structuration d'ontologies différentielles à partir de corpus textuels". Phd thesis, Université Paris-Diderot - Paris VII, 2005. http://tel.archives-ouvertes.fr/tel-00162575.
Texto completoEven, Fabrice. "Extraction d'Information et modélisation de connaissances à partir de Notes de Communication Orale". Phd thesis, Université de Nantes, 2005. http://tel.archives-ouvertes.fr/tel-00109400.
Texto completoLes Notes de Communication Orale sont des textes issus de prises de notes réalisées lors d'une communication orale (entretien, réunion, exposé, etc.) et dont le but est de synthétiser le contenu informatif de la communication. Leurs contraintes de rédaction (rapidité et limitation de la quantité d'écrits) sont à l'origine de particularités linguistiques auxquelles sont mal adaptées les méthodes classiques de Traitement Automatique des Langues et d'Extraction d'Information. Aussi, bien qu'elles soient riches en informations, elles ne sont pas exploitées par les systèmes extrayant des informations à partir de textes.
Dans cette thèse, nous proposons une méthode d'extraction adaptée aux Notes de Communication Orale. Cette méthode, nommée MEGET, est fondée sur une ontologie modélisant les connaissances contenues dans les textes et intéressantes du point de vue des informations recherchées (« ontologie d'extraction »). Cette ontologie est construite en unifiant une « ontologie des besoins », décrivant les informations à extraire, avec une « ontologie des termes », conceptualisant les termes du corpus à traiter liés avec ces informations. L'ontologie des termes est élaborée à partir d'une terminologie extraite des textes et enrichie par des termes issus de documents spécialisés. L'ontologie d'extraction est représentée par un ensemble de règles formelles qui sont fournies comme base de connaissance au système d'extraction SYGET. Ce système procède d'abord à un étiquetage des instances des éléments de l'ontologie d'extraction présentes dans les textes, puis extrait les informations recherchées. Cette approche est validée sur plusieurs corpus.
Andreani, Vanessa. "Immersion dans des documents scientifiques et techniques : unités, modèles théoriques et processus". Phd thesis, Université de Grenoble, 2011. http://tel.archives-ouvertes.fr/tel-00662668.
Texto completoLebranchu, Julien. "Étude des phénomènes itératifs en langue : Inscription discursive et Calcul aspectuo-temporel, vers un traitement automatisé". Phd thesis, Université de Caen, 2011. http://tel.archives-ouvertes.fr/tel-00664788.
Texto completo