To see the other types of publications on this topic, follow the link: Modèle lexical.

Dissertations / Theses on the topic 'Modèle lexical'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'Modèle lexical.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Jousse, Anne-Laure. "Modèle de structuration des relations lexicales fondé sur le formalisme des fonctions lexicales." Thèse, Paris 7, 2010. http://hdl.handle.net/1866/4347.

Full text
Abstract:
Cette thèse porte sur l’élaboration d’un modèle de structuration des relations lexicales, fondé sur les fonctions lexicales de la Théorie Sens-Texte [Mel’cuk, 1997]. Les relations lexicales considérées sont les dérivations sémantiques et les collocations telles qu’elles sont définies dans le cadre de la Lexicologie Explicative et Combinatoire [Mel’cuk et al., 1995]. En partant du constat que ces relations lexicales ne sont pas décrites ni présentées de façon satisfaisante dans les bases de données lexicales, nous posons la nécessité d’en créer un modèle de structuration. Nous justifions l’intérêt de créer un système de fonctions lexicales puis détaillons les quatre perspectives du système que nous avons mises au point : une perspective sémantique, une perspective axée sur la combinatoire des éléments d’une relation lexicale, une perspective centrée sur leurs parties du discours, ainsi qu’une perspective mettant en avant l’élément sur lequel se focalise la relation. Le système intègre l’ensemble des fonctions lexicales, y compris les fonctions lexicales non standard, dont nous proposons une normalisation de l’encodage. Le système a été implémenté dans la base de données lexicale du DiCo. Nous présentons trois applications dans lesquelles il peut être exploité. Premièrement, il est possible d’en dériver des interfaces de consultation pour les bases de données lexicales de type DiCo. Le système peut également être directement consulté en tant qu’assistant à l’encodage des relations lexicales. Enfin, il sert de référence pour effectuer un certain nombre de calculs sur les informations lexicographiques, qui pourront, par la suite, être implémentés pour automatiser la rédaction de certains champs de fiches lexicographiques.
This thesis proposes a model for structuring lexical relations, based on the concept of lexical functions (LFs) proposed in Meaning-Text Theory [Mel’cuk, 1997]. The lexical relations taken into account include semantic derivations and collocations as defined within this theoretical framework, known as Explanatory and Combinatorial Lexicology [Mel’cuk et al., 1995]. Considering the assumption that lexical relations are neither encoded nor made available in lexical databases in an entirely satisfactory manner, we assume the necessity of designing a new model for structuring them. First of all, we justify the relevance of devising a system of lexical functions rather than a simple classification. Next, we present the four perspectives developped in the system: a semantic perspective, a combinatorial one, another one targetting the parts of speech of the elements involved in a lexical relation, and, finally, a last one emphasizing which element of the relation is focused on. This system covers all LFs, even non-standard ones, for which we have proposed a normalization of the encoding. Our system has already been implemented into the DiCo relational database. We propose three further applications that can be developed from it. First, it can be used to build browsing interfaces for lexical databases such as the DiCo. It can also be directly consulted as a tool to assist lexicographers in encoding lexical relations by means of lexical functions. Finally, it constitutes a reference to compute lexicographic information which will, in future work, be implemented in order to automatically fill in some fields within the entries in lexical databases.
Thèse réalisée en cotutelle avec l'Université Paris Diderot (Paris 7)
APA, Harvard, Vancouver, ISO, and other styles
2

Quang, Vu Minh. "Exploitation de la prosodie pour la segmentation et l'analyse automatique de signaux de parole." Grenoble INPG, 2007. http://www.theses.fr/2007INPG0104.

Full text
Abstract:
Cette thèse se situe à la frontière des domaines du traitement automatique de la parole et de la recherche d'informations multimédia. Ces dernièn une nouvelle tâche est apparue dans le domaine du traitement automatique de la parole: la transcription enrichie d'un document audio. Parmi les informations extra-linguistiques transportées par la parole, une meta-donnée importante pour la transcription enrichie concerne l'information sur l, des phrases parlées (c'est-à-dire les phrases sont-elles du type interrogatif ou affirmatif ou autre). Notre étude a principalement porté sur la différ prosodique entre les phrases de type affirmatif et de type interrogatif pour les langues française et vietnamienne, la détection et la classification a du type de phrase pour chacune des deux langues et la comparaison des stratégies spécifiques à chacune des deux langues. Nous avons commen travail par l'étude sur la langue française. Nous avons ainsi réalisé un système de segmentation et détection automatique de type de phrases basé sur l"information prosodique et sur l"information lexicale. Le système a été validé sur des corpus de parole spontanée de la vie courante qui sont l'enregistrement de conversations téléphoniques entre un client et une agence de tourisme, des entretiens d'embauche ou des réunions de projet. Cette première étude sur la langue française, nous avons élargit notre recherche en travaillant sur la langue vietnamienne, une langue où les étud, sur le système prosodique sont encore toutes préliminaires. Nous avons d'abord poursuivi une étude pour identifier la différence prosodique entre phrases interrogatives et affirmatives à la fois sur le plan de production et sur le plan de perception. Ensuite, sur la base de ces résultats, un mot, classification a été construit
This thesis work is at the frontier between multimedia information retrieval and automatic speech processing. During the last years, a new task en speech processing: the rich transcription of an audio document. An important meta-data for rich transcription is the information on sentence type sentence of interrogative or affirmative type). The study on the prosodie differences between these two types of sentences in Vietnamese languaç detection and classification of sentence type in French language and in Vietnamese language is the main subject of this research work. Our depar1 study on French language. We've realized a system for segmentation and automatic detection of sentence type based on both prosodie and lexica information. The system has been validated on real world spontaneous speech corpus which are recording of conversations via telephone, betwee and a tourism office staff, recruiting interview, project meeting. After this first study on French, we've extended our research in Vietnamese langui language where ail studies until now on prosodie system are still preliminary. We've carried a study on the prosodie differences between interroga affirmative sentences in both production and perception levels. Next, based on these results, a classification motor has been built
APA, Harvard, Vancouver, ISO, and other styles
3

Abdel, Jalil Mohamed ali. "Approche polysémique et traductologique du Coran : la sourate XXII (Al-Hajj [le pèlerinage]) comme modèle." Thesis, Aix-Marseille, 2017. http://www.theses.fr/2017AIXM0348.

Full text
Abstract:
La tradition musulmane rapporte que l’une des caractéristiques les plus fondamentales du Coran est qu’il est un texte polysémique par excellence (ḥammāl dhū wujūh, porteur de plusieurs visages). Si le Coran est polysémique et que ses exégèses ne sont que des lectures possibles, il s’ensuit que ses traductions en sont également autant de lectures possibles et complémentaires. L’accumulation des traductions contribue ainsi à exprimer la polysémie du texte d’origine, même si ces traductions, dans leur diversité, restent en deçà de la diversité des exégèses.La thèse prend la sourate Al-Ḥajj comme exemple de cette problématique. Elle est articulée autour de deux axes de recherche :I. une étude de la polysémie du texte de départ.II. une étude de la polysémie du texte d’arrivée qui montre comment la traduction réduit et/ou modifie la polysémie.Le corpus est constitué de 18 traductions représentatives de toutes les tendances et de toutes les périodes de l’histoire de la traduction française du Coran depuis 1647 jusqu’à 2010.Constituant un espace clos qui évolue indépendamment de l’exégèse vers plus de littéralité, les traductions se rejoignent, se complètent, reflètent dans leur diversité avec légère modification une grande partie de la polysémie réunie et concentrée dans le texte de départ mais sporadique, éparse et dispersée dans les traductions
According to Islamic tradition, one of the core characteristics of the Quran is that it is a polysemic text par excellence (ḥammāl dhū wujūh, bearer of several faces). To say that the Quranic text is polysemic implies that its various exegeses are as many possible readings of it, which implies in turn that its translations are also as many readings that complete each other. The accumulation of translations is thus another expression of the polysemy of the original text, even if the diversity of these translations does not match that of the exegeses.The thesis deals with the analysis of the Surah of Al-Ḥajj and it is based on two research axes:I. a study of the polysemy of the original text (Surah of Al-Ḥajj).II. A study of the polysemy of the final text (18 French translations) to show how translation reduces and/or modifies polysemy. The corpus of translations (18 translations) covers all the periods of the history of the translation of the Koran from 1647 until 2010 in order to see the evolution of the translation of the Koranic text.As a closed space that evolves independently from exegesis to more literality, the translations meet and complement each other, reflecting in their diversity with slight modification a large part of the polysemy united and concentrated in the original text but sporadic, sparse and Dispersed in the translations
APA, Harvard, Vancouver, ISO, and other styles
4

Ghoul, Dhaou. "Classifications et grammaires des invariants lexicaux arabes en prévision d’un traitement informatique de cette langue. Construction d’un modèle théorique de l’arabe : la grammaire des invariants lexicaux temporels." Thesis, Paris 4, 2016. http://www.theses.fr/2016PA040184.

Full text
Abstract:
Cette thèse porte sur la classification et le traitement des invariants lexicaux arabes qui expriment un aspect temporel afin de créer un modèle qui présente chaque invariant sous la forme d’un schéma de grammaire (automates à états finis). Dans ce travail nous avons limité notre traitement seulement pour 20 invariants lexicaux. Notre hypothèse part du principe que les invariants lexicaux sont situés au même niveau structural (formel) que les schèmes dans le langage quotient (squelette) de la langue arabe. Ils cachent beaucoup d’informations et entraînent des attentes syntaxiques qui permettent de prédire la structure de la phrase.Au début de cette thèse, nous abordons la notion « invariant lexical » en exposant les différents niveaux d’invariance. Ensuite, nous classons les invariants étudiés dans cette thèse selon plusieurs critères.La deuxième partie de cette thèse fait l’objet de notre propre étude concernant les invariants lexicaux temporels dans laquelle nous commençons par une présentation de notre méthode d’étude linguistique ainsi que la modélisation par schémas de grammaires des invariants lexicaux temporels étudiés. Ensuite, nous abordons l’analyse proprement dite des invariants lexicaux simples comme « ḥattā, baʿda » et complexes comme « baʿdamā, baynamā ».Enfin, une application expérimentale « Kawâkib » a été employée pour détecter et identifier les invariants lexicaux en montrant leurs points forts aussi bien que leurs lacunes. Nous proposons également une nouvelle vision de la prochaine version de « Kawâkib » qui peut représenter une application pédagogique de l'arabe sans lexique
This thesis focuses on the classification and the treatment of Arabic lexical invariants that express a temporal aspect. Our aim is to create a diagram of grammar (finite state machine) for each invariant. In this work, we limited our treatment to 20 lexical invariants. Our assumption is that the lexical invariants are located at the same structural level (formal) as the schemes in the language quotient (skeleton) of the Arabic language. They hide much information and involve syntactic expectations that make it possible to predict the structure of the sentence.In the first part of our research tasks, we present the concept of “invariant lexical” by exposing the various levels of invariance. Then, we classify the invariants according to several criteria.The second part is the object of our own study concerning the temporal lexical invariants. We present our linguistic method as well as our approach of modelling using diagrams of grammars. Then, we analyze the simple lexical invariants such “ḥattā, baʿda” and the complexes ones such “baʿdamā, baynamā”.Finally, an experimental application “Kawâkib” was used to detect and identify the lexical invariants by showing their strong points as well as their gaps. We also propose a new vision of the next version of “Kawâkib” that can represent a teaching application of Arabic without lexicon
APA, Harvard, Vancouver, ISO, and other styles
5

Desalle, Yann. "Réseaux lexicaux, métaphore, acquisition : une approche interdisciplinaire et inter-linguistique du lexique verbal." Phd thesis, Université Toulouse le Mirail - Toulouse II, 2012. http://tel.archives-ouvertes.fr/tel-00714834.

Full text
Abstract:
Souvent, lorsque les locuteurs d'une langue n'ont pas accès à l'item conventionnel pour étiqueter un objet ou un événement, ils étendent, consciemment ou inconsciemment, le sens d'un autre item lexical disponible. Ce phénomène se produit particulièrement au cours de la phase d'acquisition lexicale des jeunes enfants. Lorsqu'une telle sur-extension sémantique va au-delà de la catégorie d'objet ou d'événement conventionnellement dénotée par l'item lexical produit, il s'agit d'une sur-extension catégorielle et l'énoncé produit est d'allure métaphorique: par exemple, " déshabiller la pomme " pour l'action peler une pomme est un énoncé d'allure métaphorique. Tout d'abord, cette thèse a permis de développer SLAM, un système de résolution lexicale automatique des métaphores produites en situation de manque du mot. SLAM s'appuie, d'une part, sur l'analyse syntaxique de grands corpus et, d'autre part, sur la structure Petit Monde Hiérarchique des réseaux lexicaux de synonymie. Par exemple, à partir des énoncés " elle déshabille* la pomme " ou " les bras* de l'arbre ", SLAM permet d'obtenir les interprétations suivantes: respectivement, " elle pèle une pomme " et " les branches de l'arbre ".Ensuite, cette thèse a permis d'étudier spécifiquement la dynamique d'acquisition du lexique des verbes qui se stabilise après celui des noms. D'une part, des outils méthodologiques pour l'étude inter-linguistique de la dynamique d'acquisition du lexique des verbes ont été élaborés: (a) un cadre méthodologique pour la construction de procédures d'identification des sur-extensions sémantiques catégorielles des verbes; (b) une méthodologie pour le repérage des stimuli visuels d'action sans biais culturel. D'autre part, les liens entre la structure des réseaux lexicaux de synonymie et la dynamique d'acquisition du lexique des verbes en français et en mandarin ont été mis au jour. Après avoir pointé des différences dans la dynamique d'acquisition du lexique des verbes par les jeunes enfants natifs du français et du mandarin, cette étude a servi à construire le score REFLEX, mesure du degré d'acquisition du lexique des verbes, qui permet de catégoriser automatiquement les jeunes enfants vs adultes en français et en mandarin.
APA, Harvard, Vancouver, ISO, and other styles
6

Maheux, Marie-Andrée. "Description lexicale du français québécois un modèle prédictionnairique." Mémoire, Université de Sherbrooke, 1994. http://hdl.handle.net/11143/10036.

Full text
Abstract:
À l'heure actuelle, le français québécois n'a pas, ou presque pas, fait l'objet d'une description systématique. La très grande majorité des travaux consacrés au français québécois se sont en effet limités à le traiter avec une approche différentielle, à le décrire comme un ensemble d'écarts par rapport au français européen. Il s'avère donc important d'envisager une description du français québécois dans sa totalité, c'est-à-dire comme le regroupement de ses particularités spécifiques et des éléments communs à toute la francophonie. Le meilleur point de départ pour décrire une langue découle du dépouillement d'un corpus de textes reflétant bien l'usage de cette langue, sur le plan de l'écrit aussi bien que de l'oral. En prenant comme source principale de documentation la banque de données textuelles de Sherbrooke (désormais, BDTS), je veux décrire certains substantifs du français québécois, mais d'une façon originale. Mon travail consiste à recenser toutes les informations disponibles pour une description lexicale, sans pour autant concevoir des articles lexicographiques finis. Mon but correspond plutôt à la confection de fiches prédictionnairiques, étape antérieure à tout énoncé lexicographique. Dans ce mémoire, je tenterai d'exposer et d'expérimenter cette nouvelle méthode de description lexicale. Mon mémoire vise donc plusieurs objectifs. Je commencerai d'abord par présenter les champs descriptifs de la fiche prédictionnairique. Par la suite, j'appliquerai le contenu de cette fiche à deux substantifs désignant des parties du corps (tête et oeil). Je tenterai également de démontrer l'intérêt de ma description prédictionnairique par l'établissement de comparaisons entre mes informations et celles qui sont recensées dans deux ouvrages lexicographiques récents; le Dictionnaire québécois d'aujourd'hui et le Petit Robert 1993. De plus, je proposerai un modèle d'article lexicographique.
APA, Harvard, Vancouver, ISO, and other styles
7

Romeo, Lauren Michele. "The Structure of the lexicon in the task of the automatic acquisition of lexical information." Doctoral thesis, Universitat Pompeu Fabra, 2015. http://hdl.handle.net/10803/325420.

Full text
Abstract:
La información de clase semántica de los nombres es fundamental para una amplia variedad de tareas del procesamiento del lenguaje natural (PLN), como la traducción automática, la discriminación de referentes en tareas como la detección y el seguimiento de eventos, la búsqueda de respuestas, el reconocimiento y la clasificación de nombres de entidades, la construcción y ampliación automática de ontologías, la inferencia textual, etc. Una aproximación para resolver la construcción y el mantenimiento de los léxicos de gran cobertura que alimentan los sistemas de PNL, una tarea muy costosa y lenta, es la adquisición automática de información léxica, que consiste en la inducción de una clase semántica relacionada con una palabra en concreto a partir de datos de su distribución obtenidos de un corpus. Precisamente, por esta razón, se espera que la investigación actual sobre los métodos para la producción automática de léxicos de alta calidad, con gran cantidad de información y con anotación de clase como el trabajo que aquí presentamos, tenga un gran impacto en el rendimiento de la mayoría de las aplicaciones de PNL. En esta tesis, tratamos la adquisición automática de información léxica como un problema de clasificación. Con este propósito, adoptamos métodos de aprendizaje automático para generar un modelo que represente los datos de distribución vectorial que, basados en ejemplos conocidos, permitan hacer predicciones de otras palabras desconocidas. Las principales preguntas de investigación que planteamos en esta tesis son: (i) si los datos de corpus proporcionan suficiente información para construir representaciones de palabras de forma eficiente y que resulten en decisiones de clasificación precisas y sólidas, y (ii) si la adquisición automática puede gestionar, también, los nombres polisémicos. Para hacer frente a estos problemas, realizamos una serie de validaciones empíricas sobre nombres en inglés. Nuestros resultados confirman que la información obtenida a partir de la distribución de los datos de corpus es suficiente para adquirir automáticamente clases semánticas, como lo demuestra un valor-F global promedio de 0,80 aproximadamente utilizando varios modelos de recuento de contextos y en datos de corpus de distintos tamaños. No obstante, tanto el estado de la cuestión como los experimentos que realizamos destacaron una serie de retos para este tipo de modelos, que son reducir la escasez de datos del vector y dar cuenta de la polisemia nominal en las representaciones distribucionales de las palabras. En este contexto, los modelos de word embedding (WE) mantienen la “semántica” subyacente en las ocurrencias de un nombre en los datos de corpus asignándole un vector. Con esta elección, hemos sido capaces de superar el problema de la escasez de datos, como lo demuestra un valor-F general promedio de 0,91 para las clases semánticas de nombres de sentido único, a través de una combinación de la reducción de la dimensionalidad y de números reales. Además, las representaciones de WE obtuvieron un rendimiento superior en la gestión de las ocurrencias asimétricas de cada sentido de los nombres de tipo complejo polisémicos regulares en datos de corpus. Como resultado, hemos podido clasificar directamente esos nombres en su propia clase semántica con un valor-F global promedio de 0,85. La principal aportación de esta tesis consiste en una validación empírica de diferentes representaciones de distribución utilizadas para la clasificación semántica de nombres junto con una posterior expansión del trabajo anterior, lo que se traduce en recursos léxicos y conjuntos de datos innovadores que están disponibles de forma gratuita para su descarga y uso.
La información de clase semántica de los nombres es fundamental para una amplia variedad de tareas del procesamiento del lenguaje natural (PLN), como la traducción automática, la discriminación de referentes en tareas como la detección y el seguimiento de eventos, la búsqueda de respuestas, el reconocimiento y la clasificación de nombres de entidades, la construcción y ampliación automática de ontologías, la inferencia textual, etc. Una aproximación para resolver la construcción y el mantenimiento de los léxicos de gran cobertura que alimentan los sistemas de PNL, una tarea muy costosa y lenta, es la adquisición automática de información léxica, que consiste en la inducción de una clase semántica relacionada con una palabra en concreto a partir de datos de su distribución obtenidos de un corpus. Precisamente, por esta razón, se espera que la investigación actual sobre los métodos para la producción automática de léxicos de alta calidad, con gran cantidad de información y con anotación de clase como el trabajo que aquí presentamos, tenga un gran impacto en el rendimiento de la mayoría de las aplicaciones de PNL. En esta tesis, tratamos la adquisición automática de información léxica como un problema de clasificación. Con este propósito, adoptamos métodos de aprendizaje automático para generar un modelo que represente los datos de distribución vectorial que, basados en ejemplos conocidos, permitan hacer predicciones de otras palabras desconocidas. Las principales preguntas de investigación que planteamos en esta tesis son: (i) si los datos de corpus proporcionan suficiente información para construir representaciones de palabras de forma eficiente y que resulten en decisiones de clasificación precisas y sólidas, y (ii) si la adquisición automática puede gestionar, también, los nombres polisémicos. Para hacer frente a estos problemas, realizamos una serie de validaciones empíricas sobre nombres en inglés. Nuestros resultados confirman que la información obtenida a partir de la distribución de los datos de corpus es suficiente para adquirir automáticamente clases semánticas, como lo demuestra un valor-F global promedio de 0,80 aproximadamente utilizando varios modelos de recuento de contextos y en datos de corpus de distintos tamaños. No obstante, tanto el estado de la cuestión como los experimentos que realizamos destacaron una serie de retos para este tipo de modelos, que son reducir la escasez de datos del vector y dar cuenta de la polisemia nominal en las representaciones distribucionales de las palabras. En este contexto, los modelos de word embedding (WE) mantienen la “semántica” subyacente en las ocurrencias de un nombre en los datos de corpus asignándole un vector. Con esta elección, hemos sido capaces de superar el problema de la escasez de datos, como lo demuestra un valor-F general promedio de 0,91 para las clases semánticas de nombres de sentido único, a través de una combinación de la reducción de la dimensionalidad y de números reales. Además, las representaciones de WE obtuvieron un rendimiento superior en la gestión de las ocurrencias asimétricas de cada sentido de los nombres de tipo complejo polisémicos regulares en datos de corpus. Como resultado, hemos podido clasificar directamente esos nombres en su propia clase semántica con un valor-F global promedio de 0,85. La principal aportación de esta tesis consiste en una validación empírica de diferentes representaciones de distribución utilizadas para la clasificación semántica de nombres junto con una posterior expansión del trabajo anterior, lo que se traduce en recursos léxicos y conjuntos de datos innovadores que están disponibles de forma gratuita para su descarga y uso.
Lexical semantic class information for nouns is critical for a broad variety of Natural Language Processing (NLP) tasks including, but not limited to, machine translation, discrimination of referents in tasks such as event detection and tracking, question answering, named entity recognition and classification, automatic construction and extension of ontologies, textual inference, etc. One approach to solve the costly and time-consuming manual construction and maintenance of large-coverage lexica to feed NLP systems is the Automatic Acquisition of Lexical Information, which involves the induction of a semantic class related to a particular word from distributional data gathered within a corpus. This is precisely why current research on methods for the automatic production of high- quality information-rich class-annotated lexica, such as the work presented here, is expected to have a high impact on the performance of most NLP applications. In this thesis, we address the automatic acquisition of lexical information as a classification problem. For this reason, we adopt machine learning methods to generate a model representing vectorial distributional data which, grounded on known examples, allows for the predictions of other unknown words. The main research questions we investigate in this thesis are: (i) whether corpus data provides sufficient distributional information to build efficient word representations that result in accurate and robust classification decisions and (ii) whether automatic acquisition can handle also polysemous nouns. To tackle these problems, we conducted a number of empirical validations on English nouns. Our results confirmed that the distributional information obtained from corpus data is indeed sufficient to automatically acquire lexical semantic classes, demonstrated by an average overall F1-Score of almost 0.80 using diverse count-context models and on different sized corpus data. Nonetheless, both the State of the Art and the experiments we conducted highlighted a number of challenges of this type of model such as reducing vector sparsity and accounting for nominal polysemy in distributional word representations. In this context, Word Embeddings (WE) models maintain the “semantics” underlying the occurrences of a noun in corpus data by mapping it to a feature vector. With this choice, we were able to overcome the sparse data problem, demonstrated by an average overall F1-Score of 0.91 for single-sense lexical semantic noun classes, through a combination of reduced dimensionality and “real” numbers. In addition, the WE representations obtained a higher performance in handling the asymmetrical occurrences of each sense of regular polysemous complex-type nouns in corpus data. As a result, we were able to directly classify such nouns into their own lexical-semantic class with an average overall F1-Score of 0.85. The main contribution of this dissertation consists of an empirical validation of different distributional representations used for nominal lexical semantic classification along with a subsequent expansion of previous work, which results in novel lexical resources and data sets that have been made freely available for download and use.
APA, Harvard, Vancouver, ISO, and other styles
8

De, la Garza Bernardo. "Creating lexical models: do foreign language learning techniques affect lexical organization in fluent bilinguals?" Diss., Kansas State University, 2012. http://hdl.handle.net/2097/14127.

Full text
Abstract:
Doctor of Philosophy
Department of Psychology
Richard J. Harris
The use of different language learning methods for the purposes of acquiring foreign language vocabulary has long been explored but studies have often failed to take into account the potential effects on lexical processing. The current study examined the effectiveness of the Keyword, Context and Paired-Associate learning methods in acquiring foreign language vocabulary, but primarily focusing on the lexical and conceptual organization effects that each method may have on a foreign language learner. Three main theories/models (i.e., Word Association, Concept Mediated and Revised Asymmetrical Hierarchical) have been used to explain the organization of bilingual lexical, conceptual stores and connections between each store, but studies have not examined the addition of a third language (i.e., L3) and the potential connections created between new L3 and the two existing language stores. It was predicted that since low-proficiency bilinguals would create lexical models which heavily rely on translation equivalents, thus, the use of non-elaborative learning methods would assist in creating only lexical translation links, while more sophisticated elaborative methods would be successful in creating direct access to the conceptual meaning. The current study further explored the potential effects of language learning methods on comprehension ability, requiring the creation of situation models for comprehension. Finally, the present study explored the immediate and delayed effects of language learning methods on both vocabulary acquisition and comprehension ability. Results from the current study indicated that all learning methods were successful in creating and conceptual connections between the languages and the conceptual store, while Keyword learners had significantly better scores on certain trial types. Differences in terms in lexical and conceptual strength are suggested since differences in RTs and scores were found between some of the learning methods. Furthermore, in terms of comparisons across time, repeated testing learners attained better scores on all trial types in comparison to learners who were only tested at Time 2. Lastly, when assessing if lexical links could be created to a non-associated highly fluent second language known by the bilingual, results indicated that each language learning method successfully created such lexical connections, but these links were weaker in strength than those of the base language that was used during learning. Based on the current results, new models of lexical access are proposed which vary based on the use of language learning methods. The current findings also have strong implications and applications to the field of foreign language acquisition, primarily for bilingual language learners acquiring an L3.
APA, Harvard, Vancouver, ISO, and other styles
9

Rivière, Laura. "Etude de l'importance relative des contraintes linguistiques et extralinguistiques conduisant à la compréhension de l'ironie." Electronic Thesis or Diss., Aix-Marseille, 2019. http://www.theses.fr/2019AIXM0284.

Full text
Abstract:
Ce travail de thèse a eu pour vocation, en utilisant le cadre du modèle de Satisfaction de Contraintes, de déterminer, pour la première fois en français, le rôle joué par plusieurs types de contraintes (i.e., pragmatiques, linguistiques et socioculturelles) dans la compréhension des critiques et des compliments ironiques. Suite à une première étude dans laquelle nous avons utilisé une tâche auditive, nos résultats ont mis en évidence que l’incongruité entre le contexte et l’énoncé était un indice plus fort que la prosodie lors de la compréhension de critiques ironiques. En effet, nous avons montré que si tous les participants, lors de l’interprétation, s’appuyaient sur les informations contextuelles, seuls certains participants utilisaient également les indices prosodiques. Les résultats des deux études suivantes, composées de tâches écrites, ont confirmé le rôle majeur des contraintes pragmatiques (i.e., allusion à une attente déçue, tension négative et présence d’une victime) dans la compréhension de l’ironie, et notamment dans la compréhension des critiques ironiques. Nos résultats ont également mis en évidence la contribution, bien qu’à un niveau inférieur à celle des contraintes pragmatiques, de contraintes socioculturelles des participants dans la compréhension de l’ironie. Ils ont aussi confirmé l’asymétrie de l’ironie et montré que les contraintes pragmatiques contribuant à la compréhension des compliments ironiques seraient différentes de celles contribuant à la compréhension des critiques ironiques
The objective of this thesis was, using the framework of the Constraints Satisfaction model, to determine, for the first time in French, the role played by several types of constraints (i.e., pragmatic, linguistic and sociocultural) in the understanding of ironic criticisms and ironic praises.The results of a first experiment, in which we used a listening task, showed that the incongruity between the context and the utterance was a stronger cue than prosody in the understanding of ironic critics. Indeed, we showed that while all participants, in their interpretations, relied on contextual information, only some participants also used prosodic cues. The results of the two subsequent experiments, consisting of written tasks, confirmed the main role played by pragmatic constraints in irony understanding, and particularly in understanding of ironic criticisms. Our results also highlighted the contribution, while at a lower level than pragmatic constraints, of sociocultural constraints of the participants in the irony understanding. Our results also confirmed the asymmetry of irony and showed that the pragmatic constraints contributing to the understanding of ironic praises would be different from those contributing to the understanding of ironic criticism
APA, Harvard, Vancouver, ISO, and other styles
10

De, Nadai Patrick. "De l'image dictionnairique au modèle lexicographique : la systémique lexicale." Paris 8, 1992. http://www.theses.fr/1993PA080771.

Full text
Abstract:
Les dictionnaires de langue monolingues sont des ouvrages qui contiennent de multiples informations decrivant la langue dont ils traitent. Cette description etant fondee sur un corpus determine, les dictionnaires procurent une certaine "image" de cette langue. Pour pouvoir ameliorer la qualite des dictionnaires, il faudrait que les lexicographes puissent disposer d'un "modele", qui correspondrait a une description aussi objective que possible de la langue. Si la langue est concue comme un "systeme" de signes qui entretiennent des relations les uns avec les autres, ce modele devrait presenter toutes les relations (lexicales et syntaxiques) que la langue etablit. L'objet de notre travail est de proposer une methode pour la construction d'un tel modele, a partir des informations contenues dans un dictionnaire extensif. Apres avoir etabli ce qu'est un "dictionnaire de langue" et une "definition de mot", nous nous livrons a l'analyse pratique de notre corpus de definitions. Cette analyse nous permet de degager un certain nombre de relations lexicales, grace auxquelles nous esquissons une analyse linguistique du "prefixe" anti-
Monolingual language dictionaries contain a lot of informations describing the language they are dealing with. Since this description is based on a given corpus, dictionaries provide a certain 'image' of this language. In order to improve the quality of dictionaries, it would be necessary for lexicographers to have a 'model' providing a description of the language as objective as possible. If language is conceived of as consisting in a 'system' of relations (both lexical and syntactical), the model should present all relations established in the language. The purpose of our work is to propose a method so as to build up such a model, from the informations contained in an extensive dictionary. After having established what a 'language dictionary' and a 'word definition' are, we analyse the definitions of our corpus, in order to extract lexical relations. Then, with these extracted relations, we try to outline a linguistic analysis of the 'prefix' anti-
APA, Harvard, Vancouver, ISO, and other styles
11

Clark, Stephen. "Class-based statistical models for lexical knowledge acquisition." Thesis, University of Sussex, 2001. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.341541.

Full text
Abstract:
This thesis is about the automatic acquisition of a particular kind of lexical knowledge, namely the knowledge of which noun senses can fill the argument slots of predicates. The knowledge is represented using probabilities, which agrees with the intuition that there are no absolute constraints on the arguments of predicates, but that the constraints are satisfied to a certain degree; thus the problem of knowledge acquisition becomes the problem of probability estimation from corpus data. The problem with defining a probability model in terms of senses is that this involves a huge number of parameters, which results in a sparse data problem. The proposal here is to define a probability model over senses in a semantic hierarchy, and exploit the fact that senses can be grouped into classes consisting of semantically similar senses. A novel class-based estimation technique is developed, together with a procedure that determines a suitable class for a sense (given a predicate and argument position). The problem of determining a suitable class can be thought of as finding a suitable level of generalisation in the hierarchy. The generalisation procedure uses a statistical test to locate areas consisting of semantically similar senses, and, as well as being used for probability estimation, is also employed as part of a re-estimation algorithm for estimating sense frequencies from incomplete data. The rest of the thesis considers how the lexical knowledge can be used to resolve structural ambiguities, and provides empirical evaluations. The estimation techniques are first integrated into a parse selection system, using a probabilistic dependency model to rank the alternative parses for a sentence. Then, a PP-attachment task is used to provide an evaluation which is more focussed on the class-based estimation technique, and, finally, a pseudo disambiguation task is used to compare the estimation technique with alternative approaches.
APA, Harvard, Vancouver, ISO, and other styles
12

Hagiwara, Masato, Yasuhiro Ogawa, and Katsuhiko Toyama. "AUTOMATIC ACQUISITION OF LEXICAL KNOWLEDGE USING LATENT SEMANTIC MODELS." INTELLIGENT MEDIA INTEGRATION NAGOYA UNIVERSITY / COE, 2006. http://hdl.handle.net/2237/10444.

Full text
APA, Harvard, Vancouver, ISO, and other styles
13

Mendling, Jan, Fabian Pittke, and Henrik Leopold. "Automatic detection and resolution of lexical ambiguity in process models." Gesellschaft für Informatik e.V, 2016. https://dl.gi.de/handle/20.500.12116/730.

Full text
APA, Harvard, Vancouver, ISO, and other styles
14

Vial, Loïc. "Modèles neuronaux joints de désambiguïsation lexicale et de traduction automatique." Thesis, Université Grenoble Alpes, 2020. http://www.theses.fr/2020GRALM032.

Full text
Abstract:
La désambiguïsation lexicale (DL) et la traduction automatique (TA) sont deux tâches centrales parmi les plus anciennes du traitement automatique des langues (TAL). Bien qu'ayant une origine commune, la DL ayant été conçue initialement comme un problème fondamental à résoudre pour la TA, les deux tâches ont par la suite évolué très indépendamment. En effet, d'un côté, la TA a su s'affranchir d'une désambiguïsation explicite des termes grâce à des modèles statistiques et neuronaux entraînés sur de grandes quantités de corpus parallèles, et de l'autre, la DL, qui est confrontée à certaines limitations comme le manque de ressources unifiées et un champs d'application encore restreint, reste un défi majeur pour permettre une meilleure compréhension de la langue en général.Aujourd'hui, dans un contexte où les méthodes à base de réseaux de neurones et les représentations vectorielles des mots prennent de plus en plus d'ampleur dans la recherche en TAL, les nouvelles architectures neuronales et les nouveaux modèles de langue pré-entraînés offrent non seulement de nouvelles possibilités pour développer des systèmes de DL et de TA plus performants, mais aussi une opportunité de réunir les deux tâches à travers des modèles neuronaux joints, permettant de faciliter l'étude de leurs interactions.Dans cette thèse, nos contributions porteront dans un premier temps sur l'amélioration des systèmes de DL, par l'unification des données nécessaires à leur mise en oeuvre, la conception de nouvelles architectures neuronales et le développement d'approches originales pour l'amélioration de la couverture et des performances de ces systèmes. Ensuite, nous développerons et comparerons différentes approches pour l'intégration de nos systèmes de DL état de l'art et des modèles de langue, dans des systèmes de TA, pour l'amélioration générale de leur performance. Enfin, nous présenterons une nouvelle architecture pour l'apprentissage d'un modèle neuronal joint pour la DL et la TA, s'appuyant sur nos meilleurs systèmes neuronaux pour l'une et l'autre tâche
Word Sense Disambiguation (WSD) and Machine Translation (MT) are two central and among the oldest tasks of Natural Language Processing (NLP). Although they share a common origin, WSD being initially conceived as a fundamental problem to be solved for MT, the two tasks have subsequently evolved very independently of each other. Indeed, on the one hand, MT has been able to overcome the explicit disambiguation of terms thanks to statistical and neural models trained on large amounts of parallel corpora, and on the other hand, WSD, which faces some limitations such as the lack of unified resources and a restricted scope of applications, remains a major challenge to allow a better understanding of the language in general.Today, in a context in which neural networks and word embeddings are becoming more and more important in NLP research, the recent neural architectures and the new pre-trained language models offer not only some new possibilities for developing more efficient WSD and MT systems, but also an opportunity to bring the two tasks together through joint neural models, which facilitate the study of their interactions.In this thesis, our contributions will initially focus on the improvement of WSD systems by unifying the ressources that are necessary for their implementation, constructing new neural architectures and developing original approaches to improve the coverage and the performance of these systems. Then, we will develop and compare different approaches for the integration of our state of the art WSD systems and language models into MT systems for the overall improvement of their performance. Finally, we will present a new architecture that allows to train a joint model for both WSD and MT, based on our best neural systems
APA, Harvard, Vancouver, ISO, and other styles
15

Schwab, Didier. "Approche hybride - lexicale et thématique - pour la modélisation, la détection et l'exploitation des fonctions lexicales en vue de l'analyse sémantique de texte." Phd thesis, Université Montpellier II - Sciences et Techniques du Languedoc, 2005. http://tel.archives-ouvertes.fr/tel-00333334.

Full text
Abstract:
Utilisée à la fois pour l'apprentissage et l'exploitation des vecteurs conceptuels, l'analyse sémantique de texte est centrale à nos recherches. L'amélioration qualitative du processus d'analyse entraîne celle des vecteurs. En retour, cette meilleure pertinence a un effet positif sur l'analyse. Parmi les différentes voies à explorer pour obtenir ce cercle vertueux, l'une des pistes les plus intéressantes semble être la découverte puis l'exploitation des relations lexicales entre les mots du texte. Ces relations, parmi lesquelles la synonymie, l'antonymie, l'hyperonymie, la bonification ou l'intensification, sont modélisables sous la forme de fonctions lexicales. Énoncées essentiellement dans un cadre de production par Igor Mel'čuk, nous cherchons, dans cette thèse, à les adapter à un cadre d'analyse. Nous introduisons ici deux classes de Fonctions Lexicales d'Analyse. Les premières, les FLA de construction permettent de fabriquer un vecteur conceptuel à partir des informations lexicales disponibles. Les secondes, les FLA d'évaluation permettent de mesurer la pertinence d'une relation lexicale entre plusieurs termes. Ces dernières sont modélisables grâce à des informations thématiques (vecteurs conceptuels) et/ou grâce à des informations lexicales (relations symboliques entre les objets lexicaux).

Les informations lexicales sont issues de la base lexicale sémantique dont nous introduisons l'architecture à trois niveaux d'objets lexicaux (item lexical, acception, lexie). Elles sont matérialisées sous la forme de Relations Lexicales Valuées qui traduisent la probabilité d'existence de la relation entre les objets. L'utilité de ces relations a pu être mis en évidence pour l'analyse sémantique grâce à l'utilisation du paradigme des algorithmes à fourmis. Le modèle introduit dans cette thèse, utilise à la fois les vecteurs conceptuels et les relations du réseau lexical pour résoudre une partie des problèmes posés lors d'une analyse sémantique.

Tous nos outils ont été implémentés en Java. Ils reposent sur Blexisma (Base LEXIcale Sémantique Multi-Agent) une architecture multi-agent élaborée au cours de cette thèse dont l'objectif est d'intégrer tout élément lui permettant de créer, d'améliorer et d'exploiter une ou plusieurs Bases Lexicales Sémantiques. Les expériences menées ont montré la faisabilité de cette approche, sa pertinence en termes d'amélioration globale de l'analyse et ouvert des perspectives de recherches fort intéressantes.
APA, Harvard, Vancouver, ISO, and other styles
16

Aravena, Sandra. "Dynamics of language induced cortical motor activity : determining the linguistic contexts that trigger motor activation during lexical semantic processing." Thesis, Lyon 2, 2014. http://www.theses.fr/2014LYO20010/document.

Full text
Abstract:
Cette thèse se propose de spécifier la relation entre les structures motrices et celles du langage en tant que systèmes coopératifs dans la construction du sens. Bien qu'un grand nombre d'études aient mis en évidence que les structures motrices sont impliquées dans le traitement du langage, il est encore difficile de déterminer le rôle de ces structures dans la compréhension. Les théories dites «incarnées» et «désincarnées» débattent de la nature de la représentation du sens des mots en termes de la nécessité des structures motrices pour le langage, en négligeant le fait que les conditions de leur activation n’ont pas été décrites. Des recherches récentes soulignent l’importance de la prise en compte des contextes dans lesquels le langage recrute l'activité motrice. Néanmoins, cette tendance est en contradiction avec les présomptions implicites dans la recherche sur l’interaction langage-motricité, qui se basent sur le modèle «deux-étapes» du traitement sémantique et sur la perspective du «dictionnaire» de la représentation du sens lexical. Dans ce cadre, le traitement du sens des mots est pris comme un processus modulaire. Ce n'est qu'une fois ce processus accompli que le contexte peut influencer la signification. Ces présomptions ont biaisé le débat sur le rôle de l'activité motrice induite par le langage, qui se réduirait à la question de savoir si l'activation motrice doit être considérée comme faisant partie de l'accès lexico-sémantique ou comme résultat de la construction d’un modèle de situation. Or, un grand nombre de travaux ont mis en évidence que le traitement lexico-sémantique et le contexte sont interdépendants. Cette connaissance provenant de la psycholinguistique doit être explicitement intégrée à la recherche sur le rôle de l'activité motrice induite par le langage. Dans un effort pour porter le débat hors de la discussion «lexical vs. post-lexical», cette thèse vise à déterminer les conditions sous lesquelles les contextes linguistiques déclenchent l'activité motrice. Pour ce faire, nous avons testé un nouvel outil qui analyse en ligne les modulations de la force de préhension pendant que les participants écoutaient des mots cibles intégrés dans différents contextes. Nos résultats montrent que quand le mot cible était un verbe d'action de la main et que la phrase focalisait l'action (« John signe le contrat»), une augmentation de la force de préhension était observée dans la fenêtre temporelle associée à la récupération lexico-sémantique. Aucune augmentation de la force de préhension comparable n’a été détectée lorsque le même mot d'action était intégré dans des phrases négatives («John ne signe pas le contrat») ou dans des phrases dont le focus avait été déplacé vers l'état mental de l'agent («John veut signer le contrat»)
The present dissertation was conducted in order to specify the relationship between motor and language structures as cooperative systems in lexical meaning construction. Specifically, this thesis aimed at deepening our understanding of how the linguistic context coordinates the recruitment of motor structures during lexical semantic processing. Although the involvement of motor activity in action-related language comprehension is now sufficiently documented, the specific role that motor structures play in action language processing is still unclear. “Embodied” and “disembodied” theories debate the nature of meaning representation in terms of the necessity of motor structures, neglecting the fact that the conditions of their activation during language processing are not well-described. Very recent research has begun to note the necessity of exploring the context under which words trigger modality-specific cortical activity. However, this trend is at odds with implicit theoretical assumptions that have been made in research on motor-language crosstalk, which are based on the “two-step” model of semantic processing and the “dictionary-like” view of lexical meaning representation. Within such framework, word meaning recognition is taken to proceed in a modular fashion. Only after this process has concluded is the context thought to exert its effects. These assumptions have biased the debate on the role of language induced motor activity. The discussion has been centered on whether motor activation should be considered an integral part of the lexical access process or taken as the result of an ensuing “higher order” operation (i.e., situation model construction). A large body of work evidences that lexical semantic processing and semantic context are far more integrated and interdependent. It seems crucial to integrate this knowledge gained from psycholinguistics into the research on the role of language induced motor activity. In an effort to liberate the debate from the “lexical vs. post-lexical” discussion, this thesis aimed at determining the conditions under which language triggers motor activity. To accomplish these objectives, we introduced a novel tool that analyzes on-line modulations of grip-force while participants listened to specific target words embedded within different types of contexts. Our results show that when the target word was a hand action verb and the sentence focus centered on that action (“John signs the contract”), an increase of grip force was observed in the temporal window classically associated with lexical semantic processing. No comparable increase in grip force was detected when the same action word was embedded in negative sentences (“John doesn’t sign the contract”) or in sentences which focus was shifted towards the agent’s mental state (“John wants to sign the contract”). Our results suggest that the presence of an action word in an
APA, Harvard, Vancouver, ISO, and other styles
17

Trybocki, Christine. "Elaboration d'un modèle conceptuel pour les bases de données lexicales." Aix-Marseille 3, 1995. http://www.theses.fr/1995AIX30088.

Full text
Abstract:
Depuis une dizaine d'annees, des equipes scientifiques et des editeurs se penchent sur la transformation des dictionnaires editoriaux en bases de donnees destinees aux applications en langage naturel ou a la diffusion publique sous forme de cd-rom. Notre objectif ici est de decrire un nouveau schema de base de donnees dictionnairiques qui comble les manques des precedents modeles. Avant l'elaboration de ce schema, nous avons pris soin d'observer en detail plusieurs dictionnaires. En raison de leur structuration arbitraire, nous avons mis de cote le concept d'entree et propose une nouvelle unite dictionnairique. Nous avons defini formellement notre modele grace a un outil de reference: sgml, et nous avons choisi une representation informatique basee sur la representation objet. Enfin, nous avons constate sa validite sur un dictionnaire monolingue francais
APA, Harvard, Vancouver, ISO, and other styles
18

Muniz, Juliana Aguiar. "Processos de indeterminação lexical em conversas telefônicas interceptadas." Universidade do Estado do Rio de Janeiro, 2013. http://www.bdtd.uerj.br/tde_busca/arquivo.php?codArquivo=5340.

Full text
Abstract:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior
O objetivo principal deste trabalho é estudar estratégias de indeterminação de sentido em um corpus de conversas telefônicas interceptadas, considerando que a produção de sentido é um processo cognitivo dependente do contexto. Delimitamos a linguística cognitiva como a área na qual essa pesquisa se encontra inserida, para melhor compreender os fundamentos e os pressupostos norteadores da Teoria dos Modelos Cognitivos Idealizados (TMCI) e da Teoria da Mesclagem Conceptual (blending), tendo como base, principalmente, os estudos de Lakoff (1987), Fauconnier (1997) e Fauconnier e Turner (2002). No decorrer do trabalho propomo-nos responder às seguintes questões de pesquisa: a) que estratégias de indeterminação de sentido são mais frequentemente usadas nestas conversas? b) que elementos do contexto e do cotexto permitem a delimitação do sentido do item lexical em determinada conversa? c) como funcionam, no corpus, as estratégias de indeterminação de sentido e de que forma elas contribuem para sustentar determinado tipo de relação interpessoal? Para responder a estas questões de pesquisa, das 22 gravações de conversas telefônicas de atores sociais envolvidos com tráfico de armas e drogas, sequestro e extorsão, fornecidas pela Coordenadoria de Segurança e Inteligência do Ministério Público do Rio de Janeiro, selecionamos 10 conversas, em função da sua qualidade sonora, para serem transcritas e para proceder à análise qualitativa do uso da polissemia e da vagueza lexical. A partir das discussões teóricas e das análises desenvolvidas, concluímos que a polissemia representa a estratégia de indeterminação de sentido mais frequente no corpus desta pesquisa e que a mesma pode ser entendida como um processo de mesclagem conceptual, que sofre influências sociais e culturais: é a dinamicidade do pensamento e da linguagem que geram a polissemia. Concluímos também que a vagueza lexical é utilizada, no corpus, como um recurso linguístico para referência a assuntos ilícitos. Os itens lexicais analisados instanciam esquemas mentais abstratos que têm seus sentidos realizados a partir de pistas linguísticas e extralinguísticas que apontam para um processo interacional que pode ser entendido como um enquadre de transações comerciais (tráfico de drogas)
The main objective of this research is to study strategies of indeterminacy of meaning in a corpus of intercepted telephone conversations by social actors involved with the trafficking of drugs and weapons, with kidnapping and extortion. We elected Cognitive Linguistics as the area in which this research should be developed, as we understand the process of meaning production as a cognitive process, dependent on the context. Within Cognitive Linguistics, we adopted the principles and assumptions guiding the Theory of Idealized Cognitive Models (TMCI) and Conceptual Blending Theory, based principally on studies by Lakoff (1987), Fauconnier (1997) and Fauconnier and Turner (2002). Throughout the paper our purpose is to answer the following research questions: a) what strategies for the indeterminacy of meaning are most often used in these conversations? b) what elements of context and co-text (the immediate grammatical context ) trigger the instantiation of the meaning of a lexical item in a particular conversation? c) how do the strategies of indetermination of meaning operate , in the corpus, and how do they contribute to the creation of a particular kind of interpersonal relationship? In order to answer these questions, from the 22 recordings provided by the Coordinator of Intelligence and Security of the Public Ministry of Rio de Janeiro, we selected 10 conversations, on the basis of their sound quality. We further transcribed them and submitted them to qualitative analysis, investigating the use of lexical polysemy and vagueness. From the theoretical discussions and analyzes undertaken, we conclude that polysemy represents the strategy of indeterminacy of meaning that is most often used in the corpus and that it can be understood as a process of conceptual blending, under the influence of social and cultural factors: it is the association between the use of language and the real dynamics of thought and language that generate polysemy. We also conclude that lexical vagueness is used as a language resource to refer to illicit affairs. The lexical items studied instantiate abstract mental schemas whose meanings are triggered by the use of particular linguistic and extralinguistic cues, within the domain, or frame, of a commercial transaction (drug trafficking)
APA, Harvard, Vancouver, ISO, and other styles
19

Lara, Leandro Zanetti. "Um estudo acerca da representação semântico-lexical no modelo da gramática discursivo-funcional." reponame:Biblioteca Digital de Teses e Dissertações da UFRGS, 2012. http://hdl.handle.net/10183/49685.

Full text
Abstract:
Esta tese objetiva apresentar um estudo acerca da representação lexical, de uma forma geral, e do tratamento da semântica lexical, em específico, no âmbito da Gramática Discursivo-Funcional (HENGEVELD, 2000, 2004a, 2006; HENGEVELD e MACKENZIE, 2006; 2008), tomando como crivo para este estudo os dados em português brasileiro, sobretudo no que tange ao comportamento sintático dos adjetivos com respeito a seu conteúdo semântico. Uma vez que a Gramática Discursivo-Funcional foi erigida sob a égide da adequação pragmática, perseguida desde os tempos da GF de Dik (1978, 1997), que estipula que a configuração morfossintática e fonológica, ou seja, a codificação estrutural é uma decorrência das representações pragmático-semânticas, nossa indagação girou em torno do papel que a semântica dos itens lexicais apresenta neste modelo. Tal se deve ao fato de identificarmos fenômenos linguísticos de codificação sintática, por exemplo, que estão mais ligados ao conteúdo dos itens do léxico do que a uma semântica frasal ou textual, enfim, composicional. Nosso ponto de apoio para a análise foi o comportamento sintático dos adjetivos do português brasileiro, que se mostra intimamente atrelado ao sentido por eles vinculado. Recolhemos um corpus textual de crítica de arte em português brasileiro, para analisar que subclasses semânticas estavam atuando nos exemplos e como eram expressas sintaticamente. Os dados apontam que a representação do léxico na Gramática Discursivo- Funcional ganharia em poder explanatório se evidenciasse a organização interna do léxico, que parece apresentar regras de formação lexical, bem como uma definição semântica que motiva diretamente o comportamento morfossintático de seus itens.
This dissertation aims to present a study on lexical representation, in general, and on the treatment of lexical semantics, in particular, within the framework of Functional Discourse Grammar (HENGEVELD, 2000, 2004a, 2006; HENGEVELD; MACKENZIE, 2006, 2008) using data of Brazilian Portuguese, especially in relation to the syntactic behavior of adjectives regarding their semantic content. Since Functional Discourse Grammar is based on pragmatic adequacy, pursued since the FG by Dik (1978, 1997), which stipulates that the phonological and morphosyntatic configuration, that is, the structural encoding, is a result of pragmatic-semantic representations, our inquiry discussed the role performed by the semantic of lexical items in this model. The reason for this choice is the identification of linguistic phenomena of syntactic encoding, for instance, which are more linked to the semantic content of the lexical items instead of to phrasal or text semantics, that is, compositional semantics. Our point of support for the analysis was the syntactic behavior of Brazilian Portuguese adjectives which is closely related to the sense they present. We collected a corpus of art criticism in Brazilian Portuguese in order to analyze adjectival semantic subclasses that were acting in the examples and the way they were expressed syntactically. The data indicate that the lexical representation in Functional Discourse Grammar would be more precise in terms of explanatory power if it showed the internal organization of the lexicon, which seems to present rules of lexical formation as well as meaning definitions that directly motivate the syntactic behavior of the adjectival lexical items.
Esta tesis tiene como objetivo presentar un estudio sobre la representación léxica, en general, y el tratamiento de la semántica léxica, en particular, en la Gramática Discursivo-Funcional (HENGEVELD, 2000, 2004a, 2006; HENGEVELD; MACKENZIE, 2006, 2008), tomando como base para este estudio datos del portugués de Brasil, especialmente en relación con el comportamiento morfosintáctico de los adjetivos con respecto a su contenido semántico. Una vez que la Gramática Discursivo-Funcional fue construida bajo los auspicios de la adecuación pragmática, perseguida desde los días de la GF de Dik (1978, 1997), que estipula que la configuración fonológica y morfosintáctica, es decir, la estructura de codificación es el resultado de las representaciones pragmático-semánticas, nuestra investigación ha girado en torno al papel que la semántica de las unidades léxicas presentan en este modelo. Esto se debe al hecho de que identificamos fenómenos lingüísticos de codificación sintáctica, por ejemplo, que están vinculados directamente al contenido de la semántica lexical, y no solo de la semántica composicional. Nuestro punto de apoyo para el análisis fue el comportamiento sintáctico de los adjetivos del portugués de Brasil, que está estrechamente ligada al influjo del contenido léxico. Hemos recogido un corpus de crítica de arte en portugués de Brasil, para analizar las subclases semánticas presentes en los ejemplos y la forma en que se expresan sintácticamente. Los datos indican que la representación léxica de la Gramática Discursivo- Funcional ganaría en poder explicativo se tuviera también una representación de la organización interna del léxico, que parece contener reglas léxicas de formación, así como definiciones semánticas que motivan el comportamiento sintáctico de los elementos léxicos adjetivales.
APA, Harvard, Vancouver, ISO, and other styles
20

Humphreys, Jane. "WIRDS, WERDS, WYRDZ : visual wordlikeness, lexical phonology, and models of visual word recognition." Thesis, University of Bristol, 2008. http://hdl.handle.net/1983/7f7f66b6-f4e7-420e-ab74-af32bbec6ce0.

Full text
Abstract:
Nonwords such as 'brate' and pseudohomophones such as 'brane' are often used to explore the processes involved in decoding orthography, without the potential confound from semantics. Experiments using these usefuJ.items can provide evidence that sheds light on competing models ofvisual word recognition in general and the status oflexical phonology in particular. However, previous experiments have often used stimuli that look implausible as exemplars ofEnglish spelling, (e.g. phret, woez), and it is arguable that some ofthe current controversies in the area may be partly attributable to the use ofsuch stimuli. To investigate this notion, new items were constructed from real words on the grounds that they would contain a high proportion ofexisting orthotactic patterns. Ratings were gathered for the visual wordlikeness ofthe previous stimuli and these new items; the latter generated higher ratings than the former. Analysis ofthe ratings suggested that readers are sensitive to multiple sources oforthographic and graphophonemic information. In a series of naming and lexical decision experiments using the new stimuli, results showed that participants responded to visual wordlikeness across all tasks; for example, reading wordlike pseudohomophones more quickly than unwordlike, and responding to them more slowly in visual lexical decision. A masked priming experiment using wordlike and unwordlike primes showed that lexical phonology was less likely to be activated for the unwordlike pseudohomophones than the wordlike. Overall, the results support a view ofvisual word recognition as a highly-interactive system, processing multiple grain-sizes ofsuhlexical and lexical information in which phonology plays a functional, non-optional, role. While orthotactic violations constrain its normal workings, the system has mechanisms that can be used to process unwordlike items; but it is unlikely that these processes are the same in all respects as those used for wordlike stimuli.
APA, Harvard, Vancouver, ISO, and other styles
21

Coté, Myriam. "Utilisation d'un modele d'accès lexical et de concepts perceptifs pour la reconnaissance d'images de mots cursifs." Paris, ENST, 1997. http://www.theses.fr/1997ENST0009.

Full text
Abstract:
Cette thèse a pour objectif la reconnaissance d'images de mots cursifs isoles. Nous avons développé une méthode de reconnaissance qui modélise les effets contextuels en s'inspirant des études réalisées en psychologie expérimentale et que nous avons appelé le système percepto. Nous nous sommes donc intéresses plus particulièrement au modele de lecture propose par mcclelland & rumelhart parce qu'il prend en compte l'effet de supériorité du mot (supériorité de la reconnaissance des lettres dans un contexte sur la reconnaissance de lettres prises isolement). Notre méthode de reconnaissance a pour principes fondamentaux les idées présentées dans le modele de m & r auxquelles nous avons ajoute d'autres concepts lies spécifiquement à la reconnaissance d'écriture cursive : primitives adaptées au manuscrit, gestion de l'ambiguïté de la position des lettres, analyse contextuelle. Ainsi, notre méthode possède comme leur modele, les caractéristiques suivantes : réseau connexionniste, traitement parallèle de l'information, mécanisme d'activation se propageant entre les trois couches de détecteurs selon des processus ascendants et descendants. De plus, afin de réaliser la reconnaissance de mots cursifs, nous utilisons des primitives adaptées au manuscrit qui ne servent de points d'ancrage à la reconnaissance tels que les ascendants, les descendants et les boucles. Nous traitons les ambiguïtés relatives à la position des lettres en introduisant un appariement flou. La position et les lettres manquantes sont trouvées grace à l'exploitation de l'information contextuelle proposée par un lexique et cela, via une suite de cycles perceptifs permettant de converger vers une solution de reconnaissance. L'implémentation de la méthode et sa validation sur une base d'images réelles a donné des résultats encourageants. Nous analysons ensuite ces résultats et concluons sur les perspectives prometteuses de cette méthode.
APA, Harvard, Vancouver, ISO, and other styles
22

Kharrazen, Essaïd. "PSILISP, un modèle d'interprétation parallèle de programmes LISP." Paris 11, 1986. http://www.theses.fr/1986PA112385.

Full text
Abstract:
PSILISP comprend la définition d’un langage dérivé de LISP et d’une implémentation de ce langage sur une architecture multiprocesseur de type MIMD à mémoire partagée. Les principales caractéristiques de ce langage sont : portée lexicale des identificateurs, appel des arguments par valeur, évaluation parallèle explicite des arguments d’une application, primitives sans effet de bord. PSILISP étend LISP par l’introduction des « applications parallèles ». Leur évaluation se traduit par une exploitation massive des processeurs pour le calcul en parallèle des arguments. PSILISP utilise la portée lexicale. Ce choix permet d’éviter les défauts sémantiques caractérisant la plupart des implémentations actuelles de LISP. De plus, l’implémentation des environnements qui en résulte, se prête mieux à la gestion du parallélisme. PSILISP apporte une solution au problème du Funarg ascendant par rétention des environnements. Il en résulte que les fonctions sont des objets à part entière. L’expérience PSILISP montre qu’il est possible d’accroitre considérablement la vitesse d’exécution des programmes LISP par l’exploitation du parallélisme
PSILISP comprises the definition of a language derived from LISP and its implementation on an MIMD parallel architecture with shared memory. The main features of PSILISP are: lexically scoped variables, call by value, explicit parallel evaluation of the arguments of an application, primitives with no side effects. PSILISP language extends LISP by the new “parallel application” construct. Its evaluation leads to intensive use of the processors for the parallel computation of the arguments. PSILISP uses lexically scoped variables. This choice avoids the semantical defects that are common to the usual implementations of LISP. Furthermore, the implementation in this case lends itself better to the management of parallel evaluation. PSILISP brings a solution to the Funarg problem by environment retention. Functions become thus members of first class citizens. The PSILISP experience demonstrates that the efficiency of LISP programs can be increased considerably by use of the parallelism
APA, Harvard, Vancouver, ISO, and other styles
23

He, Yanzhang. "Segmental Models with an Exploration of Acoustic and Lexical Grouping in Automatic Speech Recognition." The Ohio State University, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=osu1429881253.

Full text
APA, Harvard, Vancouver, ISO, and other styles
24

Swaileh, Wassim. "Des modèles de langage pour la reconnaissance de l'écriture manuscrite." Thesis, Normandie, 2017. http://www.theses.fr/2017NORMR024/document.

Full text
Abstract:
Cette thèse porte sur le développement d'une chaîne de traitement complète pour réaliser des tâches de reconnaissance d'écriture manuscrite non contrainte. Trois difficultés majeures sont à résoudre: l'étape du prétraitement, l'étape de la modélisation optique et l'étape de la modélisation du langage. Au stade des prétraitements il faut extraire correctement les lignes de texte à partir de l'image du document. Une méthode de segmentation itérative en lignes utilisant des filtres orientables a été développée à cette fin. La difficulté dans l’étape de la modélisation optique vient de la diversité stylistique des scripts d'écriture manuscrite. Les modèles optiques statistiques développés sont des modèles de Markov cachés (HMM-GMM) et les modèles de réseaux de neurones récurrents (BLSTM-CTC). Les réseaux récurrents permettent d’atteindre les performances de l’état de l’art sur les deux bases de référence RIMES (pour le Français) et IAM (pour l’anglais). L'étape de modélisation du langage implique l'intégration d’un lexique et d’un modèle de langage statistique afin de rechercher parmi les hypothèses proposées par le modèle optique, la séquence de mots (phrase) la plus probable du point de vue linguistique. La difficulté à ce stade est liée à l’obtention d’un modèle de couverture lexicale optimale avec un minimum de mots hors vocabulaire (OOV). Pour cela nous introduisons une modélisation en sous-unités lexicales composée soit de syllabes soit de multigrammes. Ces modèles couvrent efficacement une partie importante des mots hors vocabulaire. Les performances du système de reconnaissance avec les unités sous-lexicales dépassent les performances des systèmes de reconnaissance traditionnelles de mots ou de caractères en présence d’un fort taux de mots hors lexique. Elles sont équivalentes aux modèles traditionnels en présence d’un faible taux de mots hors lexique. Grâce à la taille compacte du modèle de langage reposant sur des unités sous-lexicales, un système de reconnaissance multilingue unifié a été réalisé. Le système multilingue unifié améliore les performances de reconnaissance par rapport aux systèmes spécialisés dans chaque langue, notamment lorsque le modèle optique unifié est utilisé
This thesis is about the design of a complete processing chain dedicated to unconstrained handwriting recognition. Three main difficulties are adressed: pre-processing, optical modeling and language modeling. The pre-processing stage is related to extracting properly the text lines to be recognized from the document image. An iterative text line segmentation method using oriented steerable filters was developed for this purpose. The difficulty in the optical modeling stage lies in style diversity of the handwriting scripts. Statistical optical models are traditionally used to tackle this problem such as Hidden Markov models (HMM-GMM) and more recently recurrent neural networks (BLSTM-CTC). Using BLSTM we achieve state of the art performance on the RIMES (for French) and IAM (for English) datasets. The language modeling stage implies the integration of a lexicon and a statistical language model to the recognition processing chain in order to constrain the recognition hypotheses to the most probable sequence of words (sentence) from the language point of view. The difficulty at this stage is related to the finding the optimal vocabulary with minimum Out-Of-Vocabulary words rate (OOV). Enhanced language modeling approaches has been introduced by using sub-lexical units made of syllables or multigrams. The sub-lexical units cover an important portion of the OOV words. Then the language coverage depends on the domain of the language model training corpus, thus the need to train the language model with in domain data. The recognition system performance with the sub-lexical units outperformes the traditional recognition systems that use words or characters language models, in case of high OOV rates. Otherwise equivalent performances are obtained with a compact sub-lexical language model. Thanks to the compact lexicon size of the sub-lexical units, a unified multilingual recognition system has been designed. The unified system performance have been evaluated on the RIMES and IAM datasets. The unified multilingual system shows enhanced recognition performance over the specialized systems, especially when a unified optical model is used
APA, Harvard, Vancouver, ISO, and other styles
25

Neff, Kathryn Joan Eggers. "Neural net models of word representation : a connectionist approach to word meaning and lexical relations." Virtual Press, 1991. http://liblink.bsu.edu/uhtbin/catkey/832999.

Full text
Abstract:
This study examines the use of the neural net paradigm as a modeling tool to represent word meanings. The neural net paradigm, also called "connectionism" and "parallel distributed processing," provides a new metaphor and vocabulary for representing the structure of the mental lexicon. As a research method applied to the componential analysis of word meanings, the neural net approach has one primary advantage over the traditional introspective method: freedom from the investigator's personal biases.The connectionist method is illustrated in this thesis with an extensive examination of the meanings of the words "cup" and "mug." These words have been studied previously by Labov (1973), Wierzbicka (1985), Andersen (1975), and Kempton (1978), using very different methods.The neural net models developed in this study are based on empirical data acquired through interviews with nine informants who classified 37 objects, 37 photographs, and 37 line drawings as "cups," "mugs," or "neither." These responses were combined with a data file representing the coded attributes of each object, to construct neural net models which reflect each informant's classification process.In the neural net models, the "cup" and "mug" features are interconnected with positive and negative weights that represent the association strengths of the features. When the connection weights are set so that they reflect the informants' responses, the neural net models can account for the extreme discrepancies in object-naming among informants, and the models can also account for the inconsistent classifications of each individual informant with respect to the mode of presentation (drawing, photograph, or actual object). Further, the neural net modelscan predict classifications for novel objects with an accuracy varying from 82% to 100%.By examining the connection weight patterns within the neural net model, it is possible to discover the "cup" and "mug" features which are most salient for each informant, and for the informants collectively. This analysis shows that each informant has acquired internal meanings for the words "cup" and "mug" which are unique to the individual, although there is considerable overlap with respect to the most salient features.
Department of English
APA, Harvard, Vancouver, ISO, and other styles
26

Dugua, Céline. "Liaison, segmentation lexicale et schémas syntaxiques entre 2 et 6 ans : un modèle développemental basé sur l'usage." Grenoble 3, 2006. https://hal.archives-ouvertes.fr/tel-01272976.

Full text
Abstract:
Cette thèse aborde la question de l'acquisition de la liaison par les enfants francophones entre 2 à 6 ans. Elle s'inscrit dans les théories basées sur l'usage et les grammaires de construction et permet de montrer la forte interaction des différents niveaux linguistiques (phonologique, lexical, syntaxique) au cours du développement. A partir de l'analyse de 8 études de corpus et d'un relevé d'erreurs de liaison chez une fillette, nous avons élaboré 6 démarches expérimentales comprenant notamment un suivi longitudinal de 20 enfants durant 4 ans et 2 études transversales avec 122 et 200 sujets. Dans ce cadre, nous proposons un modèle développemental en trois étapes qui intègre liaison, segmentation lexicale et émergence de schémas syntaxiques. Précocement, l'enfant récupérerait des séquences globales concrètes dans son environnement langagier et les mémoriserait telles quelles dans son lexique. Par exemple, il récupèrerait un âne, zâne, l'âne. A partir du socle constitué par ces éléments lexicaux de formes diverses, des schémas plus abstraits émergeraient progressivement. D'abord généraux, ils se présenteraient sous la forme un + X ; dans ce cas le déterminant distribue des emplacements pouvant accueillir tout type de forme lexicale, d'où la prégnance des erreurs par remplacement (un zâne) à 2-3 ans. Peu à peu, ces schémas se spécifieraient en intégrant la nature de la consonne de liaison dans leur représentation (un + nX). Ces derniers expliquent les progrès en liaison et aussi le pic d'erreurs par surgénéralisation à des mots à initiale consonantique (erreur du type un nèbre pour un zèbre) vers 4-5 ans
This thesis focuses on the acquisition of liaison by french children aged between 2 and 6. Through cognitive functional approaches, more specifically on usage-based models and construction grammars, our analyses highlight how linguistic levels (phonological, lexical, syntactic) interact during the development. From 8 corpora studies as well as a measurement of errors in liaison contexts, taken from a child's utterances, we elaborated 6 experimental study protocols, in particular, a four-year longitudinal follow up of 20 children, as well as 2 cross-studies with larger samples (122 and 200 subjects). We suggest a 3 stage developmental model integrating liaison phenomenon, lexical segmentation and constructional schemas emergence. Precociously, the child would retrieve concrete linguistic sequences in her linguistic environment. She would then memorise these sequences and store them in her lexicon in same form as that one heard. For example, she could memorise sequences like un âne (a donkey), l'âne (with determiners), zâne, nâne (with consonant liaison on the initial). These concrete sequences constitute the base from which more abstract schemas emerge progressively. The first are general, integrating a determiner (the pivot) and a slot which can receive any lexical forms. They are like un (a/an) + X and they explain early frequent substitution errors (like un zâne). Gradually, these schemas become more specific, integrating the phonetic nature of the liaison consonant: un + nX. Their application explains progress in liaison contexts and overgeneralization errors on words starting with a consonant (like un nèbre instead of un zèbre (zebra))
APA, Harvard, Vancouver, ISO, and other styles
27

Zhang, Ying. "Modèles et outils pour des bases lexicales "métier" multilingues et contributives de grande taille, utilisables tant en traduction automatique et automatisée que pour des services dictionnairiques variés." Thesis, Université Grenoble Alpes (ComUE), 2016. http://www.theses.fr/2016GREAM017/document.

Full text
Abstract:
Notre recherche se situe en lexicographie computationnelle, et concerne non seulement le support informatique aux ressources lexicales utiles pour la TA (traduction automatique) et la THAM (traduction humaine aidée par la machine), mais aussi l'architecture linguistique des bases lexicales supportant ces ressources, dans un contexte opérationnel (thèse CIFRE avec L&M).Nous commençons par une étude de l'évolution des idées, depuis l'informatisation des dictionnaires classiques jusqu'aux plates-formes de construction de vraies "bases lexicales" comme JIBIKI-1 [Mangeot, M. et al., 2003 ; Sérasset, G., 2004] et JIBIKI-2 [Zhang, Y. et al., 2014]. Le point de départ a été le système PIVAX-1 [Nguyen, H.-T. et al., 2007 ; Nguyen, H. T. & Boitet, C., 2009] de bases lexicales pour systèmes de TA hétérogènes à pivot lexical supportant plusieurs volumes par "espace lexical" naturel ou artificiel (UNL). En prenant en compte le contexte industriel, nous avons centré notre recherche sur certains problèmes, informatiques et lexicographiques.Pour passer à l'échelle, et pour profiter des nouvelles fonctionnalités permises par JIBIKI-2, dont les "liens riches", nous avons transformé PIVAX-1 en PIVAX-2, et réactivé le projet GBDLEX-UW++ commencé lors du projet ANR TRAOUIERO, en réimportant toutes les données (multilingues) supportées par PIVAX-1, et en les rendant disponibles sur un serveur ouvert.Partant d'un besoin de L&M concernant les acronymes, nous avons étendu la "macrostructure" de PIVAX en y intégrant des volumes de "prolexèmes", comme dans PROLEXBASE [Tran, M. & Maurel, D., 2006]. Nous montrons aussi comment l'étendre pour répondre à de nouveaux besoins, comme ceux du projet INNOVALANGUES. Enfin, nous avons créé un "intergiciel de lemmatisation", LEXTOH, qui permet d'appeler plusieurs analyseurs morphologiques ou lemmatiseurs, puis de fusionner et filtrer leurs résultats. Combiné à un nouvel outil de création de dictionnaires, CREATDICO, LEXTOH permet de construire à la volée un "mini-dictionnaire" correspondant à une phrase ou à un paragraphe d'un texte en cours de "post-édition" en ligne sous IMAG/SECTRA, ce qui réalise la fonctionnalité d'aide lexicale proactive prévue dans [Huynh, C.-P., 2010]. On pourra aussi l'utiliser pour créer des corpus parallèles "factorisés" pour construire des systèmes de TA en MOSES
Our research is in computational lexicography, and concerns not only the computer support to lexical resources useful for MT (machine translation) and MAHT (Machine Aided Human Translation), but also the linguistic architecture of lexical databases supporting these resources in an operational context (CIFRE thesis with L&M).We begin with a study of the evolution of ideas in this area, since the computerization of classical dictionaries to platforms for building up true "lexical databases" such as JIBIKI-1 [Mangeot, M. et al., 2003 ; Sérasset, G., 2004] and JIBIKI-2 [Zhang, Y. et al., 2014]. The starting point was the PIVAX-1 system [Nguyen, H.-T. et al., 2007 ; Nguyen, H. T. & Boitet, C., 2009] designed for lexical bases for heterogeneous MT systems with a lexical pivot, able to support multiple volumes in each "lexical space", be it natural or artificial (as UNL). Considering the industrial context, we focused our research on some issues, in informatics and lexicography.To scale up, and to add some new features enabled by JIBIKI-2, such as the "rich links", we have transformed PIVAX-1 into PIVAX-2, and reactivated the GBDLEX-UW++ project that started during the ANR TRAOUIERO project, by re-importing all (multilingual) data supported by PIVAX-1, and making them available on an open server.Hence a need for L&M for acronyms, we expanded the "macrostructure" of PIVAX incorporating volumes of "prolexemes" as in PROLEXBASE [Tran, M. & Maurel, D., 2006]. We also show how to extend it to meet new needs such as those of the INNOVALANGUES project. Finally, we have created a "lemmatisation middleware", LEXTOH, which allows calling several morphological analyzers or lemmatizers and then to merge and filter their results. Combined with a new dictionary creation tool, CREATDICO, LEXTOH allows to build on the fly a "mini-dictionary" corresponding to a sentence or a paragraph of a text being "post-edited" online under IMAG/SECTRA, which performs the lexical proactive support functionality foreseen in [Huynh, C.-P., 2010]. It could also be used to create parallel corpora with the aim to build MOSES-based "factored MT systems"
APA, Harvard, Vancouver, ISO, and other styles
28

Coté, Myriam. "Utilisation d'un modèle d'accès lexical et de concepts perceptifs pour la reconnaissance d'images de mots récursifs /." Paris : École nationale supérieure des télécommunications, 1997. http://catalogue.bnf.fr/ark:/12148/cb367038172.

Full text
APA, Harvard, Vancouver, ISO, and other styles
29

Hasan, Saša [Verfasser]. "Triplet lexicon models for statistical machine translation / Sasa Hasan." Aachen : Hochschulbibliothek der Rheinisch-Westfälischen Technischen Hochschule Aachen, 2012. http://d-nb.info/1028004060/34.

Full text
APA, Harvard, Vancouver, ISO, and other styles
30

Chan, May T. M. "Alveolarization in Hong Kong Cantonese : a sociophonetic study of neogrammarian and lexical diffusion models of sound change." Thesis, University of Oxford, 2017. https://ora.ox.ac.uk/objects/uuid:2d40e687-83cd-4d93-9c3e-fa6e5569cf6b.

Full text
Abstract:
This thesis is a quantitative study of sociophonetic variation which focuses on the Hong Kong Cantonese velar coda consonants -η and -k. These codas have, under certain linguistic contexts, become increasingly realized as alveolar nasal and oral stop consonants, [-n] and [-t], respectively. For the purpose of this thesis, the phenomenon of sound shift from -η and -k, to [-n] and [-t] respectively will be termed 'alveolarization'. Insofar as the language of a speech community is a shared vehicle for communication and sound changes are constrained by the need for mutual intelligibility, the central aim of this thesis is to uncover the factors which contribute the most to driving this sound change. In describing the variation in these consonants, I examine the concomitant social and linguistic factors which might help explain it. While this study focuses on one specific set of linguistic variables, it aims to analyse a broad set of factors to obtain a picture of the complexity of this sound change. The sound change is of theoretical interest as it provides an opportunity to evaluate the neogrammarian regularity hypothesis and the lexical diffusion model of sound change. The neogrammarian regularity hypothesis states that sound changes are regular and admit no exceptions - they are purely driven by phonological context and show no lexical variation. On the other hand, the lexical diffusion model predicts that sound changes progress through the lexicon in a gradual manner. By examining the effects of neighbouring phonological environment on the velar codas, and by analyzing which lexemes might be leading the change, as well as whether there are any lexical frequency effects, this thesis sets out to test both models of sound change.
APA, Harvard, Vancouver, ISO, and other styles
31

Souza, Marcos Antônio de. "O dicionário de hebraico bíblico de Brown, Driver e Briggs (BDB) como modelo de sistema lexical bilíngüe." reponame:Repositório Institucional da UFSC, 2012. http://repositorio.ufsc.br/xmlui/handle/123456789/90857.

Full text
Abstract:
Dissertação (mestrado) - Universidade Federal de Santa Catarina, Centro de Comunicação e Expressão, Programa de Pós-Graduação em Estudos da Tradução, Florianópolis, 2008.
Made available in DSpace on 2012-10-23T16:05:59Z (GMT). No. of bitstreams: 1 273667.pdf: 1385993 bytes, checksum: 2c3ea5f466a7c2a004fa49dd996aa441 (MD5)
Esta dissertação de mestrado foi concebida como um trabalho original relacionado à lexicografia bíblica hebraica. O problema abordado neste estudo pode ser formulado da seguinte maneira: O estudo procura investigar o popular dicionário de hebraico bíblico de Brown, Driver e Briggs (BDB) baseado em um dos mais antigos dicionários e editado por volta de 1810 por Wilhelm Gesenius como um modelo de sistema lexical bilíngüe no contexto da polêmica envolvendo glosas, definições e domínios semânticos. Partindo da hipótese de que ciência é qualquer conhecimento obtido pelo Método Científico, este estudo está estruturado em uma série de três passos. No primeiro passo observação a lexicografia hebraica bíblica é investigada mediante uma identificação dos principais dicionários de hebraico bíblico publicados em língua inglesa e portuguesa quanto as suas macro-estrutura e micro-estrutura. No segundo passo, uma hipótese é formulada. Partindo do conceito de que lexicografia é uma disciplina de aplicação em que o propósito vem primeiro e posteriormente a teoria, formula-se a hipótese de que um dicionário de hebraico bíblico é um modelo de um sistema lexical bilíngüe. Como fundamento teórico para esta hipótese, é apresentada uma analogia com um dos mais bem sucedido modelo de sistema e utilizado pelos engenheiros de telecomunicações o modelo de Erlang e o conceito de cadeia de transferência desenvolvido a partir do triângulo da significação de Ogden & Richards. O terceiro e último passo consiste em um experimento apropriado para verificação da validade da hipótese. Neste experimento, quatro poemas hebraicos (dois bíblicos, um medieval e um moderno) são traduzidos segundo as glosas fornecidas pelo BDB e comparadas às glosas de quatro outros dicionários de hebraico, além de uma comparação com antigas traduções da Bíblia Hebraica (Septuaginta e Vulgata) para os dois poemas bíblicos.
This Masters degree dissertation was conceived as an original work mainly concerned to Biblical Hebrew Lexicography. The basic problem in this study is to be formulated as follows: The study seeks to investigate the well-know Biblical Hebrew dictionary by Brown, Driver, and Briggs (BDB) based in one of the oldest Hebrew dictionary published around 1810 by Wilhelm gesenius as a model of a lexical bilingual system in the context of the polemic involving glosses, definitions and semantic domains. Proceeding on the assumption that Science is any knowledge that is arrived at by the Scientific Method, this study is structured on a series of three definite steps. In the first step observation modern Biblical Hebrew Lexicography is observed by a survey of Hebrew dictionaries published in English and Portuguese languages with a analysis of their macrostructure and microstructure. In the second step, a hypothesis is formulated. On the assumption that lexicography is an applied discipline in which the purpose comes first and the theory comes last, a hypothesis is formulated in which a Biblical Hebrew Dictionary is a model of a lexical bilingual system. As a framework for this hypothesis, a analogy is made with one of the most successful system model used by telecommunication engineers the Erlangs model and the concept of chain of transference is developed based on the Ogden & Richards triangle of signification. The third and final step is an appropriate experiment to see if the hypothesis is ubstantiated. In this experiment, four Hebrew poems (two biblical poems, one medieval poem and one modern poem) are translated according the glosses provided by BDB and compared to the glosses of four other Hebrew dictionaries besides a comparison to Hebrew Bible ancient versions (Septuagint and Vulgate) for the two biblical poems.
APA, Harvard, Vancouver, ISO, and other styles
32

Laporte, Elena-Mirabela. "La traduction automatique statistique factorisée : une application à la paire de langues français - roumain." Thesis, Strasbourg, 2014. http://www.theses.fr/2014STRAC022/document.

Full text
Abstract:
Un premier objectif de cette thèse est la constitution de ressources linguistiques pour un système de traduction automatique statistique factorisée français - roumain. Un deuxième objectif est l’étude de l’impact des informations linguistiques exploitées dans le processus d’alignement lexical et de traduction. Cette étude est motivée, d’une part, par le manque de systèmes de traduction automatique pour la paire de langues étudiées et, d’autre part, par le nombre important d’erreurs générées par les systèmes de traduction automatique actuels. Les ressources linguistiques requises par ce système sont des corpus parallèles alignés au niveau propositionnel et lexical. Ces corpus sont également segmentés lexicalement, lemmatisés et étiquetés au niveau morphosyntaxique
Our first aim is to build linguistic resources for a French - Romanian factored phrase - based statistical machine translation system. Our second aim is to study the impact of exploited linguistic information in the lexical alignment and translation process. On the one hand, this study is motivated by the lack of such systems for the studied languages. On the other hand, it is motivated by the high number of errors provided by the current machine translation systems. The linguistic resources required by the system are tokenized, lemmatized, tagged, word, and sentence - aligned parallel corpora
APA, Harvard, Vancouver, ISO, and other styles
33

Zhang, Xuelu. "Les tons lexicaux du chinois mandarin en voix modale et en voix chuchotée." Thesis, Strasbourg, 2017. http://www.theses.fr/2017STRAC041/document.

Full text
Abstract:
Notre recherche est une contribution à l’étude des indices acoustiques secondaires des tons lexicaux en mandarin, comparant les données recueillies en parole modale avec celles obtenues en parole chuchotée. Selon la littérature, ces indices devraient se présenter en tant qu’un ensemble d’attributs dans les dimensions acoustiques du spectre, outre que dans la fréquence fondamentale. Nous avons analysé des attributs temporels, des attributs au niveau de l’intensité, des attributs spectraux, ainsi que leur corrélation avec les tons. Les résultats montrent que certains paramètres temporels et la quatrième résonance du spectre sont étroitement liés au ton. Leurs rapports dépendent de la caractéristique intrinsèque de la voyelle qui porte le ton (équivalente de la rime dans notre recherche)
Our research is a contribution to studies on secondary acoustic cues in Mandarin tone identification, by comparing acoustic data collected in modal speech and in whispered speech. According to the literature on the same issue, theses cues could be found in acoustic dimensions other than in the fundamental frequency, as a set of attributes. We have analyzed these attributes in the temporal domain, at the intensity level and in the spectrum, as well as their relations with tones. Our results show that some temporal parameters and the fourth resonance in the spectrum are very closely related to tones. These relations are dependent on the intrinsic characteristics of the vowel that carries the tone (which is equivalent to the rime in our research)
APA, Harvard, Vancouver, ISO, and other styles
34

Azevedo, Luciana de Oliveira Faria. ""Uma flor tapoja e uma casa jufosa: o papel da nomeação e de propriedades morfofonológicas no processo de identificação de novos adjetivos por crianças brasileiras"." Universidade Federal de Juiz de Fora (UFJF), 2008. https://repositorio.ufjf.br/jspui/handle/ufjf/4607.

Full text
Abstract:
Submitted by Renata Lopes (renatasil82@gmail.com) on 2017-05-19T15:26:21Z No. of bitstreams: 1 lucianadeoliveirafariaazevedo.pdf: 2448500 bytes, checksum: eaee6df3b32bee6f42619f40a226e17f (MD5)
Approved for entry into archive by Adriana Oliveira (adriana.oliveira@ufjf.edu.br) on 2017-05-19T15:55:07Z (GMT) No. of bitstreams: 1 lucianadeoliveirafariaazevedo.pdf: 2448500 bytes, checksum: eaee6df3b32bee6f42619f40a226e17f (MD5)
Made available in DSpace on 2017-05-19T15:55:07Z (GMT). No. of bitstreams: 1 lucianadeoliveirafariaazevedo.pdf: 2448500 bytes, checksum: eaee6df3b32bee6f42619f40a226e17f (MD5) Previous issue date: 2008-08-29
Esta dissertação aborda o processo de aquisição lexical por crianças brasileiras e investiga, particularmente, a relação entre categoria conceitual e categoria lingüística, e propriedades morfofonológicas do adjetivo. A hipótese que orienta esta dissertação é a de que a nomeação dos objetos e a presença de morfemas característicos de adjetivos são pistas robustas usadas pelas crianças no processo de aquisição de novos adjetivos. Adota-se uma perspectiva psicolingüística da aquisição da linguagem que pretende a conciliação de um modelo de processamento lingüístico (modelos de Bootstrapping Fonológico e Sintático), com um modelo de língua proposto pela Teoria Gerativa. A conciliação entre os modelos visa a explicar, satisfatoriamente, a forma pela qual a criança se torna capaz de, uma vez exposta a uma língua natural, extrair do material lingüístico ao qual é apresentada os elementos formadores do léxico de sua língua. Foram desenvolvidas duas atividades experimentais, usando-se a técnica de identificação de objeto, com crianças de dois e três anos. A primeira avalia o reconhecimento de novos adjetivos, comparando-se a apresentação de objetos nomeados (uma flor tapoja) ou com nomes vagos (uma coisa tapoja). No segundo experimento, foram acrescentados aos pseudo-adjetivos os sufixos -oso/a e –ado/a (uma casa jufosa) / uma coisa jufosa), em vista de investigar o papel do sufixo juntamente com a nomeação dos objetos como facilitadores na identificação do adjetivo pela criança. Adjetivos acompanhados de nome (Exper. 1) são mais facilmente identificados, mas quando acrescidos de sufixo (Exper. 2) são reconhecidos mesmo na presença de nomes vagos. Os resultados são compatíveis com nossa hipótese, pois sugerem que a nomeação e a marca morfofonológica são pistas robustas usadas pelas crianças para identificar novos adjetivos.
This dissertation approaches the process of lexical acquisition for Brazilian children and it investigates, particularly, the relationship among conceptual category and linguistic category, and properties morphophonological of the adjective. The hypothesis that guides this dissertation is the one that the nomination of the objects and the presence of morphemes characteristic of adjectives are robust tracks used by the children in the process of acquisition of new adjectives. A perspective psycholinguistic of the acquisition of the language is adopted that intends the conciliation of a model of linguistic processing (models of Phonological and Syntactic Bootstrapping), with a language model proposed by the Generative Theory. The conciliation among the models seeks to explain, satisfactorily, the form for the which the child becomes capable of, once exposed to a natural language, to extract of the linguistic material to which is presented the elements that form the lexicon of your language. Two experimental activities were developed, being used the technique of object identification, with two three year-old children. The first evaluates the recognition of new adjectives, being compared the presentation of nominated objects (a tapoja flower) or with vague names (a tapoja thing). In the second experiment, they were increased to the pseudo-adjectives the suffixes -oso/a and -ado/a (a jufosa house) / a jufosa thing), in view of investigating the paper of the suffix together with the nomination of the objects as facilitators in the identification of the adjective for the child accompanied Adjectives of name (Exper. 1) they are more easily identified, but when added of suffix (Exper. 2) they are recognized even in the presence of vague names. The results are compatible with our hypothesis, because they suggest that the nomination and the mark morphophonological are robust tracks used by the children to identify new adjectives.
APA, Harvard, Vancouver, ISO, and other styles
35

Séguéla, Patrick. "Construction de modèles de connaissances par analyse lingustiques de relations lexicales dans les documents techniques." Toulouse 3, 2001. http://www.theses.fr/2001TOU30210.

Full text
APA, Harvard, Vancouver, ISO, and other styles
36

Lowry, Jonathan E. "The Language of Team: Building a lexicon integrating multiple disciplines for effective project management." University of Cincinnati / OhioLINK, 2011. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1306499898.

Full text
APA, Harvard, Vancouver, ISO, and other styles
37

Eklund, Robert. "A Probabilistic Tagging Module Based on Surface Pattern Matching." Thesis, Stockholm University, Department of Computational Linguistics, Institute of Linguistics, 1993. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-135294.

Full text
Abstract:
A problem with automatic tagging and lexical analysis is that it is never 100 % accurate. In order to arrive at better figures, one needs to study the character of what is left untagged by automatic taggers. In this paper untagged residue outputted by the automatic analyser SWETWOL (Karlsson 1992) at Helsinki is studied. SWETWOL assigns tags to words in Swedish texts mainly through dictionary lookup. The contents of the untagged residue files are described and discussed, and possible ways of solving different problems are proposed. One method of tagging residual output is proposed and implemented: the left-stripping method, through which untagged words are bereaved their left-most letters, searched in a dictionary, and if found, tagged according to the information found in the said dictionary. If the stripped word is not found in the dictionary, a match is searched in ending lexica containing statistical information about word classes associated with that particular word form (i.e., final letter cluster, be this a grammatical suffix or not), and the relative frequency of each word class. If a match is found, the word is given graduated tagging according to the statistical information in the ending lexicon. If a match is not found, the word is stripped of what is now its left-most letter and is recursively searched in a dictionary and ending lexica (in that order). The ending lexica employed in this paper are retrieved from a reversed version of Nusvensk Frekvensordbok (Allén 1970), and contain endings of between one and seven letters. The contents of the ending lexica are to a certain degree described and discussed. The programs working according to the principles described are run on files of untagged residual output. Appendices include, among other things, LISP source code, untagged and tagged files, the ending lexica containing one and two letter endings and excerpts from ending lexica containing three to seven letters.
APA, Harvard, Vancouver, ISO, and other styles
38

SHIMA, Yoshihiro, and 義弘 島. "内的作業モデルが情報処理に及ぼす影響 : プライムされた関係との関連." 名古屋大学大学院教育発達科学研究科, 2012. http://hdl.handle.net/2237/16160.

Full text
APA, Harvard, Vancouver, ISO, and other styles
39

Adelstein, Andreina. "Unidad léxica y significado especializado: modelo de representación a partir del nombre relacional madre." Doctoral thesis, Universitat Pompeu Fabra, 2007. http://hdl.handle.net/10803/7505.

Full text
Abstract:
Esta tesis estudia la especificidad de los significados léxicos especializados, a partir del análisis del funcionamiento de nombres relacionales; ofrece una explicación integrada de la semántica especializada y no especializada y una modelización de entrada léxica unificada. Conjuga así, el modelo comunicativo de la terminología con modelos polisémicos de generación del significado léxico.
El trabajo presenta una revisión crítica de las propuestas lingüísticas y terminológicas acerca de las propiedades semánticas del léxico científico. Luego, analiza las propiedades del significado léxico especializado, los factores y los mecanismos de generación semántica, a partir del análisis contrastivo del comportamiento de madre en corpora textuales. La tesis proporciona, a su vez, criterios de reconocimiento formal de información semántica especializada, útiles para desarrollar diversos tipos de aplicaciones. Finalmente, propone una generalización de la semántica especializada de los nombres relacionales y una representación de entrada dinámica, que contempla componentes de conocimiento lingüístico y extralingüístico que interactúan en la generación del significado léxico.
This dissertation studies the specificity of specialized lexical meanings, based on an analysis of the behaviour of relational nouns; it offers an integral explanation of specialized and non-specialized semantics and a modellization of a unified lexical entry. Thus, it combines the communicative model of terminology with polysemic models of generation of lexical meaning.

This work starts by presenting a critical review of linguistic and terminological approaches to the semantic properties of scientific lexicon. Then it procedes to analize the properties of specialized lexical meaning, the factors that influence semantic generation and its mechanism, based on the behaviour of madre in text corpora. It also provides criteria for the formal recognition of specialized semantic information which can help develop different kinds of applications. Finally, the dissertation puts forward a generalization of the specialized semantics of relational nouns and a representation of a dynamic entry, which contemplates components of linguistic and extralinguistic knowledge interacting in the generation of lexical meaning.
APA, Harvard, Vancouver, ISO, and other styles
40

Jacob, Bruno. "Un outil informatique de gestion de modèles de Markov cachés : expérimentations en reconnaissance automatique de la parole." Toulouse 3, 1995. http://www.theses.fr/1995TOU30240.

Full text
Abstract:
Nous proposons dans ce document l'utilisation d'un compilateur de modeles de markov caches dans le cadre de la reconnaissance automatique de la parole. Apres avoir presente les caracteristiques du compilateur, nous presentons quelques applications le mettant en uvre afin de valider cet outil: ? une methode de filtrage lexical en deux etapes: un sous-dictionnaire est selectionne par un modele de markov cache principal, dont les unites sont des classes majeures. A partir de celui-ci, un modele de markov cache temporaire est construit avec des unites pseudo-diphones afin d'obtenir le mot reconnu. Le compilateur est ici utilise dans une application classique de reconnaissance. ? une nouvelle methode de fusion de donnees acoustiques et articulatoires a l'aide d'une relation de type maitre/esclave entre deux modeles de markov caches, dans le but d'augmenter la robustesse des reconnaissances dans le bruit. Nous avons adapte le compilateur afin qu'il construise ces variantes des modeles de markov caches. ? un systeme de decodage acoustico-phonetique base sur des unites phonetiques issues d'une quantification vectorielle. Nous utilisons le compilateur comme un outil de validation du systeme de decodage. ? une proposition de post-traitement des resultats d'un systeme de reconnaissance de mots isoles afin d'en augmenter les performances. Nous testons ici la compatibilite des reseaux construits par le compilateur avec ceux d'un systeme deja existant. Nous concluons par une discussion sur les extensions possibles du compilateur
APA, Harvard, Vancouver, ISO, and other styles
41

Nobrega, Karliane Fernandes. "A interpreta??o sem?ntica dos auxiliares modais poder, precisar e dever: uma abordagem da sem?ntica cognitiva." Universidade Federal do Rio Grande do Norte, 2007. http://repositorio.ufrn.br:8080/jspui/handle/123456789/16359.

Full text
Abstract:
Made available in DSpace on 2014-12-17T15:07:13Z (GMT). No. of bitstreams: 1 KarlianeFN.pdf: 2687270 bytes, checksum: 030c127f4ec27d1e8f2d78a1475b9ddf (MD5) Previous issue date: 2007-06-01
Apresentamos, neste trabalho, com base na sem?ntica cognitiva, uma an?lise do significado, em contexto, dos auxiliares modais poder, precisar e dever. Analisamos 120 textos produzidos por candidatos ao vestibular e por alunos do ensino fundamental, como resposta da quest?o n?mero tr?s da prova discursiva de L?ngua Portuguesa do vestibular 2005 da UFRN, que pede aos candidatos para explicitar a diferen?a de sentido entre tr?s frases, observando o uso desses tr?s verbos. Consideramos que um item lexical n?o ? incorporado a uma representa??o ling??stica sem?ntica fixa, limitada e ?nica, mas antes, ? ligado a uma representa??o ling??stica sem?ntica flex?vel e aberta que prov? acesso a muitas concep??es e sistemas conceituais dependente de cada contexto determinado. Com base em seu significado, um item lexical evoca um grupo de dom?nios cognitivos, que por sua vez, apresentam um determinado conte?do conceitual. Isto implica em afirmar que a rede de significados lexicais vai variar conforme o conhecimento de mundo de cada um (LANGACKER, 2000). A relev?ncia deste trabalho ? proporcionar uma contribui??o para a descri??o sem?ntica do portugu?s
We present, in this work, based on cognitive semantics, an analysis of the meaning in context of the modal auxiliaries can, need and must. We analysed 120 texts produced by applicants for university entrance examinations and primary school students as answer to question number three of the Portuguese Language discursive test, in the entrance examinations for UFRN, that asked the candidates to explicit the difference in meaning between three sentences, observing the use of those three verbs. We consider that a lexical item is not incorporated by a steady semantic structure, limited and unique, but instead, it is linked to an open and flexible linguistic semantic representation that provides access to many conceptions and conceptual systems depending on each determined context. Based on its meaning, a lexical item evokes a group of cognitive domains, which present a determined conceptual content. This makes possible to affirm that the net of lexical meanings will vary according to the world knowledge each one has (LANGACKER, 2000). The relevance of this work is provide a understanding of the semantic decription of portuguese
APA, Harvard, Vancouver, ISO, and other styles
42

Carter, Kelli Patrice. "Investigating Student Conceptual Understanding of Structure and Function by Using Formative Assessment and Automated Scoring Models." Scholar Commons, 2019. https://scholarcommons.usf.edu/etd/7761.

Full text
Abstract:
There has been a call from the national community of biologists and biology educators to increase biological literacy of undergraduate students, including understanding and application of core concepts. The structure and function relationship is a core concept identified by the wider biology community and by physiology faculty. Understanding of the core concept structure and function across multiple levels of organization may promote biological literacy. My research focused on the development of formative written assessment tools to provide insight into student understanding of structure and function in anatomy and physiology. In chapter two I developed automated scoring tools to facilitate the evaluation of written formative assessment based on structure and function. Formative written assessments allow students to demonstrate their thinking by encouraging students to use their diverse ideas to construct their responses. However, formative written assessments are not often used in the undergraduate biology classroom due to barriers, such as time spent grading and the intricacy of interpreting student responses. Automated scoring, such as lexical analysis and machine scoring, can examine student thinking in formative written responses. The core concept structure-function provides a foundation upon which many topics in anatomy and physiology can be built across all levels of organization. My research focused on the development of formative written assessment tools and automated scoring models to provide insight into student understanding of structure and function. My research objective was to examine student understanding of a core concept in anatomy and physiology by using automated scoring. Ten short answer questions were administered to students in a junior-level General Physiology course and a sophomore level Human Anatomy and Physiology course at a large Southeastern public university, and to students in Human Anatomy and Physiology courses at two Southeastern two-year colleges. Seventeen students were interviewed to determine if their responses to the short answer questions accurately reflected their thinking. Lexical analysis and machine scoring were used to build predictive models that can analyze student thinking about the structure-function relationship in anatomy and physiology with high agreement to human scoring. Less than half of the student responses in this study demonstrated conceptual understanding of the structure-function relationship. Automated scoring can successfully evaluate a large number of student responses in Human Anatomy and Physiology and General Physiology courses. In chapter three I compared conceptual understanding of structure and function in 2-yr and 4-yr student responses. Anatomy and physiology is taught at a variety of institutions, including 2-year community colleges and 4-year research universities. Regardless of the type of institution offering anatomy and physiology, conceptual understanding of the structure-function relationship is necessary to understand physiological processes. The focus of my research was to compare conceptual understanding of 2-year versus 4-year anatomy and physiology students by using written formative assessment. I hypothesize that differences in students’ academic readiness between two-year and four-year institutions may affect conceptual understanding and student performance. Based on prior research, I predict that there will be a difference in conceptual understanding of the core concept structure and function between two-year and four-year students in anatomy and physiology, and that the students at the two-year institution will not perform as well as the students at the four-year institution, as measured by performance on the constructed response questions. Responses to eight short answer essay questions were collected from students at both types of institutions from students in human anatomy and physiology over six semesters. My results demonstrated that there is a difference in conceptual understanding of the structure-function relationship between 2-year and 4-year students in anatomy and physiology with more 4-year students mentioning SRF concepts in their responses compared to the 2-year students. A potential reason for this difference may be college readiness. There was no difference in performance between institution types on structure-function concepts examined in the A&P II course. My results suggested that students may benefit from a focus on core concepts within the content of anatomy and physiology courses. This focus should occur in both the first and second semesters of anatomy and physiology. Instructors can use written formative assessment to allow students to demonstrate their conceptual understanding within the organ systems. In chapter four I investigated how question features affect student responses to anatomy and physiology formative assessment questions. Short answer essay questions contain features which are elements of the question which aid students in connecting the question to their existing knowledge. Varying the features of a question may be used to provide insight into the different stages of students’ emerging biological expertise and differentiate novice students who have memorized an explanation from those who exhibit understanding. I am interested in examining the cognitive level of questions, the use of guiding context/references in question prompts, and the order of questions, and how these features elicit student explanations of the core concept structure-function in anatomy and physiology. I hypothesized that varying the features of short answer questions may affect student explanations. Short answer questions based on the core concept ‘structure-function’ were administered to 767 students in a junior level General Physiology course and to 573 students in a sophomore level Human Anatomy and Physiology course at a large southeastern public university. Student responses were first human scored and then scored by using lexical analysis and machine scoring. Students were interviewed to examine their familiarity with levels of organization and to confirm their interpretation of the questions. Students demonstrated more conceptual understanding of four of the structure-function concepts when answering the understand questions and more conceptual understanding of two structure-function concepts when answering the apply questions. The question prompts provided a different context which may have influenced student explanations. There was no difference in conceptual understanding of the structure-function relationship with and without the use of a guiding context in the wording of the question prompt. For question sequence, students performed better on the last questions in the sequence, regardless of whether the last question was easier or more difficult. Instructors should provide students with questions in varying contexts and cognitive levels will allow students to demonstrate their heterogeneous ideas about a concept.
APA, Harvard, Vancouver, ISO, and other styles
43

Belkacem, Thiziri. "Neural models for information retrieval : towards asymmetry sensitive approaches based on attention models." Thesis, Toulouse 3, 2019. http://www.theses.fr/2019TOU30167.

Full text
Abstract:
Ce travail se situe dans le contexte de la recherche d'information (RI) utilisant des techniques d'intelligence artificielle (IA) telles que l'apprentissage profond (DL). Il s'intéresse à des tâches nécessitant l'appariement de textes, telles que la recherche ad-hoc, le domaine du questions-réponses et l'identification des paraphrases. L'objectif de cette thèse est de proposer de nouveaux modèles, utilisant les méthodes de DL, pour construire des modèles d'appariement basés sur la sémantique de textes, et permettant de pallier les problèmes de l'inadéquation du vocabulaire relatifs aux représentations par sac de mots, ou bag of words (BoW), utilisées dans les modèles classiques de RI. En effet, les méthodes classiques de comparaison de textes sont basées sur la représentation BoW qui considère un texte donné comme un ensemble de mots indépendants. Le processus d'appariement de deux séquences de texte repose sur l'appariement exact entre les mots. La principale limite de cette approche est l'inadéquation du vocabulaire. Ce problème apparaît lorsque les séquences de texte à apparier n'utilisent pas le même vocabulaire, même si leurs sujets sont liés. Par exemple, la requête peut contenir plusieurs mots qui ne sont pas nécessairement utilisés dans les documents de la collection, notamment dans les documents pertinents. Les représentations BoW ignorent plusieurs aspects, tels que la structure du texte et le contexte des mots. Ces caractéristiques sont très importantes et permettent de différencier deux textes utilisant les mêmes mots et dont les informations exprimées sont différentes. Un autre problème dans l'appariement de texte est lié à la longueur des documents. Les parties pertinentes peuvent être réparties de manières différentes dans les documents d'une collection. Ceci est d'autant vrai dans les documents volumineux qui ont tendance à couvrir un grand nombre de sujets et à inclure un vocabulaire variable. Un document long pourrait ainsi comporter plusieurs passages pertinents qu'un modèle d'appariement doit capturer. Contrairement aux documents longs, les documents courts sont susceptibles de concerner un sujet spécifique et ont tendance à contenir un vocabulaire plus restreint. L'évaluation de leur pertinence est en principe plus simple que celle des documents plus longs. Dans cette thèse, nous avons proposé différentes contributions répondant chacune à l'un des problèmes susmentionnés. Tout d'abord, afin de résoudre le problème d'inadéquation du vocabulaire, nous avons utilisé des représentations distribuées des mots (plongement lexical) pour permettre un appariement basé sur la sémantique entre les différents mots. Ces représentations ont été utilisées dans des applications de RI où la similarité document-requête est calculée en comparant tous les vecteurs de termes de la requête avec tous les vecteurs de termes du document, indifféremment. Contrairement aux modèles proposés dans l'état-de-l'art, nous avons étudié l'impact des termes de la requête concernant leur présence/absence dans un document. Nous avons adopté différentes stratégies d'appariement document/requête. L'intuition est que l'absence des termes de la requête dans les documents pertinents est en soi un aspect utile à prendre en compte dans le processus de comparaison. En effet, ces termes n'apparaissent pas dans les documents de la collection pour deux raisons possibles : soit leurs synonymes ont été utilisés ; soit ils ne font pas partie du contexte des documents en questions
This work is situated in the context of information retrieval (IR) using machine learning (ML) and deep learning (DL) techniques. It concerns different tasks requiring text matching, such as ad-hoc research, question answering and paraphrase identification. The objective of this thesis is to propose new approaches, using DL methods, to construct semantic-based models for text matching, and to overcome the problems of vocabulary mismatch related to the classical bag of word (BoW) representations used in traditional IR models. Indeed, traditional text matching methods are based on the BoW representation, which considers a given text as a set of independent words. The process of matching two sequences of text is based on the exact matching between words. The main limitation of this approach is related to the vocabulary mismatch. This problem occurs when the text sequences to be matched do not use the same vocabulary, even if their subjects are related. For example, the query may contain several words that are not necessarily used in the documents of the collection, including relevant documents. BoW representations ignore several aspects about a text sequence, such as the structure the context of words. These characteristics are important and make it possible to differentiate between two texts that use the same words but expressing different information. Another problem in text matching is related to the length of documents. The relevant parts can be distributed in different ways in the documents of a collection. This is especially true in large documents that tend to cover a large number of topics and include variable vocabulary. A long document could thus contain several relevant passages that a matching model must capture. Unlike long documents, short documents are likely to be relevant to a specific subject and tend to contain a more restricted vocabulary. Assessing their relevance is in principle simpler than assessing the one of longer documents. In this thesis, we have proposed different contributions, each addressing one of the above-mentioned issues. First, in order to solve the problem of vocabulary mismatch, we used distributed representations of words (word embedding) to allow a semantic matching between the different words. These representations have been used in IR applications where document/query similarity is computed by comparing all the term vectors of the query with all the term vectors of the document, regardless. Unlike the models proposed in the state-of-the-art, we studied the impact of query terms regarding their presence/absence in a document. We have adopted different document/query matching strategies. The intuition is that the absence of the query terms in the relevant documents is in itself a useful aspect to be taken into account in the matching process. Indeed, these terms do not appear in documents of the collection for two possible reasons: either their synonyms have been used or they are not part of the context of the considered documents. The methods we have proposed make it possible, on the one hand, to perform an inaccurate matching between the document and the query, and on the other hand, to evaluate the impact of the different terms of a query in the matching process. Although the use of word embedding allows semantic-based matching between different text sequences, these representations combined with classical matching models still consider the text as a list of independent elements (bag of vectors instead of bag of words). However, the structure of the text as well as the order of the words is important. Any change in the structure of the text and/or the order of words alters the information expressed. In order to solve this problem, neural models were used in text matching
APA, Harvard, Vancouver, ISO, and other styles
44

Balza, Tardaguila Irene. "Syntactic structure and modal interpretation : the case of Basque "behar"." Thesis, Bordeaux 3, 2018. http://www.theses.fr/2018BOR30070.

Full text
Abstract:
Cette thèse est une investigation de la structure syntaxique et de l'interprétation modale des phrases impliquant le modal dénominal de nécessité behar ‘devoir, falloir, avoir besoin’ et un complément infinitif. La thèse analyse le statut syntaxique des compléments non finis du verbe modal denominal behar en examinant leur interaction avec des phénomènes syntaxiques sensibles à des conditions structurelles et de localité diverses, et conclut que les compléments d’infinitif de behar peuvent correspondre à différentes structures sous-jacentes. Le type d'infinitif le plus complexe du point de vue structurel est un infinitif non-restructurant qui projette une architecture de phrase complète (c.-à-d. une CP), et le plus petit est un infinitif réduit de restructuration qui projette une structure de phrase de niveau vP. Il y a des preuves pour l'existence des types intermédiaires projetant jusqu'au domaine flexionnel (IP / TP). D'autre part, la thèse examine les propriétés thématiques et de portée des sujets dans chacun des différents types structurels et l'interprétation modale à laquelle elles donnent cours. Sur la base de cette analyse, la thèse soutient que l'interprétation modale n'est déterminée par aucun facteur en particulier (la présence de la restructuration, le statut référentiel du sujet et sa portée relative vis-à-vis du prédicat modal, parmi d'autres fréquemment mentionnés), mais dépend de l'effet cumulatif de plusieurs facteurs travaillant ensemble. La thèse montre également la nécessité d'adopter une vision plus fine de la modalité radicale (root modality), qui permet une association plus simple entre structures syntaxiques et significations modales
This dissertation is an investigation of the syntactic structure and modal interpretation of clauses involving the denominal necessity predicate behar ‘need’ and an infinitival complement. On the one hand, it analyses the syntactic status of non-finite complements of denominal behar by examining their interaction with syntactic phenomena sensitive to different structural and locality conditions, and concludes that the infinitival complements of behar can correspond to different underlying structures. The largest type of infinitive is a non-restructuring infinitive that projects a full clausal architecture (i.e. a CP), and the smallest one is a reduced restructuring infinitive that projects up to vP. There is evidence for intermediate types projecting up to the inflectional domain (IP/TP). On the other hand, the dissertation examines the thematic and scope properties of the subjects in each of the different structural types and the modal interpretation that they can give rise to. On the basis of this analysis it is argued that modal interpretation is not constrained by any single factor (the presence of restructuring, the referential status of the subject and its relative scope vis-à-vis the modal predicate, among other frequently mentioned ones), but depends on the cumulative effect of several factors working together. The dissertation also shows the necessity of adopting a more fine-grained view of root modality, one that allows a simpler mapping of syntactic structures into modal meanings
APA, Harvard, Vancouver, ISO, and other styles
45

Bosse, Marie-Line. "L'acquisition et la mobilisation des connaissances lexicales orthographiques : tests d'hypothèses développementales issues du modèle de lecture de Ans, Carbonnel et Valdois (1998)." Grenoble 2, 2004. http://www.theses.fr/2004GRE29038.

Full text
Abstract:
Cette thèse étudie l'acquisition de ces connaissances orthographiques lexicales chez l'enfant. Les hypothèses testées ont été élaborées à partir du modèle multi-trace de lecture experte de Ans, Carbonnel, et Valdois (1998). Une première série d'études permet de conforter l'hypothèse d'un fonctionnement analogique chez l'enfant en montrant notamment que des connaissances orthographiques lexicales peuvent être acquises dès le début de l'apprentissage et peuvent être activées lors du traitement de mots nouveaux. La suite de la thèse explore l'hypothèse selon laquelle l'acquisition des connaissances orthographiques dépendrait non seulement des traitements phonologiques, mais également de l'efficacité des traitements visuo-attentionnels. Une série de recherches étudie la perturbation du traitement visuo-attentionnel chez les dyslexiques. Elle met en évidence un trouble du traitement visuo-attentionnel, accompagné d'un déficit marqué des connaissances lexicales, dans certains cas de dyslexie développementale. Une dernière série d'études examine l'implication du traitement visuo-attentionnel lors de l'acquisition normale des connaissances lexicales orthographiques au cours de l'école élémentaire. Les résultats montrent que le traitement visuo-attentionnel est fortement prédictif des connaissances lexicales orthographiques de l'enfant, même après contrôle de la part prédite par le traitement phonologique, le QI, la mémoire verbale à court terme. Ces travaux apportent des éléments en faveur de l'hypothèse selon laquelle l'acquisition des connaissances orthographiques dépendrait en partie de l'efficacité des traitements visuo-attentionnels
This research studies the acquisition of orthographic knowledge. Hypotheses have been made from the multi-trace model of expert reading of Ans, Carbonnel, and Valdois (1998). A first series of experiments permitted to confirm the existence of an analogical process on children. The studies showed that children can acquire lexical orthographic knowledge from the beginning of literacy learning, and that this knowledge can be activated during the processing of new words. The next part of the research tests the hypothesis that the acquisition of orthographic knowledge depends not only on phonological processing but also on visual attentional processing efficacy. To do so, a second series of experiments studies the impairment of visual attentional processing on dyslexic children. They evidence, on some dyslexic children, both a visual attentional processing impairment and an important deficit of lexical orthographic knowledge. A last series of studies examined the involvement of visual attentional processing in the normal acquisition of orthographic knowledge. Results show that visual attentional processing is highly predictive of orthographic knowledge, for children from first to fifth grade, and even after the control of the part predicted by phonological processing, IQ and verbal short term memory. These studies on large samples of children, with coherent results on reading and spelling, provide convincing arguments for the hypothesis that the acquisition of orthographic knowledge depends not only on phonological processing but also on visual attentional processing efficacy
APA, Harvard, Vancouver, ISO, and other styles
46

Santos, Anderson Roberto Santos dos. "A computational investigation of verbs during aging with and without Alzheimer’s disease." reponame:Biblioteca Digital de Teses e Dissertações da UFRGS, 2011. http://hdl.handle.net/10183/119124.

Full text
Abstract:
A doença de Alzheimer produz alterações nas funções cognitivas, entre eles, de processos que são responsáveis pela linguagem e memória. Com o intuito de termos uma melhor compreensão das alterações da linguagem, este trabalho investigou características presentes em redes semânticas de pacientes com diagnóstico de provável Alzheimer, com foco nos verbos. Os resultados das comparações entre as redes de indivíduos saudáveis e pacientes com Alzheimer indicam diferenças topológicas entre eles. Neste trabalho, também foram construídos classificadores que poderiam captar as diferenças entre os vários perfis de indivíduos, e que podem ser utilizados para classificar novos indivíduos de acordo com o perfil mais próximo. Esse esforço se deu com o intuito de ajudar no diagnóstico de doenças que afetam a linguagem, como a doença de Alzheimer.
Alzheimer’s disease produces alterations of cognitive functions and of processes that are responsible for language and memory. In order to have a better understanding of language changes, we investigate the characteristics of the semantic networks of patients diagnosed with probable Alzheimer, focusing on verbs. The results of comparisons with networks of healthy individuals and patients with Alzheimer disease highlight some topological differences among them. We also constructed classifiers that could capture the differences between the various profiles of speakers, and that can be used to classify unknown speakers according to the closest profile. We made this effort in order to help the diagnosis of diseases that affect language, such as the Alzheimer’s disease.
APA, Harvard, Vancouver, ISO, and other styles
47

Aymoré, Debora de Sá Ribeiro. "O modelo de Historiografia da ciência Kuhniano: da obra A estrutura das revoluções científicas aos ensaios tardios." Universidade de São Paulo, 2010. http://www.teses.usp.br/teses/disponiveis/8/8133/tde-26102010-093744/.

Full text
Abstract:
O objetivo central de nosso trabalho é analisar criticamente os aspectos centrais do modelo de historiografia da ciência proposto por Thomas Kuhn (1922-1996). Para alcançar este objetivo, começaremos o nosso exame com A estrutura das revoluções científicas (1962), que contém a primeira formulação mais completa sobre a estrutura de desenvolvimento da ciência, juntamente com o Posfácio de 1969. Em seguida, investigaremos alguns dos ensaios publicados nas coletâneas A tensão essencial (1977) e O caminho desde a estrutura (2000). Ao final da análise veremos que a historiografia de Kuhn tem como base o postulado da história real da ciência e os pressupostos da relação entre a história e a filosofia da ciência, da centralidade do paradigma, da pluralidade de leituras de texto e da relação entre história interna e externa da ciência.
The central aim of our work is to critically examine the central aspects of the historiography of science proposed by Thomas Kuhn (1922-1996). To achieve this goal, we will begin our examination with The structure of scientific revolutions (1962), which contains the first more complete formulation about the structure of scientific development, along with the Postscript of 1969. Then we will also investigate some essays in the collections The essential tension (1977) and The road since Structure (2000). After the analysis we realize that Kuhn\'s historiography is based on the postulate of the real history of science and the assumptions of the relationship between history and philosophy of science, the centrality of the paradigm, the plurality of readings of text and the relationship between internal and external history of science.
APA, Harvard, Vancouver, ISO, and other styles
48

Nobre, Alexandre de Pontes. "Processamento léxico-semântico : relações com reconhecimento visual de palavras e compreensão de leitura textual." reponame:Biblioteca Digital de Teses e Dissertações da UFRGS, 2013. http://hdl.handle.net/10183/101860.

Full text
Abstract:
Esta dissertação teve como objetivo investigar as relações entre reconhecimento de palavras e compreensão de leitura textual e o processamento léxico. A dissertação é constituída de dois estudos. No primeiro estudo, são revisados modelos de leitura de palavras e de texto com o objetivo de examinar o papel do processamento léxico-semântico no reconhecimento visual de palavras e na compreensão de leitura textual. O paradigma de priming semântico é apresentado como uma ferramenta para a investigação da relação entre processamento léxico-semântico e ambos os componentes de leitura examinados. São apresentados os principais modelos teóricos de priming semântico, juntamente com uma revisão dos estudos empíricos que relacionam priming semântico e leitura, e algumas conclusões e perspectivas de investigação são apresentadas. No segundo estudo, foram investigadas empiricamente as relações entre processamento léxico-semântico e leitura (reconhecimento visual de palavras e compreensão de leitura textual) em uma amostra de 68 crianças, de 7 a 12 anos, de escolas particulares de Porto Alegre. O processamento léxico-semântico foi avaliado através de uma tarefa de decisão lexical no paradigma de priming semântico, enquanto as habilidades de leitura foram medidas por uma tarefa de leitura de palavras/pseudopalavras isoladas e uma tarefa de compreensão de leitura textual (resposta a questões e reconto de história). Foram investigadas correlações entre efeitos de priming semântico e desempenho em tarefas de leitura de palavras e compreensão de leitura textual e se o priming semântico prediz o desempenho dos participantes nas tarefas de leitura. Os resultados mostraram que o priming semântico se correlaciona com ambas as medidas de leitura, e que o reconhecimento de palavras medeia parcialmente a relação entre processamento léxico-semântico e compreensão de leitura textual.
The aim of this dissertation was to investigate the relationships between word recognition and reading comprehension with lexical-semantic processing. The dissertation is composed of two studies. In the first study, models of word reading and reading comprehension are reviewed in order to examine the role of lexical-semantic processing in visual word recognition and in reading comprehension. The semantic priming paradigm is presented as an instrument for the investigation of relationships between lexical-semantic processing and the components of reading examined. The main theoretical models of semantic priming are presented and a review of studies which relate semantic priming and reading is conducted, and some conclusions and perspectives for investigation are presented. In the second study, relations between lexical-semantic processing and reading (visual word recognition and reading comprehension) were investigated empirically in a sample of 68 children, aged seven to twelve years, from private schools in Porto Alegre, Brazil. Lexical-semantic processing was evaluated by a lexical decision task in the semantic priming paradigm and reading abilities were assessed with a word/nonword reading task and a reading comprehension task (questionnaire and story retelling). Correlations between semantic priming effects and word reading and reading comprehension were investigated, as well as if semantic priming effects predict performance on the reading task. Results showed that semantic priming correlates with both groups of reading measures, and that word reading partially mediates the relation between lexical-semantic processing and reading comprehension.
APA, Harvard, Vancouver, ISO, and other styles
49

Oliveira, Ana Flávia Souto de. "A multiplicidade semântica em learners' dictionaries : por uma abordagem semântico-cognitiva para a organização das acepções." reponame:Biblioteca Digital de Teses e Dissertações da UFRGS, 2015. http://hdl.handle.net/10183/130772.

Full text
Abstract:
A multiplicidade semântica, isto é, o fato de uma forma linguística apresentar mais de um significado ou nuance contextual, é um fenômeno que, apesar de não trazer grandes problemas à comunicação cotidiana, impõe diversas questões às teorias semântico-lexicais e a suas aplicações linguísticas. Neste trabalho, buscamos compreender o tratamento dispensado à multiplicidade semântica por learners’ dictionaries do inglês (dicionários monolíngues para aprendizes avançados de inglês como língua estrangeira) à luz do quadro teórico da Semântica Cognitiva Lexical. Em um primeiro momento, sistematizamos as implicações trazidas pela multiplicidade semântica para os learners’ dictionaries, principalmente no que diz respeito aos procedimentos de lumping e splitting, ao tipo de solução adotada para a divisão de verbetes para as formas lexicais (homonímica ou polissêmica) e aos critérios de organização das acepções nos verbetes. Com relação a esses aspectos, demonstramos que não há uma base teórica sólida que permita definir quantos e quais significados um item lexical apresenta, que não há consenso quanto ao tratamento das soluções nesse tipo de dicionário e que a frequência – critério utilizado pelas obras para organizar as acepções – não é tão objetivo quanto se esperaria, nem tem respaldo empírico para a utilização nesse tipo de dicionário. Assim, defendemos que, por conta do caráter interpretativo da descrição semântico lexical, evidenciada pela flexibilidade no tratamento dispensado pelas obras às questões da multiplicidade semântica, seja buscada uma abordagem distinta a essas questões. Com esse intuito, em um segundo momento, introduzimos a concepção semântico-cognitiva de estrutura semasiológica, que, ancorada em postulados da Teoria Prototípica, destaca a sobreposição e a saliência semântica como características estruturais do léxico, postulados que contemplam a flexibilidade e a instabilidade do significado lexical. Na busca por subsídios metodológicos que fundamentem uma proposta alternativa aos problemas lexicográficos, apresentamos alguns dos modelos de descrição da estrutura semântica dos itens lexicais propostos pelo paradigma cognitivo: o Modelo Radial, o Modelo Esquemático e o Modelo de Grupos em Sobreposição. Avaliamos a estrutura semasiológica do item lexical case e propomos formas alternativas para sua representação nos learners’ dictionaries que condizem tanto com os postulados semântico-cognitivos, quanto com o que se sabe sobre o tipo de dicionário em questão e as necessidades de seus usuários. Com relação às soluções homonímica e polissêmica, sugerimos quatro configurações possíveis que permitem representar a estrutura coesa da polissemia, mas cuja validação depende ainda de testes de uso. Quanto à organização das acepções, consideramos que, mesmo através do uso de uma estrutura hierárquica, é possível representar os fenômenos de sobreposição e saliência semântica que julgamos relevantes para os usuários dessas obras, por exemplo, através do uso dos próprios recursos possibilitados pela hierarquia e da redação de definições que destaquem atributos compartilhados por dois ou mais significados que não podem ser relacionados na estrutura linear do verbete. Assim, uma abordagem semântico-cognitiva parece ser útil para nortear práticas lexicográficas relativas à estruturação das informações sobre a multiplicidade semântica nos learners’ dicitionaries.
Semantic multiplicity can be defined as a case in which a single linguistic form presents more than one meaning or contextual reading. Even though this phenomenon usually does not pose serious challenges for everyday communication, it certainly brings about important issues to lexical semantic theories and its linguistic applications. The present dissertation aims at evaluating the treatment semantic multiplicity receives in English monolingual advanced learners’ dictionaries from a cognitive-semantic point of view. Therefore, the consequences of semantic multiplicity for the organization of learners’ dictionaries are presented, mainly with regard to the procedures of lumping and splitting, the solutions applied for structuring entries (homonymic or polysemous solutions), and the criteria used in arranging senses. First, it is demonstrated that there are no solid methodological bases on which to decide on how many (nor which) senses a lexical item has. Second, it is shown that there is no agreement on the solution to be applied to this type of dictionary. Third, it is advocated that the criterion used for sense arrangement (frequency) is not as objective as one would expect and it has not yet been proven to bring any advantage for the users of learners’ dictionaries. Because of the interpretative nature of lexical semantic description, which is reflected in the different treatments the dictionaries provide to these matters, a distinct approach is sought. To this end, the cognitive-semantic conception of semasiological structure is introduced. With its origins linked to Prototype Theory tenets, this notion highlights that semantic salience and overlapping are structural characteristics of the lexicon, which reflect the flexibility and instability of lexical meaning. In order to search for methods that could ground an alternative proposal for these lexicographic issues, the cognitive-semantic descriptive models of semantic structure are assessed: the radial model, the schematic model, and the overlapping sets model. The semasiological structure of the lexical item case is described and a new proposal for its organization is provided, in tune with cognitive-semantic tenets and with what is known about this type of dictionary and its users’ needs. Regarding homonymic and polysemous solutions, four different arrangements are suggested, which represent the coherent structure of polysemy. Regarding sense arrangement, it is shown that even through the use of a hierarchical structure, it is possible to represent the semantic overlapping and salience found to be useful for the users of learners’ dictionaries. By exploring the hierarchical resources themselves and by manipulating the wording of definitions, it is feasible to accentuate attributes shared by two or more senses that cannot be related in the linear structure of the dictionary entry. Thus, Cognitive Semantics presents itself as a useful approach to guide lexicographic practices related to the structuring of semantic multiplicity information in learners’ dictionaries.
APA, Harvard, Vancouver, ISO, and other styles
50

Khuc, Vinh Ngoc. "Approaches to Automatically Constructing Polarity Lexicons for Sentiment Analysis on Social Networks." The Ohio State University, 2012. http://rave.ohiolink.edu/etdc/view?acc_num=osu1343187623.

Full text
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography