Dissertations / Theses: 'Semantic analyses'

1

Ho, Man-yee. "Trendy expressions in Hong Kong Cantonese morphological, semantic and pragmatic analyses /." Click to view the E-thesis via HKUTO, 2005. http://sunzi.lib.hku.hk/hkuto/record/B31601029.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Ho, Man-yee, and 何敏兒. "Trendy expressions in Hong Kong Cantonese: morphological, semantic and pragmatic analyses." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2005. http://hub.hku.hk/bib/B31601029.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Kautz, Oliver [Verfasser]. "Model Analyses Based on Semantic Differencing and Automatic Model Repair / Oliver Kautz." Düren : Shaker, 2021. http://d-nb.info/1233548298/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

4

Kossmann, Bianca. "Rich and poor in the history of English: corpus-based analyses of lexico-semantic variation and change in Old and Middle English." [S.l. : s.n.], 2007. http://nbn-resolving.de/urn:nbn:de:bsz:25-opus-46897.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

Zoltan, Kazi. "Ontološki zasnovana analiza semantičke korektnosti modela podataka primenom sistema automatskog rezonovanja." Phd thesis, Univerzitet u Novom Sadu, Tehnički fakultet Mihajlo Pupin u Zrenjaninu, 2014. https://www.cris.uns.ac.rs/record.jsf?recordId=85033&source=NDLTD&language=en.

Full text

Abstract:

U radu je izvršeno teoretsko istraživanje i analiza postojećih stavova i rešenja u oblasti validacije i provere kvaliteta modela podataka. Kreiran je teorijski model ontološki zasnovane analize semantičke korektnosti modela podataka primenom sistema automatskog rezonovanja i izvršena praktična implementacija teorijskog modela, što je potvrđeno i sprovedenim eksperimentalnim istraživanjem. Razvijena je softverska aplikacija za formalizaciju modela podataka i mapiranje ontologije u oblik Prolog klauzula. Formirana su pravila zaključivanja na predikatskom računu prvog reda, koja su integrisana sa modelom podataka i domenskom ontologijom. Upitima u okviru Prolog sistema, vrši se provera semantičke korektnosti modela podataka. Definisana je i metrika ontološkog kvaliteta modela podataka koja se bazira na odgovorima sistema automatskog rezonovanja.
Work presents a theoretical study and analysis of existing theories and solutions in the area of data model validation and quality checking. It is created a theoretical model of ontology based analysis of data model semantic correctness by applying automated reasoning system which is practicaly implemented and confirmed by the conducted experimental research. A software application is developed for data model formalization and ontology mapping in Prolog clauses form. Reasoning rules are formed the in first-order predicate logic, which are integrated with the data model and domain ontology. Semantic correctness of the data model is checked with queries within Prolog system. Metrics of ontological quality of the data model are defined which are based on automated reasoning system replies.

APA, Harvard, Vancouver, ISO, and other styles

6

Malmqvist, Anita. "Sparsamkeit und Geiz, Grosszügigkeit und Verschwendung : ethische Konzepte im Spiegel der Sprache." Doctoral thesis, Umeå universitet, Moderna språk, 2000. http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-61584.

Full text

Abstract:

The object of this study is to analyse the lexemes and phraseological units that constitute the semantic fields employed in naming four abstract domains, greed, thrift, generosity, and extravagance that make up the ethical concept <Attitude to Ownership> in German. On the assumption that ideas are accessible to us through the lexicalised items of a language, recent theories in the field of semantic analysis and conceptualisation were applied to the source material. In each domain key words were identified and their definitions in modern and historical dictionaries were analysed. Various dimensions of meaning, which proved to be inherent in the lexical items, emerged from this analysis. The oppositions a/o (action directed to others vs. to oneself), right/wrong (virtues vs. vices) and too much/ too little vs. the ideal mean were established as central. To achieve a more precise description of meaning tentative explications of cognitive levels were proposed. By means of these the underlying ideas, as they were reflected in the lexical units, could be described. The analysis showed greater variation and expressivity in words, idioms, and proverbs referring to the two vices compared to the virtues. Furthermore, a diachronic study produced evidence of semantic and conceptual changes. On the basis of such observations conclusions could be drawn about changes in the ethical system. The data derived from a contrastive corpus analysis of the German and Swedish key words showed numerous similarities as well as some conspicuous differences in the conceptualisation and valuation of attitudes pertaining to the four abstract domains. Moreover, the key words denoting the two virtues showed a clear domination in frequency, indicating that these are more central conceptual categories in today's society than the vices. An ongoing shift in meaning could be established for the key words naming the latter. Applying modern theories of metaphor and metonymy the experiential basis of meaning and thought was explored, showing that the structures forming the ethical concepts studied in this work are grounded in experiences of a physical and socio- cultural nature. The metaphorical concept ILLNESS emerged as a common source domain for the two vices, while the PATH- concept was shown to form the basis of metaphors expressing the o-virtue but not the a-virtue. Among the numerous métonymie concepts HAND proved to be a characteristic of all four domains.
digitalisering@umu

APA, Harvard, Vancouver, ISO, and other styles

7

Gao, Boyang. "Contributions to music semantic analysis and its acceleration techniques." Thesis, Ecully, Ecole centrale de Lyon, 2014. http://www.theses.fr/2014ECDL0044/document.

Full text

Abstract:

La production et la diffusion de musique numérisée ont explosé ces dernières années. Une telle quantité de données à traiter nécessite des méthodes efficaces et rapides pour l’analyse et la recherche automatique de musique. Cette thèse s’attache donc à proposer des contributions pour l’analyse sémantique de la musique, et en particulier pour la reconnaissance du genre musical et de l’émotion induite (ressentie par l’auditoire), à l’aide de descripteurs de bas-niveau sémantique mais également de niveau intermédiaire. En effet, le genre musical et l’émotion comptent parmi les concepts sémantiques les plus naturels perçus par les auditoires. Afin d’accéder aux propriétés sémantiques à partir des descripteurs bas-niveau, des modélisations basées sur des algorithmes de types K-means et GMM utilisant des BoW et Gaussian super vectors ont été envisagées pour générer des dictionnaires. Compte-tenu de la très importante quantité de données à traiter, l’efficacité temporelle ainsi que la précision de la reconnaissance sont des points critiques pour la modélisation des descripteurs de bas-niveau. Ainsi, notre première contribution concerne l’accélération des méthodes K-means, GMM et UMB-MAP, non seulement sur des machines indépendantes, mais également sur des clusters de machines. Afin d’atteindre une vitesse d’exécution la plus importante possible sur une machine unique, nous avons montré que les procédures d’apprentissage des dictionnaires peuvent être réécrites sous forme matricielle pouvant être accélérée efficacement grâce à des infrastructures de calcul parallèle hautement performantes telle que les multi-core CPU ou GPU. En particulier, en s’appuyant sur GPU et un paramétrage adapté, nous avons obtenu une accélération de facteur deux par rapport à une implémentation single thread. Concernant le problème lié au fait que les données ne peuvent pas être stockées dans la mémoire d’une seul ordinateur, nous avons montré que les procédures d’apprentissage des K-means et GMM pouvaient être divisées par un schéma Map-Reduce pouvant être exécuté sur des clusters Hadoop et Spark. En utilisant notre format matriciel sur ce type de clusters, une accélération de 5 à 10 fois a pu être obtenue par rapport aux librairies d’accélération de l’état de l’art. En complément des descripteurs audio bas-niveau, des descripteurs de niveau sémantique intermédiaire tels que l’harmonie de la musique sont également très importants puisqu’ils intègrent des informations d’un niveau d’abstraction supérieur à celles obtenues à partir de la simple forme d’onde. Ainsi, notre seconde contribution consiste en la modélisation de l’information liée aux notes détectées au sein du signal musical, en utilisant des connaissances sur les propriétés de la musique. Cette contribution s’appuie sur deux niveaux de connaissance musicale : le son des notes des instruments ainsi que les statistiques de co-occurrence et de transitions entre notes. Pour le premier niveau, un dictionnaire musical constitué de notes d’instruments a été élaboré à partir du synthétiseur Midi de Logic Pro 9. Basé sur ce dictionnaire, nous avons proposé un algorithme « Positive Constraint Matching Pursuit » (PCMP) pour réaliser la décomposition de la musique. Pour le second niveau, nous avons proposé une décomposition parcimonieuse intégrant les informations de statistiques d’occurrence des notes ainsi que les probabilités de co-occurrence pour guider la sélection des atomes du dictionnaire musical et pour construire un graphe à candidats multiples pour proposer des choix alternatifs lors des sélections successives. Pour la recherche du chemin global optimal de succession des notes, les probabilités de transitions entre notes ont également été incorporées. […]
Digitalized music production exploded in the past decade. Huge amount of data drives the development of effective and efficient methods for automatic music analysis and retrieval. This thesis focuses on performing semantic analysis of music, in particular mood and genre classification, with low level and mid level features since the mood and genre are among the most natural semantic concepts expressed by music perceivable by audiences. In order to delve semantics from low level features, feature modeling techniques like K-means and GMM based BoW and Gaussian super vector have to be applied. In this big data era, the time and accuracy efficiency becomes a main issue in the low level feature modeling. Our first contribution thus focuses on accelerating k-means, GMM and UBM-MAP frameworks, involving the acceleration on single machine and on cluster of workstations. To achieve the maximum speed on single machine, we show that dictionary learning procedures can elegantly be rewritten in matrix format that can be accelerated efficiently by high performance parallel computational infrastructures like multi-core CPU, GPU. In particular with GPU support and careful tuning, we have achieved two magnitudes speed up compared with single thread implementation. Regarding data set which cannot fit into the memory of individual computer, we show that the k-means and GMM training procedures can be divided into map-reduce pattern which can be executed on Hadoop and Spark cluster. Our matrix format version executes 5 to 10 times faster on Hadoop and Spark clusters than the state-of-the-art libraries. Beside signal level features, mid-level features like harmony of music, the most natural semantic given by the composer, are also important since it contains higher level of abstraction of meaning beyond physical oscillation. Our second contribution thus focuses on recovering note information from music signal with musical knowledge. This contribution relies on two levels of musical knowledge: instrument note sound and note co-occurrence/transition statistics. In the instrument note sound level, a note dictionary is firstly built i from Logic Pro 9. With the musical dictionary in hand, we propose a positive constraint matching pursuit (PCMP) algorithm to perform the decomposition. In the inter-note level, we propose a two stage sparse decomposition approach integrated with note statistical information. In frame level decomposition stage, note co-occurrence probabilities are embedded to guide atom selection and to build sparse multiple candidate graph providing backup choices for later selections. In the global optimal path searching stage, note transition probabilities are incorporated. Experiments on multiple data sets show that our proposed approaches outperform the state-of-the-art in terms of accuracy and recall for note recovery and music mood/genre classification

APA, Harvard, Vancouver, ISO, and other styles

8

Krull, Kirsten. "Lieber Gott, mach mich fromm ... : Zum Wort und Konzept “fromm” im Wandel der Zeit." Doctoral thesis, Umeå : Institutionen för moderna språk, Umeå univ, 2004. http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-286.

Full text

APA, Harvard, Vancouver, ISO, and other styles

9

Dang, Qinran. "Brouillard de pollution en Chine. Analyse sémantique différentielle de corpus institutionnels, médiatiques et de microblogues." Thesis, Paris, INALCO, 2020. http://www.theses.fr/2020INAL0009.

Full text

Abstract:

Au fur et à mesure de la dégradation de la qualité de l'air en Chine, de plus en plus d'articles journalistiques et de microblogues (weibo en chinois, équivalent de tweet), provenant de sites web gouvernementaux, médiatiques, de réseaux sociaux, de forums ou de blogs, traitent le problème du « 雾霾 » (wumai en chinois, pour désigner le brouillard de pollution) en Chine sous plusieurs angles : politique, écologique, économique, sociologique, sanitaire, etc. La sémantique des thèmes abordés dans ces textes diffère sensiblement en fonction de leur genre textuel. Dans cette thèse, nous avons pour objectif d'une part, de relever les différents thèmes d'un corpus numérique traitant du wumai et spécifiquement construit à cette fin, et d'autre part, d'interpréter de façon différentielle la sémantique de ces thèmes. Dans un premier temps, nous collectons les données textuelles en langue chinoise relatives au wumai. Ces textes provenant de trois sites web chinois traditionnels et du réseau social sont divisés en quatre genres textuels. Après une série de traitements préparatoires : nettoyage, segmentation, normalisation, annotation, balisage et organisation, nous étudions les caractéristiques des quatre genres textuels du corpus à partir d'une série de variables discriminantes - hyperstructurelles, lexicales, sémiotiques, rhétoriques, modales et syntaxiques - réparties au niveau infratextuel et intratextuel. Ensuite, en nous basant sur les caractéristiques de chaque genre textuel, nous relevons les thèmes principaux exposés dans chaque genre de sous-corpus, et analysons de manière contrastive la sémantique de ces thèmes récupérés. Les résultats d'étude sont interprétés de manière quantitative et qualitative. Les analyses quantitatives s'effectuent à l'aide d'outils textométriques, les interprétations sémantiques s'inscrivent dans le cadre théorique de la sémantique interprétative (SI) proposée par Rastier (1987)
Air pollution has increasingly become a serious problem in China, more and more journalistic articles and miniblogs (weibo in Chinese, equivalent to tweet), comming from governmental or media websites, social networks, blogs and forums, etc., discuss the issue of «雾霾» (wumai in Chinese, means smog) in China through several angles : political, ecological, economic, sociological, health, etc. The semantics of the themes adressed in these texts differ significantly from each other according to their textual genre. In the framework of our research, our objectif is double-fold : on the one hand, to identify different themes of a digital propose-bulit corpus relating to wumai ; and on the other hand, to interpret differentially the semantics of these themes. Firstly, we collect the textual data written in chinese and related to wumai. These journalistic articles and weibo deriving from three traditional chinese and the social network are divided into four genres of sub-corpus. Secondly, we constitute our corpus through a series of data processing : data cleaning, word segmentation, normalization, POS tagging, benchmarking and data organization. We study the characteristics of the four genres of sub-corpus through a series of discriminating variables - hyperstructural, lexical, semiotic, rhetorical, modal and syntactic - distributed at the infratextual and intratextual level. After that, based on the characteristics of each textual genre, we identify the main themes exposed in each genre of sub-corpus, and analyze the semantics of these identified themes in a contrastive way. Our analysis results are interpreted from two angles : quantitative and qualitative. All statistical analysis are assisted by textometric tools ; and the semantic interpretations are implemented on several fundamental concepts of SI (Sémantique interprétative) proposed by Rastier (1987)

APA, Harvard, Vancouver, ISO, and other styles

10

Steinmetz, Nadine. "Context-aware semantic analysis of video metadata." Phd thesis, Universität Potsdam, 2013. http://opus.kobv.de/ubp/volltexte/2014/7055/.

Full text

Abstract:

Im Vergleich zu einer stichwortbasierten Suche ermöglicht die semantische Suche ein präziseres und anspruchsvolleres Durchsuchen von (Web)-Dokumenten, weil durch die explizite Semantik Mehrdeutigkeiten von natürlicher Sprache vermieden und semantische Beziehungen in das Suchergebnis einbezogen werden können. Eine semantische, Entitäten-basierte Suche geht von einer Anfrage mit festgelegter Bedeutung aus und liefert nur Dokumente, die mit dieser Entität annotiert sind als Suchergebnis. Die wichtigste Voraussetzung für eine Entitäten-zentrierte Suche stellt die Annotation der Dokumente im Archiv mit Entitäten und Kategorien dar. Textuelle Informationen werden analysiert und mit den entsprechenden Entitäten und Kategorien versehen, um den Inhalt semantisch erschließen zu können. Eine manuelle Annotation erfordert Domänenwissen und ist sehr zeitaufwendig. Die semantische Annotation von Videodokumenten erfordert besondere Aufmerksamkeit, da inhaltsbasierte Metadaten von Videos aus verschiedenen Quellen stammen, verschiedene Eigenschaften und Zuverlässigkeiten besitzen und daher nicht wie Fließtext behandelt werden können. Die vorliegende Arbeit stellt einen semantischen Analyseprozess für Video-Metadaten vor. Die Eigenschaften der verschiedenen Metadatentypen werden analysiert und ein Konfidenzwert ermittelt. Dieser Wert spiegelt die Korrektheit und die wahrscheinliche Mehrdeutigkeit eines Metadatums wieder. Beginnend mit dem Metadatum mit dem höchsten Konfidenzwert wird der Analyseprozess innerhalb eines Kontexts in absteigender Reihenfolge des Konfidenzwerts durchgeführt. Die bereits analysierten Metadaten dienen als Referenzpunkt für die weiteren Analysen. So kann eine möglichst korrekte Analyse der heterogen strukturierten Daten eines Kontexts sichergestellt werden. Am Ende der Analyse eines Metadatums wird die für den Kontext relevanteste Entität aus einer Liste von Kandidaten identifiziert - das Metadatum wird disambiguiert. Hierfür wurden verschiedene Disambiguierungsalgorithmen entwickelt, die Beschreibungstexte und semantische Beziehungen der Entitätenkandidaten zum gegebenen Kontext in Betracht ziehen. Der Kontext für die Disambiguierung wird für jedes Metadatum anhand der Eigenschaften und Konfidenzwerte zusammengestellt. Der vorgestellte Analyseprozess ist an zwei Hypothesen angelehnt: Um die Analyseergebnisse verbessern zu können, sollten die Metadaten eines Kontexts in absteigender Reihenfolge ihres Konfidenzwertes verarbeitet werden und die Kontextgrenzen von Videometadaten sollten durch Segmentgrenzen definiert werden, um möglichst Kontexte mit kohärentem Inhalt zu erhalten. Durch ausführliche Evaluationen konnten die gestellten Hypothesen bestätigt werden. Der Analyseprozess wurden gegen mehrere State-of-the-Art Methoden verglichen und erzielt verbesserte Ergebnisse in Bezug auf Recall und Precision, besonders für Metadaten, die aus weniger zuverlässigen Quellen stammen. Der Analyseprozess ist Teil eines Videoanalyse-Frameworks und wurde bereits erfolgreich in verschiedenen Projekten eingesetzt.
The Semantic Web provides information contained in the World Wide Web as machine-readable facts. In comparison to a keyword-based inquiry, semantic search enables a more sophisticated exploration of web documents. By clarifying the meaning behind entities, search results are more precise and the semantics simultaneously enable an exploration of semantic relationships. However, unlike keyword searches, a semantic entity-focused search requires that web documents are annotated with semantic representations of common words and named entities. Manual semantic annotation of (web) documents is time-consuming; in response, automatic annotation services have emerged in recent years. These annotation services take continuous text as input, detect important key terms and named entities and annotate them with semantic entities contained in widely used semantic knowledge bases, such as Freebase or DBpedia. Metadata of video documents require special attention. Semantic analysis approaches for continuous text cannot be applied, because information of a context in video documents originates from multiple sources possessing different reliabilities and characteristics. This thesis presents a semantic analysis approach consisting of a context model and a disambiguation algorithm for video metadata. The context model takes into account the characteristics of video metadata and derives a confidence value for each metadata item. The confidence value represents the level of correctness and ambiguity of the textual information of the metadata item. The lower the ambiguity and the higher the prospective correctness, the higher the confidence value. The metadata items derived from the video metadata are analyzed in a specific order from high to low confidence level. Previously analyzed metadata are used as reference points in the context for subsequent disambiguation. The contextually most relevant entity is identified by means of descriptive texts and semantic relationships to the context. The context is created dynamically for each metadata item, taking into account the confidence value and other characteristics. The proposed semantic analysis follows two hypotheses: metadata items of a context should be processed in descendent order of their confidence value, and the metadata that pertains to a context should be limited by content-based segmentation boundaries. The evaluation results support the proposed hypotheses and show increased recall and precision for annotated entities, especially for metadata that originates from sources with low reliability. The algorithms have been evaluated against several state-of-the-art annotation approaches. The presented semantic analysis process is integrated into a video analysis framework and has been successfully applied in several projects for the purpose of semantic video exploration of videos.

APA, Harvard, Vancouver, ISO, and other styles

11

Tramutoli, Rosanna. "`Love`encoding in Swahili: a semantic description through a corpus-based analysis." Universitätsbibliothek Leipzig, 2016. http://nbn-resolving.de/urn:nbn:de:bsz:15-qucosa-199690.

Full text

Abstract:

Several studies have described emotional expressions used by speakers from different linguistic and cultural areas all around the world. It has been demonstrated that there are universal cognitive bases for the metaphorical expressions that speakers use to describe their emotional status. There are indeed significant differences concerning the use of emotional expressions, not only across languages but also language-internally. Quite a number of studies focus on the language of emotions in several European languages and languages of West Africa, whereas not enough research has been done on this regard on Eastern African languages

APA, Harvard, Vancouver, ISO, and other styles

12

Wong, Shuk-mei Elva. "Combined treatment of semantic priming and semantic feature analysis for anomia with semantic impairment." Click to view the E-thesis via HKU Scholors Hub, 2005. http://lookup.lib.hku.hk/lookup/bib/B3827937X.

Full text

Abstract:

Thesis (B.Sc)--University of Hong Kong, 2005.
"A dissertation submitted in partial fulfilment of the requirements for the Bachelor of Science (Speech and Hearing Sciences), The University of Hong Kong, June 30, 2005." Also available in print.

APA, Harvard, Vancouver, ISO, and other styles

13

Sulieman, Dalia. "Towards Semantic-Social Recommender Systems." Phd thesis, Université de Cergy Pontoise, 2014. http://tel.archives-ouvertes.fr/tel-01017586.

Full text

Abstract:

In this thesis we propose semantic-social recommendation algorithms, that recommend an input item to users connected by a collaboration social network. These algorithms use two types of information: semantic information and social information.The semantic information is based on the semantic relevancy between users and the input item; while the social information is based on the users position and their type and quality of connections in the collaboration social network. Finally, we use depth-first search and breath-first search strategies to explore the graph.Using the semantic information and the social information, in the recommender system, helps us to partially explore the social network, which leads us to reduce the size of the explored data and to minimize the graph searching time.We apply our algorithms on real datasets: MovieLens and Amazon, and we compare the accuracy an the performance of our algorithms with the classical recommendation algorithms, mainly item-based collaborative filtering and hybrid recommendation.Our results show a satisfying accuracy values, and a very significant performance in execution time and in the size of explored data, compared to the classical recommendation algorithms.In fact, the importance of our algorithms relies on the fact that these algorithms explore a very small part of the graph, instead of exploring all the graph as the classical searching methods, and still give a good accuracy compared to the other classical recommendation algorithms. So, minimizing the size of searched data does not badly influence the accuracy of the results.

APA, Harvard, Vancouver, ISO, and other styles

14

Nikitkova, Jelena. "Semantics of English and Lithuanian number idioms: contrastive analysis." Master's thesis, Lithuanian Academic Libraries Network (LABT), 2013. http://vddb.laba.lt/obj/LT-eLABa-0001:E.02~2013~D_20130802_134815-91260.

Full text

Abstract:

The purpose of this paper is to explore the characteristic features and meaning of number idioms in the English and Lithuanian languages and determine similarities and differences in symbolic meaning conveyed by numbers in the two cultures. The research has been conducted adopting both quantitative and qualitative approaches, focusing on the main theoretical issues related to idioms and such universal and abstract phenomenon as "number" in the English and Lithuanian idioms, and searching for the relationships between number idioms in the two languages. In order to illustrate the main similarities and differences between two languages, 156 English and 212 Lithuanian idioms containing the cardinal and ordinal numbers from one to ten were subjected to the analysis. The contrastive, descriptive and statistic analysis methods were employed in the research. The results of the analysis showed that the numbers one (70 instances) and two (47 instances) are the most productive numbers in the English idioms; whereas, in the Lithuanian idioms besides the same numbers, one (99 instances) and two (35 instances), the number nine (39 instances) is common. The research demonstrated that numbers in the idioms of both languages communicate non-quantitative meaning more often than quantitative meaning. It was discovered that the choice of numbers in the idioms might be determined by logic and reality or reflect some cultural point of view. The analysis showed that the numbers one, three, six... [to full text]
Šio darbo tikslas – ištirti lietuviškų ir angliškų idiomų su skaičiais ypatumus ir reikšmes (vadovaujamasi anglų-amerikiečių tradicija terminas „idioma“ yra vartojamas vietoj lietuviško termino „frazeologizmas“); nustatyti panašumus ir skirtumus tarp skaičių simbolinių reikšmių dviejose kultūrose ir pamėginti juos paaiškinti. Tyrimui atlikti buvo naudojamasi kiekybine ir kokybine analizė, bei teorine medžiaga, susijusi su idiomų ir skaičių simbolinėmis rekšmėmis. Šiame darbe aptariami tokie teoriniai aspektai, kaip idiomos apibrėžimo problema, kriterijai naudojami atskirti idiomas nuo laisvųjų žodžių junginių, idiomų semantinė klasifikacija, ir idiomų ir kultūros ryšiai. Pagrindiniams dviejų kalbų idiomų panašumams ir skirtumams pagrįsti buvo surinktos 156 anglų ir 212 lietuvių kalbų idiomos. Analizės rezultatai parodė, kad skaičiai vienas (70 idiomos) ir du (47 idiomos) yra dažniausiai pasitaikaintys skaičiai anglų kalbos idiomose, tuo tarpu lietuvių kalbos idiomose apart tų pačių skaičių, vienas (99 idiomos) ir du (35 idiomos), yra paplytęs skaičius devyni (39 idiomos). Skaičių vienas is du dažnumas idiomose negali būti paaiškintas iš simbolinės pusės. Šių skaičių vartojimą daugiausia lemia logika ir realybė. Tačiau skaičiaus devyni dažnas vartojimas lietuviškose idiomose parodo šio skaičiaus glaudų ryšį su Lietuvos kultūra. Analizė atskleidė, kad skaičiai anglų ir lietuvių kalbų idiomose gali sukelti abiems tautoms panašias ir skirtingas asociacijas. Analizė parodė, kad... [toliau žr. visą tekstą]

APA, Harvard, Vancouver, ISO, and other styles

15

Romero, Rivas Carlos 1986. "The Effects of foreign-accented speech on language comprehension and retrieval processes." Doctoral thesis, Universitat Pompeu Fabra, 2016. http://hdl.handle.net/10803/399504.

Full text

Abstract:

When people learn a second language, they typically speak with a foreign accent. Crucially, foreign-accented speech is more difficult to understand and requires more processing time than native speech. Nevertheless, native listeners are able to adapt very fast to the variability introduced by foreign-accented speech, reaching similar intelligibility levels than during native speech comprehension. In this thesis, we show that, despite a lack of improvement at phonetic-acoustic levels of processing during the exposure to foreign-accented speech, listeners use lexical information in order to map the foreign-accented variations onto canonical representations. Also, we demonstrate that this adaptation has a cost. Thus, the higher demands on lexical processing during foreign-accented speech comprehension have an effect on lexical anticipation and semantic integration processes. Finally, we show that semantic spreading activation is also modulated by foreign-accented speech, and, particularly, by strong foreign accents. In summary, these results suggest that foreign-accented speech hinders semantic processing.
Cuando las personas aprenden una segunda lengua, habitualmente hablan con un acento extranjero. Es importante destacar que el habla con acento extranjero es más difícil de entender y requiere más tiempo de procesamiento que el habla de un nativo. Sin embargo, los oyentes nativos son capaces de adaptarse con mucha rapidez a la variabilidad introducida por el habla con acento extranjero, alcanzando unos niveles de comprensión similares a cuando procesan el habla de un nativo. En esta tesis mostramos que, aunque el procesamiento de la información acústico-fonética no mejora después de la exposición al habla con acento extranjero, los oyentes utilizan la información léxica para establecer correspondencias entre las variaciones introducidas por el acento extranjero y las representaciones canónicas que almacenan en su mente. Además, demostramos que esta adaptación tiene un coste. Así, el mayor esfuerzo requerido para el procesamiento léxico durante la comprensión del habla con acento extranjero tiene un efecto sobre los procesos de anticipación de palabras e integración semántica. Finalmente, mostramos que la difusión de la activación en las redes semánticas se ve modulada por el acento del hablante, particularmente cuando los hablantes tienen un marcado acento extranjero. En resumen, estos resultados sugieren que el acento extranjero dificulta el procesamiento semántico.

APA, Harvard, Vancouver, ISO, and other styles

16

Hassan, Samer. "Measuring Semantic Relatedness Using Salient Encyclopedic Concepts." Thesis, University of North Texas, 2011. https://digital.library.unt.edu/ark:/67531/metadc84212/.

Full text

Abstract:

While pragmatics, through its integration of situational awareness and real world relevant knowledge, offers a high level of analysis that is suitable for real interpretation of natural dialogue, semantics, on the other end, represents a lower yet more tractable and affordable linguistic level of analysis using current technologies. Generally, the understanding of semantic meaning in literature has revolved around the famous quote ``You shall know a word by the company it keeps''. In this thesis we investigate the role of context constituents in decoding the semantic meaning of the engulfing context; specifically we probe the role of salient concepts, defined as content-bearing expressions which afford encyclopedic definitions, as a suitable source of semantic clues to an unambiguous interpretation of context. Furthermore, we integrate this world knowledge in building a new and robust unsupervised semantic model and apply it to entail semantic relatedness between textual pairs, whether they are words, sentences or paragraphs. Moreover, we explore the abstraction of semantics across languages and utilize our findings into building a novel multi-lingual semantic relatedness model exploiting information acquired from various languages. We demonstrate the effectiveness and the superiority of our mono-lingual and multi-lingual models through a comprehensive set of evaluations on specialized synthetic datasets for semantic relatedness as well as real world applications such as paraphrase detection and short answer grading. Our work represents a novel approach to integrate world-knowledge into current semantic models and a means to cross the language boundary for a better and more robust semantic relatedness representation, thus opening the door for an improved abstraction of meaning that carries the potential of ultimately imparting understanding of natural language to machines.

APA, Harvard, Vancouver, ISO, and other styles

17

Li, Jie. "Intention-driven textual semantic analysis." School of Computer Science and Software Engineering, 2008. http://ro.uow.edu.au/theses/104.

Full text

Abstract:

The explosion of World Wide Web has brought endless amount of information within our reach. In order to take advantage of this phenomenon, text search becomes a major contemporary research challenge. Due to the nature of the Web, assisting users to find desired information is still a challenging task. In this thesis, we investigate semantic anlaysis techniques which can facilitate the search process at semantic level. We also study the problem that short queries are less informative and difficult to convey the user's intention into the search service system. We propose a generalized framework to address these issues. We conduct a case study of movie plot search in which a semantic analyzer seamlessly works with a user's intention detector. Our experimental results show the importance and effectiveness of intention detection and semantic analysis techniques.

APA, Harvard, Vancouver, ISO, and other styles

18

Kachintseva, Dina (Dina D. ). "Semantic knowledge representation and analysis." Thesis, Massachusetts Institute of Technology, 2011. http://hdl.handle.net/1721.1/76983.

Full text

Abstract:

Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2011.
Cataloged from PDF version of thesis.
Includes bibliographical references (p. 103).
Natural language is the means through which humans convey meaning to each other - each word or phrase is a label, or name, for an internal representation of a concept. This internal representation is built up from repeated exposure to particular examples, or instances, of a concept. The way in which we learn that a particular entity in our environment is a "bird" comes from seeing countless examples of different kinds of birds. and combining these experiences to form a menial representation of the concept. Consequently, each individual's understanding of a concept is slightly different, depending on their experiences. A person living in a place where the predominant types of birds are ostriches and emus will have a different representation birds than a person who predominantly sees penguins, even if the two people speak the same language. This thesis presents a semantic knowledge representation that incorporates this fuzziness and context-dependence of concepts. In particular, this thesis provides several algorithms for learning the meaning behind text by using a dataset of experiences to build up an internal representation of the underlying concepts. Furthermore, several methods are proposed for learning new concepts by discovering patterns in the dataset and using them to compile representations for unnamed ideas. Essentially, these methods learn new concepts without knowing the particular label - or word - used to refer to them. Words are not the only way in which experiences can be described - numbers can often communicate a situation more precisely than words. In fact, many qualitative concepts can be characterized using a set of numeric values. For instance, the qualitative concepts of "young" or "strong" can be characterized using a range of ages or strengths that are equally context-specific and fuzzy. A young adult corresponds to a different range of ages from a young child or a young puppy. By examining the sorts of numeric values that are associated with a particular word in a given context, a person can build up an understanding of the concept. This thesis presents algorithms that use a combination of qualitative and numeric data to learn the meanings of concepts. Ultimately, this thesis demonstrates that this combination of qualitative and quantitative data enables more accurate and precise learning of concepts.
by Dina Kachintseva.
M.Eng.

APA, Harvard, Vancouver, ISO, and other styles

19

Laird, James David. "A semantic analysis of control." Thesis, University of Edinburgh, 1999. http://hdl.handle.net/1842/382.

Full text

Abstract:

This thesis examines the use of denotational semantics to reason about control flow in sequential, basically functional languages. It extends recent work in game semantics, in which programs are interpreted as strategies for computation by interaction with an environment. Abramsky has suggested that an intensional hierarchy of computational features such as state, and their fully abstract models, can be captured as violations of the constraints on strategies in the basic functional model. Non-local control flow is shown to fit into this framework as the violation of strong and weak `bracketing' conditions, related to linear behaviour. The language muPCF (Parigot's mu_lambda with constants and recursion) is adopted as a simple basis for higher-type, sequential computation with access to the flow of control. A simple operational semantics for both call-by-name and call-by-value evaluation is described. It is shown that dropping the bracketing condition on games models of PCF yields fully abstract models of muPCF. The games models of muPCF are instances of a general construction based on a continuations monad on Fam(C), where C is a rational cartesian closed category with infinite products. Computational adequacy, definability and full abstraction can then be captured by simple axioms on C. The fully abstract and universal models of muPCF are shown to have an effective presentation in the category of Berry-Curien sequential algorithms. There is further analysis of observational equivalence, in the form of a context lemma, and a characterization of the unique functor from the (initial) games model, which is an isomorphism on its (fully abstract) quotient. This establishes decidability of observational equivalence for finitary muPCF, contrasting with the undecidability of the analogous relation in pure PCF.

APA, Harvard, Vancouver, ISO, and other styles

20

Kwon, Byungok. "A semantic analysis of conditionals /." The Ohio State University, 1994. http://rave.ohiolink.edu/etdc/view?acc_num=osu1487849377293743.

Full text

APA, Harvard, Vancouver, ISO, and other styles

21

Gad, Soumyashree Shrikant Gad. "Semantic Analysis of Ladder Logic." The Ohio State University, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=osu1502740043946349.

Full text

APA, Harvard, Vancouver, ISO, and other styles

22

Saif, Hassan. "Semantic sentiment analysis of microblogs." Thesis, Open University, 2015. http://oro.open.ac.uk/44063/.

Full text

Abstract:

Microblogs and social media platforms are now considered among the most popular forms of online communication. Through a platform like Twitter, much information reflecting people's opinions and attitudes is published and shared among users on a daily basis. This has recently brought great opportunities to companies interested in tracking and monitoring the reputation of their brands and businesses, and to policy makers and politicians to support their assessment of public opinions about their policies or political issues. A wide range of approaches to sentiment analysis on Twitter, and other similar microblogging platforms, have been recently built. Most of these approaches rely mainly on the presence of affect words or syntactic structures that explicitly and unambiguously reflect sentiment (e.g., "great'', "terrible''). However, these approaches are semantically weak, that is, they do not account for the semantics of words when detecting their sentiment in text. This is problematic since the sentiment of words, in many cases, is associated with their semantics, either along the context they occur within (e.g., "great'' is negative in the context "pain'') or the conceptual meaning associated with the words (e.g., "Ebola" is negative when its associated semantic concept is "Virus"). This thesis investigates the role of words' semantics in sentiment analysis of microblogs, aiming mainly at addressing the above problem. In particular, Twitter is used as a case study of microblogging platforms to investigate whether capturing the sentiment of words with respect to their semantics leads to more accurate sentiment analysis models on Twitter. To this end, several approaches are proposed in this thesis for extracting and incorporating two types of word semantics for sentiment analysis: contextual semantics (i.e., semantics captured from words' co-occurrences) and conceptual semantics (i.e., semantics extracted from external knowledge sources). Experiments are conducted with both types of semantics by assessing their impact in three popular sentiment analysis tasks on Twitter; entity-level sentiment analysis, tweet-level sentiment analysis and context-sensitive sentiment lexicon adaptation. Evaluation under each sentiment analysis task includes several sentiment lexicons, and up to 9 Twitter datasets of different characteristics, as well as comparing against several state-of-the-art sentiment analysis approaches widely used in the literature. The findings from this body of work demonstrate the value of using semantics in sentiment analysis on Twitter. The proposed approaches, which consider words' semantics for sentiment analysis at both, entity and tweet levels, surpass non-semantic approaches in most datasets.

APA, Harvard, Vancouver, ISO, and other styles

23

Boiko, Irena. "Lietuvių kalbos semantinių požymių lentelės valdymo programinė įranga." Master's thesis, Lithuanian Academic Libraries Network (LABT), 2004. http://vddb.library.lt/obj/LT-eLABa-0001:E.02~2004~D_20040611_155606-78522.

Full text

Abstract:

The purpose of this paper covered execution of one stage of semantic analysis compiuterization by development of a software able to improve the guality of automated translation. Such software "Lexes", the browser and editor routine of Lithuanian words and related to such words semantic attributes.

APA, Harvard, Vancouver, ISO, and other styles

24

Simmons, Nathan G. "Semantic Role Agency in Perceptions of the Lexical Items Sick and Evil." Diss., CLICK HERE for online access, 2008. http://contentdm.lib.byu.edu/ETD/image/etd2658.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

25

Xian, Yikun, and Liu Zhang. "Semantic Search with Information Integration." Thesis, Linnéuniversitetet, Institutionen för datavetenskap, fysik och matematik, DFM, 2011. http://urn.kb.se/resolve?urn=urn:nbn:se:lnu:diva-13832.

Full text

Abstract:

Since the search engine was first released in 1993, the development has never been slow down and various search engines emerged to vied for popularity. However, current traditional search engines like Google and Yahoo! are based on key words which lead to results impreciseness and information redundancy. A new search engine with semantic analysis can be the alternate solution in the future. It is more intelligent and informative, and provides better interaction with users. This thesis discusses the detail on semantic search, explains advantages of semantic search over other key-word-based search and introduces how to integrate semantic analysis with common search engines. At the end of this thesis, there is an example of implementation of a simple semantic search engine.

APA, Harvard, Vancouver, ISO, and other styles

26

Ozsoy, Makbule Gulcin. "Text Summarization Using Latent Semantic Analysis." Master's thesis, METU, 2011. http://etd.lib.metu.edu.tr/upload/12612988/index.pdf.

Full text

Abstract:

Text summarization solves the problem of presenting the information needed by a user in a compact form. There are different approaches to create well formed summaries in literature. One of the newest methods in text summarization is the Latent Semantic Analysis (LSA) method. In this thesis, different LSA based summarization algorithms are explained and two new LSA based summarization algorithms are proposed. The algorithms are evaluated on Turkish and English documents, and their performances are compared using their ROUGE scores.

APA, Harvard, Vancouver, ISO, and other styles

27

Norguet, Jean-Pierre. "Semantic analysis in web usage mining." Doctoral thesis, Universite Libre de Bruxelles, 2006. http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/210890.

Full text

Abstract:

With the emergence of the Internet and of the World Wide Web, the Web site has become a key communication channel in organizations. To satisfy the objectives of the Web site and of its target audience, adapting the Web site content to the users' expectations has become a major concern. In this context, Web usage mining, a relatively new research area, and Web analytics, a part of Web usage mining that has most emerged in the corporate world, offer many Web communication analysis techniques. These techniques include prediction of the user's behaviour within the site, comparison between expected and actual Web site usage, adjustment of the Web site with respect to the users' interests, and mining and analyzing Web usage data to discover interesting metrics and usage patterns. However, Web usage mining and Web analytics suffer from significant drawbacks when it comes to support the decision-making process at the higher levels in the organization.

Indeed, according to organizations theory, the higher levels in the organizations need summarized and conceptual information to take fast, high-level, and effective decisions. For Web sites, these levels include the organization managers and the Web site chief editors. At these levels, the results produced by Web analytics tools are mostly useless. Indeed, most of these results target Web designers and Web developers. Summary reports like the number of visitors and the number of page views can be of some interest to the organization manager but these results are poor. Finally, page-group and directory hits give the Web site chief editor conceptual results, but these are limited by several problems like page synonymy (several pages contain the same topic), page polysemy (a page contains several topics), page temporality, and page volatility.

Web usage mining research projects on their part have mostly left aside Web analytics and its limitations and have focused on other research paths. Examples of these paths are usage pattern analysis, personalization, system improvement, site structure modification, marketing business intelligence, and usage characterization. A potential contribution to Web analytics can be found in research about reverse clustering analysis, a technique based on self-organizing feature maps. This technique integrates Web usage mining and Web content mining in order to rank the Web site pages according to an original popularity score. However, the algorithm is not scalable and does not answer the page-polysemy, page-synonymy, page-temporality, and page-volatility problems. As a consequence, these approaches fail at delivering summarized and conceptual results.

An interesting attempt to obtain such results has been the Information Scent algorithm, which produces a list of term vectors representing the visitors' needs. These vectors provide a semantic representation of the visitors' needs and can be easily interpreted. Unfortunately, the results suffer from term polysemy and term synonymy, are visit-centric rather than site-centric, and are not scalable to produce. Finally, according to a recent survey, no Web usage mining research project has proposed a satisfying solution to provide site-wide summarized and conceptual audience metrics.

In this dissertation, we present our solution to answer the need for summarized and conceptual audience metrics in Web analytics. We first described several methods for mining the Web pages output by Web servers. These methods include content journaling, script parsing, server monitoring, network monitoring, and client-side mining. These techniques can be used alone or in combination to mine the Web pages output by any Web site. Then, the occurrences of taxonomy terms in these pages can be aggregated to provide concept-based audience metrics. To evaluate the results, we implement a prototype and run a number of test cases with real Web sites.

According to the first experiments with our prototype and SQL Server OLAP Analysis Service, concept-based metrics prove extremely summarized and much more intuitive than page-based metrics. As a consequence, concept-based metrics can be exploited at higher levels in the organization. For example, organization managers can redefine the organization strategy according to the visitors' interests. Concept-based metrics also give an intuitive view of the messages delivered through the Web site and allow to adapt the Web site communication to the organization objectives. The Web site chief editor on his part can interpret the metrics to redefine the publishing orders and redefine the sub-editors' writing tasks. As decisions at higher levels in the organization should be more effective, concept-based metrics should significantly contribute to Web usage mining and Web analytics.

Doctorat en sciences appliquées
info:eu-repo/semantics/nonPublished

APA, Harvard, Vancouver, ISO, and other styles

28

Fazekas, György. "Semantic audio analysis utilities and applications." Thesis, Queen Mary, University of London, 2012. http://qmro.qmul.ac.uk/xmlui/handle/123456789/8443.

Full text

Abstract:

Extraction, representation, organisation and application of metadata about audio recordings are in the concern of semantic audio analysis. Our broad interpretation, aligned with recent developments in the field, includes methodological aspects of semantic audio, such as those related to information management, knowledge representation and applications of the extracted information. In particular, we look at how Semantic Web technologies may be used to enhance information management practices in two audio related areas: music informatics and music production. In the first area, we are concerned with music information retrieval (MIR) and related research. We examine how structured data may be used to support reproducibility and provenance of extracted information, and aim to support multi-modality and context adaptation in the analysis. In creative music production, our goals can be summarised as follows: O↵-the-shelf sound editors do not hold appropriately structured information about the edited material, thus human-computer interaction is inefficient. We believe that recent developments in sound analysis and music understanding are capable of bringing about significant improvements in the music production workflow. Providing visual cues related to music structure can serve as an example of intelligent, context-dependent functionality. The central contributions of this work are a Semantic Web ontology for describing recording studios, including a model of technological artefacts used in music production, methodologies for collecting data about music production workflows and describing the work of audio engineers which facilitates capturing their contribution to music production, and finally a framework for creating Web-based applications for automated audio analysis. This has applications demonstrating how Semantic Web technologies and ontologies can facilitate interoperability between music research tools, and the creation of semantic audio software, for instance, for music recommendation, temperament estimation or multi-modal music tutoring.

APA, Harvard, Vancouver, ISO, and other styles

29

Mann, Jasleen Kaur. "Semantic Topic Modeling and Trend Analysis." Thesis, Linköpings universitet, Statistik och maskininlärning, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-173924.

Full text

Abstract:

This thesis focuses on finding an end-to-end unsupervised solution to solve a two-step problem of extracting semantically meaningful topics and trend analysis of these topics from a large temporal text corpus. To achieve this, the focus is on using the latest develop- ments in Natural Language Processing (NLP) related to pre-trained language models like Google’s Bidirectional Encoder Representations for Transformers (BERT) and other BERT based models. These transformer-based pre-trained language models provide word and sentence embeddings based on the context of the words. The results are then compared with traditional machine learning techniques for topic modeling. This is done to evalu- ate if the quality of topic models has improved and how dependent the techniques are on manually defined model hyperparameters and data preprocessing. These topic models provide a good mechanism for summarizing and organizing a large text corpus and give an overview of how the topics evolve with time. In the context of research publications or scientific journals, such analysis of the corpus can give an overview of research/scientific interest areas and how these interests have evolved over the years. The dataset used for this thesis is research articles and papers from a journal, namely ’Journal of Cleaner Productions’. This journal has more than 24000 research articles at the time of working on this project. We started with implementing Latent Dirichlet Allocation (LDA) topic modeling. In the next step, we implemented LDA along with document clus- tering to get topics within these clusters. This gave us an idea of the dataset and also gave us a benchmark. After having some base results, we explored transformer-based contextual word and sentence embeddings to evaluate if this leads to more meaningful, contextual, and semantic topics. For document clustering, we have used K-means clustering. In this thesis, we also discuss methods to optimally visualize the topics and the trend changes of these topics over the years. Finally, we conclude with a method for leveraging contextual embeddings using BERT and Sentence-BERT to solve this problem and achieve semantically meaningful topics. We also discuss the results from traditional machine learning techniques and their limitations.

APA, Harvard, Vancouver, ISO, and other styles

30

Tramutoli, Rosanna. "`Love`encoding in Swahili: a semantic description through a corpus-based analysis." Swahili Forum 22 (2015), S. 72-103, 2015. https://ul.qucosa.de/id/qucosa%3A14607.

Full text

Abstract:

Several studies have described emotional expressions used by speakers from different linguistic and cultural areas all around the world. It has been demonstrated that there are universal cognitive bases for the metaphorical expressions that speakers use to describe their emotional status. There are indeed significant differences concerning the use of emotional expressions, not only across languages but also language-internally. Quite a number of studies focus on the language of emotions in several European languages and languages of West Africa, whereas not enough research has been done on this regard on Eastern African languages

APA, Harvard, Vancouver, ISO, and other styles

31

Abbas, Abdullah. "Static analysis of semantic web queries with ShEx schema constraints." Thesis, Université Grenoble Alpes (ComUE), 2017. http://www.theses.fr/2017GREAM064/document.

Full text

Abstract:

La disponibilité de gros volumes de données structurées selon le modèle Resource Description Framework (RDF) est en constante augmentation. Cette situation implique un intérêt scientifique et un besoin important de rechercher de nouvelles méthodes d’analyse et de compilation de requêtes pour tirer le meilleur parti de l’extraction de données RDF. SPARQL est le plus utilisé et le mieux supporté des langages de requêtes sur des données RDF. En parallèle des langages de requêtes, les langages de définition de schéma d’expression de contraintes sur des jeux de données RDF ont également évolués. Les Shape Expressions (ShEx) sont de plus en plus utilisées pour valider des données RDF et pour indiquer les motifs de graphes attendus. Les schémas sont importants pour les tâches d’analyse statique telles que l’optimisation ou l’injection de requêtes. Notre intention est d’examiner les moyens et méthodologies d’analyse statique et d’optimisation de requêtes associés à des contraintes de schéma.Notre contribution se divise en deux grandes parties. Dans la première, nous considérons le problème de l’injection de requêtes SPARQL en présence de contraintes ShEx. Nous proposons une procédure rigoureuse et complète pour le problème de l’injection de requêtes avec ShEx, en prenant en charge plusieurs fragments de SPARQL. Plus particulièrement, notre procédure gère les patterns de requêtes OPTIONAL, qui s’avèrent former un important fonctionnalité à étudier avec les schémas. Nous fournissons ensuite les limites de complexité de notre problème en considération des fragments gérés. Nous proposons également une méthode alternative pour l’injection de requêtes SPARQL avec ShEx. Celle-ci réduit le problème à une satisfiabilité de Logique de Premier Ordre, qui permet de considérer une extension du fragment SPARQL traité par la première méthode. Il s’agit de la première étude traitant l’injection de requêtes SPARQL en présence de contraintes ShEx.Dans la seconde partie de nos contributions, nous proposons une méthode d’analyse pour optimiser l’évaluation de requêtes SPARQL groupées, sur des graphes RDF, en tirant avantage des contraintes ShEx. Notre optimisation s’appuie sur le calcul et l’assignation de rangs aux triple patterns d’une requête, permettant de déterminer leur ordre d’exécution. La présence de jointures intermédiaires entre ces patterns est la raison pour laquelle l’ordonnancement est important pour gagner en efficicacité. Nous définissons un ensemble de schémas ShEx bien- formulés, qui possède d’intéressantes caractéristiques pour l’optimisation de requêtes SPARQL. Nous développons ensuite notre méthode d’optimisation par l’exploitation d’informations extraites d’un schéma ShEx. Enfin, nous rendons compte des résultats des évaluations effectuées, montrant les avantages de l’application de notre optimisation face à l’état de l’art des systèmes d’évaluation de requêtes
Data structured in the Resource Description Framework (RDF) are increasingly available in large volumes. This leads to a major need and research interest in novel methods for query analysis and compilation for making the most of RDF data extraction. SPARQL is the widely used and well supported standard query language for RDF data. In parallel to query language evolutions, schema languages for expressing constraints on RDF datasets also evolve. Shape Expressions (ShEx) are increasingly used to validate RDF data, and to communicate expected graph patterns. Schemas in general are important for static analysis tasks such as query optimisation and containment. Our purpose is to investigate the means and methodologies for SPARQL query static analysis and optimisation in the presence of ShEx schema constraints.Our contribution is mainly divided into two parts. In the first part we consider the problem of SPARQL query containment in the presence of ShEx constraints. We propose a sound and complete procedure for the problem of containment with ShEx, considering several SPARQL fragments. Particularly our procedure considers OPTIONAL query patterns, that turns out to be an important feature to be studied with schemas. We provide complexity bounds for the containment problem with respect to the language fragments considered. We also propose alternative method for SPARQL query containment with ShEx by reduction into First Order Logic satisfiability, which allows for considering SPARQL fragment extension in comparison to the first method. This is the first work addressing SPARQL query containment in the presence of ShEx constraints.In the second part of our contribution we propose an analysis method to optimise the evaluation of conjunctive SPARQL queries, on RDF graphs, by taking advantage of ShEx constraints. The optimisation is based on computing and assigning ranks to query triple patterns, dictating their order of execution. The presence of intermediate joins between the query triple patterns is the reason why ordering is important in increasing efficiency. We define a set of well-formed ShEx schemas, that possess interesting characteristics for SPARQL query optimisation. We then develop our optimisation method by exploiting information extracted from a ShEx schema. We finally report on evaluation results performed showing the advantages of applying our optimisation on the top of an existing state-of-the-art query evaluation system

APA, Harvard, Vancouver, ISO, and other styles

32

Rukeyser, Alison Smiley. "A semantic analysis of Yup'ik spatial deixis /." For electronic version search Digital dissertations database. Restricted to UC campuses. Access is free to UC campus dissertations, 2005. http://uclibs.org/PID/11984.

Full text

APA, Harvard, Vancouver, ISO, and other styles

33

Tallberg, Ing-Mari. "Semantic analysis of irrelevant speech in dementia /." Stockholm, 2001. http://diss.kib.ki.se/2001/91-628-4613-2/.

Full text

APA, Harvard, Vancouver, ISO, and other styles

34

Menzies, Stacey. "Nsyilxcen modality : semantic analysis of epistemic modality." Thesis, University of British Columbia, 2013. http://hdl.handle.net/2429/43809.

Full text

Abstract:

The aim of this thesis is to describe and analyze the modal system of Nsyilxcen, an Interior Salish language spoken in south central British Columbia and northern Washington State. In particular, it focuses on the epistemic modals mat and cmay, which express necessity and possibility with respect to certain bodies of knowledge. Similar to modals in St'át'imcets (Rullmann et al. 2008) and Gitksan (Peterson 2010) these modals lexically encode an epistemic modal base and an indirect inferential evidential restriction. I propose that these two modals can be distinguished based on their modal force distinction, where mat has variable modal force and cmay a strictly encoded existential modal force. Based on these generalizations, I propose a formal semantic analysis for the epistemic modals drawing from Kratzer (1977, 1981, 1991, 2012), Rullmann et al. (2008), Peterson (2010), and Deal (2011). The analysis defines each modal in a way that accounts for the strictly encoded modal base and evidential restriction, as well as the variable modal force for mat and the strictly encoded existential modal force for cmay. In addition to the epistemic modals mat and cmay this thesis documents the reportative modal kʷukʷ as well as how Nsyilxcen encodes non-epistemic modality. It looks at the bouletic modal cakʷ and how Nsyilxcen encodes a deontic, circumstantial, ability, and teleological modal base which makes use of the irrealis marker ks-, imperative markers -x and -ikʷ, or the basic predicate, depending on the addressee and the context. This thesis will discuss how the Nsyilxcen system fits into a preliminary modal typology based on the semantics of these modals.

APA, Harvard, Vancouver, ISO, and other styles

35

Lascarides, A. "A formal semantic analysis of the progressive." Thesis, University of Edinburgh, 1988. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.234152.

Full text

APA, Harvard, Vancouver, ISO, and other styles

36

Rose, Tony Gerard. "Large vocabulary semantic analysis for text recognition." Thesis, Nottingham Trent University, 1993. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.333961.

Full text

APA, Harvard, Vancouver, ISO, and other styles

37

Huang, Fang. "Multi-document summarization with latent semantic analysis." Thesis, University of Sheffield, 2004. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.419255.

Full text

APA, Harvard, Vancouver, ISO, and other styles

38

Zombolou, Katerina. "Verbal alternations in Greek : a semantic analysis." Thesis, University of Reading, 2004. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.412172.

Full text

APA, Harvard, Vancouver, ISO, and other styles

39

Nikolopoulos, Spyridon. "Semantic multimedia analysis using knowledge and context." Thesis, Queen Mary, University of London, 2012. http://qmro.qmul.ac.uk/xmlui/handle/123456789/3148.

Full text

Abstract:

The difficulty of semantic multimedia analysis can be attributed to the extended diversity in form and appearance exhibited by the majority of semantic concepts and the difficulty to express them using a finite number of patterns. In meeting this challenge there has been a scientific debate on whether the problem should be addressed from the perspective of using overwhelming amounts of training data to capture all possible instantiations of a concept, or from the perspective of using explicit knowledge about the concepts’ relations to infer their presence. In this thesis we address three problems of pattern recognition and propose solutions that combine the knowledge extracted implicitly from training data with the knowledge provided explicitly in structured form. First, we propose a BNs modeling approach that defines a conceptual space where both domain related evi- dence and evidence derived from content analysis can be jointly considered to support or disprove a hypothesis. The use of this space leads to sig- nificant gains in performance compared to analysis methods that can not handle combined knowledge. Then, we present an unsupervised method that exploits the collective nature of social media to automatically obtain large amounts of annotated image regions. By proving that the quality of the obtained samples can be almost as good as manually annotated images when working with large datasets, we significantly contribute towards scal- able object detection. Finally, we introduce a method that treats images, visual features and tags as the three observable variables of an aspect model and extracts a set of latent topics that incorporates the semantics of both visual and tag information space. By showing that the cross-modal depen- dencies of tagged images can be exploited to increase the semantic capacity of the resulting space, we advocate the use of all existing information facets in the semantic analysis of social media.

APA, Harvard, Vancouver, ISO, and other styles

40

Xu, Xun. "Semantic spaces for video analysis of behaviour." Thesis, Queen Mary, University of London, 2016. http://qmro.qmul.ac.uk/xmlui/handle/123456789/23885.

Full text

Abstract:

There are ever growing interests from the computer vision community into human behaviour analysis based on visual sensors. These interests generally include: (1) behaviour recognition - given a video clip or specific spatio-temporal volume of interest discriminate it into one or more of a set of pre-defined categories; (2) behaviour retrieval - given a video or textual description as query, search for video clips with related behaviour; (3) behaviour summarisation - given a number of video clips, summarise out representative and distinct behaviours. Although countless efforts have been dedicated into problems mentioned above, few works have attempted to analyse human behaviours in a semantic space. In this thesis, we define semantic spaces as a collection of high-dimensional Euclidean space in which semantic meaningful events, e.g. individual word, phrase and visual event, can be represented as vectors or distributions which are referred to as semantic representations. With the semantic space, semantic texts, visual events can be quantitatively compared by inner product, distance and divergence. The introduction of semantic spaces can bring lots of benefits for visual analysis. For example, discovering semantic representations for visual data can facilitate semantic meaningful video summarisation, retrieval and anomaly detection. Semantic space can also seamlessly bridge categories and datasets which are conventionally treated independent. This has encouraged the sharing of data and knowledge across categories and even datasets to improve recognition performance and reduce labelling effort. Moreover, semantic space has the ability to generalise learned model beyond known classes which is usually referred to as zero-shot learning. Nevertheless, discovering such a semantic space is non-trivial due to (1) semantic space is hard to define manually. Humans always have a good sense of specifying the semantic relatedness between visual and textual instances. But a measurable and finite semantic space can be difficult to construct with limited manual supervision. As a result, constructing semantic space from data is adopted to learn in an unsupervised manner; (2) It is hard to build a universal semantic space, i.e. this space is always contextual dependent. So it is important to build semantic space upon selected data such that it is always meaningful within the context. Even with a well constructed semantic space, challenges are still present including; (3) how to represent visual instances in the semantic space; and (4) how to mitigate the misalignment of visual feature and semantic spaces across categories and even datasets when knowledge/data are generalised. This thesis tackles the above challenges by exploiting data from different sources and building contextual semantic space with which data and knowledge can be transferred and shared to facilitate the general video behaviour analysis. To demonstrate the efficacy of semantic space for behaviour analysis, we focus on studying real world problems including surveillance behaviour analysis, zero-shot human action recognition and zero-shot crowd behaviour recognition with techniques specifically tailored for the nature of each problem. Firstly, for video surveillances scenes, we propose to discover semantic representations from the visual data in an unsupervised manner. This is due to the largely availability of unlabelled visual data in surveillance systems. By representing visual instances in the semantic space, data and annotations can be generalised to new events and even new surveillance scenes. Specifically, to detect abnormal events this thesis studies a geometrical alignment between semantic representation of events across scenes. Semantic actions can be thus transferred to new scenes and abnormal events can be detected in an unsupervised way. To model multiple surveillance scenes simultaneously, we show how to learn a shared semantic representation across a group of semantic related scenes through a multi-layer clustering of scenes. With multi-scene modelling we show how to improve surveillance tasks including scene activity profiling/understanding, crossscene query-by-example, behaviour classification, and video summarisation. Secondly, to avoid extremely costly and ambiguous video annotating, we investigate how to generalise recognition models learned from known categories to novel ones, which is often termed as zero-shot learning. To exploit the limited human supervision, e.g. category names, we construct the semantic space via a word-vector representation trained on large textual corpus in an unsupervised manner. Representation of visual instance in semantic space is obtained by learning a visual-to-semantic mapping. We notice that blindly applying the mapping learned from known categories to novel categories can cause bias and deteriorating the performance which is termed as domain shift. To solve this problem we employed techniques including semisupervised learning, self-training, hubness correction, multi-task learning and domain adaptation. All these methods in combine achieve state-of-the-art performance in zero-shot human action task. In the last, we study the possibility to re-use known and manually labelled semantic crowd attributes to recognise rare and unknown crowd behaviours. This task is termed as zero-shot crowd behaviours recognition. Crucially we point out that given the multi-labelled nature of semantic crowd attributes, zero-shot recognition can be improved by exploiting the co-occurrence between attributes. To summarise, this thesis studies methods for analysing video behaviours and demonstrates that exploring semantic spaces for video analysis is advantageous and more importantly enables multi-scene analysis and zero-shot learning beyond conventional learning strategies.

APA, Harvard, Vancouver, ISO, and other styles

41

Buys, Stephanus. "Log analysis aided by latent semantic mapping." Thesis, Rhodes University, 2013. http://hdl.handle.net/10962/d1002963.

Full text

Abstract:

In an age of zero-day exploits and increased on-line attacks on computing infrastructure, operational security practitioners are becoming increasingly aware of the value of the information captured in log events. Analysis of these events is critical during incident response, forensic investigations related to network breaches, hacking attacks and data leaks. Such analysis has led to the discipline of Security Event Analysis, also known as Log Analysis. There are several challenges when dealing with events, foremost being the increased volumes at which events are often generated and stored. Furthermore, events are often captured as unstructured data, with very little consistency in the formats or contents of the events. In this environment, security analysts and implementers of Log Management (LM) or Security Information and Event Management (SIEM) systems face the daunting task of identifying, classifying and disambiguating massive volumes of events in order for security analysis and automation to proceed. Latent Semantic Mapping (LSM) is a proven paradigm shown to be an effective method of, among other things, enabling word clustering, document clustering, topic clustering and semantic inference. This research is an investigation into the practical application of LSM in the discipline of Security Event Analysis, showing the value of using LSM to assist practitioners in identifying types of events, classifying events as belonging to certain sources or technologies and disambiguating different events from each other. The culmination of this research presents adaptations to traditional natural language processing techniques that resulted in improved efficacy of LSM when dealing with Security Event Analysis. This research provides strong evidence supporting the wider adoption and use of LSM, as well as further investigation into Security Event Analysis assisted by LSM and other natural language or computer-learning processing techniques.
LaTeX with hyperref package
Adobe Acrobat 9.54 Paper Capture Plug-in

APA, Harvard, Vancouver, ISO, and other styles

42

Greenwood, Rob. "Semantic analysis for system level design automation." Thesis, This resource online, 1992. http://scholar.lib.vt.edu/theses/available/etd-10062009-020216/.

Full text

APA, Harvard, Vancouver, ISO, and other styles

43

Kawasha, Boniface Kaumba. "Lunda grammar : a morphosyntactic and semantic analysis /." view abstract or download file of text, 2003. http://wwwlib.umi.com/cr/uoregon/fullcit?p3095256.

Full text

Abstract:

Thesis (Ph. D.)--University of Oregon, 2003.
Typescript. Includes vita and abstract. Includes bibliographical references (leaves 453-461). Also available for download via the World Wide Web; free to University of Oregon users.

APA, Harvard, Vancouver, ISO, and other styles

44

Beadle, Lawrence. "Semantic and structural analysis of genetic programming." Thesis, University of Kent, 2009. https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.509628.

Full text

Abstract:

Genetic programming (GP) is a subset of evolutionary computation where candidate solutions are evaluated through execution or interpreted execution. The candidate solutions generated by GP are in the form of computer programs, which are evolved to achieve a stated objective. Darwinian evolutionary theory inspires the processes that make up GP which include crossover, mutation and selection. During a GP run, crossover, mutation and selection are performed iteratively until a program that satisfies the stated objectives is produced or a certain number of time steps have elapsed. The objectives of this thesis are to empirically analyse three different aspects of these evolved programs. These three aspects are diversity, efficient representation and the changing structure of programs during evolution. In addition to these analyses, novel algorithms are presented in order to test theories, improve the overall performance of GP and reduce program size. This thesis makes three contributions to the field of GP. Firstly, a detailed analysis is performed of the process of initialisation (generating random programs to start evolution) using four novel algorithms to empirically evaluate specific traits of starting populations of programs. It is shown how two factors simultaneously effect how strong the performance of starting population will be after a GP run. Secondly, semantically based operators are applied during evolution to encourage behavioural diversity and reduce the size of programs by removing inefficient segments of code during evolution. It is demonstrated how these specialist operators can be effective individually and when combined in a series of experiments. Finally, the role of the structure of programs is considered during evolution under different evolutionary parameters considering different problem domains. This analysis reveals some interesting effects of evolution on program structure as well as offering evidence to support the success of the specialist operators.

APA, Harvard, Vancouver, ISO, and other styles

45

Eades, Harley D. III. "The semantic analysis of advanced programming languages." Diss., University of Iowa, 2014. https://ir.uiowa.edu/etd/1312.

Full text

Abstract:

We live in a time where computing devices power essential systems of our society: our automobiles, our airplanes and even our medical services. In these safety-critical systems, bugs do not just cost money to fix; they have a potential to cause harm, even death. Therefore, software correctness is of paramount importance. Existing mainstream programming languages do not support software verification as part of their design, but rely on testing, and thus cannot completely rule out the possibility of bugs during software development. To fix this problem we must reshape the very foundation on which programming languages are based. Programming languages must support the ability to verify the correctness of the software developed in them, and this software verification must be possible using the same language the software is developed in. In the first half of this dissertation we introduce three new programming languages: Freedom of Speech, Separation of Proof from Program, and Dualized Type Theory. The Freedom of Speech language separates a logical fragment from of a general recursive programming language, but still allowing for the types of the logical fragment to depend on general recursive programs while maintaining logical consistency. Thus, obtaining the ability to verify properties of general recursion programs. Separation of Proof from Program builds on the Freedom of Speech languageby relieving several restrictions, and adding a number of extensions. Finally, Dualized Type Theory is a terminating functional programming language rich in constructive duality, and shows promise of being a logical foundation of induction and coninduction. These languages have the ability to verify properties of software, but how can we trust this verification? To be able to put our trust in these languages requires that the language be rigorously and mathematically defined so that the programming language itself can be studied as a mathematical object. Then we must show one very important property, logical consistency of the fragment of the programming language used to verify mathematical properties of the software. In the second half of this dissertation we introduce a well-known proof technique for showing logical consistency called hereditary substitution. Hereditary substitution shows promise of being less complex than existing proof techniques like the Tait-Girard Reducibility method. However, we are unsure which programming languages can be proved terminating using hereditary substitution. Our contribution to this line of work is the application of the hereditary substitution technique to predicative polymorphic programming languages, and the first proof of termination using hereditary substitution for a classical type theory.

APA, Harvard, Vancouver, ISO, and other styles

46

Kurita, Shuhei. "Neural Approaches for Syntactic and Semantic Analysis." Kyoto University, 2019. http://hdl.handle.net/2433/242436.

Full text

APA, Harvard, Vancouver, ISO, and other styles

47

Rahgozar, Arya. "Automatic Poetry Classification and Chronological Semantic Analysis." Thesis, Université d'Ottawa / University of Ottawa, 2020. http://hdl.handle.net/10393/40516.

Full text

Abstract:

The correction, authentication, validation and identification of the original texts in Hafez’s poetry among 16 or so old versions of his Divan has been a challenge for scholars. The semantic analysis of poetry with modern Digital Humanities techniques is also challenging. Analyzing latent semantics is more challenging in poetry than in prose for evident reasons, such as conciseness, imaginary and metaphorical constructions. Hafez’s poetry is, on the one hand, cryptic and complex because of his era’s restricting social properties and censorship impediments, and on the other hand, sophisticated because of his encapsulation of high-calibre world-views, mystical and philosophical attributes, artistically knitted within majestic decorations. Our research is strongly influenced by and is a continuation of, Mahmoud Houman’s instrumental and essential chronological classification of ghazals by Hafez. Houman’s chronological classification method (Houman, 1938), which we have adopted here, provides guidance to choose the correct version of Hafez’s poem among multiple manuscripts. Houman’s semantic analysis of Hafez’s poetry is unique in that the central concept of his classification is based on intelligent scrutiny of meanings, careful observation the evolutionary psychology of Hafez through his remarkable body of work. Houman’s analysis has provided the annotated data for the classification algorithms we will develop to classify the poems. We pursue to understand Hafez through the Houman’s perspective. In addition, we asked a contemporary expert to annotate Hafez ghazals (Raad, 2019). The rationale behind our research is also to satisfy the need for more efficient means of scholarly research, and to bring literature and computer science together as much as possible. Our research will support semantic analysis, and help with the design and development of tools for poetry research. We have developed a digital corpus of Hafez’s ghazals and applied proper word forms and punctuation. We digitized and extended chronological criteria to guide the correction and validation of Hafez’s poetry. To our knowledge, no automatic chronological classification has been conducted for Hafez poetry. Other than the meticulous preparation of our bilingual Hafez corpus for computational use, the innovative aspect of our classification research is two-fold. The first objective of our work is to develop semantic features to better train automatic classifiers for annotated poems and to apply the classifiers to unannotated poems, which is to classify the rest of the poems by applying machine learning (ML) methodology. The second task is to extract semantic information and properties to help design a visualization scheme to assist with providing a link between the prediction’s rationale and Houman’s perception of Hafez’s chronological properties of Hafez’s poetry. We identified and used effective Natural Language Processing (NLP) techniques such as classification, word-embedding features, and visualization to facilitate and automate semantic analysis of Hafez’s poetry. We defined and applied rigorous and repeatable procedures that can potentially be applied to other kinds of poetry. We showed that the chronological segments identified automatically were coherent. We presented and compared two independent chronological labellings of Hafez’s ghazals in digital form, pro- duced their ontologies and explained the inter-annotator-agreement and distributional semantic properties using relevant NLP techniques to help guide future corrections, authentication, and interpretation of Hafez’s poetry. Chronological labelling of the whole corpus not only helps better understand Hafez’s poetry, but it is a rigorous guide to better recognition of the correct versions of Hafez’s poems among multiple manuscripts. Such a small volume of complex poetic text required careful selection when choosing and developing appropriate ML techniques for the task. Through many classification and clustering experiments, we have achieved state-of-the-art prediction of chronological poems, trained and evaluated against our hand-made Hafez corpus. Our selected classification algorithm was a Support Vector Machine (SVM), trained with Latent Dirichlet Allocation (LDA)-based similarity features. We used clustering to produce an alternative perspective to classification. For our visualization methodology, we used the LDA features but also passed the results to a Principal Component Analysis (PCA) module to reduce the number of dimensions to two, thereby enabling graphical presentations. We believe that applying this method to poetry classifications, and showing the topic relations between poems in the same classes, will help us better understand the interrelated topics within the poems. Many of our methods can potentially be used in similar cases in which the intention is to semantically classify poetry.

APA, Harvard, Vancouver, ISO, and other styles

48

Ashraf, Jamshaid. "A semantic framework for ontology usage analysis." Thesis, Curtin University, 2013. http://hdl.handle.net/20.500.11937/1142.

Full text

Abstract:

The Semantic Web envisions a Web where information is accessible and processable by computers as well as humans. Ontologies are the cornerstones for realizing this vision of the Semantic Web by capturing domain knowledge by defining the terms and the relationship between these terms to provide a formal representation of the domain with machine-understandable semantics. Ontologies are used for semantic annotation, data interoperability and knowledge assimilation and dissemination.In the literature, different approaches have been proposed to build and evolve ontologies, but in addition to these, one more important concept needs to be considered in the ontology lifecycle, that is, its usage. Measuring the “usage” of ontologies will help us to effectively and efficiently make use of semantically annotated structured data published on the Web (formalized knowledge published on the Web), improve the state of ontology adoption and reusability, provide a usage-based feedback loop to the ontology maintenance process for a pragmatic conceptual model update, and source information accurately and automatically which can then be utilized in the other different areas of the ontology lifecycle. Ontology Usage Analysis is the area which evaluates, measures and analyses the use of ontologies on the Web. However, in spite of its importance, no formal approach is present in the literature which focuses on measuring the use of ontologies on the Web. This is in contrast to the approaches proposed in the literature on the other concepts of the ontology lifecycle, such as ontology development, ontology evaluation and ontology evolution. So, to address this gap, this thesis is an effort in such a direction to assess, analyse and represent the use of ontologies on the Web.In order to address the problem and realize the abovementioned benefits, an Ontology Usage Analysis Framework (OUSAF) is presented. The OUSAF Framework implements a methodological approach which is comprised of identification, investigation, representation and utilization phases. These phases provide a complete solution for usage analysis by allowing users to identify the key ontologies, and investigate, represent and utilize usage analysis results. Various computation components with several methods, techniques, and metrics for each phase are presented and evaluated using the Semantic Web data crawled from the Web. For the dissemination of ontology-usage-related information accessible to machines and humans, The U Ontology is presented to formalize the conceptual model of the ontology usage domain. The evaluation of the framework, solution components, methods, and a formalized conceptual model is presented, indicating the usefulness of the overall proposed solution.

APA, Harvard, Vancouver, ISO, and other styles

49

Chabot, Yoan. "Construction, enrichment and semantic analysis of timelines : application to digital forensics." Thesis, Dijon, 2015. http://www.theses.fr/2015DIJOS037/document.

Full text

Abstract:

Obtenir une vision précise des évènements survenus durant un incident est un objectif difficile à atteindre lors d'enquêtes de criminalistique informatique. Le problème de la reconstruction d'évènements, ayant pour objectif la construction et la compréhension d'une chronologie décrivant un incident, est l'une des étapes les plus importantes du processus d'investigation. La caractérisation et la compréhension complète d'un incident nécessite d'une part d'associer à chaque fragment d'information sa signification passée, puis d'établir des liens sémantiques entre ces fragments. Ces tâches nécessitent l'exploration de grands volumes de données hétérogènes trouvés dans la scène de crime. Face à ces masses d'informations, les enquêteurs rencontrent des problèmes de surcharge cognitive les amenant à commettre des erreurs ou à omettre des informations pouvant avoir une forte valeur ajoutée pour les progrès de l'enquête. De plus, tout résultat produit au terme de la reconstruction d'évènements doit respecter un certain nombre de critères afin de pouvoir être utilisé lors du procès. Les enquêteurs doivent notamment être en capacité d'expliquer les résultats produits. Afin d'aider les enquêteurs face à ces problèmes, cette thèse introduit l'approche SADFC. L'objectif principal de cette approche est de fournir aux enquêteurs des outils les aidant à restituer la sémantique des entités composant la scène de crime et à comprendre les relations liant ces entités tout en respectant les contraintes juridiques. Pour atteindre cet objectif, SADFC est composé de deux éléments. Tout d'abord, SADFC s'appuie sur des fondations théoriques garantissant la crédibilité des résultats produits par les outils via une définition formelle et rigoureuse des processus utilisés. Cette approche propose ensuite une architecture centrée sur une ontologie pour modéliser les connaissances inhérentes à la scène de crime et assister l'enquêteur dans l'analyse de ces connaissances. La pertinence et l'efficacité de ces outils sont démontrées au travers d'une étude relatant un cas d'investigation fictive
Having a clear view of events that occurred over time is a difficult objective to achieve in digital investigations (DI). Event reconstruction, which allows investigators to build and to understand the timeline of an incident, is one of the most important steps of a DI process. The complete understanding of an incident and its circumstances requires on the one hand to associate each piece of information to its meaning, and on the other hand to identify semantic relationships between these fragments. This complex task requires the exploration of a large and heterogeneous amount of information found on the crime scene. Therefore, investigators encounter cognitive overload problems when processing this data, causing them to make mistakes or omit information that could have a high added value for the progress of the investigation. In addition, any result produced by the reconstruction process must meet several legal requirements to be admissible at trial, including the ability to explain how the results were produced. To help the investigators to deal with these problems, this thesis introduces a semantic-based approach called SADFC. The main objective of this approach is to provide investigators with tools to help them find the meaning of the entities composing the crime scene and understand the relationships linking these entities, while respecting the legal requirements. To achieve this goal, SADFC is composed of two elements. First, SADFC is based on theoretical foundations, ensuring the credibility of the results produced by the tools via a formal and rigorous definition of the processes used. This approach then proposes an architecture centered on an ontology to model and structure the knowledge inherent to an incident and to assist the investigator in the analysis of this knowledge. The relevance and the effectiveness of this architecture are demonstrated through a case study describing a fictitious investigation

APA, Harvard, Vancouver, ISO, and other styles

50

De, Luca Ernesto William. "Semantic support in multilingual text retrieval." Aachen Shaker, 2008. http://d-nb.info/990194914/04.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic 'Semantic analyses'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles