To see the other types of publications on this topic, follow the link: Semantics - Data processing. eng.

Dissertations / Theses on the topic 'Semantics - Data processing. eng'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'Semantics - Data processing. eng.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Marques, Caio Miguel. "Pangea - Arquitetura semântica para a integração de dados e modelos geoespaciais na Web /." São José do Rio Preto : [s.n.], 2010. http://hdl.handle.net/11449/98654.

Full text
Abstract:
Orientador: Ivan Rizzo Guilherme<br>Banca: Marilde Terezinha Prado Santos<br>Banca: Carlos Roberto Valêncio<br>Resumo: Em muitas áreas do conhecimento e da atividade humana é requerida, impreterivelmente, a integração de informações geográficas. Atualmente, grande quantidade dessas informações geográficas estão publicadas na Web, por atores diversos, indo desde instituições governamentais, academia, até cidadãos comuns. Esses atores publicam dados geográficos em diversos formatos e utilizando tecnologias variadas. Neste contexto, apesar da enorme quantidade de dados e modelos geográficos publicados na Web, a diversidade de formatos e tecnologias nos quais são disponibilizados, somada à carência das soluções atualmente existentes, limitam o consumo, a integração e o compartilhamento das informações geográficas. Recentemente tem sido propostas abordagens que agregam semântica na descrição das informações geográficas, de modo a possibilitar melhorias no descobrimento e integração desse tipo de informação. Nesse sentido, neste trabalho é apresentado um levantamento das arquiteturas e infraestruturas semânticas utilizadas na integração e compartilhamento de dados e modelos geográficos. Com base nesse levantamento foram identificados os aspectos transversais às infraestruturas estudadas. Tais aspectos foram utilizados na definição do projeto da arquitetura descrita neste trabalho, denominada Pangea, que é composta dos seguintes módulos: anotação semântica, alinhamento de descrição semântica, repositórios semânticos, descobrimento e integração semântica de dados e modelos geográficos. Dentre os módulo mencionados foi implementado o repositório semântico e algumas funcionalidades referentes ao descobrimento e integração semântica de dados. Para avaliar os componentes implementados da Pangea é apresentado um estudo de caso referente ao contexto de derramamento de óleo no litoral<br>Abstract: The geographic information is definitely required in many areas of human knowledge and activity. Nowadays, a large part of this geographic information is published on the Web by various authors, from the governmental institutions and academy to the ordinary citizen. These authors publish the geographic data in several formats and using different technologies. In this context, in spite of having a great amount of available data on the Web, the diversity of formats and technologies that they are released, limit the consumption, the integration and the geographic information sharing. Recently, it has been proposed the approach that adds the semantics in the description of geographic information, so the discovery and integration can be enhanced. This work presents a study of semantics architectures and frameworks used in the geographic data integration and sharing. Based in this study, the transversal aspects to the studied architectures were identified. Those aspects were used in the project definition of the Pangea architecture which is composed by the following modules: semantic notation, alignment of semantic description, and semantic integration. In order to evaluate some of the Pangea components, a study of case is conducted in the problems of the environmental domain, considering oil blowout disasters<br>Mestre
APA, Harvard, Vancouver, ISO, and other styles
2

鄧偉明 and Wai-ming Tang. "Semantics of authentication in workflow security." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2001. http://hub.hku.hk/bib/B30110828.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Sax, Matthias J. "Performance Optimizations and Operator Semantics for Streaming Data Flow Programs." Doctoral thesis, Humboldt-Universität zu Berlin, 2020. http://dx.doi.org/10.18452/21424.

Full text
Abstract:
Unternehmen sammeln mehr Daten als je zuvor und müssen auf diese Informationen zeitnah reagieren. Relationale Datenbanken eignen sich nicht für die latenzfreie Verarbeitung dieser oft unstrukturierten Daten. Um diesen Anforderungen zu begegnen, haben sich in der Datenbankforschung seit dem Anfang der 2000er Jahre zwei neue Forschungsrichtungen etabliert: skalierbare Verarbeitung unstrukturierter Daten und latenzfreie Datenstromverarbeitung. Skalierbare Verarbeitung unstrukturierter Daten, auch bekannt unter dem Begriff "Big Data"-Verarbeitung, hat in der Industrie schnell Einzug erhalten. Gleichzeitig wurden in der Forschung Systeme zur latenzfreien Datenstromverarbeitung entwickelt, die auf eine verteilte Architektur, Skalierbarkeit und datenparallele Verarbeitung setzen. Obwohl diese Systeme in der Industrie vermehrt zum Einsatz kommen, gibt es immer noch große Herausforderungen im praktischen Einsatz. Diese Dissertation verfolgt zwei Hauptziele: Zuerst wird das Laufzeitverhalten von hochskalierbaren datenparallelen Datenstromverarbeitungssystemen untersucht. Im zweiten Hauptteil wird das "Dual Streaming Model" eingeführt, das eine Semantik zur gleichzeitigen Verarbeitung von Datenströmen und Tabellen beschreibt. Das Ziel unserer Untersuchung ist ein besseres Verständnis über das Laufzeitverhalten dieser Systeme zu erhalten und dieses Wissen zu nutzen um Anfragen automatisch ausreichende Rechenkapazität zuzuweisen. Dazu werden ein Kostenmodell und darauf aufbauende Optimierungsalgorithmen für Datenstromanfragen eingeführt, die Datengruppierung und Datenparallelität einbeziehen. Das vorgestellte Datenstromverarbeitungsmodell beschreibt das Ergebnis eines Operators als kontinuierlichen Strom von Veränderugen auf einer Ergebnistabelle. Dabei behandelt unser Modell die Diskrepanz der physikalischen und logischen Ordnung von Datenelementen inhärent und erreicht damit eine deterministische Semantik und eine minimale Verarbeitungslatenz.<br>Modern companies are able to collect more data and require insights from it faster than ever before. Relational databases do not meet the requirements for processing the often unstructured data sets with reasonable performance. The database research community started to address these trends in the early 2000s. Two new research directions have attracted major interest since: large-scale non-relational data processing as well as low-latency data stream processing. Large-scale non-relational data processing, commonly known as "Big Data" processing, was quickly adopted in the industry. In parallel, low latency data stream processing was mainly driven by the research community developing new systems that embrace a distributed architecture, scalability, and exploits data parallelism. While these systems have gained more and more attention in the industry, there are still major challenges to operate them at large scale. The goal of this dissertation is two-fold: First, to investigate runtime characteristics of large scale data-parallel distributed streaming systems. And second, to propose the "Dual Streaming Model" to express semantics of continuous queries over data streams and tables. Our goal is to improve the understanding of system and query runtime behavior with the aim to provision queries automatically. We introduce a cost model for streaming data flow programs taking into account the two techniques of record batching and data parallelization. Additionally, we introduce optimization algorithms that leverage our model for cost-based query provisioning. The proposed Dual Streaming Model expresses the result of a streaming operator as a stream of successive updates to a result table, inducing a duality between streams and tables. Our model handles the inconsistency of the logical and the physical order of records within a data stream natively, which allows for deterministic semantics as well as low latency query execution.
APA, Harvard, Vancouver, ISO, and other styles
4

Giese, Holger, Stephan Hildebrandt, and Leen Lambers. "Toward bridging the gap between formal semantics and implementation of triple graph grammars." Universität Potsdam, 2010. http://opus.kobv.de/ubp/volltexte/2010/4521/.

Full text
Abstract:
The correctness of model transformations is a crucial element for the model-driven engineering of high quality software. A prerequisite to verify model transformations at the level of the model transformation specification is that an unambiguous formal semantics exists and that the employed implementation of the model transformation language adheres to this semantics. However, for existing relational model transformation approaches it is usually not really clear under which constraints particular implementations are really conform to the formal semantics. In this paper, we will bridge this gap for the formal semantics of triple graph grammars (TGG) and an existing efficient implementation. Whereas the formal semantics assumes backtracking and ignores non-determinism, practical implementations do not support backtracking, require rule sets that ensure determinism, and include further optimizations. Therefore, we capture how the considered TGG implementation realizes the transformation by means of operational rules, define required criteria and show conformance to the formal semantics if these criteria are fulfilled. We further outline how static analysis can be employed to guarantee these criteria.
APA, Harvard, Vancouver, ISO, and other styles
5

Nedas, Konstantinos A. "Semantic Similarity of Spatial Scenes." Fogler Library, University of Maine, 2006. http://www.library.umaine.edu/theses/pdf/NedasKA2006.pdf.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Wong, Ping-wai, and 黃炳蔚. "Semantic annotation of Chinese texts with message structures based on HowNet." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2007. http://hub.hku.hk/bib/B38212389.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Herre, Heinrich, and Axel Hummel. "A paraconsistent semantics for generalized logic programs." Universität Potsdam, 2010. http://opus.kobv.de/ubp/volltexte/2010/4149/.

Full text
Abstract:
We propose a paraconsistent declarative semantics of possibly inconsistent generalized logic programs which allows for arbitrary formulas in the body and in the head of a rule (i.e. does not depend on the presence of any specific connective, such as negation(-as-failure), nor on any specific syntax of rules). For consistent generalized logic programs this semantics coincides with the stable generated models introduced in [HW97], and for normal logic programs it yields the stable models in the sense of [GL88].
APA, Harvard, Vancouver, ISO, and other styles
8

Lamprecht, Anna-Lena, Tiziana Margaria, and Bernhard Steffen. "Bio-jETI : a framework for semantics-based service composition." Universität Potsdam, 2009. http://opus.kobv.de/ubp/volltexte/2010/4506/.

Full text
Abstract:
Background: The development of bioinformatics databases, algorithms, and tools throughout the last years has lead to a highly distributedworld of bioinformatics services. Without adequatemanagement and development support, in silico researchers are hardly able to exploit the potential of building complex, specialized analysis processes from these services. The Semantic Web aims at thoroughly equipping individual data and services with machine-processable meta-information, while workflow systems support the construction of service compositions. However, even in this combination, in silico researchers currently would have to deal manually with the service interfaces, the adequacy of the semantic annotations, type incompatibilities, and the consistency of service compositions. Results: In this paper, we demonstrate by means of two examples how Semantic Web technology together with an adequate domain modelling frees in silico researchers from dealing with interfaces, types, and inconsistencies. In Bio-jETI, bioinformatics services can be graphically combined to complex services without worrying about details of their interfaces or about type mismatches of the composition. These issues are taken care of at the semantic level by Bio-jETI’s model checking and synthesis features. Whenever possible, they automatically resolve type mismatches in the considered service setting. Otherwise, they graphically indicate impossible/incorrect service combinations. In the latter case, the workflow developermay either modify his service composition using semantically similar services, or ask for help in developing the missing mediator that correctly bridges the detected type gap. Newly developed mediators should then be adequately annotated semantically, and added to the service library for later reuse in similar situations. Conclusion: We show the power of semantic annotations in an adequately modelled and semantically enabled domain setting. Using model checking and synthesis methods, users may orchestrate complex processes from a wealth of heterogeneous services without worrying about interfaces and (type) consistency. The success of this method strongly depends on a careful semantic annotation of the provided services and on its consequent exploitation for analysis, validation, and synthesis. We are convinced that these annotations will become standard, as they will become preconditions for the success and widespread use of (preferred) services in the Semantic Web
APA, Harvard, Vancouver, ISO, and other styles
9

Harrison, Dave. "Functional real-time programming : the language Ruth and its semantics." Thesis, University of Stirling, 1988. http://hdl.handle.net/1893/12116.

Full text
Abstract:
Real-time systems are amongst the most safety critical systems involving computer software and the incorrect functioning of this software can cause great damage, up to and including the loss of life. If seems sensible therefore to write real-time software in a way that gives us the best chance of correctly implementing specifications. Because of the high level of functional programming languages, their semantic simplicity and their amenability to formal reasoning and correctness preserving transformation it thus seems natural to use a functional language for this task. This thesis explores the problems of applying functional programming languages to real-time by defining the real-time functional programming language Ruth. The first part of the thesis concerns the identification of the particular problems associated with programming real-time systems. These can broadly be stated as a requirement that a real-time language must be able to express facts about time, a feature we have called time expressibility. The next stage is to provide time expressibility within a purely functional framework. This is accomplished by the use of timestamps on inputs and outputs and by providing a real-time clock as an input to Ruth programs. The final major part of the work is the construction of a formal definition of the semantics of Ruth to serve as a basis for formal reasoning and transformation. The framework within which the formal semantics of a real-time language are defined requires time expressibility in the same way as the real-time language itself. This is accomplished within the framework of domain theory by the use of specialised domains for timestamped objects, called herring-bone domains. These domains could be used as the basis for the definition of the semantics of any real-time language.
APA, Harvard, Vancouver, ISO, and other styles
10

Otten, Frederick John. "Using semantic knowledge to improve compression on log files." Thesis, Rhodes University, 2008. http://eprints.ru.ac.za/1660/.

Full text
APA, Harvard, Vancouver, ISO, and other styles
11

Fan, Yang, Hidehiko Masuhara, Tomoyuki Aotani, Flemming Nielson, and Hanne Riis Nielson. "AspectKE*: Security aspects with program analysis for distributed systems." Universität Potsdam, 2010. http://opus.kobv.de/ubp/volltexte/2010/4136/.

Full text
Abstract:
Enforcing security policies to distributed systems is difficult, in particular, when a system contains untrusted components. We designed AspectKE*, a distributed AOP language based on a tuple space, to tackle this issue. In AspectKE*, aspects can enforce access control policies that depend on future behavior of running processes. One of the key language features is the predicates and functions that extract results of static program analysis, which are useful for defining security aspects that have to know about future behavior of a program. AspectKE* also provides a novel variable binding mechanism for pointcuts, so that pointcuts can uniformly specify join points based on both static and dynamic information about the program. Our implementation strategy performs fundamental static analysis at load-time, so as to retain runtime overheads minimal. We implemented a compiler for AspectKE*, and demonstrate usefulness of AspectKE* through a security aspect for a distributed chat system.
APA, Harvard, Vancouver, ISO, and other styles
12

Pham, Son Bao Computer Science &amp Engineering Faculty of Engineering UNSW. "Incremental knowledge acquisition for natural language processing." Awarded by:University of New South Wales. School of Computer Science and Engineering, 2006. http://handle.unsw.edu.au/1959.4/26299.

Full text
Abstract:
Linguistic patterns have been used widely in shallow methods to develop numerous NLP applications. Approaches for acquiring linguistic patterns can be broadly categorised into three groups: supervised learning, unsupervised learning and manual methods. In supervised learning approaches, a large annotated training corpus is required for the learning algorithms to achieve decent results. However, annotated corpora are expensive to obtain and usually available only for established tasks. Unsupervised learning approaches usually start with a few seed examples and gather some statistics based on a large unannotated corpus to detect new examples that are similar to the seed ones. Most of these approaches either populate lexicons for predefined patterns or learn new patterns for extracting general factual information; hence they are applicable to only a limited number of tasks. Manually creating linguistic patterns has the advantage of utilising an expert's knowledge to overcome the scarcity of annotated data. In tasks with no annotated data available, the manual way seems to be the only choice. One typical problem that occurs with manual approaches is that the combination of multiple patterns, possibly being used at different stages of processing, often causes unintended side effects. Existing approaches, however, do not focus on the practical problem of acquiring those patterns but rather on how to use linguistic patterns for processing text. A systematic way to support the process of manually acquiring linguistic patterns in an efficient manner is long overdue. This thesis presents KAFTIE, an incremental knowledge acquisition framework that strongly supports experts in creating linguistic patterns manually for various NLP tasks. KAFTIE addresses difficulties in manually constructing knowledge bases of linguistic patterns, or rules in general, often faced in existing approaches by: (1) offering a systematic way to create new patterns while ensuring they are consistent; (2) alleviating the difficulty in choosing the right level of generality when creating a new pattern; (3) suggesting how existing patterns can be modified to improve the knowledge base's performance; (4) making the effort in creating a new pattern, or modifying an existing pattern, independent of the knowledge base's size. KAFTIE, therefore, makes it possible for experts to efficiently build large knowledge bases for complex tasks. This thesis also presents the KAFDIS framework for discourse processing using new representation formalisms: the level-of-detail tree and the discourse structure graph.
APA, Harvard, Vancouver, ISO, and other styles
13

Zhan, Tianjie. "Semantic analysis for extracting fine-grained opinion aspects." HKBU Institutional Repository, 2010. http://repository.hkbu.edu.hk/etd_ra/1213.

Full text
APA, Harvard, Vancouver, ISO, and other styles
14

Yang, Yimin. "Exploring Hidden Coherent Feature Groups and Temporal Semantics for Multimedia Big Data Analysis." FIU Digital Commons, 2015. http://digitalcommons.fiu.edu/etd/2254.

Full text
Abstract:
Thanks to the advanced technologies and social networks that allow the data to be widely shared among the Internet, there is an explosion of pervasive multimedia data, generating high demands of multimedia services and applications in various areas for people to easily access and manage multimedia data. Towards such demands, multimedia big data analysis has become an emerging hot topic in both industry and academia, which ranges from basic infrastructure, management, search, and mining to security, privacy, and applications. Within the scope of this dissertation, a multimedia big data analysis framework is proposed for semantic information management and retrieval with a focus on rare event detection in videos. The proposed framework is able to explore hidden semantic feature groups in multimedia data and incorporate temporal semantics, especially for video event detection. First, a hierarchical semantic data representation is presented to alleviate the semantic gap issue, and the Hidden Coherent Feature Group (HCFG) analysis method is proposed to capture the correlation between features and separate the original feature set into semantic groups, seamlessly integrating multimedia data in multiple modalities. Next, an Importance Factor based Temporal Multiple Correspondence Analysis (i.e., IF-TMCA) approach is presented for effective event detection. Specifically, the HCFG algorithm is integrated with the Hierarchical Information Gain Analysis (HIGA) method to generate the Importance Factor (IF) for producing the initial detection results. Then, the TMCA algorithm is proposed to efficiently incorporate temporal semantics for re-ranking and improving the final performance. At last, a sampling-based ensemble learning mechanism is applied to further accommodate the imbalanced datasets. In addition to the multimedia semantic representation and class imbalance problems, lack of organization is another critical issue for multimedia big data analysis. In this framework, an affinity propagation-based summarization method is also proposed to transform the unorganized data into a better structure with clean and well-organized information. The whole framework has been thoroughly evaluated across multiple domains, such as soccer goal event detection and disaster information management.
APA, Harvard, Vancouver, ISO, and other styles
15

Gunaratna, Kalpa. "Semantics-based Summarization of Entities in Knowledge Graphs." Wright State University / OhioLINK, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=wright1496124815009777.

Full text
APA, Harvard, Vancouver, ISO, and other styles
16

Barber, Nicole. "Aktionsart coercion." University of Western Australia. School of Humanities, 2008. http://theses.library.uwa.edu.au/adt-WU2008.0248.

Full text
Abstract:
This study aimed to investigate English Aktionsart coercion, particularly novel coercion, through corpora-based research. Novel coercions are those which need some contextual support in order to make sense of or be grammatical. Due to the nature of the data, a necessary part of the study was the design of a program to help in the process of tagging corpora for Aktionsart. This thesis starts with a discussion of five commonly accepted Aktionsarten: state, activity, achievement, accomplishment, and semelfactive. One significant contribution of the thesis is that it offers a comprehensive review and discussion of various theories that have been proposed to account for Aktionsart or aspectual coercion, as there is no such synthesis available in the literature. Thus the thesis moves on to a review of many of the more prominent works in the area of Aktionsart coercion, including Moens and Steedman (1988), Pustejovsky (1995), and De Swart (1998). I also present a few theories drawn from less prominent studies by authors in the area who have different or interesting views on the topic, such as Bickel (1997), Krifka (1998), and Xiao and McEnery (2004). In order to study the Aktionsart coercion of verbs in large corpora, examples of Aktionsart coercion needed to be collected. I aimed to design a computer program that could ideally perform a large portion of this task automatically. I present the methods I used in designing the program, as well as the process involved in using it to collect data. Some major steps in my research were the tagging of corpora, counting of coercion 3 frequency by type, and the selection of representative examples of different types of coercion for analysis and discussion. All of the examples collected from the corpora, both by my Aktionsart-tagging program and manually, were conventional coercions. As such there was no opportunity for an analysis of novel coercions. I nevertheless discuss the examples of conventional coercion that I gathered from the corpora analysis, with particular reference to Moens and Steedman’s (1988) theory. Three dominant types of coercion were identified in the data: from activities into accomplishments, activities into states, and accomplishments into states. There were two main ways coercions taking place in the data: from activity to accomplishment through the addition of an endpoint, and from various Aktionsarten into state by coercing the event into being a property of someone/something. Many of the Aktionsart coercion theories are supported at least in part by the data found in natural language. One of the most prominent coercions that is underrepresented in the data is from achievement to accomplishment through the addition of a preparatory process. I conclude that while there are reasons for analysing Aktionsart at verb phrase or sentence level, this does not mean the possibility of analyses at the lexical level should be ignored.
APA, Harvard, Vancouver, ISO, and other styles
17

Varde, Aparna S. "Graphical data mining for computational estimation in materials science applications." Link to electronic thesis, 2006. http://www.wpi.edu/Pubs/ETD/Available/etd-081506-152633/.

Full text
APA, Harvard, Vancouver, ISO, and other styles
18

Herre, Heinrich, and Axel Hummel. "Stationary generated models of generalized logic programs." Universität Potsdam, 2010. http://opus.kobv.de/ubp/volltexte/2010/4150/.

Full text
Abstract:
The interest in extensions of the logic programming paradigm beyond the class of normal logic programs is motivated by the need of an adequate representation and processing of knowledge. One of the most difficult problems in this area is to find an adequate declarative semantics for logic programs. In the present paper a general preference criterion is proposed that selects the ‘intended’ partial models of generalized logic programs which is a conservative extension of the stationary semantics for normal logic programs of [Prz91]. The presented preference criterion defines a partial model of a generalized logic program as intended if it is generated by a stationary chain. It turns out that the stationary generated models coincide with the stationary models on the class of normal logic programs. The general wellfounded semantics of such a program is defined as the set-theoretical intersection of its stationary generated models. For normal logic programs the general wellfounded semantics equals the wellfounded semantics.
APA, Harvard, Vancouver, ISO, and other styles
19

Doyen, Laurent. "Algorithmic analysis of complex semantics for timed and hybrid automata." Doctoral thesis, Universite Libre de Bruxelles, 2006. http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/210853.

Full text
Abstract:
In the field of formal verification of real-time systems, major developments have been recorded in the last fifteen years. It is about logics, automata, process algebra, programming languages, etc. From the beginning, a formalism has played an important role: timed automata and their natural extension,hybrid automata. Those models allow the definition of real-time constraints using real-valued clocks, or more generally analog variables whose evolution is governed by differential equations. They generalize finite automata in that their semantics defines timed words where each symbol is associated with an occurrence timestamp.<p><p>The decidability and algorithmic analysis of timed and hybrid automata have been intensively studied in the literature. The central result for timed automata is that they are positively decidable. This is not the case for hybrid automata, but semi-algorithmic methods are known when the dynamics is relatively simple, namely a linear relation between the derivatives of the variables.<p>With the increasing complexity of nowadays systems, those models are however limited in their classical semantics, for modelling realistic implementations or dynamical systems.<p><p>In this thesis, we study the algorithmics of complex semantics for timed and hybrid automata.<p>On the one hand, we propose implementable semantics for timed automata and we study their computational properties: by contrast with other works, we identify a semantics that is implementable and that has decidable properties. <p>On the other hand, we give new algorithmic approaches to the analysis of hybrid automata whose dynamics is given by an affine function of its variables.<p><br>Doctorat en sciences, Spécialisation Informatique<br>info:eu-repo/semantics/nonPublished
APA, Harvard, Vancouver, ISO, and other styles
20

Smith, Marc L. "View-centric reasoning about parallel and distributed computation." Doctoral diss., University of Central Florida, 2000. http://digital.library.ucf.edu/cdm/ref/collection/RTD/id/1597.

Full text
Abstract:
University of Central Florida College of Engineering Thesis<br>The development of distributed applications has not progressed as rapidly as its enabling technologies. In part, this is due to the difficulty of reasoning about such complex systems. In contrast to sequential systems, parallel systems give rise to parallel events, and the resulting uncertainty of the observed order of these events. Loosely coupled distributed systems complicate this even further by introducing the element of multiple imperfect observers of these parallel events. The goal of this dissertation is to advance parallel and distributed systems development by producing a parameterized model that can be instantiated to reflect the computation and coordination properties of such systems. The result is a model called paraDOS that we show to be general enough to have instantiations of two very distinct distributed computation models, Actors and tuple space. We show how paraDOS allows us to use operational semantics to reason about computation when such reasoning must account for multiple, inconsistent and imperfect views. We then extend the paraDOS model with an abstraction to support composition of communicating computational systems. This extension gives us a tool to reason formally about heterogeneous systems, and about new distributed computing paradigms such as the multiple tuple spaces support seen in Sun's JavaSpaces and IBM's T Spaces.<br>Ph.D.<br>Doctorate;<br>School of Electrical Engineering and Computer Science<br>Engineering and Computer Science<br>Electrical Engineering and Computer Science<br>196 p.<br>xiv, 196 leaves, bound : ill. ; 28 cm.
APA, Harvard, Vancouver, ISO, and other styles
21

Hartig, Olaf. "Querying a Web of Linked Data." Doctoral thesis, Humboldt-Universität zu Berlin, Mathematisch-Naturwissenschaftliche Fakultät II, 2014. http://dx.doi.org/10.18452/17015.

Full text
Abstract:
In den letzten Jahren haben sich spezielle Prinzipien zur Veröffentlichung strukturierter Daten im World Wide Web (WWW) etabliert. Diese Prinzipien erlauben es, von den jeweils angebotenen Daten auf weitere, nach den selben Prinzipien veröffentlichten Daten zu verweisen. Die daraus resultierende Form von Web-Daten wird entsprechend als Linked Data bezeichnet. Mit der Veröffentlichung von Linked Data im WWW entsteht ein sehr großer Datenraum, welcher Daten verschiedenster Anbieter miteinander verbindet und neuartige Möglichkeiten für Web-basierte Anwendungen bietet. Als Basis für die Entwicklung solcher Anwendungen haben mehrere Forschungsgruppen begonnen, Ansätze zu untersuchen, welche diesen Datenraum als eine Art verteilte Datenbank auffassen und die Ausführung deklarativer Anfragen über dieser Datenbank ermöglichen. Forschungsarbeit zu theoretischen Grundlagen der untersuchten Ansätze fehlt jedoch nahezu vollständig. Die vorliegende Dissertation schließt diese Lücke.<br>During recent years a set of best practices for publishing and connecting structured data on the World Wide Web (WWW) has emerged. These best practices are referred to as the Linked Data principles and the resulting form of Web data is called Linked Data. The increasing adoption of these principles has lead to the creation of a globally distributed space of Linked Data that covers various domains such as government, libraries, life sciences, and media. Approaches that conceive this data space as a huge distributed database and enable an execution of declarative queries over this database hold an enormous potential; they allow users to benefit from a virtually unbounded set of up-to-date data. As a consequence, several research groups have started to study such approaches. However, the main focus of existing work is to address practical challenges that arise in this context. Research on the foundations of such approaches is largely missing. This dissertation closes this gap.
APA, Harvard, Vancouver, ISO, and other styles
22

Zafalon, Zaira Regina. "Scan for MARC : princípios sintáticos e semânticos de registros bibliográficos aplicados à conversão de dados analógicos para o formato MARC 21 bibliográfico /." Marília : [s.n.], 2012. http://hdl.handle.net/11449/103386.

Full text
Abstract:
Orientador: Plácida Leopoldina Ventura Amorim da Costa Santos<br>Banca: Dulce Maria Baptista<br>Banca: Edberto Ferneda<br>Banca: Elisa Campos Machado<br>Banca: Ricardo César Gonçalves Sant'Ana<br>Resumo:A pesquisa apresenta como tema nuclear o estudo do processo de conversão de registros bibliográficos. Delimita-se o objeto de estudo pelo entendimento da conversão de registros bibliográficos analógicos para o formato MARC21 Bibliográfico, a partir da análise sintática e semântica de registros descritos segundo padrões de estrutura de metadados descritivos e padrões de conteúdo. A tese nesta pesquisa é a de que os princípios sintáticos e semânticos de registros bibliográficos, definidos pelos esquemas de descrição e de visualização na catalogação, presentes nos padrões de estrutura de metadados descritivos e nos padrões de conteúdo, determinam o processo de conversão de registros bibliográficos para o Formato MARC21 Bibliográfico. Em vista desse panorama, a proposição desta pesquisa é desenvolver um estudo teórico sobre a sintaxe e a semântica de registros bibliográficos, pelo viés da Linguística, com Saussure e Hjelmslev, que subsidiem a conversão de registros bibliográficos analógicos para o Formato MARC21 Bibliográfico em um interpretador computacional. Com esta proposta, estabelece-se, como objetivo geral, desenvolver um modelo teórico-conceitual de sintaxe e semântica em registros bibliográficos, a partir de estudos lingüísticos saussureanos e hjelmslevianos das manifestações da linguagem humana, que seja aplicável a um interpretador computacional voltado à conversão de registros bibliográficos ao formato MARC21 Bibliográfico. Para o alcance de tal objetivo recorre-se aos seguintes objetivos específicos, reunidos em dois grupos e voltados, respectivamente ao modelo teórico-conceitual da estrutura sintática e semântica de registros bibliográficos, e ao processo de conversão de seus registros: explicitar a relação entre a sintaxe e a semântica... (Resumo completo, clicar acesso eletrônico abaixo)<br>Abstract: The research presents as its central theme the study of the bibliographic record conversion process. The object of study is framed by an understanding of analogic bibliographic record conversion to the Bibliograhpic MARC21 format, based on a syntactic and semantic analysis of records described according to descriptive metadata structure standards and content standards. The thesis in this research is that the syntactic and semantic principles of bibliographic records, defined by description and visualization cataloguing schemes, present in the descriptive metadata structure standards and content standards, determine the bibliographic record conversion process to the MARC21 Bibliographic Format. In the light of this, the purpose of this research is to develop a theoretical study of the syntax and semantics of bibliographic records, grounded in Linguistic theories of Saussure and Hjelmslev, which can underlie analogic bibliographic record conversion to the MARC21 Bibliographic Format using a computational interpreter. To this end, the general aim was to develop a theoretical-conceptual model of the syntax and semantics of bibliographic records, based on saussurean and hjelmslevian linguistic studies of human language manifestations, which can be applicable to a computational interpreter designed for the conversion of bibliographic records to the MARC21 Bibliographic Format. To attain this goal, the following specific objectives were identified, in two groups and related to the theoretical-conceptual model of bibliographic record syntax and semantics and to the conversion process of the records, respectively: to make explicit the relationship between the syntax and semantics of bibliographic records... (Complete abstract click electronic access below)<br>Doutor
APA, Harvard, Vancouver, ISO, and other styles
23

Yang, Li. "Improving Topic Tracking with Domain Chaining." Thesis, University of North Texas, 2003. https://digital.library.unt.edu/ark:/67531/metadc4274/.

Full text
Abstract:
Topic Detection and Tracking (TDT) research has produced some successful statistical tracking systems. While lexical chaining, a non-statistical approach, has also been applied to the task of tracking by Carthy and Stokes for the 2001 TDT evaluation, an efficient tracking system based on this technology has yet to be developed. In thesis we investigate two new techniques which can improve Carthy's original design. First, at the core of our system is a semantic domain chainer. This chainer relies not only on the WordNet database for semantic relationships but also on Magnini's semantic domain database, which is an extension of WordNet. The domain-chaining algorithm is a linear algorithm. Second, to handle proper nouns, we gather all of the ones that occur in a news story together in a chain reserved for proper nouns. In this thesis we also discuss the linguistic limitations of lexical chainers to represent textual meaning.
APA, Harvard, Vancouver, ISO, and other styles
24

Sousa, Sidney Roberto de. "Gerenciamento de anotações semanticas de dados na Web para aplicações agricolas." [s.n.], 2010. http://repositorio.unicamp.br/jspui/handle/REPOSIP/275829.

Full text
Abstract:
Orientador: Claudia Maria Bauzer Medeiros<br>Dissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Computação<br>Made available in DSpace on 2018-08-15T17:25:33Z (GMT). No. of bitstreams: 1 Sousa_SidneyRobertode_M.pdf: 6639415 bytes, checksum: fbd426bff26dda1788b1310f9167190c (MD5) Previous issue date: 2010<br>Resumo: Sistemas de informação geográfica a cada vez mais utilizam informação geo-espacial da Web para produzir informação geográfica. Um grande desafio para tais sistemas é encontrar dados relevantes, onde tal busca é frequentemente baseada em palavras-chave ou nome de arquivos. Porém, tais abordagens carecem de semântica. Desta forma, torna-se necessário oferecer mecanismos para preparação de dados, afim de auxiliar a recuperação de dados semanticamente relevantes. Para atacar este problema, esta dissertação de mestrado propôem uma arquitetura baseada em serviços para gerenciar anotações semânticas. Neste trabalho, uma anotação semântica é um conjunto de triplas - chamadas unidades de anotação semântica - <subject, metadata field, object >, onde subject é um documento geo-espacial, (metadata field) é um campo de metadados sobre este documento e object é um termo de ontologia que associa semanticamente o campo de metadados a algum conceito apropriado. As principais contribuições desta dissertação são: um estudo comparativo sobre ferramentas de anotação; especificação e implementação de uma arquitetura baseada em serviços para gerenciar anotações semânticas, incluindo serviços para manuseio de termos de ontologias; e uma análise comparativa de mecanismos para armazenar anotações semânticas. O trabalho toma como estudo de caso anotações semânticas sobre documentos agrícolas<br>Abstract: Geographic information systems (GIS) are increasingly using geospatial data from the Web to produce geographic information. One big challenge is to find the relevant data, which often is based on keywords or even file names. However, these approaches lack semantics. Thus, it is necessary to provide mechanisms to prepare data to help retrieval of semantically relevant data. To attack this problem, this dissertation proposes a service-based architecture to manage semantic annotations. In this work, a semantic annotation is a set of triples - called semantic annotation units - < subject? metadata field? object >, where subject is a geospatial resource, (metadata field) contains some characteristic about this resource, and object is an ontology term that semantically associates the metadata field to some appropriate concept. The main contributions of this dissertation are: a comparative study on annotation tools; specification and implementation of a service-based architecture to manage semantic annotations, including services for handling ontology terms; and a comparative analysis of mechanisms for storing semantic annotations. The work takes as case study semantic annotations about agricultural resources<br>Mestrado<br>Banco de Dados<br>Mestre em Ciência da Computação
APA, Harvard, Vancouver, ISO, and other styles
25

Vitaliano, Filho Arnaldo Francisco 1982. "Mecanismos de anotação semântica para workfows cientificos." [s.n.], 2009. http://repositorio.unicamp.br/jspui/handle/REPOSIP/275760.

Full text
Abstract:
Orientador: Claudia Maria Bauzer Medeiros<br>Dissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Computação<br>Made available in DSpace on 2018-08-18T01:37:29Z (GMT). No. of bitstreams: 1 VitalianoFilho_ArnaldoFrancisco_M.pdf: 2435279 bytes, checksum: d273e44a51be70002d918835c2e5c11b (MD5) Previous issue date: 2009<br>Resumo: O compartilhamento de informações, processos e modelos de experimentos entre cientistas de diferentes organizações e domínios do conhecimento vem aumentando com a disponibilização dessas informações e modelos na Web. Muitos destes modelos de experimentos são descritos como workflows científicos. Entretanto, não existe uma padronização para a sua descrição, dificultando assim o reaproveitamento de workflows e seus componentes já existentes. A dissertação contribui para a solução deste problema com os seguintes resultados: a análise dos problemas relativos ao compartilhamento e projeto cooperativo de workflows científicos na Web, análise de aspectos de semântica e metadados relacionados a estes workflows, a disponibilização de um editor Web de workflows usando padrões WFMC e, o desenvolvimento de um modelo de anotação semântica para workflows científicos. Com isto, a dissertação cria a base para permitir a descoberta, reuso e compartilhamento de workflows científicos nas Web. O editor permite que pesquisadores construam seus workflows e anotações de forma online, e permite o consequente teste, com dados externos, do sistema de anotações<br>Abstract: The sharing of information, processes and models of experiments is increasing among scientists from many organizations and areas of knowledge, and thus there is a need for supply mechanisms of workflow discovery. Many of these models are described as scientific workflows. However, there is no default specification to describe them, which complicates the reuse of workflows and components that are available. This thesis contributes to solving this problem by presenting the following results: analysis of issues related to the sharing and cooperative design of scientific workflows on the Web; analysis of semantic aspects and metadata related to workflows, the development of a Web-based workflow editor, which incorporates our semantic annotation model for scientific workflows. Given these factors, this work creates the basis to allow the discovery, reuse and sharing of scientific workflows in the Web<br>Mestrado<br>Banco de Dados<br>Mestre em Ciência da Computação
APA, Harvard, Vancouver, ISO, and other styles
26

Faruque, Md Ehsanul. "A Minimally Supervised Word Sense Disambiguation Algorithm Using Syntactic Dependencies and Semantic Generalizations." Thesis, University of North Texas, 2005. https://digital.library.unt.edu/ark:/67531/metadc4969/.

Full text
Abstract:
Natural language is inherently ambiguous. For example, the word "bank" can mean a financial institution or a river shore. Finding the correct meaning of a word in a particular context is a task known as word sense disambiguation (WSD), which is essential for many natural language processing applications such as machine translation, information retrieval, and others. While most current WSD methods try to disambiguate a small number of words for which enough annotated examples are available, the method proposed in this thesis attempts to address all words in unrestricted text. The method is based on constraints imposed by syntactic dependencies and concept generalizations drawn from an external dictionary. The method was tested on standard benchmarks as used during the SENSEVAL-2 and SENSEVAL-3 WSD international evaluation exercises, and was found to be competitive.
APA, Harvard, Vancouver, ISO, and other styles
27

Sinha, Ravi Som. "Graph-based Centrality Algorithms for Unsupervised Word Sense Disambiguation." Thesis, University of North Texas, 2008. https://digital.library.unt.edu/ark:/67531/metadc9736/.

Full text
Abstract:
This thesis introduces an innovative methodology of combining some traditional dictionary based approaches to word sense disambiguation (semantic similarity measures and overlap of word glosses, both based on WordNet) with some graph-based centrality methods, namely the degree of the vertices, Pagerank, closeness, and betweenness. The approach is completely unsupervised, and is based on creating graphs for the words to be disambiguated. We experiment with several possible combinations of the semantic similarity measures as the first stage in our experiments. The next stage attempts to score individual vertices in the graphs previously created based on several graph connectivity measures. During the final stage, several voting schemes are applied on the results obtained from the different centrality algorithms. The most important contributions of this work are not only that it is a novel approach and it works well, but also that it has great potential in overcoming the new-knowledge-acquisition bottleneck which has apparently brought research in supervised WSD as an explicit application to a plateau. The type of research reported in this thesis, which does not require manually annotated data, holds promise of a lot of new and interesting things, and our work is one of the first steps, despite being a small one, in this direction. The complete system is built and tested on standard benchmarks, and is comparable with work done on graph-based word sense disambiguation as well as lexical chains. The evaluation indicates that the right combination of the above mentioned metrics can be used to develop an unsupervised disambiguation engine as powerful as the state-of-the-art in WSD.
APA, Harvard, Vancouver, ISO, and other styles
28

Souza, José Eduardo Pereira de. "Informática na EJA : contribuições da teoria histórico-cultural /." Marília : [s.n.], 2010. http://hdl.handle.net/11449/91180.

Full text
Abstract:
Orientador: José Carlos Miguel<br>Banca: Mariângela Braga Norte<br>Banca: Maria Raquel Miotto Morelatti<br>Resumo: Por trabalharmos há muitos anos com o fornecimento de softwares educativos para escolas e universidades, tínhamos a convicção de que a tecnologia podia se constituir em ferramenta importante para o processo de ensino-aprendizagem. Ao iniciarmos nossos estudos verificamos que a Educação de Jovens e Adultos - EJA era um segmento pouco estudado e com baixa utilização de tecnologias educacionais. Observamos, também, que já existiam alguns estudos sobre o uso das tecnologias na Educação com enfoque em teorias de ensino-aprendizagem, porém, não localizamos nenhum direcionado para a Teoria Histórico-Cultural - THC. Decidimos que o objetivo de nossa pesquisa seria buscar pressupostos da Teoria Histórico-Cultural que pudessem contribuir para o uso da Informática na Educação de Jovens e Adultos - EJA. Para a concretização do estudo desenvolvemos investigação bibliográfica abrangendo a história da Educação Brasileira, os índices de qualidade na Educação, o analfabetismo funcional e o letramento, a dualidade de função da escola que ao mesmo tempo emancipa e aliena, as iniciativas governamentais para a EJA, as diversas teorias da aprendizagem e as relacionamos com as tecnologias. Analisamos a questão do alfabetismo digital, da formação dos professores e nos aprofundamos na Teoria Histórico-Cultural pensando sobre o entorno, sobre as perspectivas educativas que se abrem ao se considerar a Zona de Desenvolvimento Proximal e o aprendizado pela interação social. Apresentamos as Tecnologias da Informação e Comunicação - TIC e relacionamos a Informática na Educação com a EJA e apresentamos um caso real sobre o assunto. Coletamos dados por meio de entrevistas, formulários, observações e registros durante o processo de formação de professores com ênfase na abordagem Histórico-Cultural e durante suas atividades com alunos nos laboratórios de informática em Pirassununga (SP)<br>Abstract: By working for many years with the provision of educational software for schools and colleges we had the conviction that technology could be constituted as an important tool in the teaching-learning process. When we began our studies we found that the YAE - Youth and Adult Education was a segment little studied and with low use of educational technologies. We also noted that there were already some studies on the use of technology in Education with focus on theories of teaching-learning, however, we did not find any directed to the Historic-Cultural Theory. We decided that the goal of our research would be to seek assumptions of the Historical-Cultural Theory that could contribute to the use of the Computer in Youths and Adults Education - YAE. To realize the study we developed literature investigation comprehending the history of Brazilian Education, the quality indexes in Education, the functional illiteracy and the literacy, the dual function of the school that, at the same time, emancipates and alienates, the government initiatives for the YAE, the various learning theories and related it to the technologies. We analyzed the issue of the digital illiteracy, of the teacher training and we went deeper in the Historical-Cultural Theory thinking about the environment, about the educational prospects that open up when considering the Zone of Proximal Development and the learning through social interaction. We Introduced the ICT - Information and Communication Technology and related it to the Data Processing in Education in YAE - Youth and Adult Education and presented a real case on the subject. We collected data through interviews, forms, observation and records during the process of teachers education with an emphasis on historical-cultural approach and during their activities with students in the computer labs in Pirassununga (SP). We concluded our work showing that the pedagogical approach<br>Mestre
APA, Harvard, Vancouver, ISO, and other styles
29

Thakur, Amritanshu. "Semantic construction with provenance for model configurations in scientific workflows." Master's thesis, Mississippi State : Mississippi State University, 2008. http://library.msstate.edu/etd/show.asp?etd=etd-07312008-092758.

Full text
APA, Harvard, Vancouver, ISO, and other styles
30

Schwartz, Hansen A. "The acquisition of lexical knowledge from the web for aspects of semantic interpretation." Doctoral diss., University of Central Florida, 2011. http://digital.library.ucf.edu/cdm/ref/collection/ETD/id/5028.

Full text
Abstract:
Applications to word sense disambiguation, an aspect of semantic interpretation, are used to evaluate the contributions. Disambiguation systems which utilize semantically annotated training data are considered supervised. The algorithms of this dissertation are considered minimally-supervised; they do not require training data created by humans, though they may use human-created data sources. In the case of evaluating a database of common sense knowledge, integrating the knowledge into an existing minimally-supervised disambiguation system significantly improved results -- a 20.5\% error reduction. Similarly, the Web selectors disambiguation system, which acquires knowledge directly as part of the algorithm, achieved results comparable with top minimally-supervised systems, an F-score of 80.2\% on a standard noun disambiguation task. This work enables the study of many subsequent related tasks for improving semantic interpretation and its application to real-world technologies. Other aspects of semantic interpretation, such as semantic role labeling could utilize the same methods presented here for word sense disambiguation. As the Web continues to grow, the capabilities of the systems in this dissertation are expected to increase. Although the Web selectors system achieves great results, a study in this dissertation shows likely improvements from acquiring more data. Furthermore, the methods for acquiring a database of common sense knowledge could be applied in a more exhaustive fashion for other types of common sense knowledge. Finally, perhaps the greatest benefits from this work will come from the enabling of real world technologies that utilize semantic interpretation.; This work investigates the effective acquisition of lexical knowledge from the Web to perform semantic interpretation. The Web provides an unprecedented amount of natural language from which to gain knowledge useful for semantic interpretation. The knowledge acquired is described as common sense knowledge, information one uses in his or her daily life to understand language and perception. Novel approaches are presented for both the acquisition of this knowledge and use of the knowledge in semantic interpretation algorithms. The goal is to increase accuracy over other automatic semantic interpretation systems, and in turn enable stronger real world applications such as machine translation, advanced Web search, sentiment analysis, and question answering. The major contributions of this dissertation consist of two methods of acquiring lexical knowledge from the Web, namely a database of common sense knowledge and Web selectors. The first method is a framework for acquiring a database of concept relationships. To acquire this knowledge, relationships between nouns are found on the Web and analyzed over WordNet using information-theory, producing information about concepts rather than ambiguous words. For the second contribution, words called Web selectors are retrieved which take the place of an instance of a target word in its local context. The selectors serve for the system to learn the types of concepts that the sense of a target word should be similar. Web selectors are acquired dynamically as part of a semantic interpretation algorithm, while the relationships in the database are useful to stand-alone programs. A final contribution of this dissertation concerns a novel semantic similarity measure and an evaluation of similarity and relatedness measures on tasks of concept similarity. Such tasks are useful when applying acquired knowledge to semantic interpretation.<br>ID: 029808979; System requirements: World Wide Web browser and PDF reader.; Mode of access: World Wide Web.; Thesis (Ph.D.)--University of Central Florida, 2011.; Includes bibliographical references (p. 141-160).<br>Ph.D.<br>Doctorate<br>Electrical Engineering and Computer Science<br>Engineering and Computer Science
APA, Harvard, Vancouver, ISO, and other styles
31

Santanchè, André 1968. "Fluid Web e componentes de conteudo digital : da visão centrada em documentos para a visão centrada em conteudo." [s.n.], 2006. http://repositorio.unicamp.br/jspui/handle/REPOSIP/276279.

Full text
Abstract:
Orientador: Claudia Bauer Medeiros<br>Tese (doutorado) - Universidade Estadual de Campinas, Instituto de Computação<br>Made available in DSpace on 2018-08-07T03:38:23Z (GMT). No. of bitstreams: 1 Santanche_Andre_D.pdf: 5630081 bytes, checksum: a9ac93609b33f3525c7597c3bbc398b9 (MD5) Previous issue date: 2006<br>Resumo: A Web está evoluindo de um espaço para publicação/consumo de documentos para um ambiente para trabalho colaborativo, onde o conteúdo digital pode viajar e ser replicado, adaptado, decomposto, fundido e transformado. Designamos esta perspectiva por Fluid Web. Esta visão requer uma reformulação geral da abordagem típica orientada a docu­mentos que permeia o gerenciamento de conteúdo na Web. Esta tese apresenta nossa solução para a Fluid Web, que permite nos deslocarmos de uma perspectiva orientada a documentos para outra orientada a conteúdo, onde "conteúdo" pode ser qualquer objeto digital. A solução é baseada em dois eixos: (i) uma unidade auto-descritiva que encap­sula qualquer tipo de artefato de conteúdo - o Componente de Conteúdo Digital (Digital Content Component - DCC); e (ii) uma infraestrutura para a Fluid Web que permite o gerenciamento e distribuição de DCCs na Web, cujo objetivo é dar suporte à colaboração na Web. Concebidos para serem reusados e adaptados, os DCCs encapsulam dados e software usando uma única estrutura, permitindo deste modo composição homogênea e proces­samento de qualquer conteúdo digital, seja este executável ou não. Estas propriedades são exploradas pela nossa infraestrutura para a Fluid Web, que engloba mecanismos de descoberta e de anotação de DCCs em múltiplos níveis, gerenciamento de configurações e controle de versões. Nosso trabalho explora padrões de Web Semântica e ontologias ta­xonômicas, que servem como uma ponte semântica, unificando vocabulários para gerenci­amento de DCCs e facilitando as tarefas de descrição/indexação/descoberta de conteúdo. Os DCCs e sua infraestrura foram implementados e são ilustrados por meio de exemplos práticos, para aplicações científicas. As principais contribuições desta tese são: o modelo de Digital Content Component; o projeto da infraestrutura para a Fluid Web baseada em DCCs, com suporte para armaze­namento baseado em repositórios, compartilhamento, controle de versões e gerenciamento de configurações distribuídas; um algoritmo para a descoberta de conteúdo digital que explora a semântica associada aos DCCs; e a validação prática dos principais conceitos desta pesquisa, com a implementação de protótipos<br>Abstract: The Web is evolving from a space for publicationj consumption of documents to an en­vironment for collaborative work, where digital content can traveI and be replicated, adapted, decomposed, fusioned and transformed. We call this the Fluid Web perspective. This view requires a thorough revision of the typical document-oriented approach that permeates content management on the Web. This thesis presents our solution for the Fluid Web, which allows moving from the document-oriented to a content-oriented pers­pective, where "content" can be any digital object. The solution is based on two axes: a self-descriptive unit to encapsulate any kind of content artifact - the Digital Content Component (DCC); and a Fluid Web infrastructure that provides management and de­ployment of DCCs through the Web, and whose goal is to support collaboration on the Web. Designed to be reused and adapted, DCCs encapsulate data and software using a single structure, thus allowing homogeneous composition and processing of any digital content, be it executable or noto These properties are exploited by our Fluid Web infrastructure, which supports DCC multilevel annotation and discovery mechanisms, configuration ma­nagement and version controI. Our work extensively explores Semantic Web standards and taxonomic ontologies, which serve as a semantic bridge, unifying DCC management vo­cabularies and improving DCC descriptionjindexingjdiscovery. DCCs and infrastructure have been implemented and are illustrated by means of examples, for scientific applicati­ons. The main contributions of this thesis are: the model of Digital Content Component; the design of the Fluid Web infrastructure based on DCCs, with support for repository­based storage, distributed sharing, version control and configuration management; an algorithm for digital content discovery that explores DCe semantics; and a practical validation of the main concepts in this research through implementation of prototypes<br>Doutorado<br>Banco de Dados<br>Mestre em Ciência da Computação
APA, Harvard, Vancouver, ISO, and other styles
32

Cregan, Anne Computer Science &amp Engineering Faculty of Engineering UNSW. "Weaving the semantic web: Contributions and insights." Publisher:University of New South Wales. Computer Science & Engineering, 2008. http://handle.unsw.edu.au/1959.4/42605.

Full text
Abstract:
The semantic web aims to make the meaning of data on the web explicit and machine processable. Harking back to Leibniz in its vision, it imagines a world of interlinked information that computers `understand' and `know' how to process based on its meaning. Spearheaded by the World Wide Web Consortium, ontology languages OWL and RDF form the core of the current technical offerings. RDF has successfully enabled the construction of virtually unlimited webs of data, whilst OWL gives the ability to express complex relationships between RDF data triples. However, the formal semantics of these languages limit themselves to that aspect of meaning that can be captured by mechanical inference rules, leaving many open questions as to other aspects of meaning and how they might be made machine processable. The Semantic Web has faced a number of problems that are addressed by the included publications. Its germination within academia, and logical semantics has seen it struggle to become familiar, accessible and implementable for the general IT population, so an overview of semantic technologies is provided. Faced with competing `semantic' languages, such as the ISO's Topic Map standards, a method for building ISO-compliant Topic Maps in the OWL DL language has been provided, enabling them to take advantage of the more mature OWL language and tools. Supplementation with rules is needed to deal with many real-world scenarios and this is explored as a practical exercise. The available syntaxes for OWL have hindered domain experts in ontology building, so a natural language syntax for OWL designed for use by non-logicians is offered and compared with similar offerings. In recent years, proliferation of ontologies has resulted in far more than are needed in any given domain space, so a mechanism is proposed to facilitate the reuse of existing ontologies by giving contextual information and leveraging social factors to encourage wider adoption of common ontologies and achieve interoperability. Lastly, the question of meaning is addressed in relation to the need to define one's terms and to ground one's symbols by anchoring them effectively, ultimately providing the foundation for evolving a `Pragmatic Web' of action.
APA, Harvard, Vancouver, ISO, and other styles
33

Macario, Carla Geovana do Nascimento. "Anotação semantica de dados geoespaciais." [s.n.], 2009. http://repositorio.unicamp.br/jspui/handle/REPOSIP/275838.

Full text
Abstract:
Orientador: Claudia Maria Bauzer Medeiros<br>Tese (doutorado) - Universidade Estadual de Campinas, Instituto de Computação<br>Made available in DSpace on 2018-08-15T04:11:30Z (GMT). No. of bitstreams: 1 Macario_CarlaGeovanadoNascimento_D.pdf: 3780981 bytes, checksum: 4b8ad7138779392bff940f1f95ad1f51 (MD5) Previous issue date: 2009<br>Resumo: Dados geoespaciais constituem a base para sistemas de decisão utilizados em vários domínios, como planejamento de transito, fornecimento de serviços ou controle de desastres. Entretanto, para serem usados, estes dados precisam ser analisados e interpretados, atividades muitas vezes trabalhosas e geralmente executadas por especialistas. Apesar disso estas interpretacoes nao sao armazenadas e quando o são, geralmente correspondem a alguma informacao textual e em linguagem própria, gravadas em arquivos tecnicos. A ausencia de solucoes eficientes para armazenar estas interpretaçães leva a problemas como retrabalho e dificuldades de compartilhamento de informação. Neste trabalho apresentamos uma soluçao para estes problemas que baseia-se no uso de anotações semânticas, uma abordagem que promove um entendimento comum dos conceitos usados. Para tanto, propomos a adocão de workflows científicos para descricao do processo de anotacão dos dados e tambíem de um esquema de metadados e ontologias bem conhecidas, aplicando a soluçao a problemas em agricultura. As contribuicães da tese envolvem: (i) identificacao de um conjunto de requisitos para busca semantica a dados geoespaciais; (ii) identificacao de características desejóveis para ferramentas de anotacão; (iii) proposta e implementacao parcial de um framework para a anotacão semântica de diferentes tipos de dados geoespaciais; e (iv) identificacao dos desafios envolvidos no uso de workflows para descrever o processo de anotaçcaão. Este framework foi parcialmente validado, com implementação para aplicações em agricultura<br>Abstract: Geospatial data are a basis for decision making in a wide range of domains, such as traffic planning, consumer services disasters controlling. However, to be used, these kind of data have to be analyzed and interpreted, which constitutes a hard task, prone to errors, and usually performed by experts. Although all of these factors, the interpretations are not stored. When this happens, they correspond to descriptive text, which is stored in technical files. The absence of solutions to efficiently store them leads to problems such as rework and difficulties in information sharing. In this work we present a solution for these problems based on semantic annotations, an approach for a common understanding of concepts being used. We propose the use of scientific workflows to describe the annotation process for each kind of data, and also the adoption of well known metadata schema and ontologies. The contributions of this thesis involves: (i) identification of requirements for semantic search of geospatial data; (ii) identification of desirable features for annotation tools; (iii) proposal, and partial implementation, of a a framework for semantic annotation of different kinds of geospatial data; and (iv) identification of challenges in adopting scientific workflows for describing the annotation process. This framework was partially validated, through an implementation to produce annotations for applications in agriculture<br>Doutorado<br>Banco de Dados<br>Doutora em Ciência da Computação
APA, Harvard, Vancouver, ISO, and other styles
34

Marques, Caio Miguel [UNESP]. "Pangea - Arquitetura semântica para a integração de dados e modelos geoespaciais na Web." Universidade Estadual Paulista (UNESP), 2010. http://hdl.handle.net/11449/98654.

Full text
Abstract:
Made available in DSpace on 2014-06-11T19:29:40Z (GMT). No. of bitstreams: 0 Previous issue date: 2010-08-05Bitstream added on 2014-06-13T18:59:18Z : No. of bitstreams: 1 marques_cm_me_sjrp.pdf: 1538758 bytes, checksum: c5b451433af39d95469d3e12a5eb6665 (MD5)<br>Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)<br>Em muitas áreas do conhecimento e da atividade humana é requerida, impreterivelmente, a integração de informações geográficas. Atualmente, grande quantidade dessas informações geográficas estão publicadas na Web, por atores diversos, indo desde instituições governamentais, academia, até cidadãos comuns. Esses atores publicam dados geográficos em diversos formatos e utilizando tecnologias variadas. Neste contexto, apesar da enorme quantidade de dados e modelos geográficos publicados na Web, a diversidade de formatos e tecnologias nos quais são disponibilizados, somada à carência das soluções atualmente existentes, limitam o consumo, a integração e o compartilhamento das informações geográficas. Recentemente tem sido propostas abordagens que agregam semântica na descrição das informações geográficas, de modo a possibilitar melhorias no descobrimento e integração desse tipo de informação. Nesse sentido, neste trabalho é apresentado um levantamento das arquiteturas e infraestruturas semânticas utilizadas na integração e compartilhamento de dados e modelos geográficos. Com base nesse levantamento foram identificados os aspectos transversais às infraestruturas estudadas. Tais aspectos foram utilizados na definição do projeto da arquitetura descrita neste trabalho, denominada Pangea, que é composta dos seguintes módulos: anotação semântica, alinhamento de descrição semântica, repositórios semânticos, descobrimento e integração semântica de dados e modelos geográficos. Dentre os módulo mencionados foi implementado o repositório semântico e algumas funcionalidades referentes ao descobrimento e integração semântica de dados. Para avaliar os componentes implementados da Pangea é apresentado um estudo de caso referente ao contexto de derramamento de óleo no litoral<br>The geographic information is definitely required in many areas of human knowledge and activity. Nowadays, a large part of this geographic information is published on the Web by various authors, from the governmental institutions and academy to the ordinary citizen. These authors publish the geographic data in several formats and using different technologies. In this context, in spite of having a great amount of available data on the Web, the diversity of formats and technologies that they are released, limit the consumption, the integration and the geographic information sharing. Recently, it has been proposed the approach that adds the semantics in the description of geographic information, so the discovery and integration can be enhanced. This work presents a study of semantics architectures and frameworks used in the geographic data integration and sharing. Based in this study, the transversal aspects to the studied architectures were identified. Those aspects were used in the project definition of the Pangea architecture which is composed by the following modules: semantic notation, alignment of semantic description, and semantic integration. In order to evaluate some of the Pangea components, a study of case is conducted in the problems of the environmental domain, considering oil blowout disasters
APA, Harvard, Vancouver, ISO, and other styles
35

Necşulescu, Silvia. "Automatic acquisition of lexical-semantic relations: gathering information in a dense representation." Doctoral thesis, Universitat Pompeu Fabra, 2016. http://hdl.handle.net/10803/374234.

Full text
Abstract:
Lexical-semantic relationships between words are key information for many NLP tasks, which require this knowledge in the form of lexical resources. This thesis addresses the acquisition of lexical-semantic relation instances. State of the art systems rely on word pair representations based on patterns of contexts where two related words co-occur to detect their relation. This approach is hindered by data sparsity: even when mining very large corpora, not every semantically related word pair co-occurs or not frequently enough. In this work, we investigate novel representations to predict if two words hold a lexical-semantic relation. Our intuition was that these representations should contain information about word co-occurrences combined with information about the meaning of words involved in the relation. These two sources of information have to be the basis of a generalization strategy to be able to provide information even for words that do not co-occur.<br>Les relacions lexicosemàntiques entre paraules són una informació clau per a moltes tasques del PLN, què requereixen aquest coneixement en forma de recursos lingüístics. Aquesta tesi tracta l’adquisició d'instàncies lexicosemàntiques. Els sistemes actuals utilitzen representacions basades en patrons dels contextos en què dues paraules coocorren per detectar la relació que s'hi estableix. Aquest enfocament s'enfronta a problemes de falta d’informació: fins i tot en el cas de treballar amb corpus de grans dimensions, hi haurà parells de paraules relacionades que no coocorreran, o no ho faran amb la freqüència necessària. Per tant, el nostre objectiu principal ha estat proposar noves representacions per predir si dues paraules estableixen una relació lexicosemàntica. La intuïció era que aquestes representacions noves havien de contenir informació sobre patrons dels contextos, combinada amb informació sobre el significat de les paraules implicades en la relació. Aquestes dues fonts d'informació havien de ser la base d'una estratègia de generalització que oferís informació fins i tot quan les dues paraules no coocorrien.
APA, Harvard, Vancouver, ISO, and other styles
36

Mille, Simon. "Deep stochastic sentence generation : resources and strategies." Doctoral thesis, Universitat Pompeu Fabra, 2014. http://hdl.handle.net/10803/283136.

Full text
Abstract:
The present Ph.D. thesis addresses the problem of deep data-driven Natural Language Generation (NLG), and in particular the role of proper corpus annotation schemata for stochastic sentence realization. The lack of multilevel corpus annotation has prevented so far the development of proper statistical NLG systems starting from abstract structures. We first detail a methodology for annotating corpora at different levels of linguistic abstraction (namely, semantic, deep-syntactic, surface-syntactic, topological, and morphological levels), and report on the actual annotation of such corpora, manually for Spanish and automatically for English. Then, using the resulting annotated data for our experiments, we train and evaluate deep stochastic NLG tools which go beyond the current state of the art, in particular thanks to the absence of rules in non-isomorphic transductions. Finally, we show that such data can also serve well other purposes such as statistical surface and deep dependency parsing.<br>La presente tesis aborda el problema de la generación de textos partiendo desde estructuras profundas; se examina especialmente el papel de un esquema de anotación apropiado para la generación estadística de oraciones. La falta de anotación en varios niveles ha impedido hasta ahora el desarrollo de sistemas de generación estadística desde estructuras abstractas. En primer lugar, se detalla la metodología para anotar corpus en varios niveles (representaciones semánticas, sintácticas profundas, sintácticas superficiales, topológicas y morfológicas), y se presenta su proceso de anotación, manual para el español, y automático para el inglés. Posteriormente, se usan los datos anotados para entrenar y evaluar varios generadores de textos que van más allá del estado del arte actual, en particular porque no contienen reglas para transducciones no isomórficas. Por último, se muestra que estos datos se pueden utilizar también para otros objetivos tales como el análisis sintáctico estadístico de estructuras superficiales y profundas.
APA, Harvard, Vancouver, ISO, and other styles
37

O'Connell, Gordon Wayne. "A concurrent object coordination language : semantics and applications." 2005. http://hdl.handle.net/1828/652.

Full text
APA, Harvard, Vancouver, ISO, and other styles
38

"Building a semantics-assisted risk analysis (SARA) framework for vendor risk management." Thesis, 2007. http://library.cuhk.edu.hk/record=b6074376.

Full text
Abstract:
Although there are several solutions available in the industry to manage the vendor risk confronting corporate purchasers in their practices of traditional procurement mechanism, they are not widely accepted among industries practicing the traditional procurement mechanism. Moreover, they are unfeasible to be implemented in the eProcurement mechanism. They rely heavily on self-assessment data provided by vendors or transaction records from purchasing departments, and there is a lack of a systematic approach to accumulate the collective experience of the corporation in vendor risk management.<br>Moreover, the risk cause taxonomy identified in this study lays out the theoretical grounds for the development of any software applications relating to the deployment of risk perceptions held by procurement professionals and practitioners.<br>Recently, electronic procurement or eProcurement has gradually acquired wide acceptance in various industries as an effective, efficient, and cost-saving mechanism to search for and contact potential vendors over the Internet. However, it is also a common situation that purchasers do not have handy and reliable tools for the evaluation of the risk deriving from their choices of selecting seemingly promising but unfamiliar vendors, identified through the eProcurement mechanism. The purchasing corporations need to implement a systematic framework to identify, and assess the risks associated with their vendor choices, that is, the vendor risk, and even to memorize their collective experience on risk analysis, while they try to gain benefits from the practice of the eProcurement strategy.<br>The structure for the establishment of the semantic application identified in this study can be generalized as the common framework for developing an automatic information extractor to acquire Internet content as the support for making important business decisions. The structure is composed of three basic components: (1) an information collection method to identify specific information over the Internet through the deployment of semantic technology, (2) an ontology repository to associate the collected data and the specific data schema, and (3) a scheme to associate the data schema with the analytical methods which would be deployed to provide decision support.<br>This study proposes the establishment of the Vendor Risk Analysis (VRA) system to assist procurement officers in vendor risk analysis as a support to their decision of seeking promising vendors over the Internet. The VRA system adopts a Semantic-Assisted Risk Analysis (SARA) framework to implement an innovative approach in the implementation of risk assessment. The SARA framework deploys the collaboration of a knowledge-based Expert System and several emerging semantic technologies, including Information Extraction, a Community Template Repository, and a Semantic Platform for Information Indexing and Retrieval, to enhance the capability of the VRA system in the capability of acquiring sufficient risk evidence over the Internet to provide timely and reliable risk assessment support to vendor choice decisions.<br>Chou, Ling Yu.<br>"July 2007."<br>Advisers: Vincent Sie-king Lai; Timon Chih-ting Du.<br>Source: Dissertation Abstracts International, Volume: 68-12, Section: A, page: 5128.<br>Thesis (Ph.D.)--Chinese University of Hong Kong, 2007.<br>Includes bibliographical references (p. 178-186).<br>Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web.<br>Electronic reproduction. [Ann Arbor, MI] : ProQuest Information and Learning, [200-] System requirements: Adobe Acrobat Reader. Available via World Wide Web.<br>Electronic reproduction. Ann Arbor, MI : ProQuest dissertations and theses, [201-] System requirements: Adobe Acrobat Reader. Available via World Wide Web.<br>Abstracts in English and Chinese.<br>School code: 1307.
APA, Harvard, Vancouver, ISO, and other styles
39

Yeh, Peter Zei-Chan. "Flexible semantic matching of rich knowledge structures." Thesis, 2006. http://hdl.handle.net/2152/3007.

Full text
APA, Harvard, Vancouver, ISO, and other styles
40

"Geometric and topological approaches to semantic text retrieval." Thesis, 2007. http://library.cuhk.edu.hk/record=b6074419.

Full text
Abstract:
In the first part of this thesis, we present a new understanding of the latent semantic space of a dataset from the dual perspective, which relaxes the above assumed conditions and leads naturally to a unified kernel function for a class of vector space models. New semantic analysis methods based on the unified kernel function are developed, which combine the advantages of LSI and GVSM. We also show that the new methods possess the stable property on the rank choice, i.e., even if the selected rank is quite far away from the optimal one, the retrieval performance will not degrade much. The experimental results of our methods on the standard test sets are promising.<br>In the second part of this thesis, we propose that the mathematical structure of simplexes can be attached to a term-document matrix in the vector-space model (VSM) for information retrieval. The Q-analysis devised by R. H. Atkin may then be applied to effect an analysis of the topological structure of the simplexes and their corresponding dataset. Experimental results of this analysis reveal that there is a correlation between the effectiveness of LSI and the topological structure of the dataset. By using the information obtained from the topological analysis, we develop a new query expansion method. Experimental results show that our method can enhance the performance of VSM for datasets over which LSI is not effective. Finally, the notion of homology is introduced to the topological analysis of datasets and its possible relation to word sense disambiguation is studied through a simple example.<br>With the vast amount of textual information available today, the task of designing effective and efficient retrieval methods becomes more important and complex. The Basic Vector Space Model (BVSM) is well known in information retrieval. Unfortunately, it can not retrieve all relevant documents since it is based on literal term matching. The Generalized Vector Space Model (GVSM) and the Latent Semantic Indexing (LSI) are two famous semantic retrieval methods, in which some underlying latent semantic structures in the dataset are assumed. However, their assumptions about where the semantic structure locates are a bit strong. Moreover, the performance of LSI can be very different for various datasets and the questions of what characteristics of a dataset and why these characteristics contribute to this difference have not been fully understood. The present thesis focuses on providing answers to these two questions.<br>Li , Dandan.<br>"August 2007."<br>Adviser: Chung-Ping Kwong.<br>Source: Dissertation Abstracts International, Volume: 69-02, Section: B, page: 1108.<br>Thesis (Ph.D.)--Chinese University of Hong Kong, 2007.<br>Includes bibliographical references (p. 118-120).<br>Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web.<br>Electronic reproduction. [Ann Arbor, MI] : ProQuest Information and Learning, [200-] System requirements: Adobe Acrobat Reader. Available via World Wide Web.<br>Abstract in English and Chinese.<br>School code: 1307.
APA, Harvard, Vancouver, ISO, and other styles
41

Johnson, Donald Gordon. "An improved theorem prover by using the semantics of structure." 1985. http://hdl.handle.net/2097/27463.

Full text
APA, Harvard, Vancouver, ISO, and other styles
42

"Performance characteristics of semantics-based concurrency control protocols." Chinese University of Hong Kong, 1995. http://library.cuhk.edu.hk/record=b5888489.

Full text
Abstract:
by Keith, Hang-kwong Mak.<br>Thesis (M.Phil.)--Chinese University of Hong Kong, 1995.<br>Includes bibliographical references (leaves 122-127).<br>Abstract --- p.i<br>Acknowledgement --- p.iii<br>Chapter 1 --- Introduction --- p.1<br>Chapter 2 --- Background --- p.4<br>Chapter 2.1 --- Read/Write Model --- p.4<br>Chapter 2.2 --- Abstract Data Type Model --- p.5<br>Chapter 2.3 --- Overview of Semantics-Based Concurrency Control Protocols --- p.7<br>Chapter 2.4 --- Concurrency Hierarchy --- p.9<br>Chapter 2.5 --- Control Flow of the Strict Two Phase Locking Protocol --- p.11<br>Chapter 2.5.1 --- Flow of an Operation --- p.12<br>Chapter 2.5.2 --- Response Time of a Transaction --- p.13<br>Chapter 2.5.3 --- Factors Affecting the Response Time of a Transaction --- p.14<br>Chapter 3 --- Semantics-Based Concurrency Control Protocols --- p.16<br>Chapter 3.1 --- Strict Two Phase Locking --- p.16<br>Chapter 3.2 --- Conflict Relations --- p.17<br>Chapter 3.2.1 --- Commutativity (COMM) --- p.17<br>Chapter 3.2.2 --- Forward and Right Backward Commutativity --- p.19<br>Chapter 3.2.3 --- Exploiting Context-Specific Information --- p.21<br>Chapter 3.2.4 --- Relaxing Correctness Criterion by Allowing Bounded Inconsistency --- p.26<br>Chapter 4 --- Related Work --- p.32<br>Chapter 4.1 --- Exploiting Transaction Semantics --- p.32<br>Chapter 4.2 --- Exploting Object Semantics --- p.34<br>Chapter 4.3 --- Sacrificing Consistency --- p.35<br>Chapter 4.4 --- Other Approaches --- p.37<br>Chapter 5 --- Performance Study (Testbed Approach) --- p.39<br>Chapter 5.1 --- System Model --- p.39<br>Chapter 5.1.1 --- Main Memory Database --- p.39<br>Chapter 5.1.2 --- System Configuration --- p.40<br>Chapter 5.1.3 --- Execution of Operations --- p.41<br>Chapter 5.1.4 --- Recovery --- p.42<br>Chapter 5.2 --- Parameter Settings and Performance Metrics --- p.43<br>Chapter 6 --- Performance Results and Analysis (Testbed Approach) --- p.46<br>Chapter 6.1 --- Read/Write Model vs. Abstract Data Type Model --- p.46<br>Chapter 6.2 --- Using Context-Specific Information --- p.52<br>Chapter 6.3 --- Role of Conflict Ratio --- p.55<br>Chapter 6.4 --- Relaxing the Correctness Criterion --- p.58<br>Chapter 6.4.1 --- Overhead and Performance Gain --- p.58<br>Chapter 6.4.2 --- Range Queries using Bounded Inconsistency --- p.63<br>Chapter 7 --- Performance Study (Simulation Approach) --- p.69<br>Chapter 7.1 --- Simulation Model --- p.70<br>Chapter 7.1.1 --- Logical Queueing Model --- p.70<br>Chapter 7.1.2 --- Physical Queueing Model --- p.71<br>Chapter 7.2 --- Experiment Information --- p.74<br>Chapter 7.2.1 --- Parameter Settings --- p.74<br>Chapter 7.2.2 --- Performance Metrics --- p.75<br>Chapter 8 --- Performance Results and Analysis (Simulation Approach) --- p.76<br>Chapter 8.1 --- Relaxing Correctness Criterion of Serial Executions --- p.77<br>Chapter 8.1.1 --- Impact of Resource Contention --- p.77<br>Chapter 8.1.2 --- Impact of Infinite Resources --- p.80<br>Chapter 8.1.3 --- Impact of Limited Resources --- p.87<br>Chapter 8.1.4 --- Impact of Multiple Resources --- p.89<br>Chapter 8.1.5 --- Impact of Transaction Type --- p.95<br>Chapter 8.1.6 --- Impact of Concurrency Control Overhead --- p.96<br>Chapter 8.2 --- Exploiting Context-Specific Information --- p.98<br>Chapter 8.2.1 --- Impact of Limited Resource --- p.98<br>Chapter 8.2.2 --- Impact of Infinite and Multiple Resources --- p.101<br>Chapter 8.2.3 --- Impact of Transaction Length --- p.106<br>Chapter 8.2.4 --- Impact of Buffer Size --- p.108<br>Chapter 8.2.5 --- Impact of Concurrency Control Overhead --- p.110<br>Chapter 8.3 --- Summary and Discussion --- p.113<br>Chapter 8.3.1 --- Summary of Results --- p.113<br>Chapter 8.3.2 --- Relaxing Correctness Criterion vs. Exploiting Context-Specific In- formation --- p.114<br>Chapter 9 --- Conclusions --- p.116<br>Bibliography --- p.122<br>Chapter A --- Commutativity Tables for Queue Objects --- p.128<br>Chapter B --- Specification of a Queue Object --- p.129<br>Chapter C --- Commutativity Tables with Bounded Inconsistency for Queue Objects --- p.132<br>Chapter D --- Some Implementation Issues --- p.134<br>Chapter D.1 --- Important Data Structures --- p.134<br>Chapter D.2 --- Conflict Checking --- p.136<br>Chapter D.3 --- Deadlock Detection --- p.137<br>Chapter E --- Simulation Results --- p.139<br>Chapter E.l --- Impact of Infinite Resources (Bounded Inconsistency) --- p.140<br>Chapter E.2 --- Impact of Multiple Resource (Bounded Inconsistency) --- p.141<br>Chapter E.3 --- Impact of Transaction Type (Bounded Inconsistency) --- p.142<br>Chapter E.4 --- Impact of Concurrency Control Overhead (Bounded Inconsistency) --- p.144<br>Chapter E.4.1 --- Infinite Resources --- p.144<br>Chapter E.4.2 --- Limited Resource --- p.146<br>Chapter E.5 --- Impact of Resource Levels (Exploiting Context-Specific Information) --- p.149<br>Chapter E.6 --- Impact of Buffer Size (Exploiting Context-Specific Information) --- p.150<br>Chapter E.7 --- Impact of Concurrency Control Overhead (Exploiting Context-Specific In- formation) --- p.155<br>Chapter E.7.1 --- Impact of Infinite Resources --- p.155<br>Chapter E.7.2 --- Impact of Limited Resources --- p.157<br>Chapter E.7.3 --- Impact of Transaction Length --- p.160<br>Chapter E.7.4 --- Role of Conflict Ratio --- p.162
APA, Harvard, Vancouver, ISO, and other styles
43

Bauer, Daniel. "Grammar-Based Semantic Parsing Into Graph Representations." Thesis, 2017. https://doi.org/10.7916/D8JH3ZRR.

Full text
Abstract:
Directed graphs are an intuitive and versatile representation of natural language meaning because they can capture relationships between instances of events and entities, including cases where entities play multiple roles. Yet, there are few approaches in natural language processing that use graph manipulation techniques for semantic parsing. This dissertation studies graph-based representations of natural language meaning, discusses a formal-grammar based approach to the semantic construction of graph representations, and develops methods for open-domain semantic parsing into such representations. To perform string-to-graph translation I use synchronous hyperedge replacement grammars (SHRG). The thesis studies this grammar formalism from a formal, linguistic, and algorithmic perspective. It proposes a new lexicalized variant of this formalism (LSHRG), which is inspired by tree insertion grammar and provides a clean syntax/semantics interface. The thesis develops a new method for automatically extracting SHRG and LSHRG grammars from annotated “graph banks”, which uses existing syntactic derivations to structure the extracted grammar. It also discusses a new method for semantic parsing with large, automatically extracted grammars, that translates syntactic derivations into derivations of the synchronous grammar, as well as initial work on parse reranking and selection using a graph model. I evaluate this work on the Abstract Meaning Representation (AMR) dataset. The results show that the grammar-based approach to semantic analysis shows promise as a technique for semantic parsing and that string-to-graph grammars can be induced efficiently. Taken together, the thesis lays the foundation for future work on graph methods in natural language semantics.
APA, Harvard, Vancouver, ISO, and other styles
44

"Extracting causation knowledge from natural language texts." 2002. http://library.cuhk.edu.hk/record=b5891058.

Full text
Abstract:
Chan Ki, Cecia.<br>Thesis (M.Phil.)--Chinese University of Hong Kong, 2002.<br>Includes bibliographical references (leaves 95-99).<br>Abstracts in English and Chinese.<br>Chapter 1 --- Introduction --- p.1<br>Chapter 1.1 --- Our Contributions --- p.4<br>Chapter 1.2 --- Thesis Organization --- p.5<br>Chapter 2 --- Related Work --- p.6<br>Chapter 2.1 --- Using Knowledge-based Inferences --- p.7<br>Chapter 2.2 --- Using Linguistic Techniques --- p.8<br>Chapter 2.2.1 --- Using Linguistic Clues --- p.8<br>Chapter 2.2.2 --- Using Graphical Patterns --- p.9<br>Chapter 2.2.3 --- Using Lexicon-syntactic Patterns of Causative Verbs --- p.10<br>Chapter 2.2.4 --- Comparisons with Our Approach --- p.10<br>Chapter 2.3 --- Discovery of Extraction Patterns for Extracting Relations --- p.11<br>Chapter 2.3.1 --- Snowball system --- p.12<br>Chapter 2.3.2 --- DIRT system --- p.12<br>Chapter 2.3.3 --- Comparisons with Our Approach --- p.13<br>Chapter 3 --- Semantic Expectation-based Knowledge Extraction --- p.14<br>Chapter 3.1 --- Semantic Expectations --- p.14<br>Chapter 3.2 --- Semantic Template --- p.16<br>Chapter 3.2.1 --- Causation Semantic Template --- p.16<br>Chapter 3.3 --- Sentence Templates --- p.17<br>Chapter 3.4 --- Consequence and Reason Templates --- p.22<br>Chapter 3.5 --- Causation Knowledge Extraction Framework --- p.25<br>Chapter 3.5.1 --- Template Design --- p.25<br>Chapter 3.5.2 --- Sentence Screening --- p.27<br>Chapter 3.5.3 --- Semantic Processing --- p.28<br>Chapter 4 --- Using Thesaurus and Pattern Discovery for SEKE --- p.33<br>Chapter 4.1 --- Using a Thesaurus --- p.34<br>Chapter 4.2 --- Pattern Discovery --- p.37<br>Chapter 4.2.1 --- Use of Semantic Expectation-based Knowledge Extraction --- p.37<br>Chapter 4.2.2 --- Use of Part of Speech Information --- p.39<br>Chapter 4.2.3 --- Pattern Representation --- p.39<br>Chapter 4.2.4 --- Constructing the Patterns --- p.40<br>Chapter 4.2.5 --- Merging the Patterns --- p.43<br>Chapter 4.3 --- Pattern Matching --- p.44<br>Chapter 4.3.1 --- Matching Score --- p.46<br>Chapter 4.3.2 --- Support of Patterns --- p.48<br>Chapter 4.3.3 --- Relevancy of Sentence Templates --- p.48<br>Chapter 4.4 --- Applying the Newly Discovered Patterns --- p.49<br>Chapter 5 --- Applying SEKE on Hong Kong Stock Market Domain --- p.52<br>Chapter 5.1 --- Template Design --- p.53<br>Chapter 5.1.1 --- Semantic Templates --- p.53<br>Chapter 5.1.2 --- Sentence Templates --- p.53<br>Chapter 5.1.3 --- Consequence and Reason Templates: --- p.55<br>Chapter 5.2 --- Pattern Discovery --- p.58<br>Chapter 5.2.1 --- Support of Patterns --- p.58<br>Chapter 5.2.2 --- Relevancy of Sentence Templates --- p.58<br>Chapter 5.3 --- Causation Knowledge Extraction Result --- p.58<br>Chapter 5.3.1 --- Evaluation Approach --- p.61<br>Chapter 5.3.2 --- Parameter Investigations --- p.61<br>Chapter 5.3.3 --- Experimental Results --- p.65<br>Chapter 5.3.4 --- Knowledge Discovered --- p.68<br>Chapter 5.3.5 --- Parameter Effect --- p.75<br>Chapter 6 --- Applying SEKE on Global Warming Domain --- p.80<br>Chapter 6.1 --- Template Design --- p.80<br>Chapter 6.1.1 --- Semantic Templates --- p.81<br>Chapter 6.1.2 --- Sentence Templates --- p.81<br>Chapter 6.1.3 --- Consequence and Reason Templates --- p.83<br>Chapter 6.2 --- Pattern Discovery --- p.85<br>Chapter 6.2.1 --- Support of Patterns --- p.85<br>Chapter 6.2.2 --- Relevancy of Sentence Templates --- p.85<br>Chapter 6.3 --- Global Warming Domain Result --- p.85<br>Chapter 6.3.1 --- Evaluation Approach --- p.85<br>Chapter 6.3.2 --- Experimental Results --- p.88<br>Chapter 6.3.3 --- Knowledge Discovered --- p.89<br>Chapter 7 --- Conclusions and Future Directions --- p.92<br>Chapter 7.1 --- Conclusions --- p.92<br>Chapter 7.2 --- Future Directions --- p.93<br>Bibliography --- p.95<br>Chapter A --- Penn Treebank Part of Speech Tags --- p.100
APA, Harvard, Vancouver, ISO, and other styles
45

Wang, Yuanyong Computer Science &amp Engineering Faculty of Engineering UNSW. "Using web texts for word sense disambiguation." 2007. http://handle.unsw.edu.au/1959.4/40530.

Full text
Abstract:
In all natural languages, ambiguity is a universal phenomenon. When a word has multiple meaning depending on its contexts it is called an ambiguous word. The process of determining the correct meaning of a word (formally named word sense) in a given context is word sense disambiguation(WSD). WSD is one of the most fundamental problems in natural language processing. If properly addressed, it could lead to revolutionary advancement in many other technologies such as text search engine technology, automatic text summarization and classification, automatic lexicon construction, machine translation and automatic learning agent technology. One difficulty that has always confronted WSD researchers is the lack of high quality sense specific information. For example, if the word "power" Immediately preceds the word "plant", it would strongly constrain the meaning of "plant" to be "an industrial facility". If "power" is replaced by the phrase "root of a", then the sense of "plant" is dictated to be "an organism" of the kingdom Planate. It is obvious that manually building a comprehensive sense specific information base for each sense of each word is impractical. Researchers also tried to extract such information from large dictionaries as well as manually sense tagged corpora. Most of the dictionaries used for WSD are not built for this purpose and have a lot of inherited peculiarities. While manual tagging is slow and costly, automatic tagging is not successful in providing a reliable performance. Furthermore, it is often the case that for a randomly chosen word (to be disambiguated), the sense specific context corpora that can be collected from dictionaries are not large enough. Therefore, manually building sense specific information bases or extraction of such information from dictionaries are not effective approaches to obtain sense specific information. A web text, due to its vast quantity and wide diversity, becomes an ideal source for extraction of large quantity of sense specific information. In this thesis, the impacts of Web texts on various aspects of WSD has been investigated. New measures and models are proposed to tame enormous amount of Web texts for the purpose of WSD. They are formally evaluated by experimenting their disambiguation performance on about 70 ambiguous nouns. The results are very encouraging and have helped revealing the great potential of using Web texts for WSD. The results are published in three papers at Australia national and international level (Wang&Hoffmann,2004,2005,2006)[42][43][44].
APA, Harvard, Vancouver, ISO, and other styles
46

Yuan, Yidong Computer Science &amp Engineering Faculty of Engineering UNSW. "Efficient computation of advanced skyline queries." 2007. http://handle.unsw.edu.au/1959.4/40511.

Full text
Abstract:
Skyline has been proposed as an important operator for many applications, such as multi-criteria decision making, data mining and visualization, and user-preference queries. Due to its importance, skyline and its computation have received considerable attention from database research community recently. All the existing techniques, however, focus on the conventional databases. They are not applicable to online computation environment, such as data stream. In addition, the existing studies consider efficiency of skyline computation only, while the fundamental problem on the semantics of skylines still remains open. In this thesis, we study three problems of skyline computation: (1) online computing skyline over data stream; (2) skyline cube computation and its analysis; and (3) top-k most representative skyline. To tackle the problem of online skyline computation, we develop a novel framework which converts more expensive multiple dimensional skyline computation to stabbing queries in 1-dimensional space. Based on this framework, a rigorous theoretical analysis of the time complexity of online skyline computation is provided. Then, efficient algorithms are proposed to support ad hoc and continuous skyline queries over data stream. Inspired by the idea of data cube, we propose a novel concept of skyline cube which consists of skylines of all possible non-empty subsets of a given full space. We identify the unique sharing strategies for skyline cube computation and develop two efficient algorithms which compute skyline cube in a bottom-up and top-down manner, respectively. Finally, a theoretical framework to answer the question about semantics of skyline and analysis of multidimensional subspace skyline are presented. Motived by the fact that the full skyline may be less informative because it generally consists of a large number of skyline points, we proposed a novel skyline operator -- top-k most representative skyline. The top-k most representative skyline operator selects the k skyline points so that the number of data points, which are dominated by at least one of these k skyline points, is maximized. To compute top-k most representative skyline, two efficient algorithms and their theoretical analysis are presented.
APA, Harvard, Vancouver, ISO, and other styles
47

Neubauer, Nicolas. "Semantik und Sentiment: Konzepte, Verfahren und Anwendungen von Text-Mining." Doctoral thesis, 2014. https://repositorium.ub.uni-osnabrueck.de/handle/urn:nbn:de:gbv:700-2014060612524.

Full text
Abstract:
Diese Arbeit befasst sich mit zwei Themenbereichen des Data Mining beziehungsweise Text Mining, den zugehörigen algorithmischen Verfahren sowie Konzepten und untersucht mögliche Anwendungsszenarien. Auf der einen Seite wird das Gebiet der semantischen Ähnlichkeit besprochen. Kurz, der Frage, wie algorithmisch bestimmt werden kann, wie viel zwei Begriffe oder Konzepte miteinander zu tun haben. Die Technologie um das Wissen, dass etwa "Regen" ein Bestandteil von "Wetter" sein kann, ermöglicht verschiedenste Anwendungen. In dieser Arbeit wird ein Überblick über gängige Literatur gegeben, das Forschungsgebiet wird grob in die zwei Schulen der wissensbasierten und statistischen Methoden aufgeteilt und in jeder wird ein Beitrag durch Untersuchung vorhandener und Vorstellung eigener Ähnlichkeitsmaße geleistet. Eine Studie mit Probanden und ein daraus entstandener Datensatz liefert schließlich Einblicke in die Präferenzen von Menschen bezüglich ihrer Ähnlichkeitswahrnehmung. Auf der anderen Seite steht das Gebiet des Sentiment Mining, in dem versucht wird, algorithmisch aus großen Sammlungen unstrukturierten Texts, etwa Nachrichten von Twitter oder anderen sozialen Netzwerken, Stimmungen und Meinungen zu identifizieren und zu klassifizieren. Nach einer Besprechung zugehöriger Literatur wird der Aufbau eines neuen Testdatensatzes motiviert und die Ergebnisse der Gewinnung dieses beschrieben. Auf dieser neuen Grundlage erfolgt eine ausführliche Auswertung einer Vielzahl von Vorgehensweisen und Klassifikationsmethoden. Schließlich wird die praktische Nutzbarkeit der Ergebnisse anhand verschiedener Anwendungsszenarien bei Produkt-Präsentationen sowie Medien- oder Volksereignissen wie der Bundestagswahl nachgewiesen.
APA, Harvard, Vancouver, ISO, and other styles
48

Lokhande, Hrishikesh. "Pharmacodynamics miner : an automated extraction of pharmacodynamic drug interactions." Thesis, 2013. http://hdl.handle.net/1805/3757.

Full text
Abstract:
Indiana University-Purdue University Indianapolis (IUPUI)<br>Pharmacodynamics (PD) studies the relationship between drug concentration and drug effect on target sites. This field has recently gained attention as studies involving PD Drug-Drug interactions (DDI) assure discovery of multi-targeted drug agents and novel efficacious drug combinations. A PD drug combination could be synergistic, additive or antagonistic depending upon the summed effect of the drug combination at a target site. The PD literature has grown immensely and most of its knowledge is dispersed across different scientific journals, thus the manual identification of PD DDI is a challenge. In order to support an automated means to extract PD DDI, we propose Pharmacodynamics Miner (PD-Miner). PD-Miner is a text-mining tool, which is capable of identifying PD DDI from in vitro PD experiments. It is powered by two major features, i.e., collection of full text articles and in vitro PD ontology. The in vitro PD ontology currently has four classes and more than hundred subclasses; based on these classes and subclasses the full text corpus is annotated. The annotated full text corpus forms a database of articles, which can be queried based upon drug keywords and ontology subclasses. Since the ontology covers term and concept meanings, the system is capable of formulating semantic queries. PD-Miner extracts in vitro PD DDI based upon references to cell lines and cell phenotypes. The results are in the form of fragments of sentences in which important concepts are visually highlighted. To determine the accuracy of the system, we used a gold standard of 5 expert curated articles. PD-Miner identified DDI with a recall of 75% and a precision of 46.55%. Along with the development of PD Miner, we also report development of a semantically annotated in vitro PD corpus. This corpus includes term and sentence level annotations and serves as a gold standard for future text mining.
APA, Harvard, Vancouver, ISO, and other styles
49

Newsom, Eric Tyner. "An exploratory study using the predicate-argument structure to develop methodology for measuring semantic similarity of radiology sentences." Thesis, 2013. http://hdl.handle.net/1805/3666.

Full text
Abstract:
Indiana University-Purdue University Indianapolis (IUPUI)<br>The amount of information produced in the form of electronic free text in healthcare is increasing to levels incapable of being processed by humans for advancement of his/her professional practice. Information extraction (IE) is a sub-field of natural language processing with the goal of data reduction of unstructured free text. Pertinent to IE is an annotated corpus that frames how IE methods should create a logical expression necessary for processing meaning of text. Most annotation approaches seek to maximize meaning and knowledge by chunking sentences into phrases and mapping these phrases to a knowledge source to create a logical expression. However, these studies consistently have problems addressing semantics and none have addressed the issue of semantic similarity (or synonymy) to achieve data reduction. To achieve data reduction, a successful methodology for data reduction is dependent on a framework that can represent currently popular phrasal methods of IE but also fully represent the sentence. This study explores and reports on the benefits, problems, and requirements to using the predicate-argument statement (PAS) as the framework. A convenient sample from a prior study with ten synsets of 100 unique sentences from radiology reports deemed by domain experts to mean the same thing will be the text from which PAS structures are formed.
APA, Harvard, Vancouver, ISO, and other styles
50

Shehata, Shady. "Concept Mining: A Conceptual Understanding based Approach." Thesis, 2009. http://hdl.handle.net/10012/4430.

Full text
Abstract:
Due to the daily rapid growth of the information, there are considerable needs to extract and discover valuable knowledge from data sources such as the World Wide Web. Most of the common techniques in text mining are based on the statistical analysis of a term either word or phrase. These techniques consider documents as bags of words and pay no attention to the meanings of the document content. In addition, statistical analysis of a term frequency captures the importance of the term within a document only. However, two terms can have the same frequency in their documents, but one term contributes more to the meaning of its sentences than the other term. Therefore, there is an intensive need for a model that captures the meaning of linguistic utterances in a formal structure. The underlying model should indicate terms that capture the semantics of text. In this case, the model can capture terms that present the concepts of the sentence, which leads to discover the topic of the document. A new concept-based model that analyzes terms on the sentence, document and corpus levels rather than the traditional analysis of document only is introduced. The concept-based model can effectively discriminate between non-important terms with respect to sentence semantics and terms which hold the concepts that represent the sentence meaning. The proposed model consists of concept-based statistical analyzer, conceptual ontological graph representation, concept extractor and concept-based similarity measure. The term which contributes to the sentence semantics is assigned two different weights by the concept-based statistical analyzer and the conceptual ontological graph representation. These two weights are combined into a new weight. The concepts that have maximum combined weights are selected by the concept extractor. The similarity between documents is calculated based on a new concept-based similarity measure. The proposed similarity measure takes full advantage of using the concept analysis measures on the sentence, document, and corpus levels in calculating the similarity between documents. Large sets of experiments using the proposed concept-based model on different datasets in text clustering, categorization and retrieval are conducted. The experiments demonstrate extensive comparison between traditional weighting and the concept-based weighting obtained by the concept-based model. Experimental results in text clustering, categorization and retrieval demonstrate the substantial enhancement of the quality using: (1) concept-based term frequency (tf), (2) conceptual term frequency (ctf), (3) concept-based statistical analyzer, (4) conceptual ontological graph, (5) concept-based combined model. In text clustering, the evaluation of results is relied on two quality measures, the F-Measure and the Entropy. In text categorization, the evaluation of results is relied on three quality measures, the Micro-averaged F1, the Macro-averaged F1 and the Error rate. In text retrieval, the evaluation of results relies on three quality measures, the precision at 10 documents retrieved P(10), the preference measure (bpref), and the mean uninterpolated average precision (MAP). All of these quality measures are improved when the newly developed concept-based model is used to enhance the quality of the text clustering, categorization and retrieval.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!