Log in

Relevant bibliographies by topics / Information retrieval keyword extraction / Dissertations / Theses

To see the other types of publications on this topic, follow the link: Information retrieval keyword extraction.

Dissertations / Theses on the topic 'Information retrieval keyword extraction'

Author: Grafiati

Published: 5 June 2025

Last updated: 15 July 2025

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'Information retrieval keyword extraction.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Grosz, Sandra. "Keyword Extraction from Swedish Court Documents." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-272117.

Full text

Abstract:

This thesis addresses the problem of extracting keywords which represent the rulings and and grounds for the rulings in Swedish court documents. The problem of identifying the candidate keywords was divided into two steps; first preprocessing the documents and second extracting keywords using a keyword extraction algorithm on the preprocessed documents. The preprocessing methods used in conjunction with the keywords extraction algorithms were that of using stop words and a stemmer. Then, three different approaches for extracting keywords were used; one statistic approach, one machine learning

APA, Harvard, Vancouver, ISO, and other styles

2

Luo, Yi Computer Science &amp Engineering Faculty of Engineering UNSW. "SPARK: a keyword search system on relational databases." Awarded by:University of New South Wales. Computer Science & Engineering, 2009. http://handle.unsw.edu.au/1959.4/41542.

Full text

Abstract:

With the increasing usage of storing textual data into relational databases, there is a demand for the databases to support keyword queries over textual data. Due to the normalization and the inherent connections among tuples in different tables, traditional IR-style ranking and query evaluation methods do not apply. A number of systems have been proposed to deal with this issue. In this thesis, I will give a detailed demonstration and description to our SPARK project. In the project, we study both the effectiveness and the efficiency issues of answering top-k keyword query on a relational dat

APA, Harvard, Vancouver, ISO, and other styles

3

Rogerson, Brittany E. "An Evaluation of Existing Light Stemming Algorithms for Arabic Keyword Searches." Thesis, School of Information and Library Science, 2008. http://hdl.handle.net/1901/572.

Full text

Abstract:

The field of Information Retrieval recognizes the importance of stemming in improving retrieval effectiveness. This same tool, when applied to searches conducted in the Arabic language, increases the relevancy of documents returned and expands searches to encompass the general meaning of a word instead of the word itself. Since the Arabic language relies mainly on triconsonantal roots for verb forms and derives nouns by adding affixes, words with similar consonants are closely related in meaning. Stemming allows a search term to focus more on the meaning of a term and closely related terms

APA, Harvard, Vancouver, ISO, and other styles

4

Romano, Nicholas C., Dmitri G. Roussinov, Jay F. Nunamaker, and Hsinchun Chen. "Collaborative Information Retrieval Environment: Integration of Information Retrieval with Group Support Systems." HICSS, 1999. http://hdl.handle.net/10150/105688.

Full text

Abstract:

Artificial Intelligence Lab, Department of MIS, University of Arizona<br>Observations of Information Retrieval (IR) system user experiences reveal a strong desire for collaborative search while at the same time suggesting that collaborative capabilities are rarely, and then only in a limited fashion, supported by current searching and visualization tools. Equally interesting is the fact that observations of user experiences with Group Support Systems (GSS) reveal that although access to external information and the ability to search for relevant material is often vital to the progress

APA, Harvard, Vancouver, ISO, and other styles

5

James, David Anthony. "The application of classical information retrieval techniques to spoken documents." Thesis, University of Cambridge, 1995. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.361412.

Full text

APA, Harvard, Vancouver, ISO, and other styles

6

Loper, Edward (Edward Daniel) 1977. "Applying semantic relation extraction to information retrieval." Thesis, Massachusetts Institute of Technology, 2000. http://hdl.handle.net/1721.1/86521.

Full text

APA, Harvard, Vancouver, ISO, and other styles

7

Song, Min Song Il-Yeol. "Robust knowledge extraction over large text collections /." Philadelphia, Pa. : Drexel University, 2005. http://dspace.library.drexel.edu/handle/1860/495.

Full text

APA, Harvard, Vancouver, ISO, and other styles

8

Kushmerick, Nicholas. "Wrapper induction for information extraction /." Thesis, Connect to this title online; UW restricted, 1997. http://hdl.handle.net/1773/6867.

Full text

APA, Harvard, Vancouver, ISO, and other styles

9

Brucato, Matteo. "Temporal Information Retrieval." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2013. http://amslaurea.unibo.it/5690/.

Full text

APA, Harvard, Vancouver, ISO, and other styles

10

Zitzelberger, Andrew J. "HyKSS: Hybrid Keyword and Semantic Search." BYU ScholarsArchive, 2011. https://scholarsarchive.byu.edu/etd/2832.

Full text

Abstract:

The rapid production of digital information makes the task of locating relevant information increasingly difficult. Keyword search alleviates this difficulty by retrieving documents containing keywords of interest. However, keyword search suffers from a number of issues such ambiguity, synonymy, and the inability to handle semantic constraints. Semantic search helps resolve these issues but is limited by the quality of annotations which are likely to be incomplete or imprecise. Hybrid search, a search technique that combines the merits of both keyword and semantic search, appears to be a promi

APA, Harvard, Vancouver, ISO, and other styles

11

Sukhahuta, Rattasit. "Information extraction system for Thai documents." Thesis, University of East Anglia, 2001. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.368173.

Full text

APA, Harvard, Vancouver, ISO, and other styles

12

Zhu, Bin, and Hsinchun Chen. "Validating a Geographic Image Retrieval System." Wiley Periodicals, Inc, 2000. http://hdl.handle.net/10150/105934.

Full text

Abstract:

Artificial Intelligence Lab, Department of MIS, University of Arizona<br>This paper summarizes a prototype geographical image retrieval system that demonstrates how to integrate image processing and information analysis techniques to support large-scale content-based image retrieval. By using an image as its interface, the prototype system addresses a troublesome aspect of traditional retrieval models, which require users to have complete knowledge of the low-level features of an image. In addition we describe an experiment to validate the performance of this image retrieval system ag

APA, Harvard, Vancouver, ISO, and other styles

13

Wimalasuriya, Daya Chinthana. "Use of ontologies in information extraction." Thesis, University of Oregon, 2011. http://hdl.handle.net/1794/11216.

Full text

Abstract:

xiii, 149 p. : ill. (some col.)<br>Information extraction (IE) aims to recognize and retrieve certain types of information from natural language text. For instance, an information extraction system may extract key geopolitical indicators about countries from a set of web pages while ignoring other types of information. IE has existed as a research field for a few decades, and ontology-based information extraction (OBIE) has recently emerged as one of its subfields. Here, the general idea is to use ontologies--which provide formal and explicit specifications of shared conceptualizations--to gui

APA, Harvard, Vancouver, ISO, and other styles

14

Wang, Jiying. "Information extraction and integration for Web databases /." View abstract or full-text, 2004. http://library.ust.hk/cgi/db/thesis.pl?COMP%202004%20WANGJ.

Full text

Abstract:

Thesis (Ph. D.)--Hong Kong University of Science and Technology, 2004.<br>Includes bibliographical references (leaves 112-118). Also available in electronic version. Access restricted to campus users.

APA, Harvard, Vancouver, ISO, and other styles

15

McManigal, Chris A. "Towards More Comprehensive Information Retrieval Systems: Entity Extraction Using XSLT." UNF Digital Commons, 2005. http://digitalcommons.unf.edu/etd/222.

Full text

Abstract:

One problem that exists in today's document management arena is the issue of retrieving information from electronic documents such as images, Microsoft Office documents, and e-mail. Specific data entities must be extracted from these documents so that the data can be searched and queried. This study presents a unique approach to extracting these entities: using Extensible Stylesheet Language Transformations (XSLT) to match patterns in text. Because XSLT is processed at run time, new XSLT templates can be created and used without having to recompile and redeploy the application. The specific im

APA, Harvard, Vancouver, ISO, and other styles

16

Quintavalle, Bruno <1966&gt. "Information retrieval and extraction from forums, complaints and technical reviews." Doctoral thesis, Università Ca' Foscari Venezia, 2019. http://hdl.handle.net/10579/15586.

Full text

Abstract:

Complaints and technical reviews often describe complex problems, most of the times in very articulated ways. Over that kind of corpora, we are considering here three classical tasks: Information Retrieval, Text Classification and Information Extraction. In this context however, these tasks should take into special consideration the structure of the sentence, with special attention to verbal phrases, as complaints are usually descriptions of actions that have been performed whilst they shouldn’t (or the other way around). We want to leverage results from traditional NLP tasks like Semantic Rol

APA, Harvard, Vancouver, ISO, and other styles

17

Huang, Junhao. "CE Standard Documents Keyword Extraction and Comparison Between Different MachineLearning Methods." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-236481.

Full text

Abstract:

Conformité Européenne (CE) approval is a complex task for producers in Europe. The producers need to search for necessary standard documents and do the tests by themselves. CE-CHECK is a website which provides document searching service, and the company engineers want to use machine learning methods to analysis the documents and the results can improve the searching system. The ﬁrst task is to construct an auto keyword extraction system to analysis the standard documents. This paper performed three different machine learning methods: Conditional Random Field (CRF), joint-layer Recurrent Neural

APA, Harvard, Vancouver, ISO, and other styles

18

Puigcerver, I. Pérez Joan. "A Probabilistic Formulation of Keyword Spotting." Doctoral thesis, Universitat Politècnica de València, 2019. http://hdl.handle.net/10251/116834.

Full text

Abstract:

[ES] La detección de palabras clave (Keyword Spotting, en inglés), aplicada a documentos de texto manuscrito, tiene como objetivo recuperar los documentos, o partes de ellos, que sean relevantes para una cierta consulta (query, en inglés), indicada por el usuario, entre una gran colección de documentos. La temática ha recogido un gran interés en los últimos 20 años entre investigadores en Reconocimiento de Formas (Pattern Recognition), así como bibliotecas y archivos digitales. Esta tesis, en primer lugar, define el objetivo de la detección de palabras clave a partir de una perspectiva basada

APA, Harvard, Vancouver, ISO, and other styles

19

Chen, Hsinchun, and Vasant Dhar. "A Knowledge-Based Approach to the Design of Document-Based Retrieval Systems." ACM, 1990. http://hdl.handle.net/10150/106486.

Full text

Abstract:

Artificial Intelligence Lab, Department of MIS, University of Arizona<br>This article presents a knowledge-based approach to the design of document-based retrieval systems. We conducted two empirical studies investigating the users' behavior using an online catalog. The studies revcaled a range of knowledge elements which are necessary for performing a successful search. We proposed a semantic network based representation to capture these knowledge elements. The findings we derived from our empirical studies were used to construct a knowledge-based retrieval system. We performed a laboratory e

APA, Harvard, Vancouver, ISO, and other styles

20

Chen, Hsinchun, and Jinwoo Kim. "GANNET: A machine learning approach to document retrieval." M.E. Sharpe, Inc, 1994. http://hdl.handle.net/10150/105547.

Full text

Abstract:

Artificial Intelligence Lab, Department of MIS, University of Arizona<br>Information science researchers have recently turned to new artificial intelligence-based inductive learning techniques including neural networks, symbolic learning and genetic algorithms. An overview of the new techniques and their usage in information science research is provided. The algorithms adopted for a hybrid genetic algorithms and neural nets based system, called GANNET, are presented. GANNET performed concept (keyword) optimization for user-selected documents during information retrieval using the genetic algor

APA, Harvard, Vancouver, ISO, and other styles

21

Hao, Dayang. "Content extraction, analysis, and retrieval for plant visual traits studies." Diss., Columbia, Mo. : University of Missouri-Columbia, 2008. http://hdl.handle.net/10355/5704.

Full text

Abstract:

Thesis (M.S.)--University of Missouri-Columbia, 2008.<br>The entire dissertation/thesis text is included in the research.pdf file; the official abstract appears in the short.pdf file (which also appears in the research.pdf); a non-technical general description, or public abstract, appears in the public.pdf file. Title from title screen of research.pdf file (viewed on August 12, 2009) Includes bibliographical references.

APA, Harvard, Vancouver, ISO, and other styles

22

Chen, Hsinchun, and Vasant Dhar. "Cognitive Process as a Basis for Intelligent Retrieval Systems Design." Pergamon Press, 1991. http://hdl.handle.net/10150/105912.

Full text

Abstract:

Artificial Intelligence Lab, Department of MIS, University of Arizona<br>Two studies were conducted to investigate the cognitive processes involved in online document-based information retrieval. These studies led to the development of five computational models of online document retrieval. These models were then incorporated into the design of an "intelligent" document-based retrieval system. Following a discussion of this system, we discuss the broader implications of our research for the design of information retrieval systems.

APA, Harvard, Vancouver, ISO, and other styles

23

Ademi, Muhamet. "adXtractor – Automated and Adaptive Generation of Wrappers for Information Retrieval." Thesis, Malmö högskola, Fakulteten för teknik och samhälle (TS), 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:mau:diva-20071.

Full text

Abstract:

The aim of this project is to investigate the feasibility of retrieving unstructured automotive listings from structured web pages on the Internet. The research has two major purposes: (1) to investigate whether it is feasible to pair information extraction algorithms and compute wrappers (2) demonstrate the results of pairing these techniques and evaluate the measurements. We merge two training sets available on the web to construct reference sets which is the basis for the information extraction. The wrappers are computed by using information extraction techniques to identify data properties

APA, Harvard, Vancouver, ISO, and other styles

24

McEnnis, Daniel. "On-demand metadata extraction network (OMEN)." Thesis, McGill University, 2006. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=99382.

Full text

Abstract:

OMEN (On-demand Metadata Extraction Network) addresses a fundamental problem in Music Information Retrieval: the lack of universal access to a large dataset containing significant amounts of copyrighted music. This thesis proposes a solution to this problem that is accomplished by utilizing the large collections of digitized music available at many libraries. Using OMEN, libraries will be able to perform on-demand feature extraction on site, returning feature values to researchers instead of providing direct access to the recordings themselves. This avoids copyright difficulties, since the und

APA, Harvard, Vancouver, ISO, and other styles

25

Califf, Mary Elaine. "Relational learning techniques for natural language information extraction /." Digital version accessible at:, 1998. http://wwwlib.umi.com/cr/utexas/main.

Full text

APA, Harvard, Vancouver, ISO, and other styles

26

Chen, Hsinchun. "From Information Retrieval to Knowledge Management Enabling Technologies and Best Practices." Elsevier, 1999. http://hdl.handle.net/10150/106079.

Full text

Abstract:

Artificial Intelligence Lab, Department of MIS, University of Arizona<br>In this era of the Internet and distributed multimedia computing, new and emerging classes of information technologies have swept into the lives of office workers and everyday people. As technologies and applications become more overwhelming, pressing, and diverse, several well-known information technology problems have become even more urgent. Information overload, a result of the ease of information creation and rendering via Internet and WWW, has become more evident in people’s lives. Significant variations of database

APA, Harvard, Vancouver, ISO, and other styles

27

Xhemali, Daniela. "Automated retrieval and extraction of training course information from unstructured web pages." Thesis, Loughborough University, 2010. https://dspace.lboro.ac.uk/2134/7022.

Full text

Abstract:

Web Information Extraction (WIE) is the discipline dealing with the discovery, processing and extraction of specific pieces of information from semi-structured or unstructured web pages. The World Wide Web comprises billions of web pages and there is much need for systems that will locate, extract and integrate the acquired knowledge into organisations practices. There are some commercial, automated web extraction software packages, however their success comes from heavily involving their users in the process of finding the relevant web pages, preparing the system to recognise items of interes

APA, Harvard, Vancouver, ISO, and other styles

28

Pridaphattharakun, Wilasini. "Information retrieval and answer extraction for an XML knowledge base in WebNL." [Gainesville, Fla.] : University of Florida, 2001. http://purl.fcla.edu/fcla/etd/UFE0000344.

Full text

Abstract:

Thesis (M.S.)--University of Florida, 2001.<br>Title from title page of source document. Document formatted into pages; contains xiii, 71 p.; also contains graphics. Includes vita. Includes bibliographical references.

APA, Harvard, Vancouver, ISO, and other styles

29

Gollapally, Devender R. "Multi-Agent Architecture for Internet Information Extraction and Visualization." Thesis, University of North Texas, 2000. https://digital.library.unt.edu/ark:/67531/metadc2575/.

Full text

Abstract:

The World Wide Web is one of the largest sources of information; more and more applications are being developed daily to make use of this information. This thesis presents a multi-agent architecture that deals with some of the issues related to Internet data extraction. The primary issue addresses the reliable, efficient and quick extraction of data through the use of HTTP performance monitoring agents. A second issue focuses on how to make use of available data to take decisions and alert the user when there is change in data; this is done with the help of user agents that are equipped with a

APA, Harvard, Vancouver, ISO, and other styles

30

Chen, Hsinchun. "Machine Learning for Information Retrieval: Neural Networks, Symbolic Learning, and Genetic Algorithms." Wiley Periodicals, Inc, 1995. http://hdl.handle.net/10150/106427.

Full text

Abstract:

Artificial Intelligence Lab, Department of MIS, University of Arizona<br>Information retrieval using probabilistic techniques has attracted significant attention on the part of researchers in information and computer science over the past few decades. In the 1980s, knowledge-based techniques also made an impressive contribution to “intelligent” information retrieval and indexing. More recently, information science researchers have turned to other newer artificial-intelligence- based inductive learning techniques including neural networks, symbolic learning, and genetic algorithms. These newe

APA, Harvard, Vancouver, ISO, and other styles

31

Chartrand, Tim. "Ontology-based extraction of RDF data from the World Wide Web /." Diss., CLICK HERE for online access, 2003. http://contentdm.lib.byu.edu/ETD/image/etd168.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

32

Tabassum, Binte Jafar Jeniya. "Information Extraction From User Generated Noisy Texts." The Ohio State University, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=osu1606315356821532.

Full text

APA, Harvard, Vancouver, ISO, and other styles

33

Wang, Wei. "Automated spatiotemporal and semantic information extraction for hazards." Diss., University of Iowa, 2014. https://ir.uiowa.edu/etd/1415.

Full text

Abstract:

This dissertation explores three research topics related to automated spatiotemporal and semantic information extraction about hazard events from Web news reports and other social media. The dissertation makes a unique contribution of bridging geographic information science, geographic information retrieval, and natural language processing. Geographic information retrieval and natural language processing techniques are applied to extract spatiotemporal and semantic information automatically from Web documents, to retrieve information about patterns of hazard events that are not explicitly desc

APA, Harvard, Vancouver, ISO, and other styles

34

Chen, Hsinchun, Joanne Martinez, Tobun Dorbin Ng, and Bruce R. Schatz. "A Concept Space Approach to Addressing the Vocabulary Problem in Scientific Information Retrieval: An Experiment on the Worm Community System." Wiley Periodicals, Inc, 1997. http://hdl.handle.net/10150/105991.

Full text

Abstract:

Artificial Intelligence Lab, Department of MIS, University of Arizona<br>This research presents an algorithmic approach to addressing the vocabulary problem in scientific information retrieval and information sharing, using the molecular biology domain as an example. We first present a literature review of cognitive studies related to the vocabulary problem and vocabuiary-based search aids (thesauri) and then discuss techniques for building robust and domain-specific thesauri to assist in cross-domain scientific information retrieval. Using a variation of the automatic thesaurus generation t

APA, Harvard, Vancouver, ISO, and other styles

35

Pham, Nam Wilamowski Bogdan M. "Data extraction from servers by the Internet Robot." Auburn, Ala, 2009. http://hdl.handle.net/10415/1781.

Full text

APA, Harvard, Vancouver, ISO, and other styles

36

Alsaad, Amal. "Enhanced root extraction and document classification algorithm for Arabic text." Thesis, Brunel University, 2016. http://bura.brunel.ac.uk/handle/2438/13510.

Full text

Abstract:

Many text extraction and classification systems have been developed for English and other international languages; most of the languages are based on Roman letters. However, Arabic language is one of the difficult languages which have special rules and morphology. Not many systems have been developed for Arabic text categorization. Arabic language is one of the Semitic languages with morphology that is more complicated than English. Due to its complex morphology, there is a need for pre-processing routines to extract the roots of the words then classify them according to the group of acts or m

APA, Harvard, Vancouver, ISO, and other styles

37

Chen, Hsinchun, and Vasant Dhar. "Online Query Refinement on Information Retrieval Systems: A Process Model of Searched System Interactions." ACM, 1990. http://hdl.handle.net/10150/105597.

Full text

Abstract:

Artificial Intelligence Lab, Department of MIS, University of Arizona<br>This article reports findings of empirical research that investigated information searchers online query refinement process. Prior studies have recognized the information specialists' role in helping searchers articulate and refine queries. Using a semantic network and a Problem Behavior Graph to represent the online search our study revealed that searchers also refined their own queries in an online task environment. The information retrieval system played a passive role in assisting online query refinement, which was

APA, Harvard, Vancouver, ISO, and other styles

38

Halling, Leonard. "Feature Extraction for ContentBased Image Retrieval Using a PreTrained Deep Convolutional Neural Network." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-274340.

Full text

Abstract:

This thesis examines the performance of features, extracted from a pre-trained deep convolutional neural network, for content-based image retrieval in images of news articles. The industry constantly awaits improved methods for image retrieval, including the company hosting this research project, who are looking to improve their existing image description-based method for image retrieval. It has been shown that in a neural network, the invoked activations from an image can be used as a high-level representation (feature) of the image. This study explores the efficiency of these features in an

APA, Harvard, Vancouver, ISO, and other styles

39

Paradzinets, Aliaksandr V. "Variable resolution transform-based music feature extraction and their applications for music information retrieval." Ecully, Ecole centrale de Lyon, 2007. http://www.theses.fr/2007ECDL0047.

Full text

Abstract:

Dans le secteur de loisirs il y a un nombre considérable d’enregistrements numériques musicaux produits, diffusés et échangés qui favorise la demande croisante de services intelligents de recherche de musique. La navigation par contenu devient cruciale pour permettre aux professionnels et également aux amateurs d’accéder facilement aux quantités de données musicales disponibles. Ce travail présente les nouveaux descripteurs de contenu musical et mesures de similarité qui permettent l’organisation automatique de données musicales (recherche par similarité, génération automatique des playlistes)

APA, Harvard, Vancouver, ISO, and other styles

40

Schatz, Bruce R., Eric H. Johnson, Pauline A. Cochrane, and Hsinchun Chen. "Interactive Term Suggestion for Users of Digital Libraries: Using Subject Thesauri and Co-occurrence Lists for Information Retrieval." ACM, 1996. http://hdl.handle.net/10150/106216.

Full text

Abstract:

Artificial Intelligence Lab, Department of MIS, University of Arizona<br>The basic problem in information retrieval is that large scale searches can only match terms specified by the user to terms appearing in documents in the digital library collection. Intermediate sources that support term suggestion can thus enhance retrieval by providing altentative search terms for the user. Term suggestion increases the recall, while interaction enables the user to attempt to not decrease the precision. We are building a prototype user interface that will become the Web interface for the University of I

APA, Harvard, Vancouver, ISO, and other styles

41

Cui, Licong. "Ontology-guided Health Information Extraction, Organization, and Exploration." Case Western Reserve University School of Graduate Studies / OhioLINK, 2014. http://rave.ohiolink.edu/etdc/view?acc_num=case1401709795.

Full text

APA, Harvard, Vancouver, ISO, and other styles

42

Normand, Eric. "A Semi-Supervised Information Extraction Framework for Large Redundant Corpora." ScholarWorks@UNO, 2008. http://scholarworks.uno.edu/td/877.

Full text

Abstract:

The vast majority of text freely available on the Internet is not available in a form that computers can understand. There have been numerous approaches to automatically extract information from human- readable sources. The most successful attempts rely on vast training sets of data. Others have succeeded in extracting restricted subsets of the available information. These approaches have limited use and require domain knowledge to be coded into the application. The current thesis proposes a novel framework for Information Extraction. From large sets of documents, the system develops sta

APA, Harvard, Vancouver, ISO, and other styles

43

Zhou, Yuanqiu. "Generating Data-Extraction Ontologies By Example." Diss., CLICK HERE for online access, 2005. http://contentdm.lib.byu.edu/ETD/image/etd1115.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

44

Kucuk, Dilek. "Exploiting Information Extraction Techniques For Automatic Semantic Annotation And Retrieval Of News Videos In Turkish." Phd thesis, METU, 2011. http://etd.lib.metu.edu.tr/upload/12613043/index.pdf.

Full text

Abstract:

Information extraction (IE) is known to be an effective technique for automatic semantic indexing of news texts. In this study, we propose a text-based fully automated system for the semantic annotation and retrieval of news videos in Turkish which exploits several IE techniques on the video texts. The IE techniques employed by the system include named entity recognition, automatic hyperlinking, person entity extraction with coreference resolution, and event extraction. The system utilizes the outputs of the components implementing these IE techniques as the semantic annotations for the underl

APA, Harvard, Vancouver, ISO, and other styles

45

Tarczyńska, Anna. "Methods of Text Information Extraction in Digital Videos." Thesis, Blekinge Tekniska Högskola, Sektionen för datavetenskap och kommunikation, 2012. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-2656.

Full text

Abstract:

Context The huge amount of existing digital video files needs to provide indexing to make it available for customers (easier searching). The indexing can be provided by text information extraction. In this thesis we have analysed and compared methods of text information extraction in digital videos. Furthermore, we have evaluated them in the new context proposed by us, namely usefulness in sports news indexing and information retrieval. Objectives The objectives of this thesis are as follows: providing a better understanding of the nature of text extraction; performing a systematic literature

APA, Harvard, Vancouver, ISO, and other styles

46

Walker, Troy L. "Automating the Extraction of Domain-Specific Information from the Web-A Case Study for the Genealogical Domain." Diss., CLICK HERE for online access, 2004. http://contentdm.lib.byu.edu/ETD/image/etd607.walker.

Full text

APA, Harvard, Vancouver, ISO, and other styles

47

Lipani, Aldo. "Query rewriting in information retrieval: automatic context extraction from local user documents to improve query results." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2012. http://amslaurea.unibo.it/4528/.

Full text

Abstract:

The central objective of research in Information Retrieval (IR) is to discover new techniques to retrieve relevant information in order to satisfy an Information Need. The Information Need is satisfied when relevant information can be provided to the user. In IR, relevance is a fundamental concept which has changed over time, from popular to personal, i.e., what was considered relevant before was information for the whole population, but what is considered relevant now is specific information for each user. Hence, there is a need to connect the behavior of the system to the condition of a part

APA, Harvard, Vancouver, ISO, and other styles

48

Mohammadzadeh, Hadi. "Improving Retrieval Accuracy in Main Content Extraction from HTML Web Documents." Doctoral thesis, Universitätsbibliothek Leipzig, 2013. http://nbn-resolving.de/urn:nbn:de:bsz:15-qucosa-130500.

Full text

Abstract:

The rapid growth of text based information on the World Wide Web and various applications making use of this data motivates the need for efficient and effective methods to identify and separate the “main content” from the additional content items, such as navigation menus, advertisements, design elements or legal disclaimers. Firstly, in this thesis, we study, develop, and evaluate R2L, DANA, DANAg, and AdDANAg, a family of novel algorithms for extracting the main content of web documents. The main concept behind R2L, which also provided the initial idea and motivation for the other three algo

APA, Harvard, Vancouver, ISO, and other styles

49

Lee, Kyogu. "A system for acoustic chord transcription and key extraction from audio using hidden Markov models trained on synthesized audio /." May be available electronically:, 2008. http://proquest.umi.com/login?COPT=REJTPTU1MTUmSU5UPTAmVkVSPTI=&clientId=12498.

Full text

APA, Harvard, Vancouver, ISO, and other styles

50

Ebadat, Ali-Reza. "Toward Robust Information Extraction Models for Multimedia Documents." Phd thesis, INSA de Rennes, 2012. http://tel.archives-ouvertes.fr/tel-00760383.

Full text

Abstract:

Au cours de la dernière décennie, d'énormes quantités de documents multimédias ont été générées. Il est donc important de trouver un moyen de gérer ces données, notamment d'un point de vue sémantique, ce qui nécessite une connaissance fine de leur contenu. Il existe deux familles d'approches pour ce faire, soit par l'extraction d'informations à partir du document (par ex., audio, image), soit en utilisant des données textuelles extraites du document ou de sources externes (par ex., Web). Notre travail se place dans cette seconde famille d'approches ; les informations extraites des textes peuve

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!