To see the other types of publications on this topic, follow the link: Search index.

Dissertations / Theses on the topic 'Search index'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'Search index.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Reimers, Axel, and Isak Gustafsson. "Indexing and Search Algorithmsfor Web shops :." Thesis, KTH, Data- och elektroteknik, 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-193373.

Full text
Abstract:
Web shops today needs to be more and more responsive, where one part of this responsivenessis fast product searches. One way of getting faster searches are by searching against anindex instead of directly against a database. Network Expertise Sweden AB (Net Exp) wants to explore different methods of implementingan index in their future web shop, building upon the open-source web shop platformSmartStore.NET. Since SmartStore.NET does all of its searches directly against itsdatabase, it will not scale well and will wear more on the database. The aim was thereforeto find different solutions to offload the database by using an index instead. A prototype that retrieved products from a database and made them searchable through anindex was developed, evaluated and implemented. The prototype indexed the data with aninverted index algorithm, and was made searchable with a search algorithm that mixed typeboolean queries with normal queries.<br>Webbutiker idag behöver vara mer och mer responsiva, en del av denna responsivitet ärsnabb produkt sökningar. Ett sätt att skaffa snabbare sökningar är genom att söka mot ettindex istället för att söka direkt mot en databas. Network Expertise Sweden AB vill utforska olika metoder för att implementera ett index ideras framtida webbutik, byggt ovanpå SmartStore.NET som är öppen käll-kod. Då Smart-Store.NET gör alla av sina sökningar direkt mot sin databas, kommer den inte att skala braoch kommer slita mer på databasen. Målsättningen var därför att hitta olika lösningar somavlastar databasen genom att använda ett index istället. En prototyp som hämtade produkter från en databas och gjorde dom sökbara genom ettindex var utvecklad, utvärderad och implementerad. Prototypen indexerade datan med eninverterad indexerings algoritm, och gjordes sökbara med en sök algoritm som blandar booleskafrågor med normala frågor.<br><p></p><p></p><p></p>
APA, Harvard, Vancouver, ISO, and other styles
2

Gupta, Chirag. "EFFICIENT K-WORD PROXIMITY SEARCH." Case Western Reserve University School of Graduate Studies / OhioLINK, 2008. http://rave.ohiolink.edu/etdc/view?acc_num=case1197213718.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Wareing, Malcolm. "A search for an index of lift traffic performance." Thesis, University of Manchester, 1985. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.293283.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Temple, Thomas J. (Thomas John). "A general index heuristic for search with mobile agents." Thesis, Massachusetts Institute of Technology, 2011. http://hdl.handle.net/1721.1/67175.

Full text
Abstract:
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Aeronautics and Astronautics, 2011.<br>Cataloged from PDF version of thesis.<br>Includes bibliographical references (p. 181-187) and index.<br>This dissertation considers a suite of search problems in which agents are trying to find goals in minimum expected time. Unlike search in data structures in which time is measured by a number operations, search in metric spaces measures time by units of distance and has received much less attention. In particular, search strategies that attempt to minimize expected search time are only available for a handful of relatively simple cases. Nonetheless many relevant search problems take place in metric spaces. This dissertation includes several concrete examples from navigation and surveillance that would have previously only been approachable by much more ad hoc methods. We visit these examples along the way to establishing relevance to a much larger set of problems. We present a policy that is an extension of Whittle's index heuristic and is applicable under the following assumptions. * The location of goals are independent random variables. " The agents and goals are in a length space, i.e., a metric space with continuous paths. * The agents move along continuous paths with bounded speed. " The agents' sensing is noiseless. We demonstrate the performance of our policy by applying it to a diverse set of problems for which solutions are available in the literature. We treat each of the following problems as a special case of a more general search problem: " search in one-dimensional spaces such as the Line Search Problem (LSP) and Cow Path Problem (CPP), " search in two-dimensional spaces such as the Lost in a Forest Problem (LFP) and problems of coverage, " problems in networks such as the Graph Search Problem (GSP) and Minimum Latency Tour Problem (MLTP), and " dynamic problems such as the Persistent Patrol Problem (PPP) and Dynamic Traveling Repairperson Problem (DTRP). On each of these we find that our policy performs comparably to, and occasionally better than, the accepted solutions developed specifically for these problems. As a result, we believe that this dissertation contributes a significant inroad into a large space of search problems that meets our assumptions, but that remains unaddressed.<br>by Thomas. J. Temple.<br>Ph.D.
APA, Harvard, Vancouver, ISO, and other styles
5

Nilsson, Malin, and Linus Engback. "Visualization of a blog search engine index using 3D graphics." Thesis, Linköping University, Department of Science and Technology, 2007. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-8569.

Full text
Abstract:
<p>The purpose of this thesis is to find ways to make the extent and constant movement in the blogosphere visible. An application has been developed using C# and OpenGL. The application is an interactive screensaver to be run on the Windows platform. It visualizes data combining 3D and 2D elements. Geographical data is rendered using a model of the Earth, where the blog posts are constantly updated. Various statistics are displayed to give information on the current state of the blogosphere.</p>
APA, Harvard, Vancouver, ISO, and other styles
6

Kropf, Carsten. "Efficient Reorganisation of Hybrid Index Structures Supporting Multimedia Search Criteria." Doctoral thesis, Saechsische Landesbibliothek- Staats- und Universitaetsbibliothek Dresden, 2017. http://nbn-resolving.de/urn:nbn:de:bsz:14-qucosa-216425.

Full text
Abstract:
This thesis describes the development and setup of hybrid index structures. They are access methods for retrieval techniques in hybrid data spaces which are formed by one or more relational or normalised columns in conjunction with one non-relational or non-normalised column. Examples for these hybrid data spaces are, among others, textual data combined with geographical ones or data from enterprise content management systems. However, all non-relational data types may be stored as well as image feature vectors or comparable types. Hybrid index structures are known to function efficiently regarding retrieval operations. Unfortunately, little information is available about reorganisation operations which insert or update the row tuples. The fundamental research is mainly executed in simulation based environments. This work is written ensuing from a previous thesis that implements hybrid access structures in realistic database surroundings. During this implementation it has become obvious that retrieval works efficiently. Yet, the restructuring approaches require too much effort to be set up, e.g., in web search engine environments where several thousands of documents are inserted or modified every day. These search engines rely on relational database systems as storage backends. Hence, the setup of these access methods for hybrid data spaces is required in real world database management systems. This thesis tries to apply a systematic approach for the optimisation of the rearrangement algorithms inside realistic scenarios. Thus, a measurement and evaluation scheme is created which is repeatedly deployed to an evolving state and a model of hybrid index structures in order to optimise the regrouping algorithms to make a setup of hybrid index structures in real world information systems possible. Thus, a set of input corpora is selected which is applied to the test suite as well as an evaluation scheme. To sum up, it can be said that this thesis describes input sets, a test suite including an evaluation scheme as well as optimisation iterations on reorganisation algorithms reflecting a theoretical model framework to provide efficient reorganisations of hybrid index structures supporting multimedia search criteria.
APA, Harvard, Vancouver, ISO, and other styles
7

Maršálek, Tomáš. "Návrh vyhledávacího systému pro moderní potřeby." Master's thesis, Vysoká škola ekonomická v Praze, 2016. http://www.nusl.cz/ntk/nusl-262227.

Full text
Abstract:
In this work I argue that field of text search has focused mostly on long text documents, but there is a growing need for efficient short text search, which has different user expectations. Due to this reduced data set size requirements different algorithmic techniques become more computationally affordable. The focus of this work is on approximate and prefix search and purely text based ranking methods, which are needed due to lower precision of text statistics on short text. A basic prototype search engine has been created using the researched techniques. Its capabilities were demonstrated on example search scenarios and the implementation was compared to two other open source systems representing currently recommended approaches for short text search problem. The results show feasibility of the implemented prototype regarding both user expectations and performance. Several options of future direction of the system are proposed.
APA, Harvard, Vancouver, ISO, and other styles
8

Lester, Nicholas, and nml@cs rmit edu au. "Efficient Index Maintenance for Text Databases." RMIT University. Computer Science and Information Technology, 2006. http://adt.lib.rmit.edu.au/adt/public/adt-VIT20070214.154933.

Full text
Abstract:
All practical text search systems use inverted indexes to quickly resolve user queries. Offline index construction algorithms, where queries are not accepted during construction, have been the subject of much prior research. As a result, current techniques can invert virtually unlimited amounts of text in limited main memory, making efficient use of both time and disk space. However, these algorithms assume that the collection does not change during the use of the index. This thesis examines the task of index maintenance, the problem of adapting an inverted index to reflect changes in the collection it describes. Existing approaches to index maintenance are discussed, including proposed optimisations. We present analysis and empirical evidence suggesting that existing maintenance algorithms either scale poorly to large collections, or significantly degrade query resolution speed. In addition, we propose a new strategy for index maintenance that trades a strictly controlled amount of querying efficiency for greatly increased maintenance speed and scalability. Analysis and empirical results are presented that show that this new algorithm is a useful trade-off between indexing and querying efficiency. In scenarios described in Chapter 7, the use of the new maintenance algorithm reduces the time required to construct an index to under one sixth of the time taken by algorithms that maintain contiguous inverted lists. In addition to work on index maintenance, we present a new technique for accumulator pruning during ranked query evaluation, as well as providing evidence that existing approaches are unsatisfactory for collections of large size. Accumulator pruning is a key problem in both querying efficiency and overall text search system efficiency. Existing approaches either fail to bound the memory footprint required for query evaluation, or suffer loss of retrieval accuracy. In contrast, the new pruning algorithm can be used to limit the memory footprint of ranked query evaluation, and in our experiments gives retrieval accuracy not worse than previous alternatives. The results presented in this thesis are validated with robust experiments, which utilise collections of significant size, containing real data, and tested using appropriate numbers of real queries. The techniques presented in this thesis allow information retrieval applications to efficiently index and search changing collections, a task that has been historically problematic.
APA, Harvard, Vancouver, ISO, and other styles
9

González, Cornejo Senen Andrés. "To index or not to index:|bTime-space trade-offs in search engines with positional ranking functions." Tesis, Universidad de Chile, 2014. http://www.repositorio.uchile.cl/handle/2250/116403.

Full text
Abstract:
Magíster en Ciencias, Mención Computación<br>Web search has become an important part of day-to-day life. Web search engines are important tools that give access to the information stored in the web. The success of a web search engine mostly depends on its efficiency and the quality of its ranking function. But also, web search engines give extra aids to their users, which make them more usable. An instance of this is the ability of generating result snippets and being able to retrieve the in-cache version of a web page, among others. Inverted indexes are a fundamental data structure used by web search engines to efficiently answer user queries. In a basic setup, inverted indexes only allow for simple (though fairly effective) ranking functions (e.g., BM25). It is well known that the high quality of nowadays search-engine results is due to sophisticated ranking functions. A particular example that has been widely studied in the literature is that of positional ranking functions, where the positions of the query terms within the resulting documents are used in order to rank them. To support this kind of ranking, the classical solution are positional inverted indexes. However, these usually demand large amounts of extra space, typically about three times the space of an inverted index. Moreover, if the web search engine needs to produce text snippets or display a cached copy of a web page, the textual data must be also stored. In this thesis we study time/space trade-offs for web search engines with positional ranking functions and text snippet generation. We aim to answer the question of whether positional inverted indexes are the most efficient way to store and retrieve positional data. In particular, we propose to get rid of positional data in inverted indexes, and instead obtain that information from the text collection itself. The challenge is to compress the text collection such that one can support the extraction of arbitrary documents, in order to find the positions of the query terms within them. We study and compare several alternatives for compressing the textual data. The first one uses a succinct data structure (in particular, a Wavelet Tree). We show how the space of the data structure can be reduced significantly, but also slowed down, by using high-order compressors within the nodes of the data structure. We then show how several text compression alternatives behave when used to obtain arbitrary documents (note that decompression speed is key in this application). Our starting point are compressors that either: (1) use little space for the text, yet with a slow decompression speed; and (2) have a very efficient decompression time (achieving a total performance comparable to that of positional inverted indexes), yet with a poor compression ratio. We then show how to obtain the best from both worlds: an efficient compression ratio, with a high decompression speed. We conclude that there exist a wide range of practical time/space trade-offs, other than just positional inverted indexes. The main result is that using only about 50% of the space of current solutions (i.e., positional inverted indexes plus the compressed text), one can support positional ranking and snippet generation almost with no time penalties. This seems to indicate that not to index positional data is the best solution in many practical scenarios. This can change the way in which positional data is stored and retrieved in web search engines.
APA, Harvard, Vancouver, ISO, and other styles
10

Kubilay, Mustafa. "Special Index And Retrieval Mechanism For Ontology Based Medical Domain Search Engines." Master's thesis, METU, 2005. http://etd.lib.metu.edu.tr/upload/2/12606470/index.pdf.

Full text
Abstract:
This thesis focuses on index and retrieval mechanism of an ontology based medical domain search engine. First, indexing techniques and retrieval methods are reviewed. Then, a special indexing and retrieval mechanism are introduced. This thesis also specifies the functional requirements of these mechanisms. Finally, an evaluation is given by indicating the positive and negative aspects of mechanisms.
APA, Harvard, Vancouver, ISO, and other styles
11

Harrison, Makiko Ito. "The human development index : a search for a measure of human values." Thesis, London School of Economics and Political Science (University of London), 2001. http://etheses.lse.ac.uk/2499/.

Full text
Abstract:
The thesis investigates methods of evaluating indexes that measure concepts of human values. My understanding of indexes, especially on how they relate to the real world and concepts (that are the objectives of the measurement), is influenced by my study of literature on models used in economic and in physics. We learn from this study of models the following: (1) regularities described in theories do not represent real world phenomena, which consist of many different forces acting simultaneously; (2) but such regularities are true in models, because they describe specific conditions under which regularities in nature are displayed; (3) there are more than one model that can represent the same phenomenon depending on which particular aspect of the phenomenon to focus on; and (4) the success of a model has to be evaluated partly by criteria that are independent from theoretical ones. Since the role indexes play in relation to real world and concepts are similar to the role models play in relation to theories, I have applied the above knowledge to propose the following three criteria to evaluate successful indexes: (1) Purpose-dependent criteria: criteria that are based on particular motivations of the measurement project; (2) Theory-dependent criteria: criteria that are reflected in the theories that expressly or implicitly guide the development of the project of measurement; and (3) Conditions-dependent criteria: criteria that are based on the conditions under which the index measures what it is designed to measure. I apply these three criteria of successful indexes to examine two projects of measuring human values, one called the Human Development Index developed by the United Nations Development Programme and the other called the Life Satisfaction Indicator developed by an officer at the Economic Planning Agency in Japan. Among the findings from the examination of those two indexes are that they can be the products of a mixture of concerns that include convenience, conventions, practicality, politics and consistency with relevant theories, and some of these concerns may conflict with each other. Another important finding is that because there are many assumptions made and simplifications applied in order to choose a quantitative representation of a human value, the application of the measure is limited. I conclude that both in using and in evaluating indexes of human values, it is important that we are aware of such limitations, so that we can more effectively know both how to avoid misusing the indexes and how to improve them over time.
APA, Harvard, Vancouver, ISO, and other styles
12

Pan, Jenq-Shyang. "Improved algorithms for VQ codeword search, codebook design and codebook index assignment." Thesis, University of Edinburgh, 1996. http://hdl.handle.net/1842/15581.

Full text
Abstract:
This thesis investigates efficient codeword search algorithms and efficient clustering algorithms for vector quantization (VQ), improved codebook design algorithms and improved codebook index assignment for noisy channels. In the investigation of codeword search algorithms, several fast approaches are proposed, such as the improved absolute error inequality criterion, improved algorithms for partial distortion search, improved algorithms for extended partial distortion search and a fast approximate search algorithm. The bound for the Minkowski metric is derived as the generalised form of the partial distortion search algorithm, hypercube approach, absolute error inequality criterion and improved absolute error inequality criterion. This bound provides a better criterion than the absolute error inequality elimination rule on the Euclidean distortion measure. For the Minkowski metric of order n, this bound contributes the elimination criterion from the L<SUB>1</SUB> metric to the L<SUB>n</SUB> metric. This bound is also extended to the bound for the quadratic metric by using methods of metric transformation. The improved absolute error inequality criterion is also extended to the generalised form of the mean-distance-ordered search algorithm for VQ image coding. Several fast clustering algorithms for vector quantization based on the LBG algorithm are presented. Genetic algorithms are applied to codebook design to derive improved codevectors. The approach of stochastic relaxation is also applied to the mutation step of the genetic algorithm to further improve the codebook design algorithm. Vector quantization is very efficient for data compression of speech and images where the binary indices of the optimally chosen codevectors are used. The effect of channel errors is to cause errors in the received indices. A parallel genetic algorithm is applied to assign the codevector indices for noisy channels so as to minimize the distortion due to bit errors. The novel property of multiple global optima and the average distortion of the memoryless binary symmetric channel for any bit error in the assignment of codebook index are also introduced.
APA, Harvard, Vancouver, ISO, and other styles
13

Li, Chaoyang, and Ke Liu. "Smart Search Engine : A Design and Test of Intelligent Search of News with Classification." Thesis, Högskolan Dalarna, Institutionen för information och teknik, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:du-37601.

Full text
Abstract:
Background Google, Bing, and Baidu are the most commonly used search engines in the world. They also have some problems. For example, when searching for Jaguar, most of the search  results are cars, not animals. This is the problem of polysemy. Search engines always provide the most popular but not the most correct results. Aim We want to design and implement a search function and explore whether the method of classified news can improve the precision of users searching for news. Method In this research, we collect data by using a web crawler. We use a web crawler to crawl    the data of news in BBC news. Then we use NLTK, inverted index to do data pre-processing, and use BM25 to do data processing. Results Compare to the normal search function, our  function has a lower recall rate and a higher precision. Conclusions This search function can improve the precision when people search for news. Implications This search function can be used not only to search news but to search everything. It has a great future in search engines. It can be combined with machine learning to analyze users' search habits to search and classify more accurately.
APA, Harvard, Vancouver, ISO, and other styles
14

Lallali, Saliha. "A scalable search engine for the Personal Cloud." Thesis, Université Paris-Saclay (ComUE), 2016. http://www.theses.fr/2016SACLV009.

Full text
Abstract:
Un nouveau moteur de recherche embarqué conçu pour les objets intelligents. Ces dispositifs sont généralement équipés d'extrêmement de faible quantité de RAM et une grande capacité de stockage Flash NANAD. Pour faire face à ces contraintes matérielles contradictoires, les moteurs de recherche classique privilégient soit la scalabilité en insertion ou la scalabilité en requête, et ne peut pas répondre à ces deux exigences en même temps. En outre, très peu de solutions prennent en charge les suppressions de documents et mises à jour dans ce contexte. nous avons introduit trois principes de conception, à savoir y Write-Once Partitioning, Linear Pipelining and Background Linear Merging, et montrent comment ils peuvent être combinés pour produire un moteur de recherche intégré concilier un niveau élevé d'insertion / de suppression / et des mises à jour. Nous avons mis en place notre moteur de recherche sur une Board de développement ayant un représentant de configuration matérielle pour les objets intelligents et avons mené de vastes expériences en utilisant deux ensembles de données représentatives. Le dispositif expérimental résultats démontrent la scalabilité de l'approche et sa supériorité par rapport à l'état des procédés de l'art<br>A new embedded search engine designed for smart objects. Such devices are generally equipped with extremely low RAM and large Flash storage capacity. To tackle these conflicting hardware constraints, conventional search engines privilege either insertion or query scalability but cannot meet both requirements at the same time. Moreover, very few solutions support document deletions and updates in this context. we introduce three design principles, namely Write-Once Partitioning, Linear Pipelining and Background Linear Merging, and show how they can be combined to produce an embedded search engine reconciling high insert/delete/update rate and query scalability. We have implemented our search engine on a development board having a hardware configuration representative for smart objects and have conducted extensive experiments using two representative datasets. The experimental results demonstrate the scalability of the approach and its superiority compared to state of the art methods
APA, Harvard, Vancouver, ISO, and other styles
15

Åkesson, Erik. "Nyhetssöktjänster på webben : En utvärdering av News Index, Excite News Search och Ananova." Thesis, Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan, 2001. http://urn.kb.se/resolve?urn=urn:nbn:se:hb:diva-18320.

Full text
Abstract:
The purpose of this study is to examine the retrieval performance of three search engines, specialized in retrieving news articles: News Index, Excite News Search and Ananova. Thirty questions, grouped into three categories – politics, economy and sports, were used and the first twenty documents for each question were examined. The questions used were designed to be as current as possible and efforts were made to perform the searches with as little time span as possible between each search engine. The precision of the search engines was determined for each of the questions as well as for each category and for the combined categories. In measuring precision an average was calculated, intended to favour search engines that place its relevant documents early in the ranked list. The relevance of the retrieved documents was evaluated using a three-grade scale. Irrelevant articles and duplicates were given 0 points, partially relevant documents were given 0,5 points and those judged to be highly relevant were given 1 point. The results of the study show surprisingly high precision from two of the search engines, Excite and News Index with the former performing slightly better than the latter. Ananova performed considerably worse than the other two. One possible reason for the high precision observed is the relatively low complexity of the documents retrieved compared to web pages in general. When comparing the different categories of questions one notable result was that all search engines performed considerably worse in the &quot;economy&quot;-category. Possible reasons for this are, apart from a higher number of duplicates, a shortage of relevant articles for the questions in this category as well as possible differences between either the documents retrieved in the different categories, or the web pages publishing them.<br>Uppsatsnivå: D
APA, Harvard, Vancouver, ISO, and other styles
16

Chatterjee, Kasturi. "A generalized multidimensional index structure for multimedia data to support content-based similarity searches in a collaborative search environment." FIU Digital Commons, 2010. http://digitalcommons.fiu.edu/etd/2114.

Full text
Abstract:
Since multimedia data, such as images and videos, are way more expressive and informative than ordinary text-based data, people find it more attractive to communicate and express with them. Additionally, with the rising popularity of social networking tools such as Facebook and Twitter, multimedia information retrieval can no longer be considered a solitary task. Rather, people constantly collaborate with one another while searching and retrieving information. But the very cause of the popularity of multimedia data, the huge and different types of information a single data object can carry, makes their management a challenging task. Multimedia data is commonly represented as multidimensional feature vectors and carry high-level semantic information. These two characteristics make them very different from traditional alpha-numeric data. Thus, to try to manage them with frameworks and rationales designed for primitive alpha-numeric data, will be inefficient. An index structure is the backbone of any database management system. It has been seen that index structures present in existing relational database management frameworks cannot handle multimedia data effectively. Thus, in this dissertation, a generalized multidimensional index structure is proposed which accommodates the atypical multidimensional representation and the semantic information carried by different multimedia data seamlessly from within one single framework. Additionally, the dissertation investigates the evolving relationships among multimedia data in a collaborative environment and how such information can help to customize the design of the proposed index structure, when it is used to manage multimedia data in a shared environment. Extensive experiments were conducted to present the usability and better performance of the proposed framework over current state-of-art approaches.
APA, Harvard, Vancouver, ISO, and other styles
17

Ali, Halil, and hali@cs rmit edu au. "Effective web crawlers." RMIT University. CS&IT, 2008. http://adt.lib.rmit.edu.au/adt/public/adt-VIT20081127.164414.

Full text
Abstract:
Web crawlers are the component of a search engine that must traverse the Web, gathering documents in a local repository for indexing by a search engine so that they can be ranked by their relevance to user queries. Whenever data is replicated in an autonomously updated environment, there are issues with maintaining up-to-date copies of documents. When documents are retrieved by a crawler and have subsequently been altered on the Web, the effect is an inconsistency in user search results. While the impact depends on the type and volume of change, many existing algorithms do not take the degree of change into consideration, instead using simple measures that consider any change as significant. Furthermore, many crawler evaluation metrics do not consider index freshness or the amount of impact that crawling algorithms have on user results. Most of the existing work makes assumptions about the change rate of documents on the Web, or relies on the availability of a long history of change. Our work investigates approaches to improving index consistency: detecting meaningful change, measuring the impact of a crawl on collection freshness from a user perspective, developing a framework for evaluating crawler performance, determining the effectiveness of stateless crawl ordering schemes, and proposing and evaluating the effectiveness of a dynamic crawl approach. Our work is concerned specifically with cases where there is little or no past change statistics with which predictions can be made. Our work analyses different measures of change and introduces a novel approach to measuring the impact of recrawl schemes on search engine users. Our schemes detect important changes that affect user results. Other well-known and widely used schemes have to retrieve around twice the data to achieve the same effectiveness as our schemes. Furthermore, while many studies have assumed that the Web changes according to a model, our experimental results are based on real web documents. We analyse various stateless crawl ordering schemes that have no past change statistics with which to predict which documents will change, none of which, to our knowledge, has been tested to determine effectiveness in crawling changed documents. We empirically show that the effectiveness of these schemes depends on the topology and dynamics of the domain crawled and that no one static crawl ordering scheme can effectively maintain freshness, motivating our work on dynamic approaches. We present our novel approach to maintaining freshness, which uses the anchor text linking documents to determine the likelihood of a document changing, based on statistics gathered during the current crawl. We show that this scheme is highly effective when combined with existing stateless schemes. When we combine our scheme with PageRank, our approach allows the crawler to improve both freshness and quality of a collection. Our scheme improves freshness regardless of which stateless scheme it is used in conjunction with, since it uses both positive and negative reinforcement to determine which document to retrieve. Finally, we present the design and implementation of Lara, our own distributed crawler, which we used to develop our testbed.
APA, Harvard, Vancouver, ISO, and other styles
18

Giusti, Miguel. "In Search of Lost Happiness. Onthe Contlict of Paradigms in Contemporary Ethics." Pontificia Universidad Católica del Perú - Departamento de Humanidades, 2013. http://repositorio.pucp.edu.pe/index/handle/123456789/113022.

Full text
Abstract:
Following Hegel's revealing judgment concerning the ambivalence of modern morals ,it could be contented that the consensus looked for in contemporary ethical debates oscillates between utopia and nostalgia, invention and tradition. The idea of a harmonious agreement of everybody's interests that could act as a binding moral norm is usually searched for in the projection of an ideal community orin the retrieval of a lost paradise. Itis, with all the ambiguity of this expression, a quest for lost happiness. But an oscillating quest like this, inevitably faces argumentative paradoxes as the aporetical moraldebates of the last decades show, and also as this essay intends to show.<br>Siguiendo un revelador juicio de Hegel acerca de la ambivalencia de la moral moderna, podría decirse que el consenso buscado en los debates de la ética contemporánea oscilaentre la utopía y la nostalgia, entre la invención y la tradición. La idea de una concertación armoniosa de los intereses de todos, que pueda servir de norma moral vinculante, suele buscarse en la proyección de una comunidad ideal o en la recuperación de un paraíso perdido. Es, con toda la ambigüedad de la expresión, una búsqueda de la felicidad perdida. Pero una búsqueda oscilante como ésta parece enfrentarse inevitablemente a paradojas argumentativas. como lo atestiguan los debates aporéticos de la moral en las últimas décadas, y como trata de mostrarse también en este ensayo.
APA, Harvard, Vancouver, ISO, and other styles
19

Lange, Dustin. "Effective and efficient similarity search in databases." Phd thesis, Universität Potsdam, 2013. http://opus.kobv.de/ubp/volltexte/2013/6571/.

Full text
Abstract:
Given a large set of records in a database and a query record, similarity search aims to find all records sufficiently similar to the query record. To solve this problem, two main aspects need to be considered: First, to perform effective search, the set of relevant records is defined using a similarity measure. Second, an efficient access method is to be found that performs only few database accesses and comparisons using the similarity measure. This thesis solves both aspects with an emphasis on the latter. In the first part of this thesis, a frequency-aware similarity measure is introduced. Compared record pairs are partitioned according to frequencies of attribute values. For each partition, a different similarity measure is created: machine learning techniques combine a set of base similarity measures into an overall similarity measure. After that, a similarity index for string attributes is proposed, the State Set Index (SSI), which is based on a trie (prefix tree) that is interpreted as a nondeterministic finite automaton. For processing range queries, the notion of query plans is introduced in this thesis to describe which similarity indexes to access and which thresholds to apply. The query result should be as complete as possible under some cost threshold. Two query planning variants are introduced: (1) Static planning selects a plan at compile time that is used for all queries. (2) Query-specific planning selects a different plan for each query. For answering top-k queries, the Bulk Sorted Access Algorithm (BSA) is introduced, which retrieves large chunks of records from the similarity indexes using fixed thresholds, and which focuses its efforts on records that are ranked high in more than one attribute and thus promising candidates. The described components form a complete similarity search system. Based on prototypical implementations, this thesis shows comparative evaluation results for all proposed approaches on different real-world data sets, one of which is a large person data set from a German credit rating agency.<br>Ziel von Ähnlichkeitssuche ist es, in einer Menge von Tupeln in einer Datenbank zu einem gegebenen Anfragetupel all diejenigen Tupel zu finden, die ausreichend ähnlich zum Anfragetupel sind. Um dieses Problem zu lösen, müssen zwei zentrale Aspekte betrachtet werden: Erstens, um eine effektive Suche durchzuführen, muss die Menge der relevanten Tupel mithilfe eines Ähnlichkeitsmaßes definiert werden. Zweitens muss eine effiziente Zugriffsmethode gefunden werden, die nur wenige Datenbankzugriffe und Vergleiche mithilfe des Ähnlichkeitsmaßes durchführt. Diese Arbeit beschäftigt sich mit beiden Aspekten und legt den Fokus auf Effizienz. Im ersten Teil dieser Arbeit wird ein häufigkeitsbasiertes Ähnlichkeitsmaß eingeführt. Verglichene Tupelpaare werden entsprechend der Häufigkeiten ihrer Attributwerte partitioniert. Für jede Partition wird ein unterschiedliches Ähnlichkeitsmaß erstellt: Mithilfe von Verfahren des Maschinellen Lernens werden Basisähnlichkeitsmaßes zu einem Gesamtähnlichkeitsmaß verbunden. Danach wird ein Ähnlichkeitsindex für String-Attribute vorgeschlagen, der State Set Index (SSI), welcher auf einem Trie (Präfixbaum) basiert, der als nichtdeterministischer endlicher Automat interpretiert wird. Zur Verarbeitung von Bereichsanfragen wird in dieser Arbeit die Notation der Anfragepläne eingeführt, um zu beschreiben welche Ähnlichkeitsindexe angefragt und welche Schwellwerte dabei verwendet werden sollen. Das Anfrageergebnis sollte dabei so vollständig wie möglich sein und die Kosten sollten einen gegebenen Schwellwert nicht überschreiten. Es werden zwei Verfahren zur Anfrageplanung vorgeschlagen: (1) Beim statischen Planen wird zur Übersetzungszeit ein Plan ausgewählt, der dann für alle Anfragen verwendet wird. (2) Beim anfragespezifischen Planen wird für jede Anfrage ein unterschiedlicher Plan ausgewählt. Zur Beantwortung von Top-k-Anfragen stellt diese Arbeit den Bulk Sorted Access-Algorithmus (BSA) vor, der große Mengen von Tupeln mithilfe fixer Schwellwerte von den Ähnlichkeitsindexen abfragt und der Tupel bevorzugt, die hohe Ähnlichkeitswerte in mehr als einem Attribut haben und damit vielversprechende Kandidaten sind. Die vorgestellten Komponenten bilden ein vollständiges Ähnlichkeitssuchsystem. Basierend auf einer prototypischen Implementierung zeigt diese Arbeit vergleichende Evaluierungsergebnisse für alle vorgestellten Ansätze auf verschiedenen Realwelt-Datensätzen; einer davon ist ein großer Personendatensatz einer deutschen Wirtschaftsauskunftei.
APA, Harvard, Vancouver, ISO, and other styles
20

Rehnby, Nicklas. "Performance of alternative option pricing models during spikes in the FTSE 100 volatility index : Empirical evidence from FTSE100 index options." Thesis, Linköpings universitet, Institutionen för ekonomisk och industriell utveckling, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-139718.

Full text
Abstract:
Derivatives have a large and significant role on the financial markets today and the popularity of options has increased. This has also increased the demand of finding a suitable option pricing model, since the ground-breaking model developed by Black &amp; Scholes (1973) have poor pricing performance. Practitioners and academics have over the years developed different models with the assumption of non-constant volatility, without reaching any conclusions regarding which model is more suitable to use. This thesis examines four different models, the first model is the Practitioners Black &amp; Scholes model proposed by Christoffersen &amp; Jacobs (2004b). The second model is the Heston´s (1993) continuous time stochastic volatility model, a modification of the model is also included, which is called the Strike Vector Computation suggested by Kilin (2011). The last model is the Heston &amp; Nandi (2000) Generalized Autoregressive Conditional Heteroscedasticity type discrete model. From a practical point of view the models are evaluated, with the goal of finding the model with the best pricing performance and the most practical usage. The model´s robustness is also tested to see how the models perform in out-of-sample during a high respectively low implied volatility market. All the models are effected in the robustness test, the out-sample ability is negatively affected by a high implied volatility market. The results show that both of the stochastic volatility models have superior performances in the in-sample and out-sample analysis. The Generalized Autoregressive Conditional Heteroscedasticity type discrete model shows surprisingly poor results both in the in-sample and out-sample analysis. The results indicate that option data should be used instead of historical return data to estimate the model’s parameters. This thesis also provides an insight on why overnight-index-swap (OIS) rates should be used instead of LIBOR rates as a proxy for the risk-free rate.
APA, Harvard, Vancouver, ISO, and other styles
21

Al-Akashi, Falah Hassan Ali. "Using Wikipedia Knowledge and Query Types in a New Indexing Approach for Web Search Engines." Thesis, Université d'Ottawa / University of Ottawa, 2014. http://hdl.handle.net/10393/31773.

Full text
Abstract:
The Web is comprised of a vast quantity of text. Modern search engines struggle to index it independent of the structure of queries and type of Web data, and commonly use indexing based on Web‘s graph structure to identify high-quality relevant pages. However, despite the apparent widespread use of these algorithms, Web indexing based on human feedback and document content is controversial. There are many fundamental questions that need to be addressed, including: How many types of domains/websites are there in the Web? What type of data is in each type of domain? For each type, which segments/HTML fields in the documents are most useful? What are the relationships between the segments? How can web content be indexed efficiently in all forms of document configurations? Our investigation of these questions has led to a novel way to use Wikipedia to find the relationships between the query structures and document configurations throughout the document indexing process and to use them to build an efficient index that allows fast indexing and searching, and optimizes the retrieval of highly relevant results. We consider the top page on the ranked list to be highly important in determining the types of queries. Our aim is to design a powerful search engine with a strong focus on how to make the first page highly relevant to the user, and on how to retrieve other pages based on that first page. Through processing the user query using the Wikipedia index and determining the type of the query, our approach could trace the path of a query in our index, and retrieve specific results for each type. We use two kinds of data to increase the relevancy and efficiency of the ranked results: offline and real-time. Traditional search engines find it difficult to use these two kinds of data together, because building a real-time index from social data and integrating it with the index for the offline data is difficult in a traditional distributed index. As a source of offline data, we use data from the Text Retrieval Conference (TREC) evaluation campaign. The web track at TREC offers researchers chance to investigate different retrieval approaches for web indexing and searching. The crawled offline dataset makes it possible to design powerful search engines that extends current methods and to evaluate and compare them. We propose a new indexing method, based on the structures of the queries and the content of documents. Our search engine uses a core index for offline data and a hash index for real-time V data, which leads to improved performance. The TREC Web track evaluation of our experiments showed that our approach can be successfully employed for different types of queries. We evaluated our search engine on different sets of queries from TREC 2010, 2011 and 2012 Web tracks. Our approach achieved very good results in the TREC 2010 training queries. In the TREC 2011 testing queries, our approach was one of the six best compared to all other approaches (including those that used a very large corpus of 500 million documents), and it was second best when compared to approaches that used only part of the corpus (50 million documents), as ours did. In the TREC 2012 testing queries, our approach was second best if compared to all the approaches, and first if compared only to systems that used the subset of 50 million documents.
APA, Harvard, Vancouver, ISO, and other styles
22

Appelros, Peter. "Stroke severity and outcome : in search of predictors using a population-based strategy /." Stockholm : [Karolinska institutets bibl.], 2002. http://diss.kib.ki.se/2002/91-7349-275-2/.

Full text
APA, Harvard, Vancouver, ISO, and other styles
23

Tekli, Joe, Richard Chbeir, Agma J. M. Traina, et al. "Full-fledged semantic indexing and querying model designed for seamless integration in legacy RDBMS." Elsevier B.V, 2018. http://hdl.handle.net/10757/624626.

Full text
Abstract:
El texto completo de este trabajo no está disponible en el Repositorio Académico UPC por restricciones de la casa editorial donde ha sido publicado.<br>In the past decade, there has been an increasing need for semantic-aware data search and indexing in textual (structured and NoSQL) databases, as full-text search systems became available to non-experts where users have no knowledge about the data being searched and often formulate query keywords which are different from those used by the authors in indexing relevant documents, thus producing noisy and sometimes irrelevant results. In this paper, we address the problem of semantic-aware querying and provide a general framework for modeling and processing semantic-based keyword queries in textual databases, i.e., considering the lexical and semantic similarities/disparities when matching user query and data index terms. To do so, we design and construct a semantic-aware inverted index structure called SemIndex, extending the standard inverted index by constructing a tightly coupled inverted index graph that combines two main resources: a semantic network and a standard inverted index on a collection of textual data. We then provide a general keyword query model with specially tailored query processing algorithms built on top of SemIndex, in order to produce semantic-aware results, allowing the user to choose the results' semantic coverage and expressiveness based on her needs. To investigate the practicality and effectiveness of SemIndex, we discuss its physical design within a standard commercial RDBMS allowing to create, store, and query its graph structure, thus enabling the system to easily scale up and handle large volumes of data. We have conducted a battery of experiments to test the performance of SemIndex, evaluating its construction time, storage size, query processing time, and result quality, in comparison with legacy inverted index. Results highlight both the effectiveness and scalability of our approach.<br>This study is partly funded by the National Council for Scientific Research - Lebanon (CNRS-L), by the Lebanese American University (LAU), and the Research Support Foundation of the State of Sao Paulo ( FAPESP ). Appendix SemIndex Weighting Scheme We propose a set of weighting functions to assign weight scores to SemIndex entries, including: index nodes , index edges, data nodes , and data edges . The weighting functions are used to select and rank semantically relevant results w.r.t. the user's query (cf. SemIndex query processing in Section 5). Other weight functions could be later added to cater to the index designer's needs.<br>Revisión por pares
APA, Harvard, Vancouver, ISO, and other styles
24

Prívozník, Michal. "Lokální vyhledávání pro Linux." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2010. http://www.nusl.cz/ntk/nusl-237126.

Full text
Abstract:
This work deals with indexing, difeerent types of indexing structures their advantages and disadvantages. It provides the basis for a search engine with support of morphology or difeerent file formats. Provides insight to the basic ideas, which answer is aim of the master's thesis.
APA, Harvard, Vancouver, ISO, and other styles
25

Towell, Alexander R. "Encrypted Search| Enabling Standard Information Retrieval Techniques for Several New Secure Index Types While Preserving Confidentiality Against an Adversary With Access to Query Histories and Secure Index Contents." Thesis, Southern Illinois University at Edwardsville, 2015. http://pqdtopen.proquest.com/#viewpdf?dispub=1601582.

Full text
Abstract:
<p> Encrypted Search is a way for a client to store searchable documents on untrusted systems such that the untrusted system can obliviously search the documents on the client's behalf, i.e., the untrusted system does not know what the client is searching for nor what the documents contain. Several new secure index types (which enable Encrypted Search functionality) are designed and implemented, and then compared against each other and against the more typical Bloom filter-based secure index. We compare them with respect to several performance measures: time complexity, space complexity, and retrieval accuracy with respect to two rank-ordered search heuristics, MinDist* and BM25. In order to support these search heuristics, the secure indexes must store frequency and proximity information. We investigate the risk this poses to confidentiality and explore ways to mitigate said risk. Finally, we analyze the effect the false positive rate and secure index poisoning techniques have on both confidentiality and performance. Separately, we also simulate an adversary who has access to a history of hidden (encrypted) queries and design techniques that demonstrably mitigate the risk posed by this adversary, e.g., query obfuscation, without adversely effecting retrieval accuracy.</p>
APA, Harvard, Vancouver, ISO, and other styles
26

Kunze, Matthias. "Searching business process models by example." Phd thesis, Universität Potsdam, 2013. http://opus.kobv.de/ubp/volltexte/2013/6884/.

Full text
Abstract:
Business processes are fundamental to the operations of a company. Each product manufactured and every service provided is the result of a series of actions that constitute a business process. Business process management is an organizational principle that makes the processes of a company explicit and offers capabilities to implement procedures, control their execution, analyze their performance, and improve them. Therefore, business processes are documented as process models that capture these actions and their execution ordering, and make them accessible to stakeholders. As these models are an essential knowledge asset, they need to be managed effectively. In particular, the discovery and reuse of existing knowledge becomes challenging in the light of companies maintaining hundreds and thousands of process models. In practice, searching process models has been solved only superficially by means of free-text search of process names and their descriptions. Scientific contributions are limited in their scope, as they either present measures for process similarity or elaborate on query languages to search for particular aspects. However, they fall short in addressing efficient search, the presentation of search results, and the support to reuse discovered models. This thesis presents a novel search method, where a query is expressed by an exemplary business process model that describes the behavior of a possible answer. This method builds upon a formal framework that captures and compares the behavior of process models by the execution ordering of actions. The framework contributes a conceptual notion of behavioral distance that quantifies commonalities and differences of a pair of process models, and enables process model search. Based on behavioral distances, a set of measures is proposed that evaluate the quality of a particular search result to guide the user in assessing the returned matches. A projection of behavioral aspects to a process model enables highlighting relevant fragments that led to a match and facilitates its reuse. The thesis further elaborates on two search techniques that provide concrete behavioral distance functions as an instantiation of the formal framework. Querying enables search with a notion of behavioral inclusion with regard to the query. In contrast, similarity search obtains process models that are similar to a query, even if the query is not precisely matched. For both techniques, indexes are presented that enable efficient search. Methods to evaluate the quality and performance of process model search are introduced and applied to the techniques of this thesis. They show good results with regard to human assessment and scalability in a practical setting.<br>Geschäftsprozesse bilden die Grundlage eines jeden Unternehmens, da jedes Produkt und jede Dienstleistung das Ergebnis einer Reihe von Arbeitsschritten sind, deren Ablauf einen Geschäftsprozess darstellen. Das Geschäftsprozessmanagement rückt diese Prozesse ins Zentrum der Betrachtung und stellt Methoden bereit, um Prozesse umzusetzen, abzuwickeln und, basierend auf einer Auswertung ihrer Ausführung, zu verbessern. Zu diesem Zweck werden Geschäftsprozesse in Form von Prozessmodellen dokumentiert, welche die auszuführenden Arbeitsschritte und ihre Ausführungsbeziehungen erfassen und damit eine wesentliche Grundlage des Geschäftsprozessmanagements bilden. Um dieses Wissen verwerten zu können, muss es gut organisiert und leicht auffindbar sein – eine schwierige Aufgabe angesichts hunderter bzw. tausender Prozessmodelle, welche moderne Unternehmen unterhalten. In der Praxis haben sich bisher lediglich einfache Suchmethoden etabliert, zum Beispiel Freitextsuche in Prozessbeschreibungen. Wissenschaftliche Ansätze hingegen betrachten Ähnlichkeitsmaße und Anfragesprachen für Prozessmodelle, vernachlässigen dabei aber Maßnahmen zur effizienten Suche, sowie die verständliche Wiedergabe eines Suchergebnisses und Hilfestellungen für dessen Verwendung. Diese Dissertation stellt einen neuen Ansatz für die Prozessmodellsuche vor, wobei statt einer Anfragesprache Prozessmodelle zur Formulierung einer Anfrage verwendet werden, welche exemplarisch das Verhalten der gesuchten Prozesse beschreiben. Dieser Ansatz fußt auf einem formalen Framework, welches ein konzeptionelles Distanzmaß zur Bewertung gemeinsamen Verhaltens zweier Geschäftsprozesse definiert und die Grundlage zur Suche bildet. Darauf aufbauend werden Qualitätsmaße vorgestellt, die einem Benutzer bei der Bewertung von Suchergebnissen behilflich sind. Verhaltensausschnitte, die zur Aufnahme in das Suchergebnis geführt haben, können im Prozessmodell hervorgehoben werden. Die Arbeit führt zwei Suchtechniken ein, die konkrete Distanzmaße einsetzen, um Prozesse zu suchen, die das Verhalten einer Anfrage exakt enthalten (Querying), oder diesem in Bezug auf das Verhalten ähnlich sind (Similarity Search). Für beide Techniken werden Indexstrukturen vorgestellt, die effizientes Suchen ermöglichen. Abschließend werden allgemeine Methoden zur Evaluierung von Prozessmodellsuchansätzen vorgestellt, mit welchen die genannten Suchtechniken überprüft werden. Im Ergebnis zeigen diese eine hohe Qualität der Suchergebnisse hinsichtlich einer Vergleichsstudie mit Prozessexperten, sowie gute Skalierbarkeit für große Prozessmodellsammlungen.
APA, Harvard, Vancouver, ISO, and other styles
27

Dietze, Heiko. "GoWeb: Semantic Search and Browsing for the Life Sciences." Doctoral thesis, Saechsische Landesbibliothek- Staats- und Universitaetsbibliothek Dresden, 2010. http://nbn-resolving.de/urn:nbn:de:bsz:14-qucosa-63267.

Full text
Abstract:
Searching is a fundamental task to support research. Current search engines are keyword-based. Semantic technologies promise a next generation of semantic search engines, which will be able to answer questions. Current approaches either apply natural language processing to unstructured text or they assume the existence of structured statements over which they can reason. This work provides a system for combining the classical keyword-based search engines with semantic annotation. Conventional search results are annotated using a customized annotation algorithm, which takes the textual properties and requirements such as speed and scalability into account. The biomedical background knowledge consists of the GeneOntology and Medical Subject Headings and other related entities, e.g. proteins/gene names and person names. Together they provide the relevant semantic context for a search engine for the life sciences. We develop the system GoWeb for semantic web search and evaluate it using three benchmarks. It is shown that GoWeb is able to aid question answering with success rates up to 79%. Furthermore, the system also includes semantic hyperlinks that enable semantic browsing of the knowledge space. The semantic hyperlinks facilitate the use of the eScience infrastructure, even complex workflows of composed web services. To complement the web search of GoWeb, other data source and more specialized information needs are tested in different prototypes. This includes patents and intranet search. Semantic search is applicable for these usage scenarios, but the developed systems also show limits of the semantic approach. That is the size, applicability and completeness of the integrated ontologies, as well as technical issues of text-extraction and meta-data information gathering. Additionally, semantic indexing as an alternative approach to implement semantic search is implemented and evaluated with a question answering benchmark. A semantic index can help to answer questions and address some limitations of GoWeb. Still the maintenance and optimization of such an index is a challenge, whereas GoWeb provides a straightforward system.
APA, Harvard, Vancouver, ISO, and other styles
28

Ozturk, Ozgur. "Feature extraction and similarity-based analysis for proteome and genome databases." The Ohio State University, 2007. http://rave.ohiolink.edu/etdc/view?acc_num=osu1190138805.

Full text
APA, Harvard, Vancouver, ISO, and other styles
29

Dvořák, Pavel. "Vyhledávání fotografií podle obsahu." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2014. http://www.nusl.cz/ntk/nusl-236108.

Full text
Abstract:
This thesis covers design and practical realization of a tool for quick search in large image databases, containing from tens to hundreds of thousands photos, based on image similarity. The proposed technique uses various methods of descriptor extraction, creation of Bag of Words dictionaries and methods of storing image data in PostgreSQL database. Further, experiments with the implemented software were carried out to evaluate the search time effectivity and scaling possibilities of the design solution.
APA, Harvard, Vancouver, ISO, and other styles
30

Figueroa, Portilla Carlos Saussure. "The use of the smartphone as a tool for the search of information in the undergraduate students of Education of a Metropolitan Lima’s university." Pontificia Universidad Católica del Perú, 2016. http://repositorio.pucp.edu.pe/index/handle/123456789/116829.

Full text
Abstract:
The mobile devices like the tablet and the smartphone, especially this latter, forits portability and easy access to internet, has allowed its use to a massive public,within which are the university students.In this article are shown the results of a quantitative research about how is donethe information search through the educational use of the smartphone by theincoming students of the 2015-I cycle of the education faculty in a MetropolitanLima’s university, who all of them own a smartphone.<br>Los dispositivos móviles como la tablet y el smartphone, sobre todo este últimopor su portabilidad y fácil acceso a internet, han extendido su uso a un públicomasivo, dentro del cual se encuentran los estudiantes universitarios.En el presente artículo se muestran los resultados de una investigacióncuantitativa acerca de cómo se realiza la búsqueda de información a través del usoeducativo del smartphone por parte de los estudiantes ingresantes del ciclo 2015-Ide la Facultad de Educación de una universidad de Lima Metropolitana, de loscuales todos poseen un smartphone.A fin de obtener la información para el presente estudio, se aplicó una encuestaal grupo señalado. A continuación se presenta la síntesis de los resultados, asícomo las conclusiones respectivas.<br>Dispositivos móveis como a tablet e o smartphone, especialmente o último porsua portabilidade e fácil acesso à internet, têm alargado sua utilização para umaaudiência de massa, como são os estudantes universitários.Neste artigo são apresentados os resultados de uma pesquisa quantitativa sobrecomo encontrar informações através do uso educacional do smartphone por alunosdo Ciclo 2015-I da Faculdade de Educação de uma universidade em Lima, todosos que têm um smartphone.A fim de obter a informação para este estudo, foi aplicada uma pesquisaao grupo indicado. Após são apresentadas a síntese dos resultados e também asrespectivas conclusões.
APA, Harvard, Vancouver, ISO, and other styles
31

Reck, Ryan. "Suffix Trees for Document Retrieval." DigitalCommons@CalPoly, 2012. https://digitalcommons.calpoly.edu/theses/773.

Full text
Abstract:
This thesis presents a look at the suitability of Suffix Trees for full text indexing and retrieval. Typically suffix trees are built on a character level, where the tree records which characters follow each other character. By building suffix trees for documents based on words instead of characters, the resulting tree effectively indexes every word or sequence of words that occur in any of the documents. Ukkonnen's algorithm is adapted to build word-level suffix trees. But the primary focus is on developing Algorithms for searching the suffix tree for exact and approximate, or fuzzy, matches to arbitrary query strings. A proof-of-concept implementation is built and compared to a Lucene index for retrieval over a subset of the Reuters RCV1 data set.
APA, Harvard, Vancouver, ISO, and other styles
32

Lengua, Parra Adrián, and Ana Paula Mendoza. "A pending issue that does not disappear: the need to implement a policy of search of missing persons parting from the establishment of a central agency in the Peruvian State." THĒMIS-Revista de Derecho, 2016. http://repositorio.pucp.edu.pe/index/handle/123456789/109008.

Full text
Abstract:
As a product of the armed violence and the human rights violations committed in the decades of the eighties and nineties, the Peruvian government initiated a process of transitional justice in order to compensate the victims and reconcile a fragmented and divided society. However, there are still issues pending in that matter. One of these issues is the search of the missing persons.The present article will delve into the importance of a policy of search of missing persons in the light of the international obligations on human rights matters of the Peruvian state, and will analyze the weaknesses of their judicial actions to accomplish this task. The need of a centralized organism in charge of this function will be sustained, and a normative proposal for its implementation in the Peruvian legal system will be presented.<br>Producto de la violencia armada y de las vulneraciones a los derechos humanos cometidas en las décadas de los ochenta y noventa, el Estado peruano inició un proceso de justicia transicional con la finalidad de resarcir a las víctimas y reconciliar a una sociedad fragmentada. A pesar de ello, aún se mantienen pendientes en esta materia, como la búsqueda de las personas desaparecidas.El presente artículo ahondará en la importancia de una política de búsqueda de personas desaparecidas a la luz de las obligaciones internacionales en materia de derechos humanos del Estado peruano, y analizará las falencias de sus acciones de judicialización para cumplir esta tarea. Se sustentará la necesidad de un organismo centralizado que se encargue de esta función, y se presentará una propuesta normativa para su implementación en nuestro ordenamiento.
APA, Harvard, Vancouver, ISO, and other styles
33

Copland, Gordon Arthur, and gordon copland@flinders edu au. "A House for the Governor:Settlement Theory, the South Australian Experiment, and the Search for the First Government House." Flinders University. Education,Theology, Law, Humanities, 2006. http://catalogue.flinders.edu.au./local/adt/public/adt-SFU20061010.104925.

Full text
Abstract:
This thesis considers the human spatial occupational behaviour generically called 'settlement'. Within this process a diagnostic index of settlement is created to assist in analysing, defining, and exploring the parameters of 'Settlement Theory'. There is particular reference to Edward Gibbon Wakefield's Theory of Systematic Colonisation in South Australia, as it is one of the few Settlement Theories actually put into practice. Two case studies are examined to develop a transitional argument that connects theory to material outcome. Firstly, considering the macro implications of theory and material culture by comparing the implementation of Wakefield's theory (The South Australian Experiment) and the site, design, and Government Domain of the Capital (Adelaide). Secondly, by considering the micro effect of the theory on material culture in the form of the Governor's residence between 1836 and 1856, including search for the first Government House (Government Hut), to test the connection at this level.
APA, Harvard, Vancouver, ISO, and other styles
34

Granell, Albin, and Filip Carlsson. "How Google Search Trends Can Be Used as Technical Indicators for the S&P500-Index : A Time Series Analysis Using Granger’s Causality Test." Thesis, KTH, Matematisk statistik, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-228740.

Full text
Abstract:
This thesis studies whether Google search trends can be used as indicators for movements in the S&amp;P500 index. Using Granger's causality test, the level of causality between movements in the S&amp;P500 index and Google search volumes for certain keywords is analyzed. The result of the analysis is used to form an investment strategy entirely based on Google search volumes, which is then backtested over a five year period using historic data. The causality tests show that 8 of 30 words indicate causality at a 10% level of significance, where one word, mortgage, indicates causality at a 1% level of significance. Several investment strategies based on search volumes yield higher returns than the index itself over the considered five year period, where the best performing strategy beats the index with over 60 percentage units.<br>Denna uppsats studerar huruvida Google-söktrender kan användas som indikatorer för rörelser i S&amp;P500-indexet. Genom Grangers kausalitetstest studeras kausalitetsnivån mellan rörelser i S&amp;P500 och Google-sökvolymer för särskilt utvalda nyckelord. Resultaten i denna analys används i sin tur för att utforma en investeringsstrategi enbart baserad på Google-sökvolymer, som med hjälp av historisk data prövas över en femårsperiod. Resultaten av kausalitetstestet visar att 8 av 30 ord indikerar en kausalitet på en 10 % -ig signifikansnivå, varav ett av orden, mortgage, påvisar kausalitet på en 1 % -ig signifikansnivå. Flera investeringsstrategier baserade på sökvolymer genererar högre avkastning än indexet självt över den prövade femårsperioden, där den bästa strategin slår index med över 60 procentenheter.
APA, Harvard, Vancouver, ISO, and other styles
35

Kropf, Carsten [Verfasser], Thorsten [Akademischer Betreuer] [Gutachter] Claus, Richard [Akademischer Betreuer] [Gutachter] Göbel, and Alexander [Akademischer Betreuer] [Gutachter] Schill. "Efficient Reorganisation of Hybrid Index Structures Supporting Multimedia Search Criteria / Carsten Kropf ; Gutachter: Thorsten Claus, Richard Göbel, Alexander Schill ; Thorsten Claus, Richard Göbel, Alexander Schill." Dresden : Saechsische Landesbibliothek- Staats- und Universitaetsbibliothek Dresden, 2017. http://d-nb.info/1125850914/34.

Full text
APA, Harvard, Vancouver, ISO, and other styles
36

Liu, Danzhou. "EFFICIENT TECHNIQUES FOR RELEVANCE FEEDBACK PROCESSING IN CONTENT-BASED IMAGE RETRIEVAL." Doctoral diss., University of Central Florida, 2009. http://digital.library.ucf.edu/cdm/ref/collection/ETD/id/2991.

Full text
Abstract:
In content-based image retrieval (CBIR) systems, there are two general types of search: target search and category search. Unlike queries in traditional database systems, users in most cases cannot specify an ideal query to retrieve the desired results for either target search or category search in multimedia database systems, and have to rely on iterative feedback to refine their query. Efficient evaluation of such iterative queries can be a challenge, especially when the multimedia database contains a large number of entries, and the search needs many iterations, and when the underlying distance measure is computationally expensive. The overall processing costs, including CPU and disk I/O, are further emphasized if there are numerous concurrent accesses. To address these limitations involved in relevance feedback processing, we propose a generic framework, including a query model, index structures, and query optimization techniques. Specifically, this thesis has five main contributions as follows. The first contribution is an efficient target search technique. We propose four target search methods: naive random scan (NRS), local neighboring movement (LNM), neighboring divide-and-conquer (NDC), and global divide-and-conquer (GDC) methods. All these methods are built around a common strategy: they do not retrieve checked images (i.e., shrink the search space). Furthermore, NDC and GDC exploit Voronoi diagrams to aggressively prune the search space and move towards target images. We theoretically and experimentally prove that the convergence speeds of GDC and NDC are much faster than those of NRS and recent methods. The second contribution is a method to reduce the number of expensive distance computation when answering k-NN queries with non-metric distance measures. We propose an efficient distance mapping function that transfers non-metric measures into metric, and still preserves the original distance orderings. Then existing metric index structures (e.g., M-tree) can be used to reduce the computational cost by exploiting the triangular inequality property. The third contribution is an incremental query processing technique for Support Vector Machines (SVMs). SVMs have been widely used in multimedia retrieval to learn a concept in order to find the best matches. SVMs, however, suffer from the scalability problem associated with larger database sizes. To address this limitation, we propose an efficient query evaluation technique by employing incremental update. The proposed technique also takes advantage of a tuned index structure to efficiently prune irrelevant data. As a result, only a small portion of the data set needs to be accessed for query processing. This index structure also provides an inexpensive means to process the set of candidates to evaluate the final query result. This technique can work with different kernel functions and kernel parameters. The fourth contribution is a method to avoid local optimum traps. Existing CBIR systems, designed around query refinement based on relevance feedback, suffer from local optimum traps that may severely impair the overall retrieval performance. We therefore propose a simulated annealing-based approach to address this important issue. When a stuck-at-a-local-optimum occurs, we employ a neighborhood search technique (i.e., simulated annealing) to continue the search for additional matching images, thus escaping from the local optimum. We also propose an index structure to speed up such neighborhood search. Finally, the fifth contribution is a generic framework to support concurrent accesses. We develop new storage and query processing techniques to exploit sequential access and leverage inter-query concurrency to share computation. Our experimental results, based on the Corel dataset, indicate that the proposed optimization can significantly reduce average response time while achieving better precision and recall, and is scalable to support a large user community. This latter performance characteristic is largely neglected in existing systems making them less suitable for large-scale deployment. With the growing interest in Internet-scale image search applications, our framework offers an effective solution to the scalability problem.<br>Ph.D.<br>School of Electrical Engineering and Computer Science<br>Engineering and Computer Science<br>Computer Science PhD
APA, Harvard, Vancouver, ISO, and other styles
37

Yang, Tony King. "The needs of a lifetime the search for security, 1865-1914 /." Diss., [Riverside, Calif.] : University of California, Riverside, 2009. http://proquest.umi.com/pqdweb?index=6&did=1957340911&SrchMode=2&sid=1&Fmt=2&VInst=PROD&VType=PQD&RQT=309&VName=PQD&TS=1269458235&clientId=48051.

Full text
Abstract:
Thesis (Ph. D.)--University of California, Riverside, 2009.<br>Includes abstract. Includes color illustrations. Available via ProQuest Digital Dissertations. Title from first page of PDF file (viewed March 24, 2010). Includes bibliographical references (p. 231-239). Also issued in print.
APA, Harvard, Vancouver, ISO, and other styles
38

The, Paw Liang. "In search of unity for the Methodist Church in Indonesia." Available from ProQuest, 2008. http://proquest.umi.com.ezproxy.drew.edu/pqdweb?index=0&sid=2&srchmode=2&vinst=PROD&fmt=6&startpage=-1&clientid=10355&vname=PQD&RQT=309&did=1626382391&scaling=FULL&ts=1263925423&vtype=PQD&rqt=309&TS=1263925429&clientId=10355.

Full text
APA, Harvard, Vancouver, ISO, and other styles
39

Yaghmaei, Sepideh. "In search of a low barrier hydrogen bond in proton-bridged diamines." Diss., [Riverside, Calif.] : University of California, Riverside, 2009. http://proquest.umi.com/pqdweb?index=0&did=1663079761&SrchMode=2&sid=2&Fmt=2&VInst=PROD&VType=PQD&RQT=309&VName=PQD&TS=1269364839&clientId=48051.

Full text
Abstract:
Thesis (Ph. D.)--University of California, Riverside, 2009.<br>Includes abstract. Available via ProQuest Digital Dissertations. Title from first page of PDF file (viewed March 23, 2010). Includes bibliographical references. Also issued in print.
APA, Harvard, Vancouver, ISO, and other styles
40

Pal, Anibrata. "Multi-objective optimization in learn to pre-compute evidence fusion to obtain high quality compressed web search indexes." Universidade Federal do Amazonas, 2016. http://tede.ufam.edu.br/handle/tede/5128.

Full text
Abstract:
Submitted by Sáboia Nágila (nagila.saboia01@gmail.com) on 2016-07-29T14:09:40Z No. of bitstreams: 1 Disertação-Anibrata Pal.pdf: 1139751 bytes, checksum: a29e1923e75e239365abac2dc74c7f40 (MD5)<br>Approved for entry into archive by Divisão de Documentação/BC Biblioteca Central (ddbc@ufam.edu.br) on 2016-08-15T17:54:46Z (GMT) No. of bitstreams: 1 Disertação-Anibrata Pal.pdf: 1139751 bytes, checksum: a29e1923e75e239365abac2dc74c7f40 (MD5)<br>Approved for entry into archive by Divisão de Documentação/BC Biblioteca Central (ddbc@ufam.edu.br) on 2016-08-15T17:57:29Z (GMT) No. of bitstreams: 1 Disertação-Anibrata Pal.pdf: 1139751 bytes, checksum: a29e1923e75e239365abac2dc74c7f40 (MD5)<br>Made available in DSpace on 2016-08-15T17:57:29Z (GMT). No. of bitstreams: 1 Disertação-Anibrata Pal.pdf: 1139751 bytes, checksum: a29e1923e75e239365abac2dc74c7f40 (MD5) Previous issue date: 2016-04-19<br>CAPES - Coordenação de Aperfeiçoamento de Pessoal de Nível Superior<br>The world of information retrieval revolves around web search engines. Text search engines are one of the most important source for routing information. The web search engines index huge volumes of data and handles billions of documents. The learn to rank methods have been adopted in the recent past to generate high quality answers for the search engines. The ultimate goal of these systems are to provide high quality results and, at the same time, reduce the computational time for query processing. Drawing direct correlation from the aforementioned fact; reading from smaller or compact indexes always accelerate data read or in other words, reduce computational time during query processing. In this thesis we study about using learning to rank method to not only produce high quality ranking of search results, but also to optimize another important aspect of search systems, the compression achieved in their indexes. We show that it is possible to achieve impressive gains in search engine index compression with virtually no loss in the final quality of results by using simple, yet effective, multi objective optimization techniques in the learning process. We also used basic pruning techniques to find out the impact of pruning in the compression of indexes. In our best approach, we were able to achieve more than 40% compression of the existing index, while keeping the quality of results at par with methods that disregard compression.<br>Máquinas de busca web para a web indexam grandes volumes de dados, lidando com coleções que muitas vezes são compostas por dezenas de bilhões de documentos. Métodos aprendizagem de máquina têm sido adotados para gerar as respostas de alta qualidade nesses sistemas e, mais recentemente, há métodos de aprendizagem de máquina propostos para a fusão de evidências durante o processo de indexação das bases de dados. Estes métodos servem então não somente para melhorar a qualidade de respostas em sistemas de busca, mas também para reduzir custos de processamento de consultas. O único método de fusão de evidências em tempo de indexação proposto na literatura tem como foco exclusivamente o aprendizado de funções de fusão de evidências que gerem bons resultados durante o processamento de consulta, buscando otimizar este único objetivo no processo de aprendizagem. O presente trabalho apresenta uma proposta onde utiliza-se o método de aprendizagem com múltiplos objetivos, visando otimizar, ao mesmo tempo, tanto a qualidade de respostas produzidas quando o grau de compressão do índice produzido pela fusão de rankings. Os resultados apresentados indicam que a adoção de um processo de aprendizagem com múltiplos objetivos permite que se obtenha melhora significativa na compressão dos índices produzidos sem que haja perda significativa na qualidade final do ranking produzido pelo sistema.
APA, Harvard, Vancouver, ISO, and other styles
41

Holzapfel, Christina [Verfasser], Johann Josef [Akademischer Betreuer] Hauner, Thomas [Akademischer Betreuer] Illig, and Martin [Akademischer Betreuer] Halle. "Search for single nucleotide polymorphisms (SNPs) for weight loss and lifestyle factors associated with body mass index / Christina Holzapfel. Gutachter: Johann J. Hauner ; Thomas Illig ; Martin Halle. Betreuer: Johann J. Hauner." München : Universitätsbibliothek der TU München, 2011. http://d-nb.info/101433053X/34.

Full text
APA, Harvard, Vancouver, ISO, and other styles
42

White, Barbara Jo. "Evaluating the impact of typical images for visual query formulation on search efficacy /." Full text available from ProQuest UM Digital Dissertations, 2005. http://0-proquest.umi.com.umiss.lib.olemiss.edu/pqdweb?index=0&did=1253473101&SrchMode=1&sid=3&Fmt=2&VInst=PROD&VType=PQD&RQT=309&VName=PQD&TS=1193754304&clientId=22256.

Full text
APA, Harvard, Vancouver, ISO, and other styles
43

Kyjovský, Marek. "Extrakce klíčových slov z vědeckých článků." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2010. http://www.nusl.cz/ntk/nusl-237148.

Full text
Abstract:
The main goal of this thesis is to explore basic methods which is using for extraction of important words from articles. After that try to understand character of using keywords from the available set of testing English articles. Based on these findings, try to design and to implement a system which is using this methods. Then created system testing on the real English articles and after that try to analyse results.
APA, Harvard, Vancouver, ISO, and other styles
44

FINOMORE, VICTOR S. JR. "EFFECTS OF FEATURE PRESENCE/ABSENCE AND EVENT ASYNCHRONY ON VIGILANCE PERFORMANCE AND PERCEIVED MENTAL WORKLOAD." University of Cincinnati / OhioLINK, 2006. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1143732659.

Full text
APA, Harvard, Vancouver, ISO, and other styles
45

Kalmegh, Prajakta. "Image mining methodologies for content based retrieval." Thesis, Georgia Institute of Technology, 2010. http://hdl.handle.net/1853/39587.

Full text
Abstract:
The thesis presents a system for content based image retrieval and mining. The research presents a design of a scalable solution for efficient retrieval of images from large image databases using image features such as color, shape and texture. A framework for automatic labeling of images and clustering of meta data in database based on the dominant shapes, textures and colors in the image is proposed. The thesis also presents a new image tagging methodology to annotate the dominant image features to the image as meta data. The users of this system can input a query image and select similar image retrieval criteria by selecting a feature type from amongst color, texture or shape. The system retrieves images from the database that match the specified pattern and displays them by relevance. The user can enter a set of keywords or a combination of keywords that form the input text query. Images in the database that match the input text query are fetched and displayed. This ensures content based similar image search even for text based search. An efficient clustering algorithm is shown to improve the image retrieval by an order of magnitude.
APA, Harvard, Vancouver, ISO, and other styles
46

Hansén, Jacob, and Axel Gustafsson. "A Study on Comparison Websites in the Airline Industry and Using CART Methods to Determine Key Parameters in Flight Search Conversion." Thesis, KTH, Matematisk statistik, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-254309.

Full text
Abstract:
This bachelor thesis in applied mathematics and industrial engineering and management aimed to identify relationships between search parameters in flight comparison search engines and the exit conversion rate, while also investigating how the emergence of such comparison search engines has impacted the airline industry. To identify such relationships, several classification models were employed in conjunction with several sampling methods to produce a predictive model using the program R. To investigate the impact of the emergence of comparison websites, Porter's 5 forces and a SWOT - analysis were employed to analyze findings of a literature study and a qualitative interview. The classification models developed performed poorly with regards to several assessments metrics which suggested that there were little to no significance in the relationship between the search parameters investigated and exit conversion rate. Porter's 5 forces and the SWOT-analysis suggested that the competitive landscape of the airline industry has become more competitive and that airlines which do not manage to adapt to this changing market environment will experience decreasing profitability.<br>Detta kandidatexamensarbete inriktat på tillämpad matematik och industriell ekonomi syftade till att identifiera samband mellan sökparametrar från flygsökmotorer och konverteringsgraden för utträde till ett flygbolags hemsida, och samtidigt undersöka hur uppkomsten av flygsökmotorer har påverkat flygindustrin för flygbolag. För att identifiera sådana samband, tillämpades flera klassificeringsmodeller tillsammans med stickprovsmetoder för att bygga en predikativ modell i programmet R. För att undersöka påverkan av flygsökmotorer tillämpades Porters 5 krafter och SWOT-analys som teoretiska ramverk för att analysera information uppsamlad genom en litteraturstudie och en intervju. Klassificeringsmodellerna som byggdes presterade undermåligt med avseende på flera utvärderingsmått, vilket antydde att det fanns lite eller inget samband mellan de undersökta sökparametrarna och konverteringsgraden för utträde. Porters 5 krafter och SWOT-analysen visade att flygindustrin hade blivit mer konkurrensutsatt och att flygbolag som inte lyckas anpassa sig efter en omgivning i ändring kommer att uppleva minskande lönsamhet.
APA, Harvard, Vancouver, ISO, and other styles
47

Yasin, Zafar. "Search for the decay of a charged B meson to a charged rho meson and a f₀(980) meson." Diss., UC access only, 2009. http://proquest.umi.com/pqdweb?index=138&did=1907183711&SrchMode=1&sid=1&Fmt=7&retrieveGroup=0&VType=PQD&VInst=PROD&RQT=309&VName=PQD&TS=1270493424&clientId=48051.

Full text
Abstract:
Thesis (Ph. D.)--University of California, Riverside, 2009.<br>Includes abstract. Includes bibliographical references (leaves 104-105). Issued in print and online. Available via ProQuest Digital Dissertations.
APA, Harvard, Vancouver, ISO, and other styles
48

Santos, Célia Francisca dos. "Métodos de poda estática para índices de máquinas de busca." Universidade Federal do Amazonas, 2006. http://tede.ufam.edu.br/handle/tede/2944.

Full text
Abstract:
Made available in DSpace on 2015-04-11T14:03:08Z (GMT). No. of bitstreams: 1 Celia Francisca dos Santos.pdf: 545200 bytes, checksum: 1be2bb65210d0ea7f3239ecdd2efa28d (MD5) Previous issue date: 2006-02-22<br>Coordenação de Aperfeiçoamento de Pessoal de Nível Superior<br>Neste trabalho são propostos e avaliados experimentalmente novos métodos de poda estática especialmente projetados para máquinas de busca web. Os métodos levam em consideração a localidade de ocorrência dos termos nos documentos para realizar a poda em índices de máquinas de busca e, por esta razão, são chamados de "métodos de poda baseados em localidade". Quatro novos métodos de poda que utilizam informação de localidade são propostos aqui: two-pass lbpm, full coverage, top fragments e random. O método two-pass lbpm é o mais efetivo dentre os métodos baseados em localidade, mas requer uma construção completa dos índices antes de realizar o processo de poda. Por outro lado, full coverage, top fragments e random são métodos single-pass que executam a poda dos índices sem requerer uma construção prévia dos índices originais. Os métodos single-pass são úteis para ambientes onde a base de documentos sofre alterações contínuas, como em máquinas de busca de grande escala desenvolvidas para a web. Experimentos utilizando uma máquina de busca real mostram que os métodos propostos neste trabalho podem reduzir o custo de armazenamento dos índices em até 60%, enquanto mantém uma perda mínima de precisão. Mais importante, os resultados dos experimentos indicam que esta mesma redução de 60% no tamanho dos índices pode reduzir o tempo de processamento de consultas para quase 57% do tempo original. Além disso, os experimentos mostram que, para consultas conjuntivas e frases, os métodos baseados em localidade produzem resultados melhores do que o método de Carmel, melhor método proposto na literatura. Por exemplo, utilizando apenas consultas com frases, com uma redução de 67% no tamanho dos índices, o método baseados em localidade two-pass lbpm produziu resultados com uma grau de similaridade de 0.71, em relação aos resultados obtidos com os índices originais, enquanto o método de Carmel produziu resultados com um grau de similaridade de apenas 0.39. Os resultados obtidos mostram que os métodos de poda baseados em localidade são mais efetivos em manter a qualidade dos resultados providos por máquinas de busca.
APA, Harvard, Vancouver, ISO, and other styles
49

Chu, Po-Ju, and 朱柏儒. "The Research of Search Volume Index and TSEC Taiwan 50 Index." Thesis, 2016. http://ndltd.ncl.edu.tw/handle/17379830499250405335.

Full text
Abstract:
碩士<br>國立高雄應用科技大學<br>金融系金融資訊碩士班<br>105<br>Search volume index Taking Google search trends launched as a proxy variable investor attention, and the use of financing growth, the growth rate of foreign ownership and the growth rate of investment trust shares as sentiment indicators, the study of Taiwan's 50 components stocks, selected from 2010 to 2015 during the weekly frequency data sample after screening 31 stalls stocks. The empirical results show: 1. Search volume index of abnormal changes can only be interpreted week stock-based compensation, was unable to make a forecast for the next week and the next week the stock returns because investors in the week of the search term stocks of interest, then possible Trade and push up stock prices, but this effect can only be maintained within a week, at one week and the next two weeks search behavior is not obvious. 2. In the week and the week before, the volume buy and sell of foreign investment and investment trust of the impacts is more consistent direction, opposite to the individual investors operation was. 3. When the search volume index rises abnormal in the previous week, the use of financing growth and the growth rate increased foreign ownership would make a stock compensation decreased in this week.
APA, Harvard, Vancouver, ISO, and other styles
50

Patel, Hiren. "Inverted Index Partitioning Strategies for a Distributed Search Engine." Thesis, 2010. http://hdl.handle.net/10012/5683.

Full text
Abstract:
One of the greatest challenges in information retrieval is to develop an intelligent system for user and machine interaction that supports users in their quest for relevant information. The dramatic increase in the amount of Web content gives rise to the need for a large-scale distributed information retrieval system, targeted to support millions of users and terabytes of data. To retrieve information from such a large amount of data in an efficient manner, the index is split among the servers in a distributed information retrieval system. Thus, partitioning the index among these collaborating nodes plays an important role in enhancing the performance of a distributed search engine. The two widely known inverted index partitioning schemes for a distributed information retrieval system are document partitioning and term partitioning. %In a document partitioned system, each of the server hosts a subset of the documents in the collection, and execute every query against its local sub-collection. In a term partitioned index, each node is responsible for a subset of the terms in the collection, and serves them to a central node as they are required for query evaluation. In this thesis, we introduce the Document over Term inverted index distribution scheme, which splits a set of nodes into several groups (sub-clusters) and then performs document partitioning between the groups and term partitioning within the group. As this approach is based on the term and document index partitioning approaches, we also refer it as a Hybrid Inverted Index. This approach retains the disk access benefits of term partitioning and the benefits of sharing computational load, scalability, maintainability, and availability of the document partitioning. We also introduce the Document over Document index partitioning scheme, based on the document partitioning approach. In this approach, a set of nodes is split into groups and documents in the collection are partitioned between groups and also within each group. This strategy retains all the benefits of the document partitioning approach, but reduces the computational load more effectively and uses resources more efficiently. We compare distributed index approaches experimentally and show that in terms of efficiency and scalability, document partition based approaches perform significantly better than the others. The Document over Term partitioning offers efficient utilization of search-servers and lowers disk access, but suffers from the problem of load imbalance. The Document over Document partitioning emerged to be the preferred method during high workload.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!