Dissertations / Theses: 'Data structures and algorithms for data management'

1

Karras, Panagiotis. "Data structures and algorithms for data representation in constrained environments." Thesis, Click to view the E-thesis via HKUTO, 2007. http://sunzi.lib.hku.hk/hkuto/record/B38897647.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

蔡纓 and Ying Choi. "Improved data structures for two-dimensional library management and dictionary problems." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 1996. http://hub.hku.hk/bib/B31213029.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Choi, Ying. "Improved data structures for two-dimensional library management and dictionary problems /." Hong Kong : University of Hong Kong, 1996. http://sunzi.lib.hku.hk/hkuto/record.jsp?B18037240.

Full text

APA, Harvard, Vancouver, ISO, and other styles

4

Upadhyay, Abhyudaya. "Big Vector: An External Memory Algorithm and Data Structure." University of Cincinnati / OhioLINK, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1439279714.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

Johansen, Valdemar. "Object serialization vs relational data modelling in Apache Cassandra: a performance evaluation." Thesis, Blekinge Tekniska Högskola, Institutionen för datalogi och datorsystemteknik, 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-10391.

Full text

Abstract:

Context. In newer database solutions designed for large-scale, cloud-based services, database performance is of particular concern as these services face scalability challenges due to I/O bottlenecks. These issues can be alleviated through various data model optimizations that reduce I/O loads. Object serialization is one such approach. Objectives. This study investigates the performance of serialization using the Apache Avro library in the Cassandra database. Two different serialized data models are compared with a traditional relational database model. Methods. This study uses an experimental approach that compares read and write latency using Twitter data in JSON format. Results. Avro serialization is found to improve performance. However, the extent of the performance benefit is found to be highly dependent on the serialization granularity defined by the data model. Conclusions. The study concludes that developers seeking to improve database throughput in Cassandra through serialization should prioritize data model optimization as serialization by itself will not outperform relational modelling in all use cases. The study also recommends that further work is done to investigate additional use cases, as there are potential performance issues with serialization that are not covered in this study.

APA, Harvard, Vancouver, ISO, and other styles

6

Fischer, Johannes. "Data Structures for Efficient String Algorithms." Diss., lmu, 2007. http://nbn-resolving.de/urn:nbn:de:bvb:19-75053.

Full text

APA, Harvard, Vancouver, ISO, and other styles

7

Ochoa, Méndez Carlos Ernesto. "Synergistic (Analysis of) algorithms and data structures." Tesis, Universidad de Chile, 2019. http://repositorio.uchile.cl/handle/2250/170961.

Full text

Abstract:

Tesis para optar al grado de Doctor en Ciencias, Mención Computación
Los refinamientos actuales del análisis del peor caso sobre instancias con tamaño de entrada fijo consideran el orden de la entrada (por ejemplo, las subsecuencias ordenadas en una secuencia de números y las cadenas poligonales simples en las que puede dividirse una secuencia de puntos) o la estructura de la entrada (por ejemplo, la multiplicidad de los elementos en un multiconjunto y las posiciones relativas entre un conjunto de puntos en el plano), pero nunca, hasta donde sabemos, ambos al mismo tiempo. En esta tesis se proponen nuevas técnicas que combinan soluciones que se aprovechan del orden y la estructura de la entrada en una sola solución sinérgica para ordenar multiconjuntos, y para calcular la eficiencia de Pareto y la envoltura convexa de un conjunto de puntos en el plano. Estas soluciones sinérgicas se aprovechan del orden y la estructura de la entrada de tal forma que asintóticamente superan cualquier solución comparable que se aproveche solo de una de estas características. Como resultados intermedios, se describen y analizan varios algoritmos de mezcla: un algoritmo para mezclar secuencias ordenadas que es óptimo para cada instancia del problema; el primer algoritmo adaptativo para mezclar eficiencias de Pareto; y un algoritmo adaptativo para mezclar envolturas convexas en el plano. Estos tres algoritmos se basan en un paradigma donde las estructuras se dividen antes de ser mezcladas. Este paradigma es conveniente para extenderlo al contexto donde se responden consultas. Karp et al. (1998) describieron estructuras de datos diferidas como estructuras "perezosas" que procesan la entrada gradualmente a medida que responden consultas sobre los datos, trabajando la menor cantidad posible en el peor caso sobre instancias de tamaño fijo y número de consultas fijo. En esta tesis se desarrollan nuevas técnicas para refinar aún más estos resultados y aprovechar al mismo tiempo el orden y la estructura de la entrada y el orden y la estructura de la secuencia de consultas en tres problemas distintos: calcular el rango y la posici\'on de un elemento en un multiconjunto, determinar si un punto está dominado por la eficiencia de Pareto de un conjunto de puntos en el plano y determinar si un punto pertenece a la envoltura convexa de un conjunto de puntos en el plano. Las estructuras de datos diferidas que se obtienen superan todas las soluciones previas que solo se aprovechan de un subconjunto de estas características. Como una extensión natural a los resultados sinérgicos obtenidos en este trabajo para ordenar un multiconjunto, se describen estructuras de datos comprimidas que se aprovechan del orden y la estructura de la entrada para representar un multiconjunto, mientras se responden consultas del rango y la posición de elementos en el multiconjunto.
CONICYT-PCHA/Doctorado Nacional/2013-63130161, y los proyectos CONICYT Fondecyt/Regular nos 1120054 y 1170366

APA, Harvard, Vancouver, ISO, and other styles

8

Tsanakas, Panagiotis D. "Algorithms and data structures for hierarchical image processing." Ohio : Ohio University, 1985. http://www.ohiolink.edu/etd/view.cgi?ohiou1184075678.

Full text

APA, Harvard, Vancouver, ISO, and other styles

9

Allen, Sam D. "Algorithms and data structures for three-dimensional packing." Thesis, University of Nottingham, 2011. http://eprints.nottingham.ac.uk/12779/.

Full text

Abstract:

Cutting and packing problems are increasingly prevalent in industry. A well utilised freight vehicle will save a business money when delivering goods, as well as reducing the environmental impact, when compared to sending out two lesser-utilised freight vehicles. A cutting machine that generates less wasted material will have a similar effect. Industry reliance on automating these processes and improving productivity is increasing year-on-year. This thesis presents a number of methods for generating high quality solutions for these cutting and packing challenges. It does so in a number of ways. A fast, efficient framework for heuristically generating solutions to large problems is presented, and a method of incrementally improving these solutions over time is implemented and shown to produce even higher packing utilisations. The results from these findings provide the best known results for 28 out of 35 problems from the literature. This framework is analysed and its effectiveness shown over a number of datasets, along with a discussion of its theoretical suitability for higher-dimensional packing problems. A way of automatically generating new heuristics for this framework that can be problem specific, and therefore highly tuned to a given dataset, is then demonstrated and shown to perform well when compared to the expert-designed packing heuristics. Finally some mathematical models which can guarantee the optimality of packings for small datasets are given, and the (in)effectiveness of these techniques discussed. The models are then strengthened and a novel model presented which can handle much larger problems under certain conditions. The thesis finishes with a discussion about the applicability of the different approaches taken to the real-world problems that motivate them.

APA, Harvard, Vancouver, ISO, and other styles

10

Parry-Smith, David John. "Algorithms and data structures for protein sequence analysis." Thesis, University of Leeds, 1990. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.277404.

Full text

APA, Harvard, Vancouver, ISO, and other styles

11

Tor, S. B. "Geometric algorithms and data structures for CAD/CAM." Thesis, University of Westminster, 1985. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.356320.

Full text

APA, Harvard, Vancouver, ISO, and other styles

12

Ochi, Hiroyuki. "Algorithms and Data Structures for Manipulating Boolean Functions." Kyoto University, 1993. http://hdl.handle.net/2433/74648.

Full text

APA, Harvard, Vancouver, ISO, and other styles

13

Toss, Julio. "Parallel algorithms and data structures for interactive applications." reponame:Biblioteca Digital de Teses e Dissertações da UFRGS, 2017. http://hdl.handle.net/10183/172043.

Full text

Abstract:

La quête de performance a été une constante à travers l’histoire des systèmes informatiques. Il y a plus d’une décennie maintenant, le modèle de traitement séquentiel montrait ses premiers signes d’épuisement pour satisfaire les exigences de performance. Les barrières du calcul séquentiel ont poussé à un changement de paradigme et ont établi le traitement parallèle comme standard dans les systèmes informatiques modernes. Avec l’adoption généralisée d’ordinateurs parallèles, de nombreux algorithmes et applications ont été développés pour s’adapter à ces nouvelles architectures. Cependant, dans des applications non conventionnelles, avec des exigences d’interactivité et de temps réel, la parallélisation efficace est encore un défi majeur. L’exigence de performance en temps réel apparaît, par exemple, dans les simulations interactives où le système doit prendre en compte l’entrée de l’utilisateur dans une itération de calcul de la boucle de simulation. Le même type de contrainte apparaît dans les applications d’analyse de données en continu. Par exemple, lorsque des donnes issues de capteurs de trafic ou de messages de réseaux sociaux sont produites en flux continu, le système d’analyse doit être capable de traiter ces données à la volée rapidement sur ce flux tout en conservant un budget de mémoire contrôlé La caractéristique dynamique des données soulève plusieurs problèmes de performance tel que la décomposition du problème pour le traitement en parallèle et la maintenance de la localité mémoire pour une utilisation efficace du cache. Les optimisations classiques qui reposent sur des modèles pré-calculés ou sur l’indexation statique des données ne conduisent pas aux performances souhaitées. Dans cette thèse, nous abordons les problèmes dépendants de données sur deux applications différentes : la première dans le domaine de la simulation physique interactive et la seconde sur l’analyse des données en continu. Pour le problème de simulation, nous présentons un algorithme GPU parallèle pour calculer les multiples plus courts chemins et des diagrammes de Voronoi sur un graphe en forme de grille. Pour le problème d’analyse de données en continu, nous présentons une structure de données parallélisable, basée sur des Packed Memory Arrays, pour indexer des données dynamiques géo-référencées tout en conservant une bonne localité de mémoire.
A busca por desempenho tem sido uma constante na história dos sistemas computacionais. Ha mais de uma década, o modelo de processamento sequencial já mostrava seus primeiro sinais de exaustão pare suprir a crescente exigência por performance. Houveram "barreiras"para a computação sequencial que levaram a uma mudança de paradigma e estabeleceram o processamento paralelo como padrão nos sistemas computacionais modernos. Com a adoção generalizada de computadores paralelos, novos algoritmos foram desenvolvidos e aplicações reprojetadas para se adequar às características dessas novas arquiteturas. No entanto, em aplicações menos convencionais, com características de interatividade e tempo real, alcançar paralelizações eficientes ainda representa um grande desafio. O requisito por desempenho de tempo real apresenta-se, por exemplo, em simulações interativas onde o sistema deve ser capaz de reagir às entradas do usuário dentro do tempo de uma iteração da simulação. O mesmo tipo de exigência aparece em aplicações de monitoramento de fluxos contínuos de dados (streams). Por exemplo, quando dados provenientes de sensores de tráfego ou postagens em redes sociais são produzidos em fluxo contínuo, o sistema de análise on-line deve ser capaz de processar essas informações em tempo real e ao mesmo tempo manter um consumo de memória controlada A natureza dinâmica desses dados traz diversos problemas de performance, tais como a decomposição do problema para processamento em paralelo e a manutenção da localidade de dados para uma utilização eficiente da memória cache. As estratégias de otimização tradicionais, que dependem de modelos pré-computados ou de índices estáticos sobre os dados, não atendem às exigências de performance necessárias nesses cenários. Nesta tese, abordamos os problemas dependentes de dados em dois contextos diferentes: um na área de simulações baseada em física e outro em análise de dados em fluxo contínuo. Para o problema de simulação, apresentamos um algoritmo paralelo, em GPU, para computar múltiplos caminhos mínimos e diagramas de Voronoi em um grafo com topologia de grade. Para o problema de análise de fluxos de dados, apresentamos uma estrutura de dados paralelizável, baseada em Packed Memory Arrays, para indexar dados dinâmicos geo-localizados ao passo que mantém uma boa localidade de memória.
The quest for performance has been a constant through the history of computing systems. It has been more than a decade now since the sequential processing model had shown its first signs of exhaustion to keep performance improvements. Walls to the sequential computation pushed a paradigm shift and established the parallel processing as the standard in modern computing systems. With the widespread adoption of parallel computers, many algorithms and applications have been ported to fit these new architectures. However, in unconventional applications, with interactivity and real-time requirements, achieving efficient parallelizations is still a major challenge. Real-time performance requirement shows up, for instance, in user-interactive simulations where the system must be able to react to the user’s input within a computation time-step of the simulation loop. The same kind of constraint appears in streaming data monitoring applications. For instance, when an external source of data, such as traffic sensors or social media posts, provides a continuous flow of information to be consumed by an online analysis system. The consumer system has to keep a controlled memory budget and deliver a fast processed information about the stream Common optimizations relying on pre-computed models or static index of data are not possible in these highly dynamic scenarios. The dynamic nature of the data brings up several performance issues originated from the problem decomposition for parallel processing and from the data locality maintenance for efficient cache utilization. In this thesis we address data-dependent problems on two different applications: one on physically based simulations and another on streaming data analysis. To deal with the simulation problem, we present a parallel GPU algorithm for computing multiple shortest paths and Voronoi diagrams on a grid-like graph. Our contribution to the streaming data analysis problem is a parallelizable data structure, based on packed memory arrays, for indexing dynamic geo-located data while keeping good memory locality.

APA, Harvard, Vancouver, ISO, and other styles

14

Mostaghim, Sanaz. "Multi-objective evolutionary algorithms data structures, convergence, and diversity /." [S.l. : s.n.], 2004. http://deposit.ddb.de/cgi-bin/dokserv?idn=974405604.

Full text

APA, Harvard, Vancouver, ISO, and other styles

15

Chen, Calvin Ching-Yuen. "Efficient Parallel Algorithms and Data Structures Related to Trees." Thesis, University of North Texas, 1991. https://digital.library.unt.edu/ark:/67531/metadc332626/.

Full text

Abstract:

The main contribution of this dissertation proposes a new paradigm, called the parentheses matching paradigm. It claims that this paradigm is well suited for designing efficient parallel algorithms for a broad class of nonnumeric problems. To demonstrate its applicability, we present three cost-optimal parallel algorithms for breadth-first traversal of general trees, sorting a special class of integers, and coloring an interval graph with the minimum number of colors.

APA, Harvard, Vancouver, ISO, and other styles

16

Karlsson, Jonas S. "Scalable distributed data structures for database management." [S.l. : Amsterdam : s.n.] ; Universiteit van Amsterdam [Host], 2000. http://dare.uva.nl/document/57022.

Full text

APA, Harvard, Vancouver, ISO, and other styles

17

Seiferth, Paul [Verfasser]. "Disk Intersection Graphs: Models, Data Structures, and Algorithms / Paul Seiferth." Berlin : Freie Universität Berlin, 2016. http://d-nb.info/111255307X/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

18

Jeger, Olivier. "Extending the Eiffel Library for data structures and algorithms: EiffelBase." Zürich : ETH, Eidgenössische Technische Hochschule Zürich, Department of Computer Science, Chair of Software Engineering, 2004. http://e-collection.ethbib.ethz.ch/show?type=dipl&nr=187.

Full text

APA, Harvard, Vancouver, ISO, and other styles

19

Wason, Jasmin Lesley. "Automating data management in science and engineering." Thesis, University of Southampton, 2001. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.396143.

Full text

APA, Harvard, Vancouver, ISO, and other styles

20

Mak, Vivian, and 麥慧芸. "Algorithms for proximity problems in the presence of obstacles." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 1999. http://hub.hku.hk/bib/B29822749.

Full text

APA, Harvard, Vancouver, ISO, and other styles

21

Zampetakis, Stamatis. "Scalable algorithms for cloud-based Semantic Web data management." Thesis, Paris 11, 2015. http://www.theses.fr/2015PA112199/document.

Full text

Abstract:

Afin de construire des systèmes intelligents, où les machines sont capables de raisonner exactement comme les humains, les données avec sémantique sont une exigence majeure. Ce besoin a conduit à l’apparition du Web sémantique, qui propose des technologies standards pour représenter et interroger les données avec sémantique. RDF est le modèle répandu destiné à décrire de façon formelle les ressources Web, et SPARQL est le langage de requête qui permet de rechercher, d’ajouter, de modifier ou de supprimer des données RDF. Être capable de stocker et de rechercher des données avec sémantique a engendré le développement des nombreux systèmes de gestion des données RDF.L’évolution rapide du Web sémantique a provoqué le passage de systèmes de gestion des données centralisées à ceux distribués. Les premiers systèmes étaient fondés sur les architectures pair-à-pair et client-serveur, alors que récemment l’attention se porte sur le cloud computing.Les environnements de cloud computing ont fortement impacté la recherche et développement dans les systèmes distribués. Les fournisseurs de cloud offrent des infrastructures distribuées autonomes pouvant être utilisées pour le stockage et le traitement des données. Les principales caractéristiques du cloud computing impliquent l’évolutivité́, la tolérance aux pannes et l’allocation élastique des ressources informatiques et de stockage en fonction des besoins des utilisateurs.Cette thèse étudie la conception et la mise en œuvre d’algorithmes et de systèmes passant à l’échelle pour la gestion des données du Web sémantique sur des platformes cloud. Plus particulièrement, nous étudions la performance et le coût d’exploitation des services de cloud computing pour construire des entrepôts de données du Web sémantique, ainsi que l’optimisation de requêtes SPARQL pour les cadres massivement parallèles.Tout d’abord, nous introduisons les concepts de base concernant le Web sémantique et les principaux composants des systèmes fondés sur le cloud. En outre, nous présentons un aperçu des systèmes de gestion des données RDF (centralisés et distribués), en mettant l’accent sur les concepts critiques de stockage, d’indexation, d’optimisation des requêtes et d’infrastructure.Ensuite, nous présentons AMADA, une architecture de gestion de données RDF utilisant les infrastructures de cloud public. Nous adoptons le modèle de logiciel en tant que service (software as a service - SaaS), où la plateforme réside dans le cloud et des APIs appropriées sont mises à disposition des utilisateurs, afin qu’ils soient capables de stocker et de récupérer des données RDF. Nous explorons diverses stratégies de stockage et d’interrogation, et nous étudions leurs avantages et inconvénients au regard de la performance et du coût monétaire, qui est une nouvelle dimension importante à considérer dans les services de cloud public.Enfin, nous présentons CliqueSquare, un système distribué de gestion des données RDF basé sur Hadoop. CliqueSquare intègre un nouvel algorithme d’optimisation qui est capable de produire des plans massivement parallèles pour des requêtes SPARQL. Nous présentons une famille d’algorithmes d’optimisation, s’appuyant sur les équijointures n- aires pour générer des plans plats, et nous comparons leur capacité à trouver les plans les plus plats possibles. Inspirés par des techniques de partitionnement et d’indexation existantes, nous présentons une stratégie de stockage générique appropriée au stockage de données RDF dans HDFS (Hadoop Distributed File System). Nos résultats expérimentaux valident l’effectivité et l’efficacité de l’algorithme d’optimisation démontrant également la performance globale du système
In order to build smart systems, where machines are able to reason exactly like humans, data with semantics is a major requirement. This need led to the advent of the Semantic Web, proposing standard ways for representing and querying data with semantics. RDF is the prevalent data model used to describe web resources, and SPARQL is the query language that allows expressing queries over RDF data. Being able to store and query data with semantics triggered the development of many RDF data management systems. The rapid evolution of the Semantic Web provoked the shift from centralized data management systems to distributed ones. The first systems to appear relied on P2P and client-server architectures, while recently the focus moved to cloud computing.Cloud computing environments have strongly impacted research and development in distributed software platforms. Cloud providers offer distributed, shared-nothing infrastructures that may be used for data storage and processing. The main features of cloud computing involve scalability, fault-tolerance, and elastic allocation of computing and storage resources following the needs of the users.This thesis investigates the design and implementation of scalable algorithms and systems for cloud-based Semantic Web data management. In particular, we study the performance and cost of exploiting commercial cloud infrastructures to build Semantic Web data repositories, and the optimization of SPARQL queries for massively parallel frameworks.First, we introduce the basic concepts around Semantic Web and the main components and frameworks interacting in massively parallel cloud-based systems. In addition, we provide an extended overview of existing RDF data management systems in the centralized and distributed settings, emphasizing on the critical concepts of storage, indexing, query optimization, and infrastructure. Second, we present AMADA, an architecture for RDF data management using public cloud infrastructures. We follow the Software as a Service (SaaS) model, where the complete platform is running in the cloud and appropriate APIs are provided to the end-users for storing and retrieving RDF data. We explore various storage and querying strategies revealing pros and cons with respect to performance and also to monetary cost, which is a important new dimension to consider in public cloud services. Finally, we present CliqueSquare, a distributed RDF data management system built on top of Hadoop, incorporating a novel optimization algorithm that is able to produce massively parallel plans for SPARQL queries. We present a family of optimization algorithms, relying on n-ary (star) equality joins to build flat plans, and compare their ability to find the flattest possibles. Inspired by existing partitioning and indexing techniques we present a generic storage strategy suitable for storing RDF data in HDFS (Hadoop’s Distributed File System). Our experimental results validate the efficiency and effectiveness of the optimization algorithm demonstrating also the overall performance of the system

APA, Harvard, Vancouver, ISO, and other styles

22

Pockrandt, Christopher Maximilian [Verfasser]. "Approximate String Matching : Improving Data Structures and Algorithms / Christopher Maximilian Pockrandt." Berlin : Freie Universität Berlin, 2019. http://d-nb.info/1183675879/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

23

Mostaghim, Sanaz [Verfasser]. "Multi-Objective Evolutionary Algorithms : Data Structures, Convergence, and Diversity / Sanaz Mostaghim." Aachen : Shaker, 2005. http://d-nb.info/1181620465/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

24

Li, Miaoqi. "Statistical models and algorithms for large data with complex dependence structures." University of Cincinnati / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1584015958922068.

Full text

APA, Harvard, Vancouver, ISO, and other styles

25

Fylakis, A. (Angelos). "Data hiding algorithms for healthcare applications." Doctoral thesis, Oulun yliopisto, 2019. http://urn.fi/urn:isbn:9789526224008.

Full text

Abstract:

Abstract Developments in information technology have had a big impact in healthcare, producing vast amounts of data and increasing demands associated with their secure transfer, storage and analysis. To serve them, biomedical data need to carry patient information and records or even extra biomedical images or signals required for multimodal applications. The proposed solution is to host this information in data using data hiding algorithms through the introduction of imperceptible modiﬁcations achieving two main purposes: increasing data management efﬁciency and enhancing the security aspects of conﬁdentiality, reliability and availability. Data hiding achieve this by embedding the payload in objects, including components such as authentication tags, without requirements in extra space or modiﬁcations in repositories. The proposed methods satisfy two research problems. The ﬁrst is the hospital-centric problem of providing efﬁcient and secure management of data in hospital networks. This includes combinations of multimodal data in single objects. The host data were biomedical images and sequences intended for diagnoses meaning that even non-visible modiﬁcations can cause errors. Thus, a determining restriction was reversibility. Reversible data hiding methods remove the introduced modiﬁcations upon extraction of the payload. Embedding capacity was another priority that determined the proposed algorithms. To meet those demands, the algorithms were based on the Least Signiﬁcant Bit Substitution and Histogram Shifting approaches. The second was the patient-centric problem, including user authentication and issues of secure and efﬁcient data transfer in eHealth systems. Two novel solutions were proposed. The ﬁrst method uses data hiding to increase the robustness of face biometrics in photos, where due to the high robustness requirements, a periodic pattern embedding approach was used. The second method protects sensitive user data collected by smartphones. In this case, to meet the low computational cost requirements, the method was based on Least Signiﬁcant Bit Substitution. Concluding, the proposed algorithms introduced novel data hiding applications and demonstrated competitive embedding properties in existing applications
Tiivistelmä Modernit terveydenhuoltojärjestelmät tuottavat suuria määriä tietoa, mikä korostaa tiedon turvalliseen siirtämiseen, tallentamiseen ja analysointiin liittyviä vaatimuksia. Täyttääkseen nämä vaatimukset, biolääketieteellisen tiedon täytyy sisältää potilastietoja ja -kertomusta, jopa biolääketieteellisiä lisäkuvia ja -signaaleja, joita tarvitaan multimodaalisissa sovelluksissa. Esitetty ratkaisu on upottaa tämä informaatio tietoon käyttäen tiedonpiilotusmenetelmiä, joissa näkymättömiä muutoksia tehden saavutetaan kaksi päämäärää: tiedonhallinnan tehokkuuden nostaminen ja luottamuksellisuuteen, luotettavuuteen ja saatavuuteen liittyvien turvallisuusnäkökulmien parantaminen. Tiedonpiilotus saavuttaa tämän upottamalla hyötykuorman, sisältäen komponentteja, kuten todentamismerkinnät, ilman lisätilavaatimuksia tai muutoksia tietokantoihin. Esitetyt menetelmät ratkaisevat kaksi tutkimusongelmaa. Ensimmäinen on sairaalakeskeinen ongelma tehokkaan ja turvallisen tiedonhallinnan tarjoamiseen sairaaloiden verkoissa. Tämä sisältää multimodaalisen tiedon yhdistämisen yhdeksi kokonaisuudeksi. Tiedon kantajana olivat biolääketieteelliset kuvat ja sekvenssit, jotka on tarkoitettu diagnosointiin, missä jopa näkymättömät muutokset voivat aiheuttaa virheitä. Siispä määrittävin rajoite oli palautettavuus. Palauttavat tiedonpiilotus-menetelmät poistavat lisätyt muutokset, kun hyötykuorma irrotetaan. Upotuskapasiteetti oli toinen tavoite, joka määritteli esitettyjä algoritmeja. Saavuttaakseen nämä vaatimukset, algoritmit perustuivat vähiten merkitsevän bitin korvaamiseen ja histogrammin siirtämiseen. Toisena oli potilaskeskeinen ongelma, joka sisältää käyttäjän henkilöllisyyden todentamisen sekä turvalliseen ja tehokkaaseen tiedonsiirtoon liittyvät haasteet eHealth-järjestelmissä. Työssä ehdotettiin kahta uutta ratkaisua. Ensimmäinen niistä käyttää tiedonpiilotusta parantamaan kasvojen biometriikan kestävyyttä valokuvissa. Korkeasta kestävyysvaatimuksesta johtuen käytettiin periodisen kuvion upottamismenetelmää. Toinen menetelmä suojelee älypuhelimien keräämää arkaluontoista käyttäjätietoa. Tässä tapauksessa, jotta matala laskennallinen kustannus saavutetaan, menetelmä perustui vähiten merkitsevän bitin korvaamiseen. Yhteenvetona ehdotetut algoritmit esittelivät uusia tiedonpiilotussovelluksia ja osoittivat kilpailukykyisiä upotusominaisuuksia olemassa olevissa sovelluksissa

APA, Harvard, Vancouver, ISO, and other styles

26

Benjamin, Jim Isaac. "Quadtree algorithms for image processing /." Online version of thesis, 1991. http://hdl.handle.net/1850/11078.

Full text

APA, Harvard, Vancouver, ISO, and other styles

27

Costa, Andre. "Analytic modelling of agent-based network routing algorithms." Title page, contents and abstract only, 2002. http://web4.library.adelaide.edu.au/theses/09PH/09phc8373.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

28

Feldmann, Michael [Verfasser]. "Algorithms for distributed data structures and self-stabilizing overlay networks / Michael Feldmann." Paderborn : Universitätsbibliothek, 2021. http://d-nb.info/1231907754/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

29

Kammaje, Ravi Prakash. "Investigating ray tracing algorithms and data structures in the context of visibility." Thesis, Swansea University, 2009. https://cronfa.swan.ac.uk/Record/cronfa42220.

Full text

Abstract:

Ray tracing is a popular rendering method with built in visibility determination. However, the computational costs are significant. To reduce them, there has been extensive research leading to innovative data structures and algorithms that optimally utilize both object and image coherence. Investigating these from a visibility determination context without considering further optical effects is the main motivation of the research. Three methods - one structure and two coherent tree traversal algorithms - are discussed. While the structure aims to increase coherence, the algorithms aim to optimise utilization of coherence provided by ray tracing structures (kd-trees, octrees). RBSF trees - Restricted Binary Space Partitioning Trees - build upon the research in ray tracing with kd-trees. A higher degree of freedom for split plane selection increases object coherence implying a reduction in the number of node traversals and triangle intersections for most scenes. Consequently, reduced ray casting times for scenes with predominantly non-axis-aligned triangles is observed. Coherent Rendering is a rendering method that shows improved complexity, but at an absolute performance that is much slower than packet ray tracing. However, since it led to the creation of the Row Tracing' algorithm, it is described briefly. Row Tracing can be considered as an adaptation of Coherent Rendering, scanline rendering or packet ray tracing. One row of the image is considered and its pixels are determined. Similar to Coherent Rendering, an adapted version of Hierarchical Occlusion Maps is used to identify and skip occluded nodes. To maximize utilisation of coherence, the method is extended so that several adjacent rows are traversed through the tree. The two versions of Row Tracing demonstrate excellent performance, exceeding that of packet ray tracing. Further, it is shown that for larger models (2 million+ triangles). Row Tracing and Packet Row Tracing significantly outperform Z-buffer based methods (OpenGL). Row tracing show's scalability over scene sizes leading to a rendering method that has fast rendering times for both large and small models. In addition it has excellent parallelisation properties allowing utilisation of multiple cores with ease. Thus, the Row Tracing and Packet Row Tracing algorithms can be considered as the significant contributions of the Ph.D. These data structures and algorithms demonstrate that ray tracing data structures and adaptations of ray tracing algorithms exhibit excellent potential in a visibility context.

APA, Harvard, Vancouver, ISO, and other styles

30

Fouh, Mbindi Eric Noel. "Building and Evaluating a Learning Environment for Data Structures and Algorithms Courses." Diss., Virginia Tech, 2015. http://hdl.handle.net/10919/51951.

Full text

Abstract:

Learning technologies in computer science education have been most closely associated with teaching of programming, including automatic assessment of programming exercises. However, when it comes to teaching computer science content and concepts, learning technologies have not been heavily used. Perhaps the best known application today is Algorithm Visualization (AV), of which there are hundreds of examples. AVs tend to focus on presenting the procedural aspects of how a given algorithm works, rather than more conceptual content. There are also new electronic textbooks (eTextbooks) that incorporate the ability to edit and execute program examples. For many traditional courses, a longstanding problem is lack of sufficient practice exercises with feedback to the student. Automated assessment provides a way to increase the number of exercises on which students can receive feedback. Interactive eTextbooks have the potential to make it easy for instructors to introduce both visualizations and practice exercises into their courses. OpenDSA is an interactive eTextbook for data structures and algorithms (DSA) courses. It integrates tutorial content with AVs and automatically assessed interactive exercises. Since Spring 2013, OpenDSA has been regularly used to teach a fundamental data structures and algorithms course (CS2), and also a more advanced data structures, algorithms, and analysis course (CS3) at various institutions of higher education. In this thesis, I report on findings from early adoption of the OpenDSA system. I describe how OpenDSA's design addresses obstacles in the use of AV systems. I identify a wide variety of use for OpenDSA in the classroom. I found that instructors used OpenDSA exercises as graded assignments in all the courses where it was used. Some instructors assigned an OpenDSA assignment before lectures and started spending more time teaching higher-level concepts. OpenDSA also supported implementing a ``flipped classroom'' by some instructors. I found that students are enthusiastic about OpenDSA and voluntarily used the AVs embedded within OpenDSA. Students found OpenDSA beneficial and expressed a preference for a class format that included using OpenDSA as part of the assigned graded work. The relationship between OpenDSA and students' performance was inconclusive, but I found that students with higher grades tend to complete more exercises.
Ph. D.

APA, Harvard, Vancouver, ISO, and other styles

31

Maquet, Nicolas. "New algorithms and data structures for the emptiness problem of alternating automata." Doctoral thesis, Universite Libre de Bruxelles, 2011. http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/209961.

Full text

Abstract:

This work studies new algorithms and data structures that are useful in the context of program verification. As computers have become more and more ubiquitous in our modern societies, an increasingly large number of computer-based systems are considered safety-critical. Such systems are characterized by the fact that a failure or a bug (computer error in the computing jargon) could potentially cause large damage, whether in loss of life, environmental damage, or economic damage. For safety-critical systems, the industrial software engineering community increasingly calls for using techniques which provide some formal assurance that a certain piece of software is correct.

One of the most successful program verification techniques is model checking, in which programs are typically abstracted by a finite-state machine. After this abstraction step, properties (typically in the form of some temporal logic formula) can be checked against the finite-state abstraction, with the help of automated tools. Alternating automata play an important role in this context, since many temporal logics on words and trees can be efficiently translated into those automata. This property allows for the reduction of model checking to automata-theoretic questions and is called the automata-theoretic approach to model checking. In this work, we provide three novel approaches for the analysis (emptiness checking) of alternating automata over finite and infinite words. First, we build on the successful framework of antichains to devise new algorithms for LTL satisfiability and model checking, using alternating automata. These algorithms combine antichains with reduced ordered binary decision diagrams in order to handle the exponentially large alphabets of the automata generated by the LTL translation. Second, we develop new abstraction and refinement algorithms for alternating automata, which combine the use of antichains with abstract interpretation, in order to handle ever larger instances of alternating automata. Finally, we define a new symbolic data structure, coined lattice-valued binary decision diagrams that is particularly well-suited for the encoding of transition functions of alternating automata over symbolic alphabets. All of these works are supported with empirical evaluations that confirm the practical usefulness of our approaches. / Ce travail traite de l'étude de nouveaux algorithmes et structures de données dont l'usage est destiné à la vérification de programmes. Les ordinateurs sont de plus en plus présents dans notre vie quotidienne et, de plus en plus souvent, ils se voient confiés des tâches de nature critique pour la sécurité. Ces systèmes sont caractérisés par le fait qu'une panne ou un bug (erreur en jargon informatique) peut avoir des effets potentiellement désastreux, que ce soit en pertes humaines, dégâts environnementaux, ou économiques. Pour ces systèmes critiques, les concepteurs de systèmes industriels prônent de plus en plus l'usage de techniques permettant d'obtenir une assurance formelle de correction.

Une des techniques de vérification de programmes les plus utilisées est le model checking, avec laquelle les programmes sont typiquement abstraits par une machine a états finis. Après cette phase d'abstraction, des propriétés (typiquement sous la forme d'une formule de logique temporelle) peuvent êtres vérifiées sur l'abstraction à espace d'états fini, à l'aide d'outils de vérification automatisés. Les automates alternants jouent un rôle important dans ce contexte, principalement parce que plusieurs logiques temporelle peuvent êtres traduites efficacement vers ces automates. Cette caractéristique des automates alternants permet de réduire le model checking des logiques temporelles à des questions sur les automates, ce qui est appelé l'approche par automates du model checking. Dans ce travail, nous étudions trois nouvelles approches pour l'analyse (le test du vide) desautomates alternants sur mots finis et infinis. Premièrement, nous appliquons l'approche par antichaînes (utilisée précédemment avec succès pour l'analyse d'automates) pour obtenir de nouveaux algorithmes pour les problèmes de satisfaisabilité et du model checking de la logique temporelle linéaire, via les automates alternants.Ces algorithmes combinent l'approche par antichaînes avec l'usage des ROBDD, dans le but de gérer efficacement la combinatoire induite par la taille exponentielle des alphabets d'automates générés à partir de LTL. Deuxièmement, nous développons de nouveaux algorithmes d'abstraction et raffinement pour les automates alternants, combinant l'usage des antichaînes et de l'interprétation abstraite, dans le but de pouvoir traiter efficacement des automates de grande taille. Enfin, nous définissons une nouvelle structure de données, appelée LVBDD (Lattice-Valued Binary Decision Diagrams), qui permet un encodage efficace des fonctions de transition des automates alternants sur alphabets symboliques. Tous ces travaux ont fait l'objet d'implémentations et ont été validés expérimentalement.
Doctorat en Sciences
info:eu-repo/semantics/nonPublished

APA, Harvard, Vancouver, ISO, and other styles

32

Zhang, Yan. "Improving the efficiency of graph-based data mining with application to public health data." Online access for everyone, 2007. http://www.dissertations.wsu.edu/Thesis/Fall2007/y_zhang_112907.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

33

Tobeck, Daniel. "Data Structures and Reduction Techniques for Fire Tests." Thesis, University of Canterbury. Civil Engineering, 2007. http://hdl.handle.net/10092/1578.

Full text

Abstract:

To perform fire engineering analysis, data on how an object or group of objects burn is almost always needed. This data should be collected and stored in a logical and complete fashion to allow for meaningful analysis later. This thesis details the design of a new fire test Data Base Management System (DBMS) termed UCFIRE which was built to overcome the limitations of existing fire test DBMS and was based primarily on the FDMS 2.0 and FIREBASEXML specifications. The UCFIRE DBMS is currently the most comprehensive and extensible DBMS available in the fire engineering community and can store the following test types: Cone Calorimeter, Furniture Calorimeter, Room/Corner Test, LIFT and Ignitability Apparatus Tests. Any data reduction which is performed on this fire test data should be done in an entirely mechanistic fashion rather than rely on human intuition which is subjective. Currently no other DBMS allows for the semi-automation of the data reduction process. A number of pertinent data reduction algorithms were investigated and incorporated into the UCFIRE DBMS. An ASP.NET Web Service (WEBFIRE) was built to reduce the bandwidth required to exchange fire test information between the UCFIRE DBMS and a UCFIRE document stored on a web server. A number of Mass Loss Rate (MLR) algorithms were investigated and it was found that the Savitzky-Golay filtering algorithm offered the best performance. This algorithm had to be further modified to autonomously filter other noisy events that occurred during the fire tests. This algorithm was then evaluated on test data from exemplar Furniture Calorimeter and Cone Calorimeter tests. The LIFT test standard (ASTM E 1321-97a) requires its ignition and flame spread data to be scrutinised but does not state how to do this. To meet these requirements the fundamentals of linear regression were reviewed and an algorithm to mechanistically scrutinise ignition and flame spread data was developed. This algorithm seemed to produce reasonable results when used on exemplar ignition and flame spread test data.

APA, Harvard, Vancouver, ISO, and other styles

34

Mak, Vivian. "Algorithms for proximity problems in the presence of obstacles /." Hong Kong : University of Hong Kong, 1999. http://sunzi.lib.hku.hk/hkuto/record.jsp?B21414944.

Full text

APA, Harvard, Vancouver, ISO, and other styles

35

Fieldsend, Jonathan E. "Novel algorithms for multi-objective search and their application in multi-objective evolutionary neural network training." Thesis, University of Exeter, 2003. http://hdl.handle.net/10871/11706.

Full text

APA, Harvard, Vancouver, ISO, and other styles

36

Hatzinger, Reinhold, and Wolfgang Panny. "Single and Twin-Heaps as Natural Data Structures for Percentile Point Simulation Algorithms." Department of Statistics and Mathematics, WU Vienna University of Economics and Business, 1993. http://epub.wu.ac.at/574/1/document.pdf.

Full text

Abstract:

Sometimes percentile points cannot be determined analytically. In such cases one has to resort to Monte Carlo techniques. In order to provide reliable and accurate results it is usually necessary to generate rather large samples. Thus the proper organization of the relevant data is of crucial importance. In this paper we investigate the appropriateness of heap-based data structures for the percentile point estimation problem. Theoretical considerations and empirical results give evidence of the good performance of these structures regarding their time and space complexity. (author's abstract)
Series: Forschungsberichte / Institut für Statistik

APA, Harvard, Vancouver, ISO, and other styles

37

Zhang, Yingying. "Algorithms and Data Structures for Efficient Timing Analysis of Asynchronous Real-time Systems." Scholar Commons, 2013. http://scholarcommons.usf.edu/etd/4622.

Full text

Abstract:

This thesis presents a framework to verify asynchronous real-time systems based on model checking. These systems are modeled by using a common modeling formalism named Labeled Petri-nets(LPNs). In order to verify the real-time systems algorithmically, the zone-based timing analysis method is used for LPNs. It searches the state space with timing information (represented by zones). When there is a high degree of concurrency in the model, firing concurrent enabled transitions in different order may result in different zones, and these zones may be combined without affecting the verification result. Since the zone-based method could not deal with this problem efficiently, the POSET timing analysis method is adopted for LPNs. It separates concurrency from causality and generates an exactly one zone for a single state. But it needs to maintain an extra POSET matrix for each state. In order to save time and memory, an improved zone-based timing analysis method is introduced by integrating above two methods. It searches the state space with zones but eliminates the use of the POSET matrix, which generates the same result as with the POSET method. To illustrate these methods, a circuit example is used throughout the thesis. Since the state space generated is usually very large, a graph data structure named multi-value decision diagrams (MDDs) is implemented to store the zones compactly. In order to share common clock value of dierent zones, two zone encoding methods are described: direct encoding and minimal constraint encoding. They ignore the unnecessary information in zones thus reduce the length of the integer tuples. The effectiveness of these two encoding methods is demonstrated by experimental result of the circuit example.

APA, Harvard, Vancouver, ISO, and other styles

38

Siniolakis, Constantinos J. "Design, analysis and implementation of bulk-synchronous parallel algorithms, data structures and techniques." Thesis, University of Oxford, 1998. https://ora.ox.ac.uk/objects/uuid:dd317ff5-f25d-45f7-8405-8c48df99674b.

Full text

Abstract:

The objective of this thesis is the unified investigation of a wide range of fundamental parallel methods that are transportable amongst pragmatic parallel machines having different number of processors, different periodicity of global synchronization and different bandwidth of inter-processor communication. The computational model adopted is the bulk-synchronous parallel (BSP) model, which abstracts the characteristics of parallel machines into three numerical parameters p, L and g, that quantify, respectively, processors, periodicity and bandwidth - the model differentiates memory that is local to a processor from memory that is non-local, yet, for the sake of universality, does not differentiate network proximity. The BSP parameters p, L and g, together with the problem size n, are employed to measure the performance, and consequently, the transportability of parallel methods across machines having different values of these parameters. We show that optimality to within small multiplicative constant factors close to one can be achieved for a multiplicity of fundamental computational problems by transportable algorithms and data structures that can be applied for a wide range of values of the BSP parameters. While these algorithms and data structures are fairly simple themselves, description of their performance in terms of these parameters is somewhat complicated. The main reward for quantifying these complications, is that it enables software to be written once and for all that can be migrated efficiently amongst a variety of parallel machines. The methods considered in this thesis - both theoretically and experimentally - embody deterministic and randomized techniques for the efficient realization of fundamental algorithms (broadcasting, computing parallel-prefixes, load-balancing, list-contracting, merging, sorting, integer-sorting, selecting, searching and hashing), data structures (heaps, search trees and hash tables) and applications (computational geometry, parallel model simulations and structured query language primitives).

APA, Harvard, Vancouver, ISO, and other styles

39

宋永健 and Wing-kin Sung. "Fast labeled tree comparison via better matching algorithms." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 1998. http://hub.hku.hk/bib/B31239316.

Full text

APA, Harvard, Vancouver, ISO, and other styles

40

Sung, Wing-kin. "Fast labeled tree comparison via better matching algorithms /." Hong Kong : University of Hong Kong, 1998. http://sunzi.lib.hku.hk/hkuto/record.jsp?B20229999.

Full text

APA, Harvard, Vancouver, ISO, and other styles

41

Bougiouklis, Theodoros C. "Traffic management algorithms in wireless sensor networks." Thesis, Monterey, Calif. : Springfield, Va. : Naval Postgraduate School ; Available from National Technical Information Service, 2006. http://library.nps.navy.mil/uhtbin/hyperion/06Sep%5FBougiouklis.pdf.

Full text

Abstract:

Thesis (M.S. in Electrical Engineering)--Naval Postgraduate School, September 2006.
Thesis Advisor(s): Weillian Su. "September 2006." Includes bibliographical references (p. 79-80). Also available in print.

APA, Harvard, Vancouver, ISO, and other styles

42

Geum, Seong. "An approximate load balancing parallel hash join algorithm to handle data skew in a parallel data base system." Thesis, Georgia Institute of Technology, 1995. http://hdl.handle.net/1853/9222.

Full text

APA, Harvard, Vancouver, ISO, and other styles

43

Chan, Sze-hang, and 陳思行. "Competitive online job scheduling algorithms under different energy management models." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2013. http://hdl.handle.net/10722/206690.

Full text

Abstract:

Online flow-time scheduling is a fundamental problem in computer science and has been extensively studied for years. It is about how to design a scheduler to serve computer jobs with unpredictable arrival times and varying sizes and priorities so as to minimize the total flow time (better understood as response time) of jobs. It has many applications, most notable in the operating of server farms. As energy has become an important issue, the design of scheduler also has to take power management into consideration, for example, how to scale the speed of the processors dynamically. The objectives are orthogonal as one would prefer lower processor speed to save energy, yet a good quality of service must be retained. In this thesis, I study a few scheduling problems for energy and flow time in depth and give new algorithms to tackle them. The competitiveness of our algorithms is guaranteed with worst-case mathematical analysis against the best possible or hypothetical solutions. In the speed scaling model, the power of a processor increases with its speed according to a certain function (e.g., a cubic function of speed). Among all online scheduling problems with speed scaling, the nonclairvoyant setting (in which the size of a job is not known during its execution) with arbitrary priorities is perhaps the most challenging. This thesis gives the first competitive algorithm called WLAPS for this setting. In reality, it is not uncommon that during the peak-load period, some (low-priority) users have their jobs rejected by the servers. This triggers me to study more complicated scheduling algorithms that can strike a good balance among speed scaling, flow time and rejection penalty. Two new algorithms UPUW and HDFAC for different models of rejection penalty have been proposed and analyzed. Last, but perhaps the most interesting, we study power management in large server farm environment in which the primary energy saving mechanism is to put some processors to sleep. Two new algorithms POOL and SATA have been designed to tackle jobs that cannot and can migrate among the processors, respectively. They are integrated algorithms that can consider speed scaling, job scheduling and processor sleep management together to optimize the energy usage and ow time simultaneously. These algorithms are again proven mathematically to be competitive even in the worst case.
published_or_final_version
Computer Science
Doctoral
Doctor of Philosophy

APA, Harvard, Vancouver, ISO, and other styles

44

Jess, Torben. "Recommender systems and market approaches for industrial data management." Thesis, University of Cambridge, 2017. https://www.repository.cam.ac.uk/handle/1810/270103.

Full text

Abstract:

Industrial companies are dealing with an increasing data overload problem in all aspects of their business: vast amounts of data are generated in and outside each company. Determining which data is relevant and how to get it to the right users is becoming increasingly difficult. There are a large number of datasets to be considered, and an even higher number of combinations of datasets that each user could be using. Current techniques to address this data overload problem necessitate detailed analysis. These techniques have limited scalability due to their manual effort and their complexity, which makes them unpractical for a large number of datasets. Search, the alternative used by many users, is limited by the user’s knowledge about the available data and does not consider the relevance or costs of providing these datasets. Recommender systems and so-called market approaches have previously been used to solve this type of resource allocation problem, as shown for example in allocation of equipment for production processes in manufacturing or for spare part supplier selection. They can therefore also be seen as a potential application for the problem of data overload. This thesis introduces the so-called RecorDa approach: an architecture using market approaches and recommender systems on their own or by combining them into one system. Its purpose is to identify which data is more relevant for a user’s decision and improve allocation of relevant data to users. Using a combination of case studies and experiments, this thesis develops and tests the approach. It further compares RecorDa to search and other mechanisms. The results indicate that RecorDa can provide significant benefit to users with easier and more flexible access to relevant datasets compared to other techniques, such as search in these databases. It is able to provide a fast increase in precision and recall of relevant datasets while still keeping high novelty and coverage of a large variety of datasets.

APA, Harvard, Vancouver, ISO, and other styles

45

Valero, Bresó Alejandro. "Hybrid caches: design and data management." Doctoral thesis, Editorial Universitat Politècnica de València, 2013. http://hdl.handle.net/10251/32663.

Full text

Abstract:

Cache memories have been usually implemented with Static Random-Access Memory (SRAM) technology since it is the fastest electronic memory technology. However, this technology consumes a high amount of leakage currents, which is a major design concern because leakage energy consumption increases as the transistor size shrinks. Alternative technologies are being considered to reduce this consumption. Among them, embedded Dynamic RAM (eDRAM) technology provides minimal area and leakage by design but reads are destructive and it is not as fast as SRAM. In this thesis, both SRAM and eDRAM technologies are mingled to take the advantatges that each of them o¿ers. First, they are combined at cell level to implement an n-bit macrocell consisting of one SRAM cell and n-1 eDRAM cells. The macrocell is used to build n-way set-associative hybrid ¿rst-level (L1) data caches having one SRAM way and n-1 eDRAM ways. A single SRAM way is enough to achieve good performance given the high data locality of L1 caches. Architectural mechanisms such as way-prediction, swaps, and scrub operations are considered to avoid unnecessary eDRAM reads, to maintain the Most Recently Used (MRU) data in the fast SRAM way, and to completely avoid refresh logic. Experimental results show that, compared to a conventional SRAM cache, leakage and area are largely reduced with a scarce impact on performance. The study of the bene¿ts of hybrid caches has been also carried out in second-level (L2) caches acting as Last-Level Caches (LLCs). In this case, the technologies are combined at bank level and the optimal ratio of SRAM and eDRAM banks that achieves the best trade-o¿ among performance, energy, and area is identi¿ed. Like in L1 caches, the MRU blocks are kept in the SRAM banks and they are accessed ¿rst to avoid unnecessary destructive reads. Nevertheless, refresh logic is not removed since data locality widely di¿ers in this cache level. Experimental results show that a hybrid LLC with an eighth of its banks built with SRAM technology is enough to achieve the best target trade-o¿. This dissertation also deals with performance of replacement policies in heterogeneous LLCs mainly focusing on the energy overhead incurred by refresh operations. In this thesis it is de¿ned a new concept, namely MRU-Tour (MRUT), that helps estimate reuse information of cache blocks. Based on this concept, it is proposed a family of MRUTbased replacement algorithms that randomly select the victim block among those having a single MRUT. These policies are enhanced to leverage recency of information for a few blocks and to adapt to changes in the working set of the benchmarks. Results show that the proposed MRUT policies, with simpler hardware complexity, outperform the Least Recently Used (LRU) policy and a set of the most representative state-of-the-art replacement policies for LLCs. Refresh operations represent an important fraction of the overall dynamic energy consumption of eDRAM LLCs. This fraction increases with the cache capacity, since more blocks have to be refreshed for a given period of time. Prior works have attacked the refresh energy taking into account inter-cell feature variations. Unlike these works, this thesis proposes a selective refresh policy based on the MRUT concept. The devised policy takes into account the number of MRUTs of a block to select whether the block is refreshed. In this way, many refreshes done in a typical distributed refresh policy are skipped (i.e., in those blocks having a single MRUT). This refresh mechanism is applied in the hybrid LLC memory. Results show that refresh energy consumption is largely reduced with respect to a conventional eDRAM cache, while the performance degradation is minimal with respect to a conventional SRAM cache.
Valero Bresó, A. (2013). Hybrid caches: design and data management [Tesis doctoral]. Editorial Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/32663
Alfresco
Premiado

APA, Harvard, Vancouver, ISO, and other styles

46

Gendron, Marlin. "Algorithms and Data Structures for Automated Change Detection and Classification of Sidescan Sonar Imagery." ScholarWorks@UNO, 2004. http://scholarworks.uno.edu/td/210.

Full text

Abstract:

During Mine Warfare (MIW) operations, MIW analysts perform change detection by visually comparing historical sidescan sonar imagery (SSI) collected by a sidescan sonar with recently collected SSI in an attempt to identify objects (which might be explosive mines) placed at sea since the last time the area was surveyed. This dissertation presents a data structure and three algorithms, developed by the author, that are part of an automated change detection and classification (ACDC) system. MIW analysts at the Naval Oceanographic Office, to reduce the amount of time to perform change detection, are currently using ACDC. The dissertation introductory chapter gives background information on change detection, ACDC, and describes how SSI is produced from raw sonar data. Chapter 2 presents the author's Geospatial Bitmap (GB) data structure, which is capable of storing information geographically and is utilized by the three algorithms. This chapter shows that a GB data structure used in a polygon-smoothing algorithm ran between 1.3 – 48.4x faster than a sparse matrix data structure. Chapter 3 describes the GB clustering algorithm, which is the author's repeatable, order-independent method for clustering. Results from tests performed in this chapter show that the time to cluster a set of points is not affected by the distribution or the order of the points. In Chapter 4, the author presents his real-time computer-aided detection (CAD) algorithm that automatically detects mine-like objects on the seafloor in SSI. The author ran his GB-based CAD algorithm on real SSI data, and results of these tests indicate that his real-time CAD algorithm performs comparably to or better than other non-real-time CAD algorithms. The author presents his computer-aided search (CAS) algorithm in Chapter 5. CAS helps MIW analysts locate mine-like features that are geospatially close to previously detected features. A comparison between the CAS and a great circle distance algorithm shows that the CAS performs geospatial searching 1.75x faster on large data sets. Finally, the concluding chapter of this dissertation gives important details on how the completed ACDC system will function, and discusses the author's future research to develop additional algorithms and data structures for ACDC.

APA, Harvard, Vancouver, ISO, and other styles

47

Breakiron, Daniel Aubrey. "Evaluating the Integration of Online, Interactive Tutorials into a Data Structures and Algorithms Course." Thesis, Virginia Tech, 2013. http://hdl.handle.net/10919/23107.

Full text

Abstract:

OpenDSA is a collection of open source tutorials for teaching data structures and algorithms. It was created with the goals of visualizing complex, abstract topics; increasing the amount of practice material available to students; and providing immediate feedback and incremental assessment. In this thesis, I first describe aspects of the OpenDSA architecture relevant to collecting user interaction data. I then present an analysis of the interaction log data gathered from three classes during Spring 2013. The analysis focuses on determining the time distribution of student activity, determining the time required for assignment completion, and exploring \credit-seeking" behaviors and behavior related to non-required exercises. We identified clusters of students based on when they completed exercises, verified the reliability of estimated time requirements for exercises, provided evidence that a majority of students do not read the text, discovered a measurement that could be used to identify exercises that require additional development, and found evidence that students complete exercises after obtaining credit. Furthermore, we determined that slideshow usage was fairly high (even when credit was not ordered), and skipping to the end of slideshows was more common when credit was offered but also occurred when it was not.
Master of Science

APA, Harvard, Vancouver, ISO, and other styles

48

Agarwalla, Bikash Kumar. "Resource management for data streaming applications." Diss., Georgia Institute of Technology, 2010. http://hdl.handle.net/1853/34836.

Full text

Abstract:

This dissertation investigates novel middleware mechanisms for building streaming applications. Developing streaming applications is a challenging task because (i) they are continuous in nature; (ii) they require fusion of data coming from multiple sources to derive higher level information; (iii) they require efficient transport of data from/to distributed sources and sinks; (iv) they need access to heterogeneous resources spanning sensor networks and high performance computing; and (v) they are time critical in nature. My thesis is that an intuitive programming abstraction will make it easier to build dynamic, distributed, and ubiquitous data streaming applications. Moreover, such an abstraction will enable an efficient allocation of shared and heterogeneous computational resources thereby making it easier for domain experts to build these applications. In support of the thesis, I present a novel programming abstraction, called DFuse, that makes it easier to develop these applications. A domain expert only needs to specify the input and output connections to fusion channels, and the fusion functions. The subsystems developed in this dissertation take care of instantiating the application, allocating resources for the application (via the scheduling heuristic developed in this dissertation) and dynamically managing the resources (via the dynamic scheduling algorithm presented in this dissertation). Through extensive performance evaluation, I demonstrate that the resources are allocated efficiently to optimize the throughput and latency constraints of an application.

APA, Harvard, Vancouver, ISO, and other styles

49

Lenharth, Andrew D. "Algorithms for stable allocations in distributed real-time resource management systems." Ohio : Ohio University, 2004. http://www.ohiolink.edu/etd/view.cgi?ohiou1102697777.

Full text

APA, Harvard, Vancouver, ISO, and other styles

50

Archer, David William. "Conceptual Modeling of Data with Provenance." PDXScholar, 2011. https://pdxscholar.library.pdx.edu/open_access_etds/133.

Full text

Abstract:

Traditional database systems manage data, but often do not address its provenance. In the past, users were often implicitly familiar with data they used, how it was created (and hence how it might be appropriately used), and from which sources it came. Today, users may be physically and organizationally remote from the data they use, so this information may not be easily accessible to them. In recent years, several models have been proposed for recording provenance of data. Our work is motivated by opportunities to make provenance easy to manage and query. For example, current approaches model provenance as expressions that may be easily stored alongside data, but are difficult to parse and reconstruct for querying, and are difficult to query with available languages. We contribute a conceptual model for data and provenance, and evaluate how well it addresses these opportunities. We compare the expressive power of our model's language to that of other models. We also define a benchmark suite with which to study performance of our model, and use this suite to study key model aspects implemented on existing software platforms. We discover some salient performance bottlenecks in these implementations, and suggest future work to explore improvements. Finally, we show that our implementations can comprise a logical model that faithfully supports our conceptual model.

APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic 'Data structures and algorithms for data management'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles