To see the other types of publications on this topic, follow the link: Data rescue and reuse.

Dissertations / Theses on the topic 'Data rescue and reuse'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'Data rescue and reuse.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Moreau, Benjamin. "Facilitating reuse on the web data." Thesis, Nantes, 2020. http://www.theses.fr/2020NANT4045.

Full text
Abstract:
Le Web des données est un ensemble de données liées qui peuvent être interrogées et réutilisées à l’aide de moteurs de requêtes fédérées. Pour protéger les jeux de données, les licences renseignent leurs conditions d’utilisation. Cependant, choisir une licence conforme n’est pas toujours aisé. En effet, pour protéger la réutilisation de plusieurs jeux de données, il est nécessaire de prendre en considération la compatibilité entre leurs licences. Pour faciliter la réutilisation, les moteurs de requêtes fédérées devraient respecter les licences. Dans ce contexte, nous nous intéressons à deux problèmes (1) comment calculer la relation de compatibilité entre des licences, et (2) comment respecter les licences pendant le traitement de requêtes fédérées. Pour le premier problème, nous proposons CaLi, un modèle capable d’ordonner partiellement n’importe quel ensemble de licences selon leur compatibilité. Pour le second problème, nous proposons FLiQue, un moteur de requête fédéré respectant les licences. FLiQue utilise CaLi pour détecter les conflits de compatibilité entre licences et assure que le résultat d’une requête fédérée respecte les licences. Dans le cadre de cette thèse, nous proposons également trois approches ODMTP, EvaMap et le SemanticBot ayant pour objectif de faciliter l’intégration de données au web des données<br>The Web of Data is a web of interlinked datasets that can be queried and reused through federated query engines. To protect their datasets, data producers use licenses to specify their condition of reuse. But, choosing a compliant license is not easy. Licensing reuse of several licensed datasets must consider compatibility among licenses. To facilitate reuse, federated query engines should preserve license compliance. To do so, we focus on two problems (1) how to compute compatibility relations among licenses, and (2) how to ensure license compliance during federated query processing. To the first problem, we propose CaLi, a model that partially orders any set of licenses in terms of compatibility. To the second problem, we propose FLiQue, a license-aware federated query processing strategy. FLiQue uses CaLi to detect license compatibility conflicts and ensures that the result of a federated query preserves license compliance. Within the scope of this thesis, we also propose three approaches ODMTP, EvaMap, and the SemanticBot that aim to facilitate the integration of datasets to the Web of Data
APA, Harvard, Vancouver, ISO, and other styles
2

Liu, Qiang. "Data Reuse and Parallelism in Hardware Compilation." Thesis, Imperial College London, 2008. http://hdl.handle.net/10044/1/4370.

Full text
Abstract:
This thesis presents a methodology to automatically determine a data memory organisation at compiletime, suitable to exploit data reuse and loop-level parallelization, in order to achieve high performanceand low power design for data-dominated applications. Moore?s Law has enabled more and more heterogeneouscomponents integrated on a single chip. However, there are challenges to extract maximumperformance from these hardware resources efficiently. Unlike previous approaches, which mainly focus on making efficient use of computational resources,our focus is on data memory organisation and input-output bandwidth considerations, which are thetypical stumbling block of existing hardware compilation schemes. To optimize accesses to large off-chip memories, an approach is adopted and formalized to identify datareuse opportunities in local scratch-pad memory. An approach is presented for evaluating differentdata reuse options in terms of the memory space required by buffering reused data and execution timefor loading the data to the local memories. Determining the data reuse design option that consumesthe least power or performs operations quickest with respect to a memory constraint is a NP-hardproblem. In this work, the problem of data reuse exploration for low-power designs is formulated asa Multiple-Choice Knapsack problem. Together with a proposed power model, the problem is solvedefficiently. An integer geometric programming framework is presented for exploring data reuse andloop-level parallelization within a single step. The objective is to find the design that achieves theshortest execution time for an application. We describe our approaches based on formal optimization techniques, and present some results fromapplying these approaches to several benchmarks that show the advantages of optimizing data memoryorganisation and of exposing the interaction between data memory system design and parallelismextraction to the compiler.
APA, Harvard, Vancouver, ISO, and other styles
3

Rabby, Md Hasib Mahmud. "Tethered drone for rescue boats." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-290819.

Full text
Abstract:
Human has a great interest in aerial devices from very ancient days. The journey was started by kites and is still going on with the invention of airplane, helicopter, rocket,and many more. Drone is the latest field of scientist’s research which is a miniature of a helicopter. It has a great impact on military unit, navigation, and rescue mission. Inrecent days we can see its use in defense and attack strategies among the developed countries. There are many ways in which the use of drones in practical circumstances can be further explored. For example, in rescue missions, it is sometimes difficult for rescuers to reach the spot of the accident, which might be in the middle of a sea or an ocean. A navigation technology that will lead them towards the destination therefore can be of immense use. Rescue organizations like the Swedish Sea Rescue Society (SSRS) are in lack of such a technology that can assist them further in their mission. A tethered drone can act as navigation guidance for them to reach the destination very quickly. It will save time and fuel by reaching the destination via the shortest possible way. Its birds’ eye technology and sensor will broaden the chances of successful rescues. The current similar technologies have some drawbacks. An ordinary drone is generally powered by a battery and therefore has limited flight time. During the flight, it can record and sends a live video stream to a base station via the mobile network. The information collected is intended to be used to assist in crucial rescue decisions such as which boat to use, rescue crew size, what instruments to carry, and so forth. A tethered drone can fly longer than the average flight time. Due to being powered from the ground with wire, the drone gets constant power supply that does not depend on batteries with limited life. The main aim of this thesis is to design a system in which a drone can fly longer in a fixed position and altitude and to find suitable wire for wiring up and ensure the drone’s weight balance. Da-Jiang Innovations spark drone has been used for implementing the project as a model drone and the task is divided into few parts by testing process. Dronesflight time limitations have been overcome and longer flight time has been achieved. Also, because of testing, a few other limitations have been found.<br>Människan har alltid haft ett stort intresse för flygande objekt sedan långt tillbaka.Resan började med drakar och intresset pågår fortfarande med uppfinnigar somflygplan, helikopter, raket och många fler. Drönare är det senaste forskningsområdetsom är en miniatyr av en helikopter. Det har stor inverkan på militära enheter,navigering och räddningsuppdrag. Under de senaste dagarna kan vi se dess användningi försvars- och attackstrategier bland de utvecklade länderna.Det finns många sätt på vilket användningen av drönare under praktiska omständigheterkan utforskas ytterligare. Till exempel i räddningsuppdrag är det ibland svårt förräddare att nå platsen för olyckan, som till exempel kan vara mitt i ett hav.Navigeringsteknik som leder räddningstjänsten mot destinationen kan därför vara tillstor nytta. Räddningsorganisationer som Swedish Sea Rescue Society saknar sådanteknik som kan hjälpa dem ytterligare i deras uppdrag. En kabel bunden drönare kanfungera som navigeringsvägledare för dem att nå destinationen snabbare. Det spararbåde tid och bränsle genom att nå destinationen på kortast möjliga väg. Dessfågelperspektiv och sensor kommer att öka chanserna till framgångsrika räddningar.Den nuvarande liknande tekniken har vissa nackdelar. En vanlig drönare drivsvanligtvis av ett batteri och har därför begränsad flygtid. Under flygningen kan denspela in och skicka en live videoström till en basstation via mobilnätet. Den insamladeinformationen är avsedd att användas för att hjälpa till med viktiga räddningsbeslut somtill exempel vilken båt som ska användas, räddningsbesättningens storlek, vilkainstrument som ska bäras och så vidare.En bunden drönare kan flyga längre än den genomsnittliga flygtiden på grund av attden drivs från marken med kabel får drönaren konstant strömförsörjning som inte berorpå batterier med begränsad livslängd.Huvudsyftet med denna avhandling är att utforma ett system där en drönare kan flygalängre i en fast position och höjd och att hitta en lämplig ledare för kabeldragning ochsäkerställa drönarens viktbalans. Da-Jiang Innovations drönare Spark har använts föratt genomföra projektet som en model drönare och uppgiften är uppdelad i få delargenom testprocess. Drönarnas flygtidsbegränsningar har övervunnits och längre flygtidhar uppnåtts. Dessutom har ny begränsningar hittats efter flera tester har utförts.
APA, Harvard, Vancouver, ISO, and other styles
4

Jeeson, Daniel Joshua. "An investigation into information reuse for cost prediction : from needs to a data reuse framework." Thesis, University of Southampton, 2014. https://eprints.soton.ac.uk/363782/.

Full text
Abstract:
The need to be able to reuse a wide variety of data that an organise has created, constitutes a part of the challenge known as ‘Broad Data’. The aim of this research was to create a framework that would enable the reuse of broad data while complying with the corporate requirements of data security and privacy. A key requirement in enabling reuse of broad data is to ensure maximum interoperability among datasets, which in Linked Data depends on the URIs (Uniform Resource Identifier) that the datasets have in common (i.e. reused). The URIs in linked data can be dereferenced to obtain more information about it from its owner and hence dereferencing can have a profound impact on making someone reuse a URI. However, the wide variety of vocabulary in broad data means the provenance and ownership of URIs could be key in promoting its reuse by the data creators. The full potential offered by linked data cannot be realised due to the fundamental way the URIs are currently constructed. In part, this is because the World Wide Web (Web) was designed for an open web of documents, not a secure web of data. By making subtle but essential changes to the building blocks one can change the way data is handled on the Web, thereby creating what has been proposed in this thesis as the World Wide Information web (WWI). The WWI is based on a framework of people and things that are active contributors to the web of data (hereinafter referred to as ‘active thing’), identified by URIs. The URI for an active thing is constructed from its path in the organisational stakeholder hierarchy to represent the provenance of ownership. As a result, it becomes easier to reference data held in sparse and heterogeneous resources, to navigate complex organisational structures, and to automatically include the provenance of the data to support trust based data reuse and an organic growth of linked data. As a result, the new data retrieval technique referred to as ‘domino request’ was demonstrated, where sparsely located linked data can be reused as though it was from a single source. With the use of a domino request on WWI web there is no more need to include the name of the organisation itself to maintain a catalogue of all the data sources to be queried and thus, making the ‘security by privacy’ on the Web a reality. At the same time, WWI allows the data owner or its stakeholder to maintain their privacy not only on the source of data, but also on the provenance of individual URIs that describe the data. The thesis concludes that WWI is a suitable framework for broad data reuse and in addition demonstrates its application in managing data in the air travel industry, where security by privacy could play a significant role in controlling the flow of data among its ‘internet of things’ that have multiple stakeholders.
APA, Harvard, Vancouver, ISO, and other styles
5

Meyer, K. C. (Kobus Cornelius). "Development of a GIS for sea rescue." Thesis, Stellenbosch : Stellenbosch University, 2003. http://hdl.handle.net/10019.1/53360.

Full text
Abstract:
Thesis (MA)--Stellenbosch University, 2003.<br>ENGLISH ABSTRACT: Saving the life of another person cannot be measured in monetary terms. It is also impossible to describe the satisfactiori of carrying out a successful rescue to anybody. However, the disappointment and sense of failure when a rescue mission fails and a life is lost, is devastating. Many rescue workers, including those of the National Sea Rescue Institute (NSRI), have experienced this overwhelming sense of failure. Rescue workers often dwell on failed rescue attempts, wishing that they could have arrived on the scene earlier or knew where to start looking for people. The fact that lives are still lost, despite the best efforts of rescue workers, points to the need to improve on life saving techniques, procedures, equipment and technology. Providing the NSRI with a workable tool to help them manage and allocate resources, plan a rescue, determine drift speed and distance or create search patterns, may one day be just enough to save one more life. With this goal in mind, a search and rescue application, called RescueView, was developed utilising ArcView 3.2a. This application was specifically designed for use by the NSRI, and it will be used as a command centre in all NRSI control rooms and for all rescue efforts.<br>AFRIKAANSE OPSOMMING: Om die lewe van 'n ander persoon te red, kan nie in geldwaarde gemeet word nie. Dit is ook onmoontlik om aan enige iemand die bevrediging van 'n suksesvolle redding te beskryf. Die terleurstelling en gevoel van verlies is egter baie groot wanneer 'n reddingspoging misluk en 'n lewe verloor word. Menige reddingswerkers, insluitend dié van die Nasional Seereddingsinstituut (NSRI), het al hierdie oorweldigende gevoel van mislukking ervaar. Reddingswerkers tob dikwels oor onsuksesvolle reddingspogings en wens dat hulle vroeër op die toneel aangekom het of geweet het waar om vir mense te begin soek. Die feit dat lewensverlies steeds plaasvind, ten spyte van reddingswerkers se beste pogings, dui op die behoefte om lewensreddingstegnieke, -prosedures, -toerusting en -tegnologie te verbeter. ( Deur die NSRI met 'n werkbare instrument te voorsien, wat hulle kan help om hulpbronne te bestuur en toe te wys, 'n redding te beplan, dryfspoed en -afstand te bepaal of soekpatrone te skep, mag eendag dalk net genoeg wees om nog 'n lewe te red. Met hierdie doel in gedagte is RescueView, 'n soek- en reddingsapplikasie, deur middel van ArcView 3.2a ontwikkel. Hierdie applikasie is spesifiek ontwerp vir gebruik deur die NSRI en dit sal as beheersentrurn in alle NSRI kontrolekamers en vir alle reddingspogings gebruik word.
APA, Harvard, Vancouver, ISO, and other styles
6

Goncharuk, Elena. "A case study on pragmatic software reuse." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-300047.

Full text
Abstract:
Software reuse became a very active and demanding research area. The research on this topic is extensive, but is there is gap between theory and industrial reuse practices. There is a need to connect the theoretical and practical aspects of reuse, especially on the topic of the reuse which is not performed in a planned, systematic manner. This case study investigates a real-world case of pragmatic software reuse with help of an existing academic research. The investigation includes a literature study on software reuse processes which are summarised in a form of a general process model of pragmatic software reuse. A case of pragmatic reuse performed in the industry is then analysed and compared to the proposed model, as well additional academic research. The real-world reuse process is shown to closely follow the proposed general model.<br>Återanvändning av mjukvara är ett krävande och väldigt aktivt forskningsområde. Det forskas mycket i ämnet, men det finns en glipa mellan teorin och hur återanvändning går till i industriella miljöer. Det finns ett behov av att koppla samman teoretiska och praktiska aspekter av återanvändning, speciellt för sådan som är inte utförd på ett planerat, systematiserat sätt. Den här fallstudien undersöker ett fall av pragmatisk mjukvaruåteranvändning i praktiken med hjälp av existerande akademisk forskning. Den inkluderar en litteraturstudie om återanvändningsprocesser som sammanfattas i en form av en generell processmodell av pragmatisk återanvändning. Ett industriexempel på sådan process sedan analyseras och jämföras med processmodellen och annan akademisk forskning. Det analyserade återanvändningsfall i praktiken har visat dig att stämma väl överens med den generella processmodellen.
APA, Harvard, Vancouver, ISO, and other styles
7

Lima, Lucas Albertins de. "Test case prioritization based on data reuse for Black-box environments." Universidade Federal de Pernambuco, 2009. https://repositorio.ufpe.br/handle/123456789/1922.

Full text
Abstract:
Made available in DSpace on 2014-06-12T15:53:11Z (GMT). No. of bitstreams: 2 arquivo1906_1.pdf: 1728491 bytes, checksum: 711dbaf0713ac324ffe904a6dace38d7 (MD5) license.txt: 1748 bytes, checksum: 8a4605be74aa9ea9d79846c1fba20a33 (MD5) Previous issue date: 2009<br>Conselho Nacional de Desenvolvimento Científico e Tecnológico<br>Albertins de Lima, Lucas; Cezar Alves Sampaio, Augusto. Test case prioritization based on data reuse for Black-box environments. 2009. Dissertação (Mestrado). Programa de Pós-Graduação em Ciência da Computação, Universidade Federal de Pernambuco, Recife, 2009.
APA, Harvard, Vancouver, ISO, and other styles
8

Jaldell, Henrik. "Essays on the performance of fire and rescue services /." Göteborg : Dept. of Economics, School of Economics and Commercial Law [Nationalekonomiska institutionen, Handelshögsk.], Univ, 2002. http://www.handels.gu.se/epc/data/html/html/PDF/JaldelldissNE.pdf.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Rodriguez, Alfredo. "Frequency reuse through RF power management in ship-to-ship data networks." Monterey, Calif. : Springfield, Va. : Naval Postgraduate School ; Available from National Technical Information Service, 1997. http://handle.dtic.mil/100.2/ADA341252.

Full text
Abstract:
Thesis (M.S. in Electrical Engineering) Naval Postgraduate School, December 1997.<br>"December 1997." Thesis advisor(s): Chin-Hwa Lee. Includes bibliographical references (p. 59). Also available online.
APA, Harvard, Vancouver, ISO, and other styles
10

Arnesen, Adam T. "Increasing Design Productivity for FPGAs Through IP Reuse and Meta-Data Encapsulation." BYU ScholarsArchive, 2011. https://scholarsarchive.byu.edu/etd/2614.

Full text
Abstract:
As Moore's law continues to progress, it is becoming increasingly difficult for hardware designers to fully utilize the increasing number of transistors available semiconductor devices including FPGAs. This design productivity gap must be addressed to allow designs to take full advantage of the increased logic density that results from rising transistor density. The reuse of previously developed and verified intellectual property (IP) is one approach that has claimed to narrow the design productivity gap. Reuse, however, has proved difficult to realize in practice because of the complexity of IP and the reluctance of designers to reuse IP that they do not understand. This thesis proposes to narrow the design productivity gap for FPGAs by simplifying the reuse problem by encapsulating IP with extra machine-readable information or meta-data. This meta-data simplifies reuse by providing a language independent format for composing complex systems, providing a parameter representation system, defining high-level data types for FPGA IP, and allowing arbitrary IP to be described as actors in the homogeneous synchronous dataflow model of computation.This work implements meta-data in XML and presents two XML schemas that enable reuse. A new XML schema known as CHREC XML is presented as well as extensions that enable IP-XACT to be used to describe FPGA dataflow IP. Two tools developed in this work are also presented that leverage meta-data to simplify reuse of arbitrary IP. These tools simplify structural composition of IP, allow designers to manipulate parameters, check and validate high-level data types, and automatically synthesize control circuitry for dataflow designs. Productivity improvements are also demonstrated by reusing IP to quickly compose software radio receivers.
APA, Harvard, Vancouver, ISO, and other styles
11

Gordon, Justin. "Research on post commencement finance data from South African companies in business rescue." Master's thesis, University of Cape Town, 2018. http://hdl.handle.net/11427/29551.

Full text
Abstract:
SA has one of the lowest survival rates of small and medium enterprises (hereafter referred to as “SMEs”), in the world (Edmore, December 2011). Therefore, business rescue is critical in developing SA’s economy, as defined in Section 7(b)(i) of the Companies Act, No.71 of 2008 (“the Act”) which reads: “Promote the development of the South African Economy by encouraging entrepreneurship and enterprise efficieny” The literature on business rescue concludes that post commencement finance is critical to the success of business rescue. However, to date, there has been no research performed on actual data collected from practitioners to answer the question of whether post commencement finance is a predictor of a successful business rescue The findings of this study initially contradict the literature insofar as 56% of business rescues received post commencement finance: however, further investigation showed that only 7% of the total companies in this study received third party financial institutional post commencement finance, with the balance being introduced by shareholders. The main finding of this study was that the introduction of post commencement finance is only a partial predictor of a successful business rescue. Thus, in the case of those companies which received finance, under business rescue, only 57% were successful. Another finding of this study is that the combination that provides the best probability of successful business rescue is when equity, in the business rescue company, is made available after the successful adoption of the business rescue plan.
APA, Harvard, Vancouver, ISO, and other styles
12

Björklund, Rickard. "Software Reuse in Game Development : Creating Building Blocks for Prototyping." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-259370.

Full text
Abstract:
As games and the technologies used by them have become more advanced, the cost of producing games have increased. Today, the latest AAA titles are the results of hundreds or as many as thousands of people working full-time for years, and even developing a prototype requires a large investment. This project sets out to reduce that cost by looking at how reusable building blocks can be used in the prototyping process. During the project, seven interviews with game designers were conducted. The interviews found that building character controllers for the player was the most common activity and one of the more difficult tasks when prototyping a new game. As a result, a tool for creating character controllers was made. The tool builds the character controllers to work as state machines where actions in a state and transitions between states are editable through a visual programming language. The visual programming language uses nodes. These nodes work as reusable building blocks. The tool was evaluated by six game designers and four programmers who all thought the tool used a good approach for building and prototyping character controllers. The evaluation also showed that the building blocks, in the form of nodes in the tool, should be functionally small and general, like nodes for applying forces and accessing character data.<br>Allt eftersom spel och de tekniker som används i spelutveckling ökar i komplexitet, desto högre blir kostnaderna för att utveckla spel. Idag krävs hundratals, ibland till och med tusentals heltidsarbetande i mångåriga projekt för att utveckla AAA-spel, och även utvecklandet av en prototyp innebär stora investeringar. Detta projekt ämnar att minska den kostnaden genom att undersöka hur återanvändningsbara byggstenar kan användas i prototyputveckling. Under projektets gång, utfördes sju intervjuer med speldesigners. Dessa intervjuer visade att utveckling av karaktärkontroller var en av de vanligaste och svåraste aktiviteterna i utveckling av prototyper. Som en följd av detta skapades ett verktyg för att utveckla karaktärkontroller. De karaktärkontroller som verktyget bygger fungerar som tillståndsmaskiner där händelser i ett tillstånd och övergångar mellan tillstånd kontrolleras med ett visuellt programmeringsspråk. Det visuella programmeringsspråket använder noder och det är dessa noder som utgör de återanvändningsbara byggstenarna. Verktyget utvärderades av sex speldesigners och fyra spelprogrammerare som alla tyckte att verktyget var ett bra sätt att utveckla och prototypa karaktärkontroller på. Utvärderingen visade också att byggstenarna, i form av noder i verktyget, bör vara funktionellt små och generella, så som noder för att applicera krafter och modifiera karaktärsdata.
APA, Harvard, Vancouver, ISO, and other styles
13

Hutchings, Elizabeth Helen. "The secondary use and linking of administrative health data and clinical trial data in cancer: attitudes towards data reuse in Australia." Thesis, The University of Sydney, 2021. https://hdl.handle.net/2123/26924.

Full text
Abstract:
Since the introduction of personal computing technology in 1974, the capacity to collect, store, and analyse data has grown exponentially. Terms such as ‘big data’ and ‘data mining’ were coined shortly after with the potential for large datasets and the analysis of stored data to generate important and useful outcomes being recognised early. Since then, the near universal adoption of technology by individuals and organisations has seen data move from being ‘just information’ to an organisational asset which requires clear governance structures and curation. As technology continues to become more interconnected, increasing amounts of data will be generated, collated, and analysed to inform new technologies and processes. Healthcare, like other sectors, collects a significant amount of data which will also continue to grow over time. This data can be leveraged to improve health systems and provide insight into disease presentations and treatment patterns. The research for this thesis focused on attitudes of healthcare consumers and researchers and/or clinicians towards the reuse (secondary data analysis) of cancer data in Australia. The thesis examines the secondary use of two types of data: administrative health data collected during an individual’s interaction with the health system, and clinical trial data, collected during an individual’s participation in a clinical trial. While administrative health data are considered more reflective of a real-world population, its collection is primarily for economic reimbursement, not research purposes. Criticisms of this data include that it may be incomplete and of a poor-quality. In contrast, clinical trial data whilst usually of high quality, is only collected in a well-defined, highly selected population meaning that it is more difficult to generalise the findings to a wider population. The thesis commenced with a systematic literature review, which found that health data by its nature is sensitive, and attitudes towards its secondary use are highly individualised. Central themes identified in the systematic literature review included issues related to trust, transparency, data ownership, privacy, information security, and consent. The results found a paucity of information about attitudes towards data reuse in cancer, highlighting the need for continued research in this area. Three papers summarising the results of the review are now published. Based on the systematic literature review, a series of semi-structured interviews were conducted with professional stakeholders with an interest in cancer research. Results highlighted similar concerns as those raised in the systematic literature review. A paper summarising these results has been submitted for publication. Based on the findings of both the systematic literature review and the structured interviews, two online anonymous questionnaires were developed to determine the attitudes of: 1) Australian healthcare consumers with a previous diagnosis of breast cancer, and 2) Australian and New Zealand researchers and/or clinicians, towards the secondary use of data. Responses to the questionnaires were diverse, indicating that attitudes towards health data reuse in Australia are unique to the individual, and reflect the findings of both the systematic literature review and structured interviews. Two papers reporting these results have been submitted for publication. Collectively, these results suggest that a continued dialogue about the reuse of data in cancer is required. This thesis concludes with recommendations for secondary data use in cancer in Australia and areas for future research.
APA, Harvard, Vancouver, ISO, and other styles
14

Schaible, Johann [Verfasser]. "TermPicker: Recommending Vocabulary Terms for Reuse When Modeling Linked Open Data / Johann Schaible." Kiel : Universitätsbibliothek Kiel, 2017. http://d-nb.info/1127044257/34.

Full text
APA, Harvard, Vancouver, ISO, and other styles
15

King, Bradley. "Data Center Conversion: The Adaptive Reuse of a Remote Textile Mill in Augusta, Georgia." University of Cincinnati / OhioLINK, 2016. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1459438392.

Full text
APA, Harvard, Vancouver, ISO, and other styles
16

Thorstensson, Eleonor. "Important properties for requirements reuse tools." Thesis, University of Skövde, Department of Computer Science, 2002. http://urn.kb.se/resolve?urn=urn:nbn:se:his:diva-741.

Full text
Abstract:
<p>Requirements reuse is based upon the idea that it is possible to reuse requirements from previous software development projects. Requirements reuse leads to efficiency and quality gains in the beginning of the development process. One of the obstacles for requirements reuse is that there is a lack of appropriate tools for it. This work approaches this hinder by identifying properties that are important, that is, that the properties represent something that has so much influence that it should be in a requirements reuse tool. These identified properties may then guide the development when building requirements reuse tools.</p><p>In order to find the properties this work is conducted as a literature study where both tool-specific and non-tool specific articles were searched in order to elicitate the properties. The work focuses on properties present in both tool-specific and non-tool specific articles. This makes the result more reliable since two different sources have identified them. 18 verified properties were identified through this work.</p>
APA, Harvard, Vancouver, ISO, and other styles
17

Prats-Abadia, Esther. "On the systematic reuse of legacy data in distributed object-based enterprise resource planning software." Thesis, Loughborough University, 2000. https://dspace.lboro.ac.uk/2134/34563.

Full text
Abstract:
The study concerns the development and testing of a systematic approach to reuse of legacy data. A key aspect of the approach is that it is designed to transform relational data, as might typically be stored in a source database, into object-oriented data. By building upon and extending the use of a set of general-purpose software engineering concepts the systematic approach has many potential application areas. However, within this study focus has been on testing the utility and practicality of the approach in the domain of enterprise resource planning (ERP). The systematic approach is based on a process consisting of four main steps, two of which are centred on the use of an algorithm which was conceived and developed to automate the processing of relational schema and data input by expert users so that object schema can be drawn out, new object classes can be added and redundant object classes removed.
APA, Harvard, Vancouver, ISO, and other styles
18

Larsson, Henrik. "Öppna data : En teknisk undersökning av Sveriges offentliga sektor." Thesis, Mittuniversitetet, Avdelningen för informations- och kommunikationssystem, 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:miun:diva-28125.

Full text
Abstract:
Today's information society is facing constant new challenges, one of which is how the public sector gives out information to their citizens. Sweden has long been a pioneer when it comes to openness and transparency. According to a study by the European Commission, Sweden is a well-functioning digital society, while the digital transparency and the openness to the public sector is still lower than other countries. The problem is that Sweden's public sector has not kept up with developments regarding the work with open data over the years and it has taken place in a decentralized manner. Timothy Berners-Lee, a British computer engineer and founder of the World Wide Web, advocates open linked data and addresses data owners to release their data as soon as possible, regardless of format. This study is based on Timothy's theory of open data and contains a literature study, a survey and a proof-of-concept model. The purpose of this study is to make a technical analysis of open data in Sweden’s public organizations and to be able to present a situational picture, as there are few technical studies about it. According to this study Timothy's theory is fully applicable in Sweden and there are several factors that the public sector should take notice of in the event of publication of data on the Internet. The technical potential for Sweden to comply with the EU-directive already exists and the biggest pitfalls have been the lack of knowledge from certain authorities and certain legal obstacles. The Swedish government needs to do a larger national effort to centralize the work with open data in order to turn Sweden into a leading country in terms of digital transparency and openness once again.<br>Dagens informationssamhälle står inför ständigt nya utmaningar, en av dessa är hur offentlig sektor ger ut information till medborgare. Sverige har länge varit ett föregångsland när det gäller öppenhet och transparens. Enligt en undersökning från Europeiska kommissionen är Sverige ett välutvecklat digitalt land, medan den digitala transparensen och insynen hos den offentliga sektorn fortfarande är lägre än hos andra länder. Problemet är att Sveriges offentliga sektor inte har följt med i utvecklingen gällande öppna data och arbetet har under åren skett på ett decentraliserat sätt. Timothy Berners-Lee, en engelsk dataingenjör och grundare till World Wide Web, förespråkar öppna länkad data och syftar till att dataägare ska frisläppa deras datalöpande, oavsett format. Denna studie baseras på Timothys teori gällande öppna data och har genomförts med hjälp av en litteraturstudie, enkätundersökning och en proof-of-concept modell. Syftet med studien är att göra en teknisk analys av öppna data i Sverige för att kunna presentera status för detta i en nulägesbild då det finns väldigt få konkreta studier om det. Enligt denna studie är Timothys teori fullt applicerbar i Sverige och det finns flertalet faktorer den offentliga sektorn bör ta i anspråk vid publicering av data för vidareutnyttjande på Internet. Den tekniska potentialen för Sverige att uppfylla EU-direktivet finns redan och de största fallgroparna är kunskapsbrist eller juridiska hinder. Styrande organ i Sverige behöver göra en större nationell satsning för att centralisera arbetet med öppna data så landet återigen kan bli ett föregångsland när det gäller digital transparens och öppenhet.
APA, Harvard, Vancouver, ISO, and other styles
19

Brushett, Ben Amon. "Assessment of Metocean Forecast Data and Consensus Forecasting for Maritime Search and Rescue and Pollutant Response Applications." Thesis, Griffith University, 2015. http://hdl.handle.net/10072/367700.

Full text
Abstract:
Effective prediction of objects drifting on the water surface is essential to successful maritime search and rescue (SAR) services, since a more accurate prediction of the object’s likely location results in a greater probability for the success of a SAR operation. SAR drift models, based on Lagrangian stochastic particle trajectory models, are frequently utilised for this task. More recently, metocean (meteorological and oceanographic) forecast data has been used as input to these models to provide the environmental forcing (due to winds and ocean currents) that the object may be subject to. Further, the slip of a drifting object across the water surface due to the ambient wind and waves (irrespective of currents) is described by its leeway drift coefficients, which are also required by the SAR drift model to calculate the potential drift of the object. This study examined several ways to improve the prediction of an objects drift on the water surface, with the primary focus being the improvement of SAR forecasting. To achieve this, many simulations were undertaken, comparing the trajectories of actual drifters deployed in the ocean, and the corresponding model simulations of drift, using the commercially available SARMAP (Search and Rescue Mapping and Analysis Program) SAR drift model. Each drifter trajectory was simulated independently using a different ocean model to provide ocean current forcing. The ocean models tested included BLUElink, FOAM (Forecasting Ocean Assimilation Model), HYCOM (Hybrid Coordinate Ocean Model) and NCOM (Navy Coastal Ocean Model).<br>Thesis (PhD Doctorate)<br>Doctor of Philosophy (PhD)<br>Griffith School of Engineering<br>Science, Environment, Engineering and Technology<br>Full Text
APA, Harvard, Vancouver, ISO, and other styles
20

KAVADIYA, JENIS. "TEST DERIVATION AND REUSE USING HORIZONTAL TRANSFORMATION OF SYSTEM MODELS." Thesis, Mälardalen University, School of Innovation, Design and Engineering, 2010. http://urn.kb.se/resolve?urn=urn:nbn:se:mdh:diva-7808.

Full text
APA, Harvard, Vancouver, ISO, and other styles
21

Zemirline, Nadjet. "Assisting in the reuse of existing materials to build adaptive hypermedia." Phd thesis, Université Paris Sud - Paris XI, 2011. http://tel.archives-ouvertes.fr/tel-00664996.

Full text
Abstract:
Nowadays, there is a growing demand for personalization and the "one-size-fits-all" approach for hypermedia systems is no longer applicable. Adaptive hypermedia (AH) systems adapt their behavior to the needs of individual users. However due to the complexity of their authoring process and the different skills required from authors, only few of them have been proposed. These last years, numerous efforts have been put to propose assistance for authors to create their own AH. However, as explained in this thesis some problems remain.In this thesis, we tackle two particular problems. A first problem concerns the integration of authors' materials (information and user profile) into models of existing systems. Thus, allowing authors to directly reuse existing reasoning and execute it on their materials. We propose a semi-automatic merging/specialization process to integrate an author's model into a model of an existing system. Our objectives are twofold: to create a support for defining mappings between elements in a model of existing models and elements in the author's model and to help creating consistent and relevant models integrating the two models and taking into account the mappings between them.A second problem concerns the adaptation specification, which is famously the hardest part of the authoring process of adaptive web-based systems. We propose an EAP framework with three main contributions: a set of elementary adaptation patterns for the adaptive navigation, a typology organizing the proposed elementary adaptation patterns and a semi-automatic process to generate adaptation strategies based on the use and the combination of patterns. Our objectives are to define easily adaptation strategies at a high level by combining simple ones. Furthermore, we have studied the expressivity of some existing solutions allowing the specification of adaptation versus the EAP framework, discussing thus, based on this study, the pros and cons of various decisions in terms of the ideal way of defining an adaptation language. We propose a unified vision of adaptation and adaptation languages, based on the analysis of these solutions and our framework, as well as a study of the adaptation expressivity and the interoperability between them, resulting in an adaptation typology. The unified vision and adaptation typology are not limited to the solutions analysed, and can be used to compare and extend other approaches in the future. Besides these theoretical qualitative studies, this thesis also describes implementations and experimental evaluations of our contributions in an e-learning application.
APA, Harvard, Vancouver, ISO, and other styles
22

Dufbäck, Dennis, and Fredrik Håkansson. "Adapting network interactions of a rescue service mobile application for improved battery life." Thesis, Linköpings universitet, Institutionen för datavetenskap, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-139836.

Full text
Abstract:
Today, it is not unusual that smartphone devices can’t survive even one day of regular use until the battery needs to be recharged. The batteries are drained while using power hungry applications made by developers who haven’t taken their application’s energy impact into consideration. In this thesis we study network transmissions as made by a mobile application, and the impact these have on the battery life. The application was developed with the local rescue and emergency service as a hypothetical target group. We test how the mobile network technologies 3G and WiFi together with the device’s current signal strength and battery level affect the energy usage of the battery when uploading data to a server. We develop an adaptation mechanism on application level which uses a mathematical model for calculating a suitable adaptation of scheduling of network interactions. The adaptation mechanism makes use of burst buffering of packets, and adjusts for 3G tail times as well as for different priorities of incoming requests. Custom packet scheduling profiles are made to make consistent measurements, and with this implementation we are able to reduce the amount of energy consumed using 3G and WiFi with 67 % and 39 % respectively during tests.
APA, Harvard, Vancouver, ISO, and other styles
23

Hu, Hang. "Characterizing and Detecting Online Deception via Data-Driven Methods." Diss., Virginia Tech, 2020. http://hdl.handle.net/10919/98575.

Full text
Abstract:
In recent years, online deception has become a major threat to information security. Online deception that caused significant consequences is usually spear phishing. Spear-phishing emails come in a very small volume, target a small number of audiences, sometimes impersonate a trusted entity and use very specific content to redirect targets to a phishing website, where the attacker tricks targets sharing their credentials. In this thesis, we aim at measuring the entire process. Starting from phishing emails, we examine anti-spoofing protocols, analyze email services' policies and warnings towards spoofing emails, and measure the email tracking ecosystem. With phishing websites, we implement a powerful tool to detect domain name impersonation and detect phishing pages using dynamic and static analysis. We also analyze credential sharing on phishing websites, and measure what happens after victims share their credentials. Finally, we discuss potential phishing and privacy concerns on new platforms such as Alexa and Google Assistant. In the first part of this thesis (Chapter 3), we focus on measuring how email providers detect and handle forged emails. We also try to understand how forged emails can reach user inboxes by deliberately composing emails. Finally, we check how email providers warn users about forged emails. In the second part (Chapter 4), we measure the adoption of anti-spoofing protocols and seek to understand the reasons behind the low adoption rates. In the third part of this thesis (Chapter 5), we observe that a lot of phishing emails use email tracking techniques to track targets. We collect a large dataset of email messages using disposable email services and measure the landscape of email tracking. In the fourth part of this thesis (Chapter 6), we move on to phishing websites. We implement a powerful tool to detect squatting domains and train a machine learning model to classify phishing websites. In the fifth part (Chapter 7), we focus on the credential leaks. More specifically, we measure what happens after the targets' credentials are leaked. We monitor and measure the potential post-phishing exploiting activities. Finally, with new voice platforms such as Alexa becoming more and more popular, we wonder if new phishing and privacy concerns emerge with new platforms. In this part (Chapter 8), we systematically assess the attack surfaces by measuring sensitive applications on voice assistant systems. My thesis measures important parts of the complete process of online deception. With deeper understandings of phishing attacks, more complete and effective defense mechanisms can be developed to mitigate attacks in various dimensions.<br>Doctor of Philosophy<br>In recent years, online deception becomes a major threat to information security. The most common form of online deception starts with a phishing email, then redirects targets to a phishing website where the attacker tricks targets sharing their credentials. General phishing emails are relatively easy to recognize from both the target's and the defender's perspective. They are usually from strange addresses, the content is usually very general and they come in a large volume. However, Online deception that caused significant consequences is usually spear phishing. Spear-phishing emails come in a very small volume, target a small number of audiences, sometimes impersonate a trusted entity and use very specific content to redirect targets to a phishing website, where the attacker tricks targets sharing their credentials. Sometimes, attackers use domain impersonation techniques to make the phishing website even more convincing. In this thesis, we measure the entire process. Starting from phishing emails, we examine anti-spoofing protocols, analyze email services' policies and warnings towards spoofing emails, and measure the email tracking ecosystem. With phishing websites, we implement a tool to detect domain name impersonation and detect phishing pages using dynamic and static analysis. We also studied credential sharing on phishing websites. We measure what happens after targets share their credentials. Finally, we analyze potential phishing and privacy concerns on new platforms such as Alexa and Google Assistant.
APA, Harvard, Vancouver, ISO, and other styles
24

Wakchaure, Abhijit. "Exploring techniques for measurement and improvement of data quality with application to determination of the last known position (LKP) in search and rescue (SAR) data." Doctoral diss., University of Central Florida, 2011. http://digital.library.ucf.edu/cdm/ref/collection/ETD/id/5076.

Full text
Abstract:
The SAR Algorithm has been developed, which makes use of partial or incomplete information, cleans and validates the data, and further extracts address information from relevant fields to successfully geocode the data. The algorithm improves the geocoding accuracy and has been validated by a set of approaches.; There is a tremendous volume of data being generated in today's world. As organizations around the globe realize the increased importance of their data as being a valuable asset in gaining a competitive edge in a fast-paced and a dynamic business world, more and more attention is being paid to the quality of the data. Advances in the fields of data mining, predictive modeling, text mining, web mining, business intelligence, health care analytics, etc. all depend on clean, accurate data. That one cannot effectively mine data, which is dirty, comes as no surprise. This research is an exploratory study of different domain data sets, addressing the data quality issues specific to each domain, identifying the challenges faced and arriving at techniques or methodologies for measuring and improving the data quality. The primary focus of the research is on the SAR or Search and Rescue dataset, identifying key issues related to data quality therein and developing an algorithm for improving the data quality. SAR missions which are routinely conducted all over the world show a trend of increasing mission costs. Retrospective studies of historic SAR data not only allow for a detailed analysis and understanding of SAR incidents and patterns, but also form the basis for generating probability maps, analytical data models, etc., which allow for an efficient use of valuable SAR resources and their distribution. One of the challenges with regards to the SAR dataset is that the collection process is not perfect. Often, the LKP or the Last Known Position is not known or cannot be arrived at. The goal is to fully or partially geocode the LKP for as many data points as possible, identify those data points where the LKP cannot be geocoded at all, and further highlight the underlying data quality issues.<br>ID: 030422992; System requirements: World Wide Web browser and PDF reader.; Mode of access: World Wide Web.; Thesis (Ph.D.)--University of Central Florida, 2011.; Includes bibliographical references (p. 137-142).<br>Ph.D.<br>Doctorate<br>Electrical Engineering and Computer Science<br>Engineering and Computer Science
APA, Harvard, Vancouver, ISO, and other styles
25

Kalyadin, Dmitry. "Robot data and control server for Internet-based training on ground robots." [Tampa, Fla.] : University of South Florida, 2007. http://purl.fcla.edu/usf/dc/et/SFE0002111.

Full text
APA, Harvard, Vancouver, ISO, and other styles
26

Stöver, Ben [Verfasser], and Kai F. [Akademischer Betreuer] Müller. "Software components for increased data reuse and reproducibility in phylogenetics and phylogenomics / Ben Stöver ; Betreuer: Kai F. Müller." Münster : Universitäts- und Landesbibliothek Münster, 2018. http://d-nb.info/1173248579/34.

Full text
APA, Harvard, Vancouver, ISO, and other styles
27

Zia, Huma. "Enabling proactive agricultural drainage reuse for improved water quality through collaborative networks and low-complexity data-driven modelling." Thesis, University of Southampton, 2015. https://eprints.soton.ac.uk/384511/.

Full text
Abstract:
With increasing prevalence of Wireless Sensor Networks (WSNs) in agriculture and hydrology, there exists an opportunity for providing a technologically viable solution for the conservation of already scarce fresh water resources. In this thesis, a novel framework is proposed for enabling a proactive management of agricultural drainage and nutrient losses at farm scale where complex models are replaced by in-situ sensing, communication and low complexity predictive models suited to an autonomous operation. This is achieved through the development of the proposed Water Quality Management using Collaborative Monitoring (WQMCM) framework that combines local farm-scale WSNs through an information sharing mechanism. Under the proposed WQMCM framework, various functional modules are developed to demonstrate the overall mechanism: (1) neighbour learning and linking, (2) low-complexity predictive models for drainage dynamics, (3) low-complexity predictive model for nitrate losses, and (4) decision support model for drainage and nitrate reusability. The predictive models for drainage dynamics and nitrate losses are developed by abstracting model complexity from the traditional models (National Resource Conservation Method (NRCS) and De-Nitrification-DeComposition (DNDC) model respectively). Machine learning algorithms such as M5 decision tree, multiple linear regression, artificial neural networks, C4.5, and Naïve Bayes are used in this thesis. For the predictive models, validation is performed using 12-month long event dataset from a sub-catchment in Ireland. Overall, the following contributions are achieved: (1) framework architecture and implementation for WQMCM for a networked catchment, (2) model development for low-complexity drainage discharge dynamics and nitrate losses by reducing number of model parameters to less than 50%, (3) validation of the predictive models for drainage and nitrate losses using M5 tree algorithm and measured catchment data. Additionally modelling results are compared with existing models and further tested with using other learning algorithms, and (4) development of a decision support model, based on Naïve Bayes algorithm, for suggesting reusability of drainage and nitrate losses.
APA, Harvard, Vancouver, ISO, and other styles
28

KUNDA, SAKETH RAM. "Methods to Reuse CAD Data in Tessellated Models for Efficient Visual Configurations : An Investigation for Tacton Systems AB." Thesis, KTH, Skolan för industriell teknik och management (ITM), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-281748.

Full text
Abstract:
Data loss has always been a side effect of sharing 3D models between different CAD systems. Continuous research and new frameworks have been implemented to minimise the data loss in CAD and in other downstream applications separately like 3D visual graphics applications (eg. 3DS Max, Blender etc.). As a first step into this research area, the thesis is an explorative study on understanding the problem of CAD data loss while exchanging models between a CAD application and a visual application. The thesis is performed at Tacton systems which provides product configurations to their customers in both CAD and visual environments and hence the research is focussed on reusing the CAD data in visual applications or restoring the data after exchange. The research questions are framed to answer the reasons of data loss and address the possible implementation techniques at the company. Being a niche topic, the thesis required inputs from different perspectives and knowledge sharing from people outside the company which proves the significance of open innovation in technology-oriented companies. Ten different ideas were brainstormed and developed into concepts to solve the problem of data loss. All the concepts are analysed and evaluated to check the functionality and feasibility of implementing it within the company workflow. The evaluations resulted in different concepts that are capable of solving the research problem. They have also been verified with various people internal and external to the company. The results also highlight the strengths and weaknesses of each of these concepts giving clear instructions to the company on the next steps.<br>Dataförluster har alltid varit en följd av att dela 3D-modeller mellan olika CAD-system. I forskning har nya metoder utvecklats och implementerats för att minimera dataförlusten i CAD och andra program, som t.ex. visuella 3D-grafikapplikationer (3DS Max, Blender etc.). Denna rapport är resultatet av en studie kring CAD-dataförluster när man överför modeller mellan ett CAD-program och ett visualiseringsprogram. Studien har utförts vid Tacton Systems AB, som tillhandahåller produktkonfigureringslösningar både i CAD-program och i visuella miljöer och därför har studien haft fokus på att återanvända eller återskapa CAD-data i visualiseringsprogramvaror. Forskningsfrågorna är inriktade på att hitta orsaker till dataförlusterna och möjliga lösningar för företaget. Eftersom detta är ett högspecialiserat ämne krävde arbetet insatser från olika perspektiv och kunskapsinhämtning från människor också utanför företaget, vilket visar på betydelsen av öppen innovation i teknikorienterade företag. Tio olika idéer togs fram och utvecklades till koncept för att lösa problemet med dataförluster. Alla koncept har analyserats och utvärderats för att bedöma deras funktionalitet och genomförbarhet, för att implementera dem inom företagets arbetsflöde. Utvärderingarna resulterade i sex olika koncept som skulle kunna lösa problemet. Dessa koncept har diskuterats och verifierats med olika personer inom och utanför företaget. Resultatet visar styrkor och svagheter i vart och ett av dessa koncept och ger tydliga rekommendationer till företaget om nästa steg.
APA, Harvard, Vancouver, ISO, and other styles
29

Salter, Chris. "Economics of fire : exploring fire incident data for a design tool methodology." Thesis, Loughborough University, 2013. https://dspace.lboro.ac.uk/2134/13199.

Full text
Abstract:
Fires within the built environment are a fact of life and through design and the application of the building regulations and design codes, the risk of fire to the building occupants can be minimised. However, the building regulations within the UK do not deal with property protection and focus solely on the safety of the building occupants. This research details the statistical analysis of the UK Fire and Rescue Service and the Fire Protection Association's fire incident databases to create a loss model framework, allowing the designers of a buildings fire safety systems to conduct a cost benefit analysis on installing additional fire protection solely for property protection. It finds that statistical analysis of the FDR 1 incident database highlights the data collection methods of the Fire and Rescue Service ideally need to be changed to allow further risk analysis on the UK building stock, that the statistics highlight that the incidents affecting the size of a fire are the time from ignition to discovery and the presence of dangerous materials, that sprinkler activations may not be as high as made out by sprinkler groups and that the activation of an alarm system gives a smaller size fire. The original contribution to knowledge that this PhD makes is to analyse the FDR 1 database to try and create a loss model, using data from both the Fire Protection Association and the Fire and Rescue Service.
APA, Harvard, Vancouver, ISO, and other styles
30

Friedrich, Tanja. "Looking for data." Doctoral thesis, Humboldt-Universität zu Berlin, 2020. http://dx.doi.org/10.18452/22173.

Full text
Abstract:
Die Informationsverhaltensforschung liefert zahlreiche Erkenntnisse darüber, wie Menschen Informationen suchen, abrufen und nutzen. Wir verfügen über Forschungsergebnisse zu Informationsverhaltensmustern in einem breiten Spektrum von Kontexten und Situationen, aber wir wissen nicht genug über die Informationsbedürfnisse und Ziele von Forschenden hinsichtlich der Nutzung von Forschungsdaten. Die Informationsverhaltensforschung gibt insbesondere Aufschluss über das literaturbezogene Informationsverhalten. Die vorliegende Studie basiert auf der Annahme, dass diese Erkenntnisse nicht ohne weiteres auf datenbezogenes Informationsverhalten übertragen werden können. Um diese Annahme zu untersuchen, wurde eine Studie zum Informationssuchverhalten von Datennutzenden durchgeführt. Übergeordnetes Ziel der Studie war es, Erkenntnisse über das Informationsverhalten der Nutzenden eines bestimmten Retrievalsystems für sozialwissenschaftliche Daten zu erlangen, um die Entwicklung von Forschungsdateninfrastrukturen zu unterstützen, die das Data Sharing erleichtern sollen. Das empirische Design dieser Studie folgt einem Mixed-Methods-Ansatz. Dieser umfasst eine qualitative Studie in Form von Experteninterviews und – darauf aufbauend – eine quantitative Studie in Form einer Online-Befragung von Sekundärnutzenden von Daten aus Bevölkerungs- und Meinungsumfragen (Umfragedaten). Im Kern hat die Untersuchung ergeben, dass die Einbindung in die Forschungscommunity bei der Datensuche eine zentrale Rolle spielt. Die Analysen zeigen, dass Communities eine wichtige Determinante für das Informationssuchverhalten sind. Die Einbindung in die Community hat das Potential, Probleme oder Barrieren bei der Datensuche zu reduzieren. Diese Studie trägt zur Theorieentwicklung in der Informationsverhaltensforschung durch die Modellierung des Datensuchverhaltens bei. In praktischer Hinsicht gibt die Studie Empfehlungen für das Design von Dateninfrastrukturen, basierend auf empirischen Anforderungsanalysen.<br>From information behaviour research we have a rich knowledge of how people are looking for, retrieving, and using information. We have scientific evidence for information behaviour patterns in a wide scope of contexts and situations, but we don’t know enough about researchers’ information needs and goals regarding the usage of research data. Having emerged from library user studies, information behaviour research especially provides insight into literature-related information behaviour. This thesis is based on the assumption that these insights cannot be easily transferred to data-related information behaviour. In order to explore this assumption, a study of secondary data users’ information-seeking behaviour was conducted. The study was designed and evaluated in comparison to existing theories and models of information-seeking behaviour. The overall goal of the study was to create evidence of actual information practices of users of one particular retrieval system for social science data in order to inform the development of research data infrastructures that facilitate data sharing. The empirical design of this study follows a mixed methods approach. This includes a qualitative study in the form of expert interviews and – building on the results found therein – a quantitative web survey of secondary survey data users. The core result of this study is that community involvement plays a pivotal role in survey data seeking. The analyses show that survey data communities are an important determinant in survey data users' information seeking behaviour and that community involvement facilitates data seeking and has the capacity of reducing problems or barriers. Community involvement increases with growing experience, seniority, and data literacy. This study advances information behaviour research by modelling the specifics of data seeking behaviour. In practical respect, the study specifies data-user oriented requirements for systems design.
APA, Harvard, Vancouver, ISO, and other styles
31

Silva, Daniel Lins da. "Estratégia computacional para apoiar a reprodutibilidade e reuso de dados científicos baseado em metadados de proveniência." Universidade de São Paulo, 2017. http://www.teses.usp.br/teses/disponiveis/3/3141/tde-05092017-095907/.

Full text
Abstract:
A ciência moderna, apoiada pela e-science, tem enfrentado desafios de lidar com o grande volume e variedade de dados, gerados principalmente pelos avanços tecnológicos nos processos de coleta e processamento dos dados científicos. Como consequência, houve também um aumento na complexidade dos processos de análise e experimentação. Estes processos atualmente envolvem múltiplas fontes de dados e diversas atividades realizadas por grupos de pesquisadores geograficamente distribuídos, que devem ser compreendidas, reutilizadas e reproduzíveis. No entanto, as iniciativas da comunidade científica que buscam disponibilizar ferramentas e conscientizar os pesquisadores a compartilharem seus dados e códigos-fonte, juntamente com as publicações científicas, são, em muitos casos, insuficientes para garantir a reprodutibilidade e o reuso das contribuições científicas. Esta pesquisa objetiva definir uma estratégia computacional para o apoio ao reuso e a reprodutibilidade dos dados científicos, por meio da gestão da proveniência dos dados durante o seu ciclo de vida. A estratégia proposta nesta pesquisa é apoiada em dois componentes principais, um perfil de aplicação, que define um modelo padronizado para a descrição da proveniência dos dados, e uma arquitetura computacional para a gestão dos metadados de proveniência, que permite a descrição, armazenamento e compartilhamento destes metadados em ambientes distribuídos e heterogêneos. Foi desenvolvido um protótipo funcional para a realização de dois estudos de caso que consideraram a gestão dos metadados de proveniência de experimentos de modelagem de distribuição de espécies. Estes estudos de caso possibilitaram a validação da estratégia computacional proposta na pesquisa, demonstrando o seu potencial no apoio à gestão de dados científicos.<br>Modern science, supported by e-science, has faced challenges in dealing with the large volume and variety of data generated primarily by technological advances in the processes of collecting and processing scientific data. Therefore, there was also an increase in the complexity of the analysis and experimentation processes. These processes currently involve multiple data sources and numerous activities performed by geographically distributed research groups, which must be understood, reused and reproducible. However, initiatives by the scientific community with the goal of developing tools and sensitize researchers to share their data and source codes related to their findings, along with scientific publications, are often insufficient to ensure the reproducibility and reuse of scientific results. This research aims to define a computational strategy to support the reuse and reproducibility of scientific data through data provenance management during its entire life cycle. Two principal components support our strategy in this research, an application profile that defines a standardized model for the description of provenance metadata, and a computational architecture for the management of the provenance metadata that enables the description, storage and sharing of these metadata in distributed and heterogeneous environments. We developed a functional prototype for the accomplishment of two case studies that considered the management of provenance metadata during the experiments of species distribution modeling. These case studies enabled the validation of the computational strategy proposed in the research, demonstrating the potential of this strategy in supporting the management of scientific data.
APA, Harvard, Vancouver, ISO, and other styles
32

Heaton, Tyler DeVoe. "Cloud Based IP Data Management Theory and Implementation for a Secure and Trusted Design Space." The Ohio State University, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=osu155498721009978.

Full text
APA, Harvard, Vancouver, ISO, and other styles
33

LAZAROVA, ELENA. "An Interoperable Clinical Cardiology Electronic Health Record System - a standards based approach for Clinical Practice and Research with Data Reuse." Doctoral thesis, Università degli studi di Genova, 2022. https://hdl.handle.net/11567/1103145.

Full text
Abstract:
Currently in hospitals, several information systems manage, very often autonomously, the patient’s personal, clinical and diagnostic data. This originates a clinical information management system consisting of a myriad of independent subsystems which, although efficient in their specific purpose, make the integration of the whole system very difficult and limit the use of clinical data, especially as regards the reuse of these data for research purposes. Mainly for these reasons, the management of the Genoese ASL3 decided to commission the University of Genoa to set up a medical record system that could be easily integrated with the rest of the information system already present, but which offered solid interoperability features, and which could support the research skills of hospital health workers. My PhD work aimed to develop an electronic health record system for a cardiology ward, obtaining a prototype which is functional and usable in a hospital ward. The choice of cardiology was due to the wide availability of the staff of the cardiology department to support me in the development and in the test phase. The resulting medical record system has been designed “ab initio” to be fully integrated into the hospital information system and to exchange data with the regional health information infrastructure. In order to achieve interoperability the system is based on the Health Level Seven standards for exchanging information between medical information systems. These standards are widely deployed and allow for the exchange of information in several functional domains. Specific decision support sections for particular aspects of the clinical life were also included. The data collected by this system were the basis for examples of secondary use for the development of two models based on machine learning algorithms. The first model allows to predict mortality in patients with heart failure within 6 months from their admission, and the second is focused on the discrimination between heart failure versus chronic ischemic heart disease in the elderly population, which is the widest population section served by the cardiological ward.
APA, Harvard, Vancouver, ISO, and other styles
34

Chehade, Samer. "Designing a Customisable Communication System for Situation Awareness in Rescue Operations." Thesis, Troyes, 2021. http://www.theses.fr/2021TROY0007.

Full text
Abstract:
Cette thèse porte sur le problème d'awareness et des communications dans les opérations de secours. Nous cherchons à concevoir et à mettre en œuvre un système visant à simplifier les communications dans ces opérations en se basant sur des techniques de représentation sémantique et une personnalisation des usages. Pour être utilisé par les unités opérationnelles, il est essentiel de concevoir un tel système de manière à répondre à leurs besoins. De plus, afin de garantir la confidentialité des informations, il est essentiel d'intégrer des techniques de sécurité. Pour aborder ces aspects, nous proposons une approche pour concevoir les interfaces et les spécifications du système. Cette approche consiste en une méthodologie basée sur cinq étapes. Tout d'abord, nous modélisons les interactions entre les différentes parties sur la base de pratiques opérationnelles. Deuxièmement, nous formalisons ces interactions et connaissances à travers une ontologie d'application. Cette ontologie intègre des concepts liés au domaine du secours, à la conception de systèmes et à la sécurité de l'information. Ensuite, nous présentons une plate-forme pour concevoir le système. Basée sur l'ontologie développée, cette plateforme permettra aux utilisateurs finaux du système de définir leurs spécifications et de concevoir leurs interfaces de manière personnalisée. De plus, nous proposons une politique de contrôle d'accès basée sur l'ontologie proposée. Finalement, nous présentons un cas d’usage de la plateforme proposée<br>This thesis deals with the problem of awareness and communications in rescue operations. We look forward to designing and implementing a communication system aiming to simplify information sharing in rescue operations based on semantic representation techniques and a customisation of uses. In order to be used by operational units, it is essential to design such a system in a way that meets their practical needs. Moreover, in order to guarantee the privacy of information, it is essential to integrate security techniques in the proposed system. In this consequence, we propose in this thesis a novel approach for defining and designing the system’s interfaces and specifications. This approach consists of a five-step methodology. First, we analyse and model communications and interactions between different stakeholders based on practical operations. Secondly, we formalise those interactions and knowledge through an application ontology. This ontology integrates concepts related to the rescue domain, to the design of systems and to information security. Afterwards, we present ontology-based platform for designing the system. Based on the developed ontology, this platform will allow the end-users of the system to define its specifications and design its interfaces in a customised way. Moreover, we propose an access control and rights management policy based on the proposed ontology. Eventually, we present a use case scenario of the proposed platform
APA, Harvard, Vancouver, ISO, and other styles
35

Stoltzfus, Arlin, Hilmar Lapp, Naim Matasci, et al. "Phylotastic! Making tree-of-life knowledge accessible, reusable and convenient." BioMed Central, 2013. http://hdl.handle.net/10150/610243.

Full text
Abstract:
BACKGROUND:Scientists rarely reuse expert knowledge of phylogeny, in spite of years of effort to assemble a great "Tree of Life" (ToL). A notable exception involves the use of Phylomatic, which provides tools to generate custom phylogenies from a large, pre-computed, expert phylogeny of plant taxa. This suggests great potential for a more generalized system that, starting with a query consisting of a list of any known species, would rectify non-standard names, identify expert phylogenies containing the implicated taxa, prune away unneeded parts, and supply branch lengths and annotations, resulting in a custom phylogeny suited to the user's needs. Such a system could become a sustainable community resource if implemented as a distributed system of loosely coupled parts that interact through clearly defined interfaces.RESULTS:With the aim of building such a "phylotastic" system, the NESCent Hackathons, Interoperability, Phylogenies (HIP) working group recruited 2 dozen scientist-programmers to a weeklong programming hackathon in June 2012. During the hackathon (and a three-month follow-up period), 5 teams produced designs, implementations, documentation, presentations, and tests including: (1) a generalized scheme for integrating components<br>(2) proof-of-concept pruners and controllers<br>(3) a meta-API for taxonomic name resolution services<br>(4) a system for storing, finding, and retrieving phylogenies using semantic web technologies for data exchange, storage, and querying<br>(5) an innovative new service, DateLife.org, which synthesizes pre-computed, time-calibrated phylogenies to assign ages to nodes<br>and (6) demonstration projects. These outcomes are accessible via a public code repository (GitHub.com), a website (http://www.phylotastic.org webcite), and a server image.CONCLUSIONS:Approximately 9 person-months of effort (centered on a software development hackathon) resulted in the design and implementation of proof-of-concept software for 4 core phylotastic components, 3 controllers, and 3 end-user demonstration tools. While these products have substantial limitations, they suggest considerable potential for a distributed system that makes phylogenetic knowledge readily accessible in computable form. Widespread use of phylotastic systems will create an electronic marketplace for sharing phylogenetic knowledge that will spur innovation in other areas of the ToL enterprise, such as annotation of sources and methods and third-party methods of quality assessment.
APA, Harvard, Vancouver, ISO, and other styles
36

GONÇALVES, JÚNIOR Paulo Mauricio. "Multivariate non-parametric statistical tests to reuse classifiers in recurring concept drifting environments." Universidade Federal de Pernambuco, 2013. https://repositorio.ufpe.br/handle/123456789/12226.

Full text
Abstract:
Data streams are a recent processing model where data arrive continuously, in large quantities, at high speeds, so that they must be processed on-line. Besides that, several private and public institutions store large amounts of data that also must be processed. Traditional batch classi ers are not well suited to handle huge amounts of data for basically two reasons. First, they usually read the available data several times until convergence, which is impractical in this scenario. Second, they imply that the context represented by data is stable in time, which may not be true. In fact, the context change is a common situation in data streams, and is named concept drift. This thesis presents rcd, a framework that o ers an alternative approach to handle data streams that su er from recurring concept drifts. It creates a new classi er to each context found and stores a sample of the data used to build it. When a new concept drift occurs, rcd compares the new context to old ones using a non-parametric multivariate statistical test to verify if both contexts come from the same distribution. If so, the corresponding classi er is reused. If not, a new classi er is generated and stored. Three kinds of tests were performed. One compares the rcd framework with several adaptive algorithms (among single and ensemble approaches) in arti cial and real data sets, among the most used in the concept drift research area, with abrupt and gradual concept drifts. It is observed the ability of the classi ers in representing each context, how they handle concept drift, and training and testing times needed to evaluate the data sets. Results indicate that rcd had similar or better statistical results compared to the other classi ers. In the real-world data sets, rcd presented accuracies close to the best classi er in each data set. Another test compares two statistical tests (knn and Cramer) in their capability in representing and identifying contexts. Tests were performed using adaptive and batch classi ers as base learners of rcd, in arti cial and real-world data sets, with several rates-of-change. Results indicate that, in average, knn had better results compared to the Cramer test, and was also faster. Independently of the test used, rcd had higher accuracy values compared to their respective base learners. It is also presented an improvement in the rcd framework where the statistical tests are performed in parallel through the use of a thread pool. Tests were performed in three processors with di erent numbers of cores. Better results were obtained when there was a high number of detected concept drifts, the bu er size used to represent each data distribution was large, and there was a high test frequency. Even if none of these conditions apply, parallel and sequential execution still have very similar performances. Finally, a comparison between six di erent drift detection methods was also performed, comparing the predictive accuracies, evaluation times, and drift handling, including false alarm and miss detection rates, as well as the average distance to the drift point and its standard deviation.<br>Submitted by João Arthur Martins (joao.arthur@ufpe.br) on 2015-03-12T18:02:08Z No. of bitstreams: 2 Tese Paulo Gonçalves Jr..pdf: 2957463 bytes, checksum: de163caadf10cbd5442e145778865224 (MD5) license_rdf: 1232 bytes, checksum: 66e71c371cc565284e70f40736c94386 (MD5)<br>Made available in DSpace on 2015-03-12T18:02:08Z (GMT). No. of bitstreams: 2 Tese Paulo Gonçalves Jr..pdf: 2957463 bytes, checksum: de163caadf10cbd5442e145778865224 (MD5) license_rdf: 1232 bytes, checksum: 66e71c371cc565284e70f40736c94386 (MD5) Previous issue date: 2013-04-23<br>Fluxos de dados s~ao um modelo de processamento de dados recente, onde os dados chegam continuamente, em grandes quantidades, a altas velocidades, de modo que eles devem ser processados em tempo real. Al em disso, v arias institui c~oes p ublicas e privadas armazenam grandes quantidades de dados que tamb em devem ser processadas. Classi cadores tradicionais n~ao s~ao adequados para lidar com grandes quantidades de dados por basicamente duas raz~oes. Primeiro, eles costumam ler os dados dispon veis v arias vezes at e convergirem, o que e impratic avel neste cen ario. Em segundo lugar, eles assumem que o contexto representado por dados e est avel no tempo, o que pode n~ao ser verdadeiro. Na verdade, a mudan ca de contexto e uma situa c~ao comum em uxos de dados, e e chamado de mudan ca de conceito. Esta tese apresenta o rcd, uma estrutura que oferece uma abordagem alternativa para lidar com os uxos de dados que sofrem de mudan cas de conceito recorrentes. Ele cria um novo classi cador para cada contexto encontrado e armazena uma amostra dos dados usados para constru -lo. Quando uma nova mudan ca de conceito ocorre, rcd compara o novo contexto com os antigos, utilizando um teste estat stico n~ao param etrico multivariado para veri car se ambos os contextos prov^em da mesma distribui c~ao. Se assim for, o classi cador correspondente e reutilizado. Se n~ao, um novo classi cador e gerado e armazenado. Tr^es tipos de testes foram realizados. Um compara o rcd com v arios algoritmos adaptativos (entre as abordagens individuais e de agrupamento) em conjuntos de dados arti ciais e reais, entre os mais utilizados na area de pesquisa de mudan ca de conceito, com mudan cas bruscas e graduais. E observada a capacidade dos classi cadores em representar cada contexto, como eles lidam com as mudan cas de conceito e os tempos de treinamento e teste necess arios para avaliar os conjuntos de dados. Os resultados indicam que rcd teve resultados estat sticos semelhantes ou melhores, em compara c~ao com os outros classi cadores. Nos conjuntos de dados do mundo real, rcd apresentou precis~oes pr oximas do melhor classi cador em cada conjunto de dados. Outro teste compara dois testes estat sticos (knn e Cramer) em suas capacidades de representar e identi car contextos. Os testes foram realizados utilizando classi cadores xi xii RESUMO tradicionais e adaptativos como base do rcd, em conjuntos de dados arti ciais e do mundo real, com v arias taxas de varia c~ao. Os resultados indicam que, em m edia, KNN obteve melhores resultados em compara c~ao com o teste de Cramer, al em de ser mais r apido. Independentemente do crit erio utilizado, rcd apresentou valores mais elevados de precis~ao em compara c~ao com seus respectivos classi cadores base. Tamb em e apresentada uma melhoria do rcd onde os testes estat sticos s~ao executadas em paralelo por meio do uso de um pool de threads. Os testes foram realizados em tr^es processadores com diferentes n umeros de n ucleos. Melhores resultados foram obtidos quando houve um elevado n umero de mudan cas de conceito detectadas, o tamanho das amostras utilizadas para representar cada distribui c~ao de dados era grande, e havia uma alta freq u^encia de testes. Mesmo que nenhuma destas condi c~oes se aplicam, a execu c~ao paralela e seq uencial ainda t^em performances muito semelhantes. Finalmente, uma compara c~ao entre seis diferentes m etodos de detec c~ao de mudan ca de conceito tamb em foi realizada, comparando a precis~ao, os tempos de avalia c~ao, manipula c~ao das mudan cas de conceito, incluindo as taxas de falsos positivos e negativos, bem como a m edia da dist^ancia ao ponto de mudan ca e o seu desvio padr~ao.
APA, Harvard, Vancouver, ISO, and other styles
37

Zirbes, Sergio Felipe. "A reutilização de modelos de requisitos de sistemas por analogia : experimentação e conclusões." reponame:Biblioteca Digital de Teses e Dissertações da UFRGS, 1995. http://hdl.handle.net/10183/17809.

Full text
Abstract:
A exemplo de qualquer outra atividade que se destine a produzir um produto, a engenharia de software necessariamente passa por um fase inicial, onde necessário definir o que será produzido. A análise de requisitos é esta fase inicial, e o produto dela resultante é a especificação do sistema a ser construído. As duas atividades básicas durante a analise de requisitos são a eliciação (busca ou descoberta das características do sistema) e a modelagem. Uma especificação completa e consistente é condição indispensável para o adequado desenvolvimento de um sistema. Muitos tem sido, entretanto, os problemas enfrentados pelos analistas na execução desta tarefa. A variedade e complexidade dos requisitos, as limitações humanas e a dificuldade de comunicação entre usuários e analistas são as principais causas destas dificuldades. Ao considerarmos o ciclo de vida de um sistema de informação, verificamos que a atividade principal dos profissionais em computação é a transformação de uma determinada porção do ambiente do usuário, em um conjunto de modelos. Inicialmente, através de um modelo descritivo representamos a realidade. A partir dele derivamos um modelo das necessidades (especificação dos requisitos), transformando-o a seguir num modelo conceitual. Finalizando o ciclo de transformações, derivamos o modelo programado (software), que ira se constituir no sistema automatizado requerido. Apesar da reconhecida importância da analise dos requisitos e da conseqüente representação destes requisitos em modelos, muito pouco se havia inovado nesta área ate o final dos anos 80. Com a evolução do conceito de reutilização de software para reutilização de especificações ou reutilização de modelos de requisitos, finalmente surge não apenas um novo método, mas um novo paradigma: a reutilização sistemática (sempre que possível) de modelos integrantes de especificações de sistemas semelhantes ao que se pretende desenvolver. Muito se tem dito sobre esta nova forma de modelagem e um grande número de pesquisadores tem se dedicado a tornar mais simples e eficientes várias etapas do novo processo. Entretanto, para que a reutilização de modelos assuma seu papel como uma metodologia de use geral e de plena aceitação, resta comprovar se, de fato, ele produz software de melhor quantidade e confiabilidade, de forma mais produtiva. A pesquisa descrita neste trabalho tem por objetivo investigar um dos aspectos envolvido nesta comprovação. A experimentação viabilizou a comparação entre modelos de problemas construídos com reutilização, a partir dos modelos de problemas similares previamente construídos e postos a disposição dos analistas, e os modelos dos mesmos problemas elaborados sem nenhuma reutilização. A comparação entre os dois conjuntos de modelos permitiu concluir, nas condições propostas na pesquisa, serem os modelos construídos com reutilização mais completos e corretos do que os que foram construídos sem reutilização. A apropriação dos tempos gastos pelos analistas durante as diversas etapas da modelagem, permitiu considerações sobre o esforço necessário em cada um dos dois tipos de modelagem. 0 protocolo experimental e a estratégia definida para a pesquisa possibilitaram também que medidas pudessem ser realizadas com duas series de modelos, onde a principal diferença era o grau de similaridade entre os modelos do problema reutilizado e os modelos do problema alvo. A variação da qualidade e completude dos dois conjuntos de modelos, bem como do esforço necessário para produzi-los, evidenciou uma questão fundamental do processo: a reutilização só terá efeitos realmente produtivos se realizada apenas com aplicações integrantes de domínios específicos e bem definidos, compartilhando, em alto grau, dados e procedimentos. De acordo com as diretrizes da pesquisa, o processo de reutilização de modelos de requisitos foi investigado em duas metodologias de desenvolvimento: na metodologia estruturada a modelagem foi realizada com Diagramas de Fluxo de Dados (DFD's) e na metodologia orientada a objeto com Diagramas de Objetos. A pesquisa contou com a participação de 114 alunos/analistas, tendo sido construídos 175 conjuntos de modelos com diagramas de fluxo de dados e 23 modelos com diagramas de objeto. Sobre estas amostras foram realizadas as analises estatísticas pertinentes, buscando-se responder a um considerável número de questões existentes sobre o assunto. Os resultados finais mostram a existência de uma série de benefícios na análise de requisitos com modelagem baseada na reutilização de modelos análogos. Mas, a pesquisa em seu todo mostra, também, as restrições e cuidados necessários para que estes benefícios de fato ocorram.<br>System Engineering, as well as any other product oriented activity, starts by a clear definition of the product to be obtained. This initial activity is called Requirement Analysis and the resulting product consists of a system specification. The Requirement Analysis is divided in two separated phases: elicitation and modeling. An appropriate system development definition relies in a complete, and consistent system specification phase. However, many problems have been faced by system analysts in the performance of such task, as a result of requirements complexity, and diversity, human limitations, and communication gap between users and developers. If we think of a system life cycle, we'll find out that the main activity performed by software engineers consists in the generation of models corresponding to specific parts of the users environment. This modeling activity starts by a descriptive model of the portion of reality from which the requirement model is derived, resulting in the system conceptual model. The last phase of this evolving modeling activity is the software required for the system implementation. In spite of the importance of requirement analysis and modeling, very little research effort was put in these activities and none significant improvement in available methodologies were presented until the late 80s. Nevertheless, when the concepts applied in software reuse were also applied to system specification and requirements modeling, then a new paradigm was introduced, consisting in the specification of new systems based on systematic reuse of similar available system models. Research effort have been put in this new modeling technique in the aim of make it usable and reliable. However, only after this methodology is proved to produce better and reliable software in a more productive way, it would be world wide accepted by the scientific and technical community. The present work provides a critical analysis about the use of such requirement modeling technique. Experimental modeling techniques based on the reuse of similar existing models are analyzed. Systems models were developed by system analyst with similar skills, with and without reusing previously existing models. The resulting models were compared in terms of correction, consumed time in each modeling phase, effort, etc. An experimental protocol and a special strategy were defined in order to compare and to measure results obtained from the use of two different groups of models. The main difference between the two selected groups were the similarity level between the model available for reuse and the model to be developed. The diversity of resulting models in terms of quality and completeness, as well in the modeling effort, was a corroboration to the hypothesis that reuse effectiveness is related to similarity between domains, data and procedures of pre-existing models and applications being developed. In this work, the reuse of requirements models is investigated in two different methodologies: in the first one, the modeling process is based on the use of Data Flow Diagrams, as in the structured methodology; in the second methodology, based on Object Orientation, Object Diagrams are used for modeling purposes. The research was achieved with the cooperation of 114 students/analysts, resulting in 175 series of Data Flow Diagrams and 23 series of Object Diagrams. Proper statistical analysis were conducted with these samples, in order to clarify questions about requirements reuse. According to the final results, modeling techniques based on the reuse of analogous models provide an improvement in requirement analysis, without disregarding restrictions resulting from differences in domain, data and procedures.
APA, Harvard, Vancouver, ISO, and other styles
38

Gonçalves, Júnior Paulo Mauricio. "Multivariate non-parametric statistical tests to reuse classifiers in recurring concept drifting environments." Universidade Federal de Pernambuco, 2013. https://repositorio.ufpe.br/handle/123456789/12288.

Full text
Abstract:
Data streams are a recent processing model where data arrive continuously, in large quantities, at high speeds, so that they must be processed on-line. Besides that, several private and public institutions store large amounts of data that also must be processed. Traditional batch classi ers are not well suited to handle huge amounts of data for basically two reasons. First, they usually read the available data several times until convergence, which is impractical in this scenario. Second, they imply that the context represented by data is stable in time, which may not be true. In fact, the context change is a common situation in data streams, and is named concept drift. This thesis presents rcd, a framework that o ers an alternative approach to handle data streams that su er from recurring concept drifts. It creates a new classi er to each context found and stores a sample of the data used to build it. When a new concept drift occurs, rcd compares the new context to old ones using a non-parametric multivariate statistical test to verify if both contexts come from the same distribution. If so, the corresponding classi er is reused. If not, a new classi er is generated and stored. Three kinds of tests were performed. One compares the rcd framework with several adaptive algorithms (among single and ensemble approaches) in arti cial and real data sets, among the most used in the concept drift research area, with abrupt and gradual concept drifts. It is observed the ability of the classi ers in representing each context, how they handle concept drift, and training and testing times needed to evaluate the data sets. Results indicate that rcd had similar or better statistical results compared to the other classi ers. In the real-world data sets, rcd presented accuracies close to the best classi er in each data set. Another test compares two statistical tests (knn and Cramer) in their capability in representing and identifying contexts. Tests were performed using adaptive and batch classi ers as base learners of rcd, in arti cial and real-world data sets, with several rates-of-change. Results indicate that, in average, knn had better results compared to the Cramer test, and was also faster. Independently of the test used, rcd had higher accuracy values compared to their respective base learners. It is also presented an improvement in the rcd framework where the statistical tests are performed in parallel through the use of a thread pool. Tests were performed in three processors with di erent numbers of cores. Better results were obtained when there was a high number of detected concept drifts, the bu er size used to represent each data distribution was large, and there was a high test frequency. Even if none of these conditions apply, parallel and sequential execution still have very similar performances. Finally, a comparison between six di erent drift detection methods was also performed, comparing the predictive accuracies, evaluation times, and drift handling, including false alarm and miss detection rates, as well as the average distance to the drift point and its standard deviation.<br>Submitted by João Arthur Martins (joao.arthur@ufpe.br) on 2015-03-12T19:25:11Z No. of bitstreams: 2 license_rdf: 1232 bytes, checksum: 66e71c371cc565284e70f40736c94386 (MD5) tese Paulo Mauricio Gonçalves Jr..pdf: 2957463 bytes, checksum: de163caadf10cbd5442e145778865224 (MD5)<br>Made available in DSpace on 2015-03-12T19:25:11Z (GMT). No. of bitstreams: 2 license_rdf: 1232 bytes, checksum: 66e71c371cc565284e70f40736c94386 (MD5) tese Paulo Mauricio Gonçalves Jr..pdf: 2957463 bytes, checksum: de163caadf10cbd5442e145778865224 (MD5) Previous issue date: 2013-04-23<br>Fluxos de dados s~ao um modelo de processamento de dados recente, onde os dados chegam continuamente, em grandes quantidades, a altas velocidades, de modo que eles devem ser processados em tempo real. Al em disso, v arias institui c~oes p ublicas e privadas armazenam grandes quantidades de dados que tamb em devem ser processadas. Classi cadores tradicionais n~ao s~ao adequados para lidar com grandes quantidades de dados por basicamente duas raz~oes. Primeiro, eles costumam ler os dados dispon veis v arias vezes at e convergirem, o que e impratic avel neste cen ario. Em segundo lugar, eles assumem que o contexto representado por dados e est avel no tempo, o que pode n~ao ser verdadeiro. Na verdade, a mudan ca de contexto e uma situa c~ao comum em uxos de dados, e e chamado de mudan ca de conceito. Esta tese apresenta o rcd, uma estrutura que oferece uma abordagem alternativa para lidar com os uxos de dados que sofrem de mudan cas de conceito recorrentes. Ele cria um novo classi cador para cada contexto encontrado e armazena uma amostra dos dados usados para constru -lo. Quando uma nova mudan ca de conceito ocorre, rcd compara o novo contexto com os antigos, utilizando um teste estat stico n~ao param etrico multivariado para veri car se ambos os contextos prov^em da mesma distribui c~ao. Se assim for, o classi cador correspondente e reutilizado. Se n~ao, um novo classi cador e gerado e armazenado. Tr^es tipos de testes foram realizados. Um compara o rcd com v arios algoritmos adaptativos (entre as abordagens individuais e de agrupamento) em conjuntos de dados arti ciais e reais, entre os mais utilizados na area de pesquisa de mudan ca de conceito, com mudan cas bruscas e graduais. E observada a capacidade dos classi cadores em representar cada contexto, como eles lidam com as mudan cas de conceito e os tempos de treinamento e teste necess arios para avaliar os conjuntos de dados. Os resultados indicam que rcd teve resultados estat sticos semelhantes ou melhores, em compara c~ao com os outros classi cadores. Nos conjuntos de dados do mundo real, rcd apresentou precis~oes pr oximas do melhor classi cador em cada conjunto de dados. Outro teste compara dois testes estat sticos (knn e Cramer) em suas capacidades de representar e identi car contextos. Os testes foram realizados utilizando classi cadores tradicionais e adaptativos como base do rcd, em conjuntos de dados arti ciais e do mundo real, com v arias taxas de varia c~ao. Os resultados indicam que, em m edia, KNN obteve melhores resultados em compara c~ao com o teste de Cramer, al em de ser mais r apido. Independentemente do crit erio utilizado, rcd apresentou valores mais elevados de precis~ao em compara c~ao com seus respectivos classi cadores base. Tamb em e apresentada uma melhoria do rcd onde os testes estat sticos s~ao executadas em paralelo por meio do uso de um pool de threads. Os testes foram realizados em tr^es processadores com diferentes n umeros de n ucleos. Melhores resultados foram obtidos quando houve um elevado n umero de mudan cas de conceito detectadas, o tamanho das amostras utilizadas para representar cada distribui c~ao de dados era grande, e havia uma alta freq u^encia de testes. Mesmo que nenhuma destas condi c~oes se aplicam, a execu c~ao paralela e seq uencial ainda t^em performances muito semelhantes. Finalmente, uma compara c~ao entre seis diferentes m etodos de detec c~ao de mudan ca de conceito tamb em foi realizada, comparando a precis~ao, os tempos de avalia c~ao, manipula c~ao das mudan cas de conceito, incluindo as taxas de falsos positivos e negativos, bem como a m edia da dist^ancia ao ponto de mudan ca e o seu desvio padr~ao.
APA, Harvard, Vancouver, ISO, and other styles
39

Niu, Qingpeng. "Characterization and Enhancement of Data Locality and Load Balancing for Irregular Applications." The Ohio State University, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=osu1420811652.

Full text
APA, Harvard, Vancouver, ISO, and other styles
40

Ogidan, Olugbenga Kayode. "Design of nonlinear networked control for wastewater distributed systems." Thesis, Cape Peninsula University of Technology, 2014. http://hdl.handle.net/20.500.11838/1201.

Full text
Abstract:
Thesis submitted in fulfilment of the requirements for the degree Doctor of Technology: Electrical Engineering in the Faculty of Engineering at the Cape Peninsula University of Technology 2014<br>This thesis focuses on the design, development and real-time simulation of a robust nonlinear networked control for the dissolved oxygen concentration as part of the wastewater distributed systems. This concept differs from previous methods of wastewater control in the sense that the controller and the wastewater treatment plants are separated by a wide geographical distance and exchange data through a communication medium. The communication network introduced between the controller and the DO process creates imperfections during its operation, as time delays which are an object of investigation in the thesis. Due to the communication network imperfections, new control strategies that take cognisance of the network imperfections in the process of the controller design are needed to provide adequate robustness for the DO process control system. This thesis first investigates the effects of constant and random network induced time delays and the effects of controller parameters on the DO process behaviour with a view to using the obtained information to design an appropriate controller for the networked closed loop system. On the basis of the above information, a Smith predictor delay compensation controller is developed in the thesis to eliminate the deadtime, provide robustness and improve the performance of the DO process. Two approaches are adopted in the design of the Smith predictor compensation scheme. The first is the transfer function approach that allows a linearized model of the DO process to be described in the frequency domain. The second one is the nonlinear linearising approach in the time domain. Simulation results reveal that the developed Smith predictor controllers out-performed the nonlinear linearising controller designed for the DO process without time delays by compensating for the network imperfections and maintaining the DO concentration within a desired acceptable level. The transfer function approach of designing the Smith predictor is found to perform better under small time delays but the performance deteriorates under large time delays and disturbances. It is also found to respond faster than the nonlinear approach. The nonlinear feedback linearisig approach is slower in response time but out-performs the transfer function approach in providing robustness and performance for the DO process under large time delays and disturbances. The developed Smith predictor compensation schemes were later simulated in a real-time platform using LabVIEW. The Smith predictor controllers developed in this thesis can be applied to other process control plants apart from the wastewater plants, where distributed control is required. It can also be applied in the nuclear reactor plants where remote control is required in hazardous conditions. The developed LabVIEW real-time simulation environment would be a valuable tool for researchers and students in the field of control system engineering. Lastly, this thesis would form the basis for further research in the field of distributed wastewater control.
APA, Harvard, Vancouver, ISO, and other styles
41

Deric, Sanjin. "Increased Capacity for VDL Mode 2 Aeronautical Data Communication." Cleveland State University / OhioLINK, 2013. http://rave.ohiolink.edu/etdc/view?acc_num=csu1376063529.

Full text
APA, Harvard, Vancouver, ISO, and other styles
42

Trinkune, Anna Marija. "Enhancement of the Swedish Emergency Services : A study of the potential of human enhancement implementation within the Swedish police and fire and rescue service." Thesis, Linköpings universitet, Institutionen för datavetenskap, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-167198.

Full text
Abstract:
Human enhancement (HE) is the field of research aiming to improve and overcome the biological limitations of the physical and mental performance of humans. The implications of HE might especially prove relevant within high-intensity environments, such as the working environment of police and fire and rescue services. The aim of this thesis is to identify the needs of personnel within these domains and to highlight potential HE solutions that could aid to improve responder performance in the setting of an active emergency. Interviews conducted with representatives for the Swedish police and fire and rescue service highlighted the need for support of mental performance in the form of destressing during highintensity situations as well as an improvement of creating situational awareness and making adequate decisions in a short amount of time. The results imply a potential of implementing HE solutions that aid emergency responders by keeping them alert and open for information gathering or minimizing the experienced level of stress while providing vital information for a proper response. The thesis alsoconcluded a generally positive attitude towards implementation of HE solutions.
APA, Harvard, Vancouver, ISO, and other styles
43

Ait, Mouhoub Louali Nadia. "Le service public à l’heure de l’Open Data." Thesis, Paris 2, 2018. http://www.theses.fr/2018PA020022.

Full text
Abstract:
Le service public a éprouvé une ouverture massive de données publiques dite "Open Data". Ce phénomène s'est développé avec l'émergence des nouvelles technologies d’information dans les administrations publiques, devenant un facteur important dans le renouveau et la modernisation du service public. Cette nouvelle tendance que le monde explore depuis quelques années, vise à partager et à réutiliser les données publiques détenues par le service public. L’objectif de l’Open Data étant la transparence démocratique en réponse à l’exigence de rendre des comptes aux citoyens, pour lutter contre la corruption et promouvoir un gouvernement ouvert en faveur de la participation citoyenne. À cet égard, le concept Open Data mène à nous interroger sur l'importance de l'ouverture des données du service public, sur le degré de l'obligation de s'adapter à cette ouverture, sur les conséquences de l'intrusion d’Open Data dans la sphère du service public et sur les limites imposées à l'Open Data. Pour répondre à ces interrogations, on s’intéressera à l’apparition et au développement de l’Open Data dans le service public, tout en illustrant son impact sur l'évolution de la démocratie et son rôle éminent dans la création de nouveaux services publics, avec notamment le cas du service public de la donnée en France. Ainsi, le meilleur angle pour étudier l'ouverture des données publiques dans le service public sera le droit public comparé, cela nous permettra d'analyser la pratique d'Open Data dans les pays pionniers dans ce domaine et les pays du Maghreb qui intègrent, depuis peu, cette nouvelle méthode de travail. Cette étude a pour but aussi de démontrer ce que l'Open Data peu apporter concrètement à l'administration et au citoyen<br>The public service has experienced a massive opening of public data known as "Open Data". This phenomenon has developed with the emergence of new information technologies in public administrations, becoming then an important factor in the renewal and modernization of the public service. This new trend that the world has been exploring for a few years aims to share and reuse the public data held by the public service, while keeping as objective the democratic transparency in response to the requirement to be accountable to citizens to fight against corruption and promote Open Government in favor of citizen involvement.In this respect, the Open Data concept leads us to question ourselves about the importance of opening up data in the public service, about the degree of obligation to adapt to this opening, also about the consequences of the Open Data’s intrusion into the sphere of the public service and the limits that Open Data may encounter.To answer these questions, we will focus on the emergence and development of Open Data in the public service, with a depiction of its impact on democracy evolution and its eminent role in the creation of new public services, such as the case of data public service in France. Thus, the best angle to study the opening of public data in the public service is the comparative public law, this allows us to analyze the practice of Open Data in the pioneer countries in this field and the Maghreb countries who recently integrated this new way of work. This study also aims to prove what are the benefits of Open Data for the administration and the citizen
APA, Harvard, Vancouver, ISO, and other styles
44

Bergsten, Linnea. "Communication and Resilience in a Crisis Management Exercise : A qualitative study of the communication of a staff leading the rescue work during a simulated ferry fire, understood through the systemic resilience model." Thesis, Linköpings universitet, Institutionen för datavetenskap, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-152124.

Full text
Abstract:
This study concerns communication in a crisis management exercise with a resilience perspective. The staff’s communication during a crisis management exercise, a simulating a ferry fire, facilitated by DARWIN, a European research project in resilience, is analysed with thematic analysis and understood through the Systemic Resilience (SyRes) model (Lundberg &amp; Johansson, 2015)which combines different aspects of resilience. The main themes found are The Staff’s Decision Making, Operational Care of Affected Persons,and Communication. The staff’s decision makingconsist of the following subthemes: SituationAnalysis, Value of Measuresand Delegation.Operational care of Affected Personsinvolves the themes Transport,and Healthcare. Communicationconsists of the subthemes Stakeholders, and External Communication. The themes are connected in the way that in order to make informed decisions about the operational care ofaffected persons, the staff need to communicate with external stakeholders.  The themes could be understood through the functions in the SyRes model as they share elements with, could be seen as parts of, or in another way could fit into the adaptive functions of the SyRes model. This study found themes in the communication of a staff in crisis management. These themes seem to be central for this staff, are reflected in the SyRes model and would reflect what is important for a staff to behave resilient. That is why it would be suggested to examine if the staff’s in crisis management are supported in their work involving these themes.
APA, Harvard, Vancouver, ISO, and other styles
45

Cecchinel, Cyril. "DEPOSIT : une approche pour exprimer et déployer des politiques de collecte sur des infrastructures de capteurs hétérogènes et partagées." Thesis, Université Côte d'Azur (ComUE), 2017. http://www.theses.fr/2017AZUR4094/document.

Full text
Abstract:
Les réseaux de capteurs sont utilisés dans l’IoT pour collecter des données. Cependant, une expertise envers les réseaux de capteurs est requise pour interagir avec ces infrastructures. Pour un ingénieur logiciel, cibler de tels systèmes est difficile. Les spécifications des plateformes composant l'infrastructure de capteurs les obligent à travailler à un bas niveau d'abstraction et à utiliser des plateformes hétérogènes. Cette fastidieuse activité peut conduire à un code exploitant de manière non optimisée l’infrastructure. En étant spécifiques à une infrastructure, ces applications ne peuvent également pas être réutilisées facilement vers d’autres infrastructures. De plus, le déploiement de ces applications est hors du champ de compétences d’un ingénieur logiciel car il doit identifier la ou les plateforme(s) requise(s) pour supporter l’application. Enfin, l’architecture peut ne pas être conçue pour supporter l’exécution simultanée d’application, engendrant des déploiements redondants lorsqu’une nouvelle application est identifiée. Dans cette thèse, nous présentons une approche qui supporte (i) la définition de politiques de collecte de données à haut niveau d’abstraction et réutilisables, (ii) leur déploiement sur une infrastructure hétérogène dirigée par des modèles apportés par des experts réseau et (iii) la composition automatique de politiques sur des infrastructures hétérogènes. De ces contributions, un ingénieur peut dès lors manipuler un réseau de capteurs sans en connaitre les détails, en réutilisant des abstractions architecturales disponibles lors de l'expression des politiques, des politiques qui pourront également coexister au sein d'un même réseau<br>Sensing infrastructures are classically used in the IoT to collect data. However, a deep knowledge of sensing infrastructures is needed to properly interact with the deployed systems. For software engineers, targeting these systems is tedious. First, the specifies of the platforms composing the infrastructure compel them to work with little abstractions and heterogeneous devices. This can lead to code that badly exploit the network infrastructure. Moreover, by being infrastructure specific, these applications cannot be easily reused across different systems. Secondly, the deployment of an application is outside the domain expertise of a software engineer as she needs to identify the required platform(s) to support her application. Lastly, the sensing infrastructure might not be designed to support the concurrent execution of various applications leading to redundant deployments when a new application is contemplated. In this thesis we present an approach that supports (i) the definition of data collection policies at high level of abstraction with a focus on their reuse, (ii) their deployment over a heterogeneous infrastructure driven by models designed by a network export and (iii) the automatic composition of the policy on top of the heterogeneous sensing infrastructures. Based on these contributions, a software engineer can exploit sensor networks without knowing the associated details, while reusing architectural abstractions available off-the-shelf in their policy. The network will also be shared automatically between the policies
APA, Harvard, Vancouver, ISO, and other styles
46

Kamat, Niranjan Ganesh. "Sampling-based Techniques for Interactive Exploration of Large Datasets." The Ohio State University, 2018. http://rave.ohiolink.edu/etdc/view?acc_num=osu1523552932728325.

Full text
APA, Harvard, Vancouver, ISO, and other styles
47

Bouzillé, Guillaume. "Enjeux et place des data sciences dans le champ de la réutilisation secondaire des données massives cliniques : une approche basée sur des cas d’usage." Thesis, Rennes 1, 2019. http://www.theses.fr/2019REN1B023/document.

Full text
Abstract:
La dématérialisation des données de santé a permis depuis plusieurs années de constituer un véritable gisement de données provenant de tous les domaines de la santé. Ces données ont pour caractéristiques d’être très hétérogènes et d’être produites à différentes échelles et dans différents domaines. Leur réutilisation dans le cadre de la recherche clinique, de la santé publique ou encore de la prise en charge des patients implique de développer des approches adaptées reposant sur les méthodes issues de la science des données. L’objectif de cette thèse est d’évaluer au travers de trois cas d’usage, quels sont les enjeux actuels ainsi que la place des data sciences pour l’exploitation des données massives en santé. La démarche utilisée pour répondre à cet objectif consiste dans une première partie à exposer les caractéristiques des données massives en santé et les aspects techniques liés à leur réutilisation. La seconde partie expose les aspects organisationnels permettant l’exploitation et le partage des données massives en santé. La troisième partie décrit les grandes approches méthodologiques en science des données appliquées actuellement au domaine de la santé. Enfin, la quatrième partie illustre au travers de trois exemples l’apport de ces méthodes dans les champs suivant : la surveillance syndromique, la pharmacovigilance et la recherche clinique. Nous discutons enfin les limites et enjeux de la science des données dans le cadre de la réutilisation des données massives en santé<br>The dematerialization of health data, which started several years ago, now generates na huge amount of data produced by all actors of health. These data have the characteristics of being very heterogeneous and of being produced at different scales and in different domains. Their reuse in the context of clinical research, public health or patient care involves developing appropriate approaches based on methods from data science. The aim of this thesis is to evaluate, through three use cases, what are the current issues as well as the place of data sciences regarding the reuse of massive health data. To meet this objective, the first section exposes the characteristics of health big data and the technical aspects related to their reuse. The second section presents the organizational aspects for the exploitation and sharing of health big data. The third section describes the main methodological approaches in data sciences currently applied in the field of health. Finally, the fourth section illustrates, through three use cases, the contribution of these methods in the following fields: syndromic surveillance, pharmacovigilance and clinical research. Finally, we discuss the limits and challenges of data science in the context of health big data
APA, Harvard, Vancouver, ISO, and other styles
48

Ficheur, Grégoire. "Réutilisation de données hospitalières pour la recherche d'effets indésirables liés à la prise d'un médicament ou à la pose d'un dispositif médical implantable." Thesis, Lille 2, 2015. http://www.theses.fr/2015LIL2S015/document.

Full text
Abstract:
Introduction : les effets indésirables associés à un traitement médicamenteux ou à la pose d'un dispositif médical implantable doivent être recherchés systématiquement après le début de leur commercialisation. Les études réalisées pendant cette phase sont des études observationnelles qui peuvent s'envisager à partir des bases de données hospitalières. L'objectif de ce travail est d'étudier l'intérêt de la ré-utilisation de données hospitalières pour la mise en évidence de tels effets indésirables.Matériel et méthodes : deux bases de données hospitalières sont ré-utilisées pour les années 2007 à 2013 : une première contenant 171 000 000 de séjours hospitaliers incluant les codes diagnostiques, les codes d'actes et des données démographiques, ces données étant chaînées selon un identifiant unique de patient ; une seconde issue d'un centre hospitalier contenant les mêmes types d'informations pour 80 000 séjours ainsi que les résultats de biologie médicale, les administrations médicamenteuses et les courriers hospitaliers pour chacun des séjours. Quatre études sont conduites sur ces données afin d'identifier d'une part des évènements indésirables médicamenteux et d'autre part des évènements indésirables faisant suite à la pose d'un dispositif médical implantable.Résultats : la première étude démontre l'aptitude d'un jeu de règles de détection à identifier automatiquement les effets indésirables à type d'hyperkaliémie. Une deuxième étude décrit la variation d'un paramètre de biologie médicale associée à la présence d'un motif séquentiel fréquent composé d'administrations de médicaments et de résultats de biologie médicale. Un troisième travail a permis la construction d'un outil web permettant d'explorer à la volée les motifs de réhospitalisation des patients ayant eu une pose de dispositif médical implantable. Une quatrième et dernière étude a permis l'estimation du risque thrombotique et hémorragique faisant suite à la pose d'une prothèse totale de hanche.Conclusion : la ré-utilisation de données hospitalières dans une perspective pharmacoépidémiologique permet l'identification d'effets indésirables associés à une administration de médicament ou à la pose d'un dispositif médical implantable. L'intérêt de ces données réside dans la puissance statistique qu'elles apportent ainsi que dans la multiplicité des types de recherches d'association qu'elles permettent<br>Introduction:The adverse events associated with drug administration or placement of an implantable medical device should be sought systematically after the beginning of the commercialisation. Studies conducted in this phase are observational studies that can be performed from hospital databases. The objective of this work is to study the interest of the re-use of hospital data for the identification of such an adverse event.Materials and methods:Two hospital databases have been re-used between the years 2007 to 2013: the first contains 171 million inpatient stays including diagnostic codes, procedures and demographic data. This data is linked with a single patient identifier; the second database contains the same kinds of information for 80,000 stays and also the laboratory results and drug administrations for each inpatient stay. Four studies were conducted on these pieces of data to identify adverse drug events and adverse events following the placement of an implantable medical device.Results:The first study demonstrates the ability of a set of detection of rules to automatically identify adverse drug events with hyperkalaemia. The second study describes the variation of a laboratory results associated with the presence of a frequent sequential pattern composed of drug administrations and laboratory results. The third piece of work enables the user to build a web tool exploring on the fly the reasons for rehospitalisation of patients with an implantable medical device. The fourth and final study estimates the thrombotic and bleeding risks following a total hip replacement.Conclusion:The re-use of hospital data in a pharmacoepidemiological perspective allows the identification of adverse events associated with drug administration or placement of an implantable medical device. The value of this data is the amount statistical power they bring as well as the types of associations they allow to analyse
APA, Harvard, Vancouver, ISO, and other styles
49

Jang, Jiyong. "Scaling Software Security Analysis to Millions of Malicious Programs and Billions of Lines of Code." Research Showcase @ CMU, 2013. http://repository.cmu.edu/dissertations/306.

Full text
Abstract:
Software security is a big data problem. The volume of new software artifacts created far outpaces the current capacity of software analysis. This gap has brought an urgent challenge to our security community—scalability. If our techniques cannot cope with an ever increasing volume of software, we will always be one step behind attackers. Thus developing scalable analysis to bridge the gap is essential. In this dissertation, we argue that automatic code reuse detection enables an efficient data reduction of a high volume of incoming malware for downstream analysis and enhances software security by efficiently finding known vulnerabilities across large code bases. In order to demonstrate the benefits of automatic software similarity detection, we discuss two representative problems that are remedied by scalable analysis: malware triage and unpatched code clone detection. First, we tackle the onslaught of malware. Although over one million new malware are reported each day, existing research shows that most malware are not written from scratch; instead, they are automatically generated variants of existing malware. When groups of highly similar variants are clustered together, new malware more easily stands out. Unfortunately, current systems struggle with handling this high volume of malware. We scale clustering using feature hashing and perform semantic analysis using co-clustering. Our evaluation demonstrates that these techniques are an order of magnitude faster than previous systems and automatically discover highly correlated features and malware groups. Furthermore, we design algorithms to infer evolutionary relationships among malware, which helps analysts understand trends over time and make informed decisions about which malware to analyze first. Second, we address the problem of detecting unpatched code clones at scale. When buggy code gets copied from project to project, eventually all projects will need to be patched. We call clones of buggy code that have been fixed in only a subset of projects unpatched code clones. Unfortunately, code copying is usually ad-hoc and is often not tracked, which makes it challenging to identify all unpatched vulnerabilities in code basesat the scale of entire OS distributions. We scale unpatched code clone detection to spot over15,000 latent security vulnerabilities in 2.1 billion lines of code from the Linux kernel, allDebian and Ubuntu packages, and all C/C++ projects in SourceForge in three hours on asingle machine. To the best of our knowledge, this is the largest set of bugs ever reported in a single paper.
APA, Harvard, Vancouver, ISO, and other styles
50

Sáez, Silvestre Carlos. "Probabilistic methods for multi-source and temporal biomedical data quality assessment." Doctoral thesis, Editorial Universitat Politècnica de València, 2016. http://hdl.handle.net/10251/62188.

Full text
Abstract:
[EN] Nowadays, biomedical research and decision making depend to a great extent on the data stored in information systems. As a consequence, a lack of data quality (DQ) may lead to suboptimal decisions, or hinder the derived research processes and outcomes. This thesis aims to the research and development of methods for assessing two DQ problems of special importance in Big Data and large-scale repositories, based on multi-institutional, cross-border infrastructures, and acquired during long periods of time: the variability of data probability distributions (PDFs) among different data sources-multi-source variability-and the variability of data PDFs over time-temporal variability. Variability in PDFs may be caused by differences in data acquisition methods, protocols or health care policies; systematic or random errors during data input and management; demographic differences in populations; or even falsified data. To date, these issues have received little attention as DQ problems nor count with adequate assessment methods. The developed methods aim to measure, detect and characterize variability dealing with multi-type, multivariate, multi-modal data, and not affected by large sample sizes. To this end, we defined an Information Theory and Geometry probabilistic framework based on the inference of non-parametric statistical manifolds from the normalized distances of PDFs among data sources and over time. Based on this, a number of contributions have been generated. For the multi-source variability assessment we have designed two metrics: the Global Probabilistic Deviation, which measures the degree of global variability among the PDFs of multiple sources-equivalent to the standard deviation among PDFs; and the Source Probabilistic Outlyingness, which measures the dissimilarity of the PDF of a single data source to a global latent average. They are based on the construction of a simplex geometrical figure (the maximum-dimensional statistical manifold) using the distances among sources, and complemented by the Multi-Source Variability plot, an exploratory visualization of that simplex which permits detecting grouping patterns among sources. The temporal variability method provides two main tools: the Information Geometric Temporal plot, an exploratory visualization of the temporal evolution of PDFs based on the projection of the statistical manifold from temporal batches; and the PDF Statistical Process Control, a monitoring and automatic change detection algorithm for PDFs. The methods have been applied to repositories in real case studies, including the Public Health Mortality and Cancer Registries of the Region of Valencia, Spain; the UCI Heart Disease; the United States NHDS; and Spanish Breast Cancer and an In-Vitro Fertilization datasets. The methods permitted discovering several findings such as partitions of the repositories in probabilistically separated temporal subgroups, punctual temporal anomalies due to anomalous data, and outlying and clustered data sources due to differences in populations or in practices. A software toolbox including the methods and the automated generation of DQ reports was developed. Finally, we defined the theoretical basis of a biomedical DQ evaluation framework, which have been used in the construction of quality assured infant feeding repositories, in the contextualization of data for their reuse in Clinical Decision Support Systems using an HL7-CDA wrapper; and in an on-line service for the DQ evaluation and rating of biomedical data repositories. The results of this thesis have been published in eight scientific contributions, including top-ranked journals and conferences. One of the journal publications was selected by the IMIA as one of the best of Health Information Systems in 2013. Additionally, the results have contributed to several research projects, and have leaded the way to the industrialization of the developed methods and approaches for the audit and control of biomedical DQ.<br>[ES] Actualmente, la investigación biomédica y toma de decisiones dependen en gran medida de los datos almacenados en los sistemas de información. En consecuencia, una falta de calidad de datos (CD) puede dar lugar a decisiones sub-óptimas o dificultar los procesos y resultados de las investigaciones derivadas. Esta tesis tiene como propósito la investigación y desarrollo de métodos para evaluar dos problemas especialmente importantes en repositorios de datos masivos (Big Data), basados en infraestructuras multi-céntricas, adquiridos durante largos periodos de tiempo: la variabilidad de las distribuciones de probabilidad (DPs) de los datos entre diferentes fuentes o sitios-variabilidad multi-fuente-y la variabilidad de las distribuciones de probabilidad de los datos a lo largo del tiempo-variabilidad temporal. La variabilidad en DPs puede estar causada por diferencias en los métodos de adquisición, protocolos o políticas de atención; errores sistemáticos o aleatorios en la entrada o gestión de datos; diferencias demográficas en poblaciones; o incluso por datos falsificados. Esta tesis aporta métodos para detectar, medir y caracterizar dicha variabilidad, tratando con datos multi-tipo, multivariantes y multi-modales, y sin ser afectados por tamaños muestrales grandes. Para ello, hemos definido un marco de Teoría y Geometría de la Información basado en la inferencia de variedades de Riemann no-paramétricas a partir de distancias normalizadas entre las PDs de varias fuentes de datos o a lo largo del tiempo. En consecuencia, se han aportado las siguientes contribuciones: Para evaluar la variabilidad multi-fuente se han definido dos métricas: la Global Probabilistic Deviation, la cual mide la variabilidad global entre las PDs de varias fuentes-equivalente a la desviación estándar entre PDs; y la Source Probabilistic Outlyingness, la cual mide la disimilaridad entre la DP de una fuente y un promedio global latente. Éstas se basan en un simplex construido mediante las distancias entre las PDs de las fuentes. En base a éste, se ha definido el Multi-Source Variability plot, visualización que permite detectar patrones de agrupamiento entre fuentes. El método de variabilidad temporal proporciona dos herramientas: el Information Geometric Temporal plot, visualización exploratoria de la evolución temporal de las PDs basada en la la variedad estadística de los lotes temporales; y el Control de Procesos Estadístico de PDs, algoritmo para la monitorización y detección automática de cambios en PDs. Los métodos han sido aplicados a casos de estudio reales, incluyendo: los Registros de Salud Pública de Mortalidad y Cáncer de la Comunidad Valenciana; los repositorios de enfermedades del corazón de UCI y NHDS de los Estados Unidos; y repositorios españoles de Cáncer de Mama y Fecundación In-Vitro. Los métodos detectaron hallazgos como particiones de repositorios en subgrupos probabilísticos temporales, anomalías temporales puntuales, y fuentes de datos agrupadas por diferencias en poblaciones y en prácticas. Se han desarrollado herramientas software incluyendo los métodos y la generación automática de informes. Finalmente, se ha definido la base teórica de un marco de CD biomédicos, el cual ha sido utilizado en la construcción de repositorios de calidad para la alimentación del lactante, en la contextualización de datos para el reuso en Sistemas de Ayuda a la Decisión Médica usando un wrapper HL7-CDA, y en un servicio on-line para la evaluación y clasificación de la CD de repositorios biomédicos. Los resultados de esta tesis han sido publicados en ocho contribuciones científicas (revistas indexadas y artículos en congresos), una de ellas seleccionada por la IMIA como una de las mejores publicaciones en Sistemas de Información de Salud en 2013. Los resultados han contribuido en varios proyectos de investigación, y facilitado los primeros pasos hacia la industrialización de las tecnologías<br>[CAT] Actualment, la investigació biomèdica i presa de decisions depenen en gran mesura de les dades emmagatzemades en els sistemes d'informació. En conseqüència, una manca en la qualitat de les dades (QD) pot donar lloc a decisions sub-òptimes o dificultar els processos i resultats de les investigacions derivades. Aquesta tesi té com a propòsit la investigació i desenvolupament de mètodes per avaluar dos problemes especialment importants en repositoris de dades massius (Big Data) basats en infraestructures multi-institucionals o transfrontereres, adquirits durant llargs períodes de temps: la variabilitat de les distribucions de probabilitat (DPs) de les dades entre diferents fonts o llocs-variabilitat multi-font-i la variabilitat de les distribucions de probabilitat de les dades al llarg del temps-variabilitat temporal. La variabilitat en DPs pot estar causada per diferències en els mètodes d'adquisició, protocols o polítiques d'atenció; errors sistemàtics o aleatoris durant l'entrada o gestió de dades; diferències demogràfiques en les poblacions; o fins i tot per dades falsificades. Aquesta tesi aporta mètodes per detectar, mesurar i caracteritzar aquesta variabilitat, tractant amb dades multi-tipus, multivariants i multi-modals, i no sent afectats per mides mostrals grans. Per a això, hem definit un marc de Teoria i Geometria de la Informació basat en la inferència de varietats de Riemann no-paramètriques a partir de distàncies normalitzades entre les DPs de diverses fonts de dades o al llarg del temps. En conseqüència s'han aportat les següents contribucions: Per avaluar la variabilitat multi-font s'han definit dos mètriques: la Global Probabilistic Deviation, la qual mesura la variabilitat global entre les DPs de les diferents fonts-equivalent a la desviació estàndard entre DPs; i la Source Probabilistic Outlyingness, la qual mesura la dissimilaritat entre la DP d'una font de dades donada i una mitjana global latent. Aquestes estan basades en la construcció d'un simplex mitjançant les distàncies en les DPs entre fonts. Basat en aquest, s'ha definit el Multi-Source Variability plot, una visualització que permet detectar patrons d'agrupament entre fonts. El mètode de variabilitat temporal proporciona dues eines: l'Information Geometric Temporal plot, visualització exploratòria de l'evolució temporal de les distribucions de dades basada en la varietat estadística dels lots temporals; i el Statistical Process Control de DPs, algoritme per al monitoratge i detecció automàtica de canvis en les DPs de dades. Els mètodes han estat aplicats en repositoris de casos d'estudi reals, incloent: els Registres de Salut Pública de Mortalitat i Càncer de la Comunitat Valenciana; els repositoris de malalties del cor de UCI i NHDS dels Estats Units; i repositoris espanyols de Càncer de Mama i Fecundació In-Vitro. Els mètodes han detectat troballes com particions dels repositoris en subgrups probabilístics temporals, anomalies temporals puntuals, i fonts de dades anòmales i agrupades a causa de diferències en poblacions i en les pràctiques. S'han desenvolupat eines programari incloent els mètodes i la generació automàtica d'informes. Finalment, s'ha definit la base teòrica d'un marc de QD biomèdiques, el qual ha estat utilitzat en la construcció de repositoris de qualitat per l'alimentació del lactant, la contextualització de dades per a la reutilització en Sistemes d'Ajuda a la Decisió Mèdica usant un wrapper HL7-CDA, i en un servei on-line per a l'avaluació i classificació de la QD de repositoris biomèdics. Els resultats d'aquesta tesi han estat publicats en vuit contribucions científiques (revistes indexades i en articles en congressos), una de elles seleccionada per la IMIA com una de les millors publicacions en Sistemes d'Informació de Salut en 2013. Els resultats han contribuït en diversos projectes d'investigació, i han facilitat la industrialització de les tecnologies d<br>Sáez Silvestre, C. (2016). Probabilistic methods for multi-source and temporal biomedical data quality assessment [Tesis doctoral]. Editorial Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/62188<br>TESIS<br>Premiado
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!