Dissertations / Theses: 'Forensic engineering - Data processing'

1

Hargreaves, C. J. "Assessing the Reliability of Digital Evidence from Live Investigations Involving Encryption." Thesis, Department of Informatics and Sensors, 2009. http://hdl.handle.net/1826/4007.

Full text

Abstract:

The traditional approach to a digital investigation when a computer system is encountered in a running state is to remove the power, image the machine using a write blocker and then analyse the acquired image. This has the advantage of preserving the contents of the computer’s hard disk at that point in time. However, the disadvantage of this approach is that the preservation of the disk is at the expense of volatile data such as that stored in memory, which does not remain once the power is disconnected. There are an increasing number of situations where this traditional approach of ‘pulling the plug’ is not ideal since volatile data is relevant to the investigation; one of these situations is when the machine under investigation is using encryption. If encrypted data is encountered on a live machine, a live investigation can be performed to preserve this evidence in a form that can be later analysed. However, there are a number of difficulties with using evidence obtained from live investigations that may cause the reliability of such evidence to be questioned. This research investigates whether digital evidence obtained from live investigations involving encryption can be considered to be reliable. To determine this, a means of assessing reliability is established, which involves evaluating digital evidence against a set of criteria; evidence should be authentic, accurate and complete. This research considers how traditional digital investigations satisfy these requirements and then determines the extent to which evidence from live investigations involving encryption can satisfy the same criteria. This research concludes that it is possible for live digital evidence to be considered to be reliable, but that reliability of digital evidence ultimately depends on the specific investigation and the importance of the decision being made. However, the research provides structured criteria that allow the reliability of digital evidence to be assessed, demonstrates the use of these criteria in the context of live digital investigations involving encryption, and shows the extent to which each can currently be met.

APA, Harvard, Vancouver, ISO, and other styles

2

Hargreaves, Christopher James. "Assessing the reliability of digital evidence from live investigations involving encryption." Thesis, Cranfield University, 2009. http://dspace.lib.cranfield.ac.uk/handle/1826/4007.

Full text

Abstract:

The traditional approach to a digital investigation when a computer system is encountered in a running state is to remove the power, image the machine using a write blocker and then analyse the acquired image. This has the advantage of preserving the contents of the computer’s hard disk at that point in time. However, the disadvantage of this approach is that the preservation of the disk is at the expense of volatile data such as that stored in memory, which does not remain once the power is disconnected. There are an increasing number of situations where this traditional approach of ‘pulling the plug’ is not ideal since volatile data is relevant to the investigation; one of these situations is when the machine under investigation is using encryption. If encrypted data is encountered on a live machine, a live investigation can be performed to preserve this evidence in a form that can be later analysed. However, there are a number of difficulties with using evidence obtained from live investigations that may cause the reliability of such evidence to be questioned. This research investigates whether digital evidence obtained from live investigations involving encryption can be considered to be reliable. To determine this, a means of assessing reliability is established, which involves evaluating digital evidence against a set of criteria; evidence should be authentic, accurate and complete. This research considers how traditional digital investigations satisfy these requirements and then determines the extent to which evidence from live investigations involving encryption can satisfy the same criteria. This research concludes that it is possible for live digital evidence to be considered to be reliable, but that reliability of digital evidence ultimately depends on the specific investigation and the importance of the decision being made. However, the research provides structured criteria that allow the reliability of digital evidence to be assessed, demonstrates the use of these criteria in the context of live digital investigations involving encryption, and shows the extent to which each can currently be met.

APA, Harvard, Vancouver, ISO, and other styles

3

Hansson, Desiree Shaun. "A prototype fact sheet designed for the development of a forensic computerized information system at Valkenberg and Lentegeur Hospitals." Master's thesis, University of Cape Town, 1987. http://hdl.handle.net/11427/15865.

Full text

Abstract:

Includes bibliography.
The discussion in this paper centers around the development of a paper-and-pencil fact sheet for collecting and systematizing forensic case material. This paper-and-pencil device is the prototype fact sheet that will be used to collect the data to form a computerized, forensic information system. The system, known as FOCIS, the Forensic Computerized Information System, will serve the largest Forensic Unit in the Western Cape, at Valkenberg Hospital, and the new unit that is being developed at Lentegeur Hospital. FOCIS will comprise case material from all forensic referrals to these two hospitals, under the present law: Sections 77, 78 and 79 of the Criminal Procedure Act 51 of the 1st of July 1977. Additionally, FOCIS will develop dynamically, continuing to incorporate case material as referrals are made to these hospitals. The estimated 7500 cases that will constitute FOCIS by the time this project is completed, include all of the officially classified population groups of South Africa, i.e. the so-called 'black', 'coloured' and 'white' groups [POPULATION REGISTRATION ACT, 1982]. The prototype fact sheet has a schematic layout and uses a mixed-format for data collection, i.e. checklists, multiple choice answer-options and semi-structured narrative text.

APA, Harvard, Vancouver, ISO, and other styles

4

Fernandez, Noemi. "Statistical information processing for data classification." FIU Digital Commons, 1996. http://digitalcommons.fiu.edu/etd/3297.

Full text

Abstract:

This thesis introduces new algorithms for analysis and classification of multivariate data. Statistical approaches are devised for the objectives of data clustering, data classification and object recognition. An initial investigation begins with the application of fundamental pattern recognition principles. Where such fundamental principles meet their limitations, statistical and neural algorithms are integrated to augment the overall approach for an enhanced solution. This thesis provides a new dimension to the problem of classification of data as a result of the following developments: (1) application of algorithms for object classification and recognition; (2) integration of a neural network algorithm which determines the decision functions associated with the task of classification; (3) determination and use of the eigensystem using newly developed methods with the objectives of achieving optimized data clustering and data classification, and dynamic monitoring of time-varying data; and (4) use of the principal component transform to exploit the eigensystem in order to perform the important tasks of orientation-independent object recognition, and di mensionality reduction of the data such as to optimize the processing time without compromising accuracy in the analysis of this data.

APA, Harvard, Vancouver, ISO, and other styles

5

Chiu, Cheng-Jung. "Data processing in nanoscale profilometry." Thesis, Massachusetts Institute of Technology, 1995. http://hdl.handle.net/1721.1/36677.

Full text

Abstract:

Thesis (M.S.)--Massachusetts Institute of Technology, Dept. of Mechanical Engineering, 1995.
Includes bibliographical references (p. 176-177).
New developments on the nanoscale are taking place rapidly in many fields. Instrumentation used to measure and understand the geometry and property of the small scale structure is therefore essential. One of the most promising devices to head the measurement science into the nanoscale is the scanning probe microscope. A prototype of a nanoscale profilometer based on the scanning probe microscope has been built in the Laboratory for Manufacturing and Productivity at MIT. A sample is placed on a precision flip stage and different sides of the sample are scanned under the SPM to acquire its separate surface topography. To reconstruct the original three dimensional profile, many techniques like digital filtering, edge identification, and image matching are investigated and implemented in the computer programs to post process the data, and with greater emphasis placed on the nanoscale application. The important programming issues are addressed, too. Finally, this system's error sources are discussed and analyzed.
by Cheng-Jung Chiu.
M.S.

APA, Harvard, Vancouver, ISO, and other styles

6

Derksen, Timothy J. (Timothy John). "Processing of outliers and missing data in multivariate manufacturing data." Thesis, Massachusetts Institute of Technology, 1996. http://hdl.handle.net/1721.1/38800.

Full text

Abstract:

Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1996.
Includes bibliographical references (leaf 64).
by Timothy J. Derksen.
M.Eng.

APA, Harvard, Vancouver, ISO, and other styles

7

Nyström, Simon, and Joakim Lönnegren. "Processing data sources with big data frameworks." Thesis, KTH, Data- och elektroteknik, 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-188204.

Full text

Abstract:

Big data is a concept that is expanding rapidly. As more and more data is generatedand garnered, there is an increasing need for efficient solutions that can be utilized to process all this data in attempts to gain value from it. The purpose of this thesis is to find an efficient way to quickly process a large number of relatively small files. More specifically, the purpose is to test two frameworks that can be used for processing big data. The frameworks that are tested against each other are Apache NiFi and Apache Storm. A method is devised in order to, firstly, construct a data flow and secondly, construct a method for testing the performance and scalability of the frameworks running this data flow. The results reveal that Apache Storm is faster than Apache NiFi, at the sort of task that was tested. As the number of nodes included in the tests went up, the performance did not always do the same. This indicates that adding more nodes to a big data processing pipeline, does not always result in a better performing setup and that, sometimes, other measures must be made to heighten the performance.
Big data är ett koncept som växer snabbt. När mer och mer data genereras och samlas in finns det ett ökande behov av effektiva lösningar som kan användas föratt behandla all denna data, i försök att utvinna värde från den. Syftet med detta examensarbete är att hitta ett effektivt sätt att snabbt behandla ett stort antal filer, av relativt liten storlek. Mer specifikt så är det för att testa två ramverk som kan användas vid big data-behandling. De två ramverken som testas mot varandra är Apache NiFi och Apache Storm. En metod beskrivs för att, för det första, konstruera ett dataflöde och, för det andra, konstruera en metod för att testa prestandan och skalbarheten av de ramverk som kör dataflödet. Resultaten avslöjar att Apache Storm är snabbare än NiFi, på den typen av test som gjordes. När antalet noder som var med i testerna ökades, så ökade inte alltid prestandan. Detta visar att en ökning av antalet noder, i en big data-behandlingskedja, inte alltid leder till bättre prestanda och att det ibland krävs andra åtgärder för att öka prestandan.

APA, Harvard, Vancouver, ISO, and other styles

8

徐順通 and Sung-thong Andrew Chee. "Computerisation in Hong Kong professional engineering firms." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 1985. http://hub.hku.hk/bib/B31263124.

Full text

APA, Harvard, Vancouver, ISO, and other styles

9

Wang, Yi. "Data Management and Data Processing Support on Array-Based Scientific Data." The Ohio State University, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=osu1436157356.

Full text

APA, Harvard, Vancouver, ISO, and other styles

10

Bostanudin, Nurul Jihan Farhah. "Computational methods for processing ground penetrating radar data." Thesis, University of Portsmouth, 2013. https://researchportal.port.ac.uk/portal/en/theses/computational-methods-for-processing-ground-penetrating-radar-data(d519f94f-04eb-42af-a504-a4c4275d51ae).html.

Full text

Abstract:

The aim of this work was to investigate signal processing and analysis techniques for Ground Penetrating Radar (GPR) and its use in civil engineering and construction industry. GPR is the general term applied to techniques which employ radio waves, typically in the Mega Hertz and Giga Hertz range, to map structures and features buried in the ground or in manmade structures. GPR measurements can suffer from large amount of noise. This is primarily caused by interference from other radio-wave-emitting devices (e.g., cell phones, radios, etc.) that are present in the surrounding area of the GPR system during data collection. In addition to noise, presence of clutter – reflections from other non-target objects buried underground in the vicinity of the target can make GPR measurement difficult to understand and interpret, even for the skilled human, GPR analysts. This thesis is concerned with the improvements and processes that can be applied to GPR data in order to enhance target detection and characterisation process particularly with multivariate signal processing techniques. Those primarily include Principal Component Analysis (PCA) and Independent Component Analysis (ICA). Both techniques have been investigated, implemented and compared regarding their abilities to separate the target originating signals from the noise and clutter type signals present in the data. Combination of PCA and ICA (SVDPICA) and two-dimensional PCA (2DPCA) are the specific approaches adopted and further developed in this work. Ability of those methods to reduce the amount of clutter and unwanted signals present in GPR data have been investigated and reported in this thesis, suggesting that their use in automated analysis of GPR images is a possibility. Further analysis carried out in this work concentrated on analysing the performance of developed multivariate signal processing techniques and at the same time investigating the possibility of identifying and characterising the features of interest in pre-processed GPR images. The driving idea behind this part of work was to extract the resonant modes present in the individual traces of each GPR image and to use properties of those poles to characterise target. Three related but different methods have been implemented and applied in this work – Extended Prony, Linear Prediction Singular Value Decomposition and Matrix Pencil methods. In addition to these approaches, PCA technique has been used to reduce dimensionality of extracted traces and to compare signals measured in various experimental setups. Performance analysis shows that Matrix Pencil offers the best results.

APA, Harvard, Vancouver, ISO, and other styles

11

Moses, Samuel Isaiah. "Measuring The Robustness of Forensic Tools' Ability to Detect Data Hiding Techniques." BYU ScholarsArchive, 2017. https://scholarsarchive.byu.edu/etd/6464.

Full text

Abstract:

The goal of this research is to create a methodology that measures the robustness and effectiveness of forensic tools' ability to detect data hiding. First, an extensive search for any existing guidelines testing against data hiding was performed. After finding none, existing guidelines and frameworks in cybersecurity and cyber forensics were reviewed. Next, I created the methodology in this thesis. This methodology includes a set of steps that a user should take to evaluate a forensic tool. The methodology has been designed to be flexible and scalable so as new anti-forensic data hiding methods are discovered and developed, they can easily be added to the framework, and the evaluator using the framework can tailor it to the files they are most focused on. Once a polished draft of the entire methodology was completed, it was reviewed by information technology and security professionals and updated based on their feedback.Two popular forensic tools – Autopsy/Sleuthkit and X-Ways – were evaluated using the methodology developed. Evaluation revealed improvements in the methodology that were updated. I propose that the methodology can be an effective tool to provide insight and evaluate forensic tools.

APA, Harvard, Vancouver, ISO, and other styles

12

Grinman, Alex J. "Natural language processing on encrypted patient data." Thesis, Massachusetts Institute of Technology, 2016. http://hdl.handle.net/1721.1/113438.

Full text

Abstract:

Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2016.
This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Cataloged from student-submitted PDF version of thesis.
Includes bibliographical references (pages 85-86).
While many industries can benefit from machine learning techniques for data analysis, they often do not have the technical expertise nor computational power to do so. Therefore, many organizations would benefit from outsourcing their data analysis. Yet, stringent data privacy policies prevent outsourcing sensitive data and may stop the delegation of data analysis in its tracks. In this thesis, we put forth a two-party system where one party capable of powerful computation can run certain machine learning algorithms from the natural language processing domain on the second party's data, where the first party is limited to learning only specific functions of the second party's data and nothing else. Our system provides simple cryptographic schemes for locating keywords, matching approximate regular expressions, and computing frequency analysis on encrypted data. We present a full implementation of this system in the form of a extendible software library and a command line interface. Finally, we discuss a medical case study where we used our system to run a suite of unmodified machine learning algorithms on encrypted free text patient notes.
by Alex J. Grinman.
M. Eng.

APA, Harvard, Vancouver, ISO, and other styles

13

Westlund, Kenneth P. (Kenneth Peter). "Recording and processing data from transient events." Thesis, Massachusetts Institute of Technology, 1988. https://hdl.handle.net/1721.1/129961.

Full text

Abstract:

Thesis (B.S.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1988.
Includes bibliographical references.
by Kenneth P. Westlund Jr.
Thesis (B.S.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1988.

APA, Harvard, Vancouver, ISO, and other styles

14

Setiowijoso, Liono. "Data Allocation for Distributed Programs." PDXScholar, 1995. https://pdxscholar.library.pdx.edu/open_access_etds/5102.

Full text

Abstract:

This thesis shows that both data and code must be efficiently distributed to achieve good performance in a distributed system. Most previous research has either tried to distribute code structures to improve parallelism or to distribute data to reduce communication costs. Code distribution (exploiting functional parallelism) is an effort to distribute or to duplicate function codes to optimize parallel performance. On the other hand, data distribution tries to place data structures as close as possible to the function codes that use it, so that communication cost can be reduced. In particular, dataflow researchers have primarily focused on code partitioning and assignment. We have adapted existing data allocation algorithms for use with an existing dataflow-based system, ParPlum. ParPlum allows the execution of dataflow graphs on networks of workstations. To evaluate the impact of data allocation, we extended ParPlum to more effectively handle data structures. We then implemented tools to extract from dataflow graphs information that is relevant to the mapping algorithms and fed this information to our version of a data distribution algorithm. To see the relation between code and data parallelism we added optimization to optimize the distribution of the loop function components and the data structure access components. All of these are done automatically without programmer or user involvement. We ran a number of experiments using matrix multiplication as our workload. We used different numbers of processors and different existing partitioning and allocation algorithm. Our results show that automatic data distribution greatly improves the performance of distributed dataflow applications. For example, with 15 x 15 matrices, applying data distribution speeds up execution about 80% on 7 machines. Using data distribution and our code-optimizations on 7 machines speeds up execution over the base case by 800%. Our work shows that it is possible to make efficient use of distributed networks with compiler support and shows that both code mapping and data mapping must be considered to achieve optimal performance.

APA, Harvard, Vancouver, ISO, and other styles

15

Jakovljevic, Sasa. "Data collecting and processing for substation integration enhancement." Texas A&M University, 2003. http://hdl.handle.net/1969/93.

Full text

APA, Harvard, Vancouver, ISO, and other styles

16

Aygar, Alper. "Doppler Radar Data Processing And Classification." Master's thesis, METU, 2008. http://etd.lib.metu.edu.tr/upload/12609890/index.pdf.

Full text

Abstract:

In this thesis, improving the performance of the automatic recognition of the Doppler radar targets is studied. The radar used in this study is a ground-surveillance doppler radar. Target types are car, truck, bus, tank, helicopter, moving man and running man. The input of this thesis is the output of the real doppler radar signals which are normalized and preprocessed (TRP vectors: Target Recognition Pattern vectors) in the doctorate thesis by Erdogan (2002). TRP vectors are normalized and homogenized doppler radar target signals with respect to target speed, target aspect angle and target range. Some target classes have repetitions in time in their TRPs. By the use of these repetitions, improvement of the target type classification performance is studied. K-Nearest Neighbor (KNN) and Support Vector Machine (SVM) algorithms are used for doppler radar target classification and the results are evaluated. Before classification PCA (Principal Component Analysis), LDA (Linear Discriminant Analysis), NMF (Nonnegative Matrix Factorization) and ICA (Independent Component Analysis) are implemented and applied to normalized doppler radar signals for feature extraction and dimension reduction in an efficient way. These techniques transform the input vectors, which are the normalized doppler radar signals, to another space. The effects of the implementation of these feature extraction algoritms and the use of the repetitions in doppler radar target signals on the doppler radar target classification performance are studied.

APA, Harvard, Vancouver, ISO, and other styles

17

Lu, Feng. "Big data scalability for high throughput processing and analysis of vehicle engineering data." Thesis, KTH, Skolan för informations- och kommunikationsteknik (ICT), 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-207084.

Full text

Abstract:

"Sympathy for Data" is a platform that is utilized for Big Data automation analytics. It is based on visual interface and workflow configurations. The main purpose of the platform is to reuse parts of code for structured analysis of vehicle engineering data. However, there are some performance issues on a single machine for processing a large amount of data in Sympathy for Data. There are also disk and CPU IO intensive issues when the data is oversized and the platform need fits comfortably in memory. In addition, for data over the TB or PB level, the Sympathy for data needs separate functionality for efficient processing simultaneously and scalable for distributed computation functionality. This paper focuses on exploring the possibilities and limitations in using the Sympathy for Data platform in various data analytic scenarios within the Volvo Cars vision and strategy. This project re-writes the CDE workflow for over 300 nodes into pure Python script code and make it executable on the Apache Spark and Dask infrastructure. We explore and compare both distributed computing frameworks implemented on Amazon Web Service EC2 used for 4 machine with a 4x type for distributed cluster measurement. However, the benchmark results show that Spark is superior to Dask from performance perspective. Apache Spark and Dask will combine with Sympathy for Data products for a Big Data processing engine to optimize the system disk and CPU IO utilization. There are several challenges when using Spark and Dask to analyze large-scale scientific data on systems. For instance, parallel file systems are shared among all computing machines, in contrast to shared-nothing architectures. Moreover, accessing data stored in commonly used scientific data formats, such as HDF5 is not tentatively supported in Spark. This report presents research carried out on the next generation of Big Data platforms in the automotive industry called "Sympathy for Data". The research questions focusing on improving the I/O performance and scalable distributed function to promote Big Data analytics. During this project, we used the Dask.Array parallelism features for interpretation the data sources as a raster shows in table format, and Apache Spark used as data processing engine for parallelism to load data sources to memory for improving the big data computation capacity. The experiments chapter will demonstrate 640GB of engineering data benchmark for single node and distributed computation mode to evaluate the Sympathy for Data Disk CPU and memory metrics. Finally, the outcome of this project improved the six times performance of the original Sympathy for data by developing a middleware SparkImporter. It is used in Sympathy for Data for distributed computation and connected to the Apache Spark for data processing through the maximum utilization of the system resources. This improves its throughput, scalability, and performance. It also increases the capacity of the Sympathy for data to process Big Data and avoids big data cluster infrastructures.

APA, Harvard, Vancouver, ISO, and other styles

18

Chen, Jiawen (Jiawen Kevin). "Efficient data structures for piecewise-smooth video processing." Thesis, Massachusetts Institute of Technology, 2011. http://hdl.handle.net/1721.1/66003.

Full text

Abstract:

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2011.
Cataloged from PDF version of thesis.
Includes bibliographical references (p. 95-102).
A number of useful image and video processing techniques, ranging from low level operations such as denoising and detail enhancement to higher level methods such as object manipulation and special effects, rely on piecewise-smooth functions computed from the input data. In this thesis, we present two computationally efficient data structures for representing piecewise-smooth visual information and demonstrate how they can dramatically simplify and accelerate a variety of video processing algorithms. We start by introducing the bilateral grid, an image representation that explicitly accounts for intensity edges. By interpreting brightness values as Euclidean coordinates, the bilateral grid enables simple expressions for edge-aware filters. Smooth functions defined on the bilateral grid are piecewise-smooth in image space. Within this framework, we derive efficient reinterpretations of a number of edge-aware filters commonly used in computational photography as operations on the bilateral grid, including the bilateral filter, edgeaware scattered data interpolation, and local histogram equalization. We also show how these techniques can be easily parallelized onto modern graphics hardware for real-time processing of high definition video. The second data structure we introduce is the video mesh, designed as a flexible central data structure for general-purpose video editing. It represents objects in a video sequence as 2.5D "paper cutouts" and allows interactive editing of moving objects and modeling of depth, which enables 3D effects and post-exposure camera control. In our representation, we assume that motion and depth are piecewise-smooth, and encode them sparsely as a set of points tracked over time. The video mesh is a triangulation over this point set and per-pixel information is obtained by interpolation. To handle occlusions and detailed object boundaries, we rely on the user to rotoscope the scene at a sparse set of frames using spline curves. We introduce an algorithm to robustly and automatically cut the mesh into local layers with proper occlusion topology, and propagate the splines to the remaining frames. Object boundaries are refined with per-pixel alpha mattes. At its core, the video mesh is a collection of texture-mapped triangles, which we can edit and render interactively using graphics hardware. We demonstrate the effectiveness of our representation with special effects such as 3D viewpoint changes, object insertion, depthof- field manipulation, and 2D to 3D video conversion.
by Jiawen Chen.
Ph.D.

APA, Harvard, Vancouver, ISO, and other styles

19

Jakubiuk, Wiktor. "High performance data processing pipeline for connectome segmentation." Thesis, Massachusetts Institute of Technology, 2015. http://hdl.handle.net/1721.1/106122.

Full text

Abstract:

Thesis: M. Eng. in Computer Science and Engineering, Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, February 2016.
"December 2015." Cataloged from PDF version of thesis.
Includes bibliographical references (pages 83-88).
By investigating neural connections, neuroscientists try to understand the brain and reconstruct its connectome. Automated connectome reconstruction from high resolution electron miscroscopy is a challenging problem, as all neurons and synapses in a volume have to be detected. A mm3 of a high-resolution brain tissue takes roughly a petabyte of space that the state-of-the-art pipelines are unable to process to date. A high-performance, fully automated image processing pipeline is proposed. Using a combination of image processing and machine learning algorithms (convolutional neural networks and random forests), the pipeline constructs a 3-dimensional connectome from 2-dimensional cross-sections of a mammal's brain. The proposed system achieves a low error rate (comparable with the state-of-the-art) and is capable of processing volumes of 100's of gigabytes in size. The main contributions of this thesis are multiple algorithmic techniques for 2- dimensional pixel classification of varying accuracy and speed trade-off, as well as a fast object segmentation algorithm. The majority of the system is parallelized for multi-core machines, and with minor additional modification is expected to work in a distributed setting.
by Wiktor Jakubiuk.
M. Eng. in Computer Science and Engineering

APA, Harvard, Vancouver, ISO, and other styles

20

Nguyen, Qui T. "Robust data partitioning for ad-hoc query processing." Thesis, Massachusetts Institute of Technology, 2015. http://hdl.handle.net/1721.1/106004.

Full text

Abstract:

Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2015.
This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Cataloged from student-submitted PDF version of thesis.
Includes bibliographical references (pages 59-62).
Data partitioning can significantly improve query performance in distributed database systems. Most proposed data partitioning techniques choose the partitioning based on a particular expected query workload or use a simple upfront scheme, such as uniform range partitioning or hash partitioning on a key. However, these techniques do not adequately address the case where the query workload is ad-hoc and unpredictable, as in many analytic applications. The HYPER-PARTITIONING system aims to ll that gap, by using a novel space-partitioning tree on the space of possible attribute values to dene partitions incorporating all attributes of a dataset. The system creates a robust upfront partitioning tree, designed to benet all possible queries, and then adapts it over time in response to the actual workload. This thesis evaluates the robustness of the upfront hyper-partitioning algorithm, describes the implementation of the overall HYPER-PARTITIONING system, and shows how hyper-partitioning improves the performance of both selection and join queries.
by Qui T. Nguyen.
M. Eng.

APA, Harvard, Vancouver, ISO, and other styles

21

Bao, Shunxing. "Algorithmic Enhancements to Data Colocation Grid Frameworks for Big Data Medical Image Processing." Thesis, Vanderbilt University, 2019. http://pqdtopen.proquest.com/#viewpdf?dispub=13877282.

Full text

Abstract:

Large-scale medical imaging studies to date have predominantly leveraged in-house, laboratory-based or traditional grid computing resources for their computing needs, where the applications often use hierarchical data structures (e.g., Network file system file stores) or databases (e.g., COINS, XNAT) for storage and retrieval. The resulting performance for laboratory-based approaches reveal that performance is impeded by standard network switches since typical processing can saturate network bandwidth during transfer from storage to processing nodes for even moderate-sized studies. On the other hand, the grid may be costly to use due to the dedicated resources used to execute the tasks and lack of elasticity. With increasing availability of cloud-based big data frameworks, such as Apache Hadoop, cloud-based services for executing medical imaging studies have shown promise.

Despite this promise, our studies have revealed that existing big data frameworks illustrate different performance limitations for medical imaging applications, which calls for new algorithms that optimize their performance and suitability for medical imaging. For instance, Apache HBases data distribution strategy of region split and merge is detrimental to the hierarchical organization of imaging data (e.g., project, subject, session, scan, slice). Big data medical image processing applications involving multi-stage analysis often exhibit significant variability in processing times ranging from a few seconds to several days. Due to the sequential nature of executing the analysis stages by traditional software technologies and platforms, any errors in the pipeline are only detected at the later stages despite the sources of errors predominantly being the highly compute-intensive first stage. This wastes precious computing resources and incurs prohibitively higher costs for re-executing the application. To address these challenges, this research propose a framework - Hadoop & HBase for Medical Image Processing (HadoopBase-MIP) - which develops a range of performance optimization algorithms and employs a number of system behaviors modeling for data storage, data access and data processing. We also introduce how to build up prototypes to help empirical system behaviors verification. Furthermore, we introduce a discovery with the development of HadoopBase-MIP about a new type of contrast for medical imaging deep brain structure enhancement. And finally we show how to move forward the Hadoop based framework design into a commercialized big data / High performance computing cluster with cheap, scalable and geographically distributed file system.

APA, Harvard, Vancouver, ISO, and other styles

22

Einstein, Noah. "SmartHub: Manual Wheelchair Data Extraction and Processing Device." The Ohio State University, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=osu1555352793977171.

Full text

APA, Harvard, Vancouver, ISO, and other styles

23

Hatchell, Brian. "Data base design for integrated computer-aided engineering." Thesis, Georgia Institute of Technology, 1987. http://hdl.handle.net/1853/16744.

Full text

APA, Harvard, Vancouver, ISO, and other styles

24

Waite, Martin. "Data structures for the reconstruction of engineering drawings." Thesis, Nottingham Trent University, 1989. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.328794.

Full text

APA, Harvard, Vancouver, ISO, and other styles

25

Guttman, Michael. "Sampled-data IIR filtering via time-mode signal processing." Thesis, McGill University, 2010. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=86770.

Full text

Abstract:

In this work, the design of sampled-data infinite impulse response filters based on time-mode signal processing circuits is presented. Time-mode signal processing (TMSP), defined as the processing of sampled analog information using time-difference variables, has become one of the more popular emerging technologies in circuit design. As TMSP is still relatively new, there is still much development needed to extend the technology into a general signal-processing tool. In this work, a set of general building block will be introduced that perform the most basic mathematical operations in the time-mode. By arranging these basic structures, higher-order time-mode systems, specifically, time-mode filters, will be realized. Three second-order time-mode filters (low-pass, band-reject, high-pass) are modeled using MATLAB, and simulated in Spectre to verify the design methodology. Finally, a damped integrator and a second-order low-pass time-mode IIR filter are both implemented using discrete components.
Dans ce mémoire, la conception de filtres de données-échantillonnées ayant une réponse impulsionnelle infinie basée sur le traitement de signal en mode temporel est présentée. Le traitement de signal dans le domaine temporel (TSDT), définie comme étant le traitement d'information analogique échantillonnée en utilisant des différences de temps comme variables, est devenu une des techniques émergentes de conception de circuits des plus populaires. Puisque le TSDT est toujours relativement récent, il y a encore beaucoup de développements requis pour étendre cette technologie comme un outil de traitement de signal général. Dans cette recherche, un ensemble de blocs d'assemblage capable de réaliser la plupart des opérations mathématiques dans le domaine temporel sera introduit. En arrangeant ces structures élémentaires, des systèmes en mode temporel d'ordre élevé, plus spécifiquement des filtres en mode temporel, seront réalisés. Trois filtres de deuxième ordre dans le domaine temporel (passe-bas, passe-bande et passe-haut) sont modélisés sur MATLAB et simulé sur Spectre afin de vérifier la méthodologie de conception. Finalement, un intégrateur amorti et un filtre passe-bas IIR de deuxième ordre en mode temporel sont implémentés avec des composantes discrètes.

APA, Harvard, Vancouver, ISO, and other styles

26

Faber, Marc. "On-Board Data Processing and Filtering." International Foundation for Telemetering, 2015. http://hdl.handle.net/10150/596433.

Full text

Abstract:

ITC/USA 2015 Conference Proceedings / The Fifty-First Annual International Telemetering Conference and Technical Exhibition / October 26-29, 2015 / Bally's Hotel & Convention Center, Las Vegas, NV
One of the requirements resulting from mounting pressure on flight test schedules is the reduction of time needed for data analysis, in pursuit of shorter test cycles. This requirement has ramifications such as the demand for record and processing of not just raw measurement data but also of data converted to engineering units in real time, as well as for an optimized use of the bandwidth available for telemetry downlink and ultimately for shortening the duration of procedures intended to disseminate pre-selected recorded data among different analysis groups on ground. A promising way to successfully address these needs consists in implementing more CPU-intelligence and processing power directly on the on-board flight test equipment. This provides the ability to process complex data in real time. For instance, data acquired at different hardware interfaces (which may be compliant with different standards) can be directly converted to more easy-to-handle engineering units. This leads to a faster extraction and analysis of the actual data contents of the on-board signals and busses. Another central goal is the efficient use of the available bandwidth for telemetry. Real-time data reduction via intelligent filtering is one approach to achieve this challenging objective. The data filtering process should be performed simultaneously on an all-data-capture recording and the user should be able to easily select the interesting data without building PCM formats on board nor to carry out decommutation on ground. This data selection should be as easy as possible for the user, and the on-board FTI devices should generate a seamless and transparent data transmission, making a quick data analysis viable. On-board data processing and filtering has the potential to become the future main path to handle the challenge of FTI data acquisition and analysis in a more comfortable and effective way.

APA, Harvard, Vancouver, ISO, and other styles

27

Breest, Martin, Paul Bouché, Martin Grund, Sören Haubrock, Stefan Hüttenrauch, Uwe Kylau, Anna Ploskonos, Tobias Queck, and Torben Schreiter. "Fundamentals of Service-Oriented Engineering." Universität Potsdam, 2006. http://opus.kobv.de/ubp/volltexte/2009/3380/.

Full text

Abstract:

Since 2002, keywords like service-oriented engineering, service-oriented computing, and service-oriented architecture have been widely used in research, education, and enterprises. These and related terms are often misunderstood or used incorrectly. To correct these misunderstandings, a deeper knowledge of the concepts, the historical backgrounds, and an overview of service-oriented architectures is demanded and given in this paper.

APA, Harvard, Vancouver, ISO, and other styles

28

Hinrichs, Angela S. (Angela Soleil). "An architecture for distributing processing on realtime data streams." Thesis, Massachusetts Institute of Technology, 1995. http://hdl.handle.net/1721.1/11418.

Full text

APA, Harvard, Vancouver, ISO, and other styles

29

Marcus, Adam Ph D. Massachusetts Institute of Technology. "Optimization techniques for human computation-enabled data processing systems." Thesis, Massachusetts Institute of Technology, 2012. http://hdl.handle.net/1721.1/78454.

Full text

Abstract:

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2012.
Cataloged from PDF version of thesis.
Includes bibliographical references (p. 119-124).
Crowdsourced labor markets make it possible to recruit large numbers of people to complete small tasks that are difficult to automate on computers. These marketplaces are increasingly widely used, with projections of over $1 billion being transferred between crowd employers and crowd workers by the end of 2012. While crowdsourcing enables forms of computation that artificial intelligence has not yet achieved, it also presents crowd workflow designers with a series of challenges including describing tasks, pricing tasks, identifying and rewarding worker quality, dealing with incorrect responses, and integrating human computation into traditional programming frameworks. In this dissertation, we explore the systems-building, operator design, and optimization challenges involved in building a crowd-powered workflow management system. We describe a system called Qurk that utilizes techniques from databases such as declarative workflow definition, high-latency workflow execution, and query optimization to aid crowd-powered workflow developers. We study how crowdsourcing can enhance the capabilities of traditional databases by evaluating how to implement basic database operators such as sorts and joins on datasets that could not have been processed using traditional computation frameworks. Finally, we explore the symbiotic relationship between the crowd and query optimization, enlisting crowd workers to perform selectivity estimation, a key component in optimizing complex crowd-powered workflows.
by Adam Marcus.
Ph.D.

APA, Harvard, Vancouver, ISO, and other styles

30

Stein, Oliver. "Intelligent Resource Management for Large-scale Data Stream Processing." Thesis, Uppsala universitet, Institutionen för informationsteknologi, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-391927.

Full text

Abstract:

With the increasing trend of using cloud computing resources, the efficient utilization of these resources becomes more and more important. Working with data stream processing is a paradigm gaining in popularity, with tools such as Apache Spark Streaming or Kafka widely available, and companies are shifting towards real-time monitoring of data such as sensor networks, financial data or anomaly detection. However, it is difficult for users to efficiently make use of cloud computing resources and studies show that a lot of energy and compute hardware is wasted. We propose an approach to optimizing resource usage in cloud computing environments designed for data stream processing frameworks, based on bin packing algorithms. Test results show that the resource usage is substantially improved as a result, with future improvements suggested to further increase this. The solution was implemented as an extension of the HarmonicIO data stream processing framework and evaluated through simulated workloads.

APA, Harvard, Vancouver, ISO, and other styles

31

DeMaio, William (William Aloysius). "Data processing and inference methods for zero knowledge nuclear disarmament." Thesis, Massachusetts Institute of Technology, 2016. http://hdl.handle.net/1721.1/106698.

Full text

Abstract:

Thesis: S.B., Massachusetts Institute of Technology, Department of Nuclear Science and Engineering, 2016.
This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Cataloged from student-submitted PDF version of thesis.
Includes bibliographical references (pages 63-64).
It is hoped that future nuclear arms control treaties will call for the dismantlement of stored nuclear warheads. To make the authenticated decommissioning of nuclear weapons agreeable, methods must be developed to validate the structure and composition of nuclear warheads without it being possible to gain knowledge about these attributes. Nuclear resonance fluorescence (NRF) imaging potentially enables the physically-encrypted verification of nuclear weapons in a manner that would meet treaty requirements. This thesis examines the physics behind NRF, develops tools for processing resonance data, establishes methodologies for simulating information gain during warhead verification, and tests potential inference processes. The influence of several inference parameters are characterized, and success is shown in predicting the properties of an encrypting foil and the thickness of a warhead in a one-dimensional verification scenario.
by William DeMaio.
S.B.

APA, Harvard, Vancouver, ISO, and other styles

32

Gardener, Michael Edwin. "A multichannel, general-purpose data logger." Thesis, Cape Technikon, 1986. http://hdl.handle.net/20.500.11838/2179.

Full text

Abstract:

Thesis (Diploma (Electrical Engineering))--Cape Technikon, 1986.
This thesis describes the implementation of a general-purpose, microprocessor-based Data Logger. The Hardware allows analog data acquisition from one to thirty two channels with 12 bit resolution and at a data throughput of up to 2KHz. The data is logged directly to a Buffer memory and from there, at the end of each 109, it is dumped to an integral cassette data recorder. The recorded data can be transfered from the logger to a desk-top computer, via the IEEE 488 port, for further processing and display. All log parameters are user selectable by means of menu prompted keyboard entry and a Real-Time clock (RTC) provides date and time information automatically.

APA, Harvard, Vancouver, ISO, and other styles

33

Nedstrand, Paul, and Razmus Lindgren. "Test Data Post-Processing and Analysis of Link Adaptation." Thesis, Linköpings universitet, Institutionen för datavetenskap, 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-121589.

Full text

Abstract:

Analysing the performance of cell phones and other wireless connected devices to mobile networks are key when validating if the standard of the system is achieved. This justies having testing tools that can produce a good overview of the data between base stations and cell phones to see the performance of the cell phone. This master thesis involves developing a tool that produces graphs with statistics from the trac data in the communication link between a connected mobile device and a base station. The statistics will be the correlation between two parameters in the trac data in the channel (e.g. throughput over the channel condition). The tool is oriented on analysis of link adaptation and by the produced graphs the testing personnel at Ericsson will be able to analyse the performance of one or several mobile equipments. We performed our own analysis on link adaptation using the tool to show that this type of analysis is possible with this tool. To show that the tool is useful for Ericsson we let test personnel answer a survey on the usability and user friendliness of it.

APA, Harvard, Vancouver, ISO, and other styles

34

Narayanan, Shruthi (Shruthi P. ). "Real-time processing and visualization of intensive care unit data." Thesis, Massachusetts Institute of Technology, 2017. http://hdl.handle.net/1721.1/119537.

Full text

Abstract:

Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2017.
This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Cataloged from student-submitted PDF version of thesis.
Includes bibliographical references (page 83).
Intensive care unit (ICU) patients undergo detailed monitoring so that copious information regarding their condition is available to support clinical decision-making. Full utilization of the data depends heavily on its quantity, quality and manner of presentation to the physician at the bedside of a patient. In this thesis, we implemented a visualization system to aid ICU clinicians in collecting, processing, and displaying available ICU data. Our goals for the system are: to be able to receive large quantities of patient data from various sources, to compute complex functions over the data that are able to quantify an ICU patient's condition, to plot the data using a clean and interactive interface, and to be capable of live plot updates upon receiving new data. We made significant headway toward our goals, and we succeeded in creating a highly adaptable visualization system that future developers and users will be able to customize.
by Shruthi Narayanan.
M. Eng.

APA, Harvard, Vancouver, ISO, and other styles

35

Shih, Daphne Yong-Hsu. "A data path for a pixel-parallel image processing system." Thesis, Massachusetts Institute of Technology, 1995. http://hdl.handle.net/1721.1/40570.

Full text

Abstract:

Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1996.
Includes bibliographical references (p. 65).
by Daphne Yong-Hsu Shih.
M.Eng.

APA, Harvard, Vancouver, ISO, and other styles

36

Kardos, Péter. "Performance optimization ofthe online data processing softwareof a high-energy physics experiment : Performance optimization ofthe online data processing softwareof a high-energy physics experiment." Thesis, Uppsala universitet, Institutionen för informationsteknologi, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-404475.

Full text

Abstract:

The LHCb experiment probes the differences between matter and anti-matter by examining particle collisions. Like any modern high energy physics experiment, LHCbrelies on a complex hardware and software infrastructure to collect and analyze the data generated from particle collisions. To filter out unimportant data before writing it to permanent storage, particle collision events have to be processed in real-time which requires a lot of computing power. This thesis focuses on performance optimizations of several parts of the real-timedata processing software: i) one of the particle path reconstruction steps; ii) theparticle path refining step; iii) the data structures used by the real-time reconstructionalgorithms. The thesis investigates and employs techniques such as vectorization, cache-friendly memory structures, microarchitecture analysis, and memory allocation optimizations. The resulting performance-optimized code uses today's many-core, data-parallel,superscalar processors to their full potential in order to meet the performance demands of the experiment. The thesis results show that the reconstruction step got3 times faster, the refinement step got 2 times faster and the changes to the datamodel allowed vectorization of most reconstruction algorithms.

APA, Harvard, Vancouver, ISO, and other styles

37

Akleman, Ergun. "Pseudo-affine functions : a non-polynomial implicit function family to describe curves and sufaces." Diss., Georgia Institute of Technology, 1992. http://hdl.handle.net/1853/15409.

Full text

APA, Harvard, Vancouver, ISO, and other styles

38

van, Schaik Sebastiaan Johannes. "A framework for processing correlated probabilistic data." Thesis, University of Oxford, 2014. http://ora.ox.ac.uk/objects/uuid:91aa418d-536e-472d-9089-39bef5f62e62.

Full text

Abstract:

The amount of digitally-born data has surged in recent years. In many scenarios, this data is inherently uncertain (or: probabilistic), such as data originating from sensor networks, image and voice recognition, location detection, and automated web data extraction. Probabilistic data requires novel and different approaches to data mining and analysis, which explicitly account for the uncertainty and the correlations therein. This thesis introduces ENFrame: a framework for processing and mining correlated probabilistic data. Using this framework, it is possible to express both traditional and novel algorithms for data analysis in a special user language, without having to explicitly address the uncertainty of the data on which the algorithms operate. The framework will subsequently execute the algorithm on the probabilistic input, and perform exact or approximate parallel probability computation. During the probability computation, correlations and provenance are succinctly encoded using probabilistic events. This thesis contains novel contributions in several directions. An expressive user language – a subset of Python – is introduced, which allows a programmer to implement algorithms for probabilistic data without requiring knowledge of the underlying probabilistic model. Furthermore, an event language is presented, which is used for the probabilistic interpretation of the user program. The event language can succinctly encode arbitrary correlations using events, which are the probabilistic counterparts of deterministic user program variables. These highly interconnected events are stored in an event network, a probabilistic interpretation of the original user program. Multiple techniques for exact and approximate probability computation (with error guarantees) of such event networks are presented, as well as techniques for parallel computation. Adaptations of multiple existing data mining algorithms are shown to work in the framework, and are subsequently subjected to an extensive experimental evaluation. Additionally, a use-case is presented in which a probabilistic adaptation of a clustering algorithm is used to predict faults in energy distribution networks. Lastly, this thesis presents techniques for integrating a number of different probabilistic data formalisms for use in this framework and in other applications.

APA, Harvard, Vancouver, ISO, and other styles

39

Jungner, Andreas. "Ground-Based Synthetic Aperture Radar Data processing for Deformation Measurement." Thesis, KTH, Geodesi och satellitpositionering, 2009. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-199677.

Full text

Abstract:

This thesis describes a first hands-on experience working with a Ground-Based Synthetic Aperture Radar (GB-SAR) at the Institute of Geomatics in Castelldefels (Barcelona, Spain), used to exploit radar interferometry usually employed on space borne platforms. We describe the key concepts of a GB-SAR as well as the data processing procedure to obtain deformation measurements. A large part of the thesis work have been devoted to development of GB-SAR processing tools such as coherence and interferogram generation, automating the co-registration process, geocoding of GB-SAR data and the adaption of existing satellite SAR tools to GB-SAR data. Finally a series of field campaigns have been conducted to test the instrument in different environments to collect data necessary to develop GB-SAR processing tools as well as to discover capabilities and limitations of the instrument. The key outcome of the field campaigns is that high coherence necessary to conduct interferometric measurements can be obtained with a long temporal baseline. Several factors that affect the result are discussed, such as the reflectivity of the observed scene, the image co-registration and the illuminating geometry.
Det här examensarbetet bygger på erfarenheter av arbete med en mark-baserad syntetisk apertur radar (GB-SAR) vid Geomatiska Institutet i Castelldefels (Barcelona, Spanien). SAR tekniken tillåter radar interferometri som är en vanligt förekommande teknik både på satellit och flygburna platformar. Det här arbetet beskriver instrumentets tekniska egenskaper samt behandlingen av data for att uppmäta deformationer. En stor del av arbetet har ägnats åt utveckling av GB-SAR data applikationer som koherens och interferogram beräkning, automatisering av bild matchning med skript, geokodning av GB-SAR data samt anpassning av befintliga SAR program till GB-SAR data. Slutligen har mätningar gjorts i fält for att samla in data nödvändiga for GB-SAR applikations utvecklingen samt få erfarenhet av instrumentets egenskaper och begränsningar. Huvudresultatet av fältmätningarna är att hög koherens nödvändig för interferometriska mätningar går att uppnå med relativ lång tid mellan mätepokerna. Flera faktorer som påverkar resultatet diskuteras, som det observerade områdets reflektivitet, radar bild matchningen och den illuminerande geometrin.

APA, Harvard, Vancouver, ISO, and other styles

40

R, S. Umesh. "Algorithms for processing polarization-rich optical imaging data." Thesis, Indian Institute of Science, 2004. http://hdl.handle.net/2005/96.

Full text

Abstract:

This work mainly focuses on signal processing issues related to continuous-wave, polarization-based direct imaging schemes. Here, we present a mathematical framework to analyze the performance of the Polarization Difference Imaging (PDI) and Polarization Modulation Imaging (PMI). We have considered three visualization parameters, namely, the polarization intensity (PI), Degree of Linear Polarization (DOLP) and polarization orientation (PO) for comparing these schemes. The first two parameters appear frequently in literature, possibly under different names. The last parameter, polarization orientation, has been introduced and elaborated in this thesis. We have also proposed some extensions/alternatives for the existing imaging and processing schemes and analyzed their advantages. Theoretically and through Monte-Carlo simulations, we have studied the performance of these schemes under white and coloured noise conditions, concluding that, in general, the PMI gives better estimates of all the parameters. Experimental results corroborate our theoretical arguments. PMI is shown to give asymptotically efficient estimates of these parameters, whereas PDI is shown to give biased estimates of the first two and is also shown to be incapable of estimating PO. Moreover, it is shown that PDI is a particular case of PMI. The property of PDI, that it can yield estimates at lower variances has been recognized as its major strength. We have also shown that the three visualization parameters can be fused to form a colour image, giving a holistic view of the scene. We report the advantages of analyzing chunks of data and bootstrapped data under various circumstances. Experiments were conducted to image objects through calibrated scattering media and natural media like mist, with successful results. Scattering media prepared with polystyrene microspheres of diameters 2.97m, 0.06m and 0.13m dispersed in water were used in our experiments. An intensified charge coupled device (CCD) camera was used to capture the images. Results showed that imaging could be performed beyond optical thickness of 40, for particles with 0.13m diameter. For larger particles, the depth to which we could image was much lesser. An experiment using an incoherent source yielded better results than with coherent sources, which we attribute to the speckle noise induced by coherent sources. We have suggested a harmonic based imaging scheme, which can perhaps be used when we have a mixture of scattering particles. We have also briefly touched upon the possible post processing that can be performed on the obtained results, and as an example, shown segmentation based on a PO imaging result.

APA, Harvard, Vancouver, ISO, and other styles

41

Korziuk, Kamil, and Tomasz Podbielski. "Engineering Requirements for platform, integrating health data." Thesis, Blekinge Tekniska Högskola, Institutionen för tillämpad signalbehandling, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-16089.

Full text

Abstract:

In the world that we already live people are more and more on the run and population ageing significantly raise, new technologies are trying to bring best they can to meet humans’ expectations. Survey’s results, that was done during technology conference with elderly on Blekinge Institute of Technology showed, that no one of them has any kind of help in their home but they would need it. This Master thesis present human health state monitoring to focus on fall detection. Health care systems will not completely stop cases when humans are falling down, but further studying causes can prevent them.In this thesis, integration of sensors for vital parameters measurements, human position and measured data evaluation are presented. This thesis is based on specific technologies compatible with Arduino Uno and Arduino Mega microcontrollers, measure sensors and data exchange between data base, MATLAB/Simulink and web page. Sensors integrated in one common system bring possibility to examine the patient health state and call aid assistance in case of health decline or serious injury risk.System efficiency was based on many series of measurement. First phase a comparison between different filter was carried out to choose one with best performance. Kalman filtering and trim parameter for accelerometer was used to gain satisfying results and the final human fall detection algorithm. Acquired measurement and data evaluation showed that Kalmar filtering allow to reach high performance and give the most reliable results. In the second phase sensor placement was tested. Collected data showed that human fall detection is correctly recognized by system with high accuracy. Designed system as a result allow to measure human health and vital state like: temperature, heartbeat, position and activity. Additionally, system gives online overview possibility with actual health state, historical data and IP camera preview when alarm was raised after bad health condition.

APA, Harvard, Vancouver, ISO, and other styles

42

Lampi, J. (Jaakko). "Large-scale distributed data management and processing using R, Hadoop and MapReduce." Master's thesis, University of Oulu, 2014. http://urn.fi/URN:NBN:fi:oulu-201406191771.

Full text

Abstract:

The exponential growth of raw, i.e. unstructured, data collected by various methods has forced companies to change their business strategies and operational approaches. The revenue strategies of a growing number of companies are solely based on the information gained from data and the utilization of it. Managing and processing large-scale data sets, also know as Big Data, requires new methods and techniques, but storing and transporting the ever-growing amount of data also creates new technological challenges. Wireless sensor networks monitor their clients and track their behavior. A client on a wireless sensor network can be anything from a random object to a living being. The Internet of Things binds these clients together, forming a single, massive network. Data is progressively produced and collected by, for example, research projects, commercial products, and governments for different means. This thesis comprises theory for managing large-scale data sets, introduces existing techniques and technologies, and analyzes the situation vis-a-vis the growing amount of data. As an implementation, a Hadoop cluster running R and Matlab is built and sample data sets collected from different sources are stored and analyzed by using the cluster. Datasets include the cellular band of the long-term spectral occupancy findings from the observatory of IIT (Illinois Institute of Technology) and open weather data from weatherunderground.com. An R software environment running on the master node is used as the main tool for calculations and controlling the data flow between different software. These include Hadoop’s HDFS and MapReduce for storage and analysis, as well as a Matlab server for processing sample data and pipelining it to R. The hypothesis that the cold weather front and snowing in the Chicago (IL, US) area should be shown on the cellular band occupancy is set. As a result of the implementation, thorough, step-by-step guides for setting up and managing a Hadoop cluster and using it via an R environment are produced, along with examples and calculations being done. Analysis of datasets and a comparison of performance between R and MapReduce is produced and speculated upon. Results of the analysis correlate somewhat with the weather, but the dataset used for performance comparison should clearly have been larger in order to produce viable results through distributed computing
Raakadatan eli eri menetelmillä kerätyn strukturoimattoman datan määrän huikea kasvu viime vuosina on ajanut yrityksiä muuttamaan strategioitaan ja toimintamallejaan. Monien uusien yritysten tuottostrategiat pohjautuvat puhtaasti datasta saatavaan informaation ja sen hyväksikäyttöön. Suuret datamäärat ja niin kutsuttu Big Data vaativat uusia menetelmiä ja sovelluksia niin datan prosessoinin kuin analysoinninkin suhteen, mutta myös suurien datamäärien fyysinen tallettaminen ja datan siirtäminen tietokannoista käyttäjille ovat luoneet uusia teknologisia haasteita. Langattomat sensoriverkot seuraavat käyttäjiään, joita voivat periaatteessa olla kaikki fyysiset objektit ja elävät olennot, ja valvovat ja tallentavat niiden käyttäytymistä. Niin kutsuttu Internet of Things yhdistää nämä objektit, tai asiat, yhteen massiiviseen verkostoon. Dataa ja informaatiota kerätään yhä kasvavalla vauhdilla esimerkiksi tutkimusprojekteissa, kaupalliseen tarkoitukseen ja valtioiden turvallisuuden takaamiseen. Diplomityössä käsitellään teoriaa suurten datamäärien hallinnasta, esitellään uusien ja olemassa olevien tekniikoiden ja teknologioiden käyttöä sekä analysoidaan tilannetta datan ja tiedon kannalta. Työosuudessa käydään vaiheittain läpi Hadoop-klusterin rakentaminen ja yleisimpien analysointityökalujen käyttö. Käytettävänä oleva testidata analysoidaan rakennettua klusteria hyväksi käyttäen, analysointitulokset ja klusterin laskentatehokkuus kirjataan ylös ja saatuja tuloksia analysoidaan olemassa olevien ratkaisujen ja tarpeiden näkökulmasta. Työssä käytetyt tietoaineistot ovat IIT (Illinois Institute of Technology) havaintoasemalla kerätty mobiilikaistan käyttöaste sekä avoin säädata weatherunderground.com:ista. Analysointituloksena mobiilikaistan käyttöasteen oletetaan korreloivan kylmään ja lumiseen aikaväliin Chigagon alueella Amerikassa. Työn tuloksena ovat tarkat asennus- ja käyttöohjeet Hadoop-klusterille ja käytetyille ohjelmistoille, aineistojen analysointitulokset sekä analysoinnin suorituskykyvertailu käyttäen R-ohjelmistoympäristöä ja MapReducea. Lopputuloksena voidaan esittää, että mobiilikaistan käyttöasteen voidaan jossain määrin todeta korreloivan sääolosuhteiden kanssa. Suorituskykymittauksessa käytetty tietoaineisto oli selvästi liian pieni, että hajautetusta laskennasta voitaisiin hyötyä

APA, Harvard, Vancouver, ISO, and other styles

43

Svenblad, Tobias. "An Analysis of Using Blockchains for Processing and Storing Digital Evidence." Thesis, Högskolan Dalarna, Datateknik, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:du-27855.

Full text

Abstract:

A review of digital forensics today shows that it could be exposed to threats jeopardizing the digital evidence integrity. There are several techniques to countermeasure this risk, one of which is the method that involves the use of blockchains. Blockchains use an advanced system to keep the data within it persistent and transparent, which makes it a natural candidate for everything integrity-sensitive. Several blockchain techniques and infrastructures have been described in this study, based on previous studies and other literature work. Interviews and experiments made a comparison between traditional digital forensic methodologies versus blockchains possible in later chapters. The results showed that blockchains could be the answer to securing digital evidence integrity. However, there is still a lot more work to be done before blockchains are ready to be implemented in production systems. The results of the blockchain analysis are presented such that they can be used as an aid to further research, either theoretically or practically, digital evidence blockchains.

APA, Harvard, Vancouver, ISO, and other styles

44

Xie, Tian, and 謝天. "Development of a XML-based distributed service architecture for product development in enterprise clusters." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2005. http://hub.hku.hk/bib/B30477165.

Full text

APA, Harvard, Vancouver, ISO, and other styles

45

Huang, Dachuan. "Improving Performance in Data Processing Distributed Systems by Exploiting Data Placement and Partitioning." The Ohio State University, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=osu1483312415041341.

Full text

APA, Harvard, Vancouver, ISO, and other styles

46

Mawji, Afzal. "Achieving Scalable, Exhaustive Network Data Processing by Exploiting Parallelism." Thesis, University of Waterloo, 2004. http://hdl.handle.net/10012/779.

Full text

Abstract:

Telecommunications companies (telcos) and Internet Service Providers (ISPs) monitor the traffic passing through their networks for the purposes of network evaluation and planning for future growth. Most monitoring techniques currently use a form of packet sampling. However, exhaustive monitoring is a preferable solution because it ensures accurate traffic characterization and also allows encoding operations, such as compression and encryption, to be performed. To overcome the very high computational cost of exhaustive monitoring and encoding of data, this thesis suggests exploiting parallelism. By utilizing a parallel cluster in conjunction with load balancing techniques, a simulation is created to distribute the load across the parallel processors. It is shown that a very scalable system, capable of supporting a fairly high data rate can potentially be designed and implemented. A complete system is then implemented in the form of a transparent Ethernet bridge, ensuring that the system can be deployed into a network without any change to the network. The system focuses its encoding efforts on obtaining the maximum compression rate and, to that end, utilizes the concept of streams, which attempts to separate data packets into individual flows that are correlated and whose redundancy can be removed through compression. Experiments show that compression rates are favourable and confirms good throughput rates and high scalability.

APA, Harvard, Vancouver, ISO, and other styles

47

Bew, M. D. "Engineering better social outcomes through requirements management & integrated asset data processing." Thesis, University of Salford, 2017. http://usir.salford.ac.uk/42341/.

Full text

Abstract:

The needs of society are no longer serviceable using the traditional methods of infrastructure providers and operators. Urbanisation, pressure on global resources, population growth and migration across borders is placing new demands which traditional methods can no longer adequately serve. The emergence of data and digital technology has enabled new skills to emerge and offer new possibilities as well as set much higher expectations for the younger generation who have only known lives in the digital age. The data describing the physical properties of built assets have been well understood and digital methods such as Building Information Modelling are providing levels of access and quality historically unknown. The concepts of human perception are not so well understood with research only being documented over the last forty years or so, but the understanding of human needs and the impact of poor infrastructure and services has now been linked to poor perception and social outcomes. This research has developed and instantiated a methodology which uses data from the delivery and operational phases of a built asset and with the aid of understanding the user community’s perceptions creates intelligence that can optimise the assets performance for the benefit of its users. The instantiation was accomplished by experiment in an educational environment using the “Test Bench” to gather physical asset data and social perception data and using analytics to implement comparative measurements and double loop feedback to identify actionable interventions. The scientific contributions of this research are the identification of methods which provide valuable and effective relationships between physical and social data to provide ‘’actionable’’ interventions for performance improvement and the instantiation of this discovery through the development and application of the ‘’Test Bench’’. The major implication has been to develop a testable relationship between social outcomes and physical assets, which with further development could provide a valid challenge to the least cost build option that is taken by the vast number of asset owners, by better understanding the full implications on people’s perceptions and social outcomes. The cost of operational staff and resources rapidly outweighs the cost of assets, and the effective motivation and productivity the right environment can provide improved or inhibited performance and social outcomes.

APA, Harvard, Vancouver, ISO, and other styles

48

Prabhakar, Aditya 1978. "A data processing subsystem for the Holo-Chidi video concentrator card." Thesis, Massachusetts Institute of Technology, 2001. http://hdl.handle.net/1721.1/86838.

Full text

Abstract:

Thesis (M.Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2001.
Includes bibliographical references (p. 75-76).
by Aditya Prabhakar.
M.Eng.

APA, Harvard, Vancouver, ISO, and other styles

49

O'Sullivan, John J. D. "Teach2Learn : gamifying education to gather training data for natural language processing." Thesis, Massachusetts Institute of Technology, 2017. http://hdl.handle.net/1721.1/117320.

Full text

Abstract:

Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2017.
Cataloged from PDF version of thesis.
Includes bibliographical references (pages 65-66).
Teach2Learn is a website which crowd-sources the problem of labeling natural text samples using gamified education as an incentive. Students assign labels to text samples from an unlabeled data set, thereby teaching superised machine learning algorithms how to interpret new samples. In return, students can learn how that algorithm works by unlocking lessons written by researchers. This aligns the incentives of researchers and learners to help both achieve their goals. The application used current best practices in gamification to create a motivating structure around that labeling task. Testing showed that 27.7% of the user base (5/18 users) engaged with the content and labeled enough samples to unlock all of the lessons, suggesting that learning modules are sufficient motivation for the right users. Attempts to grow the platform through paid social media advertising were unsuccessful, likely because users aren't looking for a class when they browse those sites. Unpaid posts on subreddits discussing related topics, where users were more likely to be searching for learning opportunities, were more successful. Future research should seek users through comparable sites and explore how Teach2Learn can be used as an additional learning resource in classrooms.
by John J.D. O'Sullivan
M. Eng.

APA, Harvard, Vancouver, ISO, and other styles

50

Rowe, Keja S. "Autonomous data processing and behaviors for adaptive and collaborative underwater sensing." Thesis, Massachusetts Institute of Technology, 2012. http://hdl.handle.net/1721.1/77025.

Full text

Abstract:

Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2012.
Cataloged from PDF version of thesis.
Includes bibliographical references (p. 73).
In this thesis, I designed, simulated and developed behaviors for active riverine data collection platforms. The current state-of-the-art in riverine data collection is plagued by several issues which I identify and address. I completed a real-time test of my behaviors to insure they worked as designed. Then, in a joint effort between the NATO Undersea Research Center (NURC) and Massachusetts Institute of Technology (MIT) I assisted the Shallow Water Autonomous Mine Sensing Initiative (SWAMSI)'11 experiment and demonstrated the viability of multi-static sonar tracking techniques for seabed and sub-seabed targets. By detecting the backscattered energy at the monostatic and several bi-static angles simultaneously, the probabilities of both target detection and target classification should be improved. However, due to equipment failure, we were not able to show the benefits of this technique.
by Keja S. Rowe.
M.Eng.

APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic 'Forensic engineering - Data processing'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles