Dissertations / Theses on the topic 'Data processors'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 50 dissertations / theses for your research on the topic 'Data processors.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Chen, Tien-Fu. "Data prefetching for high-performance processors /." Thesis, Connect to this title online; UW restricted, 1993. http://hdl.handle.net/1773/6871.
Full textGarcía, Almiñana Jordi. "Automatic data distribution for massively parallel processors." Doctoral thesis, Universitat Politècnica de Catalunya, 1997. http://hdl.handle.net/10803/5981.
Full textThe selection of an optimal data placement depends on the program structure, the program's data sizes, the compiler capabilities, and some characteristics of the target machine. In addition, there is often a trade-off between minimizing interprocessor data movement and load balancing on processors. Automatic data distribution tools can assist the programmer in the selection of a good data layout strategy. These use to be source-to-source tools which annotate the original program with data distribution directives.
Crucial aspects such as data movement, parallelism, and load balance have to be taken into consideration in a unified way to efficiently solve the data distribution problem.
In this thesis a framework for automatic data distribution is presented, in the context of a parallelizing environment for massive parallel processor (MPP) systems. The applications considered for parallelization are usually regular problems, in which data structures are dense arrays. The data mapping strategy generated is optimal for a given problem size and target MPP architecture, according to our current cost and compilation model.
A single data structure, named Communication-Parallelism Graph (CPG), that holds symbolic information related to data movement and parallelism inherent in the whole program, is the core of our approach. This data structure allows the estimation of the data movement and parallelism effects of any data distribution strategy supported by our model. Assuming that some program characteristics have been obtained by profiling and that some specific target machine features have been provided, the symbolic information included in the CPG can be replaced by constant values expressed in seconds representing data movement time overhead and saving time due to parallelization. The CPG is then used to model a minimal path problem which is solved by a general purpose linear 0-1 integer programming solver. Linear programming techniques guarantees that the solution provided is optimal, and it is highly effcient to solve this kind of problems.
The data mapping capabilities provided by the tool includes alignment of the arrays, one or two-dimensional distribution with BLOCK or CYCLIC fashion, a set of remapping actions to be performed between phases if profitable, plus the parallelization strategy associated.
The effects of control flow statements between phases are taken into account in order to improve the accuracy of the model. The novelty of the approach resides in handling all stages of the data distribution problem, that traditionally have been treated in several independent phases, in a single step, and providing an optimal solution according to our model.
Agarwal, Virat. "Algorithm design on multicore processors for massive-data analysis." Diss., Georgia Institute of Technology, 2010. http://hdl.handle.net/1853/34839.
Full textDreibelbis, Harold N., Dennis Kelsch, and Larry James. "REAL-TIME TELEMETRY DATA PROCESSING and LARGE SCALE PROCESSORS." International Foundation for Telemetering, 1991. http://hdl.handle.net/10150/612912.
Full textReal-time data processing of telemetry data has evolved from a highly centralized single large scale computer system to multiple mini-computers or super mini-computers tied together in a loosely coupled distributed network. Each mini-computer or super mini-computer essentially performing a single function in the real-time processing sequence of events. The reasons in the past for this evolution are many and varied. This paper will review some of the more significant factors in that evolution and will present some alternatives to a fully distributed mini-computer network that appear to offer significant real-time data processing advantages.
Bartlett, Viv A. "Exploiting data dependencies in low power asynchronous VLSI signal processors." Thesis, University of Westminster, 2000. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.252037.
Full textRevuelta, Fernández Borja. "Study of Scalable Architectures on FPGA for Space Data Processors." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-254909.
Full textSatelliter och rymdfarkoster är komplexa system som utvecklas av tvärvetenskapliga designteam. I ombordsystemen är omborddatorn ansvarig för processering och hantering av vetenskaplig data från nyttolasterna, och kräver strålningstoleranta processorer med hög prestanda. Under de senaste årtiondena har efterfrågan på högpresterande system för omborddatabehandling ökat stadigt på grund av nya uppdragskrav, såsom ökad flexibilitet, snabbare utvecklingstid och nya applikationer. Av samma anledningar har efterfrågan för strålningstoleranta processorkomponenter med ännu högra prestanda ökat med liknande takt. I den här mastersavhandlingen föreslås en arkitektur som är motiverad av aktiviteter med stöd av ESA (den Europeiska rymdorganisationen) inom nätverk i integrerade kretsar (“Network-On-Chip”, NoC) och signalprocessorer (“DSP”, Digital Signal Processor) med flyttalsstöd. Den föreslagna arkitekturen avses vara lämplig för att användas till att undersöka skalering av komplex kretsdesign av integrerade system (“System-on-Chip”, SoC) med signalprocessorer, m.h.a. FPGA-teknologi. Slutligen, genom att använda flera dedikerade processorer för signalbehandling, en LEON3-processor för kontroll, och flera komponenter från GRLIB-biblioteket, ges det möjlighet för en potentiell avändare att göra mjukvarutester på ett flerkärnigt inbyggt processorsystem. Arkitekturen har designats med ett fyrkärnigt signalprocesserings system, vilket anses ge en hög prestanda.
Mullins, Robert D. "Dynamic instruction scheduling and data forwarding in asynchronous superscalar processors." Thesis, University of Edinburgh, 2001. http://hdl.handle.net/1842/12701.
Full textÅström, Fransson Donny. "Utilizing Multicore Processors with Streamed Data Parallel Applications for Mobile Platforms." Thesis, KTH, Elektronik- och datorsystem, ECS, 2012. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-125822.
Full textDuric, Milovan. "Specialization and reconfiguration of lightweight mobile processors for data-parallel applications." Doctoral thesis, Universitat Politècnica de Catalunya, 2016. http://hdl.handle.net/10803/386568.
Full textLa utilización de dispositivos móviles a nivel mundial hace que el segmento de procesadores móviles de bajo consumo lidere la industria de computación. Los clientes piden dispositivos móviles de bajo coste, alto rendimiento y bajo consumo, que ejecuten aplicaciones móviles sofisticadas, tales como multimedia y juegos 3D.Los dispositivos móviles más avanzados utilizan chips con multiprocesadores (CMP) con aceleradores dedicados que explotan el paralelismo a nivel de datos (DLP) en estas aplicaciones. Tal diseño de sistemas heterogéneos permite a los procesadores móviles ofrecer el rendimiento y la eficiencia deseada. La heterogeneidad sin embargo aumenta la complejidad y el coste de fabricación de los procesadores al agregar hardware de propósito específico adicional para implementar los aceleradores. En esta tesis se proponen nuevas técnicas de hardware que aprovechan los recursos disponibles en un CMP móvil para lograr una aceleración con bajo coste de las aplicaciones con DLP. Nuestras técnicas están inspiradas por los procesadores vectoriales clásicos y por las recientes arquitecturas reconfigurables, pues ambas logran alta eficiencia en potencia al ejecutar cargas de trabajo DLP. Pero la alta exigencia de recursos adicionales que estas dos arquitecturas necesitan, limita sus aplicabilidad más allá de las computadoras de alto rendimiento. Para lograr sus ventajas en dispositivos móviles, en esta tesis se proponen técnicas que: 1) especializan núcleos móviles ligeros para la ejecución vectorial clásica de cargas de trabajo DLP; 2) ajustan dinámicamente el número de núcleos de ejecución especializada; y 3) reconfiguran en bloque los recursos existentes de ejecución de propósito general en un acelerador hardware de computación. La especialización permite a uno o más núcleos procesar cantidades configurables de operandos vectoriales largos con nuevas instrucciones vectoriales. La reconfiguración da un paso más y permite que el hardware de cómputo en los núcleos móviles ejecute dinámicamente toda la funcionalidad de diversos algoritmos informáticos. Las técnicas de especialización y reconfiguración propuestas son aplicables a diversos procesadores de propósito general disponibles en los dispositivos móviles de hoy en día. Sin embargo, en esta tesis se ha optado por implementarlas y evaluarlas en un procesador ligero basado en la arquitectura "Explicit Data Graph Execution", que encontramos prometedora para la investigación de procesadores de baja potencia. Las técnicas aplicadas mejoraran el rendimiento del procesador móvil y la eficiencia energética de sus recursos para propósito general ya existentes. El procesador con técnicas de especialización/reconfiguración habilitadas explota eficientemente el DLP sin el coste adicional de los aceleradores de propósito especial.
Picciau, Andrea. "Concurrency and data locality for sparse linear algebra on modern processors." Thesis, Imperial College London, 2017. http://hdl.handle.net/10044/1/58884.
Full textErici, Michael. "A processor in control : a study of whether processors face increased liability under the General Data Protection Regulation." Thesis, Stockholms universitet, Juridiska institutionen, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:su:diva-142776.
Full textYu, Jason Kwok Kwun. "Vector processing as a soft-core processor accelerator." Thesis, University of British Columbia, 2008. http://hdl.handle.net/2429/2394.
Full textDavis, Edward L., and William E. Grahame. "HELICOPTER FLIGHT TESTING and REAL TIME ANALYSIS with DATA FLOW ARRAY PROCESSORS." International Foundation for Telemetering, 1986. http://hdl.handle.net/10150/615414.
Full textWhen flight testing helicopters, it is essential to process and analyze many parameters spontaneously and accurately for instantaneous feedback in order to make spot decisions on the safety and integrity of the aircraft. As various maneuvers stress the airframe or load oscillatory components, the absolute limits as well as interrelated limits including average and cumulative cycle loading must be continuously monitored. This paper presents a complete acquisition and analysis system (LDF/ADS) that contains modularly expandable array processors which provide real time acquisition, processing and analysis of multiple concurrent data streams and parameters. Simple limits checking and engineering units conversions are performed as well as more complex spectrum analyses, correlations and other high level interprocessing interactively with the operator. An example configuration is presented herein which illustrates how the system interacts with the operator during an actual flight test. The processed and derived parameters are discussed and the part they play in decision making is demonstrated. The LDF/ADS system may perform vibration analyses on many structural components during flight. Potential problems may also be isolated and reported during flight. Signatures or frequency domain representations of past problems or failures may be stored in nonvolatile memory and the LDF/ADS system will perform real time convolutions to determine the degrees of correlation of a present problem with all known past problems and reply instantly. This real time fault isolation is an indispensable tool for potential savings in lives and aircraft as well as eliminating unnecessary down time.
Swenson, Kim Christian. "Exploiting network processors for low latency, high throughput, rate-based sensor update delivery." Pullman, Wash. : Washington State University, 2009. http://www.dissertations.wsu.edu/Thesis/Fall2009/k_swenson_121109.pdf.
Full textTitle from PDF title page (viewed on Feb. 9, 2010). "School of Electrical Engineering and Computer Science." Includes bibliographical references (p. 92-94).
DUTT, Nikil D., Hiroaki TAKADA, and Hiroyuki TOMIYAMA. "Memory Data Organization for Low-Energy Address Buses." Institute of Electronics, Information and Communication Engineers, 2004. http://hdl.handle.net/2237/15042.
Full textLindener, Tobias. "Enabling Arbitrary Memory Constraint Standing Queries on Distributed Stream Processors using Approximate Algorithms." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-237459.
Full textRelationsalgebra och SQL har varit standard inom analys i decennier. Inom distribuerade strömmande datamiljöer på web-nivå kan dock även enklare analytiska frågor visa sig utmanande. Två exempel är frågorna ”count” och ”count distinct”. Eftersom de nämnda frågorna kräver att alla nycklar (de värden som identifierar ett element) är persistenta så resulterar detta traditionellt i en kontinuerlig ökning av minneskraven. Genom uppskattningsmetoder med bestämd storlek av minnes-layouten blir de ovannämnda frågorna rimliga och potentiellt mer resurseffektiva inom strömmande system. I detta forskningsarbete demonstreras (1) fördelarna samt gränserna for approximativa frågor inom distribuerade strömmande processer. Vidare presenteras (2) resurseffektivitet samt (3) svårigheter med uppskattningsmetoder. (4) Optimeringar med avseende på olika dataset redovisas. Prototypen är implementerad med Yahoo Data Sketch biblioteket på Apache Flink. Möjliga förbättringar som djupare integration inom strömmande ramverk, baserat på evalueringsresultaten samt erfarenheter med prototypen presenteras. I analysen visas att en kombination av approximativa algoritmer och distribuerade strömmande processer resulterar i lovande resultat, beroende på dataset samt begärd noggrannhet.
Pflieger, Mark Eugene. "A theory of optimal event-related brain signal processors applied to omitted stimulus data /." The Ohio State University, 1991. http://rave.ohiolink.edu/etdc/view?acc_num=osu1487757723995598.
Full textZhuang, Xiaotong. "Compiler Optimizations for Multithreaded Multicore Network Processors." Diss., Georgia Institute of Technology, 2006. http://hdl.handle.net/1853/11566.
Full textSilva, João Paulo Sá da. "Data processing in Zynq APSoC." Master's thesis, Universidade de Aveiro, 2014. http://hdl.handle.net/10773/14703.
Full textField-Programmable Gate Arrays (FPGAs) were invented by Xilinx in 1985, i.e. less than 30 years ago. The influence of FPGAs on many directions in engineering is growing continuously and rapidly. There are many reasons for such progress and the most important are the inherent reconfigurability of FPGAs and relatively cheap development cost. Recent field-configurable micro-chips combine the capabilities of software and hardware by incorporating multi-core processors and reconfigurable logic enabling the development of highly optimized computational systems for a vast variety of practical applications, including high-performance computing, data, signal and image processing, embedded systems, and many others. In this context, the main goals of the thesis are to study the new micro-chips, namely the Zynq-7000 family and to apply them to two selected case studies: data sort and Hamming weight calculation for long vectors.
Field-Programmable Gate Arrays (FPGAs) foram inventadas pela Xilinx em 1985, ou seja, há menos de 30 anos. A influência das FPGAs está a crescer continua e rapidamente em muitos ramos de engenharia. Há varias razões para esta evolução, as mais importantes são a sua capacidade de reconfiguração inerente e os baixos custos de desenvolvimento. Os micro-chips mais recentes baseados em FPGAs combinam capacidades de software e hardware através da incorporação de processadores multi-core e lógica reconfigurável permitindo o desenvolvimento de sistemas computacionais altamente otimizados para uma grande variedade de aplicações práticas, incluindo computação de alto desempenho, processamento de dados, de sinal e imagem, sistemas embutidos, e muitos outros. Neste contexto, este trabalho tem como o objetivo principal estudar estes novos micro-chips, nomeadamente a família Zynq-7000, para encontrar as melhores formas de potenciar as vantagens deste sistema usando casos de estudo como ordenação de dados e cálculo do peso de Hamming para vetores longos.
BenDor, Jonathan, and J. D. Baker. "Processing Real-Time Telemetry with Multiple Embedded Processors." International Foundation for Telemetering, 1994. http://hdl.handle.net/10150/611671.
Full textThis paper describes a system in which multiple embedded processors are used for real-time processing of telemetry streams from satellites and radars. Embedded EPC-5 modules are plugged into VME slots in a Loral System 550. Telemetry streams are acquired and decommutated by the System 550, and selected parameters are packetized and appended to a mailbox which resides in VME memory. A Windows-based program continuously fetches packets from the mailbox, processes the data, writes to log files, displays processing results on screen, and sends messages via a modem connected to a serial port.
Baumstark, Lewis Benton Jr. "Extracting Data-Level Parallelism from Sequential Programs for SIMD Execution." Diss., Georgia Institute of Technology, 2004. http://hdl.handle.net/1853/4823.
Full textTidball, John E. "REAL-TIME HIGH SPEED DATA COLLECTION SYSTEM WITH ADVANCED DATA LINKS." International Foundation for Telemetering, 1997. http://hdl.handle.net/10150/609754.
Full textThe purpose of this paper is to describe the development of a very high-speed instrumentation and digital data recording system. The system converts multiple asynchronous analog signals to digital data, forms the data into packets, transmits the packets across fiber-optic lines and routes the data packets to destinations such as high speed recorders, hard disks, Ethernet, and data processing. This system is capable of collecting approximately one hundred megabytes per second of filtered packetized data. The significant system features are its design methodology, system configuration, decoupled interfaces, data as packets, the use of RACEway data and VME control buses, distributed processing on mixedvendor PowerPCs, real-time resource management objects, and an extendible and flexible configuration.
Lee, Yu-Heng George. "DYNAMIC KERNEL FUNCTION FOR HIGH-SPEED REAL-TIME FAST FOURIER TRANSFORM PROCESSORS." Wright State University / OhioLINK, 2009. http://rave.ohiolink.edu/etdc/view?acc_num=wright1260821902.
Full textJohnson, Carl E. "AN APPLICATION OF ETHERNET TECHNOLOGY AND PC TELEMETRY DATA PROCESSORS IN REAL-TIME RANGE SAFETY DECISION MAKING." International Foundation for Telemetering, 1992. http://hdl.handle.net/10150/608910.
Full textThe ethernet technology has vastly improved the capability to make real-time decisions during the flight of a vehicle. This asset combined with a PC telemetry data processor and the power of a high resolution graphics workstation, allows the decision makers to have a highly reliable graphical display of information on which to make vehicle related safety decisions in real-time.
Bierman, Cathy. "Revision and writing quality of seventh graders composing with and without word processors." Diss., Virginia Polytechnic Institute and State University, 1988. http://hdl.handle.net/10919/53912.
Full textEd. D.
Ungureanu, George. "Automatic Software Synthesis from High-Level ForSyDe Models Targeting Massively Parallel Processors." Thesis, KTH, Skolan för informations- och kommunikationsteknik (ICT), 2013. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-127832.
Full textMajd, Farjam. "Two new parallel processors for real time classification of 3-D moving objects and quad tree generation." PDXScholar, 1985. https://pdxscholar.library.pdx.edu/open_access_etds/3421.
Full textSu, Chun-Yi. "Energy-aware Thread and Data Management in Heterogeneous Multi-Core, Multi-Memory Systems." Diss., Virginia Tech, 2015. http://hdl.handle.net/10919/51255.
Full textPh. D.
Guney, Murat Efe. "High-performance direct solution of finite element problems on multi-core processors." Diss., Georgia Institute of Technology, 2010. http://hdl.handle.net/1853/34662.
Full textWeston, Mindy. "The Right to Be Forgotten: Analyzing Conflicts Between Free Expression and Privacy Rights." BYU ScholarsArchive, 2017. https://scholarsarchive.byu.edu/etd/6453.
Full textHuo, Jiale. "On testing concurrent systems through contexts of queues." Thesis, McGill University, 2006. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=102987.
Full textHayes, Timothy. "Novel vector architectures for data management." Doctoral thesis, Universitat Politècnica de Catalunya, 2015. http://hdl.handle.net/10803/397645.
Full textEl crecimiento exponencial de la ratio de creación de datos anual conlleva asociada una demanda para gestionar, consultar y resumir cantidades enormes de información rápidamente. En el pasado, se confiaba en el escalado de la frecuencia de los procesadores para incrementar el rendimiento. Hoy en día los incrementos de rendimiento deben conseguirse mediante la explotación de paralelismo. Las arquitecturas vectoriales ofrecen una manera muy eficiente y escalable de explotar el paralelismo a nivel de datos (DLP, por sus siglas en inglés) a través de sofisticados conjuntos de instrucciones "Single Instruction-Multiple Data" (SIMD). Tradicionalmente, las máquinas vectoriales se usaban para acelerar aplicaciones científicas y no de negocios. En esta tesis diseñamos extensiones vectoriales innovadoras para una microarquitectura superescalar moderna, optimizadas para tareas de gestión de datos. Basándonos en un extenso análisis de estas aplicaciones, también proponemos nuevos algoritmos, instrucciones novedosas y optimizaciones en la microarquitectura. Primero, caracterizamos un sistema comercial de soporte de decisiones. Encontramos que el operador "hash join" es responsable de una porción significativa del tiempo. Basándonos en nuestra caracterización, desarrollamos extensiones vectoriales ligeras para datos enteros, con el objetivo de capturar el paralelismo en este operandos. Entonces implementos y evaluamos estas extensiones usando un simulador especialmente adaptado por nosotros, basado en PTLsim y DRAMSim2. Descubrimos que relajar el modelo de memoria de la arquitectura base es altamente beneficioso, permitiendo ejecutar instrucciones vectoriales de memoria indexadas, fuera de orden, sin necesitar hardware asociativo complejo. Encontramos que nuestra implementación vectorial consigue buenos incrementos de rendimiento. Seguimos con la realización de un estudio detallado de algoritmos de ordenación SIMD. Usando nuestra infraestructura de simulación, evaluamos los puntos fuertes y débiles así como la escalabilidad de tres algoritmos vectorizados de ordenación diferentes quicksort, bitonic mergesort y radix sort. A partir de este análisis, proponemos "VSR sort" un nuevo algoritmo de ordenación vectorizado, basado en radix sort pero sin sus limitaciones. Sin embargo, VSR sort no puede ser implementado directamente con instrucciones vectoriales típicas, debido a la irregularidad de su DLP. Para facilitar la implementación de este algoritmo, definimos dos nuevas instrucciones vectoriales y proponemos una estructura hardware correspondiente. VSR sort consigue un rendimiento significativamente más alto que los otros algoritmos. A continuación, proponemos y evaluamos cinco maneras diferentes de vectorizar agregaciones de datos "GROUP BY". Encontramos que, aunque los algoritmos de agregación de datos tienen DLP abundante, frecuentemente este es demasiado irregular para ser expresado eficientemente usando instrucciones vectoriales típicas. Mediante la extensión del hardware usado para VSR sort, proponemos un conjunto de instrucciones vectoriales y algoritmos para capturar mejor este DLP irregular. Finalmente, evaluamos el área, energía y potencia de estas extensiones usando McPAT. Nuestros resultados muestran que las extensiones vectoriales propuestas conllevan un aumento modesto del área del procesador, incluso cuando se utiliza una longitud vectorial larga con varias líneas de ejecución vectorial paralelas. Escogiendo los algoritmos de ordenación como caso de estudio, encontramos que todos los algoritmos vectorizados consumen mucha menos energía que una implementación escalar. En particular, nuestro nuevo algoritmo VSR sort requiere un orden de magnitud menos de energía que el algoritmo escalar de referencia. Respecto a la potencia disipada, descubrimos que nuestras extensiones vectoriales presentan un incremento muy razonable
Low, Douglas Wai Kok. "Network processor memory hierarchy designs for IP packet classification /." Thesis, Connect to this title online; UW restricted, 2005. http://hdl.handle.net/1773/6973.
Full textBousselham, Abdel Kader. "FPGA based data acquistion and digital pulse processing for PET and SPECT." Doctoral thesis, Stockholm University, Department of Physics, 2007. http://urn.kb.se/resolve?urn=urn:nbn:se:su:diva-6618.
Full textThe most important aspects of nuclear medicine imaging systems such as Positron Emission Tomography (PET) or Single Photon Emission Computed Tomography (SPECT) are the spatial resolution and the sensitivity (detector efficiency in combination with the geometric efficiency). Considerable efforts have been spent during the last two decades in improving the resolution and the efficiency by developing new detectors. Our proposed improvement technique is focused on the readout and electronics. Instead of using traditional pulse height analysis techniques we propose using free running digital sampling by replacing the analog readout and acquisition electronics with fully digital programmable systems.
This thesis describes a fully digital data acquisition system for KS/SU SPECT, new algorithms for high resolution timing for PET, and modular FPGA based decentralized data acquisition system with optimal timing and energy. The necessary signal processing algorithms for energy assessment and high resolution timing are developed and evaluated. The implementation of the algorithms in field programmable gate arrays (FPGAs) and digital signal processors (DSP) is also covered. Finally, modular decentralized digital data acquisition systems based on FPGAs and Ethernet are described.
Kim, Jongmyon. "Architectural Enhancements for Color Image and Video Processing on Embedded Systems." Diss., Georgia Institute of Technology, 2005. http://hdl.handle.net/1853/6948.
Full textEl, Moussawi Ali Hassan. "SIMD-aware word length optimization for floating-point to fixed-point conversion targeting embedded processors." Thesis, Rennes 1, 2016. http://www.theses.fr/2016REN1S150/document.
Full textIn order to cut-down their cost and/or their power consumption, many embedded processors do not provide hardware support for floating-point arithmetic. However, applications in many domains, such as signal processing, are generally specified using floating-point arithmetic for the sake of simplicity. Porting these applications on such embedded processors requires a software emulation of floating-point arithmetic, which can greatly degrade performance. To avoid this, the application is converted to use fixed-point arithmetic instead. Floating-point to fixed-point conversion involves a subtle tradeoff between performance and precision ; it enables the use of narrower data word lengths at the cost of degrading the computation accuracy. Besides, most embedded processors provide support for SIMD (Single Instruction Multiple Data) as a mean to improve performance. In fact, this allows the execution of one operation on multiple data in parallel, thus ultimately reducing the execution time. However, the application should usually be transformed in order to take advantage of the SIMD instruction set. This transformation, known as Simdization, is affected by the data word lengths ; narrower word lengths enable a higher SIMD parallelism rate. Hence the tradeoff between precision and Simdization. Many existing work aimed at provide/improving methodologies for automatic floating-point to fixed-point conversion on the one side, and Simdization on the other. In the state-of-the-art, both transformations are considered separately even though they are strongly related. In this context, we study the interactions between these transformations in order to better exploit the performance/accuracy tradeoff. First, we propose an improved SLP (Superword Level Parallelism) extraction (an Simdization technique) algorithm. Then, we propose a new methodology to jointly perform floating-point to fixed-point conversion and SLP extraction. Finally, we implement this work as a fully automated source-to-source compiler flow. Experimental results, targeting four different embedded processors, show the validity of our approach in efficiently exploiting the performance/accuracy tradeoff compared to a typical approach, which considers both transformations independently
Chai, Sek Meng. "Real time image processing on parallel arrays for gigascale integration." Diss., Georgia Institute of Technology, 1999. http://hdl.handle.net/1853/15513.
Full textStanić, Milan. "Design of energy-efficient vector units for in-order cores." Doctoral thesis, Universitat Politècnica de Catalunya, 2017. http://hdl.handle.net/10803/405647.
Full textEn los últimos 15 años, la potencia disipada y el consumo de energía se han convertido en elementos cruciales del diseño de la práctica totalidad de sistemas de computación. El escalado del tamaño de los transistores conlleva densidades de potencia más altas y, en consecuencia, sistemas de refrigeración más complejos y costosos. Mientras que la potencia disipada es crítica para sistemas de alto rendimiento, como por ejemplo centros de datos, debido a su uso de gran potencia, para sistemas móviles la duración de la batería es la preocupación principal. Para el mercado de procesadores móviles de prestaciones más modestas, los límites permitidos para la potencia, energía y área del chip son significativamente más bajas que para los servidores, ordenadores de sobremesa, portátiles o móviles de gama alta. El objetivo final en sistemas de gama baja es igualmente el de incrementar el rendimiento, pero sólo si el "presupuesto" para energía o área no se ve comprometido. Tradicionalmente, las arquitecturas vectoriales han sido usadas en el ámbito de la supercomputación, con diversas implementaciones exitosas. La eficiencia energética y el alto rendimiento de los procesadores vectoriales, así como que se puedan aplicar a ámbitos emergentes, motivan a continuar la investigación en arquitecturas vectoriales. No obstante, añadir soporte paravectores basado en diseños convencionales conlleva incrementos de potencia y área que no son aceptables para procesadores móviles de gama baja. Además, no existen herramientas apropiadas para realizar esta investigación. En esta tesis, proponemos un diseño integrado vectorial-escalar para arquitecturas ARM de bajo consumo, que principalmente reutiliza el hardware escalar ya presente en el procesador para implementar el soporte de ejecución de instrucciones vectoriales. El elemento clave del diseño es nuestro modelo de ejecución por bloques propuesto en la tesis, que agrupa instrucciones de cómputo vectorial para ejecutarlas de manera coordinada. Complementamos esto con un diseño integrado avanzado que implementa tres ideas para incrementar el rendimiento eficientemente en cuanto a la energía consumida: (1) encadenamiento (chaining) desde la jerarquía de memoria, (2) reenvío (forwarding) directo de los resultados, y (3) instrucciones de memoria "shape", con patrones de acceso complejos. Además, esta tesis presenta dos herramientas para medir y analizar lo apropiado de usar microarquitecturas vectoriales para una aplicación. La primera herramienta es VALib, una biblioteca que permite la vectorización manual de aplicaciones, cuyo propósito principal es el de recolectar datos para una caracterización detallada a nivel de instrucción, así como el de generar trazas para la segunda herramienta, SimpleVector. SimpleVector es un simulador rápido basado en trazas que estima el tiempo de ejecución de una aplicación vectorial en la microarquitectura vectorial candidata. Finalmente, la tesis también evalúa las características del procesador Knight's Corner, con unidades SIMD en orden sencillas. Lo aprendido en estos análisis se ha aplicado en el diseño integrado.
McKenzie, Donald John. "An investigation of the effects which using the word processor has on the writing of standard six pupils." Thesis, Rhodes University, 1994. http://hdl.handle.net/10962/d1003531.
Full textColombet, Quentin. "Decoupled (SSA-based) register allocators : from theory to practice, coping with just-in-time compilation and embedded processors constraints." Phd thesis, Ecole normale supérieure de lyon - ENS LYON, 2012. http://tel.archives-ouvertes.fr/tel-00764405.
Full textMeyer, Andreas, Sergey Smirnov, and Mathias Weske. "Data in business processes." Universität Potsdam, 2011. http://opus.kobv.de/ubp/volltexte/2011/5304/.
Full textProzesse und Daten sind gleichermaßen wichtig für das Geschäftsprozessmanagement. Prozessdaten sind dabei insbesondere im Kontext der Automatisierung von Geschäftsprozessen, dem Prozesscontrolling und der Repräsentation der Vermögensgegenstände von Organisationen relevant. Es existieren viele Prozessmodellierungssprachen, von denen jede die Darstellung von Daten durch eine fest spezifizierte Menge an Modellierungskonstrukten ermöglicht. Allerdings unterscheiden sich diese Darstellungenund damit der Grad der Datenmodellierung stark untereinander. Dieser Report evaluiert verschiedene Prozessmodellierungssprachen bezüglich der Unterstützung von Datenmodellierung. Als einheitliche Grundlage entwickeln wir ein Framework, welches prozess- und datenrelevante Aspekte systematisch organisiert. Die Kriterien legen dabei das Hauptaugenmerk auf die datenrelevanten Aspekte. Nach Einführung des Frameworks vergleichen wir zwölf Prozessmodellierungssprachen gegen dieses. Wir generalisieren die Erkenntnisse aus den Vergleichen und identifizieren Cluster bezüglich des Grades der Datenmodellierung, in welche die einzelnen Sprachen eingeordnet werden.
Bharadwaj, V. "Distributed Computation With Communication Delays: Design And Analysis Of Load Distribution Strategies." Thesis, Indian Institute of Science, 1994. http://hdl.handle.net/2005/161.
Full textHenriksson, Tomas. "Intra-packet data-flow protocol processor /." Linköping : Univ, 2003. http://www.bibl.liu.se/liupubl/disp/disp2003/tek813s.pdf.
Full textBai, Shuanghua. "Data reconciliation for dynamic processes." Thesis, University of Ottawa (Canada), 2006. http://hdl.handle.net/10393/29279.
Full textJasovský, Filip. "Realizace superpočítače pomocí grafické karty." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2014. http://www.nusl.cz/ntk/nusl-220617.
Full textLari, Kamran A. "Sparse data estimation for knowledge processes." Thesis, McGill University, 2004. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=86073.
Full textProcess monitoring is one of the major components for any process management system. There have been efforts to design process control and monitoring systems; however, no integrated system has yet been developed as a "generic intelligent system shell". In this dissertation, an architecture for an integrated process monitoring system (IPMS) is developed, whereby the end-to-end activities of a process can be automatically measured and evaluated. In order to achieve this goal, various components of the IPMS and the interrelationship among these components are designed.
Furthermore, a comprehensive study on the available methodologies and techniques revealed that sparse data estimation (SDE) is the key component of the IPMS which does not yet exist. Consequently, a series of algorithms and methodologies are developed as the basis for the sparse data estimation of knowledge based processes. Finally, a series of computer programs demonstrate the feasibility and functionality of the proposed approach when applied to a sample process. The sparse data estimation method is successful for not only knowledge based processes, but also for any process, and indeed for any set of activities that can be modeled as a network.
Lozano, Albalate Maria Teresa. "Data Reduction Techniques in Classification Processes." Doctoral thesis, Universitat Jaume I, 2007. http://hdl.handle.net/10803/10479.
Full textThe aim of any condensing technique is to obtain a reduced training set in order to spend as few time as possible in classification. All that without a significant loss in classification accuracy. Some
new approaches to training set size reduction based on prototypes are presented. These schemes basically consist of defining a small number of prototypes that represent all the original instances. That includes those approaches that select among the already existing examples (selective condensing algorithms), and those which generate new representatives (adaptive condensing algorithms).
Those new reduction techniques are experimentally compared to some traditional ones, for data represented in feature spaces. In order to test them, the classical 1-NN rule is here applied. However, other classifiers (fast classifiers) have been considered here, as linear and quadratic ones constructed in dissimilarity spaces based on prototypes, in order to realize how editing and condensing concepts work for this different family of classifiers.
Although the goal of the algorithms proposed in this thesis is to obtain a strongly reduced set of representatives, the performance is empirically evaluated over eleven real data sets by comparing not only the reduction rate but also the classification accuracy with those of other condensing techniques. Therefore, the ultimate aim is not only to find a strongly reduced set, but also a balanced one.
Several ways to solve the same problem could be found. So, in the case of using a rule based on distance as a classifier, not only the option of reducing the training set can be afford. A different family of approaches consists of applying several searching methods. Therefore, results obtained by the use of the algorithms here presented are compared in terms of classification accuracy and time, to several efficient search techniques.
Finally, the main contributions of this PhD report could be briefly summarised in four principal points. Firstly, two selective algorithms based on the idea of surrounding neighbourhood. They obtain better results than other algorithms presented here, as well as better than other traditional schemes. Secondly, a generative approach based on mixtures of Gaussians. It presents better results in classification accuracy and size reduction than traditional adaptive algorithms, and similar to those of the LVQ. Thirdly, it is shown that classification rules other than the 1-NN can be used, even leading to better results. And finally, it is deduced from the experiments carried on, that with some databases (as the ones used here) the approaches here presented execute the classification processes in less time that the efficient search techniques.
Hoenicke, Jochen. "Combination of processes, data, and time /." Oldenburg : Univ., Fak. II, Dep. für Informatik, 2006. http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&doc_number=014970023&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA.
Full textTusche, Marco. "Empirical processes of multiple mixing data." Thesis, Tours, 2013. http://www.theses.fr/2013TOUR4033/document.
Full textThe present thesis studies weak convergence of empirical processes of multiple mixing data. It is based on the articles Durieu and Tusche (2012), Dehling, Durieu, and Tusche (2012), and Dehling, Durieu, and Tusche (2013). We follow the approximating class approach introduced by Dehling, Durieu, and Voln (2009)and Dehling and Durieu (2011), who established empirical central limit theorems for dependent R- and R”d-valued random variables, respectively. Extending their technique, we generalize their results to arbitrary state spaces and to empirical processes indexed by classes of functions. Moreover we study sequential empirical processes. Our results apply to B-geometrically ergodic Markov chains, iterative Lipschitz models, dynamical systems with a spectral gap on the Perron—Frobenius operator, and ergodic toms automorphisms. We establish conditions under which the empirical process of such processes converges weakly to a Gaussian process
Malayattil, Sarosh Aravind. "Design of a Multibus Data-Flow Processor Architecture." Thesis, Virginia Tech, 2012. http://hdl.handle.net/10919/31379.
Full textMaster of Science