To see the other types of publications on this topic, follow the link: Hardware and software co-simulation.

Dissertations / Theses on the topic 'Hardware and software co-simulation'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'Hardware and software co-simulation.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Brankovic, Aleksandar. "Performance simulation methodologies for hardware/software co-designed processors." Doctoral thesis, Universitat Politècnica de Catalunya, 2015. http://hdl.handle.net/10803/287978.

Full text
Abstract:
Recently the community started looking into Hardware/Software (HW/SW) co-designed processors as potential solutions to move towards the less power consuming and the less complex designs. Unlike other solutions, they reduce the power and the complexity doing so called dynamic binary translation and optimization from a guest ISA to an internal host custom ISA. This thesis tries to answer the question on how to simulate this kind of architectures. For any kind of processor's architecture, the simulation is the common practice, because it is impossible to build several versions of hardware in order to try all alternatives. The simulation of HW/SW co-designed processors has a big issue in comparison with the simulation of traditional HW-only architectures. First of all, open source tools do not exist. Therefore researches many times assume that the software layer overhead, which is in charge for dynamic binary translation and optimization, is constant or ignored. In this thesis we show that such an assumption is not valid and that can lead to very inaccurate results. Therefore including the software layer in the simulation is a must. On the other side, the simulation is very slow in comparison to native execution, so the community spent a big effort on delivering accurate results in a reasonable amount of time. Therefore it is the common practice for HW-only processors that only parts of application stream, which are called samples, are simulated. Samples usually correspond to different phases in the application stream and usually they are no longer than a few million of instructions. In order to archive accurate starting state of each sample, microarchitectural structures are warmed-up for a few million instructions prior to samples instructions. Unfortunately, such a methodology cannot be directly applied for HW/SW co-designed processors. The warm-up for HW/SW co-designed processors needs to be 3-4 orders of magnitude longer than the warm-up needed for traditional HW-only processor, because the warm-up of software layer needs to be longer than the warm-up of hardware structures. To overcome such a problem, in this thesis we propose a novel warm-up technique specialized for HW/SW co-designed processors. Our solution reduces the simulation time by at least 65X with an average error of just 0.75\%. Such a trend is visible for different software and hardware configurations. The process used to determine simulation samples cannot be applied to HW/SW co-designed processors as well, because due to the software layer, samples show more dissimilarities than in the case of HW-only processors. Therefore we propose a novel algorithm that needs 3X less number of samples to achieve similar error like the state of the art algorithms. Again, such a trend is visible for different software and hardware configurations.<br>Els processadors co-dissenyats Hardware/Software (HW/SW co-designed processors) han estat proposats per l'acadèmia i la indústria com a solucions potencials per a fabricar processadors menys complexos i que consumeixen menys energia. A diferència d'altres alternatives, aquest tipus de processadors redueixen la complexitat i el consum d'energia aplicant traducció y optimització dinàmica de binaris des d'un repertori d'instruccions (instruction set architecture) extern cap a un repertori d'instruccions intern adaptat. Aquesta tesi intenta resoldre els reptes relacionats a la simulació d'aquest tipus d'arquitectures. La simulació és un procés comú en el disseny i desenvolupament de processadors ja que permet explorar diverses alternatives sense haver de fabricar el hardware per a cadascuna d'elles. La simulació de processadors co-dissenyats Hardware/Software és un procés més complex que la simulació de processadores tradicionals, purament hardware. Per exemple, no existeixen eines de simulació disponibles per a la comunitat. Per tant, els investigadors acostumen a assumir que la capa de software, que s'encarrega de la traducció i optimització de les aplicacions, no té un pes específic i, per tant, uns costos computacionals baixos o constants en el millor dels casos. En aquesta tesis demostrem que aquestes premisses són incorrectes i que els resultats amb aquestes acostumen a ser molt imprecisos. Una primera conclusió d'aquesta tesi doncs és que la simulació de la capa software és totalment necessària. A més a més, degut a que els processos de simulació són lents, s'han proposat tècniques de simulació que intenten obtenir resultats precisos en el menor temps possible. Una pràctica habitual és la simulació només de parts de les aplicacions, anomenades mostres, en el disseny de processadors convencionals, purament hardware. Aquestes mostres corresponen a diferents fases de les aplicacions i acostumen a ser de pocs milions d'instruccions. Per tal d'aconseguir un estat microarquitectònic acurat per a cadascuna de les mostres, s'acostumen a estressar aquestes estructures microarquitectòniques del simulador abans de començar a extreure resultats, procés anomenat "escalfament" (warm-up). Desafortunadament, aquesta metodologia no pot ser aplicada a processadors co-dissenyats Hardware/Software. L'"escalfament" de les estructures internes del simulador en el disseny de processadores co-dissenyats Hardware/Software són 3-4 ordres de magnitud més gran que el mateix procés d' "escalfament" en simulacions de processadors convencionals, ja que en els primers cal "escalfar" també les estructures i l'estat de la capa software. En aquesta tesi proposem tècniques de simulació basades en l' "escalfament" de les estructures que redueixen el temps de simulació en 65X amb un error mig del 0,75%. Aquests resultats són extrapolables a diferents configuracions del hardware i de la capa software. Finalment, les tècniques convencionals de selecció de mostres d'aplicacions a simular no són aplicables tampoc a la simulació de processadors co-dissenyats Hardware/Software degut a que les mostres es comporten de manera molt diferent quan es té en compte la capa software. En aquesta tesi, proposem un nou algorisme que redueix 3X el nombre de mostres a simular comparat amb els algorismes tradicionals per a processadors convencionals per a obtenir un error similar. Aquests resultats també són extrapolables a diferents configuracions de hardware i de software. En conclusió, en aquesta tesi es respon al repte de com simular processadors co-dissenyats Hardware/Software, que són una alternativa al disseny tradicional de processadors. Hem demostrat que cal simular la capa software i s'han proposat noves tècniques i algorismes eficients d' "escalfament" i selecció de mostres que són tolerants a diferents configuracions
APA, Harvard, Vancouver, ISO, and other styles
2

Freitas, Arthur. "Hardware/Software Co-Verification Using the SystemVerilog DPI." Universitätsbibliothek Chemnitz, 2007. http://nbn-resolving.de/urn:nbn:de:swb:ch1-200700941.

Full text
Abstract:
During the design and verification of the Hyperstone S5 flash memory controller, we developed a highly effective way to use the SystemVerilog direct programming interface (DPI) to integrate an instruction set simulator (ISS) and a software debugger in logic simulation. The processor simulation was performed by the ISS, while all other hardware components were simulated in the logic simulator. The ISS integration allowed us to filter many of the bus accesses out of the logic simulation, accelerating runtime drastically. The software debugger integration freed both hardware and software engineers to work in their chosen development environments. Other benefits of this approach include testing and integrating code earlier in the design cycle and more easily reproducing, in simulation, problems found in FPGA prototypes.
APA, Harvard, Vancouver, ISO, and other styles
3

Nilsson, Per. "Hardware / Software co-design for JPEG2000." Thesis, Linköping University, Department of Electrical Engineering, 2006. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-5796.

Full text
Abstract:
<p>For demanding applications, for example image or video processing, there may be computations that aren’t very suitable for digital signal processors. While a DSP processor is appropriate for some tasks, the instruction set could be extended in order to achieve higher performance for the tasks that such a processor normally isn’t actually design for. The platform used in this project is flexible in the sense that new hardware can be designed to speed up certain computations.</p><p>This thesis analyzes the computational complex parts of JPEG2000. In order to achieve sufficient performance for JPEG2000, there may be a need for hardware acceleration.</p><p>First, a JPEG2000 decoder was implemented for a DSP processor in assembler. When the firmware had been written, the cycle consumption of the parts was measured and estimated. From this analysis, the bottlenecks of the system were identified. Furthermore, new processor instructions are proposed that could be implemented for this system. Finally the performance improvements are estimated.</p>
APA, Harvard, Vancouver, ISO, and other styles
4

Bappudi, Bhargav. "Example Modules for Hardware-software Co-design." University of Cincinnati / OhioLINK, 2016. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1470043472.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

TIWARI, ANURAG. "HARDWARE/SOFTWARE CO-DEBUGGING FOR RECONFIGURABLE COMPUTING APPLICATIONS." University of Cincinnati / OhioLINK, 2002. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1011816501.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Lu, Lipin. "Simulation Software and Hardware for Teaching Ultrasound." Scholarly Repository, 2008. http://scholarlyrepository.miami.edu/oa_theses/143.

Full text
Abstract:
Over the years, medical imaging modalities have evolved drastically. Accordingly, the need for conveying the basic imaging knowledge to future specialists and other trainees becomes even more crucial for devoted educators. Understanding the concepts behind each imaging modality requires a plethora of advanced physics, mathematics, mechanics and medical background. Absorbing all of this background information is a daunting task for any beginner. This thesis focuses on developing an ultrasound imaging education tutorial with the goal of easing the process of learning the principles of ultrasound. This tutorial will utilize three diverse approaches including software and hardware applications. By performing these methodologies from different perspectives, not only will the efficiency of the training be enhanced, but also the trainee?s understanding of crucial concepts will be reinforced through repetitive demonstration. The first goal of this thesis was developing an online medical imaging simulation system and deploying it on the website of the University of Miami. In order to construct an easy, understandable, and interactive environment without deteriorating the important aspects of the ultrasound principles, interactive flash animations (developed by Macromedia Director MX) were used to present concepts via graphic-oriented simulations. The second goal was developing a stand-alone MATLAB program, intended to manipulate the intensity of the pixels in the image in order to simulate how ultrasound images are derived. Additionally, a GUI (graphic user interface) was employed to maximize the accessibility of the program and provide easily adjustable parameters. The GUI window enables trainees to see the changes in outcomes by altering different parameters of the simulation. The third goal of this thesis was to incorporating an actual ultrasound demonstration into the tutorial. This was achieved by using a real ultrasound transducer with a pulse/receiver so that trainees could observe actual ultrasound phenomena, and view the results using an oscilloscope. By manually adjusting the panels on the pulse/ receiver console, basic A-mode ultrasound experiments can be performed with ease. By combining software and hardware simulations, the ultrasound education package presented in this thesis will help trainees more efficiently absorb the various concepts behind ultrasound.
APA, Harvard, Vancouver, ISO, and other styles
7

Cadenelli, Luca. "Hardware/software co-design for data-intensive genomics workloads." Doctoral thesis, Universitat Politècnica de Catalunya, 2019. http://hdl.handle.net/10803/668250.

Full text
Abstract:
Since the last decade, the main components of computer systems have been evolving, diversifying, to overcome their physical limits and to minimize their energy footprint. Hardware specialization and heterogeneity have become key to design more efficient systems and tackle ever-important problems with ever-larger volumes of data. However, to fully take advantage of the new hardware, a tighter integration between hardware and software, called hardware/software co-design, is also needed. Hardware/software co-design is a time-consuming process that poses its challenges, such as code and performance portability. Despite its challenges and considerable costs, it is an effort that is crucial for data-intensive applications that run at scale. Such applications span across different fields, such as engineering, chemistry, life sciences, astronomy, high energy physics, earth sciences, et cetera. Another scientific field where hardware/software co-design is fundamental is genomics. Here, modern DNA sequencing technologies reduced the sequencing time and made its cost orders of magnitude cheaper than it was just a few years ago. This breakthrough, together with novel genomics methods, will eventually enable the long-awaited personalized medicine. Personalized medicine selects appropriate and optimal therapies based on the context of a patient’s genome, and it has the potential to change medical treatments as we know them today. However, the broad adoption of genomics methods is limited by their capital and operational costs. In fact, genomics pipelines consist of complex algorithms with execution times of many hours per each patient and vast intermediate data structures stored in main memory for good performance. To satisfy the main memory requirement genomics applications are usually scaled-out to multiple compute nodes. Therefore, these workloads require infrastructures of enterprise-class servers, with entry and running costs that that most labs, clinics, and hospitals cannot afford. Due to these reasons, co-designing genomics workloads to lower their total cost of ownership is essential and worth investigating. This thesis demonstrates that hardware/software co-design allows migrating data-intensive genomics applications to inexpensive desktop-class machines to reduce the total cost of ownership when compared to traditional cluster deployments. Firstly, the thesis examines algorithmic improvements to ease co-design and to reduce workload footprint, using NVMs as a memory extension, and so to be able to run in one single node. Secondly, it investigates how data-intensive algorithms can offload computation to programmable accelerators (i.e., GPUs and FPGAs) to reduce the execution time and the energy-to-solution. Thirdly, it explores and proposes techniques to substantially reduce the memory footprint through the adoption of flash memory to the point that genomics methods can run on one affordable desktop-class machine. Results on SMUFIN, a state-of-the-art real-world genomics method prove that hardware/software co-design allows significant reductions in the total cost of ownership of data-intensive genomics methods, easing their adoption on large repositories of genomes and also on the field.
APA, Harvard, Vancouver, ISO, and other styles
8

Lobo, Tiago Mendonça. "Co-projeto hardware/software para cálculo de fluxo ótico." Universidade de São Paulo, 2013. http://www.teses.usp.br/teses/disponiveis/55/55134/tde-28082013-094816/.

Full text
Abstract:
O cálculo dos vetores de movimento é utilizado em vários processos na área de visão computacional. Problemas como estabelecer rotas de colisão e movimentação da câmera (egomotion) utilizam os vetores como entrada de algoritmos complexos e que demandam muitos recursos computacionais e consequentemente um consumo maior de energia. O fluxo ótico é uma aproximação do campo gerado pelos vetores de movimento. Porém, para aplicações móveis e de baixo consumo de energia se torna inviável o uso de computadores de uso geral. Um sistema embarcado é definido como um computador desenvolvido com um propósito específico referente à aplicação na qual está inserido. O objetivo principal deste trabalho foi elaborar um módulo em sistema embarcado que realiza o cálculo do fluxo ótico. Foi elaborado um co-projeto de hardware e software dedicado e implementados em FPGAs Cyclone II e Stratix IV para a prototipação do sistema. Desta forma, a implementação de um projeto que auxilia a detecção e medição do movimento é importante não só como aplicação isolada, mas para servir de base no desenvolvimento de outras aplicações como tracking, compressão de vídeos, predição de colisão, etc<br>The motion vectors calculation is used in many processes in the area of computer vision. Problems such as establishing collision routes and the movement of the camera (egomotion) use this vectors as input for complexes algorithms that require many computational and energy resources. The optical flow is an approximation of the field generated by the motion vectors. However, for mobile, low power consumption applications becomes infeasible to use general-purpose computers. An embedded system is defined as a computer designed with a specific purpose related to the application in which it is inserted. The main objective of this work is to implement a hardware and software co-design to assist the optical flow field calculation using the CycloneII and Stratix IV FPGAs. Sad that, it is easily to see that the implementation of a project to help the detection and measurement of the movement can be the base to the development of others applications like tracking, video compression and collision detection
APA, Harvard, Vancouver, ISO, and other styles
9

Dias, Maurício Acconcia. "Co-Projeto de hardware/software para correlação de imagens." Universidade de São Paulo, 2011. http://www.teses.usp.br/teses/disponiveis/55/55134/tde-31082011-124626/.

Full text
Abstract:
Este trabalho de pesquisa tem por objetivo o desenvolvimento de um coprojeto de hardware/software para o algoritmo de correlação de imagens visando atingir um ganho de desempenho com relação à implementação totalmente em software. O trabalho apresenta um comparativo entre um conjunto bastante amplo e significativo de configurações diferentes do soft-processor Nios II implementadas em FPGA, inclusive com a adição de novas instruções dedicadas. O desenvolvimento do co-projeto foi feito com base em uma modificação do método baseado em profiling adicionando-se um ciclo de desenvolvimento e de otimização de software. A comparação foi feita com relação ao tempo de execução para medir o speedup alcançado durante o desenvolvimento do co-projeto que atingiu um ganho de desempenho significativo. Também analisou-se a influência de estruturas de hardware básicas e dedicadas no tempo de execução final do algoritmo. A análise dos resultados sugere que o método se mostrou eficiente considerando o speedup atingido, porém o tempo total de execução ainda ficou acima do esperado, considerando-se a necessidade de execução e processamento de imagens em tempo real dos sistemas de navegação robótica. No entanto, destaca-se que as limitações de processamento em tempo real estão também ligadas as restrições de desempenho impostas pelo hardware adotado no projeto, baseado em uma FPGA de baixo custo e capacidade média<br>This work presents a FPGA based hardware/software co-design for image normalized cross correlation algorithm. The main goal is to achieve a significant speedup related to the execution time of the all-software implementation. The co-design proposed method is a modified profiling-based method with a software development step. The executions were compared related to execution time resulting on a significant speedup. To achieve this speedup a comparison between 21 different configurations of Nios II soft-processor was done. Also hardware influence on execution time was evaluated to know how simple hardware structures and specific hardware structures influence algorithm final execution time. Result analysis suggest that the method is very efficient considering achieved speedup but the final execution time still remains higher, considering the need for real time image processing on robotic navigation systems. However, the limitations for real time processing are a consequence of the hardware adopted in this work, based on a low cost and capacity FPGA
APA, Harvard, Vancouver, ISO, and other styles
10

Li, Juncao. "An Automata-Theoretic Approach to Hardware/Software Co-verification." PDXScholar, 2010. https://pdxscholar.library.pdx.edu/open_access_etds/12.

Full text
Abstract:
Hardware/Software (HW/SW) interfaces are pervasive in computer systems. However, many HW/SW interface implementations are unreliable due to their intrinsically complicated nature. In industrial settings, there are three major challenges to improving reliability. First, as there is no systematic framework for HW/SW interface specifications, interface protocols cannot be precisely conveyed to engineers. Second, as there is no unifying formal model for representing the implementation semantics of HW/SW interfaces accurately, some critical properties cannot be formally verified on HW/SW interface implementations. Finally, few automatic tools exist to help engineers in HW/SW interface development. In this dissertation, we present an automata-theoretic approach to HW/SW co-verification that addresses these challenges. We designed a co-specification framework to formally specify HW/SW interface protocols; we synthesized a hybrid Büchi Automaton Pushdown System, namely Büchi Pushdown System (BPDS), as the unifying formal model for HW/SW interfaces; and we created a co-verification tool, CoVer that implements our model checking algorithms and realizes our reduction algorithms for BPDS. The application of our approach to the Windows device/driver framework has resulted in the detection of fifteen specification issues. Furthermore, utilizing CoVer, we discovered twelve real bugs in five drivers. These non-trivial findings have demonstrated the significance of our approach in industrial applications.
APA, Harvard, Vancouver, ISO, and other styles
11

Tang, Yi. "SUNSHINE: Integrate TOSSIM and P-Sim." Thesis, Virginia Tech, 2011. http://hdl.handle.net/10919/40721.

Full text
Abstract:
Simulators are important tools for wireless sensor network (sensornet) design and evaluation. However, existing simulators only support evaluations of protocols and software aspects of sensornet design. Thus they cannot accurately capture the significant impacts of various hardware designs on sensornet performance. To fill in the gap, we proposed SUNSHINE, a scalable hardware-software cross-domain simulator for sensornet applications. SUNSHINE is the first sensornet simulator that effectively supports joint evaluation and design of sensor hardware and software performance in a networked context. SUNSHINE captures the performance of network protocols, software and hardware through the integration of two modules: a network simulator TOSSIM [1] and hardware-software simulator P-Sim composed of an instruction-set simulator SimulAVR [2] and a hardware simulator GEZEL [3]. This thesis focuses on the integration of TOSSIM and P-Sim. It discusses the integration design considerations and explains how to address several integration challenges: time conversion, data conversion, and time synchronization. Some experiments are also given to demonstrate SUNSHINEâ s cross-domain simulation capability, showing SUNSHINE's strength by integrating simulators from different domains.<br>Master of Science
APA, Harvard, Vancouver, ISO, and other styles
12

López, Muñoz Pedro. "Efficient hardware/software co-designed schemes for low-power processors." Doctoral thesis, Universitat Politècnica de Catalunya, 2014. http://hdl.handle.net/10803/144619.

Full text
Abstract:
Nowadays, we are reaching a point where further improving single thread performance can only be done at the expenses of significantly increasing power consumption. Thus, multi-core chips have been adopted by the industry and the scientific community as a proven solution to improve performance with limited power consumption. However, the number of units to be integrated into a single die is limited by its area and power restrictions, and therefore the thread level parallelism (TLP) that could be exploited is also limited. One way to continue incrementing the number of core units is to reduce the complexity of each individual core at the cost of sacrificing instruction level parallelism (ILP). We face a design trade-off here: to dedicate the total available die area to put a lot of simple cores and favor TLP or to dedicate it to put fewer cores and favor ILP. Among the different solutions already studied in the literature to deal with this challenge, we selected hybrid hardware/software co-designed processors. This solution provides high single thread performance on simple low-power cores through a software dynamic binary optimizer tightly coupled with the hardware underneath. For this reason, we believe that hardware/software co-designed processors is an area that deserves special attention on the design of multi-core systems since it allows implementing multiple simple cores suitable to maximize TLP but sustaining better ILP than conventional pure hardware approaches. In particular, this thesis explores three different techniques to address some of the most relevant challenges on the design of a simple low-power hardware/software co-designed processor. The first technique is a profiling mechanism, named as LIU Profiler, able to detect hot code regions. It consists in a small hardware table that uses a novel replacement policy aimed at detecting hot code. Such simple hardware structure implements this mechanism and allows the software to apply heuristics when building code regions and applying optimizations. The LIU Profiler achieves 85.5% code coverage detection whereas similar profilers implementing traditional replacement policies reach up to 60% coverage requiring a 4x bigger table. Moreover, the LIU Profiler only increases by 1% the total area of a simple low-power processor and consumes less than 0.87% of the total processor power. The LIU Profiler enables improving single thread performance without significantly incrementing the area and power of the processor. The second technique is a rollback scheme aimed to support code reordering and aggressive speculative optimizations on hot code regions. It is named HRC and combines software and hardware mechanisms to checkpoint and to recover the architectural register state of the processor. When compared with pure hardware solutions that require doubling the number of registers, the proposal reduces by 11% the area of the processor and by 24.4% the register file power consumption, at the cost of only degrading 1% the performance. The third technique is a loop parallelization (LP) scheme that uses the software layer to dynamically detect loops of instructions and to prepare them to execute multiple iterations in parallel by using Simultaneous Multi-Threading threads. These are optimized by employing dedicated loop parallelization binary optimizations to speed-up loop execution. LP scheme uses novel fine-grain register communication and thread dynamic register binding technique, as well as already existing processor resources. It introduces small overheads to the system and even small loops and loops that iterate just a few times are able to get significant performance improvements. The execution time of the loops is improved by more than a 16.5% when compared to a fully optimized baseline. LP contributes positively to the integration of a high number of simple cores in the same die and it allows those cores to cooperate to some extent to continue exploiting ILP when necessary.
APA, Harvard, Vancouver, ISO, and other styles
13

Ramírez, Bellido Alejandro. "High performance instruction fetch using software and hardware co-design." Doctoral thesis, Universitat Politècnica de Catalunya, 2002. http://hdl.handle.net/10803/5969.

Full text
Abstract:
En los últimos años, el diseño de procesadores de altas prestaciones ha progresado a lo largo de dos corrientes de investigación: incrementar la profundidad del pipeline para permitir mayores frecuencias de reloj, y ensanchar el pipeline para permitir la ejecución paralela de un mayor numero de instrucciones. Diseñar un procesador de altas prestaciones implica balancear todos los componentes del procesador para asegurar que el rendimiento global no esta limitado por ningún componente individual. Esto quiere decir que si dotamos al procesador de una unidad de ejecución mas rápida, hay que asegurarse de que podemos hacer fetch y decodificar instrucciones a una velocidad suficiente para mantener ocupada a esa unidad de ejecución.<br/><br/>Esta tesis explora los retos presentados por el diseño de la unidad de fetch desde dos puntos de vista: el diseño de un software mas adecuado para las arquitecturas de fetch ya existente, y el diseño de un hardware adaptado a las características especiales del nuevo software que hemos generado.<br/><br/>Nuestra aproximación al diseño de un suevo software ha sido la propuesta de un nuevo algoritmo de reordenación de código que no solo pretende mejorar el rendimiento de la cache de instrucciones, sino que al mismo tiempo pretende incrementar la anchura efectiva de la unidad de fetch. Usando información sobre el comportamiento del programa (profile data), encadenamos los bloques básicos del programa de forma que los saltos condicionales tendrán tendencia a ser no tomados, lo cual favorece la ejecución secuencial del código. Una vez hemos organizado los bloques básicos en estas trazas, mapeamos las diferentes trazas en memoria de forma que minimicen la cantidad de espacio requerida para el código realmente útil, y los conflictos en memoria de este código. Además de describir el algoritmo, hemos realizado un análisis en detalle del impacto de estas optimizaciones sobre los diferentes aspectos del rendimiento de la unidad de fetch: la latencia de memoria, la anchura efectiva de la unidad de fetch, y la capacidad de predicción del predictor de saltos.<br/><br/>Basado en el análisis realizado sobre el comportamiento de los códigos optimizados, proponemos también una modificacion del mecanismo de la trace cache que pretende realizar un uso mas efectivo del escaso espacio de almacenaje disponible. Este mecanismo utiliza la trace cache únicamente para almacenar aquellas trazas que no podrían ser proporcionadas por la cache de instrucciones en un único ciclo.<br/><br/>También basado en el conocimiento adquirido sobre el comportamiento de los códigos optimizados, proponemos un nuevo predictor de saltos que hace un uso extensivo de la misma información que se uso para reordenar el código, pero en este caso se usa para mejorar la precisión del predictor de saltos.<br/><br/>Finalmente, proponemos una nueva arquitectura para la unidad de fetch del procesador basada en explotar las características especiales de los códigos optimizados. Nuestra arquitectura tiene un nivel de complejidad muy bajo, similar al de una arquitectura capaz de leer un único bloque básico por ciclo, pero ofrece un rendimiento muy superior, siendo comparable al de una trace cache, mucho mas costosa y compleja.
APA, Harvard, Vancouver, ISO, and other styles
14

Zhang, Jingyao. "Hardware-Software Co-Design for Sensor Nodes in Wireless Networks." Diss., Virginia Tech, 2013. http://hdl.handle.net/10919/50972.

Full text
Abstract:
Simulators are important tools for analyzing and evaluating different design options for wireless sensor networks (sensornets) and hence, have been intensively studied in the past decades. However, existing simulators only support evaluations of protocols and software aspects of sensornet design. They cannot accurately capture the significant impacts of various hardware designs on sensornet performance.  As a result, the performance/energy benefits of customized hardware designs are difficult to be evaluated in sensornet research. To fill in this technical void, in first section, we describe the design and implementation of SUNSHINE, a scalable hardware-software emulator for sensornet applications.<br />SUNSHINE is the first sensornet simulator that effectively supports joint evaluation and design of sensor hardware and software performance in a networked context. SUNSHINE captures the performance of network protocols, software and hardware up to cycle-level accuracy through its seamless integration of three existing sensornet simulators: a network simulator TOSSIM, an instruction-set simulator SimulAVR and a hardware simulator<br />GEZEL. SUNSHINE solves several sensornet simulation challenges, including data exchanges and time synchronization across different simulation domains and simulation accuracy levels. SUNSHINE also provides hardware specification scheme for simulating flexible and customized hardware designs. Several experiments are given to illustrate SUNSHINE\'s simulation capability. Evaluation results are provided to demonstrate that SUNSHINE is an efficient tool for software-hardware co-design in sensornet research.<br /><br />Even though SUNSHINE can simulate flexible sensor nodes (nodes contain FPGA chips as coprocessors) in wireless networks, it does not estimate power/energy consumption of sensor nodes. So far, no simulators have been developed to evaluate the performance of such flexible nodes in wireless networks. In second section, we present PowerSUNSHINE, a power- and energy-estimation tool that fills the void. PowerSUNSHINE is the first scalable power/energy estimation tool for WSNs that provides an accurate prediction for both fixed and flexible sensor nodes. In the section, we first describe requirements and challenges of building PowerSUNSHINE. Then, we present power/energy models for both fixed and flexible sensor nodes. Two testbeds, a MicaZ platform and a flexible node consisting of a microcontroller, a radio and a FPGA based co-processor, are provided to demonstrate the simulation fidelity of PowerSUNSHINE. We also discuss several evaluation results based on simulation and testbeds to show that PowerSUNSHINE is a scalable simulation tool that provides accurate estimation of power/energy consumption for both fixed and flexible sensor nodes.<br /><br />Since the main components of sensor nodes include a microcontroller and a wireless transceiver (radio), their real-time performance may be a bottleneck when executing computation-intensive tasks in sensor networks. A coprocessor can alleviate the burden of microcontroller from multiple tasks and hence decrease the probability of dropping packets from wireless channel. Even though adding a coprocessor would gain benefits for sensor networks, designing applications for sensor nodes with coprocessors from scratch is challenging due to the consideration of design details in multiple domains, including software, hardware, and network. To solve this problem, we propose a hardware-software co-design framework for network applications that contain multiprocessor sensor nodes. The framework includes a three-layered architecture for multiprocessor sensor nodes and application interfaces under the framework. The layered architecture is to make the design of multiprocessor nodes\' applications flexible and efficient. The application interfaces under the framework are implemented for deploying reliable applications of multiprocessor sensor nodes. Resource sharing technique is provided to make processor, coprocessor and radio work coordinately via communication bus. Several testbeds containing multiprocessor sensor nodes are deployed to evaluate the effectiveness of our framework. Network experiments are executed in SUNSHINE emulator to demonstrate the benefits of using multiprocessor sensor nodes in many network scenarios.<br>Ph. D.
APA, Harvard, Vancouver, ISO, and other styles
15

Cavalcante, Sergio Vanderlei. "A hardware-software co-design system for embedded real-time applications." Thesis, University of Newcastle Upon Tyne, 1997. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.360339.

Full text
APA, Harvard, Vancouver, ISO, and other styles
16

Zhang, Jingyao. "SUNSHINE: A Multi-Domain Sensor Network Simulator." Thesis, Virginia Tech, 2010. http://hdl.handle.net/10919/45146.

Full text
Abstract:
Simulators are important tools for analyzing and evaluating different design options for wireless sensor networks (sensornets) and hence, have been intensively studied in the past decades. However, existing simulators only support evaluations of protocols and software aspects of sensornet design. They cannot accurately capture the significant impacts of various hardware designs on sensornet performance. As a result, the performance/energy benefits of customized hardware designs are difficult to be evaluated in sensornet research. To fill in this technical void, in this thesis, we describe the design and implementation of SUNSHINE, a scalable hardware-software cross-domain simulator for sensornet applications. SUNSHINE is the first sensornet simulator that effectively supports joint evaluation and design of sensor hardware and software performance in a networked context. SUNSHINE captures the performance of network protocols, software and hardware up to cycle-level accuracy through its seamless integration of three existing sensornet simulators: a network simulator TOSSIM, an instruction-set simulator SimulAVR and a hardware simulator GEZEL. SUNSHINE solves challenging design problems, including data exchanges and time synchronizations across different simulation domains and simulation accuracy levels. SUNSHINE also provides hardware specification scheme for simulating flexible and customized hardware designs. Several experiments are given to illustrate SUNSHINEâ s cross-domain simulation capability, demonstrating that SUNSHINE is an efficient tool for software-hardware codesign in sensornet research.<br>Master of Science
APA, Harvard, Vancouver, ISO, and other styles
17

Blumer, Aric David. "Register Transfer Level Simulation Acceleration via Hardware/Software Process Migration." Diss., Virginia Tech, 2007. http://hdl.handle.net/10919/29380.

Full text
Abstract:
The run-time reconfiguration of Field Programmable Gate Arrays (FPGAs) opens new avenues to hardware reuse. Through the use of process migration between hardware and software, an FPGA provides a parallel execution cache. Busy processes can be migrated into hardware-based, parallel processors, and idle processes can be migrated out increasing the utilization of the hardware. The application of hardware/software process migration to the acceleration of Register Transfer Level (RTL) circuit simulation is developed and analyzed. RTL code can exhibit a form of locality of reference such that executing processes tend to be executed again. This property is termed executive temporal locality, and it can be exploited by migration systems to accelerate RTL simulation. In this dissertation, process migration is first formally modeled using Finite State Machines (FSMs). Upon FSMs are built programs, processes, migration realms, and the migration of process state within a realm. From this model, a taxonomy of migration realms is developed. Second, process migration is applied to the RTL simulation of digital circuits. The canonical form of an RTL process is defined, and transformations of HDL code are justified and demonstrated. These transformations allow a simulator to identify basic active units within the simulation and combine them to balance the load across a set of processors. Through the use of input monitors, executive locality of reference is identified and demonstrated on a set of six RTL designs. Finally, the implementation of a migration system is described which utilizes Virtual Machines (VMs) and Real Machines (RMs) in existing FPGAs. Empirical and algorithmic models are developed from the data collected from the implementation to evaluate the effect of optimizations and migration algorithms.<br>Ph. D.
APA, Harvard, Vancouver, ISO, and other styles
18

Adhipathi, Pradeep. "Model based approach to Hardware/ Software Partitioning of SOC Designs." Thesis, Virginia Tech, 2003. http://hdl.handle.net/10919/9986.

Full text
Abstract:
As the IT industry marks a paradigm shift from the traditional system design model to System-On-Chip (SOC) design, the design of custom hardware, embedded processors and associated software have become very tightly coupled. Any change in the implementation of one of the components affects the design of other components and, in turn, the performance of the system. This has led to an integrated design approach known as hardware/software co-design and co-verification. The conventional techniques for co-design favor partitioning the system into hardware and software components at an early stage of the design and then iteratively refining it until a good solution is found. This method is expensive and time consuming. A more modern approach is to model the whole system and rigorously test and refine it before the partitioning is done. The key to this method is the ability to model and simulate the entire system. The advent of new System Level Modeling Languages (SLML), like SystemC, has made this possible. This research proposes a strategy to automate the process of partitioning a system model after it has been simulated and verified. The partitioning idea is based on systems modeled using Process Model Graphs (PmG). It is possible to extract a PmG directly from a SLML like SystemC. The PmG is then annotated with additional attributes like IO delay and rate of activation. A complexity heuristic is generated from this information, which is then used by a greedy algorithm to partition the graph into different architectures. Further, a command line tool has been developed that can process textually represented PmGs and partition them based on this approach.<br>Master of Science
APA, Harvard, Vancouver, ISO, and other styles
19

Yildirim, Gokce. "Smoke Simulation On Programmable Graphics Hardware." Master's thesis, METU, 2005. http://etd.lib.metu.edu.tr/upload/12606545/index.pdf.

Full text
Abstract:
Fluids such as smoke, water and fire are simulated for both Computer Graphics applications and engineering fields such as Mechanical Engineering. Generally, Fluid Dynamics is used for the achievement of realistic-looking fluid simulations. However, the complexity of these calculations makes it difficult to achieve high performance. With the advances in graphics hardware, it has been possible to provide programmability both at the vertex and the fragment level, which allows for faster simulations of complex fluids and other events. In this thesis, one gaseous fluid, smoke is simulated in three dimensions by solving Navier-Stokes Equations (NSEs) using a semi-Lagrangian unconditionally stable method. Simulation is performed both on Central Processing Unit (CPU) and Graphics Processing Unit (GPU). For the programmability at the vertex and the fragment level, C for Graphics (Cg), a platform-independent and architecture neutralshading language, is used. Owing to the advantage of programmability and parallelism of GPU, smoke simulation on graphics hardware runs significantly faster than the corresponding CPU implementation. The test results prove the higher performance of GPU over CPU for running three dimensional fluid simulations.
APA, Harvard, Vancouver, ISO, and other styles
20

Tuncali, Cumhur Erkan. "Implementation And Simulation Of Mc68hc11 Microcontroller Unit Using Systemc For Co-design Studies." Master's thesis, METU, 2007. http://etd.lib.metu.edu.tr/upload/12609177/index.pdf.

Full text
Abstract:
In this thesis, co-design and co-verification of a microcontroller hardware and software using SystemC is studied. For this purpose, an MC68HC11 microcontroller unit, a test bench that contains input and output modules for the verification of microcontroller unit are implemented using SystemC programming language and a visual simulation program is developed using C# programming language in Microsoft .NET platform. SystemC is a C++ class library that is used for co-designing hardware and software of a system. One of the advantages of using SystemC in system design is the ability to design each module of the system in different abstraction levels. In this thesis, test bench modules are designed in a high abstraction level and microcontroller hardware modules are designed in a lower abstraction level. At the end, a simulation platform that is used for co-simulation and co-verification of hardware and software modules of overall system is developed by combining microcontroller implementation, test bench modules, test software and visual simulation program. Simulations at different levels are performed on the system in the developed simulation platform. Simulation results helped observing errors in designed modules easily and making corrections until all results verified designed hardware modules. This stuation showed that co-designing and co-verifying hardware and software of a system helps finding errors and making corrections in early stages of system design cycle and so reducing design time of the system.
APA, Harvard, Vancouver, ISO, and other styles
21

Restrepo, Calle Felipe. "Co-diseño de sistemas hardware/software tolerantes a fallos inducidos por radiación." Doctoral thesis, Universidad de Alicante, 2011. http://hdl.handle.net/10045/23522.

Full text
Abstract:
En la presente tesis se propone una metodología de desarrollo de estrategias híbridas para la mitigación de fallos inducidos por radiación en los sistemas empotrados modernos. La propuesta se basa en los principios del co-diseño de sistemas y consiste en la combinación selectiva, incremental y flexible de enfoques de tolerancia a fallos basados en hardware y software. Es decir, la exploración del espacio de soluciones se fundamenta en una estrategia híbrida de grano fino. El flujo de diseño está guiado por los requisitos de la aplicación. Esta metodología se ha denominado: co-endurecimiento. De esta forma, es posible diseñar sistemas embebidos confiables a bajo coste, donde no sólo se satisfagan los requisitos de confiabilidad y las restricciones de diseño, sino que también se evite el uso excesivo de costosos mecanismos de protección (hardware y software).
APA, Harvard, Vancouver, ISO, and other styles
22

Cornevaux-Juignet, Franck. "Hardware and software co-design toward flexible terabits per second traffic processing." Thesis, Ecole nationale supérieure Mines-Télécom Atlantique Bretagne Pays de la Loire, 2018. http://www.theses.fr/2018IMTA0081/document.

Full text
Abstract:
La fiabilité et la sécurité des réseaux de communication nécessitent des composants efficaces pour analyser finement le trafic de données. La diversification des services ainsi que l'augmentation des débits obligent les systèmes d'analyse à être plus performants pour gérer des débits de plusieurs centaines, voire milliers de Gigabits par seconde. Les solutions logicielles communément utilisées offrent une flexibilité et une accessibilité bienvenues pour les opérateurs du réseau mais ne suffisent plus pour répondre à ces fortes contraintes dans de nombreux cas critiques.Cette thèse étudie des solutions architecturales reposant sur des puces programmables de type Field-Programmable Gate Array (FPGA) qui allient puissance de calcul et flexibilité de traitement. Des cartes équipées de telles puces sont intégrées dans un flot de traitement commun logiciel/matériel afin de compenser les lacunes de chaque élément. Les composants du réseau développés avec cette approche innovante garantissent un traitement exhaustif des paquets circulant sur les liens physiques tout en conservant la flexibilité des solutions logicielles conventionnelles, ce qui est unique dans l'état de l'art.Cette approche est validée par la conception et l'implémentation d'une architecture de traitement de paquets flexible sur FPGA. Celle-ci peut traiter n'importe quel type de paquet au coût d'un faible surplus de consommation de ressources. Elle est de plus complètement paramétrable à partir du logiciel. La solution proposée permet ainsi un usage transparent de la puissance d'un accélérateur matériel par un ingénieur réseau sans nécessiter de compétence préalable en conception de circuits numériques<br>The reliability and the security of communication networks require efficient components to finely analyze the traffic of data. Service diversification and through put increase force network operators to constantly improve analysis systems in order to handle through puts of hundreds,even thousands of Gigabits per second. Commonly used solutions are software oriented solutions that offer a flexibility and an accessibility welcome for network operators, but they can no more answer these strong constraints in many critical cases.This thesis studies architectural solutions based on programmable chips like Field-Programmable Gate Arrays (FPGAs) combining computation power and processing flexibility. Boards equipped with such chips are integrated into a common software/hardware processing flow in order to balance short comings of each element. Network components developed with this innovative approach ensure an exhaustive processing of packets transmitted on physical links while keeping the flexibility of usual software solutions, which was never encountered in the previous state of theart.This approach is validated by the design and the implementation of a flexible packet processing architecture on FPGA. It is able to process any packet type at the cost of slight resources over consumption. It is moreover fully customizable from the software part. With the proposed solution, network engineers can transparently use the processing power of an hardware accelerator without the need of prior knowledge in digital circuit design
APA, Harvard, Vancouver, ISO, and other styles
23

Liang, Cao. "Hardware/Software Co-Design Architecture and Implementations of MIMO Decoders on FPGA." ScholarWorks@UNO, 2006. http://scholarworks.uno.edu/td/416.

Full text
Abstract:
During the last years, multiple-input multiple-output (MIMO) technology has attracted great attentions in the area of wireless communications. The hardware implementation of MIMO decoders becomes a challenging task as the complexity of the MIMO system increases. This thesis presents hardware/software co-design architecture and implementations of two typical lattice decoding algorithms, including Agrell and Vardy (AV) algorithm and Viterbo and Boutros (VB) algorithm. Three levels of parallelisms are analyzed for an efficient implementation with the preprocessing part on embedded MicroBlaze soft processor and the decoding part on customized hardware. The decoders for a 4 by 4 MIMO system with 16-QAM modulation scheme are prototyped on a Xilinx XC2VP30 FPGA device. The hardware implementations of the AV and VB decoders show that they support up to 81 Mbps and 37 Mbps data rate respectively. The performances in terms of resource utilizations and BER are also compared between these two decoders.
APA, Harvard, Vancouver, ISO, and other styles
24

O'Connor, R. Brendan. "Dataflow Analysis and Optimization of High Level Language Code for Hardware-Software Co-Design." Thesis, Virginia Tech, 1996. http://hdl.handle.net/10919/36653.

Full text
Abstract:
Recent advancements in FPGA technology have provided devices which are not only suited for digital logic prototyping, but also are capable of implementing complex computations. The use of these devices in multi-FPGA Custom Computing Machines (CCMs) has provided the potential to execute large sections of programs entirely in custom hardware which can provide a substantial speedup over execution in a general-purpose sequential processor. Unfortunately, the development tools currently available for CCMs do not allow users to easily configure multi-FPGA platforms. In order to exploit the capabilities of such an architecture, a procedure has been developed to perform a dataflow analysis of programs written in C which is capable of several hardware-specific optimizations. This, together with other software tools developed for this purpose, allows CCMs and their host processors to be targeted from the same high-level specification.<br>Master of Science
APA, Harvard, Vancouver, ISO, and other styles
25

Wells, George James. "Hardware emulation and real-time simulation strategies for the concurrent development of microsatellite hardware and software." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 2001. http://www.collectionscanada.ca/obj/s4/f2/dsk3/ftp05/MQ62899.pdf.

Full text
APA, Harvard, Vancouver, ISO, and other styles
26

Subramanian, Sriram. "Software Performance Estimation Techniques in a Co-Design Environment." University of Cincinnati / OhioLINK, 2003. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1061553201.

Full text
APA, Harvard, Vancouver, ISO, and other styles
27

Vilanova, García Lluís. "Code-Centric Domain Isolation : a hardware/software co-design for efficient program isolation." Doctoral thesis, Universitat Politècnica de Catalunya, 2016. http://hdl.handle.net/10803/385746.

Full text
Abstract:
Current software systems contain a multitude of software components: from simple libraries to complex plugins and services. System security and resiliency depends on being able to isolate individual components onto separate domains. Conventional systems impose large performance and programmability overheads when isolating components. Importantly, when performance and isolation are at stake, performance often takes precedence at the expense of security and reliability. These performance and programmability overheads are rooted at the co-evolution of conventional architectures and OSs, which expose isolation in terms of a loose "virtual CPU" model. Operating Systems (OSs) expose isolation domains to users in the form of processes. The OS kernel is isolated from user code by running at a separate privileged level. At the same time, user processes are isolated from each other through the utilization of different page tables. The OS kernel then multiplexes processes across the available physical resources, providing processes the illusion of having a machine for their exclusive use. Given this virtual CPU model, processes interact through interfaces designed for distributed systems, making their programming and performance poorer. The architectural foundations used for building processes impose performance overheads in the excess of 10× and 1000× compared to a function call (for privilege level and page table switches, respectively). Even more, not all overheads can be attributed to the hardware itself, but to the inherent overheads imposed by current OS designs; the OS kernel must mediate cross-process communications through expensive Inter-Process Communication (IPC) operations, which deviate from the traditional synchronous function call semantics. Threads are bound to their creating process, and invoking functionality across processes requires costly OS kernel mediation and application developer involvement to synchronize and exchange information through IPC channels. This thesis proposes a hardware and software co-design that eliminate the overheads of process isolation, while providing a path for gradual adoption for more aggressive optimizations. That is, it allows processes to efficiently call into functions residing on other isolation domains (e.g., processes) without breaking the synchronous function call semantics. On the hardware side, this thesis proposes the CODOMs protection architecture. It provides memory and privilege protection across software components in a way that is at the same time very efficient and very flexible. This hardware substrate is then used to propose DomOS, a set of changes to the OS at the runtime and kernel layers to allow threads to efficiently and securely cross process boundaries using regular function calls. That is, a thread in one process is allowed to call into a function residing in another process without involving the OS in the critical communication path. This is achieved by mapping processes into a shared address space and eliminating IPC overheads through a combination of new hardware primitives and compile-time and run-time optimizations. IPC in DomOS is up to 24× times faster than Linux pipes, and up to 14× times faster than IPC in L4 Fiasco.OC. When applied to a multi-tier web server, DomOS performs up to 2.18× better than an unmodified Linux system, and 1.32× on average. On all configurations, DomOS provides more than 85% of the ideal system efficiency.<br>Els sistemes software d'avui en dia contenen una multitud de components software: des de simples llibreries fins a plugins o serveis complexos. La seguretat i fiabilitat d'aquests sistemes depèn de ser capaç d'aïllar cadascun d'aquests components en un domini a part. L'aïllament en els sistemes convencionals imposa grans costos tant en el rendiment com en la programabilitat del sistema. És més, tots els sistemes solen donar prioritat al rendiment sobre qualsevol altre consideració, degradant la seguretat i fiabilitat del sistema. Aquests costos en rendiment i programabilitat són deguts a la co-evolució de les arquitectures i Sistemes Operatius (SOs) convencionals, que exposen l'aïllament en termes d'un model de "CPUs virtuals". Els SOs encarnen aquest model a través dels processos que proprcionen. El SO s'aïlla del codi d'usuari a través d'un nivell de privilegi separat. Al mateix temps, els processos d'usuari estan aïllats els uns dels altres al utilitzar taules de pàgines separades. El nucli del SO multiplexa aquests processos entre els diferents recursos físics del sistema, proporcionant-los la il·lusió d'estar executant-se en una màquina per al seu ús exclusiu. Donat aquest model, els processos interactuen a través d'interfícies que han estat dissenyades per a sistemes distribuïts, empitjorant-ne la programabilitat i rendiment. Els elements de l'arquitectura que s'utilitzen per a construïr processos imposen costos en el rendiment que superen el 10x i 1000x en comparació amb una simple crida a funció (en el cas de nivells de privilegi i canvis de taula de pàgina, respectivament). És més, part d'aquests costos no vénen donats per l'arquitectura, sinó pels costos inherents al disseny dels SOs actuals. El nucli del SO actua com a mitjancer en la comunicació entre processos a través de primitives conegudes com a IPC. El IPC no és només costós en termes de rendiment, sinó que a més a més es desvia de les semàntiques tradicionals de crida síncrona de funcions. Tot "thread" està lligat al procés que el crea, i la invocació de funcionalitat entre processos requereix de la costosa mediació del SO i de la participació del programador a l'hora de sincronitzar "threads" i intercanviar informacio a través dels canals d'IPC. Aquesta tesi proposa un co-disseny del programari i del maquinari que elimina els costos de l'aïllament basat en processos, alhora que proporciona un camí per a l'adopció gradual d'optimitzacions més agressives. És a dir, permet que qualsevol procés faci una simple crida a una funció que està en un altre domini d'aïllament (com ara un altre procés) sense trencar la la semàntica de les crides síncrones a funció. Aquesta tesi proposa l'arquitectura de protecció CODOMs, que proporciona protecció de memòria i privilegis entre components de programari d'una forma que és, alhora, eficient i flexible. Aquest substrat del maquinari és aleshores utilitzat per proposar DomOS, un conjunt de canvis al SO al nivell del "runtime" i del nucli que permeten a qualsevol "thread" fer crides a funció de forma eficient i segura a codi que resideix en d'altres processos. És a dir, que el "thread" d'un procés pot cridar una funció d'un altre procés sense haver de passar pel SO en el seu camí crític. Això s'aconsegueix a través de mapejar tots els processos en un espai d'adreces compartit i d'eliminar tots els costos d'IPC a través d'una combinació de noves primitives en el maquinari i d'optimitzacions en temps de compilació i en temps d'execució. El IPC a DomOS és fins a 24x més ràpid que les pipes a Linux, i fins a 14x més ràpid que el IPC al SO L4 Fiasco.OC. Si s'aplica el sistema a un servidor web multi-capa, DomOS és fins a 2.18x més ràpid que un sistema Linux no modificat, i 1.32x més ràpid de mitjana. En totes les configuracions, DomOS proporciona més del 85% de la eficiència d'un sistema ideal.
APA, Harvard, Vancouver, ISO, and other styles
28

Ying, Victor A. "Scaling sequential code with hardware-software co-design for fine-grain speculative parallelization." Thesis, Massachusetts Institute of Technology, 2019.

Find full text
Abstract:
This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.<br>Thesis: S.M., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2019<br>Cataloged from PDF version of thesis.<br>Includes bibliographical references (pages 51-55).<br>Multicores are now ubiquitous, but most programmers still write sequential code. Speculative parallelization is an enticing approach to parallelize code while retaining the ease and simplicity of sequential programming, making parallelism pervasive. However, prior speculative parallelizing compilers and architectures achieved limited speedups due to high costs of recovering from misspeculation, limited support for fine-grain parallelism, and hardware scalability bottlenecks. We present SCC, a parallelizing compiler for sequential C/C++ programs. SCC targets the recent Swarm architecture, which exposes a flexible execution model, enables fine-grain speculative parallelism, supports locality and composition, and scales efficiently. SCC introduces novel compiler techniques to exploit Swarm's features and parallelize a broader range of applications than prior work. SCC performs whole-program fine-grain parallelization, breaking applications into many small tasks of tens of instructions each, and decouples the spawning of speculative tasks to enable cheap selective aborts. SCC exploits parallelism across function calls, loops, and loop nests; performs new transformations to expose more speculative parallelism enabled by Swarm's execution model; and exploits locality across fine-grain tasks. As a result, SCC speeds up seven SPEC CPU2006 benchmarks by gmean 6.7x and by up to 29x on 36 cores, over optimized serial code.<br>by Victor A. Ying.<br>S.M.<br>S.M. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science
APA, Harvard, Vancouver, ISO, and other styles
29

Jin, Liwu. "Hardware and software co-design in space compaction of cores-based digital circuit." Thesis, University of Ottawa (Canada), 2004. http://hdl.handle.net/10393/26670.

Full text
Abstract:
Implementation of fault testing environment for embeded cores-based digital circuits is a challenging endeavor. The subject thesis aims developing techniques in design verification and test architecture utilizing well-known concepts of hardware and software co-design. There are available methods to ensure correct functionality, in both hardware and software, for embeded cores-based systems but one of the most used and acceptable approaches to realize this is through the use of design for testability. Specifically, applications of built-in self-test (BIST) methodology in testing embeded cores are considered in the thesis, with specific implementations being targeted towards ISCAS 85 combinational benchmark circuits. Experimental results provided in the thesis prove the validity and importance of the approaches proposed for the design verification and test based on hardware and software co-design concepts utilizing Altera MAX Plus II simulation environment.
APA, Harvard, Vancouver, ISO, and other styles
30

Silva, Junior José Cláudio Vieira e. "Verificação de Projetos de Sistemas Embarcados através de Cossimulação Hardware/Software." Universidade Federal da Paraíba, 2015. http://tede.biblioteca.ufpb.br:8080/handle/tede/7856.

Full text
Abstract:
Submitted by Viviane Lima da Cunha (viviane@biblioteca.ufpb.br) on 2016-02-16T14:54:49Z No. of bitstreams: 1 arquivovotal.pdf: 4473573 bytes, checksum: 152c2f0d263c50dcbea7d500d5f7f5da (MD5)<br>Made available in DSpace on 2016-02-16T14:54:49Z (GMT). No. of bitstreams: 1 arquivovotal.pdf: 4473573 bytes, checksum: 152c2f0d263c50dcbea7d500d5f7f5da (MD5) Previous issue date: 2015-08-17<br>Este trabalho propõe um ambiente para verificação de sistemas embarcados heterogêneos através da cossimulação distribuída. A verificação ocorre de maneira síncrona entre o software do sistema e o sistema embarcado usando a High Level Architecture (HLA) como middeware. A novidade desta abordagem não é apenas fornecer suporte para simulações, mas também permitir a integração sincronizada com todos os dispositivos de hardware físico. Neste trabalho foi utilizado o Ptolemy como uma plataforma de simulação. A integração do HLA com Ptolemy e os modelos de hardware abre um vasto conjunto de aplicações, como o de teste de vários dispositivos ao mesmo tempo, executando os mesmos, ou diferentes aplicativos ou módulos, a execução de multiplos dispositivos embarcados para a melhoria de performance. Além disso a abordagem de utilização do HLA, permite que sejam interligados ao ambiente, qualquer tipo de robô, assim como qualquer outro simulador diferente do Ptolemy. Estudo de casos são apresentado para provar o conceito, mostrando a integração bem sucedida entre o Ptolemy e o HLA e a verificação de sistemas utilizando Hardware-in-the-loop e Robot-in-the-loop.<br>This work proposes an environment for verification of heterogeneous embedded systems through distributed co-simulation. The verification occurs in real-time co-simulating the system software and hardware platform using the High Level Architecture (HLA) as a middleware. The novelty of this approach is not only providing support for simulations, but also allowing the synchronous integration with any physical hardware devices. In this work we use the Ptolemy framework as a simulation platform. The integration of HLA with Ptolemy and the hardware models open a vast set of applications, like the test of many devices at the same time, running the same, or different applications or modules, the usage of Ptolemy for real-time control of embedded systems and the distributed execution of different embedded devices for performance improvement. Furthermore the use of HLA approach allows them to be connected to the environment, any type of robot, as well as any other Ptolemy simulations. Case studies are presented to prove the concept, showing the successful integration between Ptolemy and the HLA and verification systems using Hardware-in-the-loop and Robot-in-the-loop.
APA, Harvard, Vancouver, ISO, and other styles
31

Liu, Tsun-Ho. "Future hardware realization of self-organizing learning array and its software simulation." Ohio : Ohio University, 2002. http://www.ohiolink.edu/etd/view.cgi?ohiou1174680878.

Full text
APA, Harvard, Vancouver, ISO, and other styles
32

Rudraiah, Dakshinamurthy Amruth. "A Compiler-based Framework for Automatic Extraction of Program Skeletons for Exascale Hardware/Software Co-design." Master's thesis, University of Central Florida, 2013. http://digital.library.ucf.edu/cdm/ref/collection/ETD/id/5695.

Full text
Abstract:
The design of high-performance computing architectures requires performance analysis of large-scale parallel applications to derive various parameters concerning hardware design and software development. The process of performance analysis and benchmarking an application can be done in several ways with varying degrees of fidelity. One of the most cost-effective ways is to do a coarse-grained study of large-scale parallel applications through the use of program skeletons. The concept of a "program skeleton" that we discuss in this paper is an abstracted program that is derived from a larger program where source code that is determined to be irrelevant is removed for the purposes of the skeleton. In this work, we develop a semi-automatic approach for extracting program skeletons based on compiler program analysis. We demonstrate correctness of our skeleton extraction process by comparing details from communication traces, as well as show the performance speedup of using skeletons by running simulations in the SST/macro simulator. Extracting such a program skeleton from a large-scale parallel program requires a substantial amount of manual effort and often introduces human errors. We outline a semi-automatic approach for extracting program skeletons from large-scale parallel applications that reduces cost and eliminates errors inherent in manual approaches. Our skeleton generation approach is based on the use of the extensible and open-source ROSE compiler infrastructure that allows us to perform flow and dependency analysis on larger programs in order to determine what code can be removed from the program to generate a skeleton.<br>M.S.<br>Masters<br>Electrical Engineering and Computer Science<br>Engineering and Computer Science<br>Computer Science
APA, Harvard, Vancouver, ISO, and other styles
33

Alhamwi, Ali. "Co-design hardware/software of real time vision system on FPGA for obstacle detection." Thesis, Toulouse 3, 2016. http://www.theses.fr/2016TOU30342/document.

Full text
Abstract:
La détection, localisation d'obstacles et la reconstruction de carte d'occupation 2D sont des fonctions de base pour un robot navigant dans un environnement intérieure lorsque l'intervention avec les objets se fait dans un environnement encombré. Les solutions fondées sur la vision artificielle et couramment utilisées comme SLAM (simultaneous localization and mapping) ou le flux optique ont tendance a être des calculs intensifs. Ces solutions nécessitent des ressources de calcul puissantes pour répondre à faible vitesse en temps réel aux contraintes. Nous présentons une architecture matérielle pour la détection, localisation d'obstacles et la reconstruction de cartes d'occupation 2D en temps réel. Le système proposé est réalisé en utilisant une architecture de vision sur FPGA (field programmable gates array) et des capteurs d'odométrie pour la détection, localisation des obstacles et la cartographie. De la fusion de ces deux sources d'information complémentaires résulte un modèle amelioré de l'environnement autour des robots. L'architecture proposé est un système à faible coût avec un temps de calcul réduit, un débit d'images élevé, et une faible consommation d'énergie<br>Obstacle detection, localization and occupancy map reconstruction are essential abilities for a mobile robot to navigate in an environment. Solutions based on passive monocular vision such as simultaneous localization and mapping (SLAM) or optical flow (OF) require intensive computation. Systems based on these methods often rely on over-sized computation resources to meet real-time constraints. Inverse perspective mapping allows for obstacles detection at a low computational cost under the hypothesis of a flat ground observed during motion. It is thus possible to build an occupancy grid map by integrating obstacle detection over the course of the sensor. In this work we propose hardware/software system for obstacle detection, localization and 2D occupancy map reconstruction in real-time. The proposed system uses a FPGA-based design for vision and proprioceptive sensors for localization. Fusing this information allows for the construction of a simple environment model of the sensor surrounding. The resulting architecture is a low-cost, low-latency, high-throughput and low-power system
APA, Harvard, Vancouver, ISO, and other styles
34

Liu, Ming-Lun, and 劉明倫. "Efficient Hardware/Software Co-design with System Software Co-simulation via Native Translation." Thesis, 2003. http://ndltd.ncl.edu.tw/handle/43299696572914011209.

Full text
Abstract:
碩士<br>國立中正大學<br>資訊工程研究所<br>91<br>In the past years, System-on-Chip (SOC) becomes an industry in great demand. As the chip designs reach larger gate counts and time-to-market windows shrink, the cost of simulation and vari‾cation in the process of hardware/software co-design rises outstandingly. Slowdown is one of the most important factors of simulations, since the practicability of the simulation depends on the length of the simulation time. In this thesis, we introduce a native translator which is embedded in the instruction-set simulator, translating the translatable basic blocks of in the target program been found out automatically. The experimental results will show our proposed mechanism can reduce the lowdown of the instruction-set simulator remarkably. We also implemented a simple scheduler that demonstrates the system software co-simulation with native translation is a practical process which could be considered a way to achieve the hardware/software co-design.
APA, Harvard, Vancouver, ISO, and other styles
35

Lee, Chun-Yi, and 李俊億. "A Hardware-Software Co-simulation Environment for Embedded Processor Development." Thesis, 2005. http://ndltd.ncl.edu.tw/handle/38889908583588512892.

Full text
Abstract:
碩士<br>國立中正大學<br>資訊工程所<br>93<br>Today, in the embedded system design, for the performance constrain, a system consists and increases of the software and hardware components cooperate to achieve a goal. Oppositely, accompanies this factor to cause the complexity of the system to be increased. The capability of the system verification is more difficulty and time than before. Therefore, verification of the complete system has become the critical bottleneck in the design process. We provide a co-simulation environment that verifies the functionality of the hardware and software components, integrates the partial hardware and software components to be made sure of the architecture exploration and simulates the architecture of system before implementing the system. In our environment, the hardware and software components are treated as two separate UNIX processes in which the hardware is described in a hardware description language (Verilog) and the software is written in a programming language (C/C++) communicating through the shared memory IPC (Inter-Process Communication). In addition, we model a fast bidirectional data transfer interface to be bridged between the hardware/software components and shared memory region. For the communication between the components smoothly, we establish a component Backplane to coordinate between the hardware/software components through the shared memory. The benefits of our cosimulation environment are as follows: expansibility of the environment, cosimulation speedup, and convenient cosimulation. Not only advance the system verification time but also shrink the design iteration time. To reflect on the cost of the verification and time to market of development of the embedded system are conspicuous.
APA, Harvard, Vancouver, ISO, and other styles
36

Chang, Keng-Chia, and 張耿嘉. "Adaboost-based Hardware Accelerator DIP Design and Hardware/Software Co-simulation for Face Detection." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/96k2r2.

Full text
Abstract:
碩士<br>國立中興大學<br>電機工程學系所<br>106<br>In recent years, many car accidents caused by the fatigue driving have occurred frequently. Thus, many scholars and experts all over the world have paid great efforts in this issue, and they are developing the suitable detection technologies to reduce car accidents caused by driver''s drowsiness. For the fatigue detection issue, the driver’s spirit status can be evaluated through the eye blinking condition. Therefore, the proposed design implements the hardware accelerator to process the large amount of high repetitiveness data on a hardware/software co-design platform for the drowsy detection system. In the proposed fatigue detection system, by recognizing the accurate facial and eye positions, the eye detection methodology with hardware acceleration is proposed to enhance the efficiency of driver’s fatigue detections. The proposed system includes four parts, which are the face detection, the eye-glasses bridge detection, the eye detection, and the eye closure detection. Firstly, the input images are filmed by the NIR camera which has the 720x480 resolution. The system uses gray-scale images without any color information in all steps, and the proposed design works effectively in daytime and nighttime. Secondly, for face detection, the proposed system uses the machine learning method to detect the face position and face size, and the information of face geometrical position is used to reduce the searching range of driver’s eyes. In this thesis, the proposed design uses the Adaboost-based hardware accelerator for face detections. When the face size and position are already known, the proposed system can decrease the search range of eyes. The hardware accelerator architecture design for face detection is the main contribution of the thesis, and the hardware accelerator has the expandable classifier features. If the system needs more complicated machine learning classifier, the hardware-based classifier can be expanded conveniently and be improved in the future work. In experimental results, the average processing frame rates are 331 frames/sec by the proposed hardware accelerator with the 90 nanometer CMOS technology, and the design can meet the goal of real-time applications. Furthermore, the hardware architecture of face classifier could be expanded for a more complex training module. To meet the real-time issue, the input image size and the complexity of face classifier could be adjustable to improve the system accuracy.
APA, Harvard, Vancouver, ISO, and other styles
37

Chang, Po-Hao, and 張博豪. "Downlink-LTE-Based Transceiver Implementation by Using Software-Hardware Co-Simulation." Thesis, 2013. http://ndltd.ncl.edu.tw/handle/63845361040217816705.

Full text
Abstract:
碩士<br>中原大學<br>通訊工程碩士學位學程<br>101<br>In this thesis, we employ the hardware-software co-simulation scheme to implement some critical blocks of the LTE downlink transceiver in the frequency division duplexing (FDD) mode. We adopt the Simulink® and System Generator® tools for rapid prototyping the system. We first program the MATLAB codes of each building block and simulate each block function as for design reference. Then, we simulate each and whole system blocks by using System Generator®. In the transmitter, we design a packet control circuit in the LTE’s transmitter end to generate control signal of the transmitted signal frame. We then implement pseudo-noise binary generator and quadrature-phase-shift keying (QPSK) symbol-mapping circuits, and use the inverse fast-Fourier transform (IFFT) module of the System Generator® to generate orthogonal frequency-division-multiplexing (OFDM) signal. In the channel end, we use the multipath fading channel module for generating frequency-selective channel. In the receiver end, we implement packet detection module and fractional carrier frequency offset estimation and correction modules, for timing and frequency synchronizations. Next, we implement DFT-based channel estimator module and use the FFT module for realizing the downlink LTE’s demodulator. Finally, for hardware verification, we transfer the LTE’ transmitter and receiver into bitstream files by using the ISE design suite, and download the file on the WARP software defined radio platform to verify the correctness of the downlink LTE-based system.
APA, Harvard, Vancouver, ISO, and other styles
38

Tsai, An-jie, and 蔡安捷. "Enhancing QEMU into an integrated Hardware/Software Co-simulation Tool for SoC Platform." Thesis, 2008. http://ndltd.ncl.edu.tw/handle/60431007259217167248.

Full text
Abstract:
碩士<br>國立成功大學<br>工程科學系碩博士班<br>96<br>The development of hardware modules and software components are usually separated during the process of developing SoC Platform. There are many discussions about the co-work between software and hardware, but the major goals are the planning during initial period of development. Therefore, we focus on the software/hardware co-simulation during the development process; we propose an emulation system framework for software/hardware co-simulation, using an emulator for emulating embedded platforms, and simulation for hardware description language (HDL). The major goal of the framework is to reduce the modification of the software which can execute on both simulator and real SoC target platform successfully, and we can find the bugs in software/hardware quickly, and pay more attention to the optimization for desired systems.
APA, Harvard, Vancouver, ISO, and other styles
39

Yu, Hsiao-Ting, and 于曉婷. "Hardware-Software Co-simulation Platform Development for Baseband Transceiver:A Case Study on ZigBee System." Thesis, 2012. http://ndltd.ncl.edu.tw/handle/56055756951441451781.

Full text
Abstract:
碩士<br>南台科技大學<br>通訊工程研究所<br>100<br>Recently the semiconductor and IC design industry has been developed quickly. In general, the IC design and implementation include several processes, such as system design, system function verification by hardware-software simulations, circuit design and chip fabrication. Especially in the system function verification stage, the hardware- software co-simulation platform can usually provide the fast function verification and shorten the IC design and implement schedule. In recent years, the wireless communication networking technology also develops quickly. Aiming at possible need for the wireless communication network IC design, in this thesis we propose to establish a hardware-software co-simulation platform of wireless communication baseband transmission where a case study on ZigBee is applied. Firstly, we study the ZigBee standard developed by the ZigBee Alliance and research on the ZigBee baseband transceiver. Then, we build up the ZigBee baseband transmission simulation platform by using the Simulink/System Generator build-in library. The platform can provide the system and hardware level simulations. At last, we use the hardware-software co-simulation function in the System Generator to accomplish the hardware-software co-simulation platform for the ZigBee baseband transmission. The developed platforms will be use to simulate and analyze the ZigBee baseband transmission system performance in the system level and hardware level. The simulation results will also be verified the usability of the developed hardware-software co-simulation platform. The platform we build up will be useful for researchers or teachers who work on the ZigBee system or other wireless communication systems.
APA, Harvard, Vancouver, ISO, and other styles
40

Lin, Huang-Cang, and 林煌翔. "On Software/Hardware Co-Design of FFT." Thesis, 2005. http://ndltd.ncl.edu.tw/handle/03182064851667338426.

Full text
Abstract:
碩士<br>國立交通大學<br>電機資訊學院碩士在職專班<br>93<br>In this thesis, we propose a new platform for software/hardware co-design of FFT based on the SID hardware simulation software with ARM processor simulation core. With this platform, we compare the different hardware structures and analyze their efficiency, cost and speed improvements. Experiments show that this platform provides a very good simulation environment for system designers. The area and timing optimization for the hardware FFT can be easily achieved.
APA, Harvard, Vancouver, ISO, and other styles
41

Huang, Shih-tung, and 黃士桐. "Hardware/software co-verification for processor-OpenOCD integration." Thesis, 2013. http://ndltd.ncl.edu.tw/handle/96045069838755019105.

Full text
Abstract:
碩士<br>國立中山大學<br>資訊工程學系研究所<br>101<br>We usually use RVDS [18] (RealView Development Suite) and MUlTI-ICE (protocol converter) as ARM program debug environment by controlling ICE module for controlling CPU. But now we have another choice, ie the debug tool chain that Eclipse, GDB and OpenOCD (Open On-Chip Debugger) combination. We achieve program debugging by using a protocol converter called ft2232 that FIDI Company produced to connect ICE module with JTAG port. However, for ARM7-like CPU, ie. SYS32TM, ICE can control the CPU module that our laboratory develops. No matter the ARM RVDS or OpenOCD can not observate the interactive of debugger and ICE, it such that the verification of ICE and debugger is a difficult thing. In order to solve this problem, we consider PLI (Program Level Interface) to communicate with the RTL simulator and then connect to OpenOCD, or use the Platform Architect itself co-simulation environment connected OpenOCD. Finally, we chose Platform Architect environment and find the JTAGSC [4] as a protocol converter that come from embecosm EAN5 that can put into the Platform Architect environment. To achieve the purpose of co-verification, we ued shared memory mechanism to communication with OpenOCD and Platform Architect environment that JTAGSC had put into.
APA, Harvard, Vancouver, ISO, and other styles
42

Lee, Yuan-Cheng, and 李沅臻. "Optimizing Memory Virtualization through Hardware/Software Co-design." Thesis, 2017. http://ndltd.ncl.edu.tw/handle/ujxkag.

Full text
Abstract:
博士<br>國立臺灣大學<br>資訊網路與多媒體研究所<br>105<br>Virtualization is a technology enabling consolidation of multiple operating systems into a single physical machine. It originated from the need to create a multi-user time-sharing operating system based on multiple single-user operating systems. This long-lasting technology has evolved constantly. In addition to the popular applications for server-side virtualization, the advances of the capabilities of embedded processors make virtualization available on various systems much wider than before. The diversity of the target systems demands new design approaches considering the characteristics of the systems. In this dissertation, we propose the idea of optimizing virtualization environments through hardware/software co-design, and demonstrate the potential power of hardware/software co-design through the development of a new optimization technique for memory virtualization. Based on the existing studies, we recognize the memory subsystem as a major bottleneck of a virtualization environment. Therefore, we concentrate our efforts on optimizing memory virtualization for a specific type of virtualization environments as a working example. We first present a quantitative analysis of the impacts of memory virtualization. We then propose an optimized memory virtualization technique along with a comprehensive evaluation including the qualitative analysis with a formal proof and the quantitative analysis based on software emulation and hardware simulation. The results suggest the proposed technique outperforms the existing technique. The research points out hardware/software co-design is a promising direction for optimizing virtualization for the emerging applications.
APA, Harvard, Vancouver, ISO, and other styles
43

Lee, Jen-Chieh, and 李仁傑. "Hardware/Software Co-design for Image Object Segmentation." Thesis, 2007. http://ndltd.ncl.edu.tw/handle/67040552908716287125.

Full text
Abstract:
碩士<br>國立雲林科技大學<br>電子與資訊工程研究所<br>95<br>In modern system, image processing becomes more and more important, binarziation is the most usallay approach in the image technologies.However, the tradintionally binarziaiton was easily effected by illumination changes, and the currently image processings are always based on software. Thus, how to improve an effective image technology and design the hardware to reduce timing cost will become the focus.In this paper, we purpose an improved automatic thresholding algorithm to enhance the traditionally algorithm which have the insufficient performance problem, and we also implement this algorithm on the VLSI architecture. We implement the two different architectures can reach 100Mhz and 250Mhz respectively with the .18 UMC technolongy.In order to completely verify our system, we introduce the ARM integrator AP platform to implement the conpect of hardware/software co-design. We also design AHB Slaver and AHB Master Interface, and integrator them with our automatic thresholding circuit, memory controller, CMOS Sensor controller, VGA DAC controller and other system devices. To optimize our system, we provide the AHB DMA devices to improve system performance. In software, we use bootloader U-Boot to boot our system, and introduce the BusyBox and SystemV Init to be our command sets of operation system and initial system procedure respectively. And also use the Embedded Linux to be our operation system of verification platform. We design the linux drivers for our hardware devices. To completely verify software/hardware co-design, we design an image processing software flow with our hardware system to implement a motion dectetion system, and this system can produce 7 frames per sescon.
APA, Harvard, Vancouver, ISO, and other styles
44

Yung-Tai, Hsu. "A Co-simulation based Hardware/Software Deign and Verification methodology using User Mode Linux and SystemC." 2006. http://www.cetd.com.tw/ec/thesisdetail.aspx?etdun=U0001-3006200615031500.

Full text
APA, Harvard, Vancouver, ISO, and other styles
45

Hsu, Yung-Tai, and 許永泰. "A Co-simulation based Hardware/Software Deign and Verification methodology using User Mode Linux and SystemC." Thesis, 2006. http://ndltd.ncl.edu.tw/handle/54792146757812327838.

Full text
Abstract:
碩士<br>國立臺灣大學<br>電機工程學研究所<br>94<br>The paper represents a pure software framework which can be used to assist electrical system level design. The proposed solution basically relies on the interaction between the User Mode Linux virtual machine, which is used to abstract the model of the real programmable device where the embedded software should run, and hardware device simulated by SystemC. In this way, designers will be able to program and validate embedded software as well as the device driver in the early stages of the design flow. A driver based data access mechanism makes data transmission between simulated hardware and software possible. And the synchronization mechanism is mainly based on GDB and signal in SystemC. The integration of the entire system will be done at run-time. With this work, there will be a new way to take the impact of operating system into consideration with respect to electrical system design and hopefully it would therefore provide more correctness in design phase
APA, Harvard, Vancouver, ISO, and other styles
46

Choi, Jongsok. "Enabling Hardware/Software Co-design in High-level Synthesis." Thesis, 2012. http://hdl.handle.net/1807/33380.

Full text
Abstract:
A hardware implementation can bring orders of magnitude improvements in performance and energy consumption over a software implementation. Hardware design, however, can be extremely difficult. High-level synthesis, the process of compiling software to hardware, promises to make hardware design easier. However, compiling an entire software program to hardware can be inefficient. This thesis proposes hardware/software co-design, where computationally intensive functions are accelerated by hardware, while remaining program segments execute in software. The work in this thesis builds a framework where user-designated software functions are automatically compiled to hardware accelerators, which can execute serially or in parallel to work in tandem with a processor. To support multiple parallel accelerators, new multi-ported cache designs are presented. These caches provide low-latency high-bandwidth data to further improve the performance of accelerators. An extensive range of cache architectures are explored, and results show that certain cache architectures significantly outperform others in a processor/accelerator system.
APA, Harvard, Vancouver, ISO, and other styles
47

"Application hardware-software co-design for reconfigurable computing systems." THE GEORGE WASHINGTON UNIVERSITY, 2008. http://pqdtopen.proquest.com/#viewpdf?dispub=3297468.

Full text
APA, Harvard, Vancouver, ISO, and other styles
48

Wang, Yu-Min, and 王裕閔. "Hardware/Software Co-Synthesis Of An Industrial Embedded Microcontroller." Thesis, 1997. http://ndltd.ncl.edu.tw/handle/90861545925801770342.

Full text
Abstract:
碩士<br>國立中山大學<br>資訊工程研究所<br>85<br>Due to the time-to-market pressure and the lack of appropriate CAD tools and design methodology, designers of industrial embedded microcontrollers often miss the opportunity to systematically analyze the architectural properties of their designs and explore hardware and software alternatives for future upgrades. To satisfy the industrial request, we develop the hardware/software co-synthesis tool (PIPER-II) for microcontrollers. The synthesis tool accepts as input the instruction set architecture (behavioral) specification, and produces as outputs the pipeline RTL designs which can be synthesized by RT level synthesis tools such as SYNOPSYS Design CompilerTM to continue the synthesis flow. We shows how ours synthesis approach synthesize the microcontrollers and help to evaluate the design quality, analyze the architectural properties and explore possible architectural improvements and their impacts in both hardware and software.
APA, Harvard, Vancouver, ISO, and other styles
49

Sheng-HsinLo and 羅聖心. "Hardware and Software Co-design of IPsec Database Query." Thesis, 2012. http://ndltd.ncl.edu.tw/handle/06182288128288495542.

Full text
Abstract:
碩士<br>國立成功大學<br>電腦與通信工程研究所<br>100<br>With the popularity of the Internet, confidentiality requirements for the Internet have become more critical. The IEFT has proposed IP security to provide services of encryption/decryption and authentication without changing current network architecture. After enabling IPsec, every transmitted or received packet must query the IPsec database. As the speed of network increases, software searching of the IPsec database may become the critical path. The purpose of this thesis is to describe and analyze a database structure as well as its querying flow for IPsec and propose a database searching algorithm for Security Policy Database and Security Association Database. In order to accelerate the speed of IPsec Database querying, the application of hardware acceleration together with software searching is used. We evaluate three designs: scratchpad memory, hardware cache and software cache. We use SystemC language to implement our design in ESL virtual platform with the ARM processor. The design proposed in this work is implemented in Platform Architect and provides an on-line verification environment. Compare to software searching with 256 security policies, the software cache can reduce 83.54% querying time, hardware cache can reduce 85.89% querying time and scratchpad memory can reduce 83.87% querying time. We found that the efficiency of software cache is nearly equal to hardware cache and consumes less cost.
APA, Harvard, Vancouver, ISO, and other styles
50

Liu, Chih-Wen, and 劉智文. "Energy Efficient Hardware / Software Co-scheduling in Reconfigurable Systems." Thesis, 2006. http://ndltd.ncl.edu.tw/handle/02529361493267192407.

Full text
Abstract:
碩士<br>國立中正大學<br>資訊工程所<br>94<br>The great advances in technology have made electronic system applications more and more complex. At the same time, we have to consider the high power consumption caused by the complex applications. An embedded system may consist of software and hardware components. If a function unit is implemented in software, it has high flexibility but poor performance. On the contrary, if a function unit implemented in hardware, it usually has high performance along with poor flexibility and high cost. When we take into consideration performance, flexibility, and cost of the system, reconfigurable system is a feasible solution among various ones. We can dynamically reconfigure the system for different applications. This provides us with both the flexibility and performance. However, one problem occurs if the system is reconfigured frequently without proper resource management. It may cause the downgrade of performance and significant energy consumption, which neutralize the benefits brought by reconfigurable systems. Generally speaking, previous work addressed the static scheduling in such systems. However, we cannot anticipate all system behavior at design time. In this work, we present a dynamic hardware software co-scheduling method focusing on reducing energy consumption caused by frequent reconfiguration. We integrate processor dynamic voltage scaling (DVS) into the proposed method to maximize the configuration reuse. In this way, we can not only reduce the delay but also the energy consumption both caused by reconfiguration. Finally, we randomly generate numerous cases to prove the schedule quality of the proposed method.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography