Tesis:

1

Namork, Magnus Krokum. "Network on Chip for FPGA : Development of a test system for Network on Chip". Thesis, Norges teknisk-naturvitenskapelige universitet, Institutt for elektronikk og telekommunikasjon, 2011. http://urn.kb.se/resolve?urn=urn:nbn:no:ntnu:diva-13654.

Texto completo

Resumen

Testing and verification of digital systems is an essential part of product develop-ment. The Network on Chip(NoC), as a new paradigm within interconnections;has a specific need for testing. This is to determine how performance and prop-erties of the NoC are compared to the requirements of different systems such asprocessors or media applications.A NoC has been developed within the AHEAD project to form a basis for areconfigurable platform used in the AHEAD system. This report gives an outlineof the project to develop testing and benchmarking systems for a NoC. The specificwork has been regarding the development of a generic module connected to theNoC and capability of testing the NoCs’ properties. The test system was initiatedby Ivar Ersland in 2009 and developed further by Andreas Hepsø, and MagnusNamork in the fall of 2010. The functionality and systems that are implementedare the following:• Fully functional Hardware/Software interface which deﬁnes communicationbetween NoC the user• Reactive system which responds to interaction based on package information• MPEG example system that mimics an MPEG data stream• Software reconfiguration of the traffic tables by sending specific packages tothe system• Cell processor example application to test simple computation and commu-nicating modules on the networkThe systems have been tested successfully, verified and implemented on a XilinxSpartan FPGA. It has also been developed a software system written in C to read and interpret data from the Network in on-chip tests. In total these imple-mentations have been the foundation of building a benchmarking platform for theNoC.

Los estilos APA, Harvard, Vancouver, ISO, etc.

2

Ljungberg, Jan. "SYSTEM ON CHIP : Fördelar i konstruktion med system on chip i förhållande till fristående FPGA och processor". Thesis, Tekniska Högskolan, Högskolan i Jönköping, JTH, Data- och elektroteknik, 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:hj:diva-28263.

Texto completo

Resumen

In this exam project the investigation has been done to determine, which profits that can be made by switching an internal bus between two chips, one FPGA and a processor, to an internal bus implemented on only one chip, System on Chip. The work is based on measurements made in real time in Xilinx’s development tools on different buses, AXI4 and AXI4-Light connected to AXI3. The port that is used is FPGA’s own GP-port. Besides measuring the time of transactions also physical aspects have been investigated in this project: space, costs and time. Based on those criteria a comparison to the original construction was made to determine which benefits that can be achieved. The work has shown a number of results that are in comparison with the original construction. The System on Chip has turned out to be a better solution in most cases. When using the AXI4-Light-bus the benefits were not as obvious. Cosmic radiation, temperature or humidity are beyond the scope of this investigation. In the work the hypothetic deductive method has been used to prove that the System on Chip is faster than the original design. In this method three statements must be set up against each other; one statement that ought to be true, one statement that is a contradiction and a conclusion of what is proved. The pre-study pointed out that the System on Chip is a faster solution than the original construction. The method is useful since it proves that the pre-study is comparable to the measured results.
I detta examensarbete har undersökningar gjorts för att fastställa vilka vinster som går att göra genom att byta en internbuss mellan två chip, en FPGA och en processor, mot en intern buss implementerat på ett enda chip, System on Chip. Arbetet bygger på mätningar gjorda i realtid i Xilinx utvecklingsverktyg på olika bussar, AXI4 och AXI4‑Lite som är kopplade internt mot AXI3. Den port som används är FPGAs egen GP‑port. Förutom att mäta överföringshastigheterna, har även fysiska aspekter som utrymme, kostnader och utvecklingstid undersökts. Utifrån dessa kriterier har en jämförelse gjorts med den befintliga konstruktionen för att fastställa vilka vinster som går att uppnå. Arbetet har resulterat i ett antal resultat som är ställda mot de förutsättningar som fanns i den ursprungliga lösningen. I de flesta fall visar resultatet att ett System on Chip är en bättre lösning. De fall som var tveksamma var vid viss typ av överföring med AXI4‑Lite bussen. I arbetet har inte undersökning av kosmisk strålning, temperatur eller luftfuktighet betraktas. I arbetet med att försöka att bevisa att ett System on Chip är snabbare än den ursprungliga uppsättningen har utvecklingsmetoden hypotetisk deduktiv använts. Denna metod bygger på att man från början sätter upp ett påstående, som man förutsätter är sant, följt av en konjunktion, som inte får inträffa, för att slutligen dra en slutsats, som konstaterar fakta. Eftersom fakta som lästes in i början av arbetet pekade på att ett System on Chip var en snabbare och billigare lösning kändes metoden användbar. Under arbetets gång har det visat sig vara en bra metod som också ger ett resultat där sannolikheten för att det är en snabbare lösning ökar. Däremot säger inte metoden att det är helt säkert att den i alla situationer är bättre, vilket kan ändras om man använder andra förutsättningar eller tar med andra aspekter.

Los estilos APA, Harvard, Vancouver, ISO, etc.

3

Bretz, Daniel. "Digitales Diktiergerät als System-on-a-Chip mit FPGA-Evaluierungsboard". [S.l. : s.n.], 2001. http://www.bsz-bw.de/cgi-bin/xvms.cgi?SWB9033538.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

4

Yabarrena, Jean Mimar Santa Cruz. "Tecnologias system on chip e CAN em sistemas de controle distribuído". Universidade de São Paulo, 2006. http://www.teses.usp.br/teses/disponiveis/18/18149/tde-31072006-203757/.

Texto completo

Resumen

Sistemas de controle precisam trabalhar com restrições temporais rigorosas para garantir seu correto funcionamento, sendo por isso considerados sistemas de tempo-real. Quando tais sistemas são distribuídos, as redes de sensores, atuadores e controladores estão interligados em geral, por redes de campo. Nesse contexto, as redes de campo desempenham um papel extremamente importante no comportamento global do sistema. O presente trabalho de pesquisa apresenta a descrição do processo de desenvolvimento de um system on-chip (SoC) para um sistema de controle. Diferentemente das abordagens clássicas, o trabalho está focado em implementar o sistema baseado em um paradigma diferenciado, baseado em lógica reprogramável. Apresenta-se o projeto e construção dos IP cores necessários para controlar um motor DC, utilizando o barramento control area network (CAN) para obter uma plataforma distribuída. A arquitetura on chip utilizada está baseada na especificação CoreConnect da IBM. São expostos, ainda, trabalhos de simulação tanto dos componentes isolados, como do sistema integrado, de forma a realizar uma comparação qualitativa do processo de desenvolvimento
Control systems require strict time constraints to work properly, being therefore considered real-time systems. When such systems are distributed, controllers, sensors, and actuators are generally interconnected by fieldbuses. In this context the fieldbuses play an important role in the system global behavior. This research presents the description of the development process of a system-on-chip SoC. Differentiated from the classical approaches, this work focus the implementation of a reprogrammable logic based system. This work explain the necessary IP cores implementation, allowing a DC motor control, using a control area network (CAN) bus to reach a distributed platform. The on-chip architecture used is based on the IBM CoreConnect specification. Moreover it shows isolated components and integral system simulations, in such a way to obtain a qualitative comparison of development processes

Los estilos APA, Harvard, Vancouver, ISO, etc.

5

Zhou, Yuteng. "Computer Vision System-On-Chip Designs for Intelligent Vehicles". Digital WPI, 2018. https://digitalcommons.wpi.edu/etd-dissertations/162.

Texto completo

Resumen

Intelligent vehicle technologies are growing rapidly that can enhance road safety, improve transport efficiency, and aid driver operations through sensors and intelligence. Advanced driver assistance system (ADAS) is a common platform of intelligent vehicle technologies. Many sensors like LiDAR, radar, cameras have been deployed on intelligent vehicles. Among these sensors, optical cameras are most widely used due to their low costs and easy installation. However, most computer vision algorithms are complicated and computationally slow, making them difficult to be deployed on power constraint systems. This dissertation investigates several mainstream ADAS applications, and proposes corresponding efficient digital circuits implementations for these applications. This dissertation presents three ways of software / hardware algorithm division for three ADAS applications: lane detection, traffic sign classification, and traffic light detection. Using FPGA to offload critical parts of the algorithm, the entire computer vision system is able to run in real time while maintaining a low power consumption and a high detection rate. Catching up with the advent of deep learning in the field of computer vision, we also present two deep learning based hardware implementations on application specific integrated circuits (ASIC) to achieve even lower power consumption and higher accuracy. The real time lane detection system is implemented on Xilinx Zynq platform, which has a dual core ARM processor and FPGA fabric. The Xilinx Zynq platform integrates the software programmability of an ARM processor with the hardware programmability of an FPGA. For the lane detection task, the FPGA handles the majority of the task: region-of-interest extraction, edge detection, image binarization, and hough transform. After then, the ARM processor takes in hough transform results and highlights lanes using the hough peaks algorithm. The entire system is able to process 1080P video stream at a constant speed of 69.4 frames per second, realizing real time capability. An efficient system-on-chip (SOC) design which classifies up to 48 traffic signs in real time is presented in this dissertation. The traditional histogram of oriented gradients (HoG) and support vector machine (SVM) are proven to be very effective on traffic sign classification with an average accuracy rate of 93.77%. For traffic sign classification, the biggest challenge comes from the low execution efficiency of the HoG on embedded processors. By dividing the HoG algorithm into three fully pipelined stages, as well as leveraging extra on-chip memory to store intermediate results, we successfully achieved a throughput of 115.7 frames per second at 1080P resolution. The proposed generic HoG hardware implementation could also be used as an individual IP core by other computer vision systems. A real time traffic signal detection system is implemented to present an efficient hardware implementation of the traditional grass-fire blob detection. The traditional grass-fire blob detection method iterates the input image multiple times to calculate connected blobs. In digital circuits, five extra on-chip block memories are utilized to save intermediate results. By using additional memories, all connected blob information could be obtained through one-pass image traverse. The proposed hardware friendly blob detection can run at 72.4 frames per second with 1080P video input. Applying HoG + SVM as feature extractor and classifier, 92.11% recall rate and 99.29% precision rate are obtained on red lights, and 94.44% recall rate and 98.27% precision rate on green lights. Nowadays, convolutional neural network (CNN) is revolutionizing computer vision due to learnable layer by layer feature extraction. However, when coming into inference, CNNs are usually slow to train and slow to execute. In this dissertation, we studied the implementation of principal component analysis based network (PCANet), which strikes a balance between algorithm robustness and computational complexity. Compared to a regular CNN, the PCANet only needs one iteration training, and typically at most has a few tens convolutions on a single layer. Compared to hand-crafted features extraction methods, the PCANet algorithm well reflects the variance in the training dataset and can better adapt to difficult conditions. The PCANet algorithm achieves accuracy rates of 96.8% and 93.1% on road marking detection and traffic light detection, respectively. Implementing in Synopsys 32nm process technology, the proposed chip can classify 724,743 32-by-32 image candidates in one second, with only 0.5 watt power consumption. In this dissertation, binary neural network (BNN) is adopted as a potential detector for intelligent vehicles. The BNN constrains all activations and weights to be +1 or -1. Compared to a CNN with the same network configuration, the BNN achieves 50 times better resource usage with only 1% - 2% accuracy loss. Taking car detection and pedestrian detection as examples, the BNN achieves an average accuracy rate of over 95%. Furthermore, a BNN accelerator implemented in Synopsys 32nm process technology is presented in our work. The elastic architecture of the BNN accelerator makes it able to process any number of convolutional layers with high throughput. The BNN accelerator only consumes 0.6 watt and doesn't rely on external memory for storage.

Los estilos APA, Harvard, Vancouver, ISO, etc.

6

Reiche, Myrgård Martin. "Acceleration of deep convolutional neural networks on multiprocessor system-on-chip". Thesis, Uppsala universitet, Avdelningen för datorteknik, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-385904.

Texto completo

Resumen

In this master thesis some of the most promising existing frameworks and implementations of deep convolutional neural networks on multiprocessor system-on-chips (MPSoCs) are researched and evaluated. The thesis’ starting point was a previousthesis which evaluated possible deep learning models and frameworks for object detection on infra-red images conducted in the spring of 2018. In order to fit an existing deep convolutional neural network (DCNN) on a Multiple-Processor-System on Chip it needs modifications. Most DCNNs are trained on Graphic processing units (GPUs) with a bit width of 32 bit. This is not optimal for a platform with hard memory constraints such as the MPSoC which means it needs to be shortened. The optimal bit width depends on the network structure and requirements in terms of throughput and accuracy although most of the currently available object detection networks drop significantly when reduced below 6 bits width. After reducing the bit width, the network needs to be quantized and pruned for better memory usage. After quantization it can be implemented using one of many existing frameworks. This thesis focuses on Xilinx CHaiDNN and DNNWeaver V2 though it touches a little on revision, HLS4ML and DNNWeaver V1 as well. In conclusion the implementation of two network models on Xilinx Zynq UltraScale+ ZCU102 using CHaiDNN were evaluated. Conversion of existing network were done and quantization tested though not fully working. The results were a two to six times more power efficient implementation in comparison to GPU inference.

Los estilos APA, Harvard, Vancouver, ISO, etc.

7

Bayona, Adam Robert. "System on a chip Soft IP from the FPGA-vendor or an OpenCore-processor?" Thesis, Norwegian University of Science and Technology, Department of Electronics and Telecommunications, 2007. http://urn.kb.se/resolve?urn=urn:nbn:no:ntnu:diva-9507.

Texto completo

Resumen

Two different processors from two FPGA vendors and an OpenCore-processor have been investigated. For this work two different boards were used, the first was the Cyclone II FPGA Altera Board, in which the Nios II Altera microprocessor and the free processor Leon2 were tested. The second board was a SUZAKU-S board, in which the Microblaze Xilinx microprocessor and the free processor Leon2 were tested. We performed two different benchmarks in these boards, the Dhrystone and the Whetstone, to compare the different velocities between the free and not free processors. Also the documentation and ease of use of the processors is considered.

Los estilos APA, Harvard, Vancouver, ISO, etc.

8

Druyer, Rémy. "Réseau sur puce sécurisé pour applications cryptographiques sur FPGA". Thesis, Montpellier, 2017. http://www.theses.fr/2017MONTS023/document.

Texto completo

Resumen

Que ce soit au travers des smartphones, des consoles de jeux portables ou bientôt des supercalculateurs, les systèmes sur puce (System-on-chip (SoC)) ont vu leur utilisation largement se répandre durant ces deux dernières décennies. Ce phénomène s’explique notamment par leur faible consommation de puissance au regard des performances qu’ils sont capables de délivrer, et du large panel de fonctions qu’ils peuvent intégrer. Les SoC s’améliorant de jour en jour, ils requièrent de la part des systèmes d’interconnexions qui supportent leurs communications, des performances de plus en plus élevées. Pour répondre à cette problématique les réseaux sur puce (Network-on-Chip (NoC)) ont fait leur apparition.En plus des ASIC, les circuit reconfigurables FPGA sont un des choix possibles lors de la réalisation d’un SoC. Notre première contribution a donc été de réaliser et d’étudier les performances du portage du réseau sur puce générique Hermes initialement conçu pour ASIC, sur circuit reconfigurable. Cela nous a permis de confirmer que l’architecture du système d’interconnexions doit être adaptée à celle du circuit pour pouvoir atteindre les meilleures performances possibles. Par conséquent, notre deuxième contribution a été la conception de l’architecture de TrustNoC, un réseau sur puce optimisé pour FPGA à hautes performances en latence, en fréquence de fonctionnement, et en quantité de ressources logiques occupées.Un autre aspect primordial qui concerne les systèmes sur puce, et plus généralement de tous les systèmes numériques est la sécurité. Notre dernière principale contribution a été d’étudier les menaces qui s’exercent sur les SoC durant toutes les phases de leur vie, puis de développer à partir d’un modèle de menaces, des mécanismes matériels de sécurité permettant de lutter contre des détournements d’IP, et des attaques logicielles. Nous avons également veillé à limiter au maximum le surcoût qu’engendre les mécanismes de sécurité sur les performances sur réseau sur puce
Whether through smartphones, portable game consoles, or high performances computing, Systems-on-Chip (SoC) have seen their use widely spread over the last two decades. This can be explained by the low power consumption of these circuits with the regard of the performances they are able to deliver, and the numerous function they can integrate. Since SoC are improving every day, they require better performances from interconnects that support their communications. In order to address this issue Network-on-Chip have emerged.In addition to ASICs, FPGA circuits are one of the possible choices when conceiving a SoC. Our first contribution was therefore to perform and study the performance of Hermes NoC initially designed for ASIC, on reconfigurable circuit. This allowed us to confirm that the architecture of the interconnection system must be adapted to that of the circuit in order to achieve the best possible performances. Thus, our second contribution was to design TrustNoC, an optimized NoC for FPGA platform, with low latency, high operating frequency, and a moderate quantity of logical resources required for implementation.Security is also a primordial aspect of systems-on-chip, and more generally, of all digital systems. Our latest contribution was to study the threats that target SoCs during all their life cycle, then to develop and integrate hardware security mechanisms to TrustNoC in order to counter IP hijacking, and software attacks. During the design of security mechanisms, we tried to limit as much as possible the overhead on NoC performances

Los estilos APA, Harvard, Vancouver, ISO, etc.

9

Powell, Andrew Andre. "Performance of the Xilinx Zynq System-on-Chip Interconnect with Asymmetric Multiprocessing". Master's thesis, Temple University Libraries, 2014. http://cdm16002.contentdm.oclc.org/cdm/ref/collection/p245801coll10/id/306468.

Texto completo

Resumen

Electrical and Computer Engineering
M.S.E.E.
For many applications, embedded designers need to construct systems that facilitate real-time constraints and thus require complete information on a processor's performance under specified parameters. An important and limiting factor in any processor's performance is how quickly components are able to intercommunicate over the system's bus. However, another important constraint, specific to real-time systems, is knowing precisely how long the data communication will require. A highly integrated system composed of multiple processing cores, referred to as a System-on-Chip (SoC) device, contains a bus known as an on-chip interconnect. Specifically, this thesis research presents how rapidly the AMBA AXI on-chip interconnect of Xilinx Zynq-7000 Extensible Processing Platform (EPP) SoC device functions by measuring the time required to communicate between memory and the two major device components of the SoC device. The memory is either internal or external. The two major device components include the processing system (PS) and programmable logic (PL). The PS contains a dual-core ARM Cortex-A9 processor that executes FreeRTOS in Asymmetric Multiprocessing. Communication between the PL and memory is through the PS-PL interfaces; the Accelerator Coherency Port AXI interface, High Performance AXI interface, and the General Purpose AXI interface. The benchmarking is performed under several, changing parameters; such as the payload size and the number of devices executing in the PL. The embedded design is implemented with Xilinx Vivado Design Suite, which includes the Vivado IDE and the SDK, and is executed on the Avnet ZedBoard and Xilinx ZC702 Evaluation Kit.
Temple University--Theses

Los estilos APA, Harvard, Vancouver, ISO, etc.

10

Vyas, Dhaval N. "FPGA-based hardware accelerator design for performance improvement of a system-on-a-chip application". Diss., Online access via UMI:, 2005.

Buscar texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

11

Wahlqvist, Emanuel. "Adaptation of an ARM compatible System on chip as an IP-module in a FPGA". Thesis, Uppsala universitet, Institutionen för informationsteknologi, 2014. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-219578.

Texto completo

Resumen

In the world of today a fast prototyping and low time to market are very important factors when developing products. Any effort to minimize these parameters as well as making systems easier to maintain is effort well placed. Syntronic is a consultant company dealing in electronic and software development, testing and maintenance. They see the soft core processor, implemented in a Field Programmable Gate Array, as a step towards more versatile platforms. As a first effort this thesis presents the specification, implementation and testing of a System on Chip based on a open source ARMv2a compatible processor designed in Verilog. The system aims at applications where performance is not the highest priority but rather small FPGA area and possibility to connect many different sensor types. The final result is a system that is able to execute both assembler and C code in simulations. There was no hardware available for testing but the synthesis procedure shows promising results. The final system include interfaces for UART, SPI and I2C along with support for up to 32 General Purpose Input Output pins. All steps required for modifying and customizing the system is also presented along with the tools used.

Los estilos APA, Harvard, Vancouver, ISO, etc.

12

Mühlbauer, Felix. "Entwurf, Methoden und Werkzeuge für komplexe Bildverarbeitungssysteme auf Rekonfigurierbaren System-on-Chip-Architekturen". Phd thesis, Universität Potsdam, 2011. http://opus.kobv.de/ubp/volltexte/2012/5992/.

Texto completo

Resumen

Bildverarbeitungsanwendungen stellen besondere Ansprüche an das ausführende Rechensystem. Einerseits ist eine hohe Rechenleistung erforderlich. Andererseits ist eine hohe Flexibilität von Vorteil, da die Entwicklung tendentiell ein experimenteller und interaktiver Prozess ist. Für neue Anwendungen tendieren Entwickler dazu, eine Rechenarchitektur zu wählen, die sie gut kennen, anstatt eine Architektur einzusetzen, die am besten zur Anwendung passt. Bildverarbeitungsalgorithmen sind inhärent parallel, doch herkömmliche bildverarbeitende eingebettete Systeme basieren meist auf sequentiell arbeitenden Prozessoren. Im Gegensatz zu dieser "Unstimmigkeit" können hocheffiziente Systeme aus einer gezielten Synergie aus Software- und Hardwarekomponenten aufgebaut werden. Die Konstruktion solcher System ist jedoch komplex und viele Lösungen, wie zum Beispiel grobgranulare Architekturen oder anwendungsspezifische Programmiersprachen, sind oft zu akademisch für einen Einsatz in der Wirtschaft. Die vorliegende Arbeit soll ein Beitrag dazu leisten, die Komplexität von Hardware-Software-Systemen zu reduzieren und damit die Entwicklung hochperformanter on-Chip-Systeme im Bereich Bildverarbeitung zu vereinfachen und wirtschaftlicher zu machen. Dabei wurde Wert darauf gelegt, den Aufwand für Einarbeitung, Entwicklung als auch Erweiterungen gering zu halten. Es wurde ein Entwurfsfluss konzipiert und umgesetzt, welcher es dem Softwareentwickler ermöglicht, Berechnungen durch Hardwarekomponenten zu beschleunigen und das zu Grunde liegende eingebettete System komplett zu prototypisieren. Hierbei werden komplexe Bildverarbeitungsanwendungen betrachtet, welche ein Betriebssystem erfordern, wie zum Beispiel verteilte Kamerasensornetzwerke. Die eingesetzte Software basiert auf Linux und der Bildverarbeitungsbibliothek OpenCV. Die Verteilung der Berechnungen auf Software- und Hardwarekomponenten und die daraus resultierende Ablaufplanung und Generierung der Rechenarchitektur erfolgt automatisch. Mittels einer auf der Antwortmengenprogrammierung basierten Entwurfsraumexploration ergeben sich Vorteile bei der Modellierung und Erweiterung. Die Systemsoftware wird mit OpenEmbedded/Bitbake synthetisiert und die erzeugten on-Chip-Architekturen auf FPGAs realisiert.
Image processing applications have special requirements to the executing computational system. On the one hand a high computational power is necessary. On the other hand a high flexibility is an advantage because the development tends to be an experimental and interactive process. For new applications the developer tend to choose a computational architecture which they know well instead of using that one which fits best to the application. Image processing algorithms are inherently parallel while common image processing systems are mostly based on sequentially operating processors. In contrast to this "mismatch", highly efficient systems can be setup of a directed synergy of software and hardware components. However, the construction of such systems is complex and lots of solutions, like gross-grained architectures or application specific programming languages, are often too academic for the usage in commerce. The present work should contribute to reduce the complexity of hardware-software-systems and thus increase the economy of and simplify the development of high-performance on-chip systems in the domain of image processing. In doing so, a value was set on keeping the effort low on making familiar to the topic, on development and also extensions. A design flow was developed and implemented which allows the software developer to accelerate calculations with hardware components and to prototype the whole embedded system. Here complex image processing systems, like distributed camera sensor networks, are examined which need an operating system. The used software is based upon Linux and the image processing library OpenCV. The distribution of the calculations to software and hardware components and the resulting scheduling and generation of architectures is done automatically. The design space exploration is based on answer set programming which involves advantages for modelling in terms of simplicity and extensions. The software is synthesized with the help of OpenEmbedded/Bitbake and the generated on-chip architectures are implemented on FPGAs.

Los estilos APA, Harvard, Vancouver, ISO, etc.

13

Mollberg, Alexander. "A Resource-Efficient and High-Performance Implementation of Object Tracking on a Programmable System-on-Chip". Thesis, Linköpings universitet, Datorteknik, 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-124044.

Texto completo

Resumen

The computer vision problem of object tracking is introduced and explained. An approach to interest point based feature detection and tracking using FAST and BRIEF is presented and the selection of algorithms suitable for implementation on a Xilinx Zynq7000 with an XC7Z020 field-programmable gate array (FPGA) is detailed. A modification to the smoothing strategy of BRIEF which significantly reduces memory utilization on the FPGA is presented and benchmarked against a reference strategy. Measures of performance and resource efficiency are presented and utilized in an iterative development process. A system for interest point based object tracking that uses FAST for feature detection and BRIEF for feature description with the proposed smoothing modification is implemented on the FPGA. The design is described and important design choices are discussed.

Los estilos APA, Harvard, Vancouver, ISO, etc.

14

Beasley, Alexander. "Exploring the benefits and implications of dynamic partial reconfiguration using Field Programmable Gate Array-System on Chip architectures". Thesis, University of Bath, 2019. https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.767597.

Texto completo

Resumen

Demands on modern computing are becoming more intensive. Keeping up with these demands has increasing complexity. Moore's Law is in decline. Increasing the number of cores on a device has diminishing returns. Specialised architectures provide more efficient and higher performing processors. However, it is not always practical to include every architecture on every device. Running non-native tasks on architectures often results in a drop in performance. This research examines the benefits and limitations of Field Programmable Gate Arrays - Systems on Chip (FPGA-SoC) devices to provide flexible hardware accelerators for heterogeneous architectures. A number of topics are covered, including hardware acceleration of floating-point mathematical functions, dynamic reconfiguration and high-level synthesis. A number of case studies are presented. Dynamic reconfiguration is used to change the configuration of the FPGA at runtime, allowing the hardware accelerators to be changed depending on the current processor tasks. Changing accelerators at runtime has limitations, such as data perturbation. Context switching techniques are applied to the hardware to prevent loss of data and enable de-fragmentation of the FPGA. High level synthesis techniques are used in conjunction with the presented hardware accelerators to synthesise high-level languages into hardware descriptions with optimisations. Techniques for runtime synthesis of hardware accelerators are presented. These can be combined with dynamic reconfiguration to configure FPGAs with appropriate hardware accelerators from a high-level language at runtime. The research demonstrates that FPGA-SoC devices have the potential for providing reconfigurable accelerators for processors in heterogeneous architectures. Metrics show that the FPGA configurations can perform better than other commercial processors. It was demonstrated that it is possible to context switch hardware at runtime, meaning the most can be made of the FPGA-SoC at all times, even as situations change. However, there are many limitations that still need to be overcome, such as management of the implemented hardware, synthesis of new hardware at runtime, reconfiguration times, interfacing of hardware with software and the design of hardware accelerators.

Los estilos APA, Harvard, Vancouver, ISO, etc.

15

Jereb, Alexander Robert. "Design and implementation of a Radio-Frequency detection algorithm for use within A Radio-Frequency System on Chip". University of Dayton / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=dayton1608145466947488.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

16

Rößler, Marko. "Parallel Hardware- and Software Threads in a Dynamically Reconfigurable System on a Programmable Chip". Universitätsbibliothek Chemnitz, 2013. http://nbn-resolving.de/urn:nbn:de:bsz:ch1-qucosa-129626.

Texto completo

Resumen

Today’s embedded systems depend on the availability of hybrid platforms, that contain heterogeneous computing resources such as programmable processors units (CPU’s or DSP’s) and highly specialized hardware cores. These platforms have been scaled down to integrated embedded system-on-chip. Modern platform FPGAs enhance such systems by the flexibility of runtime configurable silicon. One of the major advantages that arises is the ability to use hardware (HW) and software (SW) resources in a time-shared manner. Though the ability to dynamically assign computing resources based on decisions taken at runtime is given.

Los estilos APA, Harvard, Vancouver, ISO, etc.

17

Voigt, Sven-Ole. "Dynamically reconfigurable dataflow architecture for high performance digital signal processing on multi FPGA platforms". Aachen Shaker, 2008. http://d-nb.info/992481694/04.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

18

Bonatto, Alexsandro Cristóvão. "Núcleos de interface de memória DDR SDRAM para sistemas-em-chip". reponame:Biblioteca Digital de Teses e Dissertações da UFRGS, 2009. http://hdl.handle.net/10183/17291.

Texto completo

Resumen

Dispositivos integrados de sistemas-em-chip (SoC), especialmente aqueles dedicados às aplicações multimídia, processam grandes quantidades de dados armazenados em memórias. O desempenho das portas de memória afeta diretamente no desempenho do sistema. A melhor utilização do espaço de armazenamento de dados e a redução do custo e do consumo de potência dos sistemas eletrônicos encorajam o desenvolvimento de arquiteturas eficientes para controladores de memória. Essa melhoria deve ser alcançada tanto para interfaces com memórias internas quanto externas ao chip. Em sistemas de processamento de vídeo, por exemplo, memórias de grande capacidade são necessárias para armazenar vários quadros de imagem enquanto que os algoritmos de compressão fazem a busca por redundâncias. No caso de sistemas implementados em tecnologia FPGA é possível utilizar os blocos de memória disponíveis internamente ao FPGA, os quais são limitados a poucos mega-bytes de dados. Para aumentar a capacidade de armazenamento de dados é necessário usar elementos de memória externa e um núcleo de propriedade intelectual (IP) de controlador de memória é necessário. Contudo, seu desenvolvimento é uma tarefa muito complexa e nem sempre é possível utilizar uma solução "sob demanda". O uso de FPGAs para prototipar sistemas permite ao desenvolvedor integrar módulos rapidamente. Nesse caso, a verificação do projeto é uma questão importante a ser considerada no desenvolvimento de um sistema complexo. Controladores de memória de alta velocidade são extremamente sensíveis aos atrasos de propagação da lógica e do roteamento. A síntese a partir de uma descrição em linguagem de hardware (HDL) necessita da verificação de sua compatibilidade com as especificações de temporização pré-determinadas. Como solução para esse problema, é apresentado nesse trabalho um IP do controlador de memória DDR SDRAM com função de BIST (Built-In Self-Test) integrada, onde o teste de memória é utilizado para verificar o funcionamento correto do controlador.
Many integrated Systems-on-Chip (SoC) devices, specially those dedicated to multimedia applications, process large amounts of data stored on memories. The performance of the memories ports directly affects the performance of the system. Optimization of the usage of data storage and reduction of cost and power consumption of the electronic systems encourage the development of efficient architectures for memory controllers. This improvement must be reached either for embedded or external memories. In systems for video processing, for example, large memory arrays are needed to store several video frames while compression algorithms search for redundancies. In the case of FPGA system implementation, it is possible to use memory blocks available inside FPGA, but for only a few megabytes of data. To increase data storage capacity it is necessary to use external memory devices and a memory controller intellectual property (IP) core is required. Nevertheless, its development is a very complex task and it is not always possible to have a custom solution. Using FPGA for system prototyping allows the developer to perform rapid integration of modules to exercise a hardware version. In this case, test is an important issue to be considered in a complex system design. High speed memory controllers are very sensitive to gate and routing delays and the synthesis from a hardware description language (HDL) needs to be verified to comply with predefined timing specifications. To overcome these problems, a DDR SDRAM controller IP was developed which integrate the BIST (Built-In Self-Test) function, where the memory test is used to check the correct functioning of the DDR controller.

Los estilos APA, Harvard, Vancouver, ISO, etc.

19

Zambrano-Mendez, Leandro. "Diseño System on Chip de los centros nerviosos del sistema neurorregulador de los humanos. Aplicación al centro córtico-diencefálico". Doctoral thesis, Universidad de Alicante, 2019. http://hdl.handle.net/10045/97291.

Texto completo

Resumen

El sistema neurorregulador en los humanos es un sistema nervioso complejo que consta de un grupo heterogéneo de centros nerviosos. Estos centros están distribuidos a lo largo de la médula espinal, actúan de forma autónoma, se comunican mediante interconexiones nerviosas, gobiernan y regulan el comportamiento de órganos y sistemas en los seres humanos. A partir del estudio del funcionamiento y composición del sistema neurorregulador del tracto urinario inferior (LUT), se ha conseguido aislar los centros que intervienen en su comportamiento. El objetivo es comprender el funcionamiento individual de cada centro para crear un modelo general del sistema neurorregulador capaz de operar a nivel de centro nervioso. El modelo creado se basa en la teoría de los sistemas multiagente (MAS) basados en agentes con capacidad de percibir, deliberar y ejecutar (agentes PDE). En nuestro modelo, cada agente representa el comportamiento de un centro nervioso. La propuesta supone un avance con respecto a los modelos existentes, que al no permitir la intervención a nivel de los elementos que componen el sistema, ni tratar aspectos como, por ejemplo, disfunciones, de forma independiente o aislada, son modelos de caja negra. A partir del estudio realizado a lo largo de los últimos veinte años y del modelo MAS definido, en esta investigación se propone como objetivo la propuesta de un modelo de centro nervioso para el diseño System on Chip (SoC) de un procesador con la estructura de un agente PDE capaz de desempeñar las funciones de un centro nervioso del sistema neurorregulador en los humanos: en concreto, del centro nervioso córtico-diencefálico. Esta propuesta se caracteriza porque el funcionamiento del centro es totalmente configurable y programable, con la idea de que el diseño propuesto pueda ser válido para otros centros que componen el sistema neurorregulador. Esta investigación supone un nuevo paso adelante en nuestro objetivo de crear un chip parametrizable, capaz de desarrollar cualquier función neurorreguladora, implantable en el cuerpo y con capacidad para operar de forma coordinada con el sistema neurorregulador biológico. En esta memoria se presenta de forma cronológica el proceso de la investigación junto con sus principales logros: un modelo formal del centro nervioso córtico-diencefálico, el diseño en hardware de un procesador de centro nerviosos parametrizable, un prototipo sobre tecnología en hardware reconfigurable (FPGA) de este diseño, un entorno de pruebas y simulación para la evaluación de la propuesta y, finalmente, el análisis de los resultados obtenidos y la comparación de su comportamiento con el de los datos obtenidos a partir de pacientes reales, observando que se ajusta al esperado en un ser humano. Por lo tanto, los resultados obtenidos avalan ampliamente la validez de la propuesta realizada, ya que muestran que los sistemas obtenidos son capaces de desplegar los comportamientos para los que fueron diseñados originalmente pero que, además, es posible su rápida adaptación, modificación, actualización, automatización y absorción de nuevas tecnologías y nuevo conocimiento.

Los estilos APA, Harvard, Vancouver, ISO, etc.

20

Mahmood, Adnan y Zaheer Ahmed Mohammed. "DESIGN AND PROTOTYPE OF RESOURCE NETWORK INTERFACES FOR NETWORK ON CHIP". Thesis, Jönköping University, JTH, Computer and Electrical Engineering, 2009. http://urn.kb.se/resolve?urn=urn:nbn:se:hj:diva-11114.

Texto completo

Resumen

Network on Chip (NoC) has emerged as a competitive and efficient communication infrastructure for the core based design of System on Chip. Resource (core), router and interface between router and core are the three main parts of a NoC. Each core communicates with the network through the interface, also called Resource Network Interface (RNI). One approach to speed up the design at NoC based systems is to develop standardized RNI. Design of RNI depends to some extent on the type of routing technique used in NoC. Control of route decision base the categorization of source and distributed routing algorithms. In source routing a complete path to the destination is provided in the packet header at the source, whereas in distributed routing, the path is dynamically computed in routers as the packet moves through the network. Buffering, flitization, deflitization and transfer of data from core to router and vice versa, are common responsibilities of RNI in both types of routing. In source routing, RNI has an extra functionality of storing complete paths to all destinations in tables, extracting path to reach a desired destination and adding it in the header flit. In this thesis, we have made an effort towards designing and prototyping a standardized and efficient RNI for both source and distributed routing. VHDL is used as a design language and prototyping of both types RNI has been carried out on Altera DE2 FPGA board. Testing of RNI was conducted by using Nios II soft core. Simulation results show that the best case flit latency, for both types RNI is 4 clock cycles. RNI design is also resource efficient because it consumes only 2% of the available resources on the target platform.

Los estilos APA, Harvard, Vancouver, ISO, etc.

21

Fons, Lluís Mariano. "Hardware accelerators for embedded fingerprint-based personal recognition systems". Doctoral thesis, Universitat Rovira i Virgili, 2012. http://hdl.handle.net/10803/83493.

Texto completo

Resumen

Abstract The development of automatic biometrics-based personal recognition systems is a reality in the current technological age. Not only those operations demanding stringent security levels but also many daily use consumer applications request the existence of computational platforms in charge of recognizing the identity of one individual based on the analysis of his/her physiological and/or behavioural characteristics. The state of the art points out two main open problems in the implementation of such applications: on the one hand, the needed reliability improvement in terms of recognition accuracy, overall security and real-time performances; and on the other hand, the cost reduction of those physical platforms in charge of the processing. This work aims at finding the proper system architecture able to address those limitations of current personal recognition applications. Embedded system solutions based on hardware-software co-design techniques and programmable (and run-time reconfigurable) logic devices under FPGAs or SOPCs is proven to be an efficient alternative to those existing multiprocessor systems based on HPCs, GPUs or PC platforms in the development of that kind of high-performance applications at low cost
El desenvolupament de sistemes automàtics de reconeixement personal basats en tècniques biomètriques esdevé una realitat en l’era tecnològica actual. No només aquelles operacions que exigeixen un elevat nivell de seguretat sinó també moltes aplicacions quotidianes demanen l’existència de plataformes computacionals encarregades de reconèixer la identitat d’un individu a partir de l’anàlisi de les seves característiques fisiològiques i/o comportamentals. L’estat de l’art de la tècnica identifica dues limitacions importants en la implementació d’aquest tipus d’aplicacions: per una banda, és necessària la millora de la fiabilitat d’aquests sistemes en termes de precisió en el procés de reconeixement personal, seguretat i execució en temps real; i per altra banda, és necessari reduir notablement el cost dels sistemes electrònics encarregats del processat biomètric. Aquest treball té per objectiu la cerca de l’arquitectura adequada a nivell de sistema que permeti fer front a les limitacions de les aplicacions de reconeixement personal actuals. Es demostra que la proposta de sistemes empotrats basats en tècniques de codisseny hardware-software i dispositius lògics programables (i reconfigurables en temps d’execució) sobre FPGAs o SOPCs resulta ser una alternativa eficient en front d’aquells sistemes multiprocessadors existents basats en HPCs, GPUs o plataformes PC per al desenvolupament d’aquests tipus d’aplicacions que requereixen un alt nivell de prestacions a baix cost.
El desarrollo de sistemas automáticos de reconocimiento personal basados en técnicas biométricas se ha convertido en una realidad en la era tecnológica actual. No tan solo aquellas operaciones que requieren un alto nivel de seguridad sino también muchas otras aplicaciones cotidianas exigen la existencia de plataformas computacionales encargadas de verificar la identidad de un individuo a partir del análisis de sus características fisiológicas y/o comportamentales. El estado del arte de la técnica identifica dos limitaciones importantes en la implementación de este tipo de aplicaciones: por un lado, es necesario mejorar la fiabilidad que presentan estos sistemas en términos de precisión en el proceso de reconocimiento personal, seguridad y ejecución en tiempo real; y por otro lado, es necesario reducir notablemente el coste de los sistemas electrónicos encargados de dicho procesado biométrico. Este trabajo tiene por objetivo la búsqueda de aquella arquitectura adecuada a nivel de sistema que permita hacer frente a las limitaciones de los sistemas de reconocimiento personal actuales. Se demuestra que la propuesta basada en sistemas embebidos implementados mediante técnicas de codiseño hardware-software y dispositivos lógicos programables (y reconfigurables en tiempo de ejecución) sobre FPGAs o SOPCs resulta ser una alternativa eficiente frente a aquellos sistemas multiprocesador actuales basados en HPCs, GPUs o plataformas PC en el ámbito del desarrollo de aplicaciones que demandan un alto nivel de prestaciones a bajo coste

Los estilos APA, Harvard, Vancouver, ISO, etc.

22

Robino, Francesco. "A model-based design approach for heterogeneous NoC-based MPSoCs on FPGA". Licentiate thesis, KTH, Elektroniksystem, 2014. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-145521.

Texto completo

Resumen

Network-on-chip (NoC) based multi-processor systems-on-chip (MPSoCs) are promising candidates for future multi-processor embedded platforms, which are expected to be composed of hundreds of heterogeneous processing elements (PEs) to potentially provide high performances. However, together with the performances, the systems complexity will increase, and new high level design techniques will be needed to efficiently model, simulate, debug and synthesize them. System-level design (SLD) is considered to be the next frontier in electronic design automation (EDA). It enables the description of embedded systems in terms of abstract functions and interconnected blocks. A promising complementary approach to SLD is the use of models of computation (MoCs) to formally describe the execution semantics of functions and blocks through a set of rules. However, also when this formalization is used, there is no clear way to synthesize system-level models into software (SW) and hardware (HW) towards a NoC-based MPSoC implementation, i.e., there is a lack of system design automation (SDA) techniques to rapidly synthesize and prototype system-level models onto heterogeneous NoC-based MPSoCs. In addition, many of the proposed solutions require large overhead in terms of SW components and memory requirements, resulting in complex and customized multi-processor platforms. In order to tackle the problem, a novel model-based SDA flow has been developed as part of the thesis. It starts from a system-level specification, where functions execute according to the synchronous MoC, and then it can rapidly prototype the system onto an FPGA configured as an heterogeneous NoC-based MPSoC. In the first part of the thesis the HeartBeat model is proposed as a model-based technique which fills the abstraction gap between the abstract system-level representation and its implementation on the multiprocessor prototype. Then details are provided to describe how this technique is automated to rapidly prototype the modeled system on a flexible platform, permitting to adjust the system specification until the designer is satisfied with the results. Finally, the proposed SDA technique is improved defining a methodology to automatically explore possible design alternatives for the modeled system to be implemented on a heterogeneous NoC-based MPSoC. The goal of the exploration is to find an implementation satisfying the designer's requirements, which can be integrated in the proposed SDA flow. Through the proposed SDA flow, the designer is relieved from implementation details and the design time of systems targeting heterogeneous NoC-based MPSoCs on FPGA is significantly reduced. In addition, it reduces possible design errors proposing a completely automated technique for fast prototyping. Compared to other SDA flows, the proposed technique targets a bare-metal solution, avoiding the use of an operating system (OS). This reduces the memory requirements on the FPGA platform comparing to related work targeting MPSoC on FPGA. At the same time, the performance (throughput) of the modeled applications can be increased when the number of processors of the target platform is increased. This is shown through a wide set of case studies implemented on FPGA.

QC 20140609

Los estilos APA, Harvard, Vancouver, ISO, etc.

23

Damez, Lionel. "Approche multi-processeurs homogènes sur System-on-Chip pour le traitement d'image". Phd thesis, Université Blaise Pascal - Clermont-Ferrand II, 2009. http://tel.archives-ouvertes.fr/tel-00724443.

Texto completo

Resumen

La conception de prototypes de systèmes de vision en temps réel embarqué est sujet à de multiples contraintes sévères et fortement contradictoires. Dans le cas de capteurs dits "intelligents", il est nécessaire de fournir une puissance de traitement suffisante pour exécuter les algorithmes à la cadence des capteurs d'images avec un dispositif de taille minimale et consommant peu d'énergie. La conception d'un système monopuce (ou SoC) et l'implantation d'algorithmes de plus en plus complexes pose problème si on veut l'associer avec une approche de prototypage rapide d'applications scientifiques. Afin de réduire de manière significative le temps et les différents coûts de conception, le procédé de conception est fortement automatisé. La conception matérielle est basée sur la dérivation d'un modèle d'architecture multiprocesseur générique de manière à répondre aux besoins de capacité de traitement et de communication spécifiques à l'application visée. Les principales étapes manuelles se réduisent au choix et au paramétrage des différents composants matériels synthétisables disponibles. La conception logicielle consiste en la parallélisation des algorithmes, qui est facilitée par l'homogénéité et la régularité de l'architecture de traitement parallèle et la possibilité d'employer des outils d'aide à la parallélisation. Avec l'approche de conception sont présentés les premiers éléments constitutifs qui permettent de la mettre en oeuvre.Ceux ci portent essentiellement sur les aspects de conception matérielle. L'approche proposée est illustrée par l'implantation d'un traitement de stabilisation temps réel vidéo sur technologie SoPC

Los estilos APA, Harvard, Vancouver, ISO, etc.

24

Ramquist, Henrik. "Technologies and design methods for a highly integrated AIS transponder". Thesis, Linköping University, Department of Electrical Engineering, 2003. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-2006.

Texto completo

Resumen

The principle of universal shipborne automatic identification system (AIS) is to allow automatic exchange of shipboard information between one vessel and another. Saab TransponderTech AB has an operating AIS transponder on the market and the purpose of this report is to investigate alternative technologies that could result in a highly integrated replacement for the existing hardware.

Design aspects of a system-on-chip are discussed, such as: available system-on- chip technologies, intellectual property, on-chip bus structures and development tools. This information is applied to the existing hardware and the integration possibilities of the various parts of the AIS transponder is investigated.

The focus will be on two main transponder parts that are possible to replace with highly integrated circuits. The first of these parts is the so-called digital part where system-on-chip platforms for different technologies have been investigated with a special interest in a highly integrated FPGA implementation. The second part is the radio frequency receivers where alternatives to the existing superheterodyne receiver are discussed.

The conclusion drawn is that there exist technologies for developing a highly integrated AIS transponder. An attractive highly integrated transponder could consist of a FPGA system-on-chip platform with subsampling digital receivers and additional components that are unsuitable for integration.

Los estilos APA, Harvard, Vancouver, ISO, etc.

25

Sigurðsson, Páll Axel. "Predictable Multiprocessor Platform for Safety- Critical Real- Time Systems". Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-302139.

Texto completo

Resumen

Multicore systems excel at providing concurrent execution of applications, giving true parallelism where all cores can execute sequences of machine instructions at the same time. However, multicore systems come with their own sets of problems, most notably when cores in a system (or core tiles) share hardware components such as memory modules or Input/Output (IO) peripherals. This increased level of complexity makes it especially difficult to design and verify safety- critical systems that require real- time operation, such as flight controllers in airplanes and airbag controllers in the automotive industry. Verifying that that systems are predictable is therefore essential, requiring methods for measuring and finding out the Worst- Case Execution Times (WCETs) and Best- Case Execution Times (BCETs). Additionally, the designer must ensure isolation between running applications (indicating that the platform is composable). This thesis work consists of designing a predictable Multiprocessor System On- Chip (MPSoC) using Qsys and Quartus II, as well as providing methods and test benches that can support all claims made about the platform’s reported behavior. A shared- memory loosely coupled multicore design was implemented, which can be horizontally scaled from 2 to 8 core tiles. A high- level Hardware Abstraction Layer (HAL) is written for the platform to simplify its use. Using Nios II/e processors as the logical cores in the platform’s core tiles gives predictable (mostly static) latencies when the platform is tested, showing no erratic or unexplained timing variations. However, due to the Round Robin (RR) nature of the arbitration logic in the Avalon Switch Fabric (ASF), composability was not fully achieved in the platform. Groundwork for implementing Time- Division Multiplexing (TDM) arbitration logic is proposed and will ideally be fully implemented in future work.
Mångkärniga processorsystem utmärker sig när det kommer till samkörning mellan applikationer. De ger en sann parallellism, där alla kärnor kan köra processorinstruktioner samtidigt. Mångkärniga system kommer med sina egna problem, framför allt när kärnorna ska dela komponenter så som minnesmoduler och Input/Output tillbehör. Den ökade komplexiteten gör att det är extra svårt att designa och verifiera säkerhetskritiska system som kräver körning i realtid, så som flygkontrollers på flygplan och styrenheter för krockkudden i bilar. Verifiering av att systemen är förutsägbara är essentiellt, detta behöver metoder för att mäta och hitta den värsta möjliga exekveringstiden (WCET) och den bästa möjliga exekveringstiden (BCET). Utöver detta måste designern säkerställa att processerna som körs på kärnorna är isolerade ifrån varandra (komponerbara). Detta arbetet består av att designa ett förutsägbart mångkärnigt system på chip (MPSoC) med Qsys och Quartus II, samt att ge metoder och testbänkar som kan bevisa systemets hävdade beteende. Ett löst kopplat mångkärnigt system med delat minne implementerades, där systemets kärnor kan ökas horisontellt från 2 till 8 stycken. Ett Hardware Abstraction Layer (HAL) skapades för systemet för att simplifiera användningen. Användningen av Nios II/e som processorkärna gav förutsägbara exekveringstider när systemet testades och visade inga oförklarliga tids variationer. Däremot, på grund av att Avalon Switch Fabric (ASF) tilldelar access med Round Robin (RR), är systemet inte komponerbart. Basen för att implementera Time- Division Multiplexing (TDM) istället är föreslaget och kommer idealt implementeras som fortsatt arbete.

Los estilos APA, Harvard, Vancouver, ISO, etc.

26

Johansson, Henrik. "Evaluating Vivado High-Level Synthesis on OpenCV Functions for the Zynq-7000 FPGA". Thesis, Mälardalens högskola, Akademin för innovation, design och teknik, 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:mdh:diva-29591.

Texto completo

Resumen

More complex and intricate Computer Vision algorithms combined with higher resolution image streams put bigger and bigger demands on processing power. CPU clock frequencies are now pushing the limits of possible speeds, and have instead started growing in number of cores. Most Computer Vision algorithms' performance respond well to parallel solutions. Dividing the algorithm over 4-8 CPU cores can give a good speed-up, but using chips with Programmable Logic (PL) such as FPGA's can give even more. An interesting recent addition to the FPGA family is a System on Chip (SoC) that combines a CPU and an FPGA in one chip, such as the Zynq-7000 series from Xilinx. This tight integration between the Programmable Logic and Processing System (PS) opens up for designs where C programs can use the programmable logic to accelerate selected parts of the algorithm, while still behaving like a C program. On that subject, Xilinx has introduced a new High-Level Synthesis Tool (HLST) called Vivado HLS, which has the power to accelerate C code by synthesizing it to Hardware Description Language (HDL) code. This potentially bridges two otherwise very separate worlds; the ever popular OpenCV library and FPGAs. This thesis will focus on evaluating Vivado HLS from Xilinx primarily with image processing in mind for potential use on GIMME-2; a system with a Zynq-7020 SoC and two high resolution image sensors, tailored for stereo vision.

Los estilos APA, Harvard, Vancouver, ISO, etc.

27

Fons, Lluís Francisco. "Embedded electronic systems driven by run-time reconfigurable hardware". Doctoral thesis, Universitat Rovira i Virgili, 2012. http://hdl.handle.net/10803/83494.

Texto completo

Resumen

Abstract This doctoral thesis addresses the design of embedded electronic systems based on run-time reconfigurable hardware technology –available through SRAM-based FPGA/SoC devices– aimed at contributing to enhance the life quality of the human beings. This work does research on the conception of the system architecture and the reconfiguration engine that provides to the FPGA the capability of dynamic partial reconfiguration in order to synthesize, by means of hardware/software co-design, a given application partitioned in processing tasks which are multiplexed in time and space, optimizing thus its physical implementation –silicon area, processing time, complexity, flexibility, functional density, cost and power consumption– in comparison with other alternatives based on static hardware (MCU, DSP, GPU, ASSP, ASIC, etc.). The design flow of such technology is evaluated through the prototyping of several engineering applications (control systems, mathematical coprocessors, complex image processors, etc.), showing a high enough level of maturity for its exploitation in the industry.
Resumen Esta tesis doctoral abarca el diseño de sistemas electrónicos embebidos basados en tecnología hardware dinámicamente reconfigurable –disponible a través de dispositivos lógicos programables SRAM FPGA/SoC– que contribuyan a la mejora de la calidad de vida de la sociedad. Se investiga la arquitectura del sistema y del motor de reconfiguración que proporcione a la FPGA la capacidad de reconfiguración dinámica parcial de sus recursos programables, con objeto de sintetizar, mediante codiseño hardware/software, una determinada aplicación particionada en tareas multiplexadas en tiempo y en espacio, optimizando así su implementación física –área de silicio, tiempo de procesado, complejidad, flexibilidad, densidad funcional, coste y potencia disipada– comparada con otras alternativas basadas en hardware estático (MCU, DSP, GPU, ASSP, ASIC, etc.). Se evalúa el flujo de diseño de dicha tecnología a través del prototipado de varias aplicaciones de ingeniería (sistemas de control, coprocesadores aritméticos, procesadores de imagen, etc.), evidenciando un nivel de madurez viable ya para su explotación en la industria.
Resum Aquesta tesi doctoral està orientada al disseny de sistemes electrònics empotrats basats en tecnologia hardware dinàmicament reconfigurable –disponible mitjançant dispositius lògics programables SRAM FPGA/SoC– que contribueixin a la millora de la qualitat de vida de la societat. S’investiga l’arquitectura del sistema i del motor de reconfiguració que proporcioni a la FPGA la capacitat de reconfiguració dinàmica parcial dels seus recursos programables, amb l’objectiu de sintetitzar, mitjançant codisseny hardware/software, una determinada aplicació particionada en tasques multiplexades en temps i en espai, optimizant així la seva implementació física –àrea de silici, temps de processat, complexitat, flexibilitat, densitat funcional, cost i potència dissipada– comparada amb altres alternatives basades en hardware estàtic (MCU, DSP, GPU, ASSP, ASIC, etc.). S’evalúa el fluxe de disseny d’aquesta tecnologia a través del prototipat de varies aplicacions d’enginyeria (sistemes de control, coprocessadors aritmètics, processadors d’imatge, etc.), demostrant un nivell de maduresa viable ja per a la seva explotació a la indústria.

Los estilos APA, Harvard, Vancouver, ISO, etc.

28

Heil, Mikael. "Conception architecturale pour la tolérance aux fautes d'un système auto-organisé multi-noeuds en réseau à base de NoC reconfigurables". Thesis, Université de Lorraine, 2015. http://www.theses.fr/2015LORR0351.

Texto completo

Resumen

Afin de répondre à des besoins croissants de performance et de fiabilité des systèmes sur puce embarqués pour satisfaire aux applications de plus en plus complexes, de nouveaux paradigmes architecturaux et structures de communication auto-adaptatives et auto-organisées sont à élaborer. Ces nouveaux systèmes de calcul intègrent au sein d'une même puce électronique plusieurs centaines d'éléments de calcul (systèmes sur puce multiprocesseur - MPSoC) et doivent permettre la mise à disposition d'une puissance de calcul parallèle suffisante tout en bénéficiant d'une grande flexibilité et d'une grande adaptabilité. Le but est de répondre aux évolutions des traitements distribués caractérisant le contexte évolutif du fonctionnement des systèmes. Actuellement, les performances de tels systèmes reposent sur une autonomie et une intelligence permettant de déployer et de redéployer les modules de calcul en temps réel en fonction de la demande de traitement et de la puissance de calcul. Elle dépend également des supports de communication entre les blocs de calcul afin de fournir une bande passante et une adaptabilité élevée pour une efficacité du parallélisme potentiel de la puissance de calcul disponible des MPSoC. De plus, l'apparition de la technologie FPGA reconfigurable dynamiquement a ouvert de nouvelles approches permettant aux MPSoC d'adapter leurs constituants en cours de fonctionnement, et de répondre aux besoins croissants d'adaptabilité et de flexibilité. C'est dans ce contexte du besoin primordial de flexibilité, de puissance de calcul et de bande passante qu'est apparue une nouvelle approche de conception des systèmes communicants, auto-organisés et auto-adaptatifs basés sur des nœuds de calcul reconfigurables. Ces derniers sont constitués de réseaux embarqués sur puce (NoC) permettant l'interconnexion optimisée d'un grand nombre d'éléments de calcul au sein d'une même puce, tout en assurant l'exigence d'une tolérance aux fautes et d'un compromis entre les performances de communication et les ressources d'interconnexion. Ces travaux de thèse ont pour objectif d'apporter des solutions architecturales innovantes pour la SdF des systèmes MPSoC en réseau basés sur la technologie FPGA, et configurés selon une structure distribuée et auto-organisée. L'objectif est d'obtenir des systèmes sur puce performants et fiables intégrant des techniques de détection, de localisation et de correction d'erreurs au sein de leurs structures NoC reconfigurables ou adaptatifs. La principale difficulté réside dans l'identification et la distinction entre des erreurs réelles et des fonctionnements variables ou adaptatifs des éléments constituants ces nœuds en réseau. Ces travaux ont permis de réaliser un réseau de nœuds reconfigurables à base de FPGA intégrant des structures NoC dynamiques, capables de s'auto-organiser et de s'auto-tester dans le but d'obtenir une maintenabilité maximale du fonctionnement du système dans un contexte en réseau. Dans ces travaux, un système communicant multi-nœuds MPSoC reconfigurable capable d'échanger et d'interagir a été développé, permettant ainsi une gestion avancée de tâches, la création et l'auto-gestion de mécanismes de tolérance aux fautes. Différentes techniques sont combinées et permettent d'identifier et localiser avec précision les éléments défaillants d'une telle structure dans le but de les corriger ou de les isoler pour prévenir toutes défaillances du système. Elles ont été validées au travers de nombreuses simulations matérielles afin d'estimer leur capacité de détection et de localisation des sources d'erreurs au sein d'un réseau. De même, des synthèses logiques du système intégrant les différentes solutions proposées sont analysées en termes de performances et de ressources logiques consommées dans le cas de la technologie FPGA
The need of growing performance and reliability of embedded System-on-Chips SoCs are increasing constantly to meet the requirements of applications becoming more and more complexes, new architectural processing paradigms and communication structures based in particular on self-adaptive and self-organizing structures have emerged. These new computing systems integrate within a single chip of hundreds of computing or processing elements (Multiprocessor Systems on Chip - MPSoC) allowing to feature a high level of parallel processing while providing high flexibility or adaptability. The goal is to change possible configurations of the distributed processing characterizing the evolving context of the networked systems. Nowadays, the performance of these systems relies on autonomous and intelligence allowing to deploy and redeploy the compute modules in real time to the request processing and computing power, the communication medium and data exchange between interconnected processing elements to provide bandwidth scalability and high efficiency for the potential parallelism of the available computing power of MPSoC. Moreover, the emergence of the partial reconfigurable FPGA technology allows to the MPSoC to adapt their elements during its operation in order to meet the system requirements. In this context, flexibility, computing power and high bandwidth requirements lead new approach to the design of self-organized and self-adaptive communication systems based Network-on-Chips (NoC). The aim is to allow the interconnection of a large number of elements in the same device while maintaining fault tolerance requirement and a compromise between parallel processing capacity of the MPSoC, communication performance, interconnection resources and tradeoff between performance and logical resources. This thesis work aims to provide innovative architectural solutions for networked fault tolerant MPSoC based on FPGA technology and configured as a distributed and self-organized structure. The objective is to obtain performance and reliable systems on chips incorporating detection, localization and correction of errors in their reconfigurable or adaptive NoC structures where the main difficulty lies in the identification and distinction between real errors and adaptive properties in these network nodes. More precisely, this work consists to perform a networked node based on reconfigurable FPGA which integrates dynamic or adaptive NoC capable of self-organized and self-test in order to achieve maximum maintainability of system operation in a networked environment (WSN). In this work, we developed a reconfigurable multi-node system based on MPSoC which can exchange and interact, allowing an efficient task management and self-management of fault tolerance mechanisms. Different techniques are combined and used to identify and precisely locate faulty elements of such a structure in order to correct or isolate them in order to prevent failures of the system. Validations through the many hardware simulations to estimate their capacity of detecting and locating sources of error within a network have been presented. Likewise, synthesized logic systems incorporating the various proposed solutions are analyzed in terms of performance and logic resources in the case of FPGA technology

Los estilos APA, Harvard, Vancouver, ISO, etc.

29

McNichols, John M. "Design and Implementation of an Embedded NIOS II System for JPEG2000 Tier II Encoding". University of Dayton / OhioLINK, 2012. http://rave.ohiolink.edu/etdc/view?acc_num=dayton1343737032.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

30

Ngan, Nicolas. "Etude et conception d'un réseau sur puce dynamiquement adaptable pour la vision embarquée". Thesis, Paris Est, 2011. http://www.theses.fr/2011PEST1040/document.

Texto completo

Resumen

Un équipement portable moderne intègre plusieurs capteurs d'image qui peuvent être de différents types. On peut citer en guise d'exemple un capteur couleur, un capteur infrarouge ou un capteur basse lumière. Cet équipement doit alors supporter différentes sources qui peuvent être hétérogènes en terme de résolution, de granularité de pixels et de fréquence d'émission des images. Cette tendance à multiplier les capteurs, est motivée par des besoins applicatifs dans un but de complémentarité en sensibilité (fusion des images), en position (panoramique) ou en champ de vision. Le système doit par conséquent être capable de supporter des applications de plus en plus complexes et variées, nécessitant d'utiliser une seule ou plusieurs sources d'image. Du fait de cette variété de fonctionnalités embarquées, le système électronique doit pouvoir s'adapter constamment pour garantir des performances en terme de latence et de temps de traitement en fonction des applications, tout en respectant des contraintes d'encombrement.% Même si depuis de nombreuses années, un grand nombre de solutions architecturales ont été proposées pour améliorer l'adaptabilité des unités de calcul, un problème majeur persiste au niveau du réseau d'interconnexion qui n'est pas suffisamment adaptable, en particulier pour le transfert des flux de pixels et l'accès aux données. Nous proposons dans cette thèse un nouveau réseau de communication sur puce (NoC) pour un SoC dédié à la vision. Ce réseau permet de gérer dynamiquement différents types de flux en parallèle en auto-adaptant le chemin de donnée entre les unités de calcul, afin d'exécuter de manière efficace différentes applications. La proposition d'une nouvelle structure de paquets de données, facilite les mécanismes d'adaptation du système grâce à la combinaison d'instructions et de données à traiter dans un même paquet. Nous proposons également un système de mémorisation de trames à adressage indirecte, capable de gérer dynamiquement plusieurs trames image de différentes sources d'image. Cet adressage indirect est réalisé par l'intermédiaire d'une couche d'abstraction matérielle qui se charge de traduire des requêtes de lecture et d'écriture, réalisées suivant des indicateurs de la trame requise (source de l'image, indice temporel et dernière opération effectuée). Afin de valider notre proposition, nous définissons une nouvelle architecture, appelée Multi Data Flow Ring (MDFR) basée sur notre réseau avec une topologie en anneau. Les performances de cette architecture, en temps et en surface, ont été évaluées dans le cadre d'une implémentation sur une cible FPGA
Modern portable vision systems include several types of image sensors such as colour, low-light or infrared sensor. Such system has to support heterogeneous image sources with different spatial resolutions, pixel granularities and working frequencies. This trend to multiply sensors is motivated by needs to complete sensor sensibilities with image fusion processing techniques, or sensor positions in the system. Moreover, portable vision systems implement image applications which require several images sources with a growing computing complexity. To face those challenges in integrating such a variety of functionalities, the embedded electronic computing system has to adapt permanently to preserve application timing performance in latency and processing, and to respect area and low-power constraints. In this thesis, we propose a new Network-On-Chip (NoC) adapted for a System-On-Chip (SoC) dedicated to image applications. This NoC can manage several pixel streams in parallel by adapting dynamically the datapatah between processing elements and memories. The new header packet structure enables adaptation mechanisms in routers by combining instructions and data in a same packet. To manage efficiently the frames storage required for an application, we propose a frame buffer system with an indirect frame addressing, which is able to manage several frames from different sensors. It features a hardware abstraction layer which is in charge to collect reading and writing requests, according to specific frame indicators such as the image source ID. The NoC has been validated in a complete processing architecture called Multi Data Flow Ring (MDFR) with a ring topology. The MDFR performances in time and area has been demonstrated for an FPGA target

Los estilos APA, Harvard, Vancouver, ISO, etc.

31

Sagisi, Joseph Lozano. "HE-MT6D: A Network Security Processor with Hardware Engine for Moving Target IPv6 Defense (MT6D) over 1 Gbps IEEE 802.3 Ethernet". Thesis, Virginia Tech, 2017. http://hdl.handle.net/10919/86789.

Texto completo

Resumen

Traditional static network addressing allows attackers the incredible advantage of taking time to plan and execute attacks against a network. To counter, Moving Target IPv6 Defense (MT6D) provides a network host obfuscation technique that dynamically obscures network and transport layer addresses. Software driven implementations have posed many challenges, namely, constant code maintenance to remain compliant with all library and kernel dependencies, less than optimal throughput, and the requirement for a dedicated general purpose hardware. The work of this thesis presents Network Security Processor and Hardware Engine for MT6D (HE-MT6D) to overcome these challenges. HE-MT6D is a soft core Intellectual Property (IP) block developed in full Register Transfer Level (RTL) and is the first hardware-oriented design of MT6D. Major contributions of HE-MT6D include the complete separation of the data and control planes, development of a nonlinear Complex Instruction Set Computer (CISC) Network Security Processor for in-flight packet modification, a specialized Packet Assembly language, a configurable and a parallelized memory search through tag-based Hybrid Content Addressable Memory (HCAM) L1 write-through cache, full RTL Network Time Protocol version 4 hardware module, and a modular crypto engine. HE-MT6D supports multiple nodes and provides 1,025% throughput performance increase over earlier C-based MT6D at 863 Mbps with full encapsulation and decapsulation, and it matches bare wire throughput performance for all other traffic. The HE-MT6D IP block can be configured as an independent physical gateway device, built as embedded Application Specific Integrated Circuit (ASIC), or serve as a System on Chip (SoC) integrated submodule.
Master of Science

Los estilos APA, Harvard, Vancouver, ISO, etc.

32

Bahri, Imen. "Contribution des systèmes sur puce basés sur FPGA pour les applications embarquées d’entraînement électrique". Thesis, Cergy-Pontoise, 2011. http://www.theses.fr/2011CERG0529/document.

Texto completo

Resumen

La conception des systèmes de contrôle embarqués devient de plus en plus complexe en raison des algorithmes utilisés, de l'augmentation des besoins industriels et de la nature des domaines d'applications. Une façon de gérer cette complexité est de concevoir les contrôleurs correspondant en se basant sur des plateformes numériques puissantes et ouvertes. Plus précisément, cette thèse s'intéresse à l'utilisation des plateformes FPGA System-on-Chip (SoC) pour la mise en œuvre des algorithmes d'entraînement électrique pour des applications avioniques. Ces dernières sont caractérisées par des difficultés techniques telles que leur environnement de travail (pression, température élevée) et les exigences de performance (le haut degré d'intégration, la flexibilité). Durant cette thèse, l'auteur a contribué à concevoir et à tester un contrôleur numérique pour un variateur de vitesse synchrone qui doit fonctionner à 200 °C de température ambiante. Il s'agit d'une commande par flux orienté (FOC) pour une Machine Synchrone à Aimants Permanents (MSAP) associée à un capteur de type résolveur. Une méthode de conception et de validation a été proposée et testée en utilisant une carte FPGA ProAsicPlus de la société Actel/Microsemi. L'impact de la température sur la fréquence de fonctionnement a également été analysé. Un état de l'art des technologies basées sur les SoC sur FPGA a été également présenté. Une description détaillée des plateformes numériques récentes et les contraintes en lien avec les applications embarquées a été également fourni. Ainsi, l'intérêt d'une approche basée sur SoC pour des applications d'entrainements électriques a été démontré. D'un autre coté et pour profiter pleinement des avantages offertes par les SoC, une méthodologie de Co-conception matériel-logiciel (hardware-software (HW-SW)) pour le contrôle d'entraînement électrique a été proposée. Cette méthode couvre l'ensemble des étapes de développement de l'application de contrôle à partir des spécifications jusqu'à la validation expérimentale. Une des principales étapes de cette méthode est le partitionnement HW-SW. Le but est de trouver une combinaison optimale entre les modules à mettre en œuvre dans la partie logiciel et celles qui doivent être mis en œuvre dans la partie matériel. Ce problème d'optimisation multi-objectif a été réalisé en utilisant l'algorithme de génétique, Non-Dominated Sorting Genetic Algorithm (NSGA-II). Ainsi, un Front de Pareto des solutions optimales peut être déduit. L'illustration de la méthodologie proposée a été effectuée en se basant sur l'exemple du régulateur de vitesse sans capteur utilisant le filtre de Kalman étendu (EKF). Le choix de cet exemple correspond à une tendance majeure dans le domaine des contrôleurs embraqués pour entrainements électriques. Par ailleurs, la gestion de l'architecture du contrôleur embarqué basée sur une approche SoC a été effectuée en utilisant un système d'exploitation temps réel. Afin d'accélérer les services de ce système d'exploitation, une unité temps réel a été développée en VHDL et associée au système d'exploitation. Il s'agit de placer les services d'ordonnanceur et des processus de communication du système d'exploitation logiciel au matériel. Ceci a permis une accélération significative du traitement. La validation expérimentale d'un contrôleur du courant a été effectuée en utilisant un banc de test du laboratoire. Les résultats obtenus prouvent l'intérêt de l'approche proposée
Designing embedded control systems becomes increasingly complex due to the growing of algorithm complexity, the rising of industrials requirements and the nature of application domains. One way to handle with this complexity is to design the corresponding controllers on performing powerful and open digital platforms. More specifically, this PhD deals with the use of FPGA System-on-Chip (SoC) platforms for the implementation of complex AC drive controllers for avionic applications. These latters are characterized by stringent technical issues such as environment conditions (pressure, high temperature) and high performance requirements (high integration, flexibility and efficiency). During this thesis, the author has contributed to design and to test a digital controller for a high temperature synchronous drive that must operate at 200°C ambient. It consists on the Flux Oriented Controller (FOC) for a Permanent Magnet Synchronous Machine (PMSM) associated with a Resolver sensor. A design and validation method has been proposed and tested using a FPGA ProAsicPlus board from Actel-Microsemi Company. The impact of the temperature on the operating frequency has been also analyzed. A state of the art FPGA SoC technology has been also presented. A detailed description of the recent digital platforms and constraints in link with embedded applications was investigated. Thus, the interest of a SoC-based approach for AC drives applications was also established. Additionally and to have full advantages of a SoC based approach, an appropriate HW-SW Co-design methodology for electrical AC drive has been proposed. This method covers the whole development steps of the control application from the specifications to the final experimental validation. One of the main important steps of this method is the HW-SW partitioning. The goal is to find an optimal combination between modules to be implemented in software and those to be implemented in hardware. This multi-objective optimization problem was performed with the Non-Dominated Sorting Genetic Algorithm (NSGA-II). Thus, the Pareto-Front of optimal solution can be deduced. The illustration of the proposed Co-design methodology was made based on the sensorless speed controller using the Extended Kalman Filter (EKF). The choice of this benchmark corresponds to a major trend in embedded control of AC drives. Besides, the management of SoC-based architecture of the embedded controller was allowed using an efficient Real-Time Operating System (RTOS). To accelerate the services of this operating system, a Real-Time Unit (RTU) was developed in VHDL and associated to the RTOS. It consists in hardware operating system that moves the scheduling and communication process from software RTOS to hardware. Thus, a significant acceleration has been achieved. The experimentation tests based on digital current controller were also carried out using a laboratory set-up. The obtained results prove the interest of the proposed approach

Los estilos APA, Harvard, Vancouver, ISO, etc.

33

Sethuraman, Balasubramanian. "Novel Methodologies for Efficient Networks-on-Chip Implementation on Reconfigurable Devices". Cincinnati, Ohio : University of Cincinnati, 2007. http://www.ohiolink.edu/etd/view.cgi?ucin1196043683.

Texto completo

Resumen

Thesis (Ph. D.)--University of Cincinnati, 2007.
Advisor: Ranga Vemuri. Title from electronic thesis title page (viewed Feb. 18, 2008). Keywords: Networks-on-Chip (NoC), System-on-Chip (SoC), FPGA, Reconfigurable & Platform-Based Design, Light Weight Router Design, Multi Local Port Router, Multicast Router, Low Power Topology Generation & Mapping, Power Issues and IR drop Analysis, Minimum. Includes abstract. Includes bibliographical references.

Los estilos APA, Harvard, Vancouver, ISO, etc.

34

Gantel, Laurent. "Hardware and software architecture facilitating the operation by the industry of dynamically adaptable heterogeneous embedded systems". Phd thesis, Université de Cergy Pontoise, 2014. http://tel.archives-ouvertes.fr/tel-01019909.

Texto completo

Resumen

This thesis aims to define software and hardware mechanisms helping in the management the Heterogeneous and dynamically Reconfigurable Systems-on-Chip (HRSoC). The heterogeneity is due to the presence of general processing units and reconfigurable IPs. Our objective is to provide to an application developer an abstracted view of this heterogeneity, regarding the task mapping on the available processing elements. First, we homogenize the user interface defining a hardware thread model. Then, we pursue with the homogenization of the hardware threads management. We implemented OS services permitting to save and restore a hardware thread context. Conception tools have also been developed in order to overcome the relocation issue. The last step consisted in extending the access to the distributed OS services to every thread running on the platform. This access is provided independently from the thread location and is is realized implementing the MRAPI API. With these three steps, we build a solid basis to, in future work, provide to the developer, a conception flow dedicated to HRSoC allowing to perform precise architectural space explorations. Finally, to validate these mechanisms, we realize a demonstration platform on a Virtex 5 FPGA running a dynamic tracking application.

Los estilos APA, Harvard, Vancouver, ISO, etc.

35

Pereira, Fábio Dacêncio. "Proposta e implementação de uma Camada de Integração de Serviços de Segurança (CISS) em SoC e multiplataforma". Universidade de São Paulo, 2009. http://www.teses.usp.br/teses/disponiveis/3/3142/tde-18122009-124154/.

Texto completo

Resumen

As redes de computadores são ambientes cada vez mais complexos e dotados de novos serviços, usuários e infra-estruturas. A segurança e a privacidade de informações tornam-se fundamentais para a evolução destes ambientes. O anonimato, a fragilidade e outros fatores muitas vezes estimulam indivíduos mal intencionados a criar ferramentas e técnicas de ataques a informações e a sistemas computacionais. Isto pode gerar desde pequenas inconveniências até prejuízos financeiros e morais. Nesse sentido, a detecção de intrusão aliada a outras ferramentas de segurança pode proteger e evitar ataques maliciosos e anomalias em sistemas computacionais. Porém, considerada a complexidade e robustez de tais sistemas, os serviços de segurança muitas vezes não são capazes de analisar e auditar todo o fluxo de informações, gerando pontos falhos de segurança que podem ser descobertos e explorados. Neste contexto, esta tese de doutorado propõe, projeta, implementa e analisa o desempenho de uma camada de integração de serviços de segurança (CISS). Na CISS foram implementados e integrados serviços de segurança como Firewall, IDS, Antivírus, ferramentas de autenticação, ferramentas proprietárias e serviços de criptografia. Além disso, a CISS possui como característica principal a criação de uma estrutura comum para armazenar informações sobre incidentes ocorridos em um sistema computacional. Estas informações são consideradas como a fonte de conhecimento para que o sistema de detecção de anomalias, inserido na CISS, possa atuar com eficiência na prevenção e proteção de sistemas computacionais detectando e classificando prematuramente situações anômalas. Para isso, foram criados modelos comportamentais com base nos conceitos de Modelo Oculto de Markov (HMM) e modelos de análise de seqüências anômalas. A CISS foi implementada em três versões: (i) System-on-Chip (SoC), (ii) software JCISS em Java e (iii) simulador. Resultados como desempenho temporal, taxas de ocupação, o impacto na detecção de anomalias e detalhes de implementação são apresentados, comparados e analisados nesta tese. A CISS obteve resultados expressivos em relação às taxas de detecção de anomalias utilizando o modelo MHMM, onde se destacam: para ataques conhecidos obteve taxas acima de 96%; para ataques parciais por tempo, taxas acima de 80%; para ataques parciais por seqüência, taxas acima de 96% e para ataques desconhecidos, taxas acima de 54%. As principais contribuições da CISS são a criação de uma estrutura de integração de serviços de segurança e a relação e análise de ocorrências anômalas para a diminuição de falsos positivos, detecção e classificação prematura de anormalidades e prevenção de sistemas computacionais. Contudo, soluções foram criadas para melhorar a detecção como o modelo seqüencial e recursos como o subMHMM, para o aprendizado em tempo real. Por fim, as implementações em SoC e Java permitiram a avaliação e utilização da CISS em ambientes reais.
Computer networks are increasingly complex environments and equipped with new services, users and infrastructure. The information safety and privacy become fundamental to the evolution of these environments. The anonymity, the weakness and other factors often encourage people to create malicious tools and techniques of attacks to information and computer systems. It can generate small inconveniences or even moral and financial damage. Thus, the detection of intrusion combined with other security tools can protect and prevent malicious attacks and anomalies in computer systems. Yet, considering the complexity and robustness of these systems, the security services are not always able to examine and audit the entire information flow, creating points of security failures that can be discovered and explored. Therefore, this PhD thesis proposes, designs, implements and analyzes the performance of an Integrated Security Services Layer (ISSL). So several security services were implemented and integrated to the ISSL such as Firewall, IDS, Antivirus, authentication tools, proprietary tools and cryptography services. Furthermore, the main feature of our ISSL is the creation of a common structure for storing information about incidents in a computer system. This information is considered to be the source of knowledge so that the system of anomaly detection, inserted in the ISSL, can act effectively in the prevention and protection of computer systems by detecting and classifying early anomalous situations. In this sense, behavioral models were created based on the concepts of the Hidden Markov Model (MHMM) and models for analysis of anomalous sequences. The ISSL was implemented in three versions: (i) System-on-Chip (SoC), (ii) JCISS software in Java and (iii) one simulator. Results such as the time performance, occupancy rates, the impact on the detection of anomalies and details of implementation are presented, compared and analyzed in this thesis. The ISSL obtained significant results regarding the detection rates of anomalies using the model MHMM, which are: for known attacks, rates of over 96% were obtained; for partial attacks by a time, rates above 80%, for partial attacks by a sequence, rates were over 96% and for unknown attacks, rates were over 54%. The main contributions of ISSL are the creation of a structure for the security services integration and the relationship and analysis of anomalous occurrences to reduce false positives, early detection and classification of abnormalities and prevention of computer systems. Furthermore, solutions were figured out in order to improve the detection as the sequential model, and features such as subMHMM for learning at real time. Finally, the SoC and Java implementations allowed the evaluation and use of the ISSL in real environments.

Los estilos APA, Harvard, Vancouver, ISO, etc.

36

Chiluvuri, Nayana Teja. "A Trusted Autonomic Architecture to Safeguard Cyber-Physical Control Leaf Nodes and Protect Process Integrity". Thesis, Virginia Tech, 2015. http://hdl.handle.net/10919/56572.

Texto completo

Resumen

Cyber-physical systems are networked through IT infrastructure and susceptible to malware. Threats targeting process control are much more safety-critical than traditional computing systems since they jeopardize the integrity of physical infrastructure. Existing defence mechanisms address security at the network nodes but do not protect the physical infrastructure if network integrity is compromised. An interface guardian architecture is implemented on cyber-physical control leaf nodes to maintain process integrity by enforcing high-level safety and stability policies. Preemptive detection schemes are implemented to monitor process behavior and anticipate malicious activity before process safety and stability are compromised. Autonomic properties are employed to automatically protect process integrity by initiating switch-over to a verified backup controller. Subsystems adhere to strict trust requirements safeguarding them from adversarial intrusion. The preemptive detection schemes, switch-over logic, backup controller, and process communication are all trusted components that are separated from the untrusted production controller. The proposed architecture is applied to a rotary inverted pendulum experiment and implemented on a Xilinx Zynq-7000 configurable SoC. The leaf node implementation is integrated into a cyber-physical control topology. Simulated attack scenarios show strengthened resilience to both network integrity and reconfiguration attacks. Threats attempting to disrupt process behavior are successfully thwarted by having a backup controller maintain process stability. The system ensures both safety and liveness properties even under adversarial conditions.
Master of Science

Los estilos APA, Harvard, Vancouver, ISO, etc.

37

Rullmann, Markus. "Models, Design Methods and Tools for Improved Partial Dynamic Reconﬁguration". Doctoral thesis, Saechsische Landesbibliothek- Staats- und Universitaetsbibliothek Dresden, 2010. http://nbn-resolving.de/urn:nbn:de:bsz:14-qucosa-61526.

Texto completo

Resumen

Partial dynamic reconﬁguration of FPGAs has attracted high attention from both academia and industry in recent years. With this technique, the functionality of the programmable devices can be adapted at runtime to changing requirements. The approach allows designers to use FPGAs more efﬁciently: E. g. FPGA resources can be time-shared between different functions and the functions itself can be adapted to changing workloads at runtime. Thus partial dynamic reconﬁguration enables a unique combination of software-like ﬂexibility and hardware-like performance. Still there exists no common understanding on how to assess the overhead introduced by partial dynamic reconﬁguration. This dissertation presents a new cost model for both the runtime and the memory overhead that results from partial dynamic reconﬁguration. It is shown how the model can be incorporated into all stages of the design optimization for reconﬁgurable hardware. In particular digital circuits can be mapped onto FPGAs such that only small fractions of the hardware must be reconﬁgured at runtime, which saves time, memory, and energy. The design optimization is most efﬁcient if it is applied during high level synthesis. This book describes how the cost model has been integrated into a new high level synthesis tool. The tool allows the designer to trade-off FPGA resource use versus reconﬁguration overhead. It is shown that partial reconﬁguration causes only small overhead if the design is optimized with regard to reconﬁguration cost. A wide range of experimental results is provided that demonstrates the beneﬁts of the applied method
Partielle dynamische Rekonfiguration von FPGAs hat in den letzten Jahren große Aufmerksamkeit von Wissenschaft und Industrie auf sich gezogen. Die Technik erlaubt es, die Funktionalität von progammierbaren Bausteinen zur Laufzeit an veränderte Anforderungen anzupassen. Dynamische Rekonfiguration erlaubt es Entwicklern, FPGAs effizienter einzusetzen: z.B. können Ressourcen für verschiedene Funktionen wiederverwendet werden und die Funktionen selbst können zur Laufzeit an veränderte Verarbeitungsschritte angepasst werden. Insgesamt erlaubt partielle dynamische Rekonfiguration eine einzigartige Kombination von software-artiger Flexibilität und hardware-artiger Leistungsfähigkeit. Bis heute gibt es keine Übereinkunft darüber, wie der zusätzliche Aufwand, der durch partielle dynamische Rekonfiguration verursacht wird, zu bewerten ist. Diese Dissertation führt ein neues Kostenmodell für Laufzeit und Speicherbedarf ein, welche durch partielle dynamische Rekonfiguration verursacht wird. Es wird aufgezeigt, wie das Modell in alle Ebenen der Entwurfsoptimierung für rekonfigurierbare Hardware einbezogen werden kann. Insbesondere wird gezeigt, wie digitale Schaltungen derart auf FPGAs abgebildet werden können, sodass nur wenig Ressourcen der Hardware zur Laufzeit rekonfiguriert werden müssen. Dadurch kann Zeit, Speicher und Energie eingespart werden. Die Entwurfsoptimierung ist am effektivsten, wenn sie auf der Ebene der High-Level-Synthese angewendet wird. Diese Arbeit beschreibt, wie das Kostenmodell in ein neuartiges Werkzeug für die High-Level-Synthese integriert wurde. Das Werkzeug erlaubt es, beim Entwurf die Nutzung von FPGA-Ressourcen gegen den Rekonfigurationsaufwand abzuwägen. Es wird gezeigt, dass partielle Rekonfiguration nur wenig Kosten verursacht, wenn der Entwurf bezüglich Rekonfigurationskosten optimiert wird. Eine Anzahl von Beispielen und experimentellen Ergebnissen belegt die Vorteile der angewendeten Methodik

Los estilos APA, Harvard, Vancouver, ISO, etc.

38

Dick, Chris. "FPGAs: RE-INVENTING THE SIGNAL PROCESSOR". International Foundation for Telemetering, 2002. http://hdl.handle.net/10150/606348.

Texto completo

Resumen

International Telemetering Conference Proceedings / October 21, 2002 / Town & Country Hotel and Conference Center, San Diego, California
FPGAs are increasingly being employed for building real-time signal processing systems. They have been used extensively for implementing the PHY in software radio architectures. This paper provides a technology and market perspective on the use FPGAs for signal processing and demonstrates FPGA DSP using an adaptive channel equalizer case study.

Los estilos APA, Harvard, Vancouver, ISO, etc.

39

Makni, Mariem. "Un framework haut niveau pour l'estimation du temps d'exécution, des ressources matérielles et de la consommation d'énergie dans les accélérateurs à base de FPGA". Thesis, Valenciennes, 2018. http://www.theses.fr/2018VALE0042.

Texto completo

Resumen

Les systèmes embarqués sur puce (SoC: Systems-on-Chip) sont devenus de plus en plus complexes grâce à l’évolution de la technologie des circuits intégrés. Les applications récentes nécessitent des systèmes à haute performances. Les FPGAs (Field Programmable Gate Arrays) peuvent répondre à ces besoins. On retrouve ces FPGA dans de nombreux domaines d’application : systèmes embarqués, télécommunications, traitement du signal et des images, serveurs de calcul HPC, etc. De nombreux déﬁs sont rencontrés par les concepteurs de ces applications, parmi lesquels : le développement des applications complexes, la vériﬁcation du code, la nécessité d’automatiser le processus de conception pour augmenter la productivité et satisfaire la contrainte du « time-to-market ». Récemment, la synthèse de haut niveau (ou HLS) est considérée comme une solution eﬃcace pour résoudre ces déﬁs en utilisant un niveau d’abstraction plus élevé. En eﬀet, cette technique permet de transformer automatiquement une spéciﬁcation du système en C, C++, systemC en une implémentation au niveau transfert de registre (ou RTL pour Register Transfer Level). Les outils de HLS oﬀrent un espace de solutions avec un grand nombre d’optimisations possibles au niveau du code comme l’utilisation du dépliage de boucles, le ﬂot de données et partitionnement des tableaux, etc. Le concepteur doit explorer toutes ces alternatives et mesurer les performances obtenues en termes de temps d’exécution, de ressources matérielles, et de consommation d’´energie. Dans ce travail de thèse, nous avons utilisé les accélérateurs matériels à base de FPGAs et nous avons développé l’outil HAPE. Ce dernier permet d’aider les concepteurs à estimer la performance, la surface et l’énergie pour diverses conﬁgurations au niveau du code source. L’approche proposée comprend quatre contributions principales : (i) Nous avons proposé un modèle analytique de haut niveau pour estimer le temps de communications et le temps d’exécution total (ii) nous avons proposé un modèle analytique pour estimer les diﬀérentes ressources du FPGAs (DSPs, LUTs, FFs, BRAMs), (iii) nous avons proposé un modèle analytique pour estimer la consommation d’énergie basé sur l’utilisation du matériel (BRAMs, FFs, LUTs, etc) en explorant l’espace de solutions pour les diﬀérentes optimisations, (iv) Nous avons enﬁn proposé un environnement de conception (HAPE) permettant l’exploration des 3 critères : temps, ressources matérielles et consommation de puissance. L’approche proposée dans cette thèse est basée sur une analyse dynamique du code exécutée pour extraire les dépendances des données. Cette approche augmente la précision dans l’estimation du : temps de communication, de la consommation des ressources matérielles et de la consommation d’énergie dans les accélérateurs à base de FPGA. HAPE permet d’estimer ces paramètres avec une erreur inférieure à 5% par rapport aux implémentations RTL
In recent years, the complexity of system-on-chip (SoC) designs has been dramatically increased. As a result, the increased demands for high performance and minimal power/area costs for embedded streaming applications need to ﬁnd new emerged architectures. The trend towards FPGA-based accelerators is giving a great potential of computational power and performance required for diverse applications. The advantages of such architectures result from many sources. The most important advantage stems from more eﬃcient adaptation to the various application needs. In fact, many compute-intensive applications demand diﬀerent levels of processing capabilities and energy consumption trade-oﬀs which may be satisﬁed by using FPGA-based accelerators. Current researches in performance, area and power analysis rely on register-transfer level (RTL) based synthesis ﬂows to produce accurate estimates. However, complex hardware programming model (Verilog or VHDL) makes FPGA development a time-consuming process even as the time-to-market constraints continue to tighten. Such techniques not only require advanced hardware expertise and time but are also diﬃcult to use, making large design space exploration and time-to-market costly. High-Level Synthesis (HLS) technology has been emerged in the last few years as a solution to address these problems and managing design complexity at a more abstract level. This technique aims to bridge the gap between the traditional RTL design process and the ever-increasing complexity of applications. The important advantage of HLS tools is the ability to automatically generate RTL implementations from high-level speciﬁcations (e.g., C/C++/SystemC). The HLS tools provide various optimization pragmas such as loop unrolling, loop pipelining, dataﬂow, array partitioning, etc. Unfortunately, the large design space resulting from the various combinations of pragmas makes exhaustive design space exploration prohibitively time-consuming with HLS tools. In addition, to thoroughly evaluate such architectures, designers must perform large design space exploration to understand the tradeoﬀs across the entire system, which is currently infeasible due to the lack of a fast simulation infrastructure for FPGA-based accelerators. Hence, there is a clear need for a pre-RTL and high-level framework to enable rapid design space exploration for FPGA-based accelerators

Los estilos APA, Harvard, Vancouver, ISO, etc.

40

Al-Araje, Abdul-Nasser. "Micronetwork based system-on-FPGA (SOFPGA) architecture". Connect to resource, 2005. http://rave.ohiolink.edu/etdc/view?acc%5Fnum=osu1122609799.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

41

Gkalea, Salvator. "Fault-Tolerant Nostrum NoC on FPGA for theForSyDe/NoC System Generator Tool Suite". Thesis, KTH, Elektronik och Inbyggda System, 2014. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-163426.

Texto completo

Resumen

Moore’s law is the observation that over the years, the transistor density will increase,allowing billions of transistors to be integrated on a single chip. Over the lasttwo decades, Moore’s law has enabled the implementation of complex systems on asingle chip(SoCs). The challenge of the System-on-Chip(SoC) era was the demandof an efficient communication mechanism between the growing number of processingcores on the chip. The outcome established an new interconnection scheme (amongothers, like crossbars, rings, buses) based on the telecommunication networks andthe Network- on-Chip(NoC) appeared on the scene.The NoC has been developed not only to support systems embedded into asingle processor, but also to support a set of processors embedded on a singlechip.Therefore, the Multi-Processors System on Chip(MPSoC) has arisen, whichincorporate processing elements, memories and I/O with a fixed interconnection infrastructurein a complete integrated system. In such systems, the NoC constitutesthe backbone of the communication architecture that targets future SoC composedby hundred of processing elements. Besides that, together with the deep sub-microntechnology progress, some drawbacks have arisen. The communication efficiencyand the reliability of the systems rely on the proper functionality of NoC for onchipdata communication. A NoC must deal with the susceptibility of transistors tofailure that indicates the demand for a fault tolerant communication infrastructure.A mechanism that can deal with the existence of different classes of faults(transient,intermittent and permanent [11]) which can occur in the communication network.In this thesis, different algorithms are investigated that implement fault toleranttechniques for permanent faults in the NoC. The outcome would be to deliver a faulttolerantmechanism for the NoC System Generator Tool [29] which is a researchin Network-on-Chip carried out at the Royal Institute of Technology. It will beexplicitly described the fault tolerant algorithm that is implemented in the switchin order to achieve packet rerouting around the faulty communication links.

Los estilos APA, Harvard, Vancouver, ISO, etc.

42

Fabris, Eric Ericson. "A Modular and digitally programmable interface based on band-pass sigma-delta modulator for mixed-signal systems-on-chip". reponame:Biblioteca Digital de Teses e Dissertações da UFRGS, 2005. http://hdl.handle.net/10183/6226.

Texto completo

Resumen

O foco desta tese é a descrição e validação de uma arquitetura de interface para processamento de sinais analógicos para SOC de sinais mistos. A abordagem proposta apresenta a possibilidade de cobertura de uma larga faixa de freqüências com performance praticamente constante associada a uma estrutura digital de programação. A premissa é usar uma célula analógica fixa e promover a configuração da aplicação no domínio digital, levando a uma arquitetura de interface de sinais mistos. O emprego de um bloco analógico fixo busca eliminar a perda inerente de performance decorrente da própria estrutura de programação em circuitos reconfiguráveis analógicos. A emprego da programação no domínio digital abre espaço para usos da vasta gama de ferramentas disponíveis para o projeto em alto nível de abstração, simulação e síntese automática para implementar a aplicação alvo com excelente predição do desempenho final. A abordagem proposta baseia-se no conceito de translação em freqüência (mixagem) do sinal de entrada seguida pela sua conversão para o domínio ΣΔ. A estrutura de processamento possibilita o emprego de um bloco analógico constante, e também, um processamento uniforme de sinais de entrada indo de DC até altas freqüências. A aplicação é configurada no domínio ΣΔ onde a performance pode ser predita de acordo com as especificações alvo. Objetivando a exploração do espaço de projeto foi desenvolvido o modelo de performance teórico e de simulação. Os modelos desenvolvidos auxiliam no também no projeto físico da interface proposta. Objetivando, tanto a validação dos modelos propostos, bem como o desenvolvimento de aplicações, foram construídos dois protótipos. São apresentados os usos da interface como um ADC paramétrico multi-banda e como um multiplicador e um somador de sinais analógicos. É proposta também uma arquitetura para uma interface analógica multi-canal. Os resultados experimentais empregados para a caracterização da interface proposta suportam as vantagens da mesma.
The focus of this thesis is to discuss the development and modeling of an interface architecture to be employed for interfacing analog signals in mixed-signal SOC. We claim that the approach that is going to be presented is able to achieve wide frequency range, and covers a large range of applications with constant performance, allied to digital configuration compatibility. Our primary assumptions are to use a fixed analog block and to promote application configurability in the digital domain, which leads to a mixed-signal interface. The use of a fixed analog block avoids the performance loss common to configurable analog blocks. The usage of configurability on the digital domain makes possible the use of all existing tools for high level design, simulation and synthesis to implement the target application, with very good performance prediction. The proposed approach utilizes the concept of frequency translation (mixing) of the input signal followed by its conversion to the ΣΔ domain, which makes possible the use of a fairly constant analog block, and also, a uniform treatment of input signal from DC to high frequencies. The programmability is performed in the ΣΔ digital domain where performance can be closely achieved according to application specification. The interface performance theoretical and simulation model are developed for design space exploration and for physical design support. Two prototypes are built and characterized to validate the proposed model and to implement some application examples. The usage of this interface as a multi-band parametric ADC and as a two channels analog multiplier and adder are shown. The multi-channel analog interface architecture is also presented. The characterization measurements support the main advantages of the approach proposed.

Los estilos APA, Harvard, Vancouver, ISO, etc.

43

Hong, Chuan. "Towards the development of a reliable reconfigurable real-time operating system on FPGAs". Thesis, University of Edinburgh, 2013. http://hdl.handle.net/1842/8948.

Texto completo

Resumen

In the last two decades, Field Programmable Gate Arrays (FPGAs) have been rapidly developed from simple “glue-logic” to a powerful platform capable of implementing a System on Chip (SoC). Modern FPGAs achieve not only the high performance compared with General Purpose Processors (GPPs), thanks to hardware parallelism and dedication, but also better programming flexibility, in comparison to Application Specific Integrated Circuits (ASICs). Moreover, the hardware programming flexibility of FPGAs is further harnessed for both performance and manipulability, which makes Dynamic Partial Reconfiguration (DPR) possible. DPR allows a part or parts of a circuit to be reconfigured at run-time, without interrupting the rest of the chip’s operation. As a result, hardware resources can be more efficiently exploited since the chip resources can be reused by swapping in or out hardware tasks to or from the chip in a time-multiplexed fashion. In addition, DPR improves fault tolerance against transient errors and permanent damage, such as Single Event Upsets (SEUs) can be mitigated by reconfiguring the FPGA to avoid error accumulation. Furthermore, power and heat can be reduced by removing finished or idle tasks from the chip. For all these reasons above, DPR has significantly promoted Reconfigurable Computing (RC) and has become a very hot topic. However, since hardware integration is increasing at an exponential rate, and applications are becoming more complex with the growth of user demands, highlevel application design and low-level hardware implementation are increasingly separated and layered. As a consequence, users can obtain little advantage from DPR without the support of system-level middleware. To bridge the gap between the high-level application and the low-level hardware implementation, this thesis presents the important contributions towards a Reliable, Reconfigurable and Real-Time Operating System (R3TOS), which facilitates the user exploitation of DPR from the application level, by managing the complex hardware in the background. In R3TOS, hardware tasks behave just like software tasks, which can be created, scheduled, and mapped to different computing resources on the fly. The novel contributions of this work are: 1) a novel implementation of an efficient task scheduler and allocator; 2) implementation of a novel real-time scheduling algorithm (FAEDF) and two efficacious allocating algorithms (EAC and EVC), which schedule tasks in real-time and circumvent emerging faults while maintaining more compact empty areas. 3) Design and implementation of a faulttolerant microprocessor by harnessing the existing FPGA resources, such as Error Correction Code (ECC) and configuration primitives. 4) A novel symmetric multiprocessing (SMP)-based architectures that supports shared memory programing interface. 5) Two demonstrations of the integrated system, including a) the K-Nearest Neighbour classifier, which is a non-parametric classification algorithm widely used in various fields of data mining; and b) pairwise sequence alignment, namely the Smith Waterman algorithm, used for identifying similarities between two biological sequences. R3TOS gives considerably higher flexibility to support scalable multi-user, multitasking applications, whereby resources can be dynamically managed in respect of user requirements and hardware availability. Benefiting from this, not only the hardware resources can be more efficiently used, but also the system performance can be significantly increased. Results show that the scheduling and allocating efficiencies have been improved up to 2x, and the overall system performance is further improved by ~2.5x. Future work includes the development of Network on Chip (NoC), which is expected to further increase the communication throughput; as well as the standardization and automation of our system design, which will be carried out in line with the enablement of other high-level synthesis tools, to allow application developers to benefit from the system in a more efficient manner.

Los estilos APA, Harvard, Vancouver, ISO, etc.

44

Lotlikar, Swapnil Subhash. "Design, Implementation and Evaluation of a Configurable NoC for AcENoCs FPGA Accelerated Emulation Platform". Thesis, 2010. http://hdl.handle.net/1969.1/ETD-TAMU-2010-08-8381.

Texto completo

Resumen

The heterogenous nature and the demand for extensive parallel processing in modern applications have resulted in widespread use of Multicore System-on-Chip (SoC) architectures. The emerging Network-on-Chip (NoC) architecture provides an energy-efficient and scalable communication solution for Multicore SoCs, serving as a powerful replacement for traditional bus-based solutions. The key to successful realization of such architectures is a flexible, fast and robust emulation platform for fast design space exploration. In this research, we present the design and evaluation of a highly configurable NoC used in AcENoCs (Accelerated Emulation platform for NoCs), a flexible and cycle accurate field programmable gate array (FPGA) emulation platform for validating NoC architectures. Along with the implementation details, we also discuss the various design optimizations and tradeoffs, and assess the performance improvements of AcENoCs over existing simulators and emulators. We design a hardware library consisting of routers and links using verilog hardware description language (HDL). The router is parameterized and has a configurable number of physical ports, virtual channels (VCs) and pipeline depth. A packet switched NoC is constructed by connecting the routers in either 2D-Mesh or 2D-Torus topology. The NoC is integrated in the AcENoCs platform and prototyped on Xilinx Virtex-5 FPGA. The NoC was evaluated under various synthetic and realistic workloads generated by AcENoCs' traffic generators implemented on the Xilinx MicroBlaze embedded processor. In order to validate the NoC design, performance metrics like average latency and throughput were measured and compared against the results obtained using standard network simulators. FPGA implementation of the NoC using Xilinx tools indicated a 76% LUT utilization for a 5x5 2D-Mesh network. A VC allocator was found to be the single largest consumer of hardware resources within a router. The router design synthesized at a frequency of 135MHz, 124MHz and 109MHz for 3-port, 4-port and 5-port configurations, respectively. The operational frequency of the router in the AcENoCs environment was limited only by the software execution latency even though the hardware itself could be clocked at a much higher rate. An AcENoCs emulator showed speedup improvements of 10000-12000X over HDL simulators and 5-15X over software simulators, without sacrificing cycle accuracy.

Los estilos APA, Harvard, Vancouver, ISO, etc.

45

HSU, CHIH-WEI y 許智偉. "Advanced Driver Assistance System on Chip FPGA Prototyping Based on Landmark and Pavement Detections". Thesis, 2018. http://ndltd.ncl.edu.tw/handle/s9x828.

Texto completo

Resumen

碩士
國立高雄第一科技大學
電子工程系碩士班
106
This paper proposes a prototype of an advanced driver assistance system based on landmarks and road detection. The system has three front vision subsystems, including Lane Departure Warning System (LDWS) and Forward Collision Warning System, FCWS) and Adaptive Driving Beam System (ADBS), It is implemented by real-time digital circuits with Field Programmable Gate Array (FPGA). This is able to give back the warning to driver before accident so driver can take precautionary measures or the system will directly control the vehicle to minimize damage. This paper is based on digital image processing and recognition technology to implement three subsystems. Firstly, the Landmark Detector Module is proposed in LDWS to find the left and right lane lines, and driver will receive immediate warning while the vehicle is changing lanes in order to remind driver to pay attention to road conditions.; Secondly, the Stixel Detector Module is proposed in FCWS to find the road surface, which will immediately give a warning when the distance between vehicle and vehicle in front is too close, in order to remind the driver to take precautions to avoid collisions; Finally, the Connected-component Labeling Algorithm (Labeling) is proposed in ADBS to segment image, shine light upon darker area at night and prevent the light from shining toward the vehicles coming from the opposite lane.

Los estilos APA, Harvard, Vancouver, ISO, etc.

46

Martins, João Fernando da Silva. "Desenvolvimento de um System-on-Chip basedo em Microblaze para aplicações automóveis". Master's thesis, 2014. http://hdl.handle.net/1822/41914.

Texto completo

Resumen

Dissertação de mestrado integrado em Engenharia Eletrónica Industrial e Computadores
Hoje em dia deseja-se implementar num chip o maior número de funções possíveis, o que faz diminuir o número de microcontroladores necessários para uma determinada aplicação e assim a consequente diminuição de custos. O aparecimento das FPGAs de baixo custo nos últimos anos, levou à implementação de sistemas baseados em plataformas reconfiguráveis uma vez que as suas características permitem uma rápida prototipagem de diferentes implementações facilitando o desenvolvimento de vários projetos. A sua flexibilidade permite aos designers criar módulos customizáveis e específicos à aplicação. As FPGAs permitem a implementação de SoCs dedicados a aplicações onde métricas como desempenho, determinismo e time-to-market são muito importantes em sistemas de tempo real. A implementação de um SoC numa FPGA oferece um bom equilíbrio entre a flexibilidade de implementação e um rápido time-to-market. Esta dissertação passa por desenvolver um SoC orientado para aplicações automóveis. O SoC está dotado de um controlador de interrupções baseado no NVIC da ARM que permite o atendimento a interrupções com uma latência muito baixa e um array de timers baseado nos timers presentes no microcontrolador 32-bit TriCore™. A implementação destes periféricos permite a utilização deste SoC em aplicações automóveis devido ao determinismo que o mesmo oferece, bem como o seu desempenho. O processador do SoC desenvolvido é baseado no Microblaze, que segue uma arquitetura de processadores RISC como o DLX, um processador muito utilizado para o ensino ao longo dos anos. O processador implementa um datapath de cinco estágios de pipeline, de forma a aumentar o número de instruções executadas por unidade de tempo, possui uma hazard unit para resolver os problemas inerentes a uma implementação pipelined e um barramento para fazer a comunicação com os seus periféricos. O desenvolvimento desta dissertação é feito em paralelo com uma outra, onde foi desenvolvido o compilador que dá suporte ao SoC desenvolvido nesta dissertação. Várias decisões como o ISA foram tomadas em conjunto pelos dois responsáveis das duas dissertações.
Implementing a chip with a wide number of features reduces the number of microcontrollers required for a particular application and thus the project cost is reduced too. The advent of low cost FPGAs in recent years has led to the implementation of systems based on reconfigurable platforms since their features allow rapid prototyping of different implementations facilitating the development of various projects. FPGA's flexibility allows designers to create customizable and specific modules for an application. FPGAs allow the implementation of applicationspecific SoCs where metrics such as performance, determinism and time-to-market have a keyrole in real-time systems. The implementation of a SoC on a FPGA offers a good trade-off between implementation's flexibility and fast time-to-market. This dissertation presents a SoC developed targeting automotive applications. The SoC features an interrupt controller based on the ARM NVIC, which allows the service of interrupts with a very low latency and an array of timers based on the timers present in the 32-bit Tricore ™ microcontroller. The implementation of these peripherals allows the use of the SoC for automotive applications due to the determinism that it offers, as well as its performance and priority space unification capability. The SoC’s processor is based on the Microblaze which follows a RISC architecture, like the DLX processor, a processor widely used for teaching over the years. The processor implements a five pipeline stages datapath, in order to increase the number of instructions executed in a unit of time, a hazard unit to solve the problems inherent to a pipelined implementation, and a bus to communicate with the peripherals. The development of this work was done in parallel with another, where it was developed the compiler that supports the SoC developed in this dissertation. Several decisions as the ISA were taken together on both dissertations.

Los estilos APA, Harvard, Vancouver, ISO, etc.

47

Radner, Hannes. "Adaptive optische Wellenfrontkorrektur unter Einsatz des Fresnel-Leitsterns und eines hybriden Regelkreises implementiert auf einem Field-Programmable System-on-Chip". 2020. https://tud.qucosa.de/id/qucosa%3A75637.

Texto completo

Resumen

Laseroptische Messsysteme werden vielseitig eingesetzt, unter anderem für die Messung der Strömung in Blasen und Tropfen. Beispielsweise ist die Messung in Tropfen von besonderem Interesse für die Brennstoffzellenforschung, da das Wasserkondensat die Leistungsfähigkeit der Zelle stark mindern kann. Bei der laseroptischen Messung durch die dynamische Phasengrenzfläche erhöht sich aufgrund der zufälligen Lichtbrechung die Messunsicherheit erheblich. Um dem entgegenzuwirken, wurde in dieser Arbeit untersucht, wie sich der in der Astronomie weitverbreitete Ansatz einer aktiven Wellenfrontkorrektur auf die laseroptische Strömungsmesstechnik für die Korrektur einer zufällig dynamisch streuenden Phasengrenzfläche mit nur einem optischen Zugang durch die Grenzfläche übertragen lässt. Als neuartiger Leitstern wurde hierfür der Fresnel-Reflex der Oberfläche als Fresnel Guide Star (FGS), welcher alle Informationen über die optische Störung enthält, untersucht und eingesetzt. Validiert wurde der neue Leitstern exemplarisch für die zwei laseroptischen Messverfahren Laser-Doppler-Velocimetrie (LDV) und Particle-Image-Velocimetrie (PIV). Für das bildgebende Messverfahren PIV wurde ein Regelsystem realisiert, welches eine adaptive optische Korrektur einer oszillierenden Wasseroberfläche durchführt. Das System besteht aus einem Hartmann-Shack-Sensor (HSS), einer Signalverarbeitungseinheit und einem 69-elementigen deformierbaren Membranspiegel. Dabei muss die Signalverarbeitungseinheit aus dem Hartmannogramm die Wellenfront des FGS rekonstruieren, die Stellgröße berechnen und den Membranspiegel ansteuern. Diese komplexe Multiple-Input-Multiple-Output(MIMO)-Regelungsaufgabe stellt besondere Anforderungen an das System, da die Wasseroberfläche mit mehreren hundert Hertz schwingt und das System für eine hinreichende Reserve somit eine Regelrate im Kilohertzbereich haben muss. Um diese Anforderungen zu erfüllen, wurde als hybride Recheneinheit ein Field- Programmable System-on-Chip (FPSoC) eingesetzt. Dieser vereint eine Central Processing Unit (CPU) und einen Field-Programmable Gate Array (FPGA) auf einem einzigen monolithischen Chip als eine sehr leistungsfähige Symbiose beider Architekturen. Das System erreicht eine Regelrate von 3,5 kHz und war in der Lage, die optische Störung mit einer Dämpfungsbandbreite von bis zu 150 Hz zu dämpfen. Bei der PIV-Messung wurde die Erhöhung der Standardunsicherheit des Geschwindigkeitsfeldes, verursacht durch die Oszillation der Phasengrenzfläche, um 67 % reduziert. Das System kann beispielsweise für die Optimierung von Brennstoffzellen eingesetzt werden, um die Tropfeninnenströmung in den auf der chemisch aktiven opaken Membran kondensierten Tropfen mit nur einem einzigen optischen Zugang durch die streuende Grenzfläche zu messen. Damit könnte der Gleitprozess des Tropfens an der Membranoberfläche verstanden werden und das Wasser effektiver abtransportiert werden, um die Leistungsfähigkeit der Zelle zu steigern. Weitere Anwendungsgebiete sind die Strömungsmessung in Taylorblasen, Regentropfen oder Flüssigkeitskühlfilmen mit offener Oberfläche. Generell hat der neue FGS zusammen mit dem Regelungssystem das Potenzial, die optische Messung durch eine dynamische oszillierende Grenzfläche zu verbessern oder überhaupt erst zu ermöglichen.
Laser optical measurement systems are used in a variety of applications, e.g. the flow measurement in bubbles and droplets. The flow in droplets is of particular interest for fuel cell research, since water condensate can significantly reduce the efficiency of the cell. In laser-optical measurements the dynamic motion of the phase boundary increases the measurement uncertainty significantly because of the random refraction of light. Therefore this thesis investigates how the approach of an active wavefront correction, which is widely used in astronomy, can be applied to laser-optical flow measurement techniques for the correction of a dynamic phase boundary with only one optical access through the interface. For this purpose, the Fresnel reflection of the surface was investigated, which is called Fresnel Guide Star (FGS). It contains all information about the optical distortion. The new guide star was validated exemplarily for the two laser optical measurement techniques Laser Doppler Velocimetry (LDV) and Particle Image Velocimetry (PIV). For PIV a control system consisting of a Hartmann-Shack-Sensor (HSS), a signal processing unit and a 69-element deformable membrane mirror was realized, which performs an adaptive optical correction of the moving water surface. Therefore the signal processing unit must reconstruct the wavefront of the FGS from the Hartmannogram, calculate the set value and control the membrane mirror. This complex Multiple-Input Multiple-Output (MIMO) control task results in extensive demands on the control system, since the water surface oscillates with several hundred hertz and the system must therefore have a control rate in the kilohertz range to ensure sufficient reserve. In order to meet these requirements, a Field- Programmable System-on-Chip (FPSoC) was used as hybrid computing unit. It combines a Central Processing Unit (CPU) and an Field-Programmable Gate Array (FPGA) on a single monolithic chip as a very powerful symbiosis of both architectures. The system achieved a control rate of 3,5 kHz and was able to attenuate the optical distortion with an attenuation bandwidth of up to 150 Hz. In the PIV measurement, the increase in the standard uncertainty of the velocity field caused by the oscillation of the phase boundary was reduced by 67 %. The system could be used for the optimization of fuel cells to measure the internal flow in the droplets condensed on the chemically active membrane with only one optical access through the fluctuating interface. This would allow the sliding process of the droplet on the membrane surface to be understood and the water to be removed more effectively in order to increase the performance of the cell. Further applications are flow measurement in bubbles, raindrops or liquid cooling films with an open surface, where the system expands the field of application for computational laser metrology. In general, the new FGS together with the low latency control system have the potential to improve the optical measurement through dynamically oscillating interfaces or to make the measurement possible at all.

Los estilos APA, Harvard, Vancouver, ISO, etc.

48

Jäger, Markus. "Bereitstellung eines kompletten System-on-Chip aus AMBA 2.0 Komponenten sowie des LEON3-SPARC-Prozessors im Xilinx-EDK". 2008. https://ul.qucosa.de/id/qucosa%3A16629.

Texto completo

Resumen

Aufgrund der wachsenden Ressourcen heutiger FPGAs, durch neue technologische Entwicklungen, erschließen sich immer neue Einsatzmöglichkeiten.Beispielsweise wächst der Wunsch, ein vollständiges System in einem einzigen Chip einzubringen. Die sogenannten Systems-on-Chip (kurz SoC) bestehen dabei aus einem Prozessor, einen Bussystem, Schnittstellen zu externen Speichern und anderen Peripheriegeräten. Die Firma Xilinx bietet mit ihrer Software EDK eine IP-Core Bibliothek an, mit der es möglich ist, ein komplettes SoC für einen FPGA zu synthetisieren. Die Xilinx-IP-Core-Bibliothek benutzt dabei den Soft-Prozessor MicroBlaze als μP. Die IP-Core Bibliothek von Xilinx ist nicht Open-Source und zu ihrer Benutzung werden Lizenzgebühren verlangt. In dieser Arbeit wird eine neue IP-Core Bibliothek bereitgestellt, welche Open-Source ist und damit frei einsehbar und frei verwendbar ist. Die neue IP-Core Bibliothek wird durch diese Arbeit in den Workflow des Xilinx-EDK eingebunden und ist somit komfortabel benutzbar. Als Grundlage dient die IP-Core Bibliothek der Firma Gaisler Research, auch genannt Gaisler Research Library (kurz GRLIB). Die GRLIB besitzt eine Vielzahl von IP-Cores unter denen, für jeden IP-Core der Xilinx Bibliothek, ein Ersatz gefunden werden konnte. Die GRLIB setzt als μP auf den LEON3-Prozessor. Der LEON3-Prozessor wurde nach den Spezifikationen der SPARC entworfen und ist ein höchst flexibler und konfigurierbarer Soft-Prozessor. In dieser Arbeit wurde weiterhin das SnapGear-Linux evaluiert, welches auf dem LEON3- Prozessor mit Komponenten der GRLIB ausgeführt werden kann.

Los estilos APA, Harvard, Vancouver, ISO, etc.

49

Wang, Zhoukun. "Design and Multi-Technology Multi-objective Comparative Analysis of Families of MPSOC". Phd thesis, 2009. http://pastel.archives-ouvertes.fr/pastel-00539555.

Texto completo

Resumen

Multiprocessor system on chip (MPSOC) have strongly emerged in the past decade in communication, multimedia, networking and other embedded domains. MPSOC became a new paradigm of high performance embedded application design. This thesis addresses the design and the physical implementation of a Network on Chip (NoC) based Multiprocessor System on Chip. We studied several aspects at different design stages: high level synthesis, architecture design, FPGA implementation, application evaluation and ASIC physical implementation. We try to analysis and find the impacts of these aspects for the MPSOC's final performance, power consumption and area cost. We implemented a NoC based 16 processors embedded system on FPGA prototyping. Three NoCs provide different functionalities for sixteen PE tiles. We also demonstrated the use of our performance monitoring system for software debugging and tuning. With the bi-synchronous FIFO method, our GALS architecture successfully solves the long clock signal distribution problem and allows that each clock domain can run at its own clock frequency. On the other hand we successfully implemented AES and TDES block cipher cryptographic algorithms on this platform and results show linear speedup in computation time. The network part of our architecture has been implemented on ASIC technology and has been explored with different timing constraints and different library categories of STmicroelectronics' 65nm/45nm technologies. The experimental results of ASIC and FPGA are compared, and we inducted the discussion of technology change impact on parallel programming.

Los estilos APA, Harvard, Vancouver, ISO, etc.

50

Rullmann, Markus. "Models, Design Methods and Tools for Improved Partial Dynamic Reconﬁguration". Doctoral thesis, 2009. https://tud.qucosa.de/id/qucosa%3A25391.

Texto completo

Resumen

Partial dynamic reconﬁguration of FPGAs has attracted high attention from both academia and industry in recent years. With this technique, the functionality of the programmable devices can be adapted at runtime to changing requirements. The approach allows designers to use FPGAs more efﬁciently: E. g. FPGA resources can be time-shared between different functions and the functions itself can be adapted to changing workloads at runtime. Thus partial dynamic reconﬁguration enables a unique combination of software-like ﬂexibility and hardware-like performance. Still there exists no common understanding on how to assess the overhead introduced by partial dynamic reconﬁguration. This dissertation presents a new cost model for both the runtime and the memory overhead that results from partial dynamic reconﬁguration. It is shown how the model can be incorporated into all stages of the design optimization for reconﬁgurable hardware. In particular digital circuits can be mapped onto FPGAs such that only small fractions of the hardware must be reconﬁgured at runtime, which saves time, memory, and energy. The design optimization is most efﬁcient if it is applied during high level synthesis. This book describes how the cost model has been integrated into a new high level synthesis tool. The tool allows the designer to trade-off FPGA resource use versus reconﬁguration overhead. It is shown that partial reconﬁguration causes only small overhead if the design is optimized with regard to reconﬁguration cost. A wide range of experimental results is provided that demonstrates the beneﬁts of the applied method.:1 Introduction 1 1.1 Reconfigurable Computing . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.1.1 Reconfigurable System on a Chip (RSOC) . . . . . . . . . . . . 4 1.1.2 Anatomy of an Application . . . . . . . . . . . . . . . . . . . . . . 6 1.1.3 RSOC Design Characteristics and Trade-offs . . . . . . . . . . . 7 1.2 Classification of Reconfigurable Architectures . . . . . . . . . . . . . . . 10 1.2.1 Partial Reconfiguration . . . . . . . . . . . . . . . . . . . . . . . . 10 1.2.2 Runtime Reconfiguration (RTR) . . . . . . . . . . . . . . . . . . . 10 1.2.3 Multi-Context Configuration . . . . . . . . . . . . . . . . . . . . . 11 1.2.4 Fine-Grain Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 1.2.5 Coarse-Grain Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 1.3 Reconfigurable Computing Specific Design Issues . . . . . . . . . . . . 12 1.4 Overview of this Dissertation . . . . . . . . . . . . . . . . . . . . . . . . . 14 2 Reconfigurable Computing Systems – Background 17 2.1 Examples for RSOCs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.2 Partially Reconfigurable FPGAs: Xilinx Virtex Device Family . . . . . . 20 2.2.1 Virtex-II/Virtex-II Pro Logic Architecture . . . . . . . . . . . . . 20 2.2.2 Reconfiguration Architecture and Reconfiguration Control . . 21 2.3 Methods for Design Entry . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 2.3.1 Behavioural Design Entry . . . . . . . . . . . . . . . . . . . . . . . 25 2.3.2 Design Entry at Register-Transfer Level (RTL) . . . . . . . . . . 25 2.3.3 Xilinx Early Access Partial Reconfiguration Design Flow . . . . 26 2.4 Task Management in Reconfigurable Computing . . . . . . . . . . . . . 27 2.4.1 Online and Offline Task Management . . . . . . . . . . . . . . . 28 2.4.2 Task Scheduling . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 2.4.3 Task Placement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 2.4.4 Reconfiguration Runtime Overhead . . . . . . . . . . . . . . . . 31 2.5 Configuration Data Compression . . . . . . . . . . . . . . . . . . . . . . . 32 2.6 Evaluation of Reconfigurable Systems . . . . . . . . . . . . . . . . . . . . 35 2.6.1 Energy Efficiency Models . . . . . . . . . . . . . . . . . . . . . . . 35 2.6.2 Area Efficiency Models . . . . . . . . . . . . . . . . . . . . . . . . 37 2.6.3 Runtime Efficiency Models . . . . . . . . . . . . . . . . . . . . . . 37 2.7 Similarity Based Reduction of Reconfiguration Overhead . . . . . . . . 38 2.7.1 Configuration Data Generation Methods . . . . . . . . . . . . . 39 2.7.2 Device Mapping Methods . . . . . . . . . . . . . . . . . . . . . . . 40 2.7.3 Circuit Design Methods . . . . . . . . . . . . . . . . . . . . . . . . 41 2.7.4 Model for Partial Configuration . . . . . . . . . . . . . . . . . . . 44 2.8 Contributions of this Work . . . . . . . . . . . . . . . . . . . . . . . . . . 44 3 Runtime Reconfiguration Cost and Optimization Methods 47 3.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 3.2 Reconfiguration State Graph . . . . . . . . . . . . . . . . . . . . . . . . . 50 3.2.1 Reconfiguration Time Overhead . . . . . . . . . . . . . . . . . . 52 3.2.2 Dynamic Configuration Data Overhead . . . . . . . . . . . . . . 52 3.3 Configuration Cost at Bitstream Level . . . . . . . . . . . . . . . . . . . . 54 3.4 Configuration Cost at Structural Level . . . . . . . . . . . . . . . . . . . 56 3.4.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 3.4.2 Virtual Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . 62 3.4.3 Reconfiguration Costs in the VA Context . . . . . . . . . . . . . 65 3.5 Allocation Functions with Minimal Reconfiguration Costs . . . . . . . 67 3.5.1 Allocation of Node Pairs . . . . . . . . . . . . . . . . . . . . . . . 68 3.5.2 Direct Allocation of Nodes . . . . . . . . . . . . . . . . . . . . . . 76 3.5.3 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 3.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 4 Implementation Tools for Reconfigurable Computing 95 4.1 Mapping of Netlists to FPGA Resources . . . . . . . . . . . . . . . . . . . 96 4.1.1 Mapping to Device Resources . . . . . . . . . . . . . . . . . . . . 96 4.1.2 Connectivity Transformations . . . . . . . . . . . . . . . . . . . . 99 4.1.3 Mapping Variants and Reconfiguration Costs . . . . . . . . . . . 100 4.1.4 Mapping of Circuit Macros . . . . . . . . . . . . . . . . . . . . . . 101 4.1.5 Global Interconnect . . . . . . . . . . . . . . . . . . . . . . . . . . 102 4.1.6 Netlist Hierarchy . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 4.2 Mapping Aware Allocation . . . . . . . . . . . . . . . . . . . . . . . . . . 103 4.2.1 Generalized Node Mapping . . . . . . . . . . . . . . . . . . . . . 104 4.2.2 Successive Node Allocation . . . . . . . . . . . . . . . . . . . . . 105 4.2.3 Node Allocation with Ant Colony Optimization . . . . . . . . . 107 4.2.4 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 4.3 Netlist Mapping with Minimized Reconfiguration Cost . . . . . . . . . 110 4.3.1 Mapping Database . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 4.3.2 Mapping and Packing of Elements into Logic Blocks . . . . . . 112 4.3.3 Logic Element Selection . . . . . . . . . . . . . . . . . . . . . . . 114 4.3.4 Logic Element Selection for Min. Routing Reconfiguration . . 115 4.3.5 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 4.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 5 High-Level Synthesis for Reconfigurable Computing 125 5.1 Introduction to HLS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 5.1.1 HLS Tool Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 5.1.2 Realization of the Hardware Tasks . . . . . . . . . . . . . . . . . 128 5.2 New Concepts for Task-based Reconfiguration . . . . . . . . . . . . . . 131 5.2.1 Multiple Hardware Tasks in one Reconfigurable Module . . . . 132 5.2.2 Multi-Level Reconfiguration . . . . . . . . . . . . . . . . . . . . . 133 5.2.3 Resource Sharing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 5.3 Datapath Synthesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 5.3.1 Task Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 5.3.2 Resource Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 5.3.3 Resource Binding . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 5.3.4 Scheduling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 5.3.5 Constraints for Scheduling and Resource Binding . . . . . . . . 151 5.4 Reconfiguration Optimized Datapath Implementation . . . . . . . . . . 153 5.4.1 Effects of Scheduling and Binding on Reconfiguration Costs . 153 5.4.2 Strategies for Resource Type Binding . . . . . . . . . . . . . . . 154 5.4.3 Strategies for Resource Instance Binding . . . . . . . . . . . . . 157 5.5 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 5.5.1 Summary of Binding Methods and Tool Setup . . . . . . . . . . 163 5.5.2 Cost Factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 5.5.3 Implementation Scenarios . . . . . . . . . . . . . . . . . . . . . . 166 5.5.4 Benchmark Characteristics . . . . . . . . . . . . . . . . . . . . . . 168 5.5.5 Benchmark Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 170 5.5.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174 5.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177 6 Summary and Outlook 185 Bibliography 189 A Simulated Annealing 201
Partielle dynamische Rekonfiguration von FPGAs hat in den letzten Jahren große Aufmerksamkeit von Wissenschaft und Industrie auf sich gezogen. Die Technik erlaubt es, die Funktionalität von progammierbaren Bausteinen zur Laufzeit an veränderte Anforderungen anzupassen. Dynamische Rekonfiguration erlaubt es Entwicklern, FPGAs effizienter einzusetzen: z.B. können Ressourcen für verschiedene Funktionen wiederverwendet werden und die Funktionen selbst können zur Laufzeit an veränderte Verarbeitungsschritte angepasst werden. Insgesamt erlaubt partielle dynamische Rekonfiguration eine einzigartige Kombination von software-artiger Flexibilität und hardware-artiger Leistungsfähigkeit. Bis heute gibt es keine Übereinkunft darüber, wie der zusätzliche Aufwand, der durch partielle dynamische Rekonfiguration verursacht wird, zu bewerten ist. Diese Dissertation führt ein neues Kostenmodell für Laufzeit und Speicherbedarf ein, welche durch partielle dynamische Rekonfiguration verursacht wird. Es wird aufgezeigt, wie das Modell in alle Ebenen der Entwurfsoptimierung für rekonfigurierbare Hardware einbezogen werden kann. Insbesondere wird gezeigt, wie digitale Schaltungen derart auf FPGAs abgebildet werden können, sodass nur wenig Ressourcen der Hardware zur Laufzeit rekonfiguriert werden müssen. Dadurch kann Zeit, Speicher und Energie eingespart werden. Die Entwurfsoptimierung ist am effektivsten, wenn sie auf der Ebene der High-Level-Synthese angewendet wird. Diese Arbeit beschreibt, wie das Kostenmodell in ein neuartiges Werkzeug für die High-Level-Synthese integriert wurde. Das Werkzeug erlaubt es, beim Entwurf die Nutzung von FPGA-Ressourcen gegen den Rekonfigurationsaufwand abzuwägen. Es wird gezeigt, dass partielle Rekonfiguration nur wenig Kosten verursacht, wenn der Entwurf bezüglich Rekonfigurationskosten optimiert wird. Eine Anzahl von Beispielen und experimentellen Ergebnissen belegt die Vorteile der angewendeten Methodik.:1 Introduction 1 1.1 Reconfigurable Computing . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.1.1 Reconfigurable System on a Chip (RSOC) . . . . . . . . . . . . 4 1.1.2 Anatomy of an Application . . . . . . . . . . . . . . . . . . . . . . 6 1.1.3 RSOC Design Characteristics and Trade-offs . . . . . . . . . . . 7 1.2 Classification of Reconfigurable Architectures . . . . . . . . . . . . . . . 10 1.2.1 Partial Reconfiguration . . . . . . . . . . . . . . . . . . . . . . . . 10 1.2.2 Runtime Reconfiguration (RTR) . . . . . . . . . . . . . . . . . . . 10 1.2.3 Multi-Context Configuration . . . . . . . . . . . . . . . . . . . . . 11 1.2.4 Fine-Grain Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 1.2.5 Coarse-Grain Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 1.3 Reconfigurable Computing Specific Design Issues . . . . . . . . . . . . 12 1.4 Overview of this Dissertation . . . . . . . . . . . . . . . . . . . . . . . . . 14 2 Reconfigurable Computing Systems – Background 17 2.1 Examples for RSOCs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.2 Partially Reconfigurable FPGAs: Xilinx Virtex Device Family . . . . . . 20 2.2.1 Virtex-II/Virtex-II Pro Logic Architecture . . . . . . . . . . . . . 20 2.2.2 Reconfiguration Architecture and Reconfiguration Control . . 21 2.3 Methods for Design Entry . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 2.3.1 Behavioural Design Entry . . . . . . . . . . . . . . . . . . . . . . . 25 2.3.2 Design Entry at Register-Transfer Level (RTL) . . . . . . . . . . 25 2.3.3 Xilinx Early Access Partial Reconfiguration Design Flow . . . . 26 2.4 Task Management in Reconfigurable Computing . . . . . . . . . . . . . 27 2.4.1 Online and Offline Task Management . . . . . . . . . . . . . . . 28 2.4.2 Task Scheduling . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 2.4.3 Task Placement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 2.4.4 Reconfiguration Runtime Overhead . . . . . . . . . . . . . . . . 31 2.5 Configuration Data Compression . . . . . . . . . . . . . . . . . . . . . . . 32 2.6 Evaluation of Reconfigurable Systems . . . . . . . . . . . . . . . . . . . . 35 2.6.1 Energy Efficiency Models . . . . . . . . . . . . . . . . . . . . . . . 35 2.6.2 Area Efficiency Models . . . . . . . . . . . . . . . . . . . . . . . . 37 2.6.3 Runtime Efficiency Models . . . . . . . . . . . . . . . . . . . . . . 37 2.7 Similarity Based Reduction of Reconfiguration Overhead . . . . . . . . 38 2.7.1 Configuration Data Generation Methods . . . . . . . . . . . . . 39 2.7.2 Device Mapping Methods . . . . . . . . . . . . . . . . . . . . . . . 40 2.7.3 Circuit Design Methods . . . . . . . . . . . . . . . . . . . . . . . . 41 2.7.4 Model for Partial Configuration . . . . . . . . . . . . . . . . . . . 44 2.8 Contributions of this Work . . . . . . . . . . . . . . . . . . . . . . . . . . 44 3 Runtime Reconfiguration Cost and Optimization Methods 47 3.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 3.2 Reconfiguration State Graph . . . . . . . . . . . . . . . . . . . . . . . . . 50 3.2.1 Reconfiguration Time Overhead . . . . . . . . . . . . . . . . . . 52 3.2.2 Dynamic Configuration Data Overhead . . . . . . . . . . . . . . 52 3.3 Configuration Cost at Bitstream Level . . . . . . . . . . . . . . . . . . . . 54 3.4 Configuration Cost at Structural Level . . . . . . . . . . . . . . . . . . . 56 3.4.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 3.4.2 Virtual Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . 62 3.4.3 Reconfiguration Costs in the VA Context . . . . . . . . . . . . . 65 3.5 Allocation Functions with Minimal Reconfiguration Costs . . . . . . . 67 3.5.1 Allocation of Node Pairs . . . . . . . . . . . . . . . . . . . . . . . 68 3.5.2 Direct Allocation of Nodes . . . . . . . . . . . . . . . . . . . . . . 76 3.5.3 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 3.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 4 Implementation Tools for Reconfigurable Computing 95 4.1 Mapping of Netlists to FPGA Resources . . . . . . . . . . . . . . . . . . . 96 4.1.1 Mapping to Device Resources . . . . . . . . . . . . . . . . . . . . 96 4.1.2 Connectivity Transformations . . . . . . . . . . . . . . . . . . . . 99 4.1.3 Mapping Variants and Reconfiguration Costs . . . . . . . . . . . 100 4.1.4 Mapping of Circuit Macros . . . . . . . . . . . . . . . . . . . . . . 101 4.1.5 Global Interconnect . . . . . . . . . . . . . . . . . . . . . . . . . . 102 4.1.6 Netlist Hierarchy . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 4.2 Mapping Aware Allocation . . . . . . . . . . . . . . . . . . . . . . . . . . 103 4.2.1 Generalized Node Mapping . . . . . . . . . . . . . . . . . . . . . 104 4.2.2 Successive Node Allocation . . . . . . . . . . . . . . . . . . . . . 105 4.2.3 Node Allocation with Ant Colony Optimization . . . . . . . . . 107 4.2.4 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 4.3 Netlist Mapping with Minimized Reconfiguration Cost . . . . . . . . . 110 4.3.1 Mapping Database . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 4.3.2 Mapping and Packing of Elements into Logic Blocks . . . . . . 112 4.3.3 Logic Element Selection . . . . . . . . . . . . . . . . . . . . . . . 114 4.3.4 Logic Element Selection for Min. Routing Reconfiguration . . 115 4.3.5 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 4.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 5 High-Level Synthesis for Reconfigurable Computing 125 5.1 Introduction to HLS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 5.1.1 HLS Tool Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 5.1.2 Realization of the Hardware Tasks . . . . . . . . . . . . . . . . . 128 5.2 New Concepts for Task-based Reconfiguration . . . . . . . . . . . . . . 131 5.2.1 Multiple Hardware Tasks in one Reconfigurable Module . . . . 132 5.2.2 Multi-Level Reconfiguration . . . . . . . . . . . . . . . . . . . . . 133 5.2.3 Resource Sharing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 5.3 Datapath Synthesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 5.3.1 Task Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 5.3.2 Resource Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 5.3.3 Resource Binding . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 5.3.4 Scheduling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 5.3.5 Constraints for Scheduling and Resource Binding . . . . . . . . 151 5.4 Reconfiguration Optimized Datapath Implementation . . . . . . . . . . 153 5.4.1 Effects of Scheduling and Binding on Reconfiguration Costs . 153 5.4.2 Strategies for Resource Type Binding . . . . . . . . . . . . . . . 154 5.4.3 Strategies for Resource Instance Binding . . . . . . . . . . . . . 157 5.5 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 5.5.1 Summary of Binding Methods and Tool Setup . . . . . . . . . . 163 5.5.2 Cost Factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 5.5.3 Implementation Scenarios . . . . . . . . . . . . . . . . . . . . . . 166 5.5.4 Benchmark Characteristics . . . . . . . . . . . . . . . . . . . . . . 168 5.5.5 Benchmark Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 170 5.5.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174 5.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177 6 Summary and Outlook 185 Bibliography 189 A Simulated Annealing 201

Los estilos APA, Harvard, Vancouver, ISO, etc.

Tesis sobre el tema "FPGA - System-on-Chip"

Crea una cita precisa en los estilos APA, MLA, Chicago, Harvard y otros