Дисертації: "Architecture dataflow"

1

Iannucci, Robert A. "A dataflow/von Neumann hybrid architecture." Thesis, Massachusetts Institute of Technology, 1988. http://hdl.handle.net/1721.1/14778.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

2

Benjamin, Steven I. "Dataflow : overview and simulation /." Online version of thesis, 1988. http://hdl.handle.net/1850/10221.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

3

Narayanaswamy, Ramya Priyadharshini. "Design of a Power-aware Dataflow Processor Architecture." Thesis, Virginia Tech, 2010. http://hdl.handle.net/10919/34192.

Повний текст джерела

Анотація:

In a sensor monitoring embedded computing environment, the data from a sensor is an event that triggers the execution of an application. A sensor node consists of multiple sensors and a general purpose processor that handles the multiple events by deploying an event-driven software model. The software overheads of the general purpose processors results in energy inefficiency. What is needed is a class of special purpose processing elements which are more energy efficient for the purpose of computation. In the past, special purpose microcontrollers have been designed which are energy efficient for the targeted application space. However, reuse of the same design techniques is not feasible for other application domains. Therefore, this thesis presents a power-aware dataflow processor architecture targeted for the electronic textile computing space. The processor architecture has no instructions, and handles multiple events inherently without deploying software methods. This thesis also shows that the power-aware implementation reduces the overall static power consumption. Master of Science

Стилі APA, Harvard, Vancouver, ISO та ін.

4

Moser, Nico, Carsten Gremzow, and Matthias Menge. "Interconnection Optimization for Dataflow Architectures." Universitätsbibliothek Chemnitz, 2007. http://nbn-resolving.de/urn:nbn:de:swb:ch1-200700950.

Повний текст джерела

Анотація:

In this paper we present a dataflow processor architecture based on [1], which is driven by controlflow generated tokens. We will show the special properties of this architecture with regard to scalability, extensibility, and parallelism. In this context we outline the application scope and compare our approach with related work. Advantages and disadvantages will be discussed and we suggest solutions to solve the disadvantages. Finally an example of the implementation of this architecture will be given and we have a look at further developments. We believe the features of this basic approach predestines the architecture especially for embedded systems and system on chips.

Стилі APA, Harvard, Vancouver, ISO та ін.

5

Ruggiero, C. A. "Throttle mechanisms for the Manchester Dataflow Machine." Thesis, University of Manchester, 1987. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.382765.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

6

Li, Feng. "Compiling for a multithreaded dataflow architecture : algorithms, tools, and experience." Phd thesis, Université Pierre et Marie Curie - Paris VI, 2014. http://tel.archives-ouvertes.fr/tel-00992753.

Повний текст джерела

Анотація:

Across the wide range of multiprocessor architectures, all seem to share one common problem: they are hard to program. It is a general belief that parallelism is a software problem, and that perhaps we need more sophisticated compilation techniques to partition the application into concurrent threads. Many experts also make the point that the underlining architecture plays an equally important architecture before one may expect significant progress in the programmability of multiprocessors. Our approach favors a convergence of these viewpoints. The convergence of dataflow and von Neumann architecture promises latency tolerance, the exploitation of a high degree of parallelism, and light thread switching cost. Multithreaded dataflow architectures require a high degree of parallelism to tolerate latency. On the other hand, it is error-prone for programmers to partition the program into large number of fine grain threads. To reconcile these facts, we aim to advance the state of the art in automatic thread partitioning, in combination with programming language support for coarse-grain, functionally deterministic concurrency. This thesis presents a general thread partitioning algorithm for transforming sequential code into a parallel data-flow program targeting a multithreaded dataflow architecture. Our algorithm operates on the program dependence graph and on the static single assignment form, extracting task, pipeline, and data parallelism from arbitrary control flow, and coarsening its granularity using a generalized form of typed fusion. We design a new intermediate representation to ease code generation for an explicit token match dataflow execution model. We also implement a GCC-based prototype. We also evaluate coarse-grain dataflow extensions of OpenMP in the context of a large-scale 1024-core, simulated multithreaded dataflow architecture. These extension and simulated architecture allow the exploration of innovative memory models for dataflow computing. We evaluate these tools and models on realistic applications.

Стилі APA, Harvard, Vancouver, ISO та ін.

7

Li, Feng. "Compiling for a multithreaded dataflow architecture : algorithms, tools, and experience." Electronic Thesis or Diss., Paris 6, 2014. http://www.theses.fr/2014PA066102.

Повний текст джерела

Анотація:

Quelque-soit le multiprocesseur et son architecture, la facilité de leur programmation demeure une difficulté majeure. Une croyance bien installée est que l’exploitation correcte et efficace du parallélisme dans une application est une question pour les concepteurs d’outils de développement logiciel. Selon cette vision, nous avons besoin de techniques de compilation plus sophistiqués pour partitionner une application en threads simultanés. Mais de nombreux experts revendiquent que l'architecture joue un rôle tout aussi important: il faut opérer un changement fondamental dans l'architecture de processeurs avant que l’on puisse espérer des progrès importants au niveau de leur programmabilité. Notre approche favorise la convergence de ces points de vue. La convergence entre le calcul parallèle “en flot de données” avec l'architecture de von Neumann est porteuse de nombreuses promesses. En particulier en termes de tolérance à la latence, en termes d’exploitation d'un haut degré de parallélisme, le tout pour un très faible coût de changement de contexte entre threads. Les architectures à flot de données multithread exigent un haut degré de parallélisme pour tolérer la latence. D'autre part, le partitionnement d’un programme en un grand nombre de threads à grain fin est une source d'erreurs commune pour les développeurs. Pour reconcilier ces faits, nous nous efforçons de faire progresser l'état de l'art dans le partitionnement automatique de threads, conjointement avec le support du langage de programmation pour l’exploitation de parallélisme à plus gros grain, tout en préservant un concurrence déterministe. Cette thèse présente un algorithme général de partitionnement de threads, pour transformer du code séquentiel en un programme exprimant du parallélisme en flot de données. Notre algorithme fonctionne sur le Program Dependence Graph (PDG) et la forme en assignation unique statique (Static Single Assignment, SSA), pour extraire du parallélisme de tâche, pipeline, et de données, en présence de flot de contrôle arbitraire. Nous avons conçu une nouvelle représentation intermédiaire pour faciliter la génération de code, et son exécution parallèle en flot de données. Nous avons également mis en œuvre ces algorithmes dans un prototype fondé sur GCC, et contribué au développement d’une plateforme de simulation permettant d’explorer la parallélisation en flot de données à grande échelle. Ces extensions et l'architecture simulée permettent l'exploration de modèles innovants de mémoire pour le parallélisme en flot de données. Ces outils et modèles ont également été évalués sur des applications réalistes Across the wide range of multiprocessor architectures, all seem to share one common problem: they are hard to program. It is a general belief that parallelism is a software problem, and that perhaps we need more sophisticated compilation techniques to partition the application into concurrent threads. Many experts also make the point that the underlining architecture plays an equally important architecture before one may expect significant progress in the programmability of multiprocessors. Our approach favors a convergence of these viewpoints. The convergence of dataflow and von Neumann architecture promises latency tolerance, the exploitation of a high degree of parallelism, and light thread switching cost. Multithreaded dataflow architectures require a high degree of parallelism to tolerate latency. On the other hand, it is error-prone for programmers to partition the program into large number of fine grain threads. To reconcile these facts, we aim to advance the state of the art in automatic thread partitioning, in combination with programming language support for coarse-grain, functionally deterministic concurrency. This thesis presents a general thread partitioning algorithm for transforming sequential code into a parallel data-flow program targeting a multithreaded dataflow architecture. Our algorithm operates on the program dependence graph and on the static single assignment form, extracting task, pipeline, and data parallelism from arbitrary control flow, and coarsening its granularity using a generalized form of typed fusion. We design a new intermediate representation to ease code generation for an explicit token match dataflow execution model. We also implement a GCC-based prototype. We also evaluate coarse-grain dataflow extensions of OpenMP in the context of a large-scale 1024-core, simulated multithreaded dataflow architecture. These extension and simulated architecture allow the exploration of innovative memory models for dataflow computing. We evaluate these tools and models on realistic applications

Стилі APA, Harvard, Vancouver, ISO та ін.

8

Motiwala, Quaeed. "Optimizations for acyclic dataflow graphs for hardware-software codesign." Thesis, This resource online, 1994. http://scholar.lib.vt.edu/theses/available/etd-06302009-040504/.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

9

Savaş, Süleyman. "Linear Algebra for Array Signal Processing on a Massively Parallel Dataflow Architecture." Thesis, Halmstad University, School of Information Science, Computer and Electrical Engineering (IDE), 2008. http://urn.kb.se/resolve?urn=urn:nbn:se:hh:diva-2192.

Повний текст джерела

Анотація:

This thesis provides the deliberations about the implementation of Gentleman-Kung systolic array for QR decomposition using Givens Rotations within the context of radar signal processing. The systolic array of Givens Rotations is implemented and analysed using a massively parallel processor array (MPPA), Ambric Am2045. The tools that are dedicated to the MPPA are tested in terms of engineering efficiency. aDesigner, which is built on eclipse environment, is used for programming, simulating and performance analysing. aDesigner has been produced for Ambric chip family. 2 parallel matrix multiplications have been implemented to get familiar with the architecture and tools. Moreover different sized systolic arrays are implemented and compared with each other. For programming, ajava and astruct languages are provided. However floating point numbers are not supported by the provided languages. Thus fixed point arithmetic is used in systolic array implementation of Givens Rotations. Stable and precise numerical results are obtained as outputs of the algorithms. However the analysis results are not reliable because of the performance analysis tools.

Стилі APA, Harvard, Vancouver, ISO та ін.

10

Savaş, Süleyman. "Linear Algebra for Array Signal Processing on a Massively Parallel Dataflow Architecture." Thesis, Halmstad University, School of Information Science, Computer and Electrical Engineering (IDE), 2009. http://urn.kb.se/resolve?urn=urn:nbn:se:hh:diva-4137.

Повний текст джерела

Анотація:

This thesis provides the deliberations about the implementation of Gentleman-Kung systolic array for QR decomposition using Givens Rotations within the context of radar signal processing. The systolic array of Givens Rotations is implemented and analysed using a massively parallel processor array (MPPA), Ambric Am2045. The tools that are dedicated to the MPPA are tested in terms of engineering efficiency. aDesigner, which is built on eclipse environment, is used for programming, simulating and performance analysing. aDesigner has been produced for Ambric chip family. 2 parallel matrix multiplications have been implemented to get familiar with the architecture and tools. Moreover different sized systolic arrays are implemented and compared with each other. For programming, ajava and astruct languages are provided. However floating point numbers are not supported by the provided languages. Thus fixed point arithmetic is used in systolic array implementation of Givens Rotations. Stable and precise numerical results are obtained as outputs of the algorithms. However the analysis results are not reliable because of the performance analysis tools.

Стилі APA, Harvard, Vancouver, ISO та ін.

11

Pradal, Christophe. "Architecture de dataflow pour des systèmes modulaires et génériques de simulation de plante." Thesis, Montpellier, 2019. http://www.theses.fr/2019MONTS034.

Повний текст джерела

Анотація:

La modélisation en biologie, plus particulièrement celle de la croissance et du fonctionnement des plantes, est un domaine actuellement en pleine expansion, utile pour appréhender les enjeux liés au changement climatique et à la sécurité alimentaire au niveau mondial. La modélisation et la simulation sont des outils incontournables pour la compréhension des relations complexes entre l'architecture des plantes et les processus qui influencent leur croissance dans un environnement changeant. Pour la modélisation des plantes, un grand nombre de formalismes ont été développés dans de nombreuses disciplines et à différentes échelles de représentation. L'objectif de cette thèse est de définir une architecture modulaire qui permette de simuler des systèmes structure-fonction en réutilisant et en assemblant différents modèles existants. Nous étudierons d'abord les différentes approches de la réutilisation logicielle, proposées par Krueger, les systèmes à tableau noir et les systèmes de workflows scientifiques. Ces différentes approches sont utilisées afin de faire coopérer, de réutiliser et d'assembler des artefacts logiciels de façon modulaire. A partir du constat que ces systèmes fournissent les abstractions nécessaires à l'intégration d'artefacts variés, notre hypothèse de travail est qu'une architecture hybride, basée sur les systèmes à tableau noir avec un contrôle procédural piloté par dataflow, permettrait à la fois d'obtenir la modularité tout en permettant au modélisateur de garder le contrôle sur l'exécution. Dans le chapitre 2, nous décrivons la plateforme OpenAlea, une plateforme à composants logiciels et offrant un système de workflow scientifique, permettant l'assemblage et la composition de modèles à travers une interface de programmation visuelle. Dans le chapitre 3, nous proposons une structure de données pour le tableau noir, associant une représentation topologique de l'architecture des plantes à différentes échelles, le Multiscale Tree Graph, et sa spatialisation géométrique à l'aide de la bibliothèque 3D PlantGL. Ensuite, dans le chapitre 4, nous présentons les lambda-dataflows, une extension des dataflows permettant de coupler simulation et analyse. Puis, dans le chapitre 5, nous présentons une première application, qui illustre l'utilisation d'un modèle générique de feuilles de graminées dans différents modèles de plantes. Finalement, dans le chapitre 6, nous présentons l'ensemble des éléments de l'architecture utilisés pour élaborer un cadre générique de modélisation du développement des maladies foliaires dans un couvert architecturé. L'architecture présentée dans cette thèse et sa mise en œuvre dans OpenAlea sont un premier pas vers la réalisation de plateformes de modélisation intégratives ouvertes, permettant la coopération de modèles hétérogènes en biologie. L'utilisation du formalisme de workflows scientifiques en analyse et en simulation permet notamment d'envisager à court terme l'élaboration des plateformes de simulation collaboratives et distribuées à grande échelle Biological modeling, particularly of plant growth and functioning, is a rapidly expanding field that is useful in addressing climate change and food security issues at the global level. Modeling and simulation are essential tools for understanding the complex relationships between plant architecture and the processes that influence their growth in a changing environment.For plant modeling, a large number of formalisms have been developed in many disciplines and at different scales of representation.The objective of this thesis is to define a modular architecture that allows to simulate structural-functional plant systems by reusing and assembling different existing models.We will first study the different approaches to software reuse proposed by Krueger, then blackboard systems, and scientific workflow systems.These different approaches are used to cooperate, reuse and assemble software artifacts in a modular manner.Based on the observation that these systems provide the abstractions necessary for the integration of various artifacts, our working hypothesis is that a hybrid architecture, based on blackboard systems with dataflow-driven procedural control, would both achieve modularity while allowing the modeler to maintain control over execution.In Chapter 2, we describe the OpenAlea platform, a platform with software components and a scientific workflow system, allowing the assembly and composition of models through a visual programming interface. In Chapter 3, we propose a data structure for the blackboard, combining a topological representation of plant architecture at different scales, the Multiscale Tree Graph, and its geometric spatialization using the 3D PlantGL library. In chapter 4, we present the lambda-dataflows, an extension of dataflows allowing to couple simulation and analysis.Then, in Chapter 5, we present a first application, which illustrates the use of a generic gramineous leaf model in different plant models. Finally, in Chapter 6, we present all the architectural elements used to develop a generic framework for modelling the development of foliar diseases in an architectural canopy.The architecture presented in this thesis and its implementation in OpenAlea are a first step towards the realization of open integrative modeling platforms, allowing the cooperation of heterogeneous models in biology. The use of scientific workflow formalism in analysis and simulation makes it possible to consider in the short term the development of collaborative and distributed simulation platforms on a large scale

Стилі APA, Harvard, Vancouver, ISO та ін.

12

Guo, Jinghong. "Distributed, Modular, Open Control Architecture for Power Conversion Systems." Diss., Virginia Tech, 2005. http://hdl.handle.net/10919/27900.

Повний текст джерела

Анотація:

Due to close coupling to hardware and lack of software engineering technologies, the control software in digitally controlled power conversion systems is difficult to design and maintain. This is a natural consequence of a topology- or application-driven design approach. This research work proposes a distributed, modular, open control architecture for power conversion systems to reduce control design complexity, encapsulate and localize design dependencies, reduce unnecessary redesign effort and improve software quality. Dataflow style is chosen as the architectural style for the proposed control architecture based on comparative analysis. The detailed implementation of the dataflow architecture is presented. The resulting dataflow control software is evaluated in comparison to the legacy approach to control design used in industry and academia. The dataflow control software for a 3-phase voltage source inverter is also tested on a real PEBB-based converter system. To further explore the flexibility of control composition that is brought by the dataflow approach, the feasibility of dynamic control reconfiguration is also presented as an important future research direction. Ph. D.

Стилі APA, Harvard, Vancouver, ISO та ін.

13

Lesparre, Youen. "Evaluation de l'affectation des tâches sur une architecture à mémoire distribuée pour des modèles flot de données." Thesis, Paris 6, 2017. http://www.theses.fr/2017PA066086/document.

Повний текст джерела

Анотація:

Avec l'augmentation de l'utilisation des smartphones, des objets connectés et des véhicules automatiques, le domaine des systèmes embarqués est devenu omniprésent dans notre environnement. Ces systèmes sont souvent contraints en terme de consommation et de taille. L'utilisation des processeurs many-cores dans des systèmes embarqués permet une conception rapide tout en respectant des contraintes temps-réels et en conservant une consommation énergétique basse.Exécuter une application sur un processeur many-core requiert un dispatching des tâches appelé problème de mapping et est connu comme étant NP-complet.Les contributions de cette thèse sont divisées en trois parties :Tout d'abord, nous étendons d'importantes propriétés dataflow au modèle Phased Computation Graph.Ensuite, nous présentons un générateur de graphe dataflow capable de générer des Synchonous Dataflow Graphs, Cyclo-Static Dataflow Graphs et Phased Computation Graphs vivant avec plus de 10000 tâches en moins de 30 secondes. Le générateur est comparé à SDF3 et PREESM.Enfin, la contribution majeure de cette thèse propose une nouvelle méthode d'évaluation d'un mapping en utilisant les modèles Synchonous Dataflow Graphe et Cyclo-Static Dataflow Graphe. La méthode évalue efficacement la mémoire consommée par les communications d'un dataflow mappé sur une architecture à mémoire distribuée. L'évaluation est déclinée en deux versions, la première garantit la vivacité alors que la seconde ajoute une contrainte de débit. La méthode d'évaluation est expérimentée avec des dataflow générés par Turbine et avec des applications réelles With the increasing use of smart-phones, connected objects or automated vehicles, embedded systems have become ubiquitous in our living environment. These systems are often highly constrained in terms of power consumption and size. They are more and more implemented with many-core processor array that allow, rapid design to meet stringent real-time constraints while operating at relatively low frequency, with reduced power consumption.Running an application on a processor array requires dispatching its tasks on the processors in order to meet capacity and performance constraints. This mapping problem is known to be NP-complete.The contributions of this thesis are threefold:First we extend important notions from the Cyclo-Static Dataflow Graph to the Phased Computation Graph model and two equivalent sufficient conditions of liveness.Second, we present a random dataflow graph generator able to generate Synchonous Dataflow Graphs, Cyclo-Static Dataflow Graphs and Phased Computation Graphs. The Generator, is able to generate live dataflow of up to 10,000 tasks in less than 30 seconds. It is compared with SDF3 and PREESM.Third and most important, we propose a new method of evaluation of a mapping using the Synchonous Dataflow Graph and the Cyclo-Static Dataflow Graph models. The method evaluates efficiently the memory footprint of the communications of a dataflow graph mapped on a distributed architecture. The evaluation is declined in two versions, the first guarantees a live mapping while the second accounts for a constraint on throughput.The evaluation method is experimented on dataflow graphs from Turbine and on real-life applications

Стилі APA, Harvard, Vancouver, ISO та ін.

14

Silva, Antonio Carlos Fernandes da. "ChipCflow: tool for convert C code in a static dataflow architecture in reconfigurable hardware." Universidade de São Paulo, 2015. http://www.teses.usp.br/teses/disponiveis/55/55134/tde-30062015-141638/.

Повний текст джерела

Анотація:

A growing search for alternative architectures and softwares have been noted in the last years. This search happens due to the advance of hardware technology and such advances must be complemented by innovations on design methodologies, test and verification techniques in order to use technology effectively. Alternative architectures and softwares, in general, explores the parallelism of applications, differently to Von Neumann model. Among high performance alternative architectures, there is the Dataflow Architecture. In this kind of architecture, the process of program execution is determined by data availability, thus the parallelism is intrinsic in these systems. The dataflow architectures become again a highlighted search area due to hardware advances, in particular, the advances of Reconfigurable Computing and Field Programmable Gate Arrays (FPGAs). ChipCflow projet is a tool for execution of algorithms using dynamic dataflow graph in FPGA. In this thesis, the development of a code conversion tool to generate aplications in a static dataflow architecture, is described. Also the ChipCflow project where the code conversion tool is part, is presented. The specification of algorithm to be converted is made in C language and converted to a hadware description language, respecting the proposed by ChipCflow project. The results are the proof of concept of converting a high-level language code for dataflow architecture to be used into a FPGA. Existe uma crescente busca por softwares e arquiteturas alternativas. Essa busca acontece pois houveram avanços na tecnologia do hardware, e estes avanços devem ser complementados por inovações nas metodologias de projetos, testes e verificação para que haja um uso eficaz da tecnologia. Os software e arquiteturas alternativas, geralmente são modelos que exploram o paralelismo das aplicações, ao contrário do modelo de Von Neumann. Dentre as arquiteturas alternativas de alto desempenho, tem-se a arquitetura a fluxo de dados. Nesse tipo de arquitetura, o processo de execução de programas é determinado pela disponibilidade dos dados, logo o paralelismo está embutido na própria natureza do sistema. O modelo a fluxo de dados possui a vantagem de expressar o paralelismo de maneira intrínseca, eliminando a necessidade do programador explicitar em seu código os trechos onde deve haver paralelismo. As arquiteturas a fluxo de dados voltaram a ser uma área de pesquisa devido aos avanços do hardware, em particular, os avanços da Computação Reconfigurável e dos Field Programmable Gate Arrays (FPGAs).Nesta tese é descrita uma ferramenta de conversão de código que visa a geração de aplicações utilizando uma arquitetura a fluxo de dados estática. Também é descrito o projeto ChipCflow, cuja ferramenta de conversão de código, descrita nesta tese, é parte integrante. A especificação do algoritmo a ser convertido é feita em linguagem C e convertida para uma linguagem de descrição de hardware, respeitando o modelo proposto pelo ChipCflow. Os resultados alcançados visam a prova de conceito da conversão de código de uma linguagem de alto nível para uma arquitetura a fluxo de dados a ser configurada em FPGA.

Стилі APA, Harvard, Vancouver, ISO та ін.

15

Voigt, Sven-Ole. "Dynamically reconfigurable dataflow architecture for high performance digital signal processing on multi FPGA platforms." Aachen Shaker, 2008. http://d-nb.info/992481694/04.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

16

Ngo, Dinh Thanh. "Runtime mapping of dynamic dataflow applications on heterogeneous multiprocessor platforms." Thesis, Lorient, 2015. http://www.theses.fr/2015LORIS371/document.

Повний текст джерела

Анотація:

La complexité et le nombre toujours plus grandissant des applications, notamment les standards vidéo, nécessite d’étudier des méthodes et outils pour leur déploiement sur des architectures elles aussi toujours plus complexes. En effet, afin d’atteindre les performances requises en matière de temps d’exécution ou consommation énergétique, les architectures modernes proposent des éléments de calculs hétérogènes, où chacun est spécialisé pour une fonction précise. Cette thèse s’appuie sur le modèle flot de données pour la spécification de l’application. Ce modèle permet d’exposer explicitement le parallélisme spatial et temporel de l’application à travers un réseau d’acteurs interconnectés par des canaux de type FIFO. Les acteurs, en charge du calcul, peuvent exhiber un comportement statique ou dynamique. Les derniers standards vidéo contraignent à s’appuyer sur les modèles dynamiques pour obtenir une spécification fonctionnelle. Les besoins de calcul sont alors dépendants des données à traiter. Le déploiement d’une application dynamique ne peut donc se faire à l’aide des approches statiques existantes dans la littérature. L’objectif de cette thèse est de proposer des algorithmes efficaces permettant de déployer à la volée une application flot de données dynamique sur une architecture multiprocesseurs hétérogène. La première contribution est un algorithme qui permet de trouver rapidement une solution de déploiement de l’application. La deuxième contribution est un algorithme basé sur les mouvements pour adapter en cours d’exécution le déploiement en réponse aux aspects dynamiques de l’application Modern multimedia applications are subject to an increasing complexity with widespread standards. This has led to the interest in dataflow approach that offers a powerful perspective on parallel com- putations at high level. In the meantime, the emergence of massively parallel architectures has revealed the trend towards heterogeneous Multi-Processor System-on-Chips (MPSoCs) to offer a better perfor- mance and energy tradeoff than their homogeneous counterparts. However, this also imposes challenges to the mapping of multimedia applications on such complex architectures. This thesis presents an adaptive methodology for mapping dataflow applications on heterogeneous MPSoCs. This thesis focuses on video decoders specified in RVC-CAL language, a dedicated dataflow language for video applications. Existing static approaches cannot capture all behaviors in dynamic dataflow applications. Thus, this requires to adapt the mapping according to the input data. The algorithm offers some adaptive parameters combined with our analyt- ical communication model to improve a performance while consider- ing load balancing. We evaluate our algorithms on a set of randomly generated benchmarks and real video decoders like MPEG4-SP and HEVC. Experimental results reveal that our mapping methodology is fast enough (in milliseconds) and the runtime remapping signifi- cantly improves the initial mapping. In the remapping process, we take the migration cost into account because the reconfiguration time also contributes to the overall performance

Стилі APA, Harvard, Vancouver, ISO та ін.

17

Mandlekar, Anup Shrikant. "An Application Framework for a Power-Aware Processor Architecture." Thesis, Virginia Tech, 2012. http://hdl.handle.net/10919/34484.

Повний текст джерела

Анотація:

The instruction-set based general purpose processors are not energy-efficient for event-driven applications. The E-textiles group at Virginia Tech proposed a novel data-flow processor architecture design to bridge the gap between event-driven applications and the target architecture. The architecture, although promising in terms of performance and energy-efficiency, was explored for limited number of applications. This thesis presents a model-driven approach for the design of an application framework, facilitating rapid development of software applications to test the architecture performance. The application framework is integrated with the prior automation framework bringing software applications at the right level of abstraction. The processor architecture design is made flexible and scalable, making it suitable for a wide range of applications. Additionally, an embedded flash memory based architecture design for reduction in the static power consumption is proposed. This thesis estimates significant reduction in overall power consumption with the incorporation of flash memory. Master of Science

Стилі APA, Harvard, Vancouver, ISO та ін.

18

Shelor, Charles F. "Dataflow Processing in Memory Achieves Significant Energy Efficiency." Thesis, University of North Texas, 2018. https://digital.library.unt.edu/ark:/67531/metadc1248478/.

Повний текст джерела

Анотація:

The large difference between processor CPU cycle time and memory access time, often referred to as the memory wall, severely limits the performance of streaming applications. Some data centers have shown servers being idle three out of four clocks. High performance instruction sequenced systems are not energy efficient. The execute stage of even simple pipeline processors only use 9% of the pipeline's total energy. A hybrid dataflow system within a memory module is shown to have 7.2 times the performance with 368 times better energy efficiency than an Intel Xeon server processor on the analyzed benchmarks. The dataflow implementation exploits the inherent parallelism and pipelining of the application to improve performance without the overhead functions of caching, instruction fetch, instruction decode, instruction scheduling, reorder buffers, and speculative execution used by high performance out-of-order processors. Coarse grain reconfigurable logic in an energy efficient silicon process provides flexibility to implement multiple algorithms in a low energy solution. Integrating the logic within a 3D stacked memory module provides lower latency and higher bandwidth access to memory while operating independently from the host system processor.

Стилі APA, Harvard, Vancouver, ISO та ін.

19

Voigt, Sven O. [Verfasser]. "Dynamically Reconfigurable Dataflow Architecture for High-Performance Digital Signal Processing on Multi-FPGA Platforms / Sven O Voigt." Aachen : Shaker, 2009. http://d-nb.info/116130908X/34.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

20

Gharbi, Amna. "Constraint programming for design space exploration of dataflow applications on multi-bus architectures." Electronic Thesis or Diss., Institut polytechnique de Paris, 2021. http://www.theses.fr/2021IPPAT018.

Повний текст джерела

Анотація:

Cette thèse a été effectuée à Télécom Paris et a été financée par Nokia Bell Labs France. Dans ce contexte, nous nous intéressons à l’exploration d’architecture des systèmes embarqués pour le déploiement des applications de traitement de signal, au niveau système. Ici, l’exploration d’architecture vise à identifier l’allocation et l’ordonnancement des deux composants des applications : les tâches et leurs transferts des données. Cette identification a un impact clé sur la performance (e.g., latence de bout en bout) globale du système. Tandis que plusieurs travaux se sont intéressés aux diverses architectures de communication, cette thèse se focalise sur les architectures multi-bus, particulièrement adaptées aux plateformes de calcul pour les applications de traitement de signal. Pour ce type de plateformes, nous montrons que les contributions déjà proposées sont insuffisantes. A cet égard, nous proposons trois contributions : 1) Une formulation satisfiability modulo theories (SMT) qui permet d’explorer les décisions d’allocation et d’ordonnancement sur les architectures multi-bus pour l'optimisation de la latence ; Nous démontrons son applicabilité pour produire des solutions pour des applications connues. 2) Pour améliorer la scalabilité de la recherche optimale de la première contribution, nous proposons une nouvelle technique pour couper l’espace des solutions recherchées. Notre évaluation démontre un gain de scalabilité. Finalement, 3) la consommation de puissance par les communications est étudiée ; nous montrons comment optimiser la latence et la consommation conjointement. Nos évaluations montrent comment différents compromis entre latence et consommation de puissance peuvent être étudiés. De plus, nous montrons comment nos contribution ont été intégrées à un outil de modélisation et de vérification particulièrement adapté à la conception des systèmes embarqués au niveau système (TTool). Enfin, nous identifions deux axes principaux pour les perspectives de ce travail. Le premier porte sur l’extension de la formulation actuelle pour modéliser de nouveaux aspects des systèmes étudiés (e.g., mémoire partagée, débit). Le deuxième axe concerne l’élaboration de nouvelles techniques pour améliorer davantage la scalabilité de la recherche optimale This thesis is part of a collaboration between Télécom Paris and Nokia Bell Labs France. In this context, we focus on the system-level Design Space Exploration of embedded systems for the execution of signal processing applications. In the system we target, the design space exploration process intends to identify the allocation and scheduling of both application tasks and data transfers between these tasks: this identification plays a key role in the overall performance (e.g. end-to-end latency) of these systems. While there are already multiple works for diverse communication architectures, this thesis focuses on multi-bus architectures that are particularly well-suited for computation platforms of signal processing applications. For these platforms, we show that only limited contributions have already been proposed. Three contributions are proposed to tackle the above mentioned problem. 1) A satisfiability modulo theories (SMT) formulation which allows to explore mapping and scheduling decisions on multi-bus architectures for latency optimization; We demonstrate its ability to produce a solution for well-known applications. Yet, 2) to mitigate the scalability limitations for the optimal solution search of this first contribution, we propose a technique to prune the design space of searched solutions. Evaluations we provide demonstrate a better scalability. Last, 3) communication allocation is enhanced with power consumption, and we show how to jointly optimize latency and power consumption. Our evaluation is again applied to a set of well-known signal processing applications and demonstrates how different trade-offs between latency and power consumption can be studied.Our contributions are integrated into a state-of-the-art modeling and verification tool for the system-level design of embedded systems (TTool). Perspectives are articulated in mainly two axes. 1) Extending the current formulation to account for new design aspects (e.g., shared memory, throughput). 2) Further improving the scalability of the optimal search

Стилі APA, Harvard, Vancouver, ISO та ін.

21

Lesparre, Youen. "Evaluation de l'affectation des tâches sur une architecture à mémoire distribuée pour des modèles flot de données." Electronic Thesis or Diss., Paris 6, 2017. http://www.theses.fr/2017PA066086.

Повний текст джерела

Анотація:

Avec l'augmentation de l'utilisation des smartphones, des objets connectés et des véhicules automatiques, le domaine des systèmes embarqués est devenu omniprésent dans notre environnement. Ces systèmes sont souvent contraints en terme de consommation et de taille. L'utilisation des processeurs many-cores dans des systèmes embarqués permet une conception rapide tout en respectant des contraintes temps-réels et en conservant une consommation énergétique basse.Exécuter une application sur un processeur many-core requiert un dispatching des tâches appelé problème de mapping et est connu comme étant NP-complet.Les contributions de cette thèse sont divisées en trois parties :Tout d'abord, nous étendons d'importantes propriétés dataflow au modèle Phased Computation Graph.Ensuite, nous présentons un générateur de graphe dataflow capable de générer des Synchonous Dataflow Graphs, Cyclo-Static Dataflow Graphs et Phased Computation Graphs vivant avec plus de 10000 tâches en moins de 30 secondes. Le générateur est comparé à SDF3 et PREESM.Enfin, la contribution majeure de cette thèse propose une nouvelle méthode d'évaluation d'un mapping en utilisant les modèles Synchonous Dataflow Graphe et Cyclo-Static Dataflow Graphe. La méthode évalue efficacement la mémoire consommée par les communications d'un dataflow mappé sur une architecture à mémoire distribuée. L'évaluation est déclinée en deux versions, la première garantit la vivacité alors que la seconde ajoute une contrainte de débit. La méthode d'évaluation est expérimentée avec des dataflow générés par Turbine et avec des applications réelles With the increasing use of smart-phones, connected objects or automated vehicles, embedded systems have become ubiquitous in our living environment. These systems are often highly constrained in terms of power consumption and size. They are more and more implemented with many-core processor array that allow, rapid design to meet stringent real-time constraints while operating at relatively low frequency, with reduced power consumption.Running an application on a processor array requires dispatching its tasks on the processors in order to meet capacity and performance constraints. This mapping problem is known to be NP-complete.The contributions of this thesis are threefold:First we extend important notions from the Cyclo-Static Dataflow Graph to the Phased Computation Graph model and two equivalent sufficient conditions of liveness.Second, we present a random dataflow graph generator able to generate Synchonous Dataflow Graphs, Cyclo-Static Dataflow Graphs and Phased Computation Graphs. The Generator, is able to generate live dataflow of up to 10,000 tasks in less than 30 seconds. It is compared with SDF3 and PREESM.Third and most important, we propose a new method of evaluation of a mapping using the Synchonous Dataflow Graph and the Cyclo-Static Dataflow Graph models. The method evaluates efficiently the memory footprint of the communications of a dataflow graph mapped on a distributed architecture. The evaluation is declined in two versions, the first guarantees a live mapping while the second accounts for a constraint on throughput.The evaluation method is experimented on dataflow graphs from Turbine and on real-life applications

Стилі APA, Harvard, Vancouver, ISO та ін.

22

Zheng, Chunfang. "GRAPHICAL MODELING AND SIMULATION OF A HYBRID HETEROGENEOUS AND DYNAMIC SINGLE-CHIP MULTIPROCESSOR ARCHITECTURE." UKnowledge, 2004. http://uknowledge.uky.edu/gradschool_theses/249.

Повний текст джерела

Анотація:

A single-chip, hybrid, heterogeneous, and dynamic shared memory multiprocessor architecture is being developed which may be used for real-time and non-real-time applications. This architecture can execute any application described by a dataflow (process flow) graph of any topology; it can also dynamically reconfigure its structure at the node and processor architecture levels and reallocate its resources to maximize performance and to increase reliability and fault tolerance. Dynamic change in the architecture is triggered by changes in parameters such as application input data rates, process execution times, and process request rates. The architecture is a Hybrid Data/Command Driven Architecture (HDCA). It operates as a dataflow architecture, but at the process level rather than the instruction level. This thesis focuses on the development, testing and evaluation of a new graphic software (hdca) developed to first do a static resource allocation for the architecture to meet timing requirements of an application and then hdca simulates the architecture executing the application using statically assigned resources and parameters. While simulating the architecture executing an application, the software graphically and dynamically displays parameters and mechanisms important to the architectures operation and performance. The new graphical software is able to show system and node level dynamic capability of the HDCA. The newly developed software can model a fixed or varying input data rate. The model also allows fault tolerance analysis of the architecture.

Стилі APA, Harvard, Vancouver, ISO та ін.

23

Cavenaghi, Marcos Antônio. "Implementação de um simulador para a arquitetura de dados Wolf." Universidade de São Paulo, 1992. http://www.teses.usp.br/teses/disponiveis/54/54132/tde-08062009-102639/.

Повний текст джерела

Анотація:

Esse trabalho apresenta a Proto-Arquitetura a fluxo de dados WOLF e trata da implementação de um simulador simplificado dirigido a eventos para essa arquitetura. O projeto WOLF propõe- se a implementar e estudar as características de um supercomputador de alta velocidade baseado no modelo de fluxo de dados dinâmico com granularidade variável. Para situar o trabalho é apresentado alguns conceitos envolvidos em simulações as principais implementações em f1uxo de dados. Os resultados preliminares obtidos com o simulador são apresentados. Esses resultados são analisados e comparados com dados conhecidos da arquitetura f1uxo de dados da Maquina de Manchester. Quando a maquina Proto-WOLF tiver suas características particulares desativadas, essa deve comportar-se como a Maquina de Manchester. Os resultados obtidos refletem esse comportamento. Conclui-se, portanto que o simulador implementado está se comportando de uma maneira apropriada. This work presents the Proto-WOLF dataflow architecture and implementation of a simplified event-driven simulator for this architecture. The WOLF project is a proposal for the implementation of a supercomputer based on the dynamic dataflow model with variable granularity. In order to place the work in context, some of the basic concepts involved in simulation are presented and a survey of the most relevant works in dataflow is presented. The preliminary simulation results are presented. These results are canalized and compared with known results from the dataflow architecture of the Manchester Dataflow Machine. When the simulated Proto-WOLF machine has all its unique features disabled, it is expected that it should behave in a Manchester-like fashion. The results obtained fully agree with these. It is, then, possible to conclude that the implemented simulator is behaving in a proper manner.

Стилі APA, Harvard, Vancouver, ISO та ін.

24

Cavenaghi, Marcos Antônio. "Implementação e estudo da arquitetura a fluxo de dados Wolf." Universidade de São Paulo, 1997. http://www.teses.usp.br/teses/disponiveis/76/76132/tde-01062009-111139/.

Повний текст джерела

Анотація:

Esse trabalho apresenta a arquitetura a fluxo de dados Wolf. Essa arquitetura foi proposta considerando-se alguns problemas conhecidos em execução de código em arquiteturas a fluxo de dados tais como tratamento de código seqüencial e tratamento de estruturas de dados (vetores e matrizes). Wolf é baseado no modelo de fluxo de dados dinâmico e explora granulosidade variável, sendo a mais fina a nível de instrução. Alguns conceitos explorados por outras arquiteturas a fluxo de dados guiaram o desenvolvimento do Wolf. Entre esses estão o conceito de macro-dataflow e multithreading. Para o estudo da arquitetura Wolf foi desenvolvido um simulador dirigido a tempo que implementa suas características. O simulador Saw foi escrito na linguagem orientada a objetos C++ e pode ser compilado em qualquer plataforma que use um compilador padrão ANSI de 32 bits. O código do simulador foi exaustivamente testado e os resultados numéricos dos programas executados estão de acordo com resultados apresentados por uma arquitetura Von Neumann padrão. Após o estudo dos resultados obtidos com as simulações, foram identificados alguns problemas com a arquitetura Wolf. Algumas hipóteses foram testadas para solucionar tais problemas. Os resultados desses testes levaram à alteração da arquitetura. Desta alteração originou-se a arquitetura Wolf II que será apresentada também nesse trabalho. Por se tratar de uma conseqüência dos estudos realizados com a arquitetura Wolf, não serão feitos experimentos com o Wolf II. This work presents the dataflow architecture Wolf. Wolf has been proposed in the focus of some known problems identified in previous works: the execution of data structures (vectors and matrices) and sequential code: to name a few. Wolf is based in the dynamic dataflow model and explores variable granularity (the thinnest is at instruction level). Some concepts developed in the designing of other hybrid architectures: guided the Wolf implementation. The macro-dataflow and multithreading were two of them. Focusing the study of the Wolf architecture: it has been developed a time driven simulator (Saw). The object oriented language C++ was used for this implementation. The code can be compiled on any ANSI standard 32 bits compiler. This code was exhaustively tested and the numeric results obtained with the experiments were equal to the ones obtained with Von Neumann architecture. This study identified some problems with the Wolf architecture. Some proposals were implemented in the simulator to try to identify the causes for the problems. The results led to an alteration in the Wolf architecture. The new proposed architecture (Wolf II) is described in the last chapter: but it was not submitted to experiments as Wolf was.

Стилі APA, Harvard, Vancouver, ISO та ін.

25

White, Joey. "USING DATAFLOW ARCHITECTURE TO SOLVE THE TRANSPORT LAG PROBLEM WHEN INTERFACING WITH AN ENGINEERING MODEL FLIGHT COMPUTER IN A TELEMETRY SIMULATION." International Foundation for Telemetering, 1991. http://hdl.handle.net/10150/613183.

Повний текст джерела

Анотація:

International Telemetering Conference Proceedings / November 04-07, 1991 / Riviera Hotel and Convention Center, Las Vegas, Nevada One of the most challenging technical problems in the development of a spacecraft telemetry simulation is the interface with a flight computer running real-world flight software. The ability of the simulation to satisfy flight software requests for telemetry data, and to load, mode, and control the flight software along with the simulation, can be constrained or degraded using conventional interface solutions. Telemetry dataflow architecture systems can be utilized to solve the interface problems with less constraints. This is an especially attractive solution in a telemetry simulation where the telemetry system can also be used to format and serialize spacecraft telemetry, and receive and preprocess commands. This paper discusses the concepts developed for such a system for a training simulation of the Orbital Maneuvering Vehicle for NASA at Johnson Space Center.

Стилі APA, Harvard, Vancouver, ISO та ін.

26

Arumí, Albó Pau. "Real-time multimedia on off-the-shelf operating systems: from timeliness dataflow models to pattern languages." Doctoral thesis, Universitat Pompeu Fabra, 2009. http://hdl.handle.net/10803/7558.

Повний текст джерела

Анотація:

Els sistemes multimèdia basats en programari capaços de processar àudio, vídeo i gràfics a temps-real són omnipresents avui en dia. Els trobem no només a les estacions de treball de sobre-taula sinó també als dispositius ultra-lleugers com els telèfons mòbils. Degut a que la majoria de processament es realitza mitjançant programari, usant abstraccions del maquinari i els serveis oferts pel sistema operatiu i les piles de llibreries que hi ha per sota, el desenvolupament ràpid d'aplicacions esdevé possible. A més d'aquesta immediatesa i exibilitat (comparat amb les plataformes orientades al maquinari), aquests plataformes també ofereixen capacitats d'operar en temps-real amb uns límits de latència apropiats. Malgrat tot això, els experts en el domini dels multimèdia s'enfronten a un desafiament seriós: les funcionalitats i complexitat de les seves aplicacions creixen ràpidament; mentrestant, els requeriments de temps-real (com ara la baixa latència) i els estàndards de fiabilitat augmenten. La present tesi es centra en l'objectiu de proporcionar una caixa d'eines als experts en el domini que els permeti modelar i prototipar sistemes de processament multimèdia. Aquestes eines contenen plataformes i construccions que reecteixen els requeriments del domini i de l'aplicació, i no de propietats accidentals de la implementació (com ara la sincronització entre threads i manegament de buffers). En aquest context ataquem dos problemes diferents però relacionats:la manca de models de computació adequats pel processament de fluxos multimèdia en temps-real, i la manca d'abstraccions apropiades i mètodes sistemàtics de desenvolupament de programari que suportin els esmentats models. Existeixen molts models de computació orientats-a-l'actor i ofereixen millors abstraccions que les tècniques d'enginyeria del programari dominants, per construir sistemes multimèdia de temps-real. La família de les Process Networks i els models Dataflow basades en xarxes d'actors de processat del senyal interconnectats són els més adequats pel processament de fluxos continus. Aquests models permeten expressar els dissenys de forma propera al domini del problema (en comptes de centrar-se en detalls de la implementació), i possibiliten una millor modularització i composició jeràrquica del sistema. Això és possible perquè el model no sobreespecifica com els actors s'han d'executar, sinó que només imposa dependències de dades en un estil de llenguatge declaratiu. Aquests models admeten el processat multi-freqüència i, per tant, planificacions complexes de les execucions dels actors. Però tenen un problema: els models no incorporen el concepte de temps d'una forma útil i, en conseqüència, les planifiacions periòdiques no garanteixen un comportament de temps-real i de baixa latència. Aquesta dissertació soluciona aquesta limitació a base de descriure formalment un nou model que hem anomenat Time-Triggered Synchronous Dataflow (TTSDF). En aquest nou model les planificacions periòdiques són intercalades per vàries "activacions" temporalment-disparades (time-triggered) de forma que les entrades i sortides de la xarxa de processat poden ser servides de forma regular. El model TTSDF té la mateixa expressivitat (o, en altres paraules, té computabilitat equivalent) que el model Synchronous Dataow (SDF). Però a més, té l'avantatge que garanteix la operativitat en temps-real, amb mínima latència i absència de forats i des-sincronitzacions a la sortida. Finalment, permet el balancejat de la càrrega en temps d'execució entre diferents activacions de callbacks i la paralel·lització dels actors. Els models orientats-a-l'actor no són solucions directament aplicables; no són suficients per construir sistemes multimèdia amb una metodologia sistemàtica i pròpia d'una enginyeria. També afrontem aquest problema i, per solucionar-lo, proposem un catàleg de patrons de disseny específics del domini organitzats en un llenguatge de patrons. Aquest llenguatge de patrons permet el refús del disseny, posant una especial atenció al context en el qual el disseny-solució és aplicable, les forces enfrontades que necessita balancejar i les implicacions de la seva aplicació. Els patrons proposats es centren en com: organitzar diferents tipus de connexions entre els actors, transferir dades entre els actors, habilitar la comunicació dels humans amb l'enginy del dataflow, i finalment, prototipar de forma ràpida interfícies gràfiques d'usuari per sobre de l'enginy del dataflow, creant aplicacions completes i extensibles. Com a cas d'estudi, presentem un entorn de desenvolupament (framework) orientat-a-objectes (CLAM), i aplicacions específiques construïdes al seu damunt, que fan ús extensiu del model TTSDF i els patrons contribuïts en aquesta tesi. Software-based multimedia systems that deal with real-time audio, video and graphics processing are pervasive today, not only in desktop workstations but also in ultra-light devices such as smart-phones. The fact that most of the processing is done in software, using the high-level hardware abstractions and services offered by the underlying operating systems and library stacks, enables for quick application development. Added to this exibility and immediacy (compared to hardware oriented platforms), such platforms also offer soft real-time capabilities with appropriate latency bounds. Nevertheless, experts in the multimedia domain face a serious challenge: the features and complexity of their applications are growing rapidly; meanwhile, real-time requirements (such as low latency) and reliability standards increase. This thesis focus on providing multimedia domain experts with workbench of tools they can use to model and prototype multimedia processing systems. Such tools contain platforms and constructs that reect the requirements of the domain and application, and not accidental properties of the implementation (such as thread synchronization and buffers management). In this context, we address two distinct but related problems: the lack of models of computation that can deal with continuous multimedia streams processing in real-time, and the lack of appropriate abstractions and systematic development methods that support such models. Many actor-oriented models of computation exist and they offer better abstractions than prevailing software engineering techniques (such as object-orientation) for building real-time multimedia systems. The family of Process Networks and Dataow models based on networks of connected processing actors are the most suited for continuous stream processing. Such models allow to express designs close to the problem domain (instead of focusing in implementation details such as threads synchronization), and enable better modularization and hierarchical composition. This is possible because the model does not over-specify how the actors must run, but only imposes data dependencies in a declarative language fashion. These models deal with multi-rate processing and hence complex periodic actor's execution schedulings. The problem is that the models do not incorporate the concept of time in a useful way and, hence, the periodic schedules do not guarantee real-time and low latency requirements. This dissertation overcomes this shortcoming by formally describing a new model that we named Time-Triggered Synchronous Dataow (TTSDF), whose periodic schedules can be interleaved by several time-triggered activations" so that inputs and outputs of the processing graph are regularly serviced. The TTSDF model has the same expressiveness (or equivalent computability) than the Synchronous Dataow (SDF) model, with the advantage that it guarantees minimum latency and absence of gaps and jitter in the output. Additionally, it enables run-time load balancing between callback activations and parallelization. Actor-oriented models are not off-the-shelf solutions and do not suffice for building multimedia systems in a systematic and engineering approach. We address this problem by proposing a catalog of domain-speciffic design patterns organized in a pattern language. This pattern language provides design reuse paying special attention to the context in which a design solution is applicable, the competing forces it needs to balance and the implications of its application. The proposed patterns focus on how to: organize different kinds of actors connections, transfer tokens between actors, enable human interaction with the dataow engine, and finally, rapid prototype user interfaces on top of the dataow engine, creating complete and extensible applications. As a case study, we present an object-oriented framework (CLAM), and speciffic applications built upon it, that makes extensive use of the contributed TTSDF model and patterns.

Стилі APA, Harvard, Vancouver, ISO та ін.

27

Amstel, Duco van. "Optimisation de la localité des données sur architectures manycœurs." Thesis, Université Grenoble Alpes (ComUE), 2016. http://www.theses.fr/2016GREAM019/document.

Повний текст джерела

Анотація:

L'évolution continue des architectures des processeurs a été un moteur important de la recherche en compilation. Une tendance dans cette évolution qui existe depuis l'avènement des ordinateurs modernes est le rapport grandissant entre la puissance de calcul disponible (IPS, FLOPS, ...) et la bande-passante correspondante qui est disponible entre les différents niveaux de la hiérarchie mémoire (registres, cache, mémoire vive). En conséquence la réduction du nombre de communications mémoire requis par un code donnée a constitué un sujet de recherche important. Un principe de base en la matière est l'amélioration de la localité temporelle des données: regrouper dans le temps l'ensemble des accès à une donnée précise pour qu'elle ne soit requise que pendant peu de temps et pour qu'elle puisse ensuite être transféré vers de la mémoire lointaine (mémoire vive) sans communications supplémentaires.Une toute autre évolution architecturale a été l'arrivée de l'ère des multicoeurs et au cours des dernières années les premières générations de processeurs manycoeurs. Ces architectures ont considérablement accru la quantité de parallélisme à la disposition des programmes et algorithmes mais ceci est à nouveau limité par la bande-passante disponible pour les communications entres coeurs. Ceci a amené dans le monde de la compilation et des techniques d'optimisation des problèmes qui étaient jusqu'à là uniquement connus en calcul distribué.Dans ce texte nous présentons les premiers travaux sur une nouvelle technique d'optimisation, le pavage généralisé qui a l'avantage d'utiliser un modèle abstrait pour la réutilisation des données et d'être en même temps utilisable dans un grand nombre de contextes. Cette technique trouve son origine dans le pavage de boucles, une techniques déjà bien connue et qui a été utilisée avec succès pour l'amélioration de la localité des données dans les boucles imbriquées que ce soit pour les registres ou pour le cache. Cette nouvelle variante du pavage suit une vision beaucoup plus large et ne se limite pas au cas des boucles imbriquées. Elle se base sur une nouvelle représentation, le graphe d'utilisation mémoire, qui est étroitement lié à un nouveau modèle de besoins en termes de mémoire et de communications et qui s'applique à toute forme de code exécuté itérativement. Le pavage généralisé exprime la localité des données comme un problème d'optimisation pour lequel plusieurs solutions sont proposées. L'abstraction faite par le graphe d'utilisation mémoire permet la résolution du problème d'optimisation dans différents contextes. Pour l'évaluation expérimentale nous montrons comment utiliser cette nouvelle technique dans le cadre des boucles, imbriquées ou non, ainsi que dans le cas des programmes exprimés dans un langage à flot-de-données. En anticipant le fait d'utiliser le pavage généralisé pour la distribution des calculs entre les cœurs d'une architecture manycoeurs nous donnons aussi des éléments de réponse pour modéliser les communications et leurs caractéristiques sur ce genre d'architectures. En guise de point final, et pour montrer l'étendue de l'expressivité du graphe d'utilisation mémoire et le modèle de besoins en mémoire et communications sous-jacent, nous aborderons le sujet du débogage de performances et l'analyse des traces d'exécution. Notre but est de fournir un retour sur le potentiel d'amélioration en termes de localité des données du code évalué. Ce genre de traces peut contenir des informations au sujet des communications mémoire durant l'exécution et a de grandes similitudes avec le problème d'optimisation précédemment étudié. Ceci nous amène à une brève introduction dans le monde de l'algorithmique des graphes dirigés et la mise-au-point de quelques nouvelles heuristiques pour le problème connu de joignabilité mais aussi pour celui bien moins étudié du partitionnement convexe The continuous evolution of computer architectures has been an important driver of research in code optimization and compiler technologies. A trend in this evolution that can be traced back over decades is the growing ratio between the available computational power (IPS, FLOPS, ...) and the corresponding bandwidth between the various levels of the memory hierarchy (registers, cache, DRAM). As a result the reduction of the amount of memory communications that a given code requires has been an important topic in compiler research. A basic principle for such optimizations is the improvement of temporal data locality: grouping all references to a single data-point as close together as possible so that it is only required for a short duration and can be quickly moved to distant memory (DRAM) without any further memory communications.Yet another architectural evolution has been the advent of the multicore era and in the most recent years the first generation of manycore designs. These architectures have considerably raised the bar of the amount of parallelism that is available to programs and algorithms but this is again limited by the available bandwidth for communications between the cores. This brings some issues thatpreviously were the sole preoccupation of distributed computing to the world of compiling and code optimization techniques.In this document we present a first dive into a new optimization technique which has the promise of offering both a high-level model for data reuses and a large field of potential applications, a technique which we refer to as generalized tiling. It finds its source in the already well-known loop tiling technique which has been applied with success to improve data locality for both register and cache-memory in the case of nested loops. This new "flavor" of tiling has a much broader perspective and is not limited to the case of nested loops. It is build on a new representation, the memory-use graph, which is tightly linked to a new model for both memory usage and communication requirements and which can be used for all forms of iterate code.Generalized tiling expresses data locality as an optimization problem for which multiple solutions are proposed. With the abstraction introduced by the memory-use graph it is possible to solve this optimization problem in different environments. For experimental evaluations we show how this new technique can be applied in the contexts of loops, nested or not, as well as for computer programs expressed within a dataflow language. With the anticipation of using generalized tiling also to distributed computations over the cores of a manycore architecture we also provide some insight into the methods that can be used to model communications and their characteristics on such architectures.As a final point, and in order to show the full expressiveness of the memory-use graph and even more the underlying memory usage and communication model, we turn towards the topic of performance debugging and the analysis of execution traces. Our goal is to provide feedback on the evaluated code and its potential for further improvement of data locality. Such traces may contain information about memory communications during an execution and show strong similarities with the previously studied optimization problem. This brings us to a short introduction to the algorithmics of directed graphs and the formulation of some new heuristics for the well-studied topic of reachability and the much less known problem of convex partitioning

Стилі APA, Harvard, Vancouver, ISO та ін.

28

Suzanne, Aurélie. "Decision Support Query Processing of Spanning Event Streams." Thesis, Nantes Université, 2022. http://www.theses.fr/2022NANU4022.

Повний текст джерела

Анотація:

L’ère du Big Data nécessite de nouvelles architectures de traitement de données, parmi lesquelles les systèmes de streaming qui sont devenus très populaires. Ces systèmes sont capables de résumer des flux de données infinis avec des agrégats sur les données les plus récentes. Cependant, jusqu’à présent, seuls les événements ponctuels ont été pris en compte et les événements à durée, qui s’étendent sur plusieurs instants, ont été laissés de côté, limités au seul monde des bases de données temporelles. Cette thèse définit un cadre pour traiter de tels mécanismes de flux sur des événements à durée. Ensuite, un moteur de requêtes d’agrégations continues est développé, capable d’incorporer la durée de vie des événements pour fournir un calcul d’agrégat exact. Il met à disposition des structures adaptées pour réaliser un calcul efficace des fenêtres glissantes. Ce moteur a ensuite été étendu pour prendre en charge le calcul partagé de plusieurs requêtes exécutées simultanément, tout en gérant correctement les événements en retard. Afin d’élaborer, en temps réel, le plan d’exécution des requêtes le plus efficace, une politique basée sur un calcul de coûts est mise en place. Tout au long de cette thèse, de nombreuses expérimentations ont été menées pour montrer la pertinence et l’efficacité des différentes approches proposées dans une grande variété de contextes et de profils de flux The Big Data era requires new processing architectures, among which streaming systems which have become very popular. Those systems are able to summarize infinite data streams with aggregates on the most recent data. However, up to now, only point events have been considered and spanning events, which come with a duration, have been let aside, restricted to the persistent databases world only. In this thesis, a unified framework to deal with such stream mechanisms on spanning events is defined. Then, we develop an engine for Aggregate Continuous Query (ACQ), which is able to incorporate event lifespan to provide exact aggregate computation, and provides adapted structures for an efficient computation of sliding windows. This engine is further extended to handle shared computation of simultaneously running ACQs, while properly managing out-oforder events. In order to elaborate at runtime the most efficient query execution plan, a costbased policy is followed. Throughout this thesis, many experiments have been carried out to show the pertinence and the efficiency of our approaches in a lar

Стилі APA, Harvard, Vancouver, ISO та ін.

29

Friston, S. "Low latency rendering with dataflow architectures." Thesis, University College London (University of London), 2017. http://discovery.ucl.ac.uk/1544925/.

Повний текст джерела

Анотація:

The research presented in this thesis concerns latency in VR and synthetic environments. Latency is the end-to-end delay experienced by the user of an interactive computer system, between their physical actions and the perceived response to these actions. Latency is a product of the various processing, transport and buffering delays present in any current computer system. For many computer mediated applications, latency can be distracting, but it is not critical to the utility of the application. Synthetic environments on the other hand attempt to facilitate direct interaction with a digitised world. Direct interaction here implies the formation of a sensorimotor loop between the user and the digitised world - that is, the user makes predictions about how their actions affect the world, and see these predictions realised. By facilitating the formation of the this loop, the synthetic environment allows users to directly sense the digitised world, rather than the interface, and induce perceptions, such as that of the digital world existing as a distinct physical place. This has many applications for knowledge transfer and efficient interaction through the use of enhanced communication cues. The complication is, the formation of the sensorimotor loop that underpins this is highly dependent on the fidelity of the virtual stimuli, including latency. The main research questions we ask are how can the characteristics of dataflow computing be leveraged to improve the temporal fidelity of the visual stimuli, and what implications does this have on other aspects of the fidelity. Secondarily, we ask what effects latency itself has on user interaction. We test the effects of latency on physical interaction at levels previously hypothesized but unexplored. We also test for a previously unconsidered effect of latency on higher level cognitive functions. To do this, we create prototype image generators for interactive systems and virtual reality, using dataflow computing platforms. We integrate these into real interactive systems to gain practical experience of how the real perceptible benefits of alternative rendering approaches, but also what implications are when they are subject to the constraints of real systems. We quantify the differences of our systems compared with traditional systems using latency and objective image fidelity measures. We use our novel systems to perform user studies into the effects of latency. Our high performance apparatuses allow experimentation at latencies lower than previously tested in comparable studies. The low latency apparatuses are designed to minimise what is currently the largest delay in traditional rendering pipelines and we find that the approach is successful in this respect. Our 3D low latency apparatus achieves lower latencies and higher fidelities than traditional systems. The conditions under which it can do this are highly constrained however. We do not foresee dataflow computing shouldering the bulk of the rendering workload in the future but rather facilitating the augmentation of the traditional pipeline with a very high speed local loop. This may be an image distortion stage or otherwise. Our latency experiments revealed that many predictions about the effects of low latency should be re-evaluated and experimenting in this range requires great care.

Стилі APA, Harvard, Vancouver, ISO та ін.

30

Astolfi, Vitor Fiorotto. "ChipCflow - em hardware dinamicamente reconfigurável." Universidade de São Paulo, 2009. http://www.teses.usp.br/teses/disponiveis/55/55134/tde-05032010-203142/.

Повний текст джерела

Анотація:

Nos últimos anos, houve um grande avanço na computação reconfigurável, em particular em hardware que emprega Field-Programmable Gate Arrays. Porém, esse aumento de capacidade e desempenho aumentou a distância entre a capacidade de projeto e a disponibilidade de tecnologia para o desenvolvimento do projeto. As linguagens de programação imperativas de alto nível, como C, são mais apropriadas para o desenvolvimento de aplicativos complexos que as linguagens de descrição de hardware. Por isso, surgiram diversas ferramentas para o desenvolvimento de hardware a partir de código em C. A ferramenta ChipCflow, da qual faz parte este projeto, é uma delas. A execução dos programas por meio dessa ferramenta será completamente baseada em seu fluxo de dados, seguindo o modelo dinâmico encontrado nas arquiteturas de computadores a fluxo de dados, aproveitando ao máximo o paralelismo considerado natural desse modelo e as características do hardware parcialmente reconfigurável. Neste projeto em particular, o objetivo é a prova de conceito (proof of concept) para a criação de instâncias, em forma de operadores, de um algoritmo ChipCflow em hardware parcialmente reconfigurável, tendo como base a plataforma Virtex da Xilinx In recent years, reconfigurable computing has become increasingly more advanced, especially in hardware that uses Field-Programmable Gate Arrays. However, the increase of performance in FPGAs accumulated the gap between design capacity and technology for the development of the design. Imperative high-level programming languages such as C are more appropriate for the development of complex algorithms than hardware description languages (HDL). For this reason, many ANSI C-like programming tools for the development of hardware came to existence. The ChipCflow project, of which this project is part, is one of these tools. The execution of algorithms through this tool will be completely directed by data flow, according to the dynamic model found on Dataflow Architectures, taking advantage of its natural high levels of parallelism and the characteristics of the partially reconfigurable hardware. In this project, the objective is a proof of concept for the creation of instances, in the form of operators, of a ChipCflow algorithm on a partially reconfigurable hardware, taking as reference the Xilinx Virtex boards

Стилі APA, Harvard, Vancouver, ISO та ін.

31

Lopes, Joelmir José. "ChipCflow - uma ferramenta para execução de algoritmos utilizando o modelo a fluxo de dados dinâmico em hardware reconfigurável." Universidade de São Paulo, 2012. http://www.teses.usp.br/teses/disponiveis/55/55134/tde-05122012-154304/.

Повний текст джерела

Анотація:

Devido à complexidade das aplicações, a demanda crescente por sistemas que usam milhões de transistores e hardware complexo; tem sido desenvolvidas ferramentas que convertem C em Linguagem de Descrição de Hardware, tais como VHDL e Verilog. Neste contexto, esta tese apresenta o projeto ChipCflow, o qual usa arquitetura a fluxo de dados, para implementar lógica de alto desempenho em Field Programmable Gate Array (FPGA). Maquinas a fluxo de dados são computadores programáveis, cujo hardware é otimizado para computação paralela de granularidade fina dirigida por dados. Em outras palavras, a execução de programas é determinado pela disponibilidade dos dados, assim, o paralelismo é intrínseco neste sistema. Por outro lado, com o avanço da tecnologia da microeletrônica, o FPGA tem sido utilizado principalmente devido a sua flexibilidade, facilidade para implementar sistemas complexos e paralelismo intrínseco. Um dos desafios é criar ferramentas para programadores que usam linguagem de alto nível (HLL), como a linguagem C, e produzir hardware diretamente. Essas ferramentas devem usar a máxima experiência dos programadores, o paralelismo das arquiteturas a fluxo de dados dinâmica, a flexibilidade e o paralelismo do FPGA, para produzir um hardware eficiente, otimizado para alto desempenho e baixo consumo de energia. O projeto ChipCflow é uma ferramenta que converte os programas de aplicação escritos em linguagem C para a linguagem VHDL, baseado na arquitetura a fluxo de dados dinâmica. O principal objetivo dessa tese é definir e implementar os operadores do ChipCflow, usando a arquitetura a fluxo de dados dinâmica em FPGA. Esses operadores usam tagged tokens para identificar dados, com base em instâncias de operadores. A implementação dos operadores e das instâncias usam um modelo de implementação assíncrono em FPGA para obter maior velocidade e menor consumo Due to the complexity of applications, the growing demand for both systems using millions of transistors and consecutive complex hardware, tools that convert C into a Hardware Description Language (HDL), as VHDL and Verilog, have been developed. In this context this thesis presents the ChipCflow project, which uses dataflow architecture to implement high-performance logics in Field Programmable Gate Array (FPGA). Dataflow machines are programmable computers whose hardware is optimized for fine-grain data-flow parallel computation. In other words the execution of programs is determined by data availability, thus parallelism is intrinsic in these systems. On the other hand, with the advance of technology of microelectronics, the FPGA has been used mainly because of its flexibility, facilities to implement complex systems and intrinsic parallelism. One of the challenges is to create tools for programmers who use HLL (High Level Language), such as C language, producing hardware directly. These tools should use the utmost experience of the programmers, the parallelism of dynamic dataflow architecture and the flexibility and parallelism of FPGA to produce efficient hardware optimized for high performance and lower power consumption. The ChipCflow project is a tool that converts application programs written in C language into VHDL, based on the dynamic dataflow architecture. The main goal in this thesis is to define and implement the operators of ChipCflow using dynamic dataflow architecture in FPGA. These operators use tagged tokens to identify data based on instances of operators and their implementation and instances use an asynchronous implementation model in FPGA to achieve faster speed and lower consumption

Стилі APA, Harvard, Vancouver, ISO та ін.

32

Togbe, Maurras Ulbricht. "Détection distribuée d'anomalies dans les flux de données." Electronic Thesis or Diss., Sorbonne université, 2022. http://www.theses.fr/2022SORUS400.

Повний текст джерела

Анотація:

La détection d'anomalies est une problématique importante dans de nombreux domaines d'application comme la santé, le transport, l'industrie etc. Il s'agit d'un sujet d'actualité qui tente de répondre à la demande toujours croissante dans différents domaines tels que la détection d'intrusion, de fraude, etc. Dans cette thèse, après un état de l'art général complet, la méthode non supervisé Isolation Forest (IForest) a été étudiée en profondeur en présentant ses limites qui n'ont pas été abordées dans la littérature. Notre nouvelle version de IForest appelée Majority Voting IForest permet d'améliorer son temps d'exécution. Nos méthodes ADWIN-based IForest ASD et NDKSWIN-based IForest ASD permettent la détection d'anomalies dans les flux de données avec une meilleure gestion du concept drift. Enfin, la détection distribuée d'anomalies en utilisant IForest a été étudiée et évaluée. Toutes nos propositions ont été validées avec des expérimentations sur différents jeux de données Anomaly detection is an important issue in many application areas such as healthcare, transportation, industry etc. It is a current topic that tries to meet the ever increasing demand in different areas such as intrusion detection, fraud detection, etc. In this thesis, after a general complet state of the art, the unsupervised method Isolation Forest (IForest) has been studied in depth by presenting its limitations that have not been addressed in the literature. Our new version of IForest called Majority Voting IForest improves its execution time. Our ADWIN-based IForest ASD and NDKSWIN-based IForest ASD methods allow the detection of anomalies in data stream with a better management of the drift concept. Finally, distributed anomaly detection using IForest has been studied and evaluated. All our proposals have been validated with experiments on different datasets

Стилі APA, Harvard, Vancouver, ISO та ін.

33

Bhagyanath, Anoop [Verfasser], and Klaus [Akademischer Betreuer] Schneider. "Code Generation for Synchronous Control Asynchronous Dataflow Architectures / Anoop Bhagyanath ; Betreuer: Klaus Schneider." Kaiserslautern : Technische Universität Kaiserslautern, 2021. http://d-nb.info/122615428X/34.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

34

MAHIOUT, ABDERRAHMANE. "Placement et ordonnancement automatiques de programmes dataflow data-paralleles sur les architectures paralleles." Paris 11, 1996. http://www.theses.fr/1996PA112268.

Повний текст джерела

Анотація:

Ce travail de these s'est deroule dans le cadre du projet 8 1/2 de l'equipe architectures paralleles du lri. Ce projet a pour but de developper les outils theoriques et logiciels necessaires a la realisation de grandes simulations numeriques sur les architectures paralleles actuelles. Ce travail a consiste a concevoir des outils permettant le placement et l'ordonnancement automatiques de programmes sur les architectures paralleles actuelles. En effet, les environnements actuels des machines paralleles ne permettent pas une exploitation aisee de celles-ci: le programmeur doit avoir une parfaite connaissance de l'architecture sous-jacente pour developper son application et cela induit une complexite de conception. De plus, l'application est dependante de l'architecture cible, donc non portable. Nous avons propose des modeles de programmes (graphe e-dfg) et d'architecture tenant compte du contexte data-parallele des applications. Nous avons ensuite bati des outils de placement/ordonnancement bases sur l'exploitation de la localite et une politique d'ordonnancement au plus tot. Ces outils de placement/ordonnancement ont fait l'objet d'experimentations sur des benchmarks de graphes puis sur la machine parallele sp2, ce qui a permis de mettre l'accent sur l'importance de la granularite ainsi que sur la necessite d'ameliorer la modelisation de l'architecture

Стилі APA, Harvard, Vancouver, ISO та ін.

35

Selva, Manuel. "Performance monitoring of throughput constrained dataflow programs executed on shared-memory multi-core architectures." Thesis, Lyon, INSA, 2015. http://www.theses.fr/2015ISAL0055/document.

Повний текст джерела

Анотація:

Les progrès continus de la microélectronique couplés au problème de gestion de la puissance dissipée ont conduit les fabricants de processeurs à se tourner vers des puces dites multi-coeurs au début des années 2000. Ces processeurs sont composés de plusieurs unités de calcul indépendantes. Contrairement aux progrès précédents ces architectures multi-coeurs, le logiciel doit être en grande parti repensé pour tirer parti de toutes les unités de calcul. Il faut pouvoir paralléliser une application séquentielle en tâches le plus indépendantes possibles pour pouvoir les exécuter sur différentes unités de calcul. Pour cela, de nombreux modèles de programmations dits concurrents ont été proposés. Dans cette thèse nous nous intéressons aux programmes décrits à l’aide du modèle dataflow. Ce travail porte sur l’évaluation des performances de programmes dataflow (forme que revêtent typiquement des applications de types traitement de flux vidéos ou protocoles de communication) sur des architectures multi-coeurs. Plus particulièrement, le sujet de la thèse porte sur l’extension de modèles de programmation dataflow avec des éléments d’expression de propriétés de qualité de service ainsi que la prise en compte de ces éléments pour détecter, à l’exécution, les goulots d’étranglement de performance au sein des programmes. Les informations concernant les goulots d'étranglements collectées pendant l'exécution sont utilisées à la fois pour faire de l'analyse hors-ligne et pour faire des adaptations pendant l'exécution des programmes. Dans le premier cas, le programmeur utilise ces informations pour savoir quelles parties du programme dataflow il faut optimiser et pour savoir comment distribuer efficacement le programme sur les unités de calcul. Dans le second cas, les informations collectées sont utilisées par des mécanismes d'adaptation automatique afin de redistribuer le travail sur les différentes unités de calcul de façon plus efficace. Nous portons une attention particulière au profiling de l'utilisation faite par les applications dataflow du système mémoire. Les informations sur les échanges de données fournies par le modèle de programmation permettent d'exploiter de façon intelligente les architectures mémoires des machines multi-coeurs. Néanmoins, la complexité de ces dernières ne permet pas de façon générale d'évaluer statiquement l'impact sur les performances des accès mémoires. Nous proposons donc la mise en place d'un système de profiling mémoire pour des applications dataflow basé sur des mécanismes matériels Because of physical limits, hardware designers have switched to parallel systems to exploit the still growing number of transistors per square millimeter of silicon. These parallel systems are made of several independent computing units. To benefit from these computing units, software must be changed. Existing sequential applications have to be split into independent tasks to be executed in parallel on the different computing units. To that end, many concurrent programming models have been proposed and are in use today. We focus in this thesis on the dataflow concurrent programming model. This work is about performance evaluation of dataflow programs on multicore architectures. We propose to extend dataflow programming models with the notion of throughput constraints and to take this information into account in the compilation tool chain to detect at runtime the throughput bottlenecks. The profiling results gathered during the execution are used both for off-line analyzes and to adapt the application during its execution. In the former case, the developer uses this information to know which part of the dataflow program should be optimized and to efficiently distribute the program on the computing units. In the later case, the profiling information is used by runtime adaptation mechanisms to distribute differently the work on the computing units. We give a particular focus on the profiling of the usage of the memory subsystem. The data exchange information provide by the programming model allows to efficiently used the memory subsystem of multicore architectures. Nevertheless, the complexity of modern memory systems doesn't allow to statically evaluate the impact of memory accesses on the global performances of the application. We propose to set up memory profiling dedicated to dataflow applications based on hardware profiling mechanisms

Стилі APA, Harvard, Vancouver, ISO та ін.

36

Magna, Patrícia. "Redução dos bits de emparelhamento da máquina de fluxo de dados de Manchester." Universidade de São Paulo, 1992. http://www.teses.usp.br/teses/disponiveis/54/54132/tde-17042009-115457/.

Повний текст джерела

Анотація:

O modelo a fluxo de dados tem grande destaque em pesquisas em arquiteturas de alto desempenho. Neste modelo, o controle de execução é feito apenas pela disponibilidade dos dados, permitindo que seja explorado o máximo de paralelismo implícito em um programa. As propostas que serão expostas neste trabalho visam solucionar um particular problema da máquina de fluxo de dados de Manchester. Esta arquitetura para tratar código reentrante, impõe que as fichas de dados, além da indicação da instrução destino, possuam um rótulo. Estas informações extras, que formam 70% da ficha de dado, fazem com que a implantação da máquina seja complexa. Assim, o hardware impõe um sério limite a velocidade de processamento, impedindo a plena utilização do modelo. Neste trabalho, serão apresentadas propostas para a redução do número de informações necessárias para o correto funcionamento da máquina, possibilitando uma implementação mais simples e mais eficiente. The dataflow model is specially relevant you research in high-performance architectures. In this model, the execution control is done by taking into account only the dates availability, thus allowing maximum exploitation of the paralelism implicit in programs. The present work is based on the Manchester dataflow machine, which, in to order you handle the reentran code, imposes the dates token you have, in addition you the destination instruction Field, albel. Additional This information, which corresponds you 70% of the dates token, compounds the machine implementation it substantially bounds the execution speed and prevents the full model utilization. This work presents approaches will be reducing the amount of information needed will be to proper machine operation in to order you achieve to simpler and lives effective implementation.

Стилі APA, Harvard, Vancouver, ISO та ін.

37

Magna, Patrícia. "Proposta e simulação de uma arquitetura a fluxo de dados de segunda geração." Universidade de São Paulo, 1997. http://www.teses.usp.br/teses/disponiveis/76/76132/tde-06042009-113436/.

Повний текст джерела

Анотація:

Neste trabalho é apresentada a arquitetura SEED, proposta a partir das experiências adquiridas com as arquiteturas baseadas no modelo a fluxo de dados que foram estudadas até o presente. A arquitetura SEED utiliza o modelo a fluxo de dados para escalonar e executar blocos de instruções, visando aproveitar a principal qualidade apresentada pelo modelo, que consiste em expor o máximo de paralelismo existente nos programas. No entanto, a arquitetura explora paralelismo de granularidade mais grossa que as arquiteturas a fluxo de dados, a fim de reduzir o trafego de fichas de dados na arquitetura. Esta redução tenta resolver ou amenizar problemas como a excessiva ocupação de memória e a grande complexidade exigida do hardware. Além da especificação da funcionalidade de toda a arquitetura SEED, este trabalho apresenta uma proposta para o particionamento do código. A utilização desta proposta permite a geração de blocos de códigos que podem ser executados corretamente pela arquitetura SEED. Alguns benchmarks foram gerados utilizando essa proposta de particionamento de código. Estes benchmarks foram executados no simulador da arquitetura SEED, visando analisar e avaliar o comportamento da arquitetura com diversas configurações de hardware. In this work is presented the SEED architecture. This architecture was proposed considering the experiences obtained with existing architectures based on dataflow model. The SEED architecture uses dataflow model to schedule and execute sets of instructions, called code blocks. This approach tries to make use of the main quality of the dataflow model that is to expose the maximum parallelism of the programs. However, this architecture explores coarser granularity than the one usually considered in dataflow architectures in order to reduce the data token traffic in the architecture. This type of reduction tries to solve problems like excessive occupation of memory and high complexity of the hardware. Besides the specification of all units that compose the SEED architecture, this work also proposes a way of partitioning programs, creating code blocks that may be executed by SEED architecture. Some benchmarks were generated using this proposal for partitioning programs. These benchmarks were executed in the SEED architecture simulator, in order to analyze the behavior of the proposed architecture under special configurations.

Стилі APA, Harvard, Vancouver, ISO та ін.

38

Savas, Süleyman. "Utilizing Heterogeneity in Manycore Architectures for Streaming Applications." Licentiate thesis, Högskolan i Halmstad, Centrum för forskning om inbyggda system (CERES), 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:hh:diva-33792.

Повний текст джерела

Анотація:

In the last decade, we have seen a transition from single-core to manycore in computer architectures due to performance requirements and limitations in power consumption and heat dissipation. The first manycores had homogeneous architectures consisting of a few identical cores. However, the applications, which are executed on these architectures, usually consist of several tasks requiring different hardware resources to be executed efficiently. Therefore, we believe that utilizing heterogeneity in manycores will increase the efficiency of the architectures in terms of performance and power consumption. However, development of heterogeneous architectures is more challenging and the transition from homogeneous to heterogeneous architectures will increase the difficulty of efficient software development due to the increased complexity of the architecture. In order to increase the efficiency of hardware and software development, new hardware design methods and software development tools are required. Additionally, there is a lack of knowledge on the performance of applications when executed on manycore architectures. The transition began with a shift from single-core architectures to homogeneous multicore architectures consisting of a few identical cores. It now continues with a shift from homogeneous architectures with identical cores to heterogeneous architectures with different types of cores specialized for different purposes. However, this transition has increased the complexity of architectures and hence the complexity of software development and execution. In order to decrease the complexity of software development, new software tools are required. Additionally, there is a lack of knowledge on what kind of heterogeneous manycore design is most efficient for different applications and what are the performances of these applications when executed on current commercial manycores. This thesis studies manycore architectures in order to reveal possible uses of heterogeneity in manycores and facilitate choice of architecture for software and hardware developers. It defines a taxonomy for manycore architectures that is based on the levels of heterogeneity they contain and discusses benefits and drawbacks of these levels. Additionally, it evaluates several applications, a dataflow language (CAL), a source-to-source compilation framework (Cal2Many), and a commercial manycore architecture (Epiphany). The compilation framework takes implementations written in the dataflow language as input and generates code targetting different manycore platforms. Based on these evaluations, the thesis identifies the bottlenecks of the architecture. It finally presents a methodology for developing heterogeneoeus manycore architectures which target specific application domains. Our studies show that using different types of cores in manycore architectures has the potential to increase the performance of streaming applications. If we add specialized hardware blocks to a core, the performance easily increases by 15x for the target application while the core size increases by 40-50% which can be optimized further. Other results prove that dataflow languages, together with software development tools, decrease software development efforts significantly (25-50%) while having a small impact (2-17%) on the performance. HiPEC (High Performance Embedded Computing) NGES (Towards Next Generation Embedded Systems: Utilizing Parallelism and Reconfigurability)

Стилі APA, Harvard, Vancouver, ISO та ін.

39

Menon, Suraj S. "Supporting Distributed Fault Tolerance In A Real-Time Micro-Kernel." Thesis, Virginia Tech, 2006. http://hdl.handle.net/10919/35463.

Повний текст джерела

Анотація:

Research into modular approaches for constructing power electronics control systems has provided a number of benefits, as well as new opportunities. Control systems composed of an interconnected collection of standardized parts makes distributed processing a realistic possibility. Unfortunately, current strategies to supporting software on such systems have a number of critical drawbacks. Many existing approaches rely on centralized control strategies, fail to support fault tolerance in the face of failures among processing nodes or communications links, and fail to robustly support live addition or removal of nodes from a running network. In this context, failure of a single element means failure of the entire system. This thesis describes research to extend the Dataflow Architecture Real-time Kernel (DARK) to support distributed, fault-tolerant execution of control algorithms for power electronics control systems. An appropriate scheme for fault-tolerant scheduling of processes on distributed processing nodes is described, added to DARK, and evaluated. Literature indicates that fault-tolerant multiprocessor scheduling for hard real-time tasks with task precedence constraints is an NP-hard problem. The new system is based on an off-line fault-tolerant scheduling strategy that generates a static schedule of tasks for each processing unit to follow. This algorithm handles both the task precedence constraints and the constraints imposed by the underlying network protocol(DRPESNET). Modifications to the underlying daisy-chained, packet-switched, time-triggered ring network protocol to support communications fault tolerance and plug-and-play addition or removal of live nodes from an existing control system are also described. Master of Science

Стилі APA, Harvard, Vancouver, ISO та ін.

40

Georgiou, Yiannis. "Contributions for resource and job management in high performance computing." Grenoble, 2010. http://www.theses.fr/2010GRENM079.

Повний текст джерела

Анотація:

Le domaine du Calcul à Haute Performance (HPC) évolue étroitement avec les dernières avancées technologiques des architectures informatiques et des besoins toujours croissants en demande de puissance de calcul. Cette thèse s'intéresse à l'étude d'un type d'intergiciel particulier appelé gestionnaire de tâches et ressources (RJMS) qui est chargé de distribuer la puissance de calcul aux applications dans les plateformes pour le HPC. Le RJMS joue un rôle central du fait de sa position dans la pile logicielle. Les dernières évolutions dans les couches matérielles et dans les applications ont largement augmenté le niveau de complexité auquel doit faire face ce type d'intergiciel. Des problématiques telles que le passage à l'échelle, la prise en compte d'un taux d'activité irrégulier, la gestion des contraintes liées à la topologie du matériel, l'efficacité énergétique et la tolérance aux pannes doivent être particulièrement pris en considération, afin, entre autres, de fournir une meilleure exploitation des ressources à la fois du point de vue global du système ainsi que de celui des utilisateurs. La première contribution de cette thèse est un état de l'art sur la gestion des tâches et des ressources ainsi qu'une analyse comparative des principaux intergiciels actuels et des différentes problématiques de recherche associées. Une métrique importante pour évaluer l'apport d'un RJMS sur une plate-forme est le niveau d'utilisation de l'ensemble du système. On constate parmi les traces d'activité de plusieurs plateformes qu'un grand nombre d'entre elles présentent un taux d'utilisation significativement inférieure à une pleine utilisation. Ce constat est la principale motivation des autres contributions de cette thèse qui portent sur les méthodes d'exploitations de ces périodes de sous-utilisation au profit de la gestion globale du système ou des applications en court d'exécution. Plus particulièrement cette thèse explore premièrement, les moyens d'accroître le taux de calculs utiles dans le contexte des grilles légères en présence d'une forte variabilité de la disponibilité des ressources de calcul. Deuxièmement, nous avons étudié le cas des tâches dynamiques et proposé différentes techniques s'intégrant au RJMS OAR et troisièmement nous évalués plusieurs modes d'exploitation des ressources en prenant en compte la consommation énergétique. Finalement, les évaluations de cette thèse reposent sur une approche expérimentale pour laquelle nous avons proposés des outils et une méthodologie permettant d'améliorer significativement la maîtrise et la reproductibilité d'expériences complexes propre à ce domaine d'étude High Performance Computing is characterized by the latest technological evolutions in computing architectures and by the increasing needs of applications for computing power. A particular middleware called Resource and Job Management System (RJMS), is responsible for delivering computing power to applications. The RJMS plays an important role in HPC since it has a strategic place in the whole software stack because it stands between the above two layers. However, the latest evolutions in hardware and applications layers have provided new levels of complexities to this middleware. Issues like scalability, management of topological constraints, energy efficiency and fault tolerance have to be particularly considered, among others, in order to provide a better system exploitation from both the system and user point of view. This dissertation provides a state of the art upon the fundamental concepts and research issues of Resources and Jobs Management Systems. It provides a multi-level comparison (concepts, functionalities, performance) of some Resource and Jobs Management Systems in High Performance Computing. An important metric to evaluate the work of a RJMS on a platform is the observed system utilization. However, studies and logs of production platforms show that HPC systems in general suffer of significant un-utilization rates. Our study deals with these clusters' un-utilization periods by proposing methods to aggregate otherwise un-utilized resources for the benefit of the system or the application. More particularly this thesis explores RJMS level mechanisms: 1) for increasing the jobs valuable computation rates in the high volatile environments of a lightweight grid context, 2) for improving system utilization with malleability techniques and 3) providing energy efficient system management through the exploitation of idle computing machines. The experimentation and evaluation in this type of contexts provide important complexities due to the inter-dependency of multiple parameters that have to be taken into control. In this thesis we have developed a methodology based upon real-scale controlled experimentation with submission of synthetic or real workload traces

Стилі APA, Harvard, Vancouver, ISO та ін.

41

Albakour, Subhy. "Stream-automl : automated machine learning overimbalanced data streams for bipartite ranking problems." Electronic Thesis or Diss., Institut polytechnique de Paris, 2024. http://www.theses.fr/2024IPPAT015.

Повний текст джерела

Анотація:

Malgré sa popularité dans la littérature scientifique, l’apprentissage en ligne doit encore concrétiser son utilité pratique dans les applications industrielles. Vu que l’apprentissage en ligne gère les flux incessants de données volumineuses, à haute vélocité et en évolution constante par conception, le marketing en ligne semble être le candidat favori pour que l’apprentissage en ligne fasse son entrée dans l’industrie. Dans ce contexte, l’état de l’art de l’apprentissage en ligne n’a qu’une utilité limitée, car il se concentre principalement sur les problèmes de classification, tandis que le classement biparti constitue une meilleure modélisation du problème de marketing en ligne. Récemment, la combinaison de l’apprentissage en continu et de l’apprentissage automatique automatisé, c’est-à-dire Stream-AutoML, attire davantage l’attention de la communauté scientifique. Cette thèse explore l’applicabilité de Stream-AutoML aux problèmes de classement biparti lorsque les données sont déséquilibrées. Nous commençons par développer un cadre pour exécuter et évaluer les pipelines Stream-AutoML. Ensuite, nous proposons un cadre pour calculer AUC-ROC de manière progressive, et pour introduire une décroissance exponentielle aux données. Nous proposons également un cadre pour la détection des dérives conceptuelles en utilisant AUC-ROC. Dans ce cadre, nous développons six tests statistiques pour les différences d’AUC-ROC avec des bornes théoriques pour les erreurs de type I et de type II. Enfin, nous proposons quatre générateurs de données qui enrichissent les cadres d’évaluation des détecteurs des dérives conceptuelles dans des environnements contrôlés. Les résultats ont montré que les méthodes proposées réduisent considérablement les ressources allouées à l’évaluation et détectent les dérives conceptuelles en ayant très peu de faux positifs. Ces contributions préparent le terrain pour que Stream-AutoML puisse résoudre des problèmes de classement biparti, et peuvent ensuite être exploités dans les applications de marketing en ligne. Des implémentations optimisées des méthodes proposées ont été développées et ont déjà été adoptées dans le produit de marketing en ligne d’IDAaaS Despite its popularity in the scientific literature, stream learning has yet to substantiate its practical utility in industrial applications. Characterized by the incessant influx of high-velocity, voluminous, and dynamically changing data, online marketing seems to be the favorite candidate for stream learning to make its entry into the industry. In this context, state-of-theart stream learning is of little utility, as it mainly focuses on classification, while bipartite ranking constitutes better modeling of the problem of online marketing. Recently, the combination of stream learning and AutoML, i.e., Stream-AutoML, has been drawing more attention from the scientific community. This work investigates the applicability of Stream-AutoML to bipartite ranking problems when data is imbalanced. We commence by developing a framework to execute and evaluate Stream-AutoML pipelines of stream learning models. Then we propose a framework for computing AUC-ROC incrementally, as well as introducing exponential decay to serve as a forgetting mechanism. We also propose a framework for concept drift detection using AUC-ROC, for which we develop six statistical tests for differences in AUC-ROC with theoretical bounds of type I and type II errors. Finally, we propose four data generators that enrich the tool kit to evaluate concept drift detectors under controlled environments. Results have shown that the proposed methods reduce the resources allocated for evaluation considerably and detect concept drifts with very small false positives. These contributions prepare the field for Stream-AutoML to solve bipartite ranking problems, which can be then exploited in online marketing applications. Optimized implementations of the proposed methods were developed and have already been adopted in the online marketing product of IDAaaS

Стилі APA, Harvard, Vancouver, ISO та ін.

42

Zakroum, Mehdi. "Machine Learning for the Automation of Cyber-threat Monitoring and Inference." Electronic Thesis or Diss., Université de Lorraine, 2023. http://www.theses.fr/2023LORR0108.

Повний текст джерела

Анотація:

Au cours des dernières décennies, les cyber-menaces ont connu une augmentation significative et continuent de croître de façon exponentielle. Les opérateurs de réseau et les praticiens de la sécurité s'efforcent constamment d'automatiser leurs stratégies de défense contre les cyberincidents à grande échelle et les événements particuliers à plus petite échelle ciblant leurs réseaux. Améliorer la surveillance des événements de sécurité et détecter les attaques à un stade précoce sont des éléments clés pour prévenir des éventuels dommages ou au moins atténuer leurs impacts. Le trafic enregistré par les capteurs réseau tels que les télescopes réseau, également connus sous le nom de darknets, constitue une riche source de renseignements sur la cybersécurité. Les données enregistrées par ces capteurs incluent différents types de trafic allant du trafic bénin comme les analyses régulières effectuées par les organisations à des fins statistiques, aux cyber-incidents malveillants tels que la propagation de vers, les analyses de vulnérabilité et les paquets de rétrodiffusion en relation avec les attaques par déni de service. Ces données pourraient être exploitées pour automatiser et améliorer les solutions de surveillance des cyber-menaces ainsi que pour modéliser et prédire les attaques. Pour cela, cette thèse combine des travaux de recherche sur les sujets saillants de la surveillance des cyber-menaces et de la classification et de la prévision des cyber-attaques Over the past few decades, cyber-threats have known a significant increase and continue to grow exponentially. Network operators and security practitioners are constantly striving to automate their defense strategies against large-scale cyber incidents and smaller-scale peculiar events targeting their networks. Improving the monitoring of security events and detecting attacks at an early stage are key features to prevent against eventual damages or at least alleviate their impact. The traffic captured by network sensors such as network telescopes, also known as darknets, constitute a rich source of cybersecurity intelligence. The data recorded by such sensors include different types of traffic ranging from benign traffic like regular scans performed by organizations for statistical purpose, to malicious cyber incidents like worms spread, vulnerability scans, and backscatter packets that come as a side effect spoofed source of Denial of Service attacks. These data could be leveraged to automate and improve cyber-threat monitoring solutions and attack modeling and prediction. To this end, this thesis combines research works on the salient topics of cyber-threat monitoring and cyber-attack classification and forecasting

Стилі APA, Harvard, Vancouver, ISO та ін.

43

Stan, Oana. "Placement of tasks under uncertainty on massively multicore architectures." Thesis, Compiègne, 2013. http://www.theses.fr/2013COMP2116/document.

Повний текст джерела

Анотація:

Ce travail de thèse de doctorat est dédié à l'étude de problèmes d'optimisation combinatoire du domaine des architectures massivement parallèles avec la prise en compte des données incertaines tels que les temps d'exécution. On s'intéresse aux programmes sous contraintes probabilistes dont l'objectif est de trouver la meilleure solution qui soit réalisable avec un niveau de probabilité minimal garanti. Une analyse quantitative des données incertaines à traiter (variables aléatoires dépendantes, multimodales, multidimensionnelles, difficiles à caractériser avec des lois de distribution usuelles), nous a conduit à concevoir une méthode qui est non paramétrique, intitulée "approche binomiale robuste". Elle est valable quelle que soit la loi jointe et s'appuie sur l'optimisation robuste et sur des tests d'hypothèse statistique. On propose ensuite une méthodologie pour adapter des algorithmes de résolution de type approchée pour résoudre des problèmes stochastiques en intégrant l'approche binomiale robuste afin de vérifier la réalisabilité d'une solution. La pertinence pratique de notre démarche est enfin validée à travers deux problèmes issus de la compilation des applications de type flot de données pour les architectures manycore. Le premier problème traite du partitionnement stochastique de réseaux de processus sur un ensemble fixé de nœuds, en prenant en compte la charge de chaque nœud et les incertitudes affectant les poids des processus. Afin de trouver des solutions robustes, un algorithme par construction progressive à démarrages multiples a été proposé ce qui a permis d'évaluer le coût des solution et le gain en robustesse par rapport aux solutions déterministes du même problème. Le deuxième problème consiste à traiter de manière globale le placement et le routage des applications de type flot de données sur une architecture clustérisée. L'objectif est de placer les processus sur les clusters en s'assurant de la réalisabilité du routage des communications entre les tâches. Une heuristique de type GRASP a été conçue pour le cas déterministe, puis adaptée au cas stochastique clustérisé This PhD thesis is devoted to the study of combinatorial optimization problems related to massively parallel embedded architectures when taking into account uncertain data (e.g. execution time). Our focus is on chance constrained programs with the objective of finding the best solution which is feasible with a preset probability guarantee. A qualitative analysis of the uncertain data we have to treat (dependent random variables, multimodal, multidimensional, difficult to characterize through classical distributions) has lead us to design a non parametric method, the so-called "robust binomial approach", valid whatever the joint distribution and which is based on robust optimization and statistical hypothesis testing. We also propose a methodology for adapting approximate algorithms for solving stochastic problems by integrating the robust binomial approach when verifying for solution feasibility. The paractical relevance of our approach is validated through two problems arising in the compilation of dataflow application for manycore platforms. The first problem treats the stochastic partitioning of networks of processes on a fixed set of nodes, by taking into account the load of each node and the uncertainty affecting the weight of the processes. For finding stochastic solutions, a semi-greedy iterative algorithm has been proposed which allowed measuring the robustness and cost of the solutions with regard to those for the deterministic version of the problem. The second problem consists in studying the global placement and routing of dataflow applications on a clusterized architecture. The purpose being to place the processes on clusters such that it exists a feasible routing, a GRASP heuristic has been conceived first for the deterministic case and afterwards extended for the chance constrained variant of the problem

Стилі APA, Harvard, Vancouver, ISO та ін.

44

De, Oliveira Joffrey. "Gestion de graphes de connaissances dans l'informatique en périphérie : gestion de flux, autonomie et adaptabilité." Electronic Thesis or Diss., Université Gustave Eiffel, 2023. http://www.theses.fr/2023UEFL2069.

Повний текст джерела

Анотація:

Les travaux de recherche menés dans le cadre de cette thèse de doctorat se situent à l'interface du Web sémantique, des bases de données et de l'informatique en périphérie (généralement dénotée Edge computing). En effet, notre objectif est de concevoir, développer et évaluer un système de gestion de bases de données (SGBD) basé sur le modèle de données Resource Description Framework (RDF) du W3C, qui doit être adapté aux terminaux que l'on trouve dans l'informatique périphérique. Les applications possibles d'un tel système sont nombreuses et couvrent un large éventail de secteurs tels que l'industrie, la finance et la médecine, pour n'en citer que quelques-uns. Pour preuve, le sujet de cette thèse a été défini avec l'équipe du laboratoire d'informatique et d'intelligence artificielle (CSAI) du ENGIE Lab CRIGEN. Ce dernier est le centre de recherche et de développement d'ENGIE dédié aux gaz verts (hydrogène, biogaz et gaz liquéfiés), aux nouveaux usages de l'énergie dans les villes et les bâtiments, à l'industrie et aux technologies émergentes (numérique et intelligence artificielle, drones et robots, nanotechnologies et capteurs). Le CSAI a financé cette thèse dans le cadre d'une collaboration de type CIFRE. Les fonctionnalités d'un système satisfaisant ces caractéristiques doivent permettre de détecter de manière pertinente et efficace des anomalies et des situations exceptionnelles depuis des mesures provenant de capteurs et/ou actuateurs. Dans un contexte industriel, cela peut correspondre à la détection de mesures, par exemple de pression ou de débit sur un réseau de distribution de gaz, trop élevées qui pourraient potentiellement compromettre des infrastructures ou même la sécurité des individus. Le mode opératoire de cette détection doit se faire au travers d'une approche conviviale pour permettre au plus grand nombre d'utilisateurs, y compris les non-programmeurs, de décrire les situations à risque. L'approche doit donc être déclarative, et non procédurale, et doit donc s'appuyer sur un langage de requêtes, par exemple SPARQL. Nous estimons que l'apport des technologies du Web sémantique peut être prépondérant dans un tel contexte. En effet, la capacité à inférer des conséquences implicites depuis des données et connaissances explicites constitue un moyen de créer de nouveaux services qui se distinguent par leur aptitude à s'ajuster aux circonstances rencontrées et à prendre des décisions de manière autonome. Cela peut se traduire par la génération de nouvelles requêtes dans certaines situations alarmantes ou bien en définissant un sous-graphe minimal de connaissances dont une instance de notre SGBD a besoin pour répondre à l'ensemble de ses requêtes. La conception d'un tel SGBD doit également prendre en compte les contraintes inhérentes de l'informatique en périphérie, c'est-à-dire les limites en terme de capacité de calcul, de stockage, de bande passante et parfois énergétique (lorsque le terminal est alimenté par un panneau solaire ou bien une batterie). Il convient donc de faire des choix architecturaux et technologiques satisfaisant ces limitations. Concernant la représentation des données et connaissances, notre choix de conception s'est porté sur les structures de données succinctes (SDS) qui offrent, entre autres, les avantages d'être très compactes et ne nécessitant pas de décompression lors du requêtage. De même, il a été nécessaire d'intégrer la gestion de flux de données au sein de notre SGBD, par exemple avec le support du fenêtrage dans des requêtes SPARQL continues, et des différents services supportés par notre système. Enfin, la détection d'anomalies étant un domaine où les connaissances peuvent évoluer, nous avons intégré le support des modifications au niveau des graphes de connaissances stockés sur les instances des clients de notre SGBD. Ce support se traduit par une extension de certaines structures SDS utilisées dans notre prototype The research work carried out as part of this PhD thesis lies at the interface between the Semantic Web, databases and edge computing. Indeed, our objective is to design, develop and evaluate a database management system (DBMS) based on the W3C Resource Description Framework (RDF) data model, which must be adapted to the terminals found in Edge computing.The possible applications of such a system are numerous and cover a wide range of sectors such as industry, finance and medicine, to name but a few. As proof of this, the subject of this thesis was defined with the team from the Computer Science and Artificial Intelligence Laboratory (CSAI) at ENGIE Lab CRIGEN. The latter is ENGIE's research and development centre dedicated to green gases (hydrogen, biogas and liquefied gases), new uses of energy in cities and buildings, industry and emerging technologies (digital and artificial intelligence, drones and robots, nanotechnologies and sensors). CSAI financed this thesis as part of a CIFRE-type collaboration.The functionalities of a system satisfying these characteristics must enable anomalies and exceptional situations to be detected in a relevant and effective way from measurements taken by sensors and/or actuators. In an industrial context, this could mean detecting excessively high measurements, for example of pressure or flow rate in a gas distribution network, which could potentially compromise infrastructure or even the safety of individuals. This detection must be carried out using a user-friendly approach to enable as many users as possible, including non-programmers, to describe risk situations. The approach must therefore be declarative, not procedural, and must be based on a query language, such as SPARQL.We believe that Semantic Web technologies can make a major contribution in this context. Indeed, the ability to infer implicit consequences from explicit data and knowledge is a means of creating new services that are distinguished by their ability to adjust to the circumstances encountered and to make autonomous decisions. This can be achieved by generating new queries in certain alarming situations, or by defining a minimal sub-graph of knowledge that an instance of our DBMS needs in order to respond to all of its queries.The design of such a DBMS must also take into account the inherent constraints of Edge computing, i.e. the limits in terms of computing capacity, storage, bandwidth and sometimes energy (when the terminal is powered by a solar panel or a battery). Architectural and technological choices must therefore be made to meet these limitations. With regard to the representation of data and knowledge, our design choice fell on succinct data structures (SDS), which offer, among other advantages, the fact that they are very compact and do not require decompression during querying. Similarly, it was necessary to integrate data flow management within our DBMS, for example with support for windowing in continuous SPARQL queries, and for the various services supported by our system. Finally, as anomaly detection is an area where knowledge can evolve, we have integrated support for modifications to the knowledge graphs stored on the client instances of our DBMS. This support translates into an extension of certain SDS structures used in our prototype

Стилі APA, Harvard, Vancouver, ISO та ін.

45

Bodin, Bruno. "Analyse d'Applications Flot de Données pour la Compilation Multiprocesseur." Phd thesis, Université Pierre et Marie Curie - Paris VI, 2013. http://tel.archives-ouvertes.fr/tel-00922578.

Повний текст джерела

Анотація:

Les systèmes embarqués sont des équipements électroniques et informatiques, soumis à de nombreuses contraintes et dont le fonctionnement doit être continu. Pour définir le comportement de ces systèmes, les modèles de programmation dataflows sont souvent utilisés. Ce choix de modèle est motivé d'une part, parce qu'ils permettent de décrire un comportement cyclique, nécessaire aux systèmes embarqués ; et d'autre part, parce que ces modèles s'apprêtent à des analyses qui peuvent fournir des garanties de fonctionnement et de performance essentielles. La société Kalray propose une architecture embarquée, le MPPA. Il est accompagné du langage de programmation ΣC. Ce langage permet alors de décrire des applications sous forme d'un modèle dataflow déjà très étudié, le modèle Cyclo-Static Dataflow Graph(CSDFG). Cependant, les CSDFG générés par ce langage sont souvent trop complexes pour permettre l'utilisation des techniques d'analyse existantes. L'objectif de cette thèse est de fournir des outils algorithmiques qui résolvent les différentes étapes d'analyse nécessaires à l'étude d'une application ΣC, mais dans un temps d'exécution raisonnable, et sur des instances de grande taille. Nous étudions trois problèmes d'analyse distincts : le test de vivacité, l'évaluation du débit maximal, et le dimensionnement mémoire. Pour chacun de ces problèmes, nous fournissons des méthodes algorithmiques rapides, et dont l'efficacité a été vérifiée expérimentalement. Les méthodes que nous proposons sont issues de résultats sur les ordonnancements périodiques ; elles fournissent des résultats approchés et sans aucune garantie de performance. Pour pallier cette faiblesse, nous proposons aussi de nouveaux outils d'analyse basés sur les ordonnancements K-périodiques. Ces ordonnancements généralisent nos travaux d'ordonnancement périodiques et nous permettrons dans un avenir proche de concevoir des méthodes d'analyse bien plus efficaces.

Стилі APA, Harvard, Vancouver, ISO та ін.

46

Silva, Bruno de Abreu. "Gerenciamento de tags na arquitetura ChipCflow - uma máquina a fluxo de dados dinâmica." Universidade de São Paulo, 2011. http://www.teses.usp.br/teses/disponiveis/55/55134/tde-17052011-085128/.

Повний текст джерела

Анотація:

Nos últimos anos, percebeu-se uma crescente busca por softwares e arquiteturas alternativas. Essa busca acontece porque houve avanços na tecnologia do hardware e estes avanços devem ser complementados por inovações nas metodologias de projetos, testes e verificação para que haja um uso eficaz da tecnologia. Muitos dos softwares e arquiteturas alternativas, geralmente partem para modelos que exploram o paralelismo das aplicações, ao contrário do modelo de von Neumann. Dentre as arquiteturas alternativas de alto desempenho, tem-se a arquitetura a fluxo de dados. Nesse tipo de arquitetura, o processo de execução de programas é determinado pela disponibilidade dos dados. Logo, o paralelismo está embutido na própria natureza do sistema. O modelo a fluxo de dados possui a vantagem de expressar o paralelismo de maneira intrínseca, eliminando a necessidade de o programador explicitar em seu código os trechos onde deve haver paralelismo. As arquiteturas a fluxo de dados voltaram a ser um tema de pesquisa devido aos avanços do hardware, em particular, os avanços da Computação Reconfigurável e os FPGAs (Field-Programmable Gate Arrays). O projeto ChipCflow é uma ferramenta para execução de algoritmos usando o modelo a fluxo de dados dinâmico em FPGA. Este trabalho apresenta o formato para os tagged-tokens do ChipCflow, os operadores de manipulação das tags dos tokens e suas implementações a fim de que se tenha a PROVA-DE-CONCEITOS para tais operadores na arquitetura ChipCflow The alternative architectures and softwares researches have been growing in the last years. These researches are happening due to the advance of hardware technology and such advances must be complemented by improvements on design methodologies, test and verification techniques in order to use technology effectively. Many of the alternative architectures and softwares, in general, explore the parallelism of applications, differently to von Neumann model. Among high performance alternative architectures, there is the Dataflow Architecture. In this kind of architecture, the execution of programs is determined by data availability, thus the parallelism is intrinsic in these systems. The dataflow architectures become again a highlighted research area due to hardware advances, in particular, the advances of Reconfigurable Computing and FPGAs (Field-Programmable Gate Arrays). ChipCflow project is a tool for execution of algorithms using dynamic dataflow graph in FPGA. The main goal in this module of the ChipCflow project is to define the tagged-token format, the iterative operators that will manipulate the tags of tokens and to implement them

Стилі APA, Harvard, Vancouver, ISO та ін.

47

Arras, Paul-Antoine. "Ordonnancement d'applications à flux de données pour les MPSoC embarqués hybrides comprenant des unités de calcul programmables et des accélérateurs matériels." Thesis, Bordeaux, 2015. http://www.theses.fr/2015BORD0031/document.

Повний текст джерела

Анотація:

Bien que de nombreux appareils numériques soient aujourd'hui capables de lire des contenus vidéo en temps réel et d'offrir une restitution de grande qualité, le décodage vidéo dans les systèmes embarqués n'en est pas pour autant devenu une opération anodine. En effet, les codecs récents tels que H.264 et HEVC sont d'une complexité telle que le recours à des architectures mixtes logiciel/matériel est presque incontournable. Or les plateformes de ce type sont notoirement difficiles à programmer efficacement. Cette thèse relève le défi du développement d'applications à flux de données pour les cibles embarquées hybrides et de leur exécution efficace, et propose plusieurs contributions. La première est une extension des heuristiques d'ordonnancement de liste pour tenir compte des contraintes mémorielles. La seconde est un modèle d'exécution à flot de données compatible avec la plupart des modèles existants et avec une large classe de plateformes matérielles, ainsi qu'un ordonnanceur dynamique. Enfin, de nombreux développements ont été menés sur une architecture réelle de STMicroelectronics pour démontrer la faisabilité de l'approche Although numerous electronic devices are nowadays able to play video contents in real time and offer high-quality reproduction, video decoding in embedded systems has not become a trivial process yet. As a mater of fact, recent codecs such as H.264 and HEVC exhibit such a complexity that resorting to mixed sofware-hardware architecture is almost unavoidable. However, programming efficiently this kind of platforms is well-known to be tricky. This thesis addresses the issue of developing streaming applications for hybrid embedded targets and executing them efficiently, and proposes several contributions. The first one is an extension of the classical list-scheduling heuristics to take memory constraints into account. Te second one is a datafow execution model compatible with most existing models and with a large set of hardware platforms, as well as a dynamic scheduler. Lastly, numerous developments have been carried out on a real-world architecture from STMicroelectronics so as to demonstrate the feasibility of the approach

Стилі APA, Harvard, Vancouver, ISO та ін.

48

Arnesen, Adam T. "Increasing Design Productivity for FPGAs Through IP Reuse and Meta-Data Encapsulation." BYU ScholarsArchive, 2011. https://scholarsarchive.byu.edu/etd/2614.

Повний текст джерела

Анотація:

As Moore's law continues to progress, it is becoming increasingly difficult for hardware designers to fully utilize the increasing number of transistors available semiconductor devices including FPGAs. This design productivity gap must be addressed to allow designs to take full advantage of the increased logic density that results from rising transistor density. The reuse of previously developed and verified intellectual property (IP) is one approach that has claimed to narrow the design productivity gap. Reuse, however, has proved difficult to realize in practice because of the complexity of IP and the reluctance of designers to reuse IP that they do not understand. This thesis proposes to narrow the design productivity gap for FPGAs by simplifying the reuse problem by encapsulating IP with extra machine-readable information or meta-data. This meta-data simplifies reuse by providing a language independent format for composing complex systems, providing a parameter representation system, defining high-level data types for FPGA IP, and allowing arbitrary IP to be described as actors in the homogeneous synchronous dataflow model of computation.This work implements meta-data in XML and presents two XML schemas that enable reuse. A new XML schema known as CHREC XML is presented as well as extensions that enable IP-XACT to be used to describe FPGA dataflow IP. Two tools developed in this work are also presented that leverage meta-data to simplify reuse of arbitrary IP. These tools simplify structural composition of IP, allow designers to manipulate parameters, check and validate high-level data types, and automatically synthesize control circuitry for dataflow designs. Productivity improvements are also demonstrated by reusing IP to quickly compose software radio receivers.

Стилі APA, Harvard, Vancouver, ISO та ін.

49

Glanon, Philippe Anicet. "Deployment of loop-intensive applications on heterogeneous multiprocessor architectures." Electronic Thesis or Diss., université Paris-Saclay, 2020. http://www.theses.fr/2020UPASG029.

Повний текст джерела

Анотація:

Les systèmes cyber-physiques (CPS en anglais) sont des systèmes distribués qui intègrent un large panel d'applications logicielles et de ressources de calcul hétérogènes connectées par divers moyens de communication (filaire ou non-filaire). Ces systèmes ont pour caractéristique de traiter en temps-réel, un volume important de données provenant de processus physiques, chimiques ou biologiques. Une problématique essentielle dans la phase de conception des CPSs est de prédire le comportement temporel des applications logicielles et de fournir des garanties de performances pour ces applications. Afin de répondre à cette problématique, des stratégies d'ordonnancement statique sont nécessaires. Ces stratégies doivent tenir compte de plusieurs contraintes, notamment les contraintes de dépendances cycliques induites par les boucles de calcul des applications ainsi que les contraintes de ressource et de communication des architectures de calcul. En effet, les boucles étant l'une des parties les plus critiques en temps d'exécution pour plusieurs applications de calcul intensif, le comportement temporel et les performances optimales des applications logicielles dépendent de l'ordonnancement optimal des structures de boucles embarqués dans les programmes de calcul. Pour prédire le comportement temporel des applications logicielles et fournir des garanties de performances pour ces applications, les stratégies d'ordonnancement statiques doivent donc explorer et exploiter efficacement le parallélisme embarqué dans les patterns d'exécution des programmes à boucles intensives tout en garantissant le respect des contraintes de ressources et de communication des architectures de calcul. L'ordonnancement d'un programme à boucles intensives sous contraintes ressources et communication est un problème complexe et difficile. Afin de résoudre efficacement ce problème, il est indispensable de concevoir des heuristiques. Cependant, pour concevoir des heuristiques efficaces, il est important de caractériser l'ensemble des solutions optimales pour le problème d'ordonnancement. Une solution optimale pour un problème d'ordonnancement est un ordonnancement qui réalise un objectif optimal de performance. Dans cette thèse, nous nous intéressons au problème d'ordonnancement des programmes à boucles intensives sur des architectures de calcul multiprocesseurs hétérogènes sous des contraintes de ressource et de communication, avec l'objectif d'optimiser le débit de fonctionnement des applications logicielles. Pour ce faire, nous utilisons les modèles de flots de données statiques pour décrire les structures de boucles spécifiées dans les programmes de calcul et nous concevons des stratégies d'ordonnancement périodiques sur la base des propriétés structurelles et mathématiques de ces modèles afin de générer des solutions optimales et approximatives d'ordonnancement Cyber-physical systems (CPSs) are distributed computing-intensive systems, that integrate a wide range of software applications and heterogeneous processing resources, each interacting with the other ones through different communication resources to process a large volume of data sensed from physical, chemical or biological processes. An essential issue in the design stage of these systems is to predict the timing behaviour of software applications and to provide performance guarantee to these applications. In order tackle this issue, efficient static scheduling strategies are required to deploy the computations of software applications on the processing architectures. These scheduling strategies should deal with several constraints, which include the loop-carried dependency constraints between the computational programs as well as the resource and communication constraints of the processing architectures intended to execute these programs. Actually, loops being one of the most time-critical parts of many computing-intensive applications, the optimal timing behaviour and performance of the applications depends on the optimal schedule of loops structures enclosed in the computational programs executed by the applications. Therefore, to provide performance guarantee for the applications, the scheduling strategies should efficiently explore and exploit the parallelism embedded in the repetitive execution patterns of loops while ensuring the respect of resource and communications constraints of the processing architectures of CPSs. Scheduling a loop under resource and communication constraints is a complex problem. To solve it efficiently, heuristics are obviously necessary. However, to design efficient heuristics, it is important to characterize the set of optimal solutions for the scheduling problem. An optimal solution for a scheduling problem is a schedule that achieve an optimal performance goal. In this thesis, we tackle the study of resource-constrained and communication-constrained scheduling of loop-intensive applications on heterogeneous multiprocessor architectures with the goal of optimizing throughput performance for the applications. In order to characterize the set of optimal scheduling solutions and to design efficient scheduling heuristics, we use synchronous dataflow (SDF) model of computation to describe the loop structures specified in the computational programs of software applications and we design software pipelined scheduling strategies based on the structural and mathematical properties of the SDF model

Стилі APA, Harvard, Vancouver, ISO та ін.

50

Farabet, Clément. "Analyse sémantique des images en temps-réel avec des réseaux convolutifs." Phd thesis, Université Paris-Est, 2013. http://tel.archives-ouvertes.fr/tel-00965622.

Повний текст джерела

Анотація:

Une des questions centrales de la vision informatique est celle de la conception et apprentissage de représentations du monde visuel. Quel type de représentation peut permettre à un système de vision artificielle de détecter et classifier les objects en catégories, indépendamment de leur pose, échelle, illumination, et obstruction. Plus intéressant encore, comment est-ce qu'un tel système peut apprendre cette représentation de façon automatisée, de la même manière que les animaux et humains parviennent à émerger une représentation du monde qui les entoure. Une question liée est celle de la faisabilité calculatoire, et plus précisément celle de l'efficacité calculatoire. Étant donné un modèle visuel, avec quelle efficacité peut-il être entrainé, et appliqué à de nouvelles données sensorielles. Cette efficacité a plusieurs dimensions: l'énergie consommée, la vitesse de calcul, et l'utilisation mémoire. Dans cette thèse je présente trois contributions à la vision informatique: (1) une nouvelle architecture de réseau convolutif profond multi-échelle, permettant de capturer des relations longue distance entre variables d'entrée dans des données type image, (2) un algorithme à base d'arbres permettant d'explorer de multiples candidats de segmentation, pour produire une segmentation sémantique avec confiance maximale, (3) une architecture de processeur dataflow optimisée pour le calcul de réseaux convolutifs profonds. Ces trois contributions ont été produites dans le but d'améliorer l'état de l'art dans le domain de l'analyse sémantique des images, avec une emphase sur l'efficacité calculatoire. L'analyse de scènes (scene parsing) consiste à étiqueter chaque pixel d'une image avec la catégorie de l'objet auquel il appartient. Dans la première partie de cette thèse, je propose une méthode qui utilise un réseau convolutif profond, entrainé à même les pixels, pour extraire des vecteurs de caractéristiques (features) qui encodent des régions de plusieurs résolutions, centrées sur chaque pixel. Cette méthode permet d'éviter l'usage de caractéristiques créées manuellement. Ces caractéristiques étant multi-échelle, elles permettent au modèle de capturer des relations locales et globales à la scène. En parallèle, un arbre de composants de segmentation est calculé à partir de graphe de dis-similarité des pixels. Les vecteurs de caractéristiques associés à chaque noeud de l'arbre sont agrégés, et utilisés pour entrainé un estimateur de la distribution des catégories d'objets présents dans ce segment. Un sous-ensemble des noeuds de l'arbre, couvrant l'image, est ensuite sélectionné de façon à maximiser la pureté moyenne des distributions de classes. En maximisant cette pureté, la probabilité que chaque composant ne contienne qu'un objet est maximisée. Le système global produit une précision record sur plusieurs benchmarks publics. Le calcul de réseaux convolutifs profonds ne dépend que de quelques opérateurs de base, qui sont particulièrement adaptés à une implémentation hardware dédiée. Dans la deuxième partie de cette thèse, je présente une architecture de processeur dataflow dédiée et optimisée pour le calcul de systèmes de vision à base de réseaux convolutifs--neuFlow--et un compilateur--luaFlow--dont le rôle est de compiler une description haut-niveau (type graphe) de réseaux convolutifs pour produire un flot de données et calculs optimal pour l'architecture. Ce système a été développé pour faire de la détection, catégorisation et localisation d'objets en temps réel, dans des scènes complexes, en ne consommant que 10 Watts, avec une implémentation FPGA standard.

Стилі APA, Harvard, Vancouver, ISO та ін.

Дисертації з теми "Architecture dataflow"

Оформте джерело за APA, MLA, Chicago, Harvard та іншими стилями