Dissertations / Theses: 'Message-Passing Interface (MPI)'

1

Träff, Jesper. "Aspects of the efficient implementation of the message passing interface (MPI)." Aachen Shaker, 2009. http://d-nb.info/994501803/04.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Katti, Amogh. "Epidemic failure detection and consensus for message passing interface (MPI)." Thesis, University of Reading, 2016. http://centaur.reading.ac.uk/69932/.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Rattanapoka, Choopan. "P2P-MPI : A fault-tolerant Message Passing Interface Implementation for Grids." Phd thesis, Université Louis Pasteur - Strasbourg I, 2008. http://tel.archives-ouvertes.fr/tel-00724132.

Full text

Abstract:

Cette thèse démontre la faisabilité d'un intergiciel destiné aux grilles de calcul, prenant en compte la dynamicité de ce type de plateforme, et les impératifs des programmes parallèles à passage de message. Pour cela, nous mettons en avant l'intérêt d'utiliser une architecture la plus distribuée possible : nous reprenons l'idée d'une infrastructure pair-à-pair pour l'organisation des ressources, qui facilite notamment la découverte des ressources, et nous retenons les détecteurs de défaillance distribués pour gérer la tolérance aux pannes. La dynamicité de ce type d'environnement est également un problème pour le modèle d'exécution sous-jacent à MPI, car la panne d'un seul processus entraine l'arrêt de l'application. La contribution de P2P-MPI dans ce domaine est la tolérance aux pannes par réplication. Nous pensons qu'elle est la mieux adaptée à une architecture pair-à-pair, les techniques classiques basées sur le check-point and restart nécessitant un ou des serveurs de sauvegardes. De plus, la réplication est totalement transparente à l'utilisateur et rejoint ainsi l'objectif de simplicité d'utilisation que nous nous sommes fixés. Nous pensons que garder un environnement très simple d'utilisation, entièrement maîtrisable par un utilisateur, est un des facteurs permettant d'augmenter le nombre de ressources disponibles sur la grille. Enfin, la contribution majeure de P2P-MPI est la librairie de communication proposée, qui est une implémentation de MPJ (MPI adapté à Java), et qui intègre la réplication des processus. Ce point particulier de notre travail plaide pour une collaboration étroite entre l'intergiciel, qui connaît l'état de la grille (détection des pannes par exemple) et la couche de communication qui peut adapter son comportement en connaissance de cause.

APA, Harvard, Vancouver, ISO, and other styles

4

Ramesh, Srinivasan. "MPI Performance Engineering with the MPI Tools Information Interface." Thesis, University of Oregon, 2018. http://hdl.handle.net/1794/23779.

Full text

Abstract:

The desire for high performance on scalable parallel systems is increasing the complexity and the need to tune MPI implementations. The MPI Tools Information Interface (MPI T) introduced in the MPI 3.0 standard provides an opportunity for performance tools and external software to introspect and understand MPI runtime behavior at a deeper level to detect scalability issues. The interface also provides a mechanism to fine-tune the performance of the MPI library dynamically at runtime. This thesis describes the motivation, design, and challenges involved in developing an MPI performance engineering infrastructure using MPI T for two performance toolkits — the TAU Performance System, and Caliper. I validate the design of the infrastructure for TAU by developing optimizations for production and synthetic applications. I show that the MPI T runtime introspection mechanism in Caliper enables a meaningful analysis of performance data. This thesis includes previously published co-authored material.

APA, Harvard, Vancouver, ISO, and other styles

5

Poole, Jeffrey Hyatt. "Implementation of a Hardware-Optimized MPI Library for the SCMP Multiprocessor." Thesis, Virginia Tech, 2001. http://hdl.handle.net/10919/10064.

Full text

Abstract:

As time progresses, computer architects continue to create faster and more complex microprocessors using techniques such as out-of-order execution, branch prediction, dynamic scheduling, and predication. While these techniques enable greater performance, they also increase the complexity and silicon area of the design. This creates larger development and testing times. The shrinking feature sizes associated with newer technology increase wire resistance and signal propagation delays, further complicating large designs. One potential solution is the Single-Chip Message-Passing (SCMP) Parallel Computer, developed at Virginia Tech. SCMP makes use of an architecture where a number of simple processors are tiled across a single chip and connected by a fast interconnection network. The system is designed to take advantage of thread-level parallelism and to keep wire traces short in preparation for even smaller integrated circuit feature sizes. This thesis presents the implementation of the MPI (Message-Passing Interface) communications library on top of SCMP's hardware communication support. Emphasis is placed on the specific needs of this system with regards to MPI. For example, MPI is designed to operate between heterogeneous systems; however, in the SCMP environment such support is unnecessary and wastes resources. The SCMP network is also designed such that messages can be sent with very low latency, but with cooperative multitasking it is difficult to assure a timely response to messages. Finally, the low-level network primitives have no support for send operations that occur before the receiver is prepared and that functionality is necessary for MPI support.
Master of Science

APA, Harvard, Vancouver, ISO, and other styles

6

Strand, Christian. "A Java Founded LOIS-framework and the Message Passing Interface? : An Exploratory Case Study." Thesis, Växjö University, School of Mathematics and Systems Engineering, 2006. http://urn.kb.se/resolve?urn=urn:nbn:se:vxu:diva-916.

Full text

Abstract:

In this thesis project we have successfully added an MPI extension layer to the LOIS framework. The framework defines an infrastructure for executing and connecting continuous stream processing applications. The MPI extension provides the same amount of stream based data as the framework’s original transport. We assert that an MPI-2 compatible implementation can be a candidate to extend the given framework with an adaptive and flexible communication sub-system. Adaptability is required since the communication subsystem has to be resilient to changes, either due to optimizations or system requirements.

APA, Harvard, Vancouver, ISO, and other styles

7

Träff, Jesper Larsson [Verfasser]. "Aspects of the efficient Implementation of the Message Passing Interface (MPI) / Jesper Larsson Träff." Aachen : Shaker, 2009. http://d-nb.info/115651794X/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

8

Holmes, Daniel John. "McMPI : a managed-code message passing interface library for high performance communication in C#." Thesis, University of Edinburgh, 2012. http://hdl.handle.net/1842/7732.

Full text

Abstract:

This work endeavours to achieve technology transfer between established best-practice in academic high-performance computing and current techniques in commercial high-productivity computing. It shows that a credible high-performance message-passing communication library, with semantics and syntax following the Message-Passing Interface (MPI) Standard, can be built in pure C# (one of the .Net suite of computer languages). Message-passing has been the dominant paradigm in high-performance parallel programming of distributed-memory computer architectures for three decades. The MPI Standard originally distilled architecture-independent and language-agnostic ideas from existing specialised communication libraries and has since been enhanced and extended. Object-oriented languages can increase programmer productivity, for example by allowing complexity to be managed through encapsulation. Both the C# computer language and the .Net common language runtime (CLR) were originally developed by Microsoft Corporation but have since been standardised by the European Computer Manufacturers Association (ECMA) and the International Standards Organisation (ISO), which facilitates portability of source-code and compiled binary programs to a variety of operating systems and hardware. Combining these two open and mature technologies enables mainstream programmers to write tightly-coupled parallel programs in a popular standardised object-oriented language that is portable to most modern operating systems and hardware architectures. This work also establishes that a thread-to-thread delivery option increases shared-memory communication performance between MPI ranks on the same node. This suggests that the thread-as-rank threading model should be explicitly specified in future versions of the MPI Standard and then added to existing MPI libraries for use by thread-safe parallel codes. This work also ascertains that the C# socket object suffers from undesirable characteristics that are critical to communication performance and proposes ways of improving the implementation of this object.

APA, Harvard, Vancouver, ISO, and other styles

9

Chen, Zhezhe. "System Support for Improving the Reliability of MPI Applications and Libraries." The Ohio State University, 2013. http://rave.ohiolink.edu/etdc/view?acc_num=osu1375880144.

Full text

APA, Harvard, Vancouver, ISO, and other styles

10

Radcliffe, Nicholas Ryan. "Adjusting Process Count on Demand for Petascale Global Optimization." Thesis, Virginia Tech, 2011. http://hdl.handle.net/10919/36349.

Full text

Abstract:

There are many challenges that need to be met before efficient and reliable computation at the petascale is possible. Many scientific and engineering codes running at the petascale are likely to be memory intensive, which makes thrashing a serious problem for many petascale applications. One way to overcome this challenge is to use a dynamic number of processes, so that the total amount of memory available for the computation can be increased on demand. This thesis describes modifications made to the massively parallel global optimization code pVTdirect in order to allow for a dynamic number of processes. In particular, the modified version of the code monitors memory use and spawns new processes if the amount of available memory is determined to be insufficient. The primary design challenges are discussed, and performance results are presented and analyzed.
Master of Science

APA, Harvard, Vancouver, ISO, and other styles

11

Fernandes, Cl?udio Ant?nio Costa. "Estudos de algumas ferramentas de coleta e visualiza??o de dados e desempenho de aplica??es paralelas no ambiente MPI." Universidade Federal do Rio Grande do Norte, 2003. http://repositorio.ufrn.br:8080/jspui/handle/123456789/15428.

Full text

Abstract:

Made available in DSpace on 2014-12-17T14:56:04Z (GMT). No. of bitstreams: 1 ClaudioACF.pdf: 1310703 bytes, checksum: 20942a00fb9b1da452758bbafaf1b59d (MD5) Previous issue date: 2003-09-23
Coordena??o de Aperfei?oamento de Pessoal de N?vel Superior
The last years have presented an increase in the acceptance and adoption of the parallel processing, as much for scientific computation of high performance as for applications of general intention. This acceptance has been favored mainly for the development of environments with massive parallel processing (MPP - Massively Parallel Processing) and of the distributed computation. A common point between distributed systems and MPPs architectures is the notion of message exchange, that allows the communication between processes. An environment of message exchange consists basically of a communication library that, acting as an extension of the programming languages that allow to the elaboration of applications parallel, such as C, C++ and Fortran. In the development of applications parallel, a basic aspect is on to the analysis of performance of the same ones. Several can be the metric ones used in this analysis: time of execution, efficiency in the use of the processing elements, scalability of the application with respect to the increase in the number of processors or to the increase of the instance of the treat problem. The establishment of models or mechanisms that allow this analysis can be a task sufficiently complicated considering parameters and involved degrees of freedom in the implementation of the parallel application. An joined alternative has been the use of collection tools and visualization of performance data, that allow the user to identify to points of strangulation and sources of inefficiency in an application. For an efficient visualization one becomes necessary to identify and to collect given relative to the execution of the application, stage this called instrumentation. In this work it is presented, initially, a study of the main techniques used in the collection of the performance data, and after that a detailed analysis of the main available tools is made that can be used in architectures parallel of the type to cluster Beowulf with Linux on X86 platform being used libraries of communication based in applications MPI - Message Passing Interface, such as LAM and MPICH. This analysis is validated on applications parallel bars that deal with the problems of the training of neural nets of the type perceptrons using retro-propagation. The gotten conclusions show to the potentiality and easinesses of the analyzed tools.
Os ?ltimos anos t?m apresentado um aumento na aceita??o e ado??o do processamento paralelo, tanto para computa??o cient?fica de alto desempenho como para aplica??es de prop?sito geral. Essa aceita??o tem sido favorecida principalmente pelo desenvolvimento dos ambientes com processamento maci?amente paralelo (MPP - Massively Parallel Processing) e da computa??o distribu?da. Um ponto comum entre sistemas distribu?dos e arquiteturas MPPs ? a no??o de troca de mensagem, que permite a comunica??o entre processos. Um ambiente de troca de mensagem consiste basicamente de uma biblioteca de comunica??o que, atuando como uma extens?o das linguagens de programa??o, permite a elabora??o de aplica??es paralelas, tais como C, C++ e Fortran. No desenvolvimento de aplica??es paralelas, um aspecto fundamental esta ligado ? an?lise de desempenho das mesmas. V?rias podem ser as m?tricas utilizadas nesta an?lise: tempo de execu??o, efici?ncia na utiliza??o dos elementos de processamento, escalabilidade da aplica??o com respeito ao aumento no n?mero de processadores ou ao aumento da inst?ncia do problema tratado. O estabelecimento de modelos ou mecanismos que permitam esta an?lise pode ser uma tarefa bastante complicada considerando-se par?metros e graus de liberdade envolvidos na implementa??o da aplica??o paralela. Uma alternativa encontrada tem sido a utiliza??o de ferramentas de coleta e visualiza??o de dados de desempenho, que permitem ao usu?rio identificar pontos de estrangulamento e fontes de inefici?ncia em uma aplica??o. Para uma visualiza??o eficiente torna-se necess?rio identificar e coletar dados relativos ? execu??o da aplica??o, etapa esta denominada instrumenta??o. Neste trabalho ? apresentado, inicialmente, um estudo das principais t?cnicas utilizadas na coleta dos dados de desempenho, e em seguida ? feita uma an?lise detalhada das principais ferramentas dispon?veis que podem ser utilizadas em arquiteturas paralelas do tipo Cluster Beowulf com Linux sobre plataforma X86 utilizando bibliotecas de comunica??o baseadas em aplica??es MPI - Message Passing Interface, tais como LAM e MPICH . Esta an?lise ? validada sobre aplica??es paralelas que tratam do problema do treinamento de redes neurais do tipo perceptrons usando retropropaga??o. As conclus?es obtidas mostram as potencialidade e facilidades das ferramentas analisadas.

APA, Harvard, Vancouver, ISO, and other styles

12

Seth, Umesh Kumar. "Message Passing Interface parallelization of a multi-block structured numerical solver. Application to the numerical simulation of various typical Electro-Hydro-Dynamic flows." Thesis, Poitiers, 2019. http://www.theses.fr/2019POIT2264/document.

Full text

Abstract:

Plusieurs types d’applications industrielles complexes, relèvent du domaine multidisciplinaire de l’Electro-Hydro-Dynamique (EHD) où les interactions entre des particules chargées et des particules neutres sont étudiées dans le contexte couplé de la dynamique des fluides et de l’électrostatique. Dans cette thèse, nous avons étudié par voie de simulation numérique certains phénomènes Electro-Hydro-Dynamiques comme l’injection unipolaire, le phénomène de conduction dans les liquides peu conducteurs et le contrôle d’écoulement avec des actionneurs plasma à barrières diélectriques (DBD). La résolution de tels systèmes physiques complexes exige des ressources de calculs importantes ainsi que des solveurs CFD parallèles dans la mesure où ces modèles EHD sont mathématiquement raides et très consommateurs en temps de calculs en raison des gammes d’échelles de temps et d’espace impliquées. Cette thèse vise à accroitre les capacités de simulations numériques du groupe Electro-Fluido-Dynamique de l’Institut Pprime en développant un solveur parallèle haute performance basé sur des modèles EHD avancés. Dans une première partie de cette thèse, la parallélisation de notre solveur EHD a été réalisée avec des protocoles MPI avancés comme la topologie Cartésienne et les Inter-communicateurs. En particulier, une stratégie spécifique a été conçue pour prendre en compte la caractéristique multi-blocs structurés du code. La nouvelle version parallèle du code a été entièrement validée au travers de plusieurs benchmarks. Les tests de scalabilité menés sur notre cluster de 1200 cœurs ont montré d’excellentes performances. La deuxième partie de cette thèse est consacrée à la simulation numérique de plusieurs écoulements EHD typiques. Nous nous sommes intéressés entre autres à l’électroconvection induite par l'injection unipolaire entre deux électrodes plates parallèles, à l’étude des panaches électroconvectifs dans une configuration d'électrodes lame-plan, au mécanisme de conduction basé sur la dissociation de molécules neutres d'un liquide faiblement conducteur. Certains de ces nouveaux résultats ont été validés avec des simulations numériques entreprises avec le code commercial Comsol. Enfin, le contrôle d’écoulements grâce à un actionneur DBD a été simulé à l’aide du modèle Suzen-Huang dans diverses configurations. Les effets de l’épaisseur du diélectrique, de l’espacement inter-électrodes, de la fréquence de la tension appliquée et sa forme d’onde, sur la vitesse maximale du vent ionique induit ainsi que sur la force électrique moyenne ont été étudiés
Several intricately coupled applications of modern industries fall under the multi-disciplinary domain of Electrohydrodynamics (EHD), where the interactions among charged and neutral particles are studied in context of both fluid dynamics and electrostatics together. The charge particles in fluids are generated with various physical mechanisms, and they move under the influence of external electric field and the fluid velocity. Generally, with sufficient electric force magnitudes, momentum transfer occurs from the charged species to the neutral particles also. This coupled system is solved with the Maxwell equations, charge transport equations and Navier-Stokes equations simulated sequentially in a common time loop. The charge transport is solved considering convection, diffusion, source terms and other relevant mechanisms for species. Then, the bulk fluid motion is simulated considering the induced electric force as a source term in the Navier-Stokes equations, thus, coupling the electrostatic system with the fluid. In this thesis, we numerically investigated some EHD phenomena like unipolar injection, conduction phenomenon in weakly conducting liquids and flow control with dielectric barrier discharge (DBD) plasma actuators.Solving such complex physical systems numerically requires high-end computing resources and parallel CFD solvers, as these large EHD models are mathematically stiff and highly time consuming due to the range of time and length scales involved. This thesis contributes towards advancing the capability of numerical simulations carried out within the EFD group at Institut Pprime by developing a high performance parallel solver with advanced EHD models. Being the most popular and specific technology, developed for the distributed memory platforms, Message Passing Interface (MPI) was used to parallelize our multi-block structured EHD solver. In the first part the parallelization of our numerical EHD solver with advanced MPI protocols such as Cartesian topology and Inter-Communicators is undertaken. In particular a specific strategy has been designed and detailed to account for the multi-block structured grids feature of the code. The parallel code has been fully validated through several benchmarks, and scalability tests carried out on up to 1200 cores on our local cluster showed excellent parallel speed-ups with our approach. A trustworthy database containing all these validation tests carried out on multiple cores is provided to assist in future developments. The second part of this thesis deals with the numerical simulations of several typical EHD flows. We have examined three-dimensional electroconvection induced by unipolar injection between two planar-parallel electrodes. Unsteady hexagonal cells were observed in our study. 3D flow phenomenon with electro-convective plumes was also studied in the blade-plane electrode configuration considering both autonomous and non-autonomous injection laws. Conduction mechanism based on the dissociation of neutral molecules of a weakly conductive liquid has been successfully simulated. Our results have been validated with some numerical computations undertaken with the commercial code Comsol. Physical implications of Robin boundary condition and Onsager effect on the charge species were highlighted in electro-conduction in a rectangular channel. Finally, flow control using Dielectric Barrier Discharge plasma actuator has been simulated using the Suzen-Huang model. Impacts of dielectric thickness, gap between the electrodes, frequency and waveform of applied voltage etc. were investigated in terms of their effect on the induced maximum ionic wind velocity and average body force. Flow control simulations with backward facing step showed that a laminar flow separation could be drastically controlled by placing the actuator at the tip of the step with both electrodes perpendicular to each other

APA, Harvard, Vancouver, ISO, and other styles

13

Čižek, Martin. "Paralelizace sledování paprsku." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2009. http://www.nusl.cz/ntk/nusl-235485.

Full text

Abstract:

Ray tracing is widely used technique for realistic rendering of computer scenes. Its major drawback is time needed to compute the image, therefore it's usually parallelized. This thesis describes parallelization and ray tracing in general. It explains the possibility of how can be ray tracing parallelized as well as it defines the problems which may occur during the process. The result is parallel rendering application which uses selected ray tracing software and measurement of how successful this application is.

APA, Harvard, Vancouver, ISO, and other styles

14

Aji, Ashwin M. "Programming High-Performance Clusters with Heterogeneous Computing Devices." Diss., Virginia Tech, 2015. http://hdl.handle.net/10919/52366.

Full text

Abstract:

Today's high-performance computing (HPC) clusters are seeing an increase in the adoption of accelerators like GPUs, FPGAs and co-processors, leading to heterogeneity in the computation and memory subsystems. To program such systems, application developers typically employ a hybrid programming model of MPI across the compute nodes in the cluster and an accelerator-specific library (e.g.; CUDA, OpenCL, OpenMP, OpenACC) across the accelerator devices within each compute node. Such explicit management of disjointed computation and memory resources leads to reduced productivity and performance. This dissertation focuses on designing, implementing and evaluating a runtime system for HPC clusters with heterogeneous computing devices. This work also explores extending existing programming models to make use of our runtime system for easier code modernization of existing applications. Specifically, we present MPI-ACC, an extension to the popular MPI programming model and runtime system for efficient data movement and automatic task mapping across the CPUs and accelerators within a cluster, and discuss the lessons learned. MPI-ACC's task-mapping runtime subsystem performs fast and automatic device selection for a given task. MPI-ACC's data-movement subsystem includes careful optimizations for end-to-end communication among CPUs and accelerators, which are seamlessly leveraged by the application developers. MPI-ACC provides a familiar, flexible and natural interface for programmers to choose the right computation or communication targets, while its runtime system achieves efficient cluster utilization.
Ph. D.

APA, Harvard, Vancouver, ISO, and other styles

15

Glück, Olivier. "Optimisations de la bibliothèque de communication MPI pour machines parallèles de type " grappe de PCs " sur une primitive d'écriture distante." Paris 6, 2002. http://www.theses.fr/2002PA066158.

Full text

APA, Harvard, Vancouver, ISO, and other styles

16

Abdelkafi, Omar. "Métaheuristiques hybrides distribuées et massivement parallèles." Thesis, Mulhouse, 2016. http://www.theses.fr/2016MULH9578/document.

Full text

Abstract:

De nombreux problèmes d'optimisation propres à différents secteurs industriels et académiques (énergie, chimie, transport, etc.) nécessitent de concevoir des méthodes de plus en plus efficaces pour les résoudre. Afin de répondre à ces besoins, l'objectif de cette thèse est de développer une bibliothèque composée de plusieurs métaheuristiques hybrides distribuées et massivement parallèles. Dans un premier temps, nous avons étudié le problème du voyageur de commerce et sa résolution par la méthode colonie de fourmis afin de mettre en place les techniques d'hybridation et de parallélisation. Ensuite, deux autres problèmes d'optimisation ont été traités, à savoir, le problème d'affectation quadratique (QAP) et le problème de la résolution structurale des zéolithes (ZSP). Pour le QAP, plusieurs variantes basées sur une recherche taboue itérative avec des diversifications adaptatives ont été proposées. Le but de ces propositions est d'étudier l'impact de : l'échange des données, des stratégies de diversification et des méthodes de coopération. Notre meilleure variante est comparée à six des meilleurs travaux de la littérature. En ce qui concerne le ZSP, deux nouvelles formulations de la fonction objective sont proposées pour évaluer le potentiel des structures zéolitiques trouvées. Ces formulations sont basées sur le principe de pénalisation et de récompense. Deux algorithmes génétiques hybrides et parallèles sont proposés pour générer des structures zéolitiques stables. Nos algorithmes ont généré actuellement six topologies stables, parmi lesquelles trois ne sont pas répertoriées sur le site Web du SC-IZA ou dans l'Atlas of Prospective Zeolite Structures
Many optimization problems specific to different industrial and academic sectors (energy, chemicals, transportation, etc.) require the development of more effective methods in resolving. To meet these needs, the aim of this thesis is to develop a library of several hybrid metaheuristics distributed and massively parallel. First, we studied the traveling salesman problem and its resolution by the ant colony method to establish hybridization and parallelization techniques. Two other optimization problems have been dealt, which are, the quadratic assignment problem (QAP) and the zeolite structure problem (ZSP). For the QAP, several variants based on an iterative tabu search with adaptive diversification have been proposed. The aim of these proposals is to study the impact of: the data exchange, the diversification strategies and the methods of cooperation. Our best variant is compared with six from the leading works of the literature. For the ZSP two new formulations of the objective function are proposed to evaluate the potential of the zeolites structures founded. These formulations are based on reward and penalty evaluation. Two hybrid and parallel genetic algorithms are proposed to generate stable zeolites structures. Our algorithms have now generated six stable topologies, three of them are not listed in the SC-JZA website or in the Atlas of Prospective Zeolite Structures

APA, Harvard, Vancouver, ISO, and other styles

17

Bedez, Mathieu. "Modélisation multi-échelles et calculs parallèles appliqués à la simulation de l'activité neuronale." Thesis, Mulhouse, 2015. http://www.theses.fr/2015MULH9738/document.

Full text

Abstract:

Les neurosciences computationnelles ont permis de développer des outils mathématiques et informatiques permettant la création, puis la simulation de modèles représentant le comportement de certaines composantes de notre cerveau à l’échelle cellulaire. Ces derniers sont utiles dans la compréhension des interactions physiques et biochimiques entre les différents neurones, au lieu d’une reproduction fidèle des différentes fonctions cognitives comme dans les travaux sur l’intelligence artificielle. La construction de modèles décrivant le cerveau dans sa globalité, en utilisant une homogénéisation des données microscopiques est plus récent, car il faut prendre en compte la complexité géométrique des différentes structures constituant le cerveau. Il y a donc un long travail de reconstitution à effectuer pour parvenir à des simulations. D’un point de vue mathématique, les différents modèles sont décrits à l’aide de systèmes d’équations différentielles ordinaires, et d’équations aux dérivées partielles. Le problème majeur de ces simulations vient du fait que le temps de résolution peut devenir très important, lorsque des précisions importantes sur les solutions sont requises sur les échelles temporelles mais également spatiales. L’objet de cette étude est d’étudier les différents modèles décrivant l’activité électrique du cerveau, en utilisant des techniques innovantes de parallélisation des calculs, permettant ainsi de gagner du temps, tout en obtenant des résultats très précis. Quatre axes majeurs permettront de répondre à cette problématique : description des modèles, explication des outils de parallélisation, applications sur deux modèles macroscopiques
Computational Neuroscience helped develop mathematical and computational tools for the creation, then simulation models representing the behavior of certain components of our brain at the cellular level. These are helpful in understanding the physical and biochemical interactions between different neurons, instead of a faithful reproduction of various cognitive functions such as in the work on artificial intelligence. The construction of models describing the brain as a whole, using a homogenization microscopic data is newer, because it is necessary to take into account the geometric complexity of the various structures comprising the brain. There is therefore a long process of rebuilding to be done to achieve the simulations. From a mathematical point of view, the various models are described using ordinary differential equations, and partial differential equations. The major problem of these simulations is that the resolution time can become very important when important details on the solutions are required on time scales but also spatial. The purpose of this study is to investigate the various models describing the electrical activity of the brain, using innovative techniques of parallelization of computations, thereby saving time while obtaining highly accurate results. Four major themes will address this issue: description of the models, explaining parallelization tools, applications on both macroscopic models

APA, Harvard, Vancouver, ISO, and other styles

18

Zhang, Hua. "VCLUSTER: A PORTABLE VIRTUAL COMPUTING LIBRARY FOR CLUSTER COMPUTING." Doctoral diss., Orlando, Fla. : University of Central Florida, 2008. http://purl.fcla.edu/fcla/etd/CFE0002339.

Full text

APA, Harvard, Vancouver, ISO, and other styles

19

Zhang, Hang. "Distributed Support Vector Machine With Graphics Processing Units." ScholarWorks@UNO, 2009. http://scholarworks.uno.edu/td/991.

Full text

Abstract:

Training a Support Vector Machine (SVM) requires the solution of a very large quadratic programming (QP) optimization problem. Sequential Minimal Optimization (SMO) is a decomposition-based algorithm which breaks this large QP problem into a series of smallest possible QP problems. However, it still costs O(n2) computation time. In our SVM implementation, we can do training with huge data sets in a distributed manner (by breaking the dataset into chunks, then using Message Passing Interface (MPI) to distribute each chunk to a different machine and processing SVM training within each chunk). In addition, we moved the kernel calculation part in SVM classification to a graphics processing unit (GPU) which has zero scheduling overhead to create concurrent threads. In this thesis, we will take advantage of this GPU architecture to improve the classification performance of SVM.

APA, Harvard, Vancouver, ISO, and other styles

20

Chu, Chia-Lin, and 朱家霖. "Distributed Finite-Element Computation Using Message Passing Interface - MPI." Thesis, 2002. http://ndltd.ncl.edu.tw/handle/11531823016343763977.

Full text

Abstract:

碩士
國立雲林科技大學
營建工程系碩士班
90
Static analysis is a basic computation of mechanics. It is a fundamental and very important task in structural analysis. With the results of static analysis, we can realize the status of structure when it is subjected to different types of loads. Since the finite element method is generally employed for large-scale structural analysis, the main computation phases involved in a static analysis consist of the evaluation of element stiffness matrix and load vectors, the assemblage of the system stiffness matrix and load vectors, the solution of system equilibrium equations, and the calculation of internal forces or stresses. The involved calculations are primarily matrix computations. If the data partition and the corresponding solution process can be made properly, most computational phases involved in a finite element analysis can be parallelized well. Apparently, the consumed computer time can be greatly reduced if parallel computation is incorporated into the analysis. This is extremely useful for analyzing large-scale structures. Sinice the inception of parallel computers, parallel computation has been a very popular research topic in the field of computational mechanics. The parallelization of finite element analysis holds much more attention because of the intensive computation usually involved in it. The main objective of this study is to develop efficient parallel algorithms for structural static analysis on distributed computer systems using a new generation of message passing standard, MPI.

APA, Harvard, Vancouver, ISO, and other styles

21

Squyres, Jeffrey M. "A component architecture for the message passing interface (MPI) the systems services interface (SSI) of LAM/MPI /." 2004. http://etd.nd.edu/ETD-db/theses/submitted/etd-03312004-160652/.

Full text

Abstract:

Thesis (Ph. D.)--University of Notre Dame, 2003.
Thesis directed by Andrew Lumsdaine for the Department of Computer Science and Engineering. "April 2004." Includes bibliographical references (leaves 301-312).

APA, Harvard, Vancouver, ISO, and other styles

22

Zounmevo, Ayi Judicael. "Scalability-Driven Approaches to Key Aspects of the Message Passing Interface for Next Generation Supercomputing." Thesis, 2014. http://hdl.handle.net/1974/12194.

Full text

Abstract:

The Message Passing Interface (MPI), which dominates the supercomputing programming environment, is used to orchestrate and fulfill communication in High Performance Computing (HPC). How far HPC programs can scale depends in large part on the ability to achieve fast communication; and to overlap communication with computation or communication with communication. This dissertation proposes a new asynchronous solution to the nonblocking Rendezvous protocol used between pairs of processes to transfer large payloads. On top of enforcing communication/computation overlapping in a comprehensive way, the proposal trumps existing network device-agnostic asynchronous solutions by being memory-scalable and by avoiding brute force strategies. Achieving overlapping between communication and computation is important; but each communication is also expected to generate minimal latency. In that respect, the processing of the queues meant to hold messages pending reception inside the MPI middleware is expected to be fast. Currently though, that processing slows down when program scales grow. This research presents a novel scalability-driven message queue whose processing skips altogether large portions of queue items that are deterministically guaranteed to lead to unfruitful searches. For having little sensitivity to program sizes, the proposed message queue maintains a very good performance, on top of displaying a low and flattening memory footprint growth pattern. Due to the blocking nature of its required synchronizations, the one-sided communication model of MPI creates both communication/computation and communication/communication serializations. This research fixes these issues and latency-related inefficiencies documented for MPI one-sided communications by proposing completely nonblocking and non-serializing versions for those synchronizations. The improvements, meant for consideration in a future MPI standard, also allow new classes of programs to be more efficiently expressed in MPI. Finally, a persistent distributed service is designed over MPI to show its impacts at large scales beyond communication-only activities. MPI is analyzed in situations of resource exhaustion, partial failure and heavy use of internal objects for communicating and non-communicating routines. Important scalability issues are revealed and solution approaches are put forth.
Thesis (Ph.D, Electrical & Computer Engineering) -- Queen's University, 2014-05-23 15:08:58.56

APA, Harvard, Vancouver, ISO, and other styles

23

Gupta, Rakhi. "One To Mant And Many To Many Collective Communication Operations On Grids." Thesis, 2006. http://hdl.handle.net/2005/345.

Full text

Abstract:

Collective Communication Operations are widely used in MPI applications and play an important role in their performance. Hence, various projects have focused on optimization of collective communications for various kinds of parallel computing environments including LAN settings, heterogeneous networks and most recently Grid systems. The distinguishing factor of Grids from all the other environments is heterogeneity of hosts and network, and dynamically changing resource characteristics including load and availability. The ﬁrst part of the thesis develops a solution for MPI broadcast (one-to-many) on Grids. Some current strategies take into consideration static information about network topology for determining an efficient broadcast tree for Grids. Some other strategies take into account only transient network characteristics. We combined both these strategies and cluster the network dynamically on the basis of link bandwidths. Given a set of network parameters we use Simulated Annealing (SA) to obtain the best schedule. Also, we can time tune individual. SAs, to adapt the solution ﬁnding process, on the basis of estimated available times before next broadcast invocations in the application. We also developed software architecture for updation of schedules. We compared our algorithm with the earlier approaches under loaded network conditions, and obtained average performance improvement of 20%. The second part of the thesis extends the work for MPI all gather (many-to-many) operation. Current popular techniques consider strict hierarchical schemes for this operation, wherein from each cluster a representative (or coordinator) node is chosen, and inter cluster communication is done through these representative nodes. This is non optimal as inter cluster communication is usually on high capacity links that can sustain more than one transfer with the same through- put. We developed a cluster based and incremental heuristic algorithm for allgather on Grids. We compared the time taken by allgather schedules determined by this algorithm with current popular implementations. We also compared our algorithm with a strategy where allgather is constructed from a set of broadcast trees. We obtained average performance improvement of 67% over existing strategies.

APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic 'Message-Passing Interface (MPI)'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles