Dissertations / Theses on the topic 'High performace Computation'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 50 dissertations / theses for your research on the topic 'High performace Computation.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Reis, Ruy Freitas. "Simulações numéricas 3D em ambiente paralelo de hipertermia com nanopartículas magnéticas." Universidade Federal de Juiz de Fora (UFJF), 2014. https://repositorio.ufjf.br/jspui/handle/ufjf/3499.
Full textApproved for entry into archive by Adriana Oliveira (adriana.oliveira@ufjf.edu.br) on 2017-03-06T19:28:45Z (GMT) No. of bitstreams: 1 ruyfreitasreis.pdf: 10496081 bytes, checksum: 05695a7e896bd684b83ab5850df95449 (MD5)
Made available in DSpace on 2017-03-06T19:28:45Z (GMT). No. of bitstreams: 1 ruyfreitasreis.pdf: 10496081 bytes, checksum: 05695a7e896bd684b83ab5850df95449 (MD5) Previous issue date: 2014-11-05
CAPES - Coordenação de Aperfeiçoamento de Pessoal de Nível Superior
Este estudo tem como objetivo a modelagem numérica do tratamento de tumores sólidos com hipertermia utilizando nanopartículas magnéticas, considerando o modelo tridimensional de biotransferência de calor proposto por Pennes (1948). Foram comparadas duas diferentes possibilidades de perfusão sanguínea, a primeira constante e, a segunda, dependente da temperatura. O tecido é modelado com as camadas de pele, gordura e músculo, além do tumor. Para encontrar a solução aproximada do modelo foi aplicado o método das diferenças finitas (MDF) em um meio heterogêneo. Devido aos diferentes parâmetros de perfusão, foram obtidos sistemas de equações lineares (perfusão constante) e não lineares (perfusão dependente da temperatura). No domínio do tempo foram utilizados dois esquemas numéricos explícitos, o primeiro utilizando o método clássico de Euler e o segundo um algoritmo do tipo preditor-corretor adaptado dos métodos de integração generalizada da família-alpha trapezoidal. Uma vez que a execução de um modelo tridimensional demanda um alto custo computacional, foram empregados dois esquemas de paralelização do método numérico, o primeiro baseado na API de programação paralela OpenMP e o segundo com a plataforma CUDA. Os resultados experimentais mostraram que a paralelização em OpenMP obteve aceleração de até 39 vezes comparada com a versão serial, e, além disto, a versão em CUDA também foi eficiente, obtendo um ganho de 242 vezes, também comparando-se com o tempo de execução sequencial. Assim, o resultado da execução é obtido cerca de duas vezes mais rápido do que o fenômeno biológico.
This work deals with the numerical modeling of solid tumor treatments with hyperthermia using magnetic nanoparticles considering a 3D bioheat transfer model proposed by Pennes(1948). Two different possibilities of blood perfusion were compared, the first assumes a constant value, and the second one a temperature-dependent function. The living tissue was modeled with skin, fat and muscle layers, in addition to the tumor. The model solution was approximated with the finite difference method (FDM) in an heterogeneous medium. Due to different blood perfusion parameters, a system of linear equations (constant perfusion), and a system of nonlinear equations (temperaturedependent perfusion) were obtained. To discretize the time domain, two explicit numerical strategies were used, the first one was using the classical Euler method, and the second one a predictor-corrector algorithm originated from the generalized trapezoidal alpha-family of time integration methods. Since the computational time required to solve a threedimensional model is large, two different parallel strategies were applied to the numerical method. The first one uses the OpenMP parallel programming API, and the second one the CUDA platform. The experimental results showed that the parallelization using OpenMP improves the performance up to 39 times faster than the sequential execution time, and the CUDA version was also efficient, yielding gains up to 242 times faster than the sequential execution time. Thus, this result ensures an execution time twice faster than the biological phenomenon.
Campos, Joventino de Oliveira. "Método de lattice Boltzmann para simulação da eletrofisiologia cardíaca em paralelo usando GPU." Universidade Federal de Juiz de Fora (UFJF), 2015. https://repositorio.ufjf.br/jspui/handle/ufjf/3555.
Full textApproved for entry into archive by Adriana Oliveira (adriana.oliveira@ufjf.edu.br) on 2017-03-06T20:27:39Z (GMT) No. of bitstreams: 1 joventinodeoliveiracampos.pdf: 3604904 bytes, checksum: aca8053f097ddcb9d96ba51186838610 (MD5)
Made available in DSpace on 2017-03-06T20:27:39Z (GMT). No. of bitstreams: 1 joventinodeoliveiracampos.pdf: 3604904 bytes, checksum: aca8053f097ddcb9d96ba51186838610 (MD5) Previous issue date: 2015-06-26
CAPES - Coordenação de Aperfeiçoamento de Pessoal de Nível Superior
Este trabalho apresenta o método de lattice Boltzmann (MLB) para simulações computacionais da atividade elétrica cardíaca usando o modelo monodomínio. Uma implementação otimizada do método de lattice Boltzmann é apresentada, a qual usa um modelo de colisão com múltiplos parâmetros de relaxação conhecido como multiple relaxation time (MRT), para considerar a anisotropia do tecido cardíaco. Com foco em simulações rápidas da dinâmica cardíaca, devido ao alto grau de paralelismo presente no MLB, uma implementação que executa em uma unidade de processamento gráfico (GPU) foi realizada e seu desempenho foi estudado através de domínios tridimensionais regulares e irregulares. Os resultados da implementação para simulações cardíacas mostraram fatores de aceleração tão altos quanto 500x para a simulação global e para o MLB um desempenho de 419 mega lattice update per second (MLUPS) foi alcançado. Com tempos de execução próximos ao tempo real em um único computador equipado com uma GPU moderna, estes resultados mostram que este trabalho é uma proposta promissora para aplicação em ambiente clínico.
This work presents the lattice Boltzmann method (LBM) for computational simulations of the cardiac electrical activity using monodomain model. An optimized implementation of the lattice Boltzmann method is presented which uses a collision model with multiple relaxation parameters known as multiple relaxation time (MRT) in order to consider the anisotropy of the cardiac tissue. With focus on fast simulations of cardiac dynamics, due to the high level of parallelism present in the LBM, a GPU parallelization was performed and its performance was studied under regular and irregular three-dimensional domains. The results of our optimized LBM GPU implementation for cardiac simulations shown acceleration factors as high as 500x for the overall simulation and for the LBM a performance of 419 mega lattice updates per second (MLUPS) was achieved. With near real time simulations in a single computer equipped with a modern GPU these results show that the proposed framework is a promising approach for application in a clinical workflow.
Isa, Mohammad Nazrin. "High performance reconfigurable architectures for biological sequence alignment." Thesis, University of Edinburgh, 2013. http://hdl.handle.net/1842/7721.
Full textNasar-Ullah, Q. A. "High performance parallel financial derivatives computation." Thesis, University College London (University of London), 2014. http://discovery.ucl.ac.uk/1431080/.
Full textAhrens, James P. "Scientific experiment management with high-performance distributed computation /." Thesis, Connect to this title online; UW restricted, 1996. http://hdl.handle.net/1773/6974.
Full textPandya, Ajay Kirit. "Performance of multithreaded computations on high-speed networks." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1998. http://www.collectionscanada.ca/obj/s4/f2/dsk2/ftp01/MQ32212.pdf.
Full textPilkey, Deborah F. "Computation of a Damping Matrix for Finite Element Model Updating." Diss., Virginia Tech, 1998. http://hdl.handle.net/10919/30453.
Full textPh. D.
Steen, Adrianus Jan van der. "Benchmarking of high performance computers for scientific and technical computation." [S.l.] : Utrecht : [s.n.] ; Universiteitsbibliotheek Utrecht [Host], 1997. http://www.ubu.ruu.nl/cgi-bin/grsn2url?01761909.
Full textZhao, Yu. "High performance Monte Carlo computation for finance risk data analysis." Thesis, Brunel University, 2013. http://bura.brunel.ac.uk/handle/2438/8206.
Full textVetter, Jeffrey Scott. "Techniques and optimizations for high performance computational steering." Diss., Georgia Institute of Technology, 1998. http://hdl.handle.net/1853/9242.
Full textChow, Yi-Mei Maria 1974. "Computational fluid dynamics for high performance structural facilities." Thesis, Massachusetts Institute of Technology, 1998. http://hdl.handle.net/1721.1/50366.
Full textIncludes bibliographical references (leaves 104-106).
by Yi-Mei Maria Chow.
M.Eng.
Skjerven, Brian M. "A parallel implementation of an agent-based brain tumor model." Link to electronic thesis, 2007. http://www.wpi.edu/Pubs/ETD/Available/etd-060507-172337/.
Full textKeywords: Visualization; Numerical analysis; Computational biology; Scientific computation; High-performance computing. Includes bibliographical references (p.19).
Henning, Peter Allen. "Computational Parameter Selection and Simulation of Complex Sphingolipid Pathway Metabolism." Thesis, Georgia Institute of Technology, 2006. http://hdl.handle.net/1853/16202.
Full textArora, Nitin. "High performance algorithms to improve the runtime computation of spacecraft trajectories." Diss., Georgia Institute of Technology, 2013. http://hdl.handle.net/1853/49076.
Full textCyrus, Sam. "Fast Computation on Processing Data Warehousing Queries on GPU Devices." Scholar Commons, 2016. http://scholarcommons.usf.edu/etd/6214.
Full textAxner, Lilit. "High performance computational hemodynamics with the Lattice Boltzmann method." [S.l. : Amsterdam : s.n.] ; Universiteit van Amsterdam [Host], 2007. http://dare.uva.nl/document/54726.
Full textKulkarni, Amol S. "Application of computational intelligence to high performance electric drives /." Thesis, Connect to this title online; UW restricted, 1999. http://hdl.handle.net/1773/5897.
Full textKuhlman, Christopher J. "High Performance Computational Social Science Modeling of Networked Populations." Diss., Virginia Tech, 2013. http://hdl.handle.net/10919/51175.
Full textPh. D.
Pugaonkar, Aniket Narayan. "A High Performance C++ Generic Benchmark for Computational Epidemiology." Thesis, Virginia Tech, 2015. http://hdl.handle.net/10919/51243.
Full textMaster of Science
McFarlane, Ross. "High-performance computing for computational biology of the heart." Thesis, University of Liverpool, 2010. http://livrepository.liverpool.ac.uk/3173/.
Full textRagan-Kelley, Jonathan Millard. "Decoupling algorithms from the organization of computation for high performance image processing." Thesis, Massachusetts Institute of Technology, 2014. http://hdl.handle.net/1721.1/89996.
Full textCataloged from PDF version of thesis. "June 2014."
Includes bibliographical references (pages 127-133).
Future graphics and imaging applications-from self-driving cards, to 4D light field cameras, to pervasive sensing-demand orders of magnitude more computation than we currently have. This thesis argues that the efficiency and performance of an application are determined not only by the algorithm and the hardware architecture on which it runs, but critically also by the organization of computations and data on that architecture. Real graphics and imaging applications appear embarrassingly parallel, but have complex dependencies, and are limited by locality (the distance over which data has to move, e.g., from nearby caches or far away main memory) and synchronization. Increasingly, the cost of communication-both within a chip and over a network-dominates computation and power consumption, and limits the gains realized from shrinking transistors. Driven by these trends, writing high-performance processing code is challenging because it requires global reorganization of computations and data, not simply local optimization of an inner loop. Existing programming languages make it difficult for clear and composable code to express optimized organizations because they conflate the intrinsic algorithms being defined with their organization. To address the challenge of productively building efficient, high-performance programs, this thesis presents the Halide language and compiler for image processing. Halide explicitly separates what computations define an algorithm from the choices of execution structure which determine parallelism, locality, memory footprint, and synchronization. For image processing algorithms with the same complexity-even the exact same set of arithmetic operations and data-executing on the same hardware, the order and granularity of execution and placement of data can easily change performance by an order of magnitude because of locality and parallelism. I will show that, for data-parallel pipelines common in graphics, imaging, and other data-intensive applications, the organization of computations and data for a given algorithm is constrained by a fundamental tension between parallelism, locality, and redundant computation of shared values. I will present a systematic model of "schedules" which explicitly trade off these pressures by globally reorganizing the computations and data for an entire pipeline, and an optimizing compiler that synthesizes high performance implementations from a Halide algorithm and a schedule. The end result is much simpler programs, delivering performance often many times faster than the best prior hand-tuned C, assembly, and CUDA implementations, while scaling across radically different architectures, from ARM mobile processors to massively parallel GPUs.
by Jonathan Ragan-Kelley.
Ph. D.
Livesey, Daria. "High performance computations with Hecke algebras : bilinear forms and Jantzen filtrations." Thesis, University of Aberdeen, 2014. http://digitool.abdn.ac.uk:80/webclient/DeliveryManager?pid=214835.
Full textAldred, Peter L. "Diffraction studies and computational modelling of high-performance aromatic polymers." Thesis, University of Reading, 2003. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.413675.
Full textKasap, Server. "High performance reconfigurable architectures for bioinformatics and computational biology applications." Thesis, University of Edinburgh, 2010. http://hdl.handle.net/1842/24757.
Full textChugunov, Svyatoslav. "High-Performance Simulations for Atmospheric Pressure Plasma Reactor." Diss., North Dakota State University, 2012. https://hdl.handle.net/10365/26626.
Full textPalm, Johan. "High Performance FPGA-Based Computation and Simulation for MIMO Measurement and Control Systems." Thesis, Mälardalen University, School of Innovation, Design and Engineering, 2009. http://urn.kb.se/resolve?urn=urn:nbn:se:mdh:diva-7477.
Full textThe Stressometer system is a measurement and control system used in cold rolling to improve the flatness of a metal strip. In order to achieve this goal the system employs a multiple input multiple output (MIMO) control system that has a considerable number of sensors and actuators. As a consequence the computational load on the Stressometer control system becomes very high if too advance functions are used. Simultaneously advances in rolling mill mechanical design makes it necessary to implement more complex functions in order for the Stressometer system to stay competitive. Most industrial players in this market considers improved computational power, for measurement, control and modeling applications, to be a key competitive factor. Accordingly there is a need to improve the computational power of the Stressometer system. Several different approaches towards this objective have been identified, e.g. exploiting hardware parallelism in modern general purpose and graphics processors.
Another approach is to implement different applications in FPGA-based hardware, either tailored to a specific problem or as a part of hardware/software co-design. Through the use of a hardware/software co-design approach the efficiency of the Stressometer system can be increased, lowering overall demand for processing power since the available resources can be exploited more fully. Hardware accelerated platforms can be used to increase the computational power of the Stressometer control system without the need for major changes in the existing hardware. Thus hardware upgrades can be as simple as connecting a cable to an accelerator platform while hardware/software co-design is used to find a suitable hardware/software partition, moving applications between software and hardware.
In order to determine whether this hardware/software co-design approach is realistic or not, the feasibility of implementing simulator, computational and control applications in FPGAbased hardware needs to be determined. This is accomplished by selecting two specific applications for a closer study, determining the feasibility of implementing a Stressometer measuring roll simulator and a parallel Cholesky algorithm in FPGA-based hardware.
Based on these studies this work has determined that the FPGA device technology is perfectly suitable for implementing both simulator and computational applications. The Stressometer measuring roll simulator was able to approximate the force and pulse signals of the Stressometer measuring roll at a relative modest resource consumption, only consuming 1747 slices and eight DSP slices. This while the parallel FPGA-based Cholesky component is able to provide performance in the range of GFLOP/s, exceeding the performance of the personal computer used for comparison in several simulations, although at a very high resource consumption. The result of this thesis, based on the two feasibility studies, indicates that it is possible to increase the processing power of the Stressometer control system using the FPGA device technology.
Lee, Hua, Stephanie Lockwood, James Tandon, and Andrew Brown. "BACKWARD PROPAGATION BASED ALGORITHMS FOR HIGH-PERFORMANCE IMAGE FORMATION." International Foundation for Telemetering, 2000. http://hdl.handle.net/10150/608300.
Full textIn this paper, we present the recent results of theoretical development and software implementation of a complete collection of high-performance image reconstruction algorithms designed for high-resolution imaging for various data acquisition configurations.
Sanghvi, Niraj D. "Parallel Computation of the Meddis MATLAB Auditory Periphery Model." The Ohio State University, 2012. http://rave.ohiolink.edu/etdc/view?acc_num=osu1339092782.
Full textSalavert, Torres José. "Inexact Mapping of Short Biological Sequences in High Performance Computational Environments." Doctoral thesis, Universitat Politècnica de València, 2014. http://hdl.handle.net/10251/43721.
Full textSalavert Torres, J. (2014). Inexact Mapping of Short Biological Sequences in High Performance Computational Environments [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/43721
TESIS
Schneck, Phyllis Adele. "Dynamic management of computation and communication resources to enable secure high-performances applications." Diss., Georgia Institute of Technology, 1999. http://hdl.handle.net/1853/8264.
Full textBas, Erdeniz Ozgun. "Load-Balancing Spatially Located Computations using Rectangular Partitions." The Ohio State University, 2011. http://rave.ohiolink.edu/etdc/view?acc_num=osu1306909831.
Full textMiddleton, Anthony M. "High-Performance Knowledge-Based Entity Extraction." NSUWorks, 2009. http://nsuworks.nova.edu/gscis_etd/246.
Full textGodwin, Jeswin Samuel. "High-Performancs Sparse Matrix-Vector Multiplication on GPUS for Structured Grid Computations." The Ohio State University, 2013. http://rave.ohiolink.edu/etdc/view?acc_num=osu1357280824.
Full textRuddy, John. "Computational Acceleration for Next Generation Chemical Standoff Sensors Using FPGAs." Master's thesis, Temple University Libraries, 2012. http://cdm16002.contentdm.oclc.org/cdm/ref/collection/p245801coll10/id/175971.
Full textM.S.E.E.
This research provides the real-time computational resource for three dimensional tomographic chemical threat mapping using mobile hyperspectral sensors from sparse input data. The crucial calculation limiting real-time execution of the algorithm is the determination of the projection matrix using the algebraic reconstruction technique (ART). The computation utilizes the inherent parallel nature of ART with an implementation of the algorithm on a field programmable gate array. The MATLAB Fixed-Point Toolbox is used to determine the optimal fixed-point data types in the conversion from the original floating-point algorithm. The computation is then implemented using the Xilinx System Generator, which generates a hardware description language representation from a block diagram design.
Temple University--Theses
Herrero, Zaragoza Jose Ramón. "A framework for efficient execution of matrix computations." Doctoral thesis, Universitat Politècnica de Catalunya, 2006. http://hdl.handle.net/10803/5991.
Full textWe study some important operations which appear in the solution of real world problems: some sparse and dense linear algebra codes and a classification algorithm. In particular, we focus our attention on the efficient execution of the following operations: sparse Cholesky factorization; dense matrix multiplication; dense Cholesky factorization; and Nearest Neighbor Classification.
A lot of research has been conducted on the efficient parallelization of numerical algorithms. However, the efficiency of a parallel algorithm depends ultimately on the performance obtained from the computations performed on each node. The work presented in this thesis focuses on the sequential execution on a single processor.
There exists a number of data structures for sparse computations which can be used in order to avoid the storage of and computation on zero elements. We work with a hierarchical data structure known as hypermatrix. A matrix is subdivided recursively an arbitrary number of times. Several pointer matrices are used to store the location of
submatrices at each level. The last level consists of data submatrices which are dealt with as dense submatrices. When the block size of this dense submatrices is small, the number of zeros can be greatly reduced. However, the performance obtained from BLAS3 routines drops heavily. Consequently, there is a trade-off in the size of data submatrices used for a sparse Cholesky factorization with the hypermatrix scheme. Our goal is that of reducing the overhead introduced by the unnecessary operation on zeros when a hypermatrix data structure is used to produce a sparse Cholesky factorization. In this work we study several techniques for reducing such overhead in order to obtain high performance.
One of our goals is the creation of codes which work efficiently on different platforms when operating on dense matrices. To obtain high performance, the resources offered by the CPU must be properly utilized. At the same time, the memory hierarchy must be exploited to tolerate increasing memory latencies. To achieve the former, we produce inner kernels which use the CPU very efficiently. To achieve the latter, we investigate nonlinear data layouts. Such data formats can contribute to the effective use of the memory system.
The use of highly optimized inner kernels is of paramount importance for obtaining efficient numerical algorithms. Often, such kernels are created by hand. However, we want to create efficient inner kernels for a variety of processors using a general approach and avoiding hand-made codification in assembly language. In this work, we present an alternative way to produce efficient kernels automatically, based on a set of simple codes written in a high level language, which can be parameterized at compilation time. The advantage of our method lies in the ability to generate very efficient inner kernels by means of a good compiler. Working on regular codes for small matrices most of the compilers we used in different platforms were creating very efficient inner kernels for matrix multiplication. Using the resulting kernels we have been able to produce high performance sparse and dense linear algebra codes on a variety of platforms.
In this work we also show that techniques used in linear algebra codes can be useful in other fields. We present the work we have done in the optimization of the Nearest Neighbor classification focusing on the speed of the classification process.
Tuning several codes for different problems and machines can become a heavy and unbearable task. For this reason we have developed an environment for development and automatic benchmarking of codes which is presented in this thesis.
As a practical result of this work, we have been able to create efficient codes for several matrix operations on a variety of platforms. Our codes are highly competitive with other state-of-art codes for some problems.
Gruener, Charles J. "Design and implementation of a computational cluster for high performance design and modeling of integrated circuits /." Online version of thesis, 2009. http://hdl.handle.net/1850/11204.
Full textJiménez, García Brian. "Development and optimization of high-performance computational tools for protein-protein docking." Doctoral thesis, Universitat de Barcelona, 2016. http://hdl.handle.net/10803/398790.
Full textGràcies als recents avenços en computació, el nostre coneixement de la química que suporta la vida ha incrementat enormement i ens ha conduït a comprendre que la química de la vida és més sofisticada del que mai haguéssim pensat. Les proteïnes juguen un paper fonamental en aquesta química i són descrites habitualment com a les fàbriques de les cèl·lules. A més a més, les proteïnes estan involucrades en gairebé tots els processos fonamentals en els éssers vius. Malauradament, el nostre coneixement de la funció de moltes proteïnes és encara escaig degut a les limitacions actuals de molts mètodes experimentals, que encara no són capaços de proporcionar-nos estructures de cristall per a molts complexes proteïna-proteïna. El desenvolupament de tècniques i eines informàtiques d’acoblament proteïna-proteïna pot ésser crucial per a ajudar-nos a reduir aquest forat. En aquesta tesis, hem presentat un nou mètode computacional de predicció d’acoblament proteïna-proteïna, LightDock, que és capaç de fer servir diverses funcions energètiques definides per l’usuari i incloure un model de flexibilitat de la cadena principal mitjançant la anàlisis de modes normals. Segon, diverses eines d’interès per a la comunitat científica i basades en tecnologia web han sigut desenvolupades: un servidor web de predicció d’acoblament proteïna-proteïna, una eina online per a caracteritzar les interfícies d’acoblament proteïna-proteïna i una eina web per a incloure dades experimentals de tipus SAXS. A més a més, les optimitzacions fetes al protocol pyDock i la conseqüent millora en rendiment han propiciat que el nostre grup de recerca obtingués la cinquena posició entre més de 60 grups en les dues darreres avaluacions de l’experiment internacional CAPRI. Finalment, hem dissenyat i compilat els banc de proves d’acoblament proteïna-proteïna (versió 5) i proteïna-ARN (versió 1), molt importants per a la comunitat ja que permeten provar i desenvolupar nous mètodes i analitzar-ne el rendiment en aquest marc de referència comú.
Ling, Cheng. "High performance bioinformatics and computational biology on general-purpose graphics processing units." Thesis, University of Edinburgh, 2012. http://hdl.handle.net/1842/6260.
Full textKissami, Imad. "High Performance Computational Fluid Dynamics on Clusters and Clouds : the ADAPT Experience." Thesis, Sorbonne Paris Cité, 2017. http://www.theses.fr/2017USPCD019/document.
Full textIn this thesis, we present our research work in the field of high performance computing in fluid mechanics (CFD) for cluster and cloud architectures. In general, we propose to develop an efficient solver, called ADAPT, for problemsolving of CFDs in a classic view corresponding to developments in MPI and in a view that leads us to represent ADAPT as a graph of tasks intended to be ordered on a cloud computing platform. As a first contribution, we propose a parallelization of the diffusion-convection equation coupled to a linear systemin 2D and 3D using MPI. A two-level parallelization is used in our a implementation to take advantage of thecurrent distributed multicore machines. A balanced distribution of the computational load is obtained by using the decomposition of the domain using METIS, as well as a relevant resolution of our very large linear system using the parallel solver MUMPS (Massive Parallel MUltifrontal Solver). Our second contribution illustrates how to imagine the ADAPT framework, as depicted in the first contribution, as a Service. We transform the framework (in fact, a part of the framework) as a DAG (Direct Acyclic Graph) in order to see it as a scientific workflow. Then we introduce new policies inside the RedisDG workflow engine, in order to schedule tasks of the DAG, in an opportunistic manner. We introduce into RedisDG the possibility to work with dynamic workers (they can leave or enter into the computing system as they want) and a multi-criteria approach to decide on the “best” worker to choose to execute a task. Experiments are conducted on the ADAPT workflow to exemplify howfine is the scheduling and the scheduling decisions into the new RedisDG
Jiang, Wei. "A Map-Reduce-Like System for Programming and Optimizing Data-Intensive Computations on Emerging Parallel Architectures." The Ohio State University, 2012. http://rave.ohiolink.edu/etdc/view?acc_num=osu1343677821.
Full textMiotti, Bettanini Alvise. "Welding of high performance metal matrix composite materials: the ICME approach." Thesis, KTH, Metallografi, 2014. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-154021.
Full textPulla, Gautam. "High Performance Computing Issues in Large-Scale Molecular Statics Simulations." Thesis, Virginia Tech, 1999. http://hdl.handle.net/10919/33206.
Full textMaster of Science
Hospital, Gasch Adam. "High Throughput Computational Studies of Macromolecular Structure Flexibility." Doctoral thesis, Universitat de Barcelona, 2014. http://hdl.handle.net/10803/284440.
Full textLas estructuras tridimensionales de las macromoléculas, y en particular, su dinámica y flexibilidad, están íntimamente relacionadas con su función biológica. Debido a la tremenda dificultad del estudio experimental de las propiedades dinámicas de las macromoléculas, se han popularizado un conjunto de técnicas teóricas con las que obtener simulaciones de su movimiento. En los últimos años, los grandes y rápidos avances tanto en la computación como en los estudios teóricos de flexibilidad de macromoléculas han abierto la posibilidad de llevar a cabo estudios masivos de alto rendimiento (High throughput). Sin embargo, para lograr realizar este tipo de estudios, no solo se requieren algoritmos potentes y poder computacional, sino también una automatización de los distintos pasos necesarios en el proceso de cálculo de trayectorias así como de su posterior análisis. Casi tan importante como los cálculos, es necesario un sistema de almacenamiento que permita tanto guardar como consultar de manera eficiente la cantidad enorme de datos generados por el estudio masivo. En esta tesis, se han estudiado, diseñado e implementado diferentes sistemas de automatización high throughput de cálculos de dinámica molecular, tanto atomística como de baja resolución, así como herramientas para su posterior análisis. Así mismo, y para acercar estas metodologías complejas a usuarios no expertos, hemos implementado un conjunto de entornos gráficos a partir de servidores web, que directamente, o vía el portal del Instituto Nacional de Bioinformática (INB), permiten su uso por una amplia comunidad científica.
Ozog, David. "High Performance Computational Chemistry: Bridging Quantum Mechanics, Molecular Dynamics, and Coarse-Grained Models." Thesis, University of Oregon, 2017. http://hdl.handle.net/1794/22778.
Full textStefanek, Anton. "A high-level framework for efficient computation of performance : energy trade-offs in Markov population models." Thesis, Imperial College London, 2013. http://hdl.handle.net/10044/1/23931.
Full textGreen, Robert C. II. "Novel Computational Methods for the Reliability Evaluation of Composite Power Systems using Computational Intelligence and High Performance Computing Techniques." University of Toledo / OhioLINK, 2012. http://rave.ohiolink.edu/etdc/view?acc_num=toledo1338894641.
Full textElango, Venmugil. "Techniques for Characterizing the Data Movement Complexity of Computations." The Ohio State University, 2016. http://rave.ohiolink.edu/etdc/view?acc_num=osu1452242436.
Full textMakgata, Katlego Webster. "Computational analysis and optimisation of the inlet system of a high-performance rally engine." Diss., Pretoria : [s.n.], 2005. http://upetd.up.ac.za/thesis/available/etd-01242006-123639.
Full textSetta, Mario. "Multiscale numerical approximation of morphology formation in ternary mixtures with evaporation : Discrete and continuum models for high-performance computing." Thesis, Karlstads universitet, Institutionen för matematik och datavetenskap (from 2013), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kau:diva-85036.
Full textTintó, Prims Oriol. "NEMO: computational challenges in ocean simulation." Doctoral thesis, Universitat Autònoma de Barcelona, 2019. http://hdl.handle.net/10803/669877.
Full textThe ocean plays a very important role in modulating the temperature of the Earth through absorbing, storing and transporting the energy that arrives from the sun. Better understanding the dynamics of the ocean can help us to better predict the weather and to better comprehend the climate, two topics of special relevance for society. Ocean models had become an extremely useful tools, as they became a framework upon with it was possible to build knowledge. Using computers it became possible to numerically solve the fluid equations of the ocean and by improving how ocean models exploit the computational resources, we can reduce the cost of simulation whilst enabling new developments that will increase its skill. By facing the computational challenges of ocean simulation we can contribute to topics that have a direct impact on society whilst helping to reduce the cost of our experiments. Being the major European ocean model and one of the main state-of-the-art ocean models worldwide, this thesis has focused on the Nucleus NEMO. To find a way to improve the computational performance of ocean models, one of the initial goals was to better understand their computational behaviour. To do so, an analysis methodology was proposed, paying special attention to inter-process communication. Used with NEMO, the methodology helped to highlight several implementation inefficiencies, whose optimization led to a 46-49\% gain in the maximum model throughput, increasing the scalability of the model. This result illustrated that this kind of analysis can significantly help model developers to adapt their code highlighting where the problems really are. Another of the issues detected was that the impact of the domain decomposition was alarmingly underestimated, since in certain circumstances the model's algorithm was selecting a sub-optimal decomposition. Taking into account the factors that make a specific decomposition impact the performance, a method to select an optimal decomposition was proposed. The results showed that that by a wise selection of the domain decomposition it was possible not only to save resources but also to increase the maximum model throughput by a 41\% in some cases. After the successes achieved during the first part of the thesis, that allowed an increase of the maximum throughput of the model by a factor of more than two, the attention focused on mixed-precision algorithms. Ideally, a proper usage of numerical precision would allow to improve the computational performance without sacrificing accuracy. In order to achieve that in ocean models, a method to find out the precision required for each one of the real variables in a code was presented. The method was used with NEMO and with the Regional Ocean Modelling System showing that in both models most of the variables could use less than the standard 64-bit without problems. Last but not least, it was found that being ocean models nonlinear it was not straightforward to determine whether a change made into the code was deteriorating the accuracy of the model or not. In order to solve this problem a method to verify the accuracy of a non-linear model was presented. Although the different contributions that gave form to this thesis have been diverse, they helped to identify and tackle computational challenges that affect computational ocean models. These contributions resulted in four peer-reviewed publications and many outreach activities. Moreover, the research outcomes have reached NEMO and EC-Earth consortium codes, having already helped model users to save resources and time. These contributions not only have significantly improved the computational performance of the NEMO model but have surpassed the original scope of the thesis and would be easily transferable to other computational models.