Academic literature on the topic 'Computing clusters'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Computing clusters.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Dissertations / Theses on the topic "Computing clusters"

1

Shum, Kam Hong. "Adaptive parallelism for computing on heterogeneous clusters." Thesis, University of Cambridge, 1996. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.627563.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Aji, Ashwin M. "Programming High-Performance Clusters with Heterogeneous Computing Devices." Diss., Virginia Tech, 2015. http://hdl.handle.net/10919/52366.

Full text
Abstract:
Today's high-performance computing (HPC) clusters are seeing an increase in the adoption of accelerators like GPUs, FPGAs and co-processors, leading to heterogeneity in the computation and memory subsystems. To program such systems, application developers typically employ a hybrid programming model of MPI across the compute nodes in the cluster and an accelerator-specific library (e.g.; CUDA, OpenCL, OpenMP, OpenACC) across the accelerator devices within each compute node. Such explicit management of disjointed computation and memory resources leads to reduced productivity and performance. This dissertation focuses on designing, implementing and evaluating a runtime system for HPC clusters with heterogeneous computing devices. This work also explores extending existing programming models to make use of our runtime system for easier code modernization of existing applications. Specifically, we present MPI-ACC, an extension to the popular MPI programming model and runtime system for efficient data movement and automatic task mapping across the CPUs and accelerators within a cluster, and discuss the lessons learned. MPI-ACC's task-mapping runtime subsystem performs fast and automatic device selection for a given task. MPI-ACC's data-movement subsystem includes careful optimizations for end-to-end communication among CPUs and accelerators, which are seamlessly leveraged by the application developers. MPI-ACC provides a familiar, flexible and natural interface for programmers to choose the right computation or communication targets, while its runtime system achieves efficient cluster utilization.<br>Ph. D.
APA, Harvard, Vancouver, ISO, and other styles
3

周志賢 and Chi-yin Edward Chow. "Adaptive recovery with hierarchical checkpointing on workstation clusters." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 1999. http://hub.hku.hk/bib/B29812914.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Chow, Chi-yin Edward. "Adaptive recovery with hierarchical checkpointing on workstation clusters /." Hong Kong : University of Hong Kong, 1999. http://sunzi.lib.hku.hk/hkuto/record.jsp?B20792700.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Melas, Panagiotis. "The performance evaluation of workstation clusters." Thesis, University of Southampton, 2000. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.326395.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Ribeiro, Tiago Filipe Rodrigues. "Developing and evaluating clopencl applications for heterogeneous clusters." Master's thesis, Instituto Politécnico de Bragança, Escola Superior de Tecnologia e Gestão, 2012. http://hdl.handle.net/10198/7948.

Full text
Abstract:
In the last few years, the computing systems processing capabilities have increased significantly, changing from single-core to multi-core and even many-core systems. Accompanying this evolution, local networks have also become faster, with multi-gigabit technologies like Infiniband, Myrinet and 10G Ethernet. Parallel/distributed programming tools and standards, like POSIX Threads, OpenMP and MPI, have helped to explore these technologies and have been frequently combined, giving rise to Hybrid Programming Models. Recently, co-processors like GPUs and FPGAs, started to be used as accelerators, requiring specialized frameworks (like CUDA for NVIDIA GPUs). Presented with so much heterogeneity, the industry formulated the OpenCL specification, as a standard to explore heterogeneous systems. However, in the context of cluster computing, one problem surfaces: OpenCL only enables a developer to use the devices that are present in the local machine. With many processor devices scattered across cluster nodes (CPUs, GPUs and other co-processors), it then became important to enable software developers to take full advantage of the full cluster device set. This dissertation demonstrates and evaluates an OpenCL extension, named clOpenCL, which supports the simple deployment and efficient running of OpenCL-based parallel applications that may span several cluster nodes, thus expanding the original single-node OpenCL model. The main contributions are that clOpenCL i) offers a transparent approach to the porting of traditional OpenCL applications to cluster environments and ii) provides significant performance increases over classical (non-)hybrid parallel approaches. Nos últimos anos, a capacidade de processamento dos sistemas de computação aumentou significativamente, passando de CPUs com um núcleo para CPUs multi-núcleo. Acompanhando esta evolução, as redes locais também se tornaram mais rápidas, com tecnologias multi-gigabit como a Infiniband, Myrinet e 10G Ethernet. Ferramentas e standards paralelos/distribuídos, como POSIX Threads, OpenMP e MPI, ajudaram a explorar esses sistemas, e têm sido frequentemente combinados dando origem a Modelos de Programação Híbrida. Mais recentemente, co-processadores como GPUs e FPGAs, começaram a ser utilizados como aceleradores, exigindo frameworks especializadas (como o CUDA para GPUs NVIDIA). Deparada com tanta heterogeneidade, a indústria formulou a especificação OpenCL, como sendo um standard para exploração de sistemas heterogéneos. No entanto, no contexto da computação em cluster, um problema surge: o OpenCL só permite ao desenvolvedor utilizar dispositivos presentes na máquina local. Com tantos dispositivos de processamento espalhados pelos nós de um cluster (CPUs, GPUs e outros co-processadores), tornou-se assim importante habilitar os desenvolvedores de software, a tirarem o máximo proveito do conjunto total de dispositivos do cluster. Esta dissertação demonstra e avalia uma extensão OpenCL, chamada clOpenCL, que suporta a implementação simples e execução eficiente de aplicações paralelas baseadas em OpenCL que podem estender-se por vários nós do cluster, expandindo assim o modelo original de um único nó do OpenCL. As principais contribuições referem-se a que o clOpenCL i) oferece uma abordagem transparente à portabilidade de aplicações OpenCL tradicionais para ambientes cluster e ii) proporciona aumentos significativos de desempenho sobre abordagens paralelas clássicas (não) híbridas.
APA, Harvard, Vancouver, ISO, and other styles
7

Rough, Justin, and mikewood@deakin edu au. "A Platform for reliable computing on clusters using group communications." Deakin University. School of Computing and Mathematics, 2001. http://tux.lib.deakin.edu.au./adt-VDU/public/adt-VDU20060412.141015.

Full text
Abstract:
Shared clusters represent an excellent platform for the execution of parallel applications given their low price/performance ratio and the presence of cluster infrastructure in many organisations. The focus of recent research efforts are on parallelism management, transport and efficient access to resources, and making clusters easy to use. In this thesis, we examine reliable parallel computing on clusters. The aim of this research is to demonstrate the feasibility of developing an operating system facility providing transport fault tolerance using existing, enhanced and newly built operating system services for supporting parallel applications. In particular, we use existing process duplication and process migration services, and synthesise a group communications facility for use in a transparent checkpointing facility. This research is carried out using the methods of experimental computer science. To provide a foundation for the synthesis of the group communications and checkpointing facilities, we survey and review related work in both fields. For group communications, we examine the V Distributed System, the x-kernel and Psync, the ISIS Toolkit, and Horus. We identify a need for services that consider the placement of processes on computers in the cluster. For Checkpointing, we examine Manetho, KeyKOS, libckpt, and Diskless Checkpointing. We observe the use of remote computer memories for storing checkpoints, and the use of copy-on-write mechanisms to reduce the time to create a checkpoint of a process. We propose a group communications facility providing two sets of services: user-oriented services and system-oriented services. User-oriented services provide transparency and target application. System-oriented services supplement the user-oriented services for supporting other operating systems services and do not provide transparency. Additional flexibility is achieved by providing delivery and ordering semantics independently. An operating system facility providing transparent checkpointing is synthesised using coordinated checkpointing. To ensure a consistent set of checkpoints are generated by the facility, instead of blindly blocking the processes of a parallel application, only non-deterministic events are blocked. This allows the processes of the parallel application to continue execution during the checkpoint operation. Checkpoints are created by adapting process duplication mechanisms, and checkpoint data is transferred to remote computer memories and disk for storage using the mechanisms of process migration. The services of the group communications facility are used to coordinate the checkpoint operation, and to transport checkpoint data to remote computer memories and disk. Both the group communications facility and the checkpointing facility have been implemented in the GENESIS cluster operating system and provide proof-of-concept. GENESIS uses a microkernel and client-server based operating system architecture, and is demonstrated to provide an appropriate environment for the development of these facilities. We design a number of experiments to test the performance of both the group communications facility and checkpointing facility, and to provide proof-of-performance. We present our approach to testing, the challenges raised in testing the facilities, and how we overcome them. For group communications, we examine the performance of a number of delivery semantics. Good speed-ups are observed and system-oriented group communication services are shown to provide significant performance advantages over user-oriented semantics in the presence of packet loss. For checkpointing, we examine the scalability of the facility given different levels of resource usage and a variable number of computers. Low overheads are observed for checkpointing a parallel application. It is made clear by this research that the microkernel and client-server based cluster operating system provide an ideal environment for the development of a high performance group communications facility and a transparent checkpointing facility for generating a platform for reliable parallel computing on clusters.
APA, Harvard, Vancouver, ISO, and other styles
8

Daillidis, Christos. "Establishing Linux Clusters for high-performance computing (HPC) at NPS." Thesis, Monterey, Calif. : Springfield, Va. : Naval Postgraduate School ; Available from National Technical Information Service, 2004. http://library.nps.navy.mil/uhtbin/hyperion/04Sept%5FDaillidis.pdf.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Nakad, Zahi Samir. "High Performance Applications on Reconfigurable Clusters." Thesis, Virginia Tech, 2000. http://hdl.handle.net/10919/35682.

Full text
Abstract:
Many problems faced in the engineering world are computationally intensive. Filtering using FIR (Finite Impulse Response) filters is an example to that. This thesis discusses the implementation of a fast, reconfigurable, and scalable FIR (Finite Impulse Response) digital filter. Constant coefficient multipliers and a Fast FIFO implementation are also discussed in connection with the FIR filter. This filter is used in two of its structures: the direct-form and the lattice structure. The thesis describes several configurations that can be created with the different components available and reports the testing results of these configurations.<br>Master of Science
APA, Harvard, Vancouver, ISO, and other styles
10

Rafique, Muhammad Mustafa. "An Adaptive Framework for Managing Heterogeneous Many-Core Clusters." Diss., Virginia Tech, 2011. http://hdl.handle.net/10919/29119.

Full text
Abstract:
The computing needs and the input and result datasets of modern scientific and enterprise applications are growing exponentially. To support such applications, High-Performance Computing (HPC) systems need to employ thousands of cores and innovative data management. At the same time, an emerging trend in designing HPC systems is to leverage specialized asymmetric multicores, such as IBM Cell and AMD Fusion APUs, and commodity computational accelerators, such as programmable GPUs, which exhibit excellent price to performance ratio as well as the much needed high energy efficiency. While such accelerators have been studied in detail as stand-alone computational engines, integrating the accelerators into large-scale distributed systems with heterogeneous computing resources for data-intensive computing presents unique challenges and trade-offs. Traditional programming and resource management techniques cannot be directly applied to many-core accelerators in heterogeneous distributed settings, given the complex and custom instruction sets architectures, memory hierarchies and I/O characteristics of different accelerators. In this dissertation, we explore the design space of using commodity accelerators, specifically IBM Cell and programmable GPUs, in distributed settings for data-intensive computing and propose an adaptive framework for programming and managing heterogeneous clusters. The proposed framework provides a MapReduce-based extended programming model for heterogeneous clusters, which distributes tasks between asymmetric compute nodes by considering workload characteristics and capabilities of individual compute nodes. The framework provides efficient data prefetching techniques that leverage general-purpose cores to stage the input data in the private memories of the specialized cores. We also explore the use of an advanced layered-architecture based software engineering approach and provide mixin-layers based reusable software components to enable easy and quick deployment of heterogeneous clusters. The framework also provides multiple resource management and scheduling policies under different constraints, e.g., energy-aware and QoS-aware, to support executing concurrent applications on multi-tenant heterogeneous clusters. When applied to representative applications and benchmarks, our framework yields significantly improved performance in terms of programming efficiency and optimal resource management as compared to conventional, hand-tuned, approaches to program and manage accelerator-based heterogeneous clusters.<br>Ph. D.
APA, Harvard, Vancouver, ISO, and other styles
More sources
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!