Relevant bibliographies by topics / Shared-memory parallel programming

Journal articles
Dissertations / Theses
Books
Book chapters
Conference papers
Reports

Academic literature on the topic 'Shared-memory parallel programming'

Author: Grafiati

Published: 4 June 2021

Last updated: 19 February 2022

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Shared-memory parallel programming.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Shared-memory parallel programming"

Beck, B. "Shared-memory parallel programming in C++." IEEE Software 7, no. 4 (July 1990): 38–48. http://dx.doi.org/10.1109/52.56449.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Bonetta, Daniele, Luca Salucci, Stefan Marr, and Walter Binder. "GEMs: shared-memory parallel programming for Node.js." ACM SIGPLAN Notices 51, no. 10 (December 5, 2016): 531–47. http://dx.doi.org/10.1145/3022671.2984039.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Deshpande, Ashish, and Martin Schultz. "Efficient Parallel Programming with Linda." Scientific Programming 1, no. 2 (1992): 177–83. http://dx.doi.org/10.1155/1992/829092.

Full text

Abstract:

Linda is a coordination language inverted by David Gelernter at Yale University, which when combined with a computation language (like C) yields a high-level parallel programming language for MIMD machines. Linda is based on a virtual shared associative memory containing objects called tuples. Skeptics have long claimed that Linda programs could not be efficient on distributed memory architectures. In this paper, we address this claim by discussing C-Linda's performance in solving a particular scientific computing problem, the shallow water equations, and make comparisons with alternatives available on various shared and distributed memory parallel machines.

APA, Harvard, Vancouver, ISO, and other styles

Quammen, Cory. "Introduction to programming shared-memory and distributed-memory parallel computers." XRDS: Crossroads, The ACM Magazine for Students 8, no. 3 (April 2002): 16–22. http://dx.doi.org/10.1145/567162.567167.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Quammen, Cory. "Introduction to programming shared-memory and distributed-memory parallel computers." XRDS: Crossroads, The ACM Magazine for Students 12, no. 1 (October 2005): 2. http://dx.doi.org/10.1145/1144382.1144384.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Keane, J. A., A. J. Grant, and M. Q. Xu. "Comparing distributed memory and virtual shared memory parallel programming models." Future Generation Computer Systems 11, no. 2 (March 1995): 233–43. http://dx.doi.org/10.1016/0167-739x(94)00065-m.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Redondo, J. L., I. García, and P. M. Ortigosa. "Parallel evolutionary algorithms based on shared memory programming approaches." Journal of Supercomputing 58, no. 2 (December 18, 2009): 270–79. http://dx.doi.org/10.1007/s11227-009-0374-6.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Di Martino, Beniamino, Sergio Briguglio, Gregorio Vlad, and Giuliana Fogaccia. "Workload Decomposition Strategies for Shared Memory Parallel Systems with OpenMP." Scientific Programming 9, no. 2-3 (2001): 109–22. http://dx.doi.org/10.1155/2001/891073.

Full text

Abstract:

A crucial issue in parallel programming (both for distributed and shared memory architectures) is work decomposition. Work decomposition task can be accomplished without large programming effort with use of high-level parallel programming languages, such as OpenMP. Anyway particular care must still be payed on achieving performance goals. In this paper we introduce and compare two decomposition strategies, in the framework of shared memory systems, as applied to a case study particle in cell application. A number of different implementations of them, based on the OpenMP language, are discussed with regard to time efficiency, memory occupancy, and program restructuring effort.

APA, Harvard, Vancouver, ISO, and other styles

Alaghband, Gita, and Harry F. Jordan. "Overview of the Force Scientific Parallel Language." Scientific Programming 3, no. 1 (1994): 33–47. http://dx.doi.org/10.1155/1994/632497.

Full text

Abstract:

The Force parallel programming language designed for large-scale shared-memory multiprocessors is presented. The language provides a number of parallel constructs as extensions to the ordinary Fortran language and is implemented as a two-level macro preprocessor to support portability across shared memory multiprocessors. The global parallelism model on which the Force is based provides a powerful parallel language. The parallel constructs, generic synchronization, and freedom from process management supported by the Force has resulted in structured parallel programs that are ported to the many multiprocessors on which the Force is implemented. Two new parallel constructs for looping and functional decomposition are discussed. Several programming examples to illustrate some parallel programming approaches using the Force are also presented.

APA, Harvard, Vancouver, ISO, and other styles

Warren, Karen H. "PDDP, A Data Parallel Programming Model." Scientific Programming 5, no. 4 (1996): 319–27. http://dx.doi.org/10.1155/1996/857815.

Full text

Abstract:

PDDP, the parallel data distribution preprocessor, is a data parallel programming model for distributed memory parallel computers. PDDP implements high-performance Fortran-compatible data distribution directives and parallelism expressed by the use of Fortran 90 array syntax, the FORALL statement, and the WHERE construct. Distributed data objects belong to a global name space; other data objects are treated as local and replicated on each processor. PDDP allows the user to program in a shared memory style and generates codes that are portable to a variety of parallel machines. For interprocessor communication, PDDP uses the fastest communication primitives on each platform.

APA, Harvard, Vancouver, ISO, and other styles

More sources

Dissertations / Theses on the topic "Shared-memory parallel programming"

Ravela, Srikar Chowdary. "Comparison of Shared memory based parallel programming models." Thesis, Blekinge Tekniska Högskola, Sektionen för datavetenskap och kommunikation, 2010. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-3384.

Full text

Abstract:

Parallel programming models are quite challenging and emerging topic in the parallel computing era. These models allow a developer to port a sequential application on to a platform with more number of processors so that the problem or application can be solved easily. Adapting the applications in this manner using the Parallel programming models is often influenced by the type of the application, the type of the platform and many others. There are several parallel programming models developed and two main variants of parallel programming models classified are shared and distributed memory based parallel programming models. The recognition of the computing applications that entail immense computing requirements lead to the confrontation of the obstacle regarding the development of the efficient programming models that bridges the gap between the hardware ability to perform the computations and the software ability to support that performance for those applications [25][9]. And so a better programming model is needed that facilitates easy development and on the other hand porting high performance. To answer this challenge this thesis confines and compares four different shared memory based parallel programming models with respect to the development time of the application under a shared memory based parallel programming model to the performance enacted by that application in the same parallel programming model. The programming models are evaluated in this thesis by considering the data parallel applications and to verify their ability to support data parallelism with respect to the development time of those applications. The data parallel applications are borrowed from the Dense Matrix dwarfs and the dwarfs used are Matrix-Matrix multiplication, Jacobi Iteration and Laplace Heat Distribution. The experimental method consists of the selection of three data parallel bench marks and developed under the four shared memory based parallel programming models considered for the evaluation. Also the performance of those applications under each programming model is noted and at last the results are used to analytically compare the parallel programming models. Results for the study show that by sacrificing the development time a better performance is achieved for the chosen data parallel applications developed in Pthreads. On the other hand sacrificing a little performance data parallel applications are extremely easy to develop in task based parallel programming models. The directive models are moderate from both the perspectives and are rated in between the tasking models and threading models.
From this study it is clear that threading model Pthreads model is identified as a dominant programming model by supporting high speedups for two of the three different dwarfs but on the other hand the tasking models are dominant in the development time and reducing the number of errors by supporting high growth in speedup for the applications without any communication and less growth in self-relative speedup for the applications involving communications. The degrade of the performance by the tasking models for the problems based on communications is because task based models are designed and bounded to execute the tasks in parallel without out any interruptions or preemptions during their computations. Introducing the communications violates the purpose and there by resulting in less performance. The directive model OpenMP is moderate in both aspects and stands in between these models. In general the directive models and tasking models offer better speedup than any other models for the task based problems which are based on the divide and conquer strategy. But for the data parallelism the speedup growth however achieved is low (i.e. they are less scalable for data parallel applications) are equally compatible in execution times with threading models. Also the development times are considerably low for data parallel applications this is because of the ease of development supported by those models by introducing less number of functional routines required to parallelize the applications. This thesis is concerned about the comparison of the shared memory based parallel programming models in terms of the speedup. This type of work acts as a hand in guide that the programmers can consider during the development of the applications under the shared memory based parallel programming models. We suggest that this work can be extended in two different ways: one is from the developer‘s perspective and the other is a cross-referential study about the parallel programming models. The former can be done by using a similar study like this by a different programmer and comparing this study with the new study. The latter can be done by including multiple data points in the same programming model or by using a different set of parallel programming models for the study.
C/O K. Manoj Kumar; LGH 555; Lindbloms Vägan 97; 37233; Ronneby. Phone no: 0738743400 Home country phone no: +91 9948671552

APA, Harvard, Vancouver, ISO, and other styles

Schneider, Scott. "Shared Memory Abstractions for Heterogeneous Multicore Processors." Diss., Virginia Tech, 2010. http://hdl.handle.net/10919/30240.

Full text

Abstract:

We are now seeing diminishing returns from classic single-core processor designs, yet the number of transistors available for a processor is still increasing. Processor architects are therefore experimenting with a variety of multicore processor designs. Heterogeneous multicore processors with Explicitly Managed Memory (EMM) hierarchies are one such experimental design which has the potential for high performance, but at the cost of great programmer effort. EMM processors have cores that are divorced from the normal memory hierarchy, thus the onus is on the programmer to manage locality and parallelism. This dissertation presents the Cellgen source-to-source compiler which moves some of this complexity back into the compiler. Cellgen offers a directive-based programming model with semantics similar to OpenMP for the Cell Broadband Engine, a general-purpose processor with EMM. The compiler implicitly handles locality and parallelism, schedules memory transfers for data parallel regions of code, and provides performance predictions which can be leveraged to make scheduling decisions. We compare this approach to using a software cache, to a different programming model which is task based with explicit data transfers, and to programming the Cell directly using the native SDK. We also present a case study which uses the Cellgen compiler in a comparison across multiple kinds of multicore architectures: heterogeneous, homogeneous and radically data-parallel graphics processors.
Ph. D.

APA, Harvard, Vancouver, ISO, and other styles

Stoker, Michael Allan. "The exploitation of parallelism on shared memory multiprocessors." Thesis, University of Newcastle Upon Tyne, 1990. http://hdl.handle.net/10443/2000.

Full text

Abstract:

With the arrival of many general purpose shared memory multiple processor (multiprocessor) computers into the commercial arena during the mid-1980's, a rift has opened between the raw processing power offered by the emerging hardware and the relative inability of its operating software to effectively deliver this power to potential users. This rift stems from the fact that, currently, no computational model with the capability to elegantly express parallel activity is mature enough to be universally accepted, and used as the basis for programming languages to exploit the parallelism that multiprocessors offer. To add to this, there is a lack of software tools to assist programmers in the processes of designing and debugging parallel programs. Although much research has been done in the field of programming languages, no undisputed candidate for the most appropriate language for programming shared memory multiprocessors has yet been found. This thesis examines why this state of affairs has arisen and proposes programming language constructs, together with a programming methodology and environment, to close the ever widening hardware to software gap. The novel programming constructs described in this thesis are intended for use in imperative languages even though they make use of the synchronisation inherent in the dataflow model by using the semantics of single assignment when operating on shared data, so giving rise to the term shared values. As there are several distinct parallel programming paradigms, matching flavours of shared value are developed to permit the concise expression of these paradigms.

APA, Harvard, Vancouver, ISO, and other styles

Karlbom, David. "A Performance Evaluation of MPI Shared Memory Programming." Thesis, KTH, Skolan för datavetenskap och kommunikation (CSC), 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-188676.

Full text

Abstract:

The thesis investigates the Message Passing Interface (MPI) support for shared memory programming on modern hardware architecture with multiple Non-Uniform Memory Access (NUMA) domains. We investigate its performance in two case studies: the matrix-matrix multiplication and Conway’s game of life. We compare MPI shared memory performance in terms of execution time and memory consumption with the performance of implementations using OpenMP and MPI point-to-point communication, also called "MPI two-sided". We perform strong scaling tests in both test cases. We observe that MPI two-sided implementation is 21% and 18% faster than the MPI shared and OpenMP implementations respectively in the matrix-matrix multiplication when using 32 processes. MPI shared uses less memory space: when compared to MPI two-sided, MPI shared uses 45% less memory. In the Conway’s game of life, we find that MPI two-sided implementation is 10% and 82% faster than the MPI shared and OpenMP implementations respectively when using 32 processes. We also observe that not mapping virtual memory to a specific NUMA domain can lead to an increment in execution time of 64% when using 32 processes. The use of MPI shared is viable for intranode communication on modern hardware architecture with multiple NUMA domains.
I detta examensarbete undersöker vi Message Passing Inferfaces (MPI) support för shared memory programmering på modern hårdvaruarkitektur med flera Non-Uniform Memory Access (NUMA) domäner. Vi undersöker prestanda med hjälp av två fallstudier: matris-matris multiplikation och Conway’s game of life. Vi jämför prestandan utav MPI shared med hjälp utav exekveringstid samt minneskonsumtion jämtemot OpenMP och MPI punkt-till-punkt kommunikation, även känd som MPI two-sided. Vi utför strong scaling tests för båda fallstudierna. Vi observerar att MPI-two sided är 21% snabbare än MPI shared och 18% snabbare än OpenMP för matris-matris multiplikation när 32 processorer användes. För samma testdata har MPI shared en 45% lägre minnesförburkning än MPI two-sided. För Conway’s game of life är MPI two-sided 10% snabbare än MPI shared samt 82% snabbare än OpenMP implementation vid användandet av 32 processorer. Vi kunde också utskilja att om ingen mappning av virtuella minnet till en specifik NUMA domän görs, leder det till en ökning av exekveringstiden med upp till 64% när 32 processorer används. Vi kom fram till att MPI shared är användbart för intranode kommunikation på modern hårdvaruarkitektur med flera NUMA domäner.

APA, Harvard, Vancouver, ISO, and other styles

Atukorala, G. S. "Porting a distributed operating system to a shared memory parallel computer." Thesis, University of Bath, 1990. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.256756.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Almas, Luís Pedro Parreira Galito Pimenta. "DSM-PM2 adequacy for distributed constraint programming." Master's thesis, Universidade de Évora, 2007. http://hdl.handle.net/10174/16454.

Full text

Abstract:

As Redes de alta velocidade e o melhoramento rápido da performance dos microprocessadores fazem das redes de computadores um veículo apelativo para computação paralela. Não é preciso hardware especial para usar computadores paralelos e o sistema resultante é extensível e facilmente alterável. A programação por restrições é um paradigma de programação em que as relações entre as variáveis pode ser representada por restrições. As restrições diferem das primitivas comuns das outras linguagens de programação porque, ao contrário destas, não específica uma sequência de passos a executar mas antes a definição das propriedades para encontrar as soluções de um problema específico. As bibliotecas de programação por restrições são úteis visto elas não requerem que os programadores tenham que aprender novos skills para uma nova linguagem mas antes proporcionam ferramentas de programação declarativa para uso em sistemas convencionais. A tecnologia de Memoria Partilhada Distribuída (Distributed Shared Memory) apresenta-se como uma ferramenta para uso em aplicações distribuídas em que a informação individual partilhada pode ser acedida diretamente. Nos sistemas que suportam esta tecnologia os dados movem-se entre as memórias principais dos diversos nós de um cluster. Esta tecnologia poupa o programador às preocupações de passagem de mensagens onde ele teria que ter muito trabalho de controlo do comportamento do sistema distribuído. Propomos uma arquitetura orientada para a distribuição de Programação por Restrições que tenha os mecanismos da propagação e da procura local como base sobre um ambiente CC-NUMA distribuído usando memória partilhada distribuída. Os principais objetivos desta dissertação podem ser sumarizados em: - Desenvolver um sistema resolvedor de restrições, baseado no sistema AJ ACS [3], usando a linguagem ”C', linguagem nativa da biblioteca de desenvolvimento paralelo experimentada: O PM2 [4] - Adaptar, experimentar e avaliar a adequação deste sistema resolvedor de restrições usando DSM-PM2 [1] a um ambiente distribuído assente numa arquitetura CC-NUMA; /ABSTRACT - High-speed networks and rapidly improving microprocessor performance make networks of workstations an increasingly appealing vehicle for parallel computing. No special hardware is required to use this solution as a parallel computer, and the resulting system can be easily maintained, extended and upgraded. Constraint programming is a programming paradigm where relations between variables can be stated in the form of constraints. Constraints differ from the common primitives of other programming languages in that they do not specify a step or sequence of steps to execute but rather the properties of a solution to be found. Constraint programming libraries are useful as they do not require the developers to acquire skills for a new language, providing instead declarative programming tools for use within conventional systems. Distributed Shared Memory presents itself as a tool for parallel application in which individual shared data items can be accessed directly. In systems that support Distributed Shared Memory, data moves between main memories of different nodes. The Distributed Shared Memory spares the programmer the concerns of massage passing, where he would have to put allot of effort to control the distributed system behavior. We propose an architecture aimed for Distributed Constraint Programming Solving that relies on propagation and local search over a CC-NUMA distributed environment using Distributed Shared Memory. The main objectives of this thesis can be summarized as: - Develop a Constraint Solving System, based on the AJ ACS [3] system, in the C language, the native language of the experimented Parallel library - PM2 [4]; - Adapt, experiment and evaluate the developed constraint solving system distributed suitability by using DSM-PM2 [1] over a CC-NUMA architecture distributed environment;

APA, Harvard, Vancouver, ISO, and other styles

Cordeiro, Silvio Ricardo. "Code profiling and optimization in transactional memory systems." reponame:Biblioteca Digital de Teses e Dissertações da UFRGS, 2014. http://hdl.handle.net/10183/97866.

Full text

Abstract:

Memória Transacional tem se demonstrado um paradigma promissor na implementação de aplicações concorrentes sob memória compartilhada que busquem evitar um modelo de sincronização baseado em locks. Em vez de sujeitar a execução a um acesso exclusivo com base no valor de um lock que é compartilhado por threads concorrentes, uma aplicação sob Memória Transacional tenta executar seções críticas de modo otimista, desfazendo as modificações no caso de um conflito de acesso à memória. Entretanto, apesar de a abordagem baseada em locks ter adquirido um número significativo de ferramentas automatizadas para a depuração, profiling e otimização automatizados (por ser uma das técnicas de sincronização mais antigas e mais bem pesquisadas), o campo da Memória Transacional ainda é comparativamente recente, e programadores frequentemente precisam adaptar manualmente suas aplicações transacionais ao encontrar problemas de eficiência. Este trabalho propõe um sistema no qual o profiling de código em uma implementação de Memória Transacional simulada é utilizado para caracterizar uma aplicação transacional, formando a base para uma parametrização automatizada do respectivo sistema especulativo para uma execução eficiente do código em questão. Também é proposta uma abordagem de escalonamento de threads guiado por profiling em uma implementação de Memória Transacional baseada em software, usando dados coletados pelo profiler para prever a probabilidade de conflitos e determinar que thread escalonar com base nesta previsão. São apresentados os resultados de experimentos sob ambas as abordagens.
Transactional Memory has shown itself to be a promising paradigm for the implementation of shared-memory concurrent applications that eschew a lock-based model of data synchronization. Rather than conditioning exclusive access on the value of a lock that is shared across concurrent threads, Transactional Memory attempts to execute critical sections optimistically, rolling back the modifications in the event of a data access conflict. However, while the lock-based approach has acquired a significant body of debugging, profiling and automated optimization tools (as one of the oldest and most researched synchronization techniques), the field of Transactional Memory is still comparably recent, and programmers are usually tasked with an unguided manual tuning of their transactional applications when facing efficiency problems. We propose a system in which code profiling in a simulated hardware implementation of Transactional Memory is used to characterize a transactional application, which forms the basis for the automated tuning of the underlying speculative system for the efficient execution of that particular application. We also propose a profile-guided approach to the scheduling of threads in a software-based implementation of Transactional Memory, using collected data to predict the likelihood of conflicts and determine what thread to schedule based on this prediction. We present the results achieved under both designs.

APA, Harvard, Vancouver, ISO, and other styles

Farooq, Mohammad Habibur Rahman &amp Qaisar. "Performance Prediction of Parallel Programs in a Linux Environment." Thesis, Blekinge Tekniska Högskola, Sektionen för datavetenskap och kommunikation, 2010. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-1143.

Full text

Abstract:

Context. Today’s parallel systems are widely used in different computational tasks. Developing parallel programs to make maximum use of the computing power of parallel systems is tricky and efficient tuning of parallel programs is often very hard. Objectives. In this study we present a performance prediction and visualization tool named VPPB for a Linux environment, which had already been introduced by Broberg et.al, [1] for a Solaris2.x environment. VPPB shows the predicted behavior of a multithreaded program using any number of processors and the behavior is shown on two different graphs. The prediction is based on a monitored uni-processor execution. Methods. An experimental evaluation was carried out to validate the prediction reliability of the developed tool. Results. Validation of prediction is conducted, using an Intel multiprocessor with 8 processors and PARSEC 2.0 benchmark suite application programs. The validation shows that the speed-up predictions are +/-7% of a real execution. Conclusions. The experimentation of the VPPB tool showed that the prediction of VPPB is reliable and the incurred overhead into the application programs is low.
contact: +46(0)736368336

APA, Harvard, Vancouver, ISO, and other styles

Tillenius, Martin. "Scientific Computing on Multicore Architectures." Doctoral thesis, Uppsala universitet, Avdelningen för beräkningsvetenskap, 2014. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-221241.

Full text

Abstract:

Computer simulations are an indispensable tool for scientists to gain new insights about nature. Simulations of natural phenomena are usually large, and limited by the available computer resources. By using the computer resources more efficiently, larger and more detailed simulations can be performed, and more information can be extracted to help advance human knowledge. The topic of this thesis is how to make best use of modern computers for scientific computations. The challenge here is the high level of parallelism that is required to fully utilize the multicore processors in these systems. Starting from the basics, the primitives for synchronizing between threads are investigated. Hardware transactional memory is a new construct for this, which is evaluated for a new use of importance for scientific software: atomic updates of floating point values. The evaluation includes experiments on real hardware and comparisons against standard methods. Higher level programming models for shared memory parallelism are then considered. The state of the art for efficient use of multicore systems is dynamically scheduled task-based systems, where tasks can depend on data. In such systems, the software is divided up into many small tasks that are scheduled asynchronously according to their data dependencies. This enables a high level of parallelism, and avoids global barriers. A new system for managing task dependencies is developed in this thesis, based on data versioning. The system is implemented as a reusable software library, and shown to be as efficient or more efficient than other shared-memory task-based systems in experimental comparisons. The developed runtime system is then extended to distributed memory machines, and used for implementing a parallel version of a software for global climate simulations. By running the optimized and parallelized version on eight servers, an equally sized problem can be solved over 100 times faster than in the original sequential version. The parallel version also allowed significantly larger problems to be solved, previously unreachable due to memory constraints.
UPMARC
eSSENCE

APA, Harvard, Vancouver, ISO, and other styles

Bokhari, Saniyah S. "Parallel Solution of the Subset-sum Problem: An Empirical Study." The Ohio State University, 2011. http://rave.ohiolink.edu/etdc/view?acc_num=osu1305898281.

Full text

APA, Harvard, Vancouver, ISO, and other styles

More sources

Books on the topic "Shared-memory parallel programming"

Mueller, Matthias S., Barbara M. Chapman, Bronis R. de Supinski, Allen D. Malony, and Michael Voss, eds. OpenMP Shared Memory Parallel Programming. Berlin, Heidelberg: Springer Berlin Heidelberg, 2008. http://dx.doi.org/10.1007/978-3-540-68555-5.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Voss, Michael J., ed. OpenMP Shared Memory Parallel Programming. Berlin, Heidelberg: Springer Berlin Heidelberg, 2003. http://dx.doi.org/10.1007/3-540-45009-2.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Eigenmann, Rudolf, and Michael J. Voss, eds. OpenMP Shared Memory Parallel Programming. Berlin, Heidelberg: Springer Berlin Heidelberg, 2001. http://dx.doi.org/10.1007/3-540-44587-0.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Chapman, Barbara. Using OpenMP: Portable shared memory parallel programming. Cambridge, Mass: The MIT Press, 2008.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

Chapman, Barbara. Using OpenMP: Portable shared memory parallel programming. Cambridge, MA: The MIT Press, 2006.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

Chapman, Barbara M., ed. Shared Memory Parallel Programming with Open MP. Berlin, Heidelberg: Springer Berlin Heidelberg, 2005. http://dx.doi.org/10.1007/b105895.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Scalable parallel sparse LU factorization methods on shared memory multiprocessors. Konstanz: Hartung-Gorre, 2000.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

Chung, Ki-Sung. A parallel, virtual shared memory implementation of the architecture-independent programming language UNITY. Manchester: University of Manchester, 1995.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

1973-, Voss Michael J., ed. OpenMP shared memory parallel programming: International Workshop on OpenMP Applications and Tools, WOMPAT 2003, Toronto, Canada, June 26-27, 2003 : proceedings. New York: Springer, 2003.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

Musgrave, Jeffrey L. Shared direct memory access on the Explorer II-LX. [Washington, DC]: National Aeronautics and Space Administration, 1990.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

More sources

Book chapters on the topic "Shared-memory parallel programming"

Hoeflinger, Jay P., and Bronis R. de Supinski. "The OpenMP Memory Model." In OpenMP Shared Memory Parallel Programming, 167–77. Berlin, Heidelberg: Springer Berlin Heidelberg, 2008. http://dx.doi.org/10.1007/978-3-540-68555-5_14.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Jamieson, Peter, and Angelos Bilas. "CableS : Thread Control and Memory System Extensions for Shared Virtual Memory Clusters." In OpenMP Shared Memory Parallel Programming, 170–84. Berlin, Heidelberg: Springer Berlin Heidelberg, 2001. http://dx.doi.org/10.1007/3-540-44587-0_15.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Aslot, Vishal, Max Domeika, Rudolf Eigenmann, Greg Gaertner, Wesley B. Jones, and Bodo Parady. "SPEComp: A New Benchmark Suite for Measuring Parallel Computer Performance." In OpenMP Shared Memory Parallel Programming, 1–10. Berlin, Heidelberg: Springer Berlin Heidelberg, 2001. http://dx.doi.org/10.1007/3-540-44587-0_1.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Nikolopoulos, Dimitrios S., and Eduard Ayguadé. "A Study of Implicit Data Distribution Methods for OpenMP Using the SPEC Benchmarks." In OpenMP Shared Memory Parallel Programming, 115–29. Berlin, Heidelberg: Springer Berlin Heidelberg, 2001. http://dx.doi.org/10.1007/3-540-44587-0_11.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Sato, Mitsuhisa, Motonari Hirano, Yoshio Tanaka, and Satoshi Sekiguchi. "OmniRPC: A Grid RPC Facility for Cluster and Global Computing in OpenMP." In OpenMP Shared Memory Parallel Programming, 130–36. Berlin, Heidelberg: Springer Berlin Heidelberg, 2001. http://dx.doi.org/10.1007/3-540-44587-0_12.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Gonzalez, M., E. Ayguadfi, X. Martorell, and J. Labarta. "Defining and Supporting Pipelined Executions in OpenMP." In OpenMP Shared Memory Parallel Programming, 155–69. Berlin, Heidelberg: Springer Berlin Heidelberg, 2001. http://dx.doi.org/10.1007/3-540-44587-0_14.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Min, Seung Jai, Seon Wook Kim, Michael Voss, Sang Ik Lee, and Rudolf Eigenmann. "Portable Compilers for OpenMP." In OpenMP Shared Memory Parallel Programming, 11–19. Berlin, Heidelberg: Springer Berlin Heidelberg, 2001. http://dx.doi.org/10.1007/3-540-44587-0_2.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Kusano, Kazuhiro, Mitsuhisa Sato, Takeo Hosomi, and Yoshiki Seo. "The Omni OpenMP Compiler on the Distributed Shared Memory of Cenju-4." In OpenMP Shared Memory Parallel Programming, 20–30. Berlin, Heidelberg: Springer Berlin Heidelberg, 2001. http://dx.doi.org/10.1007/3-540-44587-0_3.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Müller, Matthias. "Some Simple OpenMP Optimization Techniques." In OpenMP Shared Memory Parallel Programming, 31–39. Berlin, Heidelberg: Springer Berlin Heidelberg, 2001. http://dx.doi.org/10.1007/3-540-44587-0_4.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Caubet, Jordi, Judit Gimenez, Jesus Labarta, Luiz DeRose, and Jeffrey Vetter. "A Dynamic Tracing Mechanism for Performance Analysis of OpenMP Applications." In OpenMP Shared Memory Parallel Programming, 53–67. Berlin, Heidelberg: Springer Berlin Heidelberg, 2001. http://dx.doi.org/10.1007/3-540-44587-0_6.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "Shared-memory parallel programming"

Bonetta, Daniele, Luca Salucci, Stefan Marr, and Walter Binder. "GEMs: shared-memory parallel programming for Node.js." In SPLASH '16: Conference on Systems, Programming, Languages, and Applications: Software for Humanity. New York, NY, USA: ACM, 2016. http://dx.doi.org/10.1145/2983990.2984039.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Zhang, Yu, and Wei Hu. "Exploring Deterministic Shared Memory Programming Model." In 2012 13th International Conference on Parallel and Distributed Computing Applications and Technologies (PDCAT). IEEE, 2012. http://dx.doi.org/10.1109/pdcat.2012.74.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Shibu, P. S., Atul, Balamati Choudhury, and Raveendranath U. Nair. "Shared Memory Architecture based Parallel Programming for RCS Estimation." In 2018 International Conference on Applied Electromagnetics, Signal Processing and Communication (AESPC). IEEE, 2018. http://dx.doi.org/10.1109/aespc44649.2018.9033194.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Senghor, Abdourahmane, and Karim Konate. "A Java Hybrid Compiler for Shared Memory Parallel Programming." In 2012 13th International Conference on Parallel and Distributed Computing Applications and Technologies (PDCAT). IEEE, 2012. http://dx.doi.org/10.1109/pdcat.2012.21.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Ohno, Kazuhiko, Dai Michiura, Masaki Matsumoto, Takahiro Sasaki, and Toshio Kondo. "A GPGPU Programming Framework based on a Shared-Memory Model." In Parallel and Distributed Computing and Systems. Calgary,AB,Canada: ACTAPRESS, 2012. http://dx.doi.org/10.2316/p.2012.757-097.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Full text

APA, Harvard, Vancouver, ISO, and other styles

Chapman, B. "Scalable Shared Memory Parallel Programming: Will One Size Fit All?" In 14th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP'06). IEEE, 2006. http://dx.doi.org/10.1109/pdp.2006.64.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Karantasis, Konstantinos I., and Eleftherios D. Polychronopoulos. "Programming GPU Clusters with Shared Memory Abstraction in Software." In 2011 19th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP). IEEE, 2011. http://dx.doi.org/10.1109/pdp.2011.91.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Bättig, Martin, and Thomas R. Gross. "Synchronized-by-Default Concurrency for Shared-Memory Systems." In PPoPP '17: 22nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. New York, NY, USA: ACM, 2017. http://dx.doi.org/10.1145/3018743.3018747.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Hayashi, Koby, Grey Ballard, Yujie Jiang, and Michael J. Tobia. "Shared-memory parallelization of MTTKRP for dense tensors." In PPoPP '18: 23nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. New York, NY, USA: ACM, 2018. http://dx.doi.org/10.1145/3178487.3178522.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Reports on the topic "Shared-memory parallel programming"

Goudy, Susan Phelps, Jonathan Leighton Brown, Zhaofang Wen, Michael Allen Heroux, and Shan Shan Huang. BEC :a virtual shared memory parallel programming environment. Office of Scientific and Technical Information (OSTI), January 2006. http://dx.doi.org/10.2172/882923.

Full text

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

Contents

Academic literature on the topic 'Shared-memory parallel programming'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Journal articles on the topic "Shared-memory parallel programming"

Dissertations / Theses on the topic "Shared-memory parallel programming"

Books on the topic "Shared-memory parallel programming"

Book chapters on the topic "Shared-memory parallel programming"

Conference papers on the topic "Shared-memory parallel programming"

Reports on the topic "Shared-memory parallel programming"