Academic literature on the topic 'Parallel code mapping'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Parallel code mapping.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Parallel code mapping"

1

Proctor, Robert W., Huifang Wang, and Kim-Phuong L. Vu. "Influences of different combinations of conceptual, perceptual, and structural similarity on stimulus-response compatibility." Quarterly Journal of Experimental Psychology Section A 55, no. 1 (February 2002): 59–74. http://dx.doi.org/10.1080/02724980143000163.

Full text
Abstract:
This study evaluated the hypothesis that an increase in set-level stimulus-response compatibility produces facilitation for congruent mappings and interference for incongruent mappings. The degree of set-level compatibility was manipulated by varying combinations of conceptual, perceptual, and structural similarity. Experiment 1 varied perceptual similarity, by combining two stimulus codes (spatial, verbal) with two response modalities (manual, vocal) for orthogonal spatial dimensions, which have structural similarity. The element-level mapping effect did not vary as a function of the code-modality relation, in contrast to findings obtained with parallel spatial dimensions, which also have conceptual similarity. Experiment 2 manipulated combinations of conceptual and perceptual similarity by combining vertical and horizontal stimulus and response orientations, using verbal or spatial stimuli and vocal responses. The element-level mapping effect was larger for parallel than orthogonal orientations, with congruent mappings showing facilitation and incongruent mappings showing interference. The largest effect was facilitation for parallel orientations with the verbal-vocal set, consistent with the view that perceptual similarity contributes to performance primarily when responding with the identity of the stimulus. Our results indicate that conceptual similarity, but not perceptual similarity, produces the facilitation/interference pattern suggestive of automatic activation of the corresponding response regardless of mapping.
APA, Harvard, Vancouver, ISO, and other styles
2

GRÉWAL, GARY WILLIAM, and CHARLES THOMAS WILSON. "MAPPING REFERENCE CODE TO IRREGULAR DSPS WITHIN THE RETARGETABLE, OPTIMIZING COMPILER COGEN(T)." International Journal of Computational Intelligence and Applications 03, no. 01 (March 2003): 45–64. http://dx.doi.org/10.1142/s146902680300080x.

Full text
Abstract:
Generating high quality code for embedded processors is made difficult by irregular architectures and highly encoded parallel instructions. Rather than dealing with the target machine at every stage of the compilation, a promising new methodology employs generic algorithms to optimize code for an idealized abstraction of the true target machine. This code, called reference code, is then mapped to the real instruction set by enhanced genetic algorithms. One perturbs the original schedule to find a number of alternative (parallel) instruction sequences, and the other evolves feasible register assignments, if possible, for each sequence. This paper describes the strategy for mapping idealized code into actual code. The COGEN(T) system employs this methodology to produce good code for different commercial DSPs and ASIPs.
APA, Harvard, Vancouver, ISO, and other styles
3

Fu, Zuohui, Yikun Xian, Shijie Geng, Yingqiang Ge, Yuting Wang, Xin Dong, Guang Wang, and Gerard De Melo. "ABSent: Cross-Lingual Sentence Representation Mapping with Bidirectional GANs." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 05 (April 3, 2020): 7756–63. http://dx.doi.org/10.1609/aaai.v34i05.6279.

Full text
Abstract:
A number of cross-lingual transfer learning approaches based on neural networks have been proposed for the case when large amounts of parallel text are at our disposal. However, in many real-world settings, the size of parallel annotated training data is restricted. Additionally, prior cross-lingual mapping research has mainly focused on the word level. This raises the question of whether such techniques can also be applied to effortlessly obtain cross-lingually aligned sentence representations. To this end, we propose an Adversarial Bi-directional Sentence Embedding Mapping (ABSent) framework, which learns mappings of cross-lingual sentence representations from limited quantities of parallel data. The experiments show that our method outperforms several technically more powerful approaches, especially under challenging low-resource circumstances. The source code is available from https://github.com/zuohuif/ABSent along with relevant datasets.
APA, Harvard, Vancouver, ISO, and other styles
4

Ikuta, Kai, Hiroyuki Maehara, Yuta Notsu, Kosuke Namekata, Taichi Kato, Shota Notsu, Soshi Okamoto, Satoshi Honda, Daisaku Nogami, and Kazunari Shibata. "Starspot Mapping with Adaptive Parallel Tempering. I. Implementation of Computational Code." Astrophysical Journal 902, no. 1 (October 13, 2020): 73. http://dx.doi.org/10.3847/1538-4357/abae5f.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

NITSCHE, THOMAS. "LIFTING SEQUENTIAL FUNCTIONS TO PARALLEL SKELETONS." Parallel Processing Letters 12, no. 02 (June 2002): 267–84. http://dx.doi.org/10.1142/s0129626402000963.

Full text
Abstract:
This paper describes the transformation of (almost arbitrary) sequential functions on covers to parallel, collective operations (skeletons). This allows the direct re-use of existing, but sequential, code on parallel machines without the necessity to hand-code the desired parallel operations. A necessary pre-requisite for this skeleton lifting is the availability of a cover which holds the mapping information of the local subobjects, including their topology information. The lifting transformation distinguishes sequential values, which are available on each processor, from parallel values, which are only stored once. Accesses to parallel values are either ignored locally if they will be computed on another processor, or they induce communication messages to transfer the necessary data, depending on the behavior of the original function.
APA, Harvard, Vancouver, ISO, and other styles
6

Știrb, Iulia. "Extending NUMA-BTLP Algorithm with Thread Mapping Based on a Communication Tree." Computers 7, no. 4 (December 3, 2018): 66. http://dx.doi.org/10.3390/computers7040066.

Full text
Abstract:
The paper presents a Non-Uniform Memory Access (NUMA)-aware compiler optimization for task-level parallel code. The optimization is based on Non-Uniform Memory Access—Balanced Task and Loop Parallelism (NUMA-BTLP) algorithm Ştirb, 2018. The algorithm gets the type of each thread in the source code based on a static analysis of the code. After assigning a type to each thread, NUMA-BTLP Ştirb, 2018 calls NUMA-BTDM mapping algorithm Ştirb, 2016 which uses PThreads routine pthread_setaffinity_np to set the CPU affinities of the threads (i.e., thread-to-core associations) based on their type. The algorithms perform an improve thread mapping for NUMA systems by mapping threads that share data on the same core(s), allowing fast access to L1 cache data. The paper proves that PThreads based task-level parallel code which is optimized by NUMA-BTLP Ştirb, 2018 and NUMA-BTDM Ştirb, 2016 at compile-time, is running time and energy efficiently on NUMA systems. The results show that the energy is optimized with up to 5% at the same execution time for one of the tested real benchmarks and up to 15% for another benchmark running in infinite loop. The algorithms can be used on real-time control systems such as client/server based applications which require efficient access to shared resources. Most often, task parallelism is used in the implementation of the server and loop parallelism is used for the client.
APA, Harvard, Vancouver, ISO, and other styles
7

Di Martino, Beniamino, and Antonio Esposito. "Automatic Dynamic Data Structures Recognition to Support the Migration of Applications to the Cloud." International Journal of Grid and High Performance Computing 7, no. 3 (July 2015): 1–22. http://dx.doi.org/10.4018/ijghpc.2015070101.

Full text
Abstract:
The work presented in this manuscript describes a methodology for the recognition of Dynamic Data structures, with a focus on Queues, Pipes and Lists. The recognition of such structures is used as a basis for the mapping of sequential code to Cloud Services, in order to support the semi-automatic restructuring of source software. The goal is to develop a complete methodology and a framework based on it to ease the efforts needed to port native applications to a Cloud Platform and simplify the relative complex processes. In order to achieve such an objective, the proposed technique exploits an intermediate representation of the code, consisting in parallel Skeletons and Cloud Patterns. Logical inference rules act on a knowledge base, built during the analysis of the source code, to guide the recognition and mapping processes. Both the inference rules and knowledge base are expressed in Prolog. A prototype tool for the automatic analysis of sequential source code and its mapping to a Cloud Pattern is also presented.
APA, Harvard, Vancouver, ISO, and other styles
8

Bonati, Claudio, Enrico Calore, Simone Coscetti, Massimo D’Elia, Michele Mesiti, Francesco Negro, Sebastiano Fabio Schifano, Giorgio Silvi, and Raffaele Tripiccione. "Portable LQCD Monte Carlo code using OpenACC." EPJ Web of Conferences 175 (2018): 09008. http://dx.doi.org/10.1051/epjconf/201817509008.

Full text
Abstract:
Varying from multi-core CPU processors to many-core GPUs, the present scenario of HPC architectures is extremely heterogeneous. In this context, code portability is increasingly important for easy maintainability of applications; this is relevant in scientific computing where code changes are numerous and frequent. In this talk we present the design and optimization of a state-of-the-art production level LQCD Monte Carlo application, using the OpenACC directives model. OpenACC aims to abstract parallel programming to a descriptive level, where programmers do not need to specify the mapping of the code on the target machine. We describe the OpenACC implementation and show that the same code is able to target different architectures, including state-of-the-art CPUs and GPUs.
APA, Harvard, Vancouver, ISO, and other styles
9

Jeong, Eunjin, Dowhan Jeong, and Soonhoi Ha. "Dataflow Model–based Software Synthesis Framework for Parallel and Distributed Embedded Systems." ACM Transactions on Design Automation of Electronic Systems 26, no. 5 (June 5, 2021): 1–38. http://dx.doi.org/10.1145/3447680.

Full text
Abstract:
Existing software development methodologies mostly assume that an application runs on a single device without concern about the non-functional requirements of an embedded system such as latency and resource consumption. Besides, embedded software is usually developed after the hardware platform is determined, since a non-negligible portion of the code depends on the hardware platform. In this article, we present a novel model-based software synthesis framework for parallel and distributed embedded systems. An application is specified as a set of tasks with the given rules for execution and communication. Having such rules enables us to perform static analysis to check some software errors at compile-time to reduce the verification difficulty. Platform-specific programs are synthesized automatically after the mapping of tasks onto processing elements is determined. The proposed framework is expandable to support new hardware platforms easily. The proposed communication code synthesis method is extensible and flexible to support various communication methods between devices. In addition, the fault-tolerant feature can be added by modifying the task graph automatically according to the selected fault-tolerance configurations by the user. The viability of the proposed software development methodology is evaluated with a real-life surveillance application that runs on six processing elements.
APA, Harvard, Vancouver, ISO, and other styles
10

Wang, H. C., and C. K. Yuen. "A general framework to build new CPUs by mapping abstract machine code to instruction level parallel execution hardware." ACM SIGARCH Computer Architecture News 33, no. 4 (November 2005): 113–20. http://dx.doi.org/10.1145/1105734.1105750.

Full text
APA, Harvard, Vancouver, ISO, and other styles
More sources

Dissertations / Theses on the topic "Parallel code mapping"

1

Bengtsson, Jerker. "Models and Methods for Development of DSP Applications on Manycore Processors." Doctoral thesis, Högskolan i Halmstad, Centrum för forskning om inbyggda system (CERES), 2009. http://urn.kb.se/resolve?urn=urn:nbn:se:hh:diva-14706.

Full text
Abstract:
Advanced digital signal processing systems require specialized high-performance embedded computer architectures. The term high-performance translates to large amounts of data and computations per time unit. The term embedded further implies requirements on physical size and power efficiency. Thus the requirements are of both functional and non-functional nature. This thesis addresses the development of high-performance digital signal processing systems relying on manycore technology. We propose building two-level hierarchical computer architectures for this domain of applications. Further, we outline a tool flow based on methods and analysis techniques for automated, multi-objective mapping of such applications on distributed memory manycore processors. In particular, the focus is put on how to provide a means for tunable strategies for mapping of task graphs on array structured distributed memory manycores, with respect to given application constraints. We argue for code mapping strategies based on predicted execution performance, which can be used in an auto-tuning feedback loop or to guide manual tuning directed by the programmer. Automated parallelization, optimisation and mapping to a manycore processor benefits from the use of a concurrent programming model as the starting point. Such a model allows the programmer to express different types and granularities of parallelism as well as computation characteristics of importance in the addressed class of applications. The programming model should also abstract away machine dependent hardware details. The analytical study of WCDMA baseband processing in radio base stations, presented in this thesis, suggests dataflow models as a good match to the characteristics of the application and as execution model abstracting computations on a manycore. Construction of portable tools further requires a manycore machine model and an intermediate representation. The models are needed in order to decouple algorithms, used to transform and map application software, from hardware. We propose a manycore machine model that captures common hardware resources, as well as resource dependent performance metrics for parallel computation and communication. Further, we have developed a multifunctional intermediate representation, which can be used as source for code generation and for dynamic execution analysis. Finally, we demonstrate how we can dynamically analyse execution using abstract interpretation on the intermediate representation. It is shown that the performance predictions can be used to accurately rank different mappings by best throughput or shortest end-to-end computation latency.
APA, Harvard, Vancouver, ISO, and other styles
2

Grewe, Dominik. "Mapping parallel programs to heterogeneous multi-core systems." Thesis, University of Edinburgh, 2014. http://hdl.handle.net/1842/8852.

Full text
Abstract:
Heterogeneous computer systems are ubiquitous in all areas of computing, from mobile to high-performance computing. They promise to deliver increased performance at lower energy cost than purely homogeneous, CPU-based systems. In recent years GPU-based heterogeneous systems have become increasingly popular. They combine a programmable GPU with a multi-core CPU. GPUs have become flexible enough to not only handle graphics workloads but also various kinds of general-purpose algorithms. They are thus used as a coprocessor or accelerator alongside the CPU. Developing applications for GPU-based heterogeneous systems involves several challenges. Firstly, not all algorithms are equally suited for GPU computing. It is thus important to carefully map the tasks of an application to the most suitable processor in a system. Secondly, current frameworks for heterogeneous computing, such as OpenCL, are low-level, requiring a thorough understanding of the hardware by the programmer. This high barrier to entry could be lowered by automatically generating and tuning this code from a high-level and thus more user-friendly programming language. Both challenges are addressed in this thesis. For the task mapping problem a machine learning-based approach is presented in this thesis. It combines static features of the program code with runtime information on input sizes to predict the optimal mapping of OpenCL kernels. This approach is further extended to also take contention on the GPU into account. Both methods are able to outperform competing mapping approaches by a significant margin. Furthermore, this thesis develops a method for targeting GPU-based heterogeneous systems from OpenMP, a directive-based framework for parallel computing. OpenMP programs are translated to OpenCL and optimized for GPU performance. At runtime a predictive model decides whether to execute the original OpenMP code on the CPU or the generated OpenCL code on the GPU. This approach is shown to outperform both a competing approach as well as hand-tuned code.
APA, Harvard, Vancouver, ISO, and other styles
3

Jones, Beryl Wyn. "Mapping unstructured mesh codes onto local memory parallel architectures." Thesis, University of Greenwich, 1994. http://gala.gre.ac.uk/6201/.

Full text
Abstract:
Initial work on mapping CFD codes onto parallel systems focused upon software which employed structured meshes. Increasingly, many large scale CFD codes are being based upon unstructured meshes. One of the key problems when implementing such large scale unstructured problems on a distributed memory machine is the question of how to partition the underlying computational domain efficiently. It is important that all processors are kept busy for as large a proportion of the time as possible and that the amount, level and frequency of communication should be kept to a minimum. Proposed techniques for solving the mapping problem have separated out the solution into two distinct phases. The first phase is to partition the computational domain into cohesive sub-regions. The second phase consists of embedding these sub-regions onto the processors. However, it has been shown that performing these two operations in isolation can lead to poor mappings and much less optimal communication time. In this thesis we develop a technique which simultaneously takes account of the processor topology whilst identifying the cohesive sub-regions. Our approach is based on an unstructured mesh decomposition method that was originally developed by Sadayappan et al [SER90] for a hypercube. This technique forms a basis for a method which enables a decomposition to an arbitrary number of processors on a specified processor network topology. Whilst partitioning the mesh, the optimisation method takes into account the processor topology by minimising the total interprocessor communication. The problem with this technique is that it is not suitable for dealing with very large meshes since the calculations often require prodigious amounts of computing processing power.
APA, Harvard, Vancouver, ISO, and other styles
4

Hashmi, Jahanzeb Maqbool. "Designing High Performance Shared-Address-Space and Adaptive Communication Middlewares for Next-Generation HPC Systems." The Ohio State University, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=osu1588038721555713.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Books on the topic "Parallel code mapping"

1

Gao, Guang R. A code mapping scheme for dataflow software pipelining. Boston: Kluwer Academic Publishers, 1991.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
2

Hu, Xuhui. Encoding Events. Oxford University Press, 2018. http://dx.doi.org/10.1093/oso/9780198808466.001.0001.

Full text
Abstract:
This book presents theoretical and empirical research on the syntax of events within the broader framework of generative grammar. A central theoretical concern is how conceptual meaning interacts with narrow syntactic computation in the derivation of the information of an event. A set of Integration Conditions are proposed. Building on the Conceptual-Intentional Interface Conditions proposed in Chomsky’s (1995, 2000, 2001) Minimalist Programme, the Integration Conditions require that the content of the predicate be licensed by theta-role information generated by narrow syntax. Another theoretical component concerns the functional structure of events, which is related to such issues as the parallel between the event and nominal domains, the mapping of a predicate onto an entity, as well as the grammatical foundation of verb classification. The theoretical framework is applied in three areas: (1) the syntax of resultatives in English and Chinese, which exhibits how a theory of the syntax of events can address the thematic relationship between core arguments and predicates; (2) variation of resultatives at cross-linguistic and diachronic levels, which shows how the universal functional structure of events can be compatible with, and even contribute to, the theory of parametric variation in the generative tradition; and (3) applicative constructions, which extend the analysis of core arguments to non-core arguments, and shed light on the typology of verb/satellite-framed languages (Talmy 1991, 2000) and the analyticity parameter proposed in Huang (2015).
APA, Harvard, Vancouver, ISO, and other styles

Book chapters on the topic "Parallel code mapping"

1

Reinders, James, Ben Ashbaugh, James Brodman, Michael Kinsner, John Pennycook, and Xinmin Tian. "Programming for CPUs." In Data Parallel C++, 387–418. Berkeley, CA: Apress, 2020. http://dx.doi.org/10.1007/978-1-4842-5574-2_16.

Full text
Abstract:
Abstract Kernel programming originally became popular as a way to program GPUs. As kernel programming is generalized, it is important to understand how our style of programming affects the mapping of our code to a CPU.
APA, Harvard, Vancouver, ISO, and other styles
2

Reinders, James, Ben Ashbaugh, James Brodman, Michael Kinsner, John Pennycook, and Xinmin Tian. "Programming for FPGAs." In Data Parallel C++, 419–69. Berkeley, CA: Apress, 2020. http://dx.doi.org/10.1007/978-1-4842-5574-2_17.

Full text
Abstract:
Abstract Kernel-based programming originally became popular as a way to access GPUs. Since it has now been generalized across many types of accelerators, it is important to understand how our style of programming affects the mapping of code to an FPGA as well.
APA, Harvard, Vancouver, ISO, and other styles
3

Weichslgartner, Andreas, Stefan Wildermann, Michael Glaß, and Jürgen Teich. "Hybrid Application Mapping." In Invasive Computing for Mapping Parallel Programs to Many-Core Architectures, 85–135. Singapore: Springer Singapore, 2017. http://dx.doi.org/10.1007/978-981-10-7356-4_5.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Weichslgartner, Andreas, Stefan Wildermann, Michael Glaß, and Jürgen Teich. "Hybrid Mapping for Increased Security." In Invasive Computing for Mapping Parallel Programs to Many-Core Architectures, 137–56. Singapore: Springer Singapore, 2017. http://dx.doi.org/10.1007/978-981-10-7356-4_6.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Weichslgartner, Andreas, Stefan Wildermann, Michael Glaß, and Jürgen Teich. "Introduction." In Invasive Computing for Mapping Parallel Programs to Many-Core Architectures, 1–7. Singapore: Springer Singapore, 2017. http://dx.doi.org/10.1007/978-981-10-7356-4_1.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Weichslgartner, Andreas, Stefan Wildermann, Michael Glaß, and Jürgen Teich. "Invasive Computing." In Invasive Computing for Mapping Parallel Programs to Many-Core Architectures, 9–43. Singapore: Springer Singapore, 2017. http://dx.doi.org/10.1007/978-981-10-7356-4_2.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Weichslgartner, Andreas, Stefan Wildermann, Michael Glaß, and Jürgen Teich. "Fundamentals." In Invasive Computing for Mapping Parallel Programs to Many-Core Architectures, 45–56. Singapore: Springer Singapore, 2017. http://dx.doi.org/10.1007/978-981-10-7356-4_3.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Weichslgartner, Andreas, Stefan Wildermann, Michael Glaß, and Jürgen Teich. "Self-embedding." In Invasive Computing for Mapping Parallel Programs to Many-Core Architectures, 57–83. Singapore: Springer Singapore, 2017. http://dx.doi.org/10.1007/978-981-10-7356-4_4.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Weichslgartner, Andreas, Stefan Wildermann, Michael Glaß, and Jürgen Teich. "Conclusions and Future Work." In Invasive Computing for Mapping Parallel Programs to Many-Core Architectures, 157–61. Singapore: Springer Singapore, 2017. http://dx.doi.org/10.1007/978-981-10-7356-4_7.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Glatzmaier, Gary A. "Spatial Discretizations." In Introduction to Modeling Convection in Planets and Stars. Princeton University Press, 2013. http://dx.doi.org/10.23943/princeton/9780691141725.003.0009.

Full text
Abstract:
This chapter considers two ways of employing a spatial resolution that varies with position within a finite-difference method: using a nonuniform grid and mapping to a new coordinate variable. It first provides an overview of nonuniform grids before discussing coordinate mapping as an alternative way of achieving spatial discretization. It then describes an approach for treating both the vertical and horizontal directions with simple finite-difference methods: defining a streamfunction, which automatically satisfies mass conservation, and solving for vorticity via the curl of the momentum conservation equation. It also explains the use of the Chebyshev–Fourier method to simulate the convection or gravity wave problem by employing spectral methods in both the horizontal and vertical directions. Finally, it looks at the basic ideas and some issues that need to be addressed with respect to parallel processing as well as choices that need to be made when designing a parallel code.
APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "Parallel code mapping"

1

Grewe, D., Zheng Wang, and M. F. P. O'Boyle. "Portable mapping of data parallel programs to OpenCL for heterogeneous systems." In 2013 IEEE/ACM International Symposium on Code Generation and Optimization (CGO). IEEE, 2013. http://dx.doi.org/10.1109/cgo.2013.6494993.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Tan, Xiangchen, Tie Feng, and Jiachen Zhang. "Mapping Software Design Changes to Source Code Changes." In Eighth ACIS International Conference on Software Engineering, Artificial Intelligence, Networking, and Parallel/Distributed Computing (SNPD 2007). IEEE, 2007. http://dx.doi.org/10.1109/snpd.2007.293.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Duncan, Ralph, Nima Gougol, and Jim Frandeen. "Mapping Exceptions to High-Level Source Code on a Heterogeneous Architecture." In 2018 9th International Symposium on Parallel Architectures, Algorithms and Programming (PAAP). IEEE, 2018. http://dx.doi.org/10.1109/paap.2018.00017.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Mastoras, Aristeidis, and Thomas R. Gross. "Unifying fixed code and fixed data mapping of load-imbalanced pipelined loops." In PPoPP '16: 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. New York, NY, USA: ACM, 2016. http://dx.doi.org/10.1145/2851141.2851172.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Chen, Xiaona, Chuanshumin Hu, and Xing Chen. "Class Mapping between Different Versions of Android Applications without Source Code." In 2019 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom). IEEE, 2019. http://dx.doi.org/10.1109/ispa-bdcloud-sustaincom-socialcom48970.2019.00222.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Liu, Yu, Hong Zhang, and Baoshan Jia. "Development and Preliminary Verification of a Thermal-Hydraulic Multi-Scale Coupled Code." In 18th International Conference on Nuclear Engineering. ASMEDC, 2010. http://dx.doi.org/10.1115/icone18-29239.

Full text
Abstract:
On the basis of best estimate thermal-hydraulic system code RELAP5, sub-channel code COBRA-1V, and commercial Computational Fluid Dynamics (CFD) code CFX, a thermalhydraulic multi-scale coupled code RECOX has been developed. The coupling strategy was designed to keep the integral structure of each code and minimize modifications of code source. Under the Parallel Virtual Machine (PVM) environment, an external control code has been developed to perform codes spawn, data exchange and mapping, time step coordination, etc. Two test cases including single phase blowdown and temperature fluctuation transient have been carried out to evaluate the coupling between codes. Compared with stand-alone simulations very good agreement was achieved. Then in order to demonstrate the coupled analysis capability of RECOX, an asymmetry transient in a simple two loops system which is similar to the nuclear power plant was simulated. The result is correct and reliable, although further verification of coupled code with related experiment is needed. Finally, some potential improvements of coupling and future work were presented.
APA, Harvard, Vancouver, ISO, and other styles
7

Belardinelli, Pierpaolo, and Stefano Lenci. "HPC Methods for Domains of Attraction Computation." In ASME 2015 International Design Engineering Technical Conferences and Computers and Information in Engineering Conference. American Society of Mechanical Engineers, 2015. http://dx.doi.org/10.1115/detc2015-46095.

Full text
Abstract:
The work is devoted to the development of efficient parallel algorithms for the computation of large-scale basins of attraction. Since the required computational resources increase exponentially with the dimension of a dynamical system, it is common to get into memory saturation or in a secular elaboration time. This paper presents a code, based on a cell mapping method, that evaluates basins of attraction for high-dimensional systems by exploiting the parallel programming. The proposed approach, by using a double-step algorithm, permits, i) to fully determine the basins in all the dimensions ii) to evaluate 2D Poincaré sections of the system. The code is described in all its parts: the shell, in charge of the core management, permits to split over a multi-core environment the computing domain, it carries out an efficient use of the memory. A preliminary analysis of the performances is undertaken also by considering different dimensional grids; the optimal balance between computing cores and memory management cores is studied.
APA, Harvard, Vancouver, ISO, and other styles
8

Luo, Fuli, Peng Li, Jie Zhou, Pengcheng Yang, Baobao Chang, Xu Sun, and Zhifang Sui. "A Dual Reinforcement Learning Framework for Unsupervised Text Style Transfer." In Twenty-Eighth International Joint Conference on Artificial Intelligence {IJCAI-19}. California: International Joint Conferences on Artificial Intelligence Organization, 2019. http://dx.doi.org/10.24963/ijcai.2019/711.

Full text
Abstract:
Unsupervised text style transfer aims to transfer the underlying style of text but keep its main content unchanged without parallel data. Most existing methods typically follow two steps: first separating the content from the original style, and then fusing the content with the desired style. However, the separation in the first step is challenging because the content and style interact in subtle ways in natural language. Therefore, in this paper, we propose a dual reinforcement learning framework to directly transfer the style of the text via a one-step mapping model, without any separation of content and style. Specifically, we consider the learning of the source-to-target and target-to-source mappings as a dual task, and two rewards are designed based on such a dual structure to reflect the style accuracy and content preservation, respectively. In this way, the two one-step mapping models can be trained via reinforcement learning, without any use of parallel data. Automatic evaluations show that our model outperforms the state-of-the-art systems by a large margin, especially with more than 10 BLEU points improvement averaged on two benchmark datasets. Human evaluations also validate the effectiveness of our model in terms of style accuracy, content preservation and fluency. Our code and data, including outputs of all baselines and our model are available at https://github.com/luofuli/DualRL.
APA, Harvard, Vancouver, ISO, and other styles
9

Shu, Xinwei, Chuangang Gu, Tong Wang, and Bo Yang. "Optimum Design and Experimental Study of a Very Low-Specific-Speed Centrifugal Blower Blade." In ASME Turbo Expo 2009: Power for Land, Sea, and Air. ASMEDC, 2009. http://dx.doi.org/10.1115/gt2009-59823.

Full text
Abstract:
Low-specific-speed centrifugal blowers are widely used in industries for compressing gases to high pressures at low flows. They have relatively low manufacturing and operating cost. However, their efficiency is fairly low as a result of relatively high leakage, disc friction and passage friction pressure loss. The purpose of this paper is to show how the performances can be improved by properly reshaping its blade profile using a developed multipoint optimization approach. The core of this approach is to build an approximate model mapping the correlation between certain blade shape parameters and corresponding aerodynamic performances. The approach is implemented by the combining a blade parameterization code, a three-dimensional viscous flow solver and several improved numerical optimization algorithms. A very low-specific speed industrial centrifugal blower with parallel hub and shroud has been selected as a reference case. The superior performances of the optimized impeller blade are demonstrated by comparing the performance improvement with that of the original blade. This is then confirmed by experimental studies.
APA, Harvard, Vancouver, ISO, and other styles
10

Kennings, Andrew, and Chirag Ravishankar. "Parallel FPGA technology mapping using multi-core architectures." In 2011 24th IEEE Canadian Conference on Electrical and Computer Engineering (CCECE). IEEE, 2011. http://dx.doi.org/10.1109/ccece.2011.6030453.

Full text
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography