Literatura académica sobre el tema "CPU-GPU Partitioning"

Crea una cita precisa en los estilos APA, MLA, Chicago, Harvard y otros

Elija tipo de fuente:

Consulte las listas temáticas de artículos, libros, tesis, actas de conferencias y otras fuentes académicas sobre el tema "CPU-GPU Partitioning".

Junto a cada fuente en la lista de referencias hay un botón "Agregar a la bibliografía". Pulsa este botón, y generaremos automáticamente la referencia bibliográfica para la obra elegida en el estilo de cita que necesites: APA, MLA, Harvard, Vancouver, Chicago, etc.

También puede descargar el texto completo de la publicación académica en formato pdf y leer en línea su resumen siempre que esté disponible en los metadatos.

Artículos de revistas sobre el tema "CPU-GPU Partitioning"

1

Benatia, Akrem, Weixing Ji, Yizhuo Wang, and Feng Shi. "Sparse matrix partitioning for optimizing SpMV on CPU-GPU heterogeneous platforms." International Journal of High Performance Computing Applications 34, no. 1 (2019): 66–80. http://dx.doi.org/10.1177/1094342019886628.

Texto completo
Resumen
Sparse matrix–vector multiplication (SpMV) kernel dominates the computing cost in numerous applications. Most of the existing studies dedicated to improving this kernel have been targeting just one type of processing units, mainly multicore CPUs or graphics processing units (GPUs), and have not explored the potential of the recent, rapidly emerging, CPU-GPU heterogeneous platforms. To take full advantage of these heterogeneous systems, the input sparse matrix has to be partitioned on different available processing units. The partitioning problem is more challenging with the existence of many s
Los estilos APA, Harvard, Vancouver, ISO, etc.
2

Narayana, Divyaprabha Kabbal, and Sudarshan Tekal Subramanyam Babu. "Optimal task partitioning to minimize failure in heterogeneous computational platform." International Journal of Electrical and Computer Engineering (IJECE) 15, no. 1 (2025): 1079. http://dx.doi.org/10.11591/ijece.v15i1.pp1079-1088.

Texto completo
Resumen
The increased energy consumption by heterogeneous cloud platforms surges the carbon emissions and reduces system reliability, thus, making workload scheduling an extremely challenging process. The dynamic voltage- frequency scaling (DVFS) technique provides an efficient mechanism in improving the energy efficiency of cloud platform; however, employing DVFS reduces reliability and increases the failure rate of resource scheduling. Most of the current workload scheduling methods have failed to optimize the energy and reliability together under a central processing unit - graphical processing uni
Los estilos APA, Harvard, Vancouver, ISO, etc.
3

Huijing Yang and Tingwen Yu. "Two novel cache management mechanisms on CPU-GPU heterogeneous processors." Research Briefs on Information and Communication Technology Evolution 7 (June 15, 2021): 1–8. http://dx.doi.org/10.56801/rebicte.v7i.113.

Texto completo
Resumen
Heterogeneous multicore processors that take full advantage of CPUs and GPUs within the samechip raise an emerging challenge for sharing a series of on-chip resources, particularly Last-LevelCache (LLC) resources. Since the GPU core has good parallelism and memory latency tolerance,the majority of the LLC space is utilized by GPU applications. Under the current cache managementpolicies, the LLC sharing of CPU applications can be remarkably decreased due to the existence ofGPU workloads, thus seriously affecting the overall performance. To alleviate the unfair contentionwithin CPUs and GPUs for
Los estilos APA, Harvard, Vancouver, ISO, etc.
4

Narayana, Divyaprabha Kabbal, and Sudarshan Tekal Subramanyam Babu. "Optimal task partitioning to minimize failure in heterogeneous computational platform." International Journal of Electrical and Computer Engineering (IJECE) 15 (February 1, 2025): 1079–88. https://doi.org/10.11591/ijece.v15i1.pp1079-1088.

Texto completo
Resumen
The increased energy consumption by heterogeneous cloud platforms surges the carbon emissions and reduces system reliability, thus, making workload scheduling an extremely challenging process. The dynamic voltage-frequency scaling (DVFS) technique provides an efficient mechanism in improving the energy efficiency of cloud platform; however, employing DVFS reduces reliability and increases the failure rate of resource scheduling. Most of the current workload scheduling methods have failed to optimize the energy and reliability together under a central processing un
Los estilos APA, Harvard, Vancouver, ISO, etc.
5

Fang, Juan, Mengxuan Wang, and Zelin Wei. "A memory scheduling strategy for eliminating memory access interference in heterogeneous system." Journal of Supercomputing 76, no. 4 (2020): 3129–54. http://dx.doi.org/10.1007/s11227-019-03135-7.

Texto completo
Resumen
AbstractMultiple CPUs and GPUs are integrated on the same chip to share memory, and access requests between cores are interfering with each other. Memory requests from the GPU seriously interfere with the CPU memory access performance. Requests between multiple CPUs are intertwined when accessing memory, and its performance is greatly affected. The difference in access latency between GPU cores increases the average latency of memory accesses. In order to solve the problems encountered in the shared memory of heterogeneous multi-core systems, we propose a step-by-step memory scheduling strateg
Los estilos APA, Harvard, Vancouver, ISO, etc.
6

MERRILL, DUANE, and ANDREW GRIMSHAW. "HIGH PERFORMANCE AND SCALABLE RADIX SORTING: A CASE STUDY OF IMPLEMENTING DYNAMIC PARALLELISM FOR GPU COMPUTING." Parallel Processing Letters 21, no. 02 (2011): 245–72. http://dx.doi.org/10.1142/s0129626411000187.

Texto completo
Resumen
The need to rank and order data is pervasive, and many algorithms are fundamentally dependent upon sorting and partitioning operations. Prior to this work, GPU stream processors have been perceived as challenging targets for problems with dynamic and global data-dependences such as sorting. This paper presents: (1) a family of very efficient parallel algorithms for radix sorting; and (2) our allocation-oriented algorithmic design strategies that match the strengths of GPU processor architecture to this genre of dynamic parallelism. We demonstrate multiple factors of speedup (up to 3.8x) compar
Los estilos APA, Harvard, Vancouver, ISO, etc.
7

Vilches, Antonio, Rafael Asenjo, Angeles Navarro, Francisco Corbera, Rub́en Gran, and María Garzarán. "Adaptive Partitioning for Irregular Applications on Heterogeneous CPU-GPU Chips." Procedia Computer Science 51 (2015): 140–49. http://dx.doi.org/10.1016/j.procs.2015.05.213.

Texto completo
Los estilos APA, Harvard, Vancouver, ISO, etc.
8

Sung, Hanul, Hyeonsang Eom, and HeonYoung Yeom. "The Need of Cache Partitioning on Shared Cache of Integrated Graphics Processor between CPU and GPU." KIISE Transactions on Computing Practices 20, no. 9 (2014): 507–12. http://dx.doi.org/10.5626/ktcp.2014.20.9.507.

Texto completo
Los estilos APA, Harvard, Vancouver, ISO, etc.
9

Wang, Shunjiang, Baoming Pu, Ming Li, Weichun Ge, Qianwei Liu, and Yujie Pei. "State Estimation Based on Ensemble DA–DSVM in Power System." International Journal of Software Engineering and Knowledge Engineering 29, no. 05 (2019): 653–69. http://dx.doi.org/10.1142/s0218194019400023.

Texto completo
Resumen
This paper investigates the state estimation problem of power systems. A novel, fast and accurate state estimation algorithm is presented to solve this problem based on the one-dimensional denoising autoencoder and deep support vector machine (1D DA–DSVM). Besides, for further reducing the computation burden, a partitioning method is presented to divide the power system into several sub-networks and the proposed algorithm can be applied to each sub-network. A hybrid computing architecture of Central Processing Unit (CPU) and Graphics Processing Unit (GPU) is employed in the overall state estim
Los estilos APA, Harvard, Vancouver, ISO, etc.
10

Park, Sungwoo, Seyeon Oh, and Min-Soo Kim. "cuMatch: A GPU-based Memory-Efficient Worst-case Optimal Join Processing Method for Subgraph Queries with Complex Patterns." Proceedings of the ACM on Management of Data 3, no. 3 (2025): 1–28. https://doi.org/10.1145/3725398.

Texto completo
Resumen
Subgraph queries are widely used but face significant challenges due to complex patterns such as negative and optional edges. While worst-case optimal joins have proven effective for subgraph queries with regular patterns, no method has been proposed that can process queries involving complex patterns in a single multi-way join. Existing CPU-based and GPU-based methods experience intermediate data explosion when processing complex patterns following regular patterns. In addition, GPU-based methods struggle with issues of wasted GPU memory and redundant computation. In this paper, we propose cu
Los estilos APA, Harvard, Vancouver, ISO, etc.
Más fuentes

Tesis sobre el tema "CPU-GPU Partitioning"

1

Öhberg, Tomas. "Auto-tuning Hybrid CPU-GPU Execution of Algorithmic Skeletons in SkePU." Thesis, Linköpings universitet, Programvara och system, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-149605.

Texto completo
Resumen
The trend in computer architectures has for several years been heterogeneous systems consisting of a regular CPU and at least one additional, specialized processing unit, such as a GPU.The different characteristics of the processing units and the requirement of multiple tools and programming languages makes programming of such systems a challenging task. Although there exist tools for programming each processing unit, utilizing the full potential of a heterogeneous computer still requires specialized implementations involving multiple frameworks and hand-tuning of parameters.To fully exploit t
Los estilos APA, Harvard, Vancouver, ISO, etc.
2

Thomas, Béatrice. "Adéquation Algorithme Architecture pour la gestion des réseaux électriques." Electronic Thesis or Diss., université Paris-Saclay, 2024. http://www.theses.fr/2024UPASG104.

Texto completo
Resumen
L'augmentation de la production renouvelable décentralisée nécessaire à la transition énergétique complexifiera la gestion du réseau électrique.Une riche littérature propose de décentraliser la gestion pour éviter la surcharge de l'opérateur central pendant la gestion réelle. Cependant la décentralisation exacerbe les problèmes de passage à l'échelle lors des simulations préliminaires permettant de valider les performances, la robustesse de la gestion ou le dimensionnement du futur réseau. Une démarche Adéquation Algorithme Architecture a été suivie dans cette thèse pour un marché pair à pair
Los estilos APA, Harvard, Vancouver, ISO, etc.
3

Li, Cheng-Hsuan, and 李承軒. "Weighted LLC Latency-Based Run-Time Cache Partitioning for Heterogeneous CPU-GPU Architecture." Thesis, 2014. http://ndltd.ncl.edu.tw/handle/33311478280299879988.

Texto completo
Resumen
碩士<br>國立臺灣大學<br>資訊工程學研究所<br>102<br>Integrating the CPU and GPU on the same chip has become the development trend for microprocessor design. In integrated CPU-GPU architecture, utilizing the shared last-level cache (LLC) is a critical design issue due to the pressure on shared resources and the different characteristics of CPU and GPU applications. Because of the latency-hiding capability provided by the GPU and the huge discrepancy in concurrent executing threads between the CPU and GPU, LLC partitioning can no longer be achieved by simply minimizing the overall cache misses as in homogeneous
Los estilos APA, Harvard, Vancouver, ISO, etc.
4

Mishra, Ashirbad. "Efficient betweenness Centrality Computations on Hybrid CPU-GPU Systems." Thesis, 2016. http://etd.iisc.ac.in/handle/2005/2718.

Texto completo
Resumen
Analysis of networks is quite interesting, because they can be interpreted for several purposes. Various features require different metrics to measure and interpret them. Measuring the relative importance of each vertex in a network is one of the most fundamental building blocks in network analysis. Between’s Centrality (BC) is one such metric that plays a key role in many real world applications. BC is an important graph analytics application for large-scale graphs. However it is one of the most computationally intensive kernels to execute, and measuring centrality in billion-scale graphs is
Los estilos APA, Harvard, Vancouver, ISO, etc.
5

Mishra, Ashirbad. "Efficient betweenness Centrality Computations on Hybrid CPU-GPU Systems." Thesis, 2016. http://hdl.handle.net/2005/2718.

Texto completo
Resumen
Analysis of networks is quite interesting, because they can be interpreted for several purposes. Various features require different metrics to measure and interpret them. Measuring the relative importance of each vertex in a network is one of the most fundamental building blocks in network analysis. Between’s Centrality (BC) is one such metric that plays a key role in many real world applications. BC is an important graph analytics application for large-scale graphs. However it is one of the most computationally intensive kernels to execute, and measuring centrality in billion-scale graphs is
Los estilos APA, Harvard, Vancouver, ISO, etc.

Capítulos de libros sobre el tema "CPU-GPU Partitioning"

1

Clarke, David, Aleksandar Ilic, Alexey Lastovetsky, and Leonel Sousa. "Hierarchical Partitioning Algorithm for Scientific Computing on Highly Heterogeneous CPU + GPU Clusters." In Euro-Par 2012 Parallel Processing. Springer Berlin Heidelberg, 2012. http://dx.doi.org/10.1007/978-3-642-32820-6_49.

Texto completo
Los estilos APA, Harvard, Vancouver, ISO, etc.
2

Saba, Issa, Eishi Arima, Dai Liu, and Martin Schulz. "Orchestrated Co-scheduling, Resource Partitioning, and Power Capping on CPU-GPU Heterogeneous Systems via Machine Learning." In Architecture of Computing Systems. Springer International Publishing, 2022. http://dx.doi.org/10.1007/978-3-031-21867-5_4.

Texto completo
Los estilos APA, Harvard, Vancouver, ISO, etc.
3

Fei, Xiongwei, Kenli Li, Wangdong Yang, and Keqin Li. "CPU-GPU Computing." In Innovative Research and Applications in Next-Generation High Performance Computing. IGI Global, 2016. http://dx.doi.org/10.4018/978-1-5225-0287-6.ch007.

Texto completo
Resumen
Heterogeneous and hybrid computing has been heavily studied in the field of parallel and distributed computing in recent years. It can work on a single computer, or in a group of computers connected by a high-speed network. The former is the topic of this chapter. Its key points are how to cooperatively use devices that are different in performance and architecture to satisfy various computing requirements, and how to make the whole program achieve the best performance possible when executed. CPUs and GPUs have fundamentally different design philosophies, but combining their characteristics co
Los estilos APA, Harvard, Vancouver, ISO, etc.
4

"Topology-Aware Load-Balance Schemes for Heterogeneous Graph Processing." In Advances in Computer and Electrical Engineering. IGI Global, 2018. http://dx.doi.org/10.4018/978-1-5225-3799-1.ch005.

Texto completo
Resumen
Inspired by the insights presented in Chapters 2, 3, and 4, in this chapter the authors present the KCMAX (K-Core MAX) and the KCML (K-Core Multi-Level) frameworks: novel k-core-based graph partitioning approaches that produce unbalanced partitions of complex networks that are suitable for heterogeneous parallel processing. Then they use KCMAX and KCML to explore the configuration space for accelerating BFSs on large complex networks in the context of TOTEM, a BSP heterogeneous GPU + CPU HPC platform. They study the feasibility of the heterogeneous computing approach by systematically studying
Los estilos APA, Harvard, Vancouver, ISO, etc.

Actas de conferencias sobre el tema "CPU-GPU Partitioning"

1

Qiu, Jingbao, Huawei Zhai, Xiaodong Yuan, and Licheng Cui. "CPU -GPU Heterogeneous Stencil Computation Algorithm Based on Dynamic Hybrid Fragmentation Partitioning." In 2024 6th International Conference on Frontier Technologies of Information and Computer (ICFTIC). IEEE, 2024. https://doi.org/10.1109/icftic64248.2024.10913103.

Texto completo
Los estilos APA, Harvard, Vancouver, ISO, etc.
2

Goodarzi, Bahareh, Martin Burtscher, and Dhrubajyoti Goswami. "Parallel Graph Partitioning on a CPU-GPU Architecture." In 2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW). IEEE, 2016. http://dx.doi.org/10.1109/ipdpsw.2016.16.

Texto completo
Los estilos APA, Harvard, Vancouver, ISO, etc.
3

Cho, Younghyun, Florian Negele, Seohong Park, Bernhard Egger, and Thomas R. Gross. "On-the-fly workload partitioning for integrated CPU/GPU architectures." In PACT '18: International conference on Parallel Architectures and Compilation Techniques. ACM, 2018. http://dx.doi.org/10.1145/3243176.3243210.

Texto completo
Los estilos APA, Harvard, Vancouver, ISO, etc.
4

Kim, Dae Hee, Rakesh Nagi, and Deming Chen. "Thanos: High-Performance CPU-GPU Based Balanced Graph Partitioning Using Cross-Decomposition." In 2020 25th Asia and South Pacific Design Automation Conference (ASP-DAC). IEEE, 2020. http://dx.doi.org/10.1109/asp-dac47756.2020.9045588.

Texto completo
Los estilos APA, Harvard, Vancouver, ISO, etc.
5

Wang, Xin, and Wei Zhang. "Cache locking vs. partitioning for real-time computing on integrated CPU-GPU processors." In 2016 IEEE 35th International Performance Computing and Communications Conference (IPCCC). IEEE, 2016. http://dx.doi.org/10.1109/pccc.2016.7820644.

Texto completo
Los estilos APA, Harvard, Vancouver, ISO, etc.
6

Fang, Juan, Shijian Liu, and Xibei Zhang. "Research on Cache Partitioning and Adaptive Replacement Policy for CPU-GPU Heterogeneous Processors." In 2017 16th International Symposium on Distributed Computing and Applications to Business, Engineering and Science (DCABES). IEEE, 2017. http://dx.doi.org/10.1109/dcabes.2017.12.

Texto completo
Los estilos APA, Harvard, Vancouver, ISO, etc.
7

Wachter, Eduardo Weber, Geoff V. Merrett, Bashir M. Al-Hashimi, and Amit Kumar Singh. "Reliable mapping and partitioning of performance-constrained openCL applications on CPU-GPU MPSoCs." In ESWEEK'17: THIRTEENTH EMBEDDED SYSTEM WEEK. ACM, 2017. http://dx.doi.org/10.1145/3139315.3157088.

Texto completo
Los estilos APA, Harvard, Vancouver, ISO, etc.
8

Xiao, Chunhua, Wei Ran, Fangzhu Lin, and Lin Zhang. "Dynamic Fine-Grained Workload Partitioning for Irregular Applications on Discrete CPU-GPU Systems." In 2021 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom). IEEE, 2021. http://dx.doi.org/10.1109/ispa-bdcloud-socialcom-sustaincom52081.2021.00148.

Texto completo
Los estilos APA, Harvard, Vancouver, ISO, etc.
9

Magalhães, W. F., H. M. Gomes, L. B. Marinho, G. S. Aguiar, and P. Silveira. "Investigating Mobile Edge-Cloud Trade-Offs of Object Detection with YOLO." In VII Symposium on Knowledge Discovery, Mining and Learning. Sociedade Brasileira de Computação - SBC, 2019. http://dx.doi.org/10.5753/kdmile.2019.8788.

Texto completo
Resumen
With the advent of smart IoT applications empowered with AI, together with the democratization of mobile devices, moving the computation from cloud to edge is a natural trend in both academia and industry. A major challenge in this direction is enabling the deployment of Deep Neural Networks (DNNs), which usually demand lots of computational resources (i.e. memory, disk, CPU/GPU, and power), in resource limited edge devices. Among the possible strategies to tackle this challenge are: (i) running the entire DNN on the edge device (sometimes not feasible), (ii) distributing the computation betwe
Los estilos APA, Harvard, Vancouver, ISO, etc.
10

Negrut, Dan, Toby Heyn, Andrew Seidl, Dan Melanz, David Gorsich, and David Lamb. "ENABLING COMPUTATIONAL DYNAMICS IN DISTRIBUTED COMPUTING ENVIRONMENTS USING A HETEROGENEOUS COMPUTING TEMPLATE." In 2024 NDIA Michigan Chapter Ground Vehicle Systems Engineering and Technology Symposium. National Defense Industrial Association, 2024. http://dx.doi.org/10.4271/2024-01-3314.

Texto completo
Resumen
&lt;title&gt;ABSTRACT&lt;/title&gt; &lt;p&gt;This paper describes a software infrastructure made up of tools and libraries designed to assist developers in implementing computational dynamics applications running on heterogeneous and distributed computing environments. Together, these tools and libraries compose a so called Heterogeneous Computing Template (HCT). The underlying theme of the solution approach embraced by HCT is that of partitioning the domain of interest into a number of sub-domains that are each managed by a separate core/accelerator (CPU/GPU) pair. The five components at the
Los estilos APA, Harvard, Vancouver, ISO, etc.
Ofrecemos descuentos en todos los planes premium para autores cuyas obras están incluidas en selecciones literarias temáticas. ¡Contáctenos para obtener un código promocional único!