To see the other types of publications on this topic, follow the link: Architecture of GPU.

Journal articles on the topic 'Architecture of GPU'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Architecture of GPU.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Agibalov, Oleg, and Nikolay Ventsov. "On the issue of fuzzy timing estimations of the algorithms running at GPU and CPU architectures." E3S Web of Conferences 135 (2019): 01082. http://dx.doi.org/10.1051/e3sconf/201913501082.

Full text
Abstract:
We consider the task of comparing fuzzy estimates of the execution parameters of genetic algorithms implemented at GPU (graphics processing unit’ GPU) and CPU (central processing unit) architectures. Fuzzy estimates are calculated based on the averaged dependencies of the genetic algorithms running time at GPU and CPU architectures from the number of individuals in the populations processed by the algorithm. The analysis of the averaged dependences of the genetic algorithms running time at GPU and CPU-architectures showed that it is possible to process 10’000 chromosomes at GPU-architecture or
APA, Harvard, Vancouver, ISO, and other styles
2

Payvar, Saman, Maxime Pelcat, and Timo D. Hämäläinen. "A model of architecture for estimating GPU processing performance and power." Design Automation for Embedded Systems 25, no. 1 (2021): 43–63. http://dx.doi.org/10.1007/s10617-020-09244-4.

Full text
Abstract:
AbstractEfficient usage of heterogeneous computing architectures requires distribution of the workload on available processing elements. Traditionally, the mapping is based on information acquired from application profiling and utilized in architecture exploration. To reduce the amount of manual work required, statistical application modeling and architecture modeling can be combined with exploration heuristics. While the application modeling side of the problem has been studied extensively, architecture modeling has received less attention. Linear System Level Architecture (LSLA) is a Model o
APA, Harvard, Vancouver, ISO, and other styles
3

Wittenbrink, Craig M., Emmett Kilgariff, and Arjun Prabhu. "Fermi GF100 GPU Architecture." IEEE Micro 31, no. 2 (2011): 50–59. http://dx.doi.org/10.1109/mm.2011.24.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Kim, Youngsok, Jaewon Lee, Donggyu Kim, and Jangwoo Kim. "ScaleGPU: GPU Architecture for Memory-Unaware GPU Programming." IEEE Computer Architecture Letters 13, no. 2 (2014): 101–4. http://dx.doi.org/10.1109/l-ca.2013.19.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Ang, Li Minn, and Kah Phooi Seng. "GPU-Based Embedded Intelligence Architectures and Applications." Electronics 10, no. 8 (2021): 952. http://dx.doi.org/10.3390/electronics10080952.

Full text
Abstract:
This paper present contributions to the state-of-the art for graphics processing unit (GPU-based) embedded intelligence (EI) research for architectures and applications. This paper gives a comprehensive review and representative studies of the emerging and current paradigms for GPU-based EI with the focus on the architecture, technologies and applications: (1) First, the overview and classifications of GPU-based EI research are presented to give the full spectrum in this area that also serves as a concise summary of the scope of the paper; (2) Second, various architecture technologies for GPU-
APA, Harvard, Vancouver, ISO, and other styles
6

Arafa, Yehia, Abdel-Hameed A. Badawy, Gopinath Chennupati, Nandakishore Santhi, and Stephan Eidenbenz. "PPT-GPU: Scalable GPU Performance Modeling." IEEE Computer Architecture Letters 18, no. 1 (2019): 55–58. http://dx.doi.org/10.1109/lca.2019.2904497.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Power, Jason, Joel Hestness, Marc S. Orr, Mark D. Hill, and David A. Wood. "gem5-gpu: A Heterogeneous CPU-GPU Simulator." IEEE Computer Architecture Letters 14, no. 1 (2015): 34–36. http://dx.doi.org/10.1109/lca.2014.2299539.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Abdellah, Marwan, Ayman Eldeib, and Amr Sharawi. "High Performance GPU-Based Fourier Volume Rendering." International Journal of Biomedical Imaging 2015 (2015): 1–13. http://dx.doi.org/10.1155/2015/590727.

Full text
Abstract:
Fourier volume rendering (FVR) is a significant visualization technique that has been used widely in digital radiography. As a result of itsO(N2log⁡N)time complexity, it provides a faster alternative to spatial domain volume rendering algorithms that areO(N3)computationally complex. Relying on theFourier projection-slice theorem, this technique operates on the spectral representation of a 3D volume instead of processing its spatial representation to generate attenuation-only projections that look likeX-ray radiographs. Due to the rapid evolution of its underlying architecture, the graphics pro
APA, Harvard, Vancouver, ISO, and other styles
9

Venstad, Jon Marius. "Industry-scale finite-difference elastic wave modeling on graphics processing units using the out-of-core technique." GEOPHYSICS 81, no. 2 (2016): T35—T43. http://dx.doi.org/10.1190/geo2015-0267.1.

Full text
Abstract:
The difference in computational power between the few- and multicore architectures represented by central processing units (CPUs) and graphics processing units (GPUs) is significant today, and this difference is likely to increase in the years ahead. GPUs are, therefore, ever more popular for applications in computational physics, such as wave modeling. Finite-difference methods are popular for wave modeling and are well suited for the GPU architecture, but developing an efficient and capable GPU implementation is hindered by the limited size of the GPU memory. I revealed how the out-of-core t
APA, Harvard, Vancouver, ISO, and other styles
10

Steinberger, Markus, Michael Kenzel, Bernhard Kainz, Jörg Müller, Wonka Peter, and Dieter Schmalstieg. "Parallel generation of architecture on the GPU." Computer Graphics Forum 33, no. 2 (2014): 73–82. http://dx.doi.org/10.1111/cgf.12312.

Full text
APA, Harvard, Vancouver, ISO, and other styles
11

., V. B. Nikam. "PARALLEL KNN ON GPU ARCHITECTURE USING OPENCL." International Journal of Research in Engineering and Technology 03, no. 10 (2014): 367–72. http://dx.doi.org/10.15623/ijret.2014.0310059.

Full text
APA, Harvard, Vancouver, ISO, and other styles
12

Kim, Yeongseok, and Youngjin Park. "CPU-GPU architecture for active noise control." Applied Acoustics 153 (October 2019): 1–13. http://dx.doi.org/10.1016/j.apacoust.2019.04.002.

Full text
APA, Harvard, Vancouver, ISO, and other styles
13

Yang, Bo, Hui Liu, and Zhangxin Chen. "Preconditioned GMRES solver on multiple-GPU architecture." Computers & Mathematics with Applications 72, no. 4 (2016): 1076–95. http://dx.doi.org/10.1016/j.camwa.2016.06.027.

Full text
APA, Harvard, Vancouver, ISO, and other styles
14

Jahanshahi, Ali, Hadi Zamani Sabzi, Chester Lau, and Daniel Wong. "GPU-NEST: Characterizing Energy Efficiency of Multi-GPU Inference Servers." IEEE Computer Architecture Letters 19, no. 2 (2020): 139–42. http://dx.doi.org/10.1109/lca.2020.3023723.

Full text
APA, Harvard, Vancouver, ISO, and other styles
15

Gan, Xin Biao, Li Shen, Quan Yuan Tan, Cong Liu, and Zhi Ying Wang. "Performance Evaluation and Optimization on GPU." Advanced Materials Research 219-220 (March 2011): 1445–49. http://dx.doi.org/10.4028/www.scientific.net/amr.219-220.1445.

Full text
Abstract:
GPU provides higher peak performance with hundreds of cores than CPU counterpart. However, it is a big challenge to take full advantage of their computing power. In order to understand performance bottlenecks of applications on many-core GPU and then optimize parallel programs on GPU architectures, we propose a performance evaluating model based on memory wall and then classify applications into AbM (Application bound-in Memory) and AbC (Application bound-in Computing). Furthermore, we optimize kernels characterized with low memory bandwidth including matrix multiplication and FFT (Fast Fourie
APA, Harvard, Vancouver, ISO, and other styles
16

Deniz, Etem, and Alper Sen. "MINIME-GPU." ACM Transactions on Architecture and Code Optimization 12, no. 4 (2016): 1–25. http://dx.doi.org/10.1145/2818693.

Full text
APA, Harvard, Vancouver, ISO, and other styles
17

Braak, Gert-Jan Van Den, and Henk Corporaal. "R-GPU." ACM Transactions on Architecture and Code Optimization 13, no. 1 (2016): 1–24. http://dx.doi.org/10.1145/2890506.

Full text
APA, Harvard, Vancouver, ISO, and other styles
18

Hung, Che-Lun, and Guan-Jie Hua. "Local Alignment Tool Based on Hadoop Framework and GPU Architecture." BioMed Research International 2014 (2014): 1–7. http://dx.doi.org/10.1155/2014/541490.

Full text
Abstract:
With the rapid growth of next generation sequencing technologies, such as Slex, more and more data have been discovered and published. To analyze such huge data the computational performance is an important issue. Recently, many tools, such as SOAP, have been implemented on Hadoop and GPU parallel computing architectures. BLASTP is an important tool, implemented on GPU architectures, for biologists to compare protein sequences. To deal with the big biology data, it is hard to rely on single GPU. Therefore, we implement a distributed BLASTP by combining Hadoop and multi-GPUs. The experimental r
APA, Harvard, Vancouver, ISO, and other styles
19

Xu, Yunlong, Rui Wang, Nilanjan Goswami, Tao Li, and Depei Qian. "Software Transactional Memory for GPU Architectures." IEEE Computer Architecture Letters 13, no. 1 (2014): 49–52. http://dx.doi.org/10.1109/l-ca.2013.4.

Full text
APA, Harvard, Vancouver, ISO, and other styles
20

Singh, Inderpreet, Arrvindh Shriraman, Wilson W. L. Fung, Mike O'Connor, and Tor M. Aamodt. "Cache Coherence for GPU Architectures." IEEE Micro 34, no. 3 (2014): 69–79. http://dx.doi.org/10.1109/mm.2014.4.

Full text
APA, Harvard, Vancouver, ISO, and other styles
21

EMMART, NIALL, and CHARLES WEEMS. "SEARCH-BASED AUTOMATIC CODE GENERATION FOR MULTIPRECISION MODULAR EXPONENTIATION ON MULTIPLE GENERATIONS OF GPU." Parallel Processing Letters 23, no. 04 (2013): 1340009. http://dx.doi.org/10.1142/s0129626413400094.

Full text
Abstract:
Multiprecision modular exponentiation has a variety of uses, including cryptography, prime testing and computational number theory. It is also a very costly operation to compute. GPU parallelism can be used to accelerate these computations, but to use the GPU efficiently, a problem must involve many simultaneous exponentiation operations. Handling a large number of TLS/SSL encrypted sessions in a data center is an important problem that fits this profile. We are developing a framework that enables generation of highly efficient implementations of exponentiation operations for different NVIDIA
APA, Harvard, Vancouver, ISO, and other styles
22

Obrecht, Christian, Bernard Tourancheau, and Frédéric Kuznik. "Performance Evaluation of an OpenCL Implementation of the Lattice Boltzmann Method on the Intel Xeon Phi." Parallel Processing Letters 25, no. 03 (2015): 1541001. http://dx.doi.org/10.1142/s0129626415410017.

Full text
Abstract:
A portable OpenCL implementation of the lattice Boltzmann method targeting emerging many-core architectures is described. The main purpose of this work is to evaluate and compare the performance of this code on three mainstream hardware architectures available today, namely an Intel CPU, an Nvidia GPU, and the Intel Xeon Phi. Because of the similarities between OpenCL and CUDA, we chose to follow some of the strategies devised to implement efficient lattice Boltzmann solvers on Nvidia GPU, while remaining as generic as possible. Being fairly configurable, this program makes possible to ascerta
APA, Harvard, Vancouver, ISO, and other styles
23

Navarro, Cristóbal A., Nancy Hitschfeld-Kahler, and Luis Mateu. "A Survey on Parallel Computing and its Applications in Data-Parallel Problems Using GPU Architectures." Communications in Computational Physics 15, no. 2 (2014): 285–329. http://dx.doi.org/10.4208/cicp.110113.010813a.

Full text
Abstract:
AbstractParallel computing has become an important subject in the field of computer science and has proven to be critical when researching high performance solutions. The evolution of computer architectures (multi-coreandmany-core) towards a higher number of cores can only confirm that parallelism is the method of choice for speeding up an algorithm. In the last decade, the graphics processing unit, or GPU, has gained an important place in the field of high performance computing (HPC) because of its low cost and massive parallel processing power. Super-computing has become, for the first time,
APA, Harvard, Vancouver, ISO, and other styles
24

Dowty, Micah, and Jeremy Sugerman. "GPU virtualization on VMware's hosted I/O architecture." ACM SIGOPS Operating Systems Review 43, no. 3 (2009): 73–82. http://dx.doi.org/10.1145/1618525.1618534.

Full text
APA, Harvard, Vancouver, ISO, and other styles
25

Gutiérrez-Aguado, Juan, Jose M. Claver, and Raúl Peña-Ortiz. "Toward a transparent and efficient GPU cloudification architecture." Journal of Supercomputing 75, no. 7 (2018): 3640–72. http://dx.doi.org/10.1007/s11227-018-2720-z.

Full text
APA, Harvard, Vancouver, ISO, and other styles
26

Gonzalez Clua, Esteban Walter, and Marcelo Panaro Zamith. "Programming in CUDA for Kepler and Maxwell Architecture." Revista de Informática Teórica e Aplicada 22, no. 2 (2015): 233. http://dx.doi.org/10.22456/2175-2745.56384.

Full text
Abstract:
Since the first version of CUDA was launch, many improvements were made in GPU computing. Every new CUDA version included important novel features, turning this architecture more and more closely related to a typical parallel High Performance Language. This tutorial will present the GPU architecture and CUDA principles, trying to conceptualize novel features included by NVIDIA, such as dynamics parallelism, unified memory and concurrent kernels. This text also includes some optimization remarks for CUDA programs.
APA, Harvard, Vancouver, ISO, and other styles
27

Lim, Hyunyul, Tae Hyun Kim, and Sungho Kang. "Prediction-Based Error Correction for GPU Reliability with Low Overhead." Electronics 9, no. 11 (2020): 1849. http://dx.doi.org/10.3390/electronics9111849.

Full text
Abstract:
Scientific and simulation applications are continuously gaining importance in many fields of research and industries. These applications require massive amounts of memory and substantial arithmetic computation. Therefore, general-purpose computing on graphics processing units (GPGPU), which combines the computing power of graphics processing units (GPUs) and general CPUs, have been used for computationally intensive scientific and big data processing applications. Because current GPU architectures lack hardware support for error detection in computation logic, GPGPU has low reliability. Unlike
APA, Harvard, Vancouver, ISO, and other styles
28

Xu, S., X. Huang, Y. Zhang, et al. "gpuPOM: a GPU-based Princeton Ocean Model." Geoscientific Model Development Discussions 7, no. 6 (2014): 7651–91. http://dx.doi.org/10.5194/gmdd-7-7651-2014.

Full text
Abstract:
Abstract. Rapid advances in the performance of the graphics processing unit (GPU) have made the GPU a compelling solution for a series of scientific applications. However, most existing GPU acceleration works for climate models are doing partial code porting for certain hot spots, and can only achieve limited speedup for the entire model. In this work, we take the mpiPOM (a parallel version of the Princeton Ocean Model) as our starting point, design and implement a GPU-based Princeton Ocean Model. By carefully considering the architectural features of the state-of-the-art GPU devices, we rewri
APA, Harvard, Vancouver, ISO, and other styles
29

Saikia, Manob Jyoti, Rajan Kanhirodan, and Ram Mohan Vasu. "High-Speed GPU-Based Fully Three-Dimensional Diffuse Optical Tomographic System." International Journal of Biomedical Imaging 2014 (2014): 1–13. http://dx.doi.org/10.1155/2014/376456.

Full text
Abstract:
We have developed a graphics processor unit (GPU-) based high-speed fully 3D system for diffuse optical tomography (DOT). The reduction in execution time of 3D DOT algorithm, a severely ill-posed problem, is made possible through the use of (1) an algorithmic improvement that uses Broyden approach for updating the Jacobian matrix and thereby updating the parameter matrix and (2) the multinode multithreaded GPU and CUDA (Compute Unified Device Architecture) software architecture. Two different GPU implementations of DOT programs are developed in this study: (1) conventional C language program a
APA, Harvard, Vancouver, ISO, and other styles
30

Lee, Sangpil, and Won Woo Ro. "Parallel GPU Architecture Simulation Framework Exploiting Architectural-Level Parallelism with Timing Error Prediction." IEEE Transactions on Computers 65, no. 4 (2016): 1253–65. http://dx.doi.org/10.1109/tc.2015.2444848.

Full text
APA, Harvard, Vancouver, ISO, and other styles
31

Zhu, Weihang, Ashraf Yaseen, and Yaohang Li. "DEMCMC-GPU: An Efficient Multi-Objective Optimization Method with GPU Acceleration on the Fermi Architecture." New Generation Computing 29, no. 2 (2011): 163–84. http://dx.doi.org/10.1007/s00354-010-0103-y.

Full text
APA, Harvard, Vancouver, ISO, and other styles
32

Cuomo, Salvatore, Pasquale De Michele, and Francesco Piccialli. "3D Data Denoising via Nonlocal Means Filter by Using Parallel GPU Strategies." Computational and Mathematical Methods in Medicine 2014 (2014): 1–14. http://dx.doi.org/10.1155/2014/523862.

Full text
Abstract:
Nonlocal Means (NLM) algorithm is widely considered as a state-of-the-art denoising filter in many research fields. Its high computational complexity leads researchers to the development of parallel programming approaches and the use of massively parallel architectures such as the GPUs. In the recent years, the GPU devices had led to achieving reasonable running times by filtering, slice-by-slice, and 3D datasets with a 2D NLM algorithm. In our approach we design and implement a fully 3D NonLocal Means parallel approach, adopting different algorithm mapping strategies on GPU architecture and m
APA, Harvard, Vancouver, ISO, and other styles
33

Jeon, Hyeran, Hodjat Asghari Esfeden, Nael B. Abu-Ghazaleh, Daniel Wong, and Sindhuja Elango. "Locality-Aware GPU Register File." IEEE Computer Architecture Letters 18, no. 2 (2019): 153–56. http://dx.doi.org/10.1109/lca.2019.2959298.

Full text
APA, Harvard, Vancouver, ISO, and other styles
34

Cao, Wei, Zheng Hua Wang, and Chuan Fu Xu. "A Survey of General Purpose Computation of GPU for Computational Fluid Dynamics." Advanced Materials Research 753-755 (August 2013): 2731–35. http://dx.doi.org/10.4028/www.scientific.net/amr.753-755.2731.

Full text
Abstract:
The graphics processing unit (GPU) has evolved from configurable graphics processor to a powerful engine for high performance computer. In this paper, we describe the graphics pipeline of GPU, and introduce the history and evolution of GPU architecture. We also provide a summary of software environments used on GPU, from graphics APIs to non-graphics APIs. At last, we present the GPU computing in computational fluid dynamics applications, including the GPGPU computing for Navier-Stokes equations methods and the GPGPU computing for Lattice Boltzmann method.
APA, Harvard, Vancouver, ISO, and other styles
35

Wilton, Richard, and Alexander S. Szalay. "Arioc: High-concurrency short-read alignment on multiple GPUs." PLOS Computational Biology 16, no. 11 (2020): e1008383. http://dx.doi.org/10.1371/journal.pcbi.1008383.

Full text
Abstract:
In large DNA sequence repositories, archival data storage is often coupled with computers that provide 40 or more CPU threads and multiple GPU (general-purpose graphics processing unit) devices. This presents an opportunity for DNA sequence alignment software to exploit high-concurrency hardware to generate short-read alignments at high speed. Arioc, a GPU-accelerated short-read aligner, can compute WGS (whole-genome sequencing) alignments ten times faster than comparable CPU-only alignment software. When two or more GPUs are available, Arioc's speed increases proportionately because the softw
APA, Harvard, Vancouver, ISO, and other styles
36

Wasiljew, A., and K. Murawski. "A new CUDA-based GPU implementation of the two-dimensional Athena code." Bulletin of the Polish Academy of Sciences: Technical Sciences 61, no. 1 (2013): 239–50. http://dx.doi.org/10.2478/bpasts-2013-0023.

Full text
Abstract:
Abstract We present a new version of the Athena code, which solves magnetohydrodynamic equations in two-dimensional space. This new implementation, which we have named Athena-GPU, uses CUDA architecture to allow the code execution on Graphical Processor Unit (GPU). The Athena-GPU code is an unofficial, modified version of the Athena code which was originally designed for Central Processor Unit (CPU) architecture. We perform numerical tests based on the original Athena-CPU code and its GPU counterpart to make a performance analysis, which includes execution time, precision differences and accur
APA, Harvard, Vancouver, ISO, and other styles
37

Choi, Hong-Jun, and Cheol-Hong Kim. "Performance Evaluation of the GPU Architecture Executing Parallel Applications." Journal of the Korea Contents Association 12, no. 5 (2012): 10–21. http://dx.doi.org/10.5392/jkca.2012.12.05.010.

Full text
APA, Harvard, Vancouver, ISO, and other styles
38

Chu, A., Chi-Wing Fu, A. Hanson, and Pheng-Ann Heng. "GL4D: A GPU-based Architecture for Interactive 4D Visualization." IEEE Transactions on Visualization and Computer Graphics 15, no. 6 (2009): 1587–94. http://dx.doi.org/10.1109/tvcg.2009.147.

Full text
APA, Harvard, Vancouver, ISO, and other styles
39

Santos, Lucana, Enrico Magli, Raffaele Vitulli, Jose F. Lopez, and Roberto Sarmiento. "Highly-Parallel GPU Architecture for Lossy Hyperspectral Image Compression." IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 6, no. 2 (2013): 670–81. http://dx.doi.org/10.1109/jstars.2013.2247975.

Full text
APA, Harvard, Vancouver, ISO, and other styles
40

Huang, Qinghua, and Naida Lu. "Optimized Real-Time MUSIC Algorithm With CPU-GPU Architecture." IEEE Access 9 (2021): 54067–77. http://dx.doi.org/10.1109/access.2021.3070980.

Full text
APA, Harvard, Vancouver, ISO, and other styles
41

Li, Zheng, Shuhong Wu, Jinchao Xu, and Chensong Zhang. "Toward Cost-Effective Reservoir Simulation Solvers on GPUs." Advances in Applied Mathematics and Mechanics 8, no. 6 (2016): 971–91. http://dx.doi.org/10.4208/aamm.2015.m1138.

Full text
Abstract:
AbstractIn this paper, we focus on graphical processing unit (GPU) and discuss how its architecture affects the choice of algorithm and implementation of fully-implicit petroleum reservoir simulation. In order to obtain satisfactory performance on new many-core architectures such as GPUs, the simulator developers must know a great deal on the specific hardware and spend a lot of time on fine tuning the code. Porting a large petroleum reservoir simulator to emerging hardware architectures is expensive and risky. We analyze major components of an in-house reservoir simulator and investigate how
APA, Harvard, Vancouver, ISO, and other styles
42

WEIGEL, MARTIN. "SIMULATING SPIN MODELS ON GPU: A TOUR." International Journal of Modern Physics C 23, no. 08 (2012): 1240002. http://dx.doi.org/10.1142/s0129183112400025.

Full text
Abstract:
The use of graphics processing units (GPUs) in scientific computing has gathered considerable momentum in the past five years. While GPUs in general promise high performance and excellent performance per Watt ratios, not every class of problems is equally well suitable for exploiting the massively parallel architecture they provide. Lattice spin models appear to be prototypic examples of problems suitable for this architecture, at least as long as local update algorithms are employed. In this review, I summarize our recent experience with the simulation of a wide range of spin models on GPU em
APA, Harvard, Vancouver, ISO, and other styles
43

Chen, Dong, Hua You Su, Wen Mei, Li Xuan Wang, and Chun Yuan Zhang. "Scalable Parallel Motion Estimation on Muti-GPU System." Applied Mechanics and Materials 347-350 (August 2013): 3708–14. http://dx.doi.org/10.4028/www.scientific.net/amm.347-350.3708.

Full text
Abstract:
With NVIDIA’s parallel computing architecture CUDA, using GPU to speed up compute-intensive applications has become a research focus in recent years. In this paper, we proposed a scalable method for multi-GPU system to accelerate motion estimation algorithm, which is the most time consuming process in video encoding. Based on the analysis of data dependency and multi-GPU architecture, a parallel computing model and a communication model are designed. We tested our parallel algorithm and analyzed the performance with 10 standard video sequences in different resolutions using 4 NVIDIA GTX460 GPU
APA, Harvard, Vancouver, ISO, and other styles
44

Bai, Hong Tao, Yu Gang Li, Li Ying Chen, and Yan Ling Wang. "Parallel Optimization of Geometric Correction Algorithm Based on CPU-GPU Hybrid Architecture." Applied Mechanics and Materials 543-547 (March 2014): 2804–8. http://dx.doi.org/10.4028/www.scientific.net/amm.543-547.2804.

Full text
Abstract:
Geometric correction is an essential processing procedure in remote sensing image processing. The algorithms used in geometric correction are time intensive and the size of remote sensing images is very large. Meanwhile,the data to be calculated is in huge size and is accumulating rapidly every day. Hence, the fast processing of geometric correction of remote sensing image becomes an urgent research problem. Through the rapid development of GPU, the current GPU has a great advantage in processing speed and memory bandwidth over CPU. It provides a new way for high performance computing. In this
APA, Harvard, Vancouver, ISO, and other styles
45

El-Naggar, Sabry Ali, Karim Samy El-Said, Mona Elwan, et al. "Toxicity of bean cooking media containing EDTA in mice." Toxicology and Industrial Health 36, no. 6 (2020): 436–45. http://dx.doi.org/10.1177/0748233719893178.

Full text
Abstract:
The possible renal and hepatic toxicities of ethylenediaminetetraacetic acid (EDTA) in bean cooking media were studied using 100 male albino mice. Two sublethal doses of EDTA were used to explore their toxic effects; 20 mg/kg and 200 mg/kg, which corresponded to 1/100th and 1/10th of LD50, respectively. Accordingly, the toxicity study was performed using 50 mice, divided into five groups ( n = 10/group) as follows: group 1 (Gp1) served as a negative control and was orally administered normal saline; group 2 (Gp2) was administered the bean cooking medium; group 3 (Gp3) was administered EDTA (20
APA, Harvard, Vancouver, ISO, and other styles
46

Fernandez Declara, Placido, and J. Daniel Garcia. "Compass SPMD: a SPMD vectorized tracking algorithm." EPJ Web of Conferences 245 (2020): 01006. http://dx.doi.org/10.1051/epjconf/202024501006.

Full text
Abstract:
Compass is a SPMD (Single Program Multiple Data) tracking algorithm for the upcoming LHCb upgrade in 2021. 40 Tb/s need to be processed in real-time to select events. Alternative frameworks, algorithms and architectures are being tested to cope with the deluge of data. Allen is a research and development project aiming to run the full HLT1 (High Level Trigger) on GPUs (Graphics Processing Units). Allen’s architecture focuses on data-oriented layout and algorithms to better exploit parallel architectures. GPUs already proved to exploit the framework efficiently with the algorithms developed for
APA, Harvard, Vancouver, ISO, and other styles
47

Wang, Lu, Magnus Jahre, Almutaz Adileh, Zhiying Wang, and Lieven Eeckhout. "Modeling Emerging Memory-Divergent GPU Applications." IEEE Computer Architecture Letters 18, no. 2 (2019): 95–98. http://dx.doi.org/10.1109/lca.2019.2923618.

Full text
APA, Harvard, Vancouver, ISO, and other styles
48

Yan, Mingyu, Zhaodong Chen, Lei Deng, et al. "Characterizing and Understanding GCNs on GPU." IEEE Computer Architecture Letters 19, no. 1 (2020): 22–25. http://dx.doi.org/10.1109/lca.2020.2970395.

Full text
APA, Harvard, Vancouver, ISO, and other styles
49

MITTAL, SPARSH. "A SURVEY OF TECHNIQUES FOR MANAGING AND LEVERAGING CACHES IN GPUs." Journal of Circuits, Systems and Computers 23, no. 08 (2014): 1430002. http://dx.doi.org/10.1142/s0218126614300025.

Full text
Abstract:
Initially introduced as special-purpose accelerators for graphics applications, graphics processing units (GPUs) have now emerged as general purpose computing platforms for a wide range of applications. To address the requirements of these applications, modern GPUs include sizable hardware-managed caches. However, several factors, such as unique architecture of GPU, rise of CPU–GPU heterogeneous computing, etc., demand effective management of caches to achieve high performance and energy efficiency. Recently, several techniques have been proposed for this purpose. In this paper, we survey seve
APA, Harvard, Vancouver, ISO, and other styles
50

Kim, Do-Hyun, and Chi-Yong Kim. "Design of a SIMT architecture GP-GPU Using Tile based on Graphic Pipeline Structure." Journal of IKEEE 20, no. 1 (2016): 75–81. http://dx.doi.org/10.7471/ikeee.2016.20.1.075.

Full text
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!