Log in

Relevant bibliographies by topics / Irregular Memory Accesses / Journal articles

To see the other types of publications on this topic, follow the link: Irregular Memory Accesses.

Journal articles on the topic 'Irregular Memory Accesses'

Author: Grafiati

Published: 10 December 2022

Last updated: 28 January 2023

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 26 journal articles for your research on the topic 'Irregular Memory Accesses.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Lin, Yuan, and David Padua. "Compiler analysis of irregular memory accesses." ACM SIGPLAN Notices 35, no. 5 (2000): 157–68. http://dx.doi.org/10.1145/358438.349322.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

FRAGUELA, BASILIO B., RAMÓN DOALLO, and EMILIO L. ZAPATA. "MEMORY HIERARCHY PERFORMANCE PREDICTION FOR BLOCKED SPARSE ALGORITHMS." Parallel Processing Letters 09, no. 03 (1999): 347–60. http://dx.doi.org/10.1142/s0129626499000323.

Full text

Abstract:

Nowadays the performance gap between processors and main memory makes an efficient usage of the memory hierarchy necessary for good program performance. Several techniques have been proposed for this purpose. Nevertheless most of them consider only regular access patterns, while many scientific and numerical applications give place to irregular patterns. A typical case is that of indirect accesses due to the use of compressed storage formats for sparse matrices. This paper describes an analytic approach to model both regular and irregular access patterns. The application modeled is an optimize

APA, Harvard, Vancouver, ISO, and other styles

3

Wang, Haomiao, Prabu Thiagaraj, and Oliver Sinnen. "Harmonic-Summing Module of SKA on FPGA—Optimizing the Irregular Memory Accesses." IEEE Transactions on Very Large Scale Integration (VLSI) Systems 27, no. 3 (2019): 624–36. http://dx.doi.org/10.1109/tvlsi.2018.2882238.

Full text

APA, Harvard, Vancouver, ISO, and other styles

4

Guo, Hui, Libo Huang, Yashuai Lu, Sheng Ma, and Zhiying Wang. "DyCache: Dynamic Multi-Grain Cache Management for Irregular Memory Accesses on GPU." IEEE Access 6 (2018): 38881–91. http://dx.doi.org/10.1109/access.2018.2818193.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

Srikanth, Sriseshan, Anirudh Jain, Thomas M. Conte, Erik P. Debenedictis, and Jeanine Cook. "SortCache." ACM Transactions on Architecture and Code Optimization 18, no. 4 (2021): 1–24. http://dx.doi.org/10.1145/3473332.

Full text

Abstract:

Sparse data applications have irregular access patterns that stymie modern memory architectures. Although hyper-sparse workloads have received considerable attention in the past, moderately-sparse workloads prevalent in machine learning applications, graph processing and HPC have not. Where the former can bypass the cache hierarchy, the latter fit in the cache. This article makes the observation that intelligent, near-processor cache management can improve bandwidth utilization for data-irregular accesses, thereby accelerating moderately-sparse workloads. We propose SortCache, a processor-cent

APA, Harvard, Vancouver, ISO, and other styles

6

Asiatici, Mikhail, and Paolo Ienne. "Request, Coalesce, Serve, and Forget: Miss-Optimized Memory Systems for Bandwidth-Bound Cache-Unfriendly Applications on FPGAs." ACM Transactions on Reconfigurable Technology and Systems 15, no. 2 (2022): 1–33. http://dx.doi.org/10.1145/3466823.

Full text

Abstract:

Applications such as large-scale sparse linear algebra and graph analytics are challenging to accelerate on FPGAs due to the short irregular memory accesses, resulting in low cache hit rates. Nonblocking caches reduce the bandwidth required by misses by requesting each cache line only once, even when there are multiple misses corresponding to it. However, such reuse mechanism is traditionally implemented using an associative lookup. This limits the number of misses that are considered for reuse to a few tens, at most. In this article, we present an efficient pipeline that can process and store

APA, Harvard, Vancouver, ISO, and other styles

7

WEI, ZHENG, and JOSEPH JAJA. "OPTIMIZATION OF LINKED LIST PREFIX COMPUTATIONS ON MULTITHREADED GPUS USING CUDA." Parallel Processing Letters 22, no. 04 (2012): 1250012. http://dx.doi.org/10.1142/s0129626412500120.

Full text

Abstract:

We present a number of optimization techniques to compute prefix sums on linked lists and implement them on the multithreaded GPUs Tesla C1060, Tesla C2050, and GTX480 using CUDA. Prefix computations on linked structures involve in general highly irregular fine grain memory accesses that are typical of many computations on linked lists, trees, and graphs. While the current generation of GPUs provides substantial computational power and extremely high bandwidth memory accesses, they may appear at first to be primarily geared toward streamed, highly data parallel computations. In this paper, we

APA, Harvard, Vancouver, ISO, and other styles

8

Yang, Tao, Zhezhi He, Tengchuan Kou, et al. "BISWSRBS: A Winograd-based CNN Accelerator with a Fine-grained Regular Sparsity Pattern and Mixed Precision Quantization." ACM Transactions on Reconfigurable Technology and Systems 14, no. 4 (2021): 1–28. http://dx.doi.org/10.1145/3467476.

Full text

Abstract:

Field-programmable Gate Array (FPGA) is a high-performance computing platform for Convolution Neural Networks (CNNs) inference. Winograd algorithm, weight pruning, and quantization are widely adopted to reduce the storage and arithmetic overhead of CNNs on FPGAs. Recent studies strive to prune the weights in the Winograd domain, however, resulting in irregular sparse patterns and leading to low parallelism and reduced utilization of resources. Besides, there are few works to discuss a suitable quantization scheme for Winograd. In this article, we propose a regular sparse pruning pattern in the

APA, Harvard, Vancouver, ISO, and other styles

9

He, Guixia, and Jiaquan Gao. "A Novel CSR-Based Sparse Matrix-Vector Multiplication on GPUs." Mathematical Problems in Engineering 2016 (2016): 1–12. http://dx.doi.org/10.1155/2016/8471283.

Full text

Abstract:

Sparse matrix-vector multiplication (SpMV) is an important operation in scientific computations. Compressed sparse row (CSR) is the most frequently used format to store sparse matrices. However, CSR-based SpMVs on graphic processing units (GPUs), for example, CSR-scalar and CSR-vector, usually have poor performance due to irregular memory access patterns. This motivates us to propose a perfect CSR-based SpMV on the GPU that is called PCSR. PCSR involves two kernels and accesses CSR arrays in a fully coalesced manner by introducing a middle array, which greatly alleviates the deficiencies of CS

APA, Harvard, Vancouver, ISO, and other styles

10

NATARAJAN, RAGAVENDRA, VINEETH MEKKAT, WEI-CHUNG HSU, and ANTONIA ZHAI. "EFFECTIVENESS OF COMPILER-DIRECTED PREFETCHING ON DATA MINING BENCHMARKS." Journal of Circuits, Systems and Computers 21, no. 02 (2012): 1240006. http://dx.doi.org/10.1142/s0218126612400063.

Full text

Abstract:

For today's increasingly power-constrained multicore systems, integrating simpler and more energy-efficient in-order cores becomes attractive. However, since in-order processors lack complex hardware support for tolerating long-latency memory accesses, developing compiler technologies to hide such latencies becomes critical. Compiler-directed prefetching has been demonstrated effective on some applications. On the application side, a large class of data centric applications has emerged to explore the underlying properties of the explosively growing data. These applications, in contrast to trad

APA, Harvard, Vancouver, ISO, and other styles

11

Choudhury, Dwaipayan, Aravind Sukumaran Rajam, Ananth Kalyanaraman, and Partha Pratim Pande. "High-Performance and Energy-Efficient 3D Manycore GPU Architecture for Accelerating Graph Analytics." ACM Journal on Emerging Technologies in Computing Systems 18, no. 1 (2022): 1–19. http://dx.doi.org/10.1145/3482880.

Full text

Abstract:

Recent advances in GPU-based manycore accelerators provide the opportunity to efficiently process large-scale graphs on chip. However, real world graphs have a diverse range of topology and connectivity patterns (e.g., degree distributions) that make the design of input-agnostic hardware architectures a challenge. Network-on-Chip (NoC)- based architectures provide a way to overcome this challenge as the architectural topology can be used to approximately model the expected traffic patterns that emerge from graph application workloads. In this paper, we first study the mix of long- and short-ra

APA, Harvard, Vancouver, ISO, and other styles

12

Chen, Yuedan, Guoqing Xiao, Kenli Li, Francesco Piccialli, and Albert Y. Zomaya. "fgSpMSpV: A Fine-grained Parallel SpMSpV Framework on HPC Platforms." ACM Transactions on Parallel Computing 9, no. 2 (2022): 1–29. http://dx.doi.org/10.1145/3512770.

Full text

Abstract:

Sparse matrix-sparse vector (SpMSpV) multiplication is one of the fundamental and important operations in many high-performance scientific and engineering applications. The inherent irregularity and poor data locality lead to two main challenges to scaling SpMSpV over high-performance computing (HPC) systems: (i) a large amount of redundant data limits the utilization of bandwidth and parallel resources; (ii) the irregular access pattern limits the exploitation of computing resources. This paper proposes a fine-grained parallel SpMSpV ( fgSpMSpV ) framework on Sunway TaihuLight supercomputer t

APA, Harvard, Vancouver, ISO, and other styles

13

Yin, Qiu-Shi, and Tae-Hee Han. "In-memory Accelerator for Irregular Memory Access to Linked Data Structures." Journal of the Institute of Electronics and Information Engineers 57, no. 5 (2020): 37–44. http://dx.doi.org/10.5573/ieie.2020.57.5.37.

Full text

APA, Harvard, Vancouver, ISO, and other styles

14

S., N., Tamizharasan P. ,. Ramasubramanian. "Enhanced Data Parallelism for Irregular Memory Access Optimization on GPU." Applied Mathematics & Information Sciences 13, no. 4 (2019): 595–602. http://dx.doi.org/10.18576/amis/130411.

Full text

APA, Harvard, Vancouver, ISO, and other styles

15

Zheng, Ran, Yuan-dong Liu, and Hai Jin. "Optimizing non-coalesced memory access for irregular applications with GPU computing." Frontiers of Information Technology & Electronic Engineering 21, no. 9 (2020): 1285–301. http://dx.doi.org/10.1631/fitee.1900262.

Full text

APA, Harvard, Vancouver, ISO, and other styles

16

BRANDES, THOMAS. "HPF LIBRARY AND COMPILER SUPPORT FOR HALOS IN DATA PARALLEL IRREGULAR COMPUTATIONS." Parallel Processing Letters 10, no. 02n03 (2000): 189–200. http://dx.doi.org/10.1142/s0129626400000196.

Full text

Abstract:

On distributed memory architectures data parallel compilers emulate the global address space by distributing the data onto the processors according to the mapping directives of the user and by generating automatically explicit inter-processor communication. A shadow is additionally allocated local memory to keep on one processor also non-local values of the data that is accessed or defined by this processor. While shadow edges are already well studied for structured grids, this paper focuses on its use for applications with unstructured grids where updates on the shadow edges involve unstructu

APA, Harvard, Vancouver, ISO, and other styles

17

Li, Bingchao, Jizeng Wei, Jizhou Sun, Murali Annavaram, and Nam Sung Kim. "An Efficient GPU Cache Architecture for Applications with Irregular Memory Access Patterns." ACM Transactions on Architecture and Code Optimization 16, no. 3 (2019): 1–24. http://dx.doi.org/10.1145/3322127.

Full text

APA, Harvard, Vancouver, ISO, and other styles

18

Labarta, J., E. Ayguadé, J. Oliver, and D. S. Henty. "New OpenMP Directives for Irregular Data Access Loops." Scientific Programming 9, no. 2-3 (2001): 175–83. http://dx.doi.org/10.1155/2001/798505.

Full text

Abstract:

Many scientific applications involve array operations that are sparse in nature, ie array elements depend on the values of relatively few elements of the same or another array. When parallelised in the shared-memory model, there are often inter-thread dependencies which require that the individual array updates are protected in some way. Possible strategies include protecting all the updates, or having each thread compute local temporary results which are then combined globally across threads. However, for the extremely common situation of sparse array access, neither of these approaches is pa

APA, Harvard, Vancouver, ISO, and other styles

19

Spring, Carl, and John M. Davis. "Relations of digit naming speed with three components of reading." Applied Psycholinguistics 9, no. 4 (1988): 315–34. http://dx.doi.org/10.1017/s0142716400008031.

Full text

Abstract:

ABSTRACTThe goal of the present investigation was to identify the reading processes that are impaired in children whose digit naming speeds are slow. Continuous digit naming speed was assumed to measure the automaticity with which character codes may be accessed in memory, and the automaticity of this process was assumed to be a prerequisite for the accurate performance of higher level reading processes. In Study I it was found, for children in grades 1 through 3, that digit naming speed was reliably correlated with reading of both irregularly spelled words and pronounceable nonsense words, an

APA, Harvard, Vancouver, ISO, and other styles

20

Abed, Khalid H., and Gerald R. Morris. "Improving performance of codes with large/irregular stride memory access patterns via high performance reconfigurable computers." Journal of Parallel and Distributed Computing 73, no. 11 (2013): 1430–38. http://dx.doi.org/10.1016/j.jpdc.2012.07.011.

Full text

APA, Harvard, Vancouver, ISO, and other styles

21

Jones, Dylan, Clare Madden, and Chris Miles. "Privileged Access by Irrelevant Speech to Short-term Memory: The Role of Changing State." Quarterly Journal of Experimental Psychology Section A 44, no. 4 (1992): 645–69. http://dx.doi.org/10.1080/14640749208401304.

Full text

Abstract:

Memory for visually presented items is impaired by speech that is played as an irrelevant background. The paper presents the view that changing state of the auditory material is an important prerequisite for this disruption. Four experiments studied the effects of sounds varying in complexity in an attempt to establish which features of changing state in the auditory signal lead to diminished recall. Simple unvarying or repetitive speech sounds were not sufficient to induce the irrelevant speech effect (Experiment 1): in addition, simple analogues of speech, possessing regular or irregular env

APA, Harvard, Vancouver, ISO, and other styles

22

Segura, Albert, Jose Maria Arnau, and Antonio Gonzalez. "Irregular accesses reorder unit: improving GPGPU memory coalescing for graph-based workloads." Journal of Supercomputing, July 18, 2022. http://dx.doi.org/10.1007/s11227-022-04621-1.

Full text

Abstract:

AbstractGPGPU architectures have become the dominant platform for massively parallel workloads, delivering high performance and energy efficiency for popular applications such as machine learning, computer vision or self-driving cars. However, irregular applications, such as graph processing, fail to fully exploit GPGPU resources due to their divergent memory accesses that saturate the memory hierarchy. To reduce the pressure on the memory subsystem for divergent memory-intensive applications, programmers must take into account SIMT execution model and memory coalescing in GPGPUs, devoting sig

APA, Harvard, Vancouver, ISO, and other styles

23

Choudhury, Dwaipayan, Lizhi Xiang, Aravind Sukumaran Rajam, Ananth Kalyanaraman, and Partha Pratim Pande. "Accelerating Graph Computations on 3D NoC-enabled PIM Architectures." ACM Transactions on Design Automation of Electronic Systems, October 7, 2022. http://dx.doi.org/10.1145/3564290.

Full text

Abstract:

Graph application workloads are dominated by random memory accesses with poor locality. To tackle the irregular and sparse nature of computation, ReRAM-based Processing-in-Memory (PIM) architectures have been proposed recently. Most of these ReRAM architecture designs have focused on mapping graph computations into a set of multiply-and-accumulate (MAC) operations. ReRAMs also offer a key advantage in reducing memory latency between cores and memory by allowing for processing-in-memory (PIM). However, when implemented on a ReRAM-based manycore architecture, graph applications still pose two ke

APA, Harvard, Vancouver, ISO, and other styles

24

Barreda, Maria, Manuel F. Dolz, and M. Asunción Castaño. "Convolutional neural nets for estimating the run time and energy consumption of the sparse matrix-vector product." International Journal of High Performance Computing Applications, August 26, 2020, 109434202095319. http://dx.doi.org/10.1177/1094342020953196.

Full text

Abstract:

Modeling the performance and energy consumption of the sparse matrix-vector product (SpMV) is essential to perform off-line analysis and, for example, choose a target computer architecture that delivers the best performance-energy consumption ratio. However, this task is especially complex given the memory-bounded nature and irregular memory accesses of the SpMV, mainly dictated by the input sparse matrix. In this paper, we propose a Machine Learning (ML)-driven approach that leverages Convolutional Neural Networks (CNNs) to provide accurate estimations of the performance and energy consumptio

APA, Harvard, Vancouver, ISO, and other styles

25

Liu, Wenjie, Xubin He, and Qing Liu. "Exploring Memory Access Similarity to Improve Irregular Application Performance for Distributed Hybrid Memory Systems." IEEE Transactions on Parallel and Distributed Systems, 2022, 1–12. http://dx.doi.org/10.1109/tpds.2022.3227544.

Full text

APA, Harvard, Vancouver, ISO, and other styles

26

Brabazon, Tara. "A Red Light Sabre to Go, and Other Histories of the Present." M/C Journal 2, no. 4 (1999). http://dx.doi.org/10.5204/mcj.1761.

Full text

Abstract:

If I find out that you have bought a $90 red light sabre, Tara, well there's going to be trouble. -- Kevin Brabazon A few Saturdays ago, my 71-year old father tried to convince me of imminent responsibilities. As I am considering the purchase of a house, there are mortgages, bank fees and years of misery to endure. Unfortunately, I am not an effective Big Picture Person. The lure of the light sabre is almost too great. For 30 year old Generation Xers like myself, it is more than a cultural object. It is a textual anchor, and a necessary component to any future history of the present. Revelling

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!