Log in

Relevant bibliographies by topics / Systolic array circuits

Contents

Journal articles
Dissertations / Theses
Books
Conference papers

Academic literature on the topic 'Systolic array circuits'

Author: Grafiati

Published: 4 June 2021

Last updated: 25 January 2023

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Systolic array circuits.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Systolic array circuits"

1

AL-RABADI, ANAS N. "REVERSIBLE SYSTOLIC ARRAYS: m-ARY BIJECTIVE SINGLE-INSTRUCTION MULTIPLE-DATA (SIMD) ARCHITECTURES AND THEIR QUANTUM CIRCUITS." Journal of Circuits, Systems and Computers 17, no. 04 (2008): 729–71. http://dx.doi.org/10.1142/s0218126608004472.

Full text

Abstract:

New type of m-ary systolic arrays called reversible systolic arrays is introduced in this paper. The m-ary quantum systolic architectures' realizations and computations of the new type of systolic arrays are also introduced. A systolic array is an example of a single-instruction multiple-data (SIMD) machine in which each processing element (PE) performs a single simple operation. Systolic devices provide inexpensive but massive computation power, and are cost-effective, high-performance, and special-purpose systems that have wide range of applications such as in solving several regular and compute-bound problems containing repetitive multiple operations on large arrays of data. Similar to the classical case, information in a reversible and quantum systolic circuit flows between cells in a pipelined fashion, and communication with the outside world occurs only at the boundary cells. Since basic PEs used in the construction of arithmetic systolic arrays are the add–multiply cells, the results introduced in this paper are general and apply to a very wide range of add–multiply-based systolic arrays. Since the reduction of power consumption is a major requirement for the circuit design in future technologies, such as in quantum computing, the main features of several future technologies will include reversibility. Consequently, the new systolic circuits can play an important task in the design of future circuits that consume minimal power. It is also shown that the new systolic arrays maintain the high level of regularity while exhibiting the new fundamental bijectivity (reversibility) and quantum superposition properties. These new properties will be essential in performing super-fast arithmetic-intensive computations that are fundamental in several future applications such as in multi-dimensional quantum signal processing (QSP).

APA, Harvard, Vancouver, ISO, and other styles

2

Ho, H., V. Szwarc, and T. Kwasniewski. "A Reconfigurable Systolic Array Architecture for Multicarrier Wireless and Multirate Applications." International Journal of Reconfigurable Computing 2009 (2009): 1–14. http://dx.doi.org/10.1155/2009/529512.

Full text

Abstract:

A reconfigurable systolic array (RSA) architecture that supports the realization of DSP functions for multicarrier wireless and multirate applications is presented. The RSA consists of coarse-grained processing elements that can be configured as complex DSP functions that are the basic building blocks of Polyphase-FIR filters, phase shifters, DFTs, and Polyphase-DFT circuits. The homogeneous characteristic of the RSA architecture, where each reconfigurable processing element (PE) cell is connected to its nearest neighbors via configurable switch (SW) elements, enables array expansion for parallel processing and facilitates time sharing computation of high-throughput data by individual PEs. For DFT circuit configurations, an algorithmic optimization technique has been employed to reduce the overall number of vector-matrix products to be mapped on the RSA. The hardware complexity and throughput of the RSA-based DFT structures have been evaluated and compared against several conventional modular FFT realizations. Designs and circuit implementations of the PE cell and several RSAs configured as DFT and Polyphase filter circuits are also presented. The RSA architecture offers significant flexibility and computational capacity for applications that require real time reconfiguration and high-density computing.

APA, Harvard, Vancouver, ISO, and other styles

3

Rayudu, Kurada Verra Bhoga Vasantha, Dhananjay Ramachandra Jahagirdar, and Patri Srihari Rao. "Design and testing of systolic array multiplier using fault injecting schemes." Computer Science and Information Technologies 3, no. 1 (2022): 1–9. http://dx.doi.org/10.11591/csit.v3i1.p1-9.

Full text

Abstract:

Nowadays low power design circuits are major important for data transmission and processing the information among various system designs. One of the major multipliers used for synchronizing the data transmission is the systolic array multiplier, low power designs are mostly used for increasing the performance and reducing the hardware complexity. Among all the mathematical operations, multiplier plays a major role where it processes more information and with the high complexity of circuit in the existing irreversible design. We develop a systolic array multiplier using reversible gates for low power appliances, faults and coverage of the reversible logic are calculated in this paper. To improvise more, we introduced a reversible logic gate and tested the reversible systolic array multiplier using the fault injection method of built-in self-test block observer (BILBO) in which all corner cases are covered which shows 97% coverage compared with existing designs. Finally, Xilinx ISE 14.7 was used for synthesis and simulation results and compared parameters with existing designs which prove more efficiency.

APA, Harvard, Vancouver, ISO, and other styles

4

Yamazaki, Ichitaro, Jakub Kurzak, Piotr Luszczek, and Jack Dongarra. "Design and Implementation of a Large Scale Tree-Based QR Decomposition Using a 3D Virtual Systolic Array and a Lightweight Runtime." Parallel Processing Letters 24, no. 04 (2014): 1442004. http://dx.doi.org/10.1142/s0129626414420043.

Full text

Abstract:

A systolic array provides an alternative computing paradigm to the von Neumann architecture. Though its hardware implementation has failed as a paradigm to design integrated circuits in the past, we are now discovering that the systolic array as a software virtualization layer can lead to an extremely scalable execution paradigm. To demonstrate this scalability, in this paper, we design and implement a 3D virtual systolic array to compute a tile QR decomposition of a tall-and-skinny dense matrix. Our implementation is based on a state-of-the-art algorithm that factorizes a panel based on a tree-reduction. Freed from the constraint of a planar layout, we present a three-dimensional virtual systolic array architecture for this algorithm. Using a runtime developed as a part of the Parallel Ultra Light Systolic Array Runtime (PULSAR) project, we demonstrate on a Cray-XT5 machine how our virtual systolic array can be mapped to a large-scale machine and obtain excellent parallel performance. This is an important contribution since such a QR decomposition is used, for example, to compute a least squares solution of an overdetermined system, which arises in many scientific and engineering problems.

APA, Harvard, Vancouver, ISO, and other styles

5

Raut, R., B. B. Bhattacharyya, and S. M. Faruque. "A discrete Fourier transform using switched capacitor circuits in systolic array architecture." IEEE Transactions on Circuits and Systems 37, no. 12 (1990): 1578–80. http://dx.doi.org/10.1109/31.101284.

Full text

APA, Harvard, Vancouver, ISO, and other styles

6

Torralba, A. "A systolic array with applications to image processing and wire-routing in VLSI circuits." Parallel Computing 17, no. 1 (1991): 85–93. http://dx.doi.org/10.1016/s0167-8191(05)80020-1.

Full text

APA, Harvard, Vancouver, ISO, and other styles

7

Kondapalli, Soumya, Arjuna Madanayake, and Len Bruton. "Digital Architectures for UWB Beamforming Using 2D IIR Spatio-Temporal Frequency-Planar Filters." International Journal of Antennas and Propagation 2012 (2012): 1–19. http://dx.doi.org/10.1155/2012/234263.

Full text

Abstract:

A design method and an FPGA-based prototype implementation of massively parallel systolic-array VLSI architectures for 2nd-order and 3rd-order frequency-planar beam plane-wave filters are proposed. Frequency-planar beamforming enables highly-directional UWB RF beams at low computational complexity compared to digital phased-array feed techniques. The array factors of the proposed realizations are simulated and both high-directional selectivity and UWB performance are demonstrated. The proposed architectures operate using 2's complement finite precision digital arithmetic. The real-time throughput is maximized using look-ahead optimization applied locally to each processor in the proposed massively-parallel realization of the filter. From sensitivity theory, it is shown that 15 and 19-bit precision for filter coefficients results in better than 3% error for 2nd- and 3rd-order beam filters. Folding together with Ktimes multiplexing is applied to the proposed beam architectures such that throughput can be traded forK-fold lower complexity for realizing the 2-D fan filter banks. Prototype FPGA circuit implementations of these filters are proposed using a Virtex 6 xc6vsx475t-2ff1759 device. The FPGA-prototyped architectures are evaluated using area (A), critical path delay (T), and metricsATandAT2. TheL2error energy is used as a metric for evaluating fixed-point noise levels and the accuracy of the finite precision digital arithmetic circuits.

APA, Harvard, Vancouver, ISO, and other styles

8

ANDONOV, R., P. QUINTON, S. RAJOPADHYE, and D. WILDE. "A SHIFT REGISTER-BASED SYSTOLIC ARRAY FOR THE UNBOUNDED KNAPSACK PROBLEM." Parallel Processing Letters 05, no. 02 (1995): 251–62. http://dx.doi.org/10.1142/s0129626495000230.

Full text

Abstract:

We present a shift register-based systolic array for a class of recurrences, with dynamic dependencies called knapsack problem recurrences. All previous arrays or parallel implementations led to either low efficiency or to complicated control. To the best of our knowledge, the proposed design is the first realistic pure systolic and optimal array for this pseudo-polynomial, NP-hard problem. The key feature of the array is that it requires almost no control circuitry.

APA, Harvard, Vancouver, ISO, and other styles

9

Rasoulinezhad, Seyedramin, Esther Roorda, Steve Wilton, Philip H. W. Leong, and David Boland. "Rethinking Embedded Blocks for Machine Learning Applications." ACM Transactions on Reconfigurable Technology and Systems 15, no. 1 (2022): 1–30. http://dx.doi.org/10.1145/3491234.

Full text

Abstract:

The underlying goal of FPGA architecture research is to devise flexible substrates that implement a wide variety of circuits efficiently. Contemporary FPGA architectures have been optimized to support networking, signal processing, and image processing applications through high-precision digital signal processing (DSP) blocks. The recent emergence of machine learning has created a new set of demands characterized by: (1) higher computational density and (2) low precision arithmetic requirements. With the goal of exploring this new design space in a methodical manner, we first propose a problem formulation involving computing nested loops over multiply-accumulate (MAC) operations, which covers many basic linear algebra primitives and standard deep neural network (DNN) kernels. A quantitative methodology for deriving efficient coarse-grained compute block architectures from benchmarks is then proposed together with a family of new embedded blocks, called MLBlocks. An MLBlock instance includes several multiply-accumulate units connected via a flexible routing, where each configuration performs a few parallel dot-products in a systolic array fashion. This architecture is parameterized with support for different data movements, reuse, and precisions, utilizing a columnar arrangement that is compatible with existing FPGA architectures. On synthetic benchmarks, we demonstrate that for 8-bit arithmetic, MLBlocks offer 6× improved performance over the commercial Xilinx DSP48E2 architecture with smaller area and delay; and for time-multiplexed 16-bit arithmetic, achieves 2× higher performance per area with the same area and frequency. All source codes and data, along with documents to reproduce all the results in this article, are available at http://github.com/raminrasoulinezhad/MLBlocks .

APA, Harvard, Vancouver, ISO, and other styles

10

Roorda, Esther, Seyedramin Rasoulinezhad, Philip H. W. Leong, and Steven J. E. Wilton. "FPGA Architecture Exploration for DNN Acceleration." ACM Transactions on Reconfigurable Technology and Systems 15, no. 3 (2022): 1–37. http://dx.doi.org/10.1145/3503465.

Full text

Abstract:

Recent years have seen an explosion of machine learning applications implemented on Field-Programmable Gate Arrays (FPGAs) . FPGA vendors and researchers have responded by updating their fabrics to more efficiently implement machine learning accelerators, including innovations such as enhanced Digital Signal Processing (DSP) blocks and hardened systolic arrays. Evaluating architectural proposals is difficult, however, due to the lack of publicly available benchmark circuits. This paper addresses this problem by presenting an open-source benchmark circuit generator that creates realistic DNN-oriented circuits for use in FPGA architecture studies. Unlike previous generators, which create circuits that are agnostic of the underlying FPGA, our circuits explicitly instantiate embedded blocks, allowing for meaningful comparison of recent architectural proposals without the need for a complete inference computer-aided design (CAD) flow. Our circuits are compatible with the VTR CAD suite, allowing for architecture studies that investigate routing congestion and other low-level architectural implications. In addition to addressing the lack of machine learning benchmark circuits, the architecture exploration flow that we propose allows for a more comprehensive evaluation of FPGA architectures than traditional static benchmark suites. We demonstrate this through three case studies which illustrate how realistic benchmark circuits can be generated to target different heterogeneous FPGAs.

APA, Harvard, Vancouver, ISO, and other styles

More sources

Dissertations / Theses on the topic "Systolic array circuits"

1

Diamond, Mitchell S. "A self-timed implementation of the bi-way sorter systolic array processor /." Online version of thesis, 1993. http://hdl.handle.net/1850/11957.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Ismailoglu, Ayse Neslin. "Asynchronous Design Of Systolic Array Architectures In Cmos." Phd thesis, METU, 2008. http://etd.lib.metu.edu.tr/upload/12609443/index.pdf.

Full text

Abstract:

In this study, delay-insensitive asynchronous circuit design style has been adopted to systolic array architectures to exploit the benefits of both techniques for improved throughput. A delay-insensitivity verification analysis method employing symbolic delays is proposed for bit-level pipelined asynchronous circuits. The proposed verification method allows datadependent early output evaluation to co-exist with robust delay-insensitive circuit behavior in pipelined architectures such as systolic arrays. Regardless of the length of the pipeline, delay-insensitivity verification of a systolic array with early output evaluation paths in onedimension is reduced to analysis of three adjacent systoles for eight possible early/late output evaluation scenarios. Analyzing both combinational and sequential parts concurrently, delay-insensitivity violations are located and corrected at structural level, without diminishing the early output evaluation benefits. Since symbolic delays are used without imposing any timing constraints on the environment<br>the method is technology independent and robust against all physical and environmental variations. To demonstrate the verification method, adders are selected for being at the core of data processing systems. Two asynchronous adder topologies in the delay-insensitive dual-rail threshold logic style, having data-dependent early carry evaluation paths, are converted into bit-level pipelined systolic arrays. On these adders, data-dependent delay-insensitivity violations are detected and resolved using the proposed verification technique. The modified adders achieved the targeted O(log2n) average completion time and -as a result of bit-level pipelining- nearly constant throughput against increased bit-length. The delay-insensitivity verification method could further be extended to handle more early output evaluation paths in multi-dimension.

APA, Harvard, Vancouver, ISO, and other styles

3

Cirovic, Branislav. "Equivalence relations of synchronous schemes." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 2000. http://www.collectionscanada.ca/obj/s4/f2/dsk1/tape3/PQDD_0031/NQ62448.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

4

Zhou, Bing Bing. "Systolic architectures for parallel implementation of digital filters." Phd thesis, 1988. http://hdl.handle.net/1885/138419.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

Strazdins, Peter Edward. "Control structures for mesh-connected networks." Phd thesis, 1990. http://hdl.handle.net/1885/138431.

Full text

APA, Harvard, Vancouver, ISO, and other styles

6

Lee, Louis Wai-Fung. "Fully efficient pipelined VLSI arrays for solving toeplitz matrices." Thesis, 1991. http://hdl.handle.net/1957/37297.

Full text

Abstract:

Fully efficient systolic arrays for the solution of Toeplitz matrices using Schur algorithm [1] have been obtained. By applying clustering mapping method [2], the complexity of the algorithm is 0(n) and it requires n/2 processing elements as opposed to n processing elements developed elsewhere [1]. The motivation of this thesis is to obtain efficient pipeline arrays by using the synthesis procedure to implement Toeplitz matrix solution. Furthermore, we will examine pipeline structures for the Toeplitz system factorization and back-substitution by obtaining clustering and Multi-Rate Array structures. These methods reduce the number of processing elements and enhance the computational speed. Comparison and advantage of these methods to other method will be presented.<br>Graduation date: 1992

APA, Harvard, Vancouver, ISO, and other styles

7

Badyal, Rajeev. "VLSI implementation of adaptive BIT/serial IIR filters." Thesis, 1992. http://hdl.handle.net/1957/36521.

Full text

Abstract:

A new structure for the implementation of bit/serial adaptive IIR filter is presented. The bit level system consists of gated full adders for the arithmetic unit and data latches for the data path. This approach allows recursive operation of the IIR filter to be implemented without any global interconnections, minimal delay time, chip area and I/O pins. The coefficients of the filter can be updated serially in real time for time invariant and adaptive filtering. A fourth order bit/serial IIR filter is implemented on a 2 micron CMOS technology clocked at 55 MHz.<br>Graduation date: 1992

APA, Harvard, Vancouver, ISO, and other styles

Books on the topic "Systolic array circuits"

1

Schreiber, Robert. Bidiagonalization and symmetric tridiagonalization by systolic arrays. Research Institute for Advanced Computer Science, NASA Ames Research Center, 1988.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

2

A systolic array optimizing compiler. Kluwer Academic, 1989.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

3

Systolic computations. Kluwer Academic Publishers, 1992.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

4

International Workshop on Systolic Arrays (1st 1986 Oxford, England). Sistolicheskie struktury. "Radio i svi͡a︡zʹ", 1993.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

5

Megson, G. M. An introduction to systolic algorithm design. OUP, 1992.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

6

An introduction to systolic algorithm design. Clarendon Press, 1992.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

7

Tomás, Lang, ed. Matrix computations on systolic-type arrays. Kluwer Academic Publishers, 1992.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

8

Al-Rabadi, Anas N. Parallel computing using reversible quantum systolic networks and their super-fast array entanglement. Nova Science Publishers, 2011.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

9

International Workshop on Systolic Arrays (1st 1986 Oxford). Systolic arrays: Papers presented at the first International Workshop on Systolic Arrays, Oxford, 2-4 July 1986. Hilger, 1987.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

10

Oxford), International Workshop on Systolic Arrays (1st 1986. Systolic arrays: Papers presented at the first International Workshop on Systolic Arrays, Oxford, 2-4 July 1986. Hilger, 1987.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

More sources

Conference papers on the topic "Systolic array circuits"

1

Moore, W. R., and V. Bawa. "Testability of a VLSI Systolic Array." In 11th European Solid State Circuits Conference. IEEE, 1985. http://dx.doi.org/10.1109/esscirc.1985.5468108.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Madanayake, Arjuna, and Len T. Bruton. "Systolic-array 3D wave-digital beam filters." In APCCAS 2010-2010 IEEE Asia Pacific Conference on Circuits and Systems. IEEE, 2010. http://dx.doi.org/10.1109/apccas.2010.5774972.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Ni, Ziying, Dur-E.-Shahwar Kundi, Maire O'Neill, and Weiqiang Liu. "High-Performance Systolic Array Montgomery Multiplier for SIKE." In 2021 IEEE International Symposium on Circuits and Systems (ISCAS). IEEE, 2021. http://dx.doi.org/10.1109/iscas51556.2021.9401062.

Full text

APA, Harvard, Vancouver, ISO, and other styles

4

Senoo, Takeshi, Akira Jinguji, Ryosuke Kuramochi, and Hiroki Nakahara. "A Multilayer Perceptron Training Accelerator using Systolic Array." In 2021 IEEE Asia Pacific Conference on Circuits and Systems (APCCAS). IEEE, 2021. http://dx.doi.org/10.1109/apccas51387.2021.9687773.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

Zhang, Jiaxi, Wentai Zhang, Guojie Luo, Xuechao Wei, Yun Liang, and Jason Cong. "Frequency Improvement of Systolic Array-Based CNNs on FPGAs." In 2019 IEEE International Symposium on Circuits and Systems (ISCAS). IEEE, 2019. http://dx.doi.org/10.1109/iscas.2019.8702071.

Full text

APA, Harvard, Vancouver, ISO, and other styles

6

Ho, H., V. Szwarc, and T. Kwasniewski. "A reconfigurable systolic array SoC design for multicarrier wireless applications." In 2008 51st IEEE International Midwest Symposium on Circuits and Systems (MWSCAS). IEEE, 2008. http://dx.doi.org/10.1109/mwscas.2008.4616886.

Full text

APA, Harvard, Vancouver, ISO, and other styles

7

El ghany, Mohamed A. Abd, Aly E. Salama, and Ahmed H. Khalil. "Design and Implementation of FPGA-based Systolic Array for LZ Data Compression." In 2007 IEEE International Symposium on Circuits and Systems. IEEE, 2007. http://dx.doi.org/10.1109/iscas.2007.378644.

Full text

APA, Harvard, Vancouver, ISO, and other styles

8

Meher, Pramod K. "Merged-Cascaded Systolic Array for VLSI Implementation of Discrete Wavelet Transform." In APCCAS 2006. 2006 IEEE Asia Pacific Conference on Circuits and Systems. IEEE, 2006. http://dx.doi.org/10.1109/apccas.2006.342489.

Full text

APA, Harvard, Vancouver, ISO, and other styles

9

Zeng, Yixuan, Heming Sun, Jiro Katto, and Yibo Fan. "Accelerating Convolutional Neural Network Inference Based on a Reconfigurable Sliced Systolic Array." In 2021 IEEE International Symposium on Circuits and Systems (ISCAS). IEEE, 2021. http://dx.doi.org/10.1109/iscas51556.2021.9401287.

Full text

APA, Harvard, Vancouver, ISO, and other styles

10

Liu, Wenjian, Jun Lin, and Zhongfeng Wang. "USCA: A Unified Systolic Convolution Array Architecture for Accelerating Sparse Neural Network." In 2019 IEEE International Symposium on Circuits and Systems (ISCAS). IEEE, 2019. http://dx.doi.org/10.1109/iscas.2019.8702132.

Full text

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!