Статті в журналах: "Architecture dataflow"

1

Kavi, K. M., and B. Shirazi. "Dataflow architecture." IEEE Potentials 11, no. 3 (October 1992): 27–30. http://dx.doi.org/10.1109/45.207108.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

2

Veen, Arthur H. "Dataflow machine architecture." ACM Computing Surveys 18, no. 4 (December 11, 1986): 365–96. http://dx.doi.org/10.1145/27633.28055.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

3

Rockey, Mark. "The dataflow architecture." ACM SIGARCH Computer Architecture News 13, no. 4 (September 1985): 8–14. http://dx.doi.org/10.1145/381752.381754.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

4

Šilc, Jurij, and Borut Robič. "Synchronous dataflow-based architecture." Microprocessing and Microprogramming 27, no. 1-5 (August 1989): 315–22. http://dx.doi.org/10.1016/0165-6074(89)90065-3.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

5

Kao, Hsu-Yu, Xin-Jia Chen, and Shih-Hsu Huang. "Convolver Design and Convolve-Accumulate Unit Design for Low-Power Edge Computing." Sensors 21, no. 15 (July 27, 2021): 5081. http://dx.doi.org/10.3390/s21155081.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Convolution operations have a significant influence on the overall performance of a convolutional neural network, especially in edge-computing hardware design. In this paper, we propose a low-power signed convolver hardware architecture that is well suited for low-power edge computing. The basic idea of the proposed convolver design is to combine all multipliers’ final additions and their corresponding adder tree to form a partial product matrix (PPM) and then to use the reduction tree algorithm to reduce this PPM. As a result, compared with the state-of-the-art approach, our convolver design not only saves a lot of carry propagation adders but also saves one clock cycle per convolution operation. Moreover, the proposed convolver design can be adapted for different dataflows (including input stationary dataflow, weight stationary dataflow, and output stationary dataflow). According to dataflows, two types of convolve-accumulate units are proposed to perform the accumulation of convolution results. The results show that, compared with the state-of-the-art approach, the proposed convolver design can save 15.6% power consumption. Furthermore, compared with the state-of-the-art approach, on average, the proposed convolve-accumulate units can reduce 15.7% power consumption.

6

Teifel, J., and R. Manohar. "An asynchronous dataflow FPGA architecture." IEEE Transactions on Computers 53, no. 11 (November 2004): 1376–92. http://dx.doi.org/10.1109/tc.2004.88.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

7

Mihelič, Jurij, and Uroš Čibej. "EXPERIMENTAL COMPARISON OF MATRIX ALGORITHMS FOR DATAFLOW COMPUTER ARCHITECTURE." Acta Electrotechnica et Informatica 18, no. 3 (September 27, 2018): 47–56. http://dx.doi.org/10.15546/aeei-2018-0025.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

8

Fabiani, Erwan. "Experiencing a Problem-Based Learning Approach for Teaching Reconfigurable Architecture Design." International Journal of Reconfigurable Computing 2009 (2009): 1–11. http://dx.doi.org/10.1155/2009/923415.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

This paper presents the “reconfigurable computing” teaching part of a computer science master course (first year) on parallel architectures. The practical work sessions of this course rely on active pedagogy using problem-based learning, focused on designing a reconfigurable architecture for the implementation of an application class of image processing algorithms. We show how the successive steps of this project permit the student to experiment with several fundamental concepts of reconfigurable computing at different levels. Specific experiments include exploitation of architectural parallelism, dataflow and communicating component-based design, and configurability-specificity tradeoffs.

9

Guo, Jia Rong, Ran Feng, Zhuo Bi, and Mei Hua Xu. "A Compiler for Ladder Diagram to Multi-Core Dataflow Architecture." Advanced Materials Research 462 (February 2012): 368–74. http://dx.doi.org/10.4028/www.scientific.net/amr.462.368.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Multi-core and dataflow architecture recently researched on parallel computing can well satisfy the requirement of high-performance for PLC processors handling program by exploiting parallelism in the program. But the compiler translating the ladder diagram program into the instructions of the architecture has not been yet developed. For the problem, the paper presents a compiler aiming at editing a ladder diagram which is one of programming languages of PLC and then compiling it into instructions of multi-core function-level dataflow architecture. The compiler takes row doubly linked list as internal representation of a ladder diagram, and logic binary tree as intermediate representation during the process of compiling according to similarity of the binary tree to function-level dataflow graph, written in java.

10

Hu, Weiming. "Dataflow architecture for EEG patient monitor." ACM SIGARCH Computer Architecture News 13, no. 2 (June 1985): 3–10. http://dx.doi.org/10.1145/1296935.1296936.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

11

Carlstrom, J., and T. Boden. "Synchronous dataflow architecture for network processors." IEEE Micro 24, no. 5 (September 2004): 10–18. http://dx.doi.org/10.1109/mm.2004.57.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

12

Gao, G. R. "An Efficient Hybrid Dataflow Architecture Model." Journal of Parallel and Distributed Computing 19, no. 4 (December 1993): 293–307. http://dx.doi.org/10.1006/jpdc.1993.1113.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

13

Ghosal, D., and L. N. Bhuyan. "Performance evaluation of a dataflow architecture." IEEE Transactions on Computers 39, no. 5 (May 1990): 615–27. http://dx.doi.org/10.1109/12.53575.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

14

Vasilev, Vladimir S., Alexander I. Legalov, and Sergey V. Zykov. "The System for Transforming the Code of Dataflow Programs into Imperative." Modeling and Analysis of Information Systems 28, no. 2 (June 11, 2021): 198–214. http://dx.doi.org/10.18255/1818-1015-2021-2-198-214.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Functional dataflow programming languages are designed to create parallel portable programs. The source code of such programs is translated into a set of graphs that reflect information and control dependencies. The main way of their execution is interpretation, which does not allow to perform calculations efficiently on real parallel computing systems and leads to poor performance. To run programs directly on existing computing systems, you need to use specific optimization and transformation methods that take into account the features of both the programming language and the architecture of the system. Currently, the most common is the Von Neumann architecture, however, parallel programming for it in most cases is carried out using imperative languages with a static type system. For different architectures of parallel computing systems, there are various approaches to writing parallel programs. The transformation of dataflow parallel programs into imperative programs allows to form a framework of imperative code fragments that directly display sequential calculations. In the future, this framework can be adapted to a specific parallel architecture. The paper considers an approach to performing this type of transformation, which consists in allocating fragments of dataflow parallel programs as templates, which are subsequently replaced by equivalent fragments of imperative languages. The proposed transformation methods allow generating program code, to which various optimizing transformations can be applied in the future, including parallelization taking into account the target architecture.

15

Tibaldi, Mattia, Gianluca Palermo, and Christian Pilato. "Dynamically-Tunable Dataflow Architectures Based on Markov Queuing Models." Electronics 11, no. 4 (February 12, 2022): 555. http://dx.doi.org/10.3390/electronics11040555.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Dataflow architectures are fundamental to achieve high performance in data-intensive applications. They must be optimized to elaborate input data arriving at an expected rate, which is not always constant. While worst-case designs can significantly increase hardware resources, more optimistic solutions fail to sustain execution phases with high throughput, leading to system congestion or even computational errors. We present an architecture to monitor and control dataflow architectures that leverage approximate variants to trade off accuracy and latency of the computational processes. Our microarchitecture features online prediction based on queuing models to estimate the response time of the system and select the proper variant to meet the target throughput, enabling the creation of dynamically-tunable systems.

16

Mazloom, Bita, Shashidhar Mysore, Mohit Tiwari, Banit Agrawal, and Tim Sherwood. "Dataflow Tomography." ACM Transactions on Architecture and Code Optimization 9, no. 1 (March 2012): 1–26. http://dx.doi.org/10.1145/2133382.2133385.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

17

Geilen, Marc. "Synchronous dataflow scenarios." ACM Transactions on Embedded Computing Systems 10, no. 2 (December 2010): 1–31. http://dx.doi.org/10.1145/1880050.1880052.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

18

Edwards, Stephen A., Richard Townsend, Martha Barker, and Martha A. Kim. "Compositional Dataflow Circuits." ACM Transactions on Embedded Computing Systems 18, no. 1 (February 28, 2019): 1–27. http://dx.doi.org/10.1145/3274280.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

19

Kavi, Krishna M., and A. R. Hurson. "Design of cache memories for dataflow architecture." Journal of Systems Architecture 44, no. 9-10 (June 1998): 657–74. http://dx.doi.org/10.1016/s1383-7621(97)00012-x.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

20

Canning, James T. "A hands-on dataflow architecture/programming course." ACM SIGCSE Bulletin 23, no. 2 (May 1991): 29–32. http://dx.doi.org/10.1145/122106.122112.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

21

Iannucci, R. A. "Toward a dataflow/von Neumann hybrid architecture." ACM SIGARCH Computer Architecture News 16, no. 2 (May 17, 1988): 131–40. http://dx.doi.org/10.1145/633625.52416.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

22

Mercaldi, Martha, Steven Swanson, Andrew Petersen, Andrew Putnam, Andrew Schwerin, Mark Oskin, and Susan J. Eggers. "Instruction scheduling for a tiled dataflow architecture." ACM SIGOPS Operating Systems Review 40, no. 5 (October 20, 2006): 141–50. http://dx.doi.org/10.1145/1168917.1168876.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

23

Mercaldi, Martha, Steven Swanson, Andrew Petersen, Andrew Putnam, Andrew Schwerin, Mark Oskin, and Susan J. Eggers. "Instruction scheduling for a tiled dataflow architecture." ACM SIGPLAN Notices 41, no. 11 (November 2006): 141–50. http://dx.doi.org/10.1145/1168918.1168876.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

24

Mercaldi, Martha, Steven Swanson, Andrew Petersen, Andrew Putnam, Andrew Schwerin, Mark Oskin, and Susan J. Eggers. "Instruction scheduling for a tiled dataflow architecture." ACM SIGARCH Computer Architecture News 34, no. 5 (October 20, 2006): 141–50. http://dx.doi.org/10.1145/1168919.1168876.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

25

Wang, Lei, and Ying Tan. "The researches in fault-tolerant dataflow architecture." Journal of Computer Science and Technology 6, no. 4 (October 1991): 395–98. http://dx.doi.org/10.1007/bf02948401.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

26

Figueroa, Pablo. "Insights on the Design of InTml." Presence: Teleoperators and Virtual Environments 19, no. 2 (April 1, 2010): 118–30. http://dx.doi.org/10.1162/pres.19.2.118.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

This paper describes some details about the design of InTml, the interaction techniques markup language. We explain three main elements in its architecture: a simple mixed reality (MR) based component model, a communication model between components that allows fusion of multimodal information at a fine level of granularity, and an indirection mechanism for dataflows that is useful to keep state inside a dataflow. We also briefly discuss the advantages we have found in the use of formal methods, model driven development, and encapsulation mechanisms. The purpose of this description is to make explicit the design rationale of these mechanisms, which may be fruitful for other developments in our field.

27

Yazdanpanah, Fahimeh, Carlos Alvarez-Martinez, Daniel Jimenez-Gonzalez, and Yoav Etsion. "Hybrid Dataflow/von-Neumann Architectures." IEEE Transactions on Parallel and Distributed Systems 25, no. 6 (June 2014): 1489–509. http://dx.doi.org/10.1109/tpds.2013.125.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

28

Ashford Lee, Edward, and Jeffery C. Bier. "Architectures for statically scheduled dataflow." Journal of Parallel and Distributed Computing 10, no. 4 (December 1990): 333–48. http://dx.doi.org/10.1016/0743-7315(90)90034-m.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

29

Park, Jaeyun, Naehyuck Chang, and Wook Hyun Kwon. "An architecture of dataflow LSP for programmable controllers." IFAC Proceedings Volumes 24, no. 7 (September 1991): 243–48. http://dx.doi.org/10.1016/b978-0-08-041699-1.50045-8.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

30

Vo, Huy T., Daniel K. Osmari, Brian Summa, João L. D. Comba, Valerio Pascucci, and Cláudio T. Silva. "Streaming-Enabled Parallel Dataflow Architecture for Multicore Systems." Computer Graphics Forum 29, no. 3 (August 12, 2010): 1073–82. http://dx.doi.org/10.1111/j.1467-8659.2009.01704.x.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

31

Kavi, K. M., R. Giorgi, and J. Arul. "Scheduled dataflow: execution paradigm, architecture, and performance evaluation." IEEE Transactions on Computers 50, no. 8 (August 2001): 834–46. http://dx.doi.org/10.1109/tc.2001.947011.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

32

Jo, Jihyuck, Suchang Kim, and In-Cheol Park. "Energy-Efficient Convolution Architecture Based on Rescheduled Dataflow." IEEE Transactions on Circuits and Systems I: Regular Papers 65, no. 12 (December 2018): 4196–207. http://dx.doi.org/10.1109/tcsi.2018.2840092.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

33

Sakai, S., y. Yamaguchi, K. Hiraki, Y. Kodama, and T. Yuba. "An architecture of a dataflow single chip processor." ACM SIGARCH Computer Architecture News 17, no. 3 (June 1989): 46–53. http://dx.doi.org/10.1145/74926.74931.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

34

Kavi, K. M., R. Giorgi, and J. Arul. "Scheduled dataflow: execution paradigm, architecture, and performance evaluation." IEEE Transactions on Computers 50, no. 8 (2001): 834–46. http://dx.doi.org/10.1109/12.947003.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

35

Tan, Xu, Xiao-Chun Ye, Xiao-Wei Shen, Yuan-Chao Xu, Da Wang, Lunkai Zhang, Wen-Ming Li, Dong-Rui Fan, and Zhi-Min Tang. "A Pipelining Loop Optimization Method for Dataflow Architecture." Journal of Computer Science and Technology 33, no. 1 (January 2018): 116–30. http://dx.doi.org/10.1007/s11390-017-1748-5.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

36

Liu, Guizhong, and Yungui Ci. "Architecture of the synchronous dataflow system SDS-1." Journal of Computer Science and Technology 1, no. 1 (March 1986): 19–25. http://dx.doi.org/10.1007/bf02943297.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

37

Emani, Murali, Venkatram Vishwanath, Corey Adams, Michael E. Papka, Rick Stevens, Laura Florescu, Sumti Jairath, et al. "Accelerating Scientific Applications With SambaNova Reconfigurable Dataflow Architecture." Computing in Science & Engineering 23, no. 2 (March 1, 2021): 114–19. http://dx.doi.org/10.1109/mcse.2021.3057203.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

38

Yazar, Tuğrul. "Design of Dataflow." Nexus Network Journal 17, no. 1 (December 5, 2014): 311–25. http://dx.doi.org/10.1007/s00004-014-0222-8.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

39

Bourke, Timothy, Paul Jeanmaire, Basile Pesin, and Marc Pouzet. "Verified Lustre Normalization with Node Subsampling." ACM Transactions on Embedded Computing Systems 20, no. 5s (October 31, 2021): 1–25. http://dx.doi.org/10.1145/3477041.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Dataflow languages allow the specification of reactive systems by mutually recursive stream equations, functions, and boolean activation conditions called clocks. Lustre and Scade are dataflow languages for programming embedded systems. Dataflow programs are compiled by a succession of passes. This article focuses on the normalization pass which rewrites programs into the simpler form required for code generation. Vélus is a compiler from a normalized form of Lustre to CompCert’s Clight language. Its specification in the Coq interactive theorem prover includes an end-to-end correctness proof that the values prescribed by the dataflow semantics of source programs are produced by executions of generated assembly code. We describe how to extend Vélus with a normalization pass and to allow subsampled node inputs and outputs. We propose semantic definitions for the unrestricted language, divide normalization into three steps to facilitate proofs, adapt the clock type system to handle richer node definitions, and extend the end-to-end correctness theorem to incorporate the new features. The proofs require reasoning about the relation between static clock annotations and the presence and absence of values in the dynamic semantics. The generalization of node inputs requires adding a compiler pass to ensure the initialization of variables passed in function calls.

40

Cheng, Wei-Kai, Xiang-Yi Liu, Hsin-Tzu Wu, Hsin-Yi Pai, and Po-Yao Chung. "Reconfigurable Architecture and Dataflow for Memory Traffic Minimization of CNNs Computation." Micromachines 12, no. 11 (November 5, 2021): 1365. http://dx.doi.org/10.3390/mi12111365.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Computation of convolutional neural network (CNN) requires a significant amount of memory access, which leads to lots of energy consumption. As the increase of neural network scale, this phenomenon is further obvious, the energy consumption of memory access and data migration between on-chip buffer and off-chip DRAM is even much more than the computation energy on processing element array (PE array). In order to reduce the energy consumption of memory access, a better dataflow to maximize data reuse and minimize data migration between on-chip buffer and external DRAM is important. Especially, the dimension of input feature map (ifmap) and filter weight are much different for each layer of the neural network. Hardware resources may not be effectively utilized if the array architecture and dataflow cannot be reconfigured layer by layer according to their ifmap dimension and filter dimension, and result in a large quantity of data migration on certain layers. However, a thorough exploration of all possible configurations is time consuming and meaningless. In this paper, we propose a quick and efficient methodology to adapt the configuration of PE array architecture, buffer assignment, dataflow and reuse methodology layer by layer with the given CNN architecture and hardware resource. In addition, we make an exploration on the different combinations of configuration issues to investigate their effectiveness and can be used as a guide to speed up the thorough exploration process.

41

Michalska, Małgorzata, Nicolas Zufferey, and Marco Mattavelli. "Performance Estimation Based Multicriteria Partitioning Approach for Dynamic Dataflow Programs." Journal of Electrical and Computer Engineering 2016 (2016): 1–15. http://dx.doi.org/10.1155/2016/8536432.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

The problem of partitioning a dataflow program onto a target architecture is a difficult challenge for any application design. In general, since the problem is NP-complete, it consists of looking for high quality solutions in terms of maximizing the achievable data throughput. The difficulty is given by the exploration of the design space which results in being extremely large for parallel platforms. The paper describes a heuristic partitioning methodology applicable to dynamic dataflow programs. The methodology is based on two elements: an execution model of the dynamic dataflow program which is used as estimation of the performance for the exploration of the large design space and several partitioning algorithms competing to lead to specific high quality solutions. Experimental results are validated with executions on a virtual platform.

42

Ma, Mingze, and Rizos Sakellariou. "Code-size-aware Scheduling of Synchronous Dataflow Graphs on Multicore Systems." ACM Transactions on Embedded Computing Systems 20, no. 3 (April 2021): 1–24. http://dx.doi.org/10.1145/3440034.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

Synchronous dataflow graphs are widely used to model digital signal processing and multimedia applications. Self-timed execution is an efficient methodology for the analysis and scheduling of synchronous dataflow graphs. In this article, we propose a communication-aware self-timed execution approach to solve the problem of scheduling synchronous dataflow graphs on multicore systems with communication delays. Based on this communication-aware self-timed execution approach, four communication-aware scheduling algorithms are proposed using different allocation rules. Furthermore, a code-size-aware mapping heuristic is proposed and jointly used with a proposed scheduling algorithm to reduce the code size of SDFGs on multicore systems. The proposed scheduling algorithms are experimentally evaluated and found to perform better than existing algorithms in terms of throughput and runtime for several applications. The experiments also show that the proposed code-size-aware mapping approach can achieve significant code size reduction with limited throughput degradation in most cases.

43

Lakshmi Narasimhan, V., and T. Downs. "Fault tolerant aspects of a dynamic dataflow architecture — PATTSY." Microprocessing and Microprogramming 32, no. 1-5 (August 1991): 243–52. http://dx.doi.org/10.1016/0165-6074(91)90354-v.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

44

Kavi, Krishna M., A. R. Hurson, Phenil Patadia, Elizabeth Abraham, and Ponnarasu Shanmugam. "Design of cache memories for multi-threaded dataflow architecture." ACM SIGARCH Computer Architecture News 23, no. 2 (May 1995): 253–64. http://dx.doi.org/10.1145/225830.224436.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

45

Stancu, S., M. Ciobotaru, and K. Korcyl. "ATLAS TDAQ DataFlow network architecture analysis and upgrade proposal." IEEE Transactions on Nuclear Science 53, no. 3 (June 2006): 826–33. http://dx.doi.org/10.1109/tns.2006.873302.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

46

Gan, Lin, Haohuan Fu, Wayne Luk, Chao Yang, Wei Xue, and Guangwen Yang. "Solving Mesoscale Atmospheric Dynamics Using a Reconfigurable Dataflow Architecture." IEEE Micro 37, no. 4 (2017): 40–50. http://dx.doi.org/10.1109/mm.2017.3211107.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

47

Shen, Xiao-Wei, Xiao-Chun Ye, Xu Tan, Da Wang, Lunkai Zhang, Wen-Ming Li, Zhi-Min Zhang, Dong-Rui Fan, and Ning-Hui Sun. "An Efficient Network-on-Chip Router for Dataflow Architecture." Journal of Computer Science and Technology 32, no. 1 (January 2017): 11–25. http://dx.doi.org/10.1007/s11390-017-1703-5.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

48

Tan, Xu, Xiao-Wei Shen, Xiao-Chun Ye, Da Wang, Dong-Rui Fan, Lunkai Zhang, Wen-Ming Li, Zhi-Min Zhang, and Zhi-Min Tang. "A Non-Stop Double Buffering Mechanism for Dataflow Architecture." Journal of Computer Science and Technology 33, no. 1 (January 2018): 145–57. http://dx.doi.org/10.1007/s11390-017-1747-6.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

49

Alves, Tiago A. O., Leandro A. J. Marzulo, Felipe M. G. Franca, and Vitor Santos Costa. "Trebuchet: exploring TLP with dataflow virtualisation." International Journal of High Performance Systems Architecture 3, no. 2/3 (2011): 137. http://dx.doi.org/10.1504/ijhpsa.2011.040466.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

50

Xu, Rui, Sheng Ma, Yaohua Wang, Xinhai Chen, and Yang Guo. "Configurable Multi-directional Systolic Array Architecture for Convolutional Neural Networks." ACM Transactions on Architecture and Code Optimization 18, no. 4 (December 31, 2021): 1–24. http://dx.doi.org/10.1145/3460776.

Повний текст джерела

Стилі APA, Harvard, Vancouver, ISO та ін.

Анотація:

The systolic array architecture is one of the most popular choices for convolutional neural network hardware accelerators. The biggest advantage of the systolic array architecture is its simple and efficient design principle. Without complicated control and dataflow, hardware accelerators with the systolic array can calculate traditional convolution very efficiently. However, this advantage also brings new challenges to the systolic array. When computing special types of convolution, such as the small-scale convolution or depthwise convolution, the processing element (PE) utilization rate of the array decreases sharply. The main reason is that the simple architecture design limits the flexibility of the systolic array. In this article, we design a configurable multi-directional systolic array (CMSA) to address these issues. First, we added a data path to the systolic array. It allows users to split the systolic array through configuration to speed up the calculation of small-scale convolution. Second, we redesigned the PE unit so that the array has multiple data transmission modes and dataflow strategies. This allows users to switch the dataflow of the PE array to speed up the calculation of depthwise convolution. In addition, unlike other works, we only make a few changes and modifications to the existing systolic array architecture. It avoids additional hardware overheads and can be easily deployed in application scenarios that require small systolic arrays such as mobile terminals. Based on our evaluation, CMSA can increase the PE utilization rate by up to 1.6 times compared to the typical systolic array when running the last layers of ResNet-18. When running depthwise convolution in MobileNet, CMSA can increase the utilization rate by up to 14.8 times. At the same time, CMSA and the traditional systolic arrays are similar in area and energy consumption.

Статті в журналах з теми "Architecture dataflow"

Оформте джерело за APA, MLA, Chicago, Harvard та іншими стилями