Log in

Relevant bibliographies by topics / Performance Optimization in Software and Hardware / Journal articles

To see the other types of publications on this topic, follow the link: Performance Optimization in Software and Hardware.

Journal articles on the topic 'Performance Optimization in Software and Hardware'

Author: Grafiati

Published: 4 June 2021

Last updated: 1 February 2022

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Performance Optimization in Software and Hardware.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Zhang, Tao, Changfu Yang, and Xin Zhao. "Using Improved Brainstorm Optimization Algorithm for Hardware/Software Partitioning." Applied Sciences 9, no. 5 (February 28, 2019): 866. http://dx.doi.org/10.3390/app9050866.

Full text

Abstract:

Today, more and more complex tasks are emerging. To finish these tasks within a reasonable time, using the complex embedded system which has multiple processing units is necessary. Hardware/software partitioning is one of the key technologies in designing complex embedded systems, it is usually taken as an optimization problem and be solved with different optimization methods. Among the optimization methods, swarm intelligent (SI) algorithms are easily applied and have the advantages of strong robustness and excellent global search ability. Due to the high complexity of hardware/software partitioning problems, the SI algorithms are ideal methods to solve the problems. In this paper, a new SI algorithm, called brainstorm optimization (BSO), is applied to hardware/software partitioning. In order to improve the performance of the BSO, we analyzed its optimization process when solving the hardware/software partitioning problem and found the disadvantages in terms of the clustering method and the updating strategy. Then we proposed the improved brainstorm optimization (IBSO) which ameliorated the original clustering method by setting the cluster points and improved the updating strategy by decreasing the number of updated individuals in each iteration. Based on the simulation methods which are usually used to evaluate the performance of the hardware/software partitioning algorithms, we generated eight benchmarks which represent tasks with different scales to test the performance of IBSO, BSO, four original heuristic algorithms and two improved BSO. Simulation results show that the IBSO algorithm can achieve the solutions with the highest quality within the shortest running time among these algorithms.

APA, Harvard, Vancouver, ISO, and other styles

2

Yang, Fu, Liu Xin, and Pei Yuan Guo. "A Multi-Objective Optimization Genetic Algorithm for SOPC Hardware-Software Partitioning." Advanced Materials Research 457-458 (January 2012): 1142–48. http://dx.doi.org/10.4028/www.scientific.net/amr.457-458.1142.

Full text

Abstract:

Hardware-software partitioning is the key technology in hardware-software co-design; the results will determine the design of system directly. Genetic algorithm is a classical search algorithm for solving such combinatorial optimization problem. A Multi-objective genetic algorithm for hardware-software partitioning is presented in this paper. This method can give consideration to both system performance and indicators such as time, power, area and cost, and achieve multi-objective optimization in system on programmable chip (SOPC). Simulation results show that the method can solve the SOPC hardware-software partitioning problem effectively.

APA, Harvard, Vancouver, ISO, and other styles

3

Mhadhbi, Imene, Slim Ben Othman, and Slim Ben Saoud. "An Efficient Technique for Hardware/Software Partitioning Process in Codesign." Scientific Programming 2016 (2016): 1–11. http://dx.doi.org/10.1155/2016/6382765.

Full text

Abstract:

Codesign methodology deals with the problem of designing complex embedded systems, where automatic hardware/software partitioning is one key issue. The research efforts in this issue are focused on exploring new automatic partitioning methods which consider only binary or extended partitioning problems. The main contribution of this paper is to propose a hybrid FCMPSO partitioning technique, based on Fuzzy C-Means (FCM) and Particle Swarm Optimization (PSO) algorithms suitable for mapping embedded applications for both binary and multicores target architecture. Our FCMPSO optimization technique has been compared using different graphical models with a large number of instances. Performance analysis reveals that FCMPSO outperforms PSO algorithm as well as the Genetic Algorithm (GA), Simulated Annealing (SA), Ant Colony Optimization (ACO), and FCM standard metaheuristic based techniques and also hybrid solutions including PSO then GA, GA then SA, GA then ACO, ACO then SA, FCM then GA, FCM then SA, and finally ACO followed by FCM.

APA, Harvard, Vancouver, ISO, and other styles

4

Umesh, I. M., and G. N. Srinivasan. "Optimum Software Aging Prediction and Rejuvenation Model for Virtualized Environment." Indonesian Journal of Electrical Engineering and Computer Science 3, no. 3 (September 1, 2016): 572. http://dx.doi.org/10.11591/ijeecs.v3.i3.pp572-578.

Full text

Abstract:

<p><em>Advancement in electronics and hardware has resulted in multiple softwares running on the same hardware. The result is multiuser, multitasking and virtualized environments. However, reliability of such high performance computing systems depends both on hardware and software. For hardware, aging can be dealt with replacement. But, software aging needs to be dealt with software only. For aging detection, a new approach using machine learning framework has been proposed in this paper. For rejuvenation, Adaptive Genetic Algorithm (A-GA) has been developed to perform live migration to avoid downtime and SLA violation. The proposed A-GA based rejuvenation controller (A-GARC) has outperformed other heuristic techniques such as Ant Colony Optimization (ACO) and best fit decreasing (BFD) for migration. Results reveal that the proposed aging forecasting method and A-GA based rejuvenation outperforms other approaches to ensure optimal system availability, minimum task migration, performance degradation and SLA violation.</em></p>

APA, Harvard, Vancouver, ISO, and other styles

5

Tomecek, Jozef. "Hardware optimizations of stream cipher rabbit." Tatra Mountains Mathematical Publications 50, no. 1 (December 1, 2011): 87–101. http://dx.doi.org/10.2478/v10127-011-0039-8.

Full text

Abstract:

ABSTRACT Stream ciphers form part of cryptographic primitives focused on privacy. Synchronous, symmetric and software-oriented stream cipher Rabbit is member of final portfolio of European Union's eStream project. Although it was designed to perform well in software, employed operations seem to compute efficiently in hardware. 128-bit security, with no known security weaknesses is claimed by Rabbit's designers. Since hardware performance of Rabbit was only estimated in the proposal of algorithm, comparison of direct and optimized FPGA implementations of Rabbit stream cipher is presented, identifying algorithm bottlenecks, discussing optimization techniques applied to algorithm computations, along with key area/time trade-offs.

APA, Harvard, Vancouver, ISO, and other styles

6

Bezemer, Cor-Paul, and Andy Zaidman. "Performance optimization of deployed software-as-a-service applications." Journal of Systems and Software 87 (January 2014): 87–103. http://dx.doi.org/10.1016/j.jss.2013.09.013.

Full text

APA, Harvard, Vancouver, ISO, and other styles

7

Algarni, Sultan Abdullah, Mohammad Rafi Ikbal, Roobaea Alroobaea, Ahmed S. Ghiduk, and Farrukh Nadeem. "Performance Evaluation of Xen, KVM, and Proxmox Hypervisors." International Journal of Open Source Software and Processes 9, no. 2 (April 2018): 39–54. http://dx.doi.org/10.4018/ijossp.2018040103.

Full text

Abstract:

Hardware virtualization plays a major role in IT infrastructure optimization in private data centers and public cloud platforms. Though there are many advancements in CPU architecture and hypervisors recently, but overhead still exists as there is a virtualization layer between the guest operating system and physical hardware. This is particularly when multiple virtual guests are competing for resources on the same physical hardware. Understanding performance of a virtualization layer is crucial as this would have a major impact on entire IT infrastructure. This article has performed an extensive study on comparing the performance of three hypervisors KVM, Xen, and Proxmox VE. The experiments showed that KVM delivers the best performance on most of the selected parameters. Xen excels in file system performance and application performance. Though Proxmox has delivered the best performance in only the sub-category of CPU throughput. This article suggests best-suited hypervisors for targeted applications.

APA, Harvard, Vancouver, ISO, and other styles

8

Wang, Xin. "Research on Software Optimization Solutions of E-Commerce Site." Applied Mechanics and Materials 198-199 (September 2012): 626–30. http://dx.doi.org/10.4028/www.scientific.net/amm.198-199.626.

Full text

Abstract:

There are generally two types of E-commerce platform optimized programs: hardware optimization and software optimization, This paper first analyzes the system optimization techniques of software optimization, Including dynamic load optimization technology and cluster technology; Then studies the database performance optimization methods from the table, connection pooling, query and several other aspects; Finally to carry on the research to optimization electronic commerce platform used the cache technology. Proposes a universal significance of E-commerce platform software optimization solutions, these studies have some references for relevant E-commerce website designers and maintainers, and provides a strategy for the corresponding E-commerce enterprises to optimize platform environments.

APA, Harvard, Vancouver, ISO, and other styles

9

Koltakov, S. A., and A. A. Cherepnev. "HARDWARE-SOFTWARE COMPLEX FOR DIGITAL PROCESSING OF HYDROACOUSTIC SIGNALS." Issues of radio electronics, no. 5 (June 8, 2019): 60–63. http://dx.doi.org/10.21778/2218-5453-2019-5-60-63.

Full text

Abstract:

The article describes the hardware‑software complex (HSC) based on the debugging stand, its composition, modules and operations. A method for synthesizing the output signal is described, a formula and a table of parameters for its calculation are given. Signals and spectra at the input and output of the developed HSC are shown. The obtained parameters of the performance of various agribusiness, based on the signal processor with a General‑purpose processor and two variants with General‑purpose processors. The proposed version of the HSC2–3 times wins in performance compared to the HSC based on the general‑ purpose processor of Intel. This is achieved through the use of modern methods and programming tools, digital signal processing modules, as well as the optimization of the executable code. Recommendations for possible further improvement of the proposed complex are given, which is possible due to the use of modern FPGAs and high‑speed interface.

APA, Harvard, Vancouver, ISO, and other styles

10

Rahim, N. H. A., A. M. Kassim, M. F. Miskon, A. H. Azahar, and H. Sakidin. "Optimization of One Legged Hopping Robot Hardware Parameters via Solidworks." Applied Mechanics and Materials 393 (September 2013): 544–49. http://dx.doi.org/10.4028/www.scientific.net/amm.393.544.

Full text

Abstract:

This paper discussed about simulation of one legged hopping robot via Solidworks software in order to determine the optimum hardware parameters of the hopping robot. Simulations have been done according to different variables that have been set up earlier which are crank bar length, spring length and spring coefficient. The best parameters were chosen in terms of higher and stable hopping performance. Besides that, an experiment is done to validate the parameters from the simulation. Average hopping height is discussed and overall performances of hopping height stability are proved by the normal distribution graph. As the result, the optimum parameter values for hardware of one legged hopping robot are validated.

APA, Harvard, Vancouver, ISO, and other styles

11

Serpa, Matheus S., Eduardo HM Cruz, Matthias Diener, Arthur M. Krause, Philippe OA Navaux, Jairo Panetta, Albert Farrés, Claudia Rosas, and Mauricio Hanzich. "Optimization strategies for geophysics models on manycore systems." International Journal of High Performance Computing Applications 33, no. 3 (January 17, 2019): 473–86. http://dx.doi.org/10.1177/1094342018824150.

Full text

Abstract:

Many software mechanisms for geophysics exploration in oil and gas industries are based on wave propagation simulation. To perform such simulations, state-of-the-art high-performance computing architectures are employed, generating results faster with more accuracy at each generation. The software must evolve to support the new features of each design to keep performance scaling. Furthermore, it is important to understand the impact of each change applied to the software to improve the performance as most as possible. In this article, we propose several optimization strategies for a wave propagation model for six architectures: Intel Broadwell, Intel Haswell, Intel Knights Landing, Intel Knights Corner, NVIDIA Pascal, and NVIDIA Kepler. We focus on improving the cache memory usage, vectorization, load balancing, portability, and locality in the memory hierarchy. We analyze the hardware impact of the optimizations, providing insights of how each strategy can improve the performance. The results show that NVIDIA Pascal outperforms the other considered architectures by up to 8.5[Formula: see text].

APA, Harvard, Vancouver, ISO, and other styles

12

Rabe, Robert, Anastasiia Izycheva, and Eva Darulova. "Regime Inference for Sound Floating-Point Optimizations." ACM Transactions on Embedded Computing Systems 20, no. 5s (October 31, 2021): 1–23. http://dx.doi.org/10.1145/3477012.

Full text

Abstract:

Efficient numerical programs are required for proper functioning of many systems. Today’s tools offer a variety of optimizations to generate efficient floating-point implementations that are specific to a program’s input domain. However, sound optimizations are of an “all or nothing” fashion with respect to this input domain—if an optimizer cannot improve a program on the specified input domain, it will conclude that no optimization is possible. In general, though, different parts of the input domain exhibit different rounding errors and thus have different optimization potential. We present the first regime inference technique for sound optimizations that automatically infers an effective subdivision of a program’s input domain such that individual sub-domains can be optimized more aggressively. Our algorithm is general; we have instantiated it with mixed-precision tuning and rewriting optimizations to improve performance and accuracy, respectively. Our evaluation on a standard benchmark set shows that with our inferred regimes, we can, on average, improve performance by 65% and accuracy by 54% with respect to whole-domain optimizations.

APA, Harvard, Vancouver, ISO, and other styles

13

Smith, Melissa C., and Gregory D. Peterson. "Optimization of Shared High-Performance Reconfigurable Computing Resources." ACM Transactions on Embedded Computing Systems 11, no. 2 (July 2012): 1–22. http://dx.doi.org/10.1145/2220336.2220348.

Full text

APA, Harvard, Vancouver, ISO, and other styles

14

Lin, Ye-Jyun, Chia-Lin Yang, Jiao-We Huang, Tay-Jyi Lin, Chih-Wen Hsueh, and Naehyuck Chang. "System-Level Performance and Power Optimization for MPSoC." ACM Transactions on Embedded Computing Systems 14, no. 1 (January 21, 2015): 1–26. http://dx.doi.org/10.1145/2656339.

Full text

APA, Harvard, Vancouver, ISO, and other styles

15

Steiger, Damian S., Thomas Häner, and Matthias Troyer. "ProjectQ: an open source software framework for quantum computing." Quantum 2 (January 31, 2018): 49. http://dx.doi.org/10.22331/q-2018-01-31-49.

Full text

Abstract:

We introduce ProjectQ, an open source software effort for quantum computing. The first release features a compiler framework capable of targeting various types of hardware, a high-performance simulator with emulation capabilities, and compiler plug-ins for circuit drawing and resource estimation. We introduce our Python-embedded domain-specific language, present the features, and provide example implementations for quantum algorithms. The framework allows testing of quantum algorithms through simulation and enables running them on actual quantum hardware using a back-end connecting to the IBM Quantum Experience cloud service. Through extension mechanisms, users can provide back-ends to further quantum hardware, and scientists working on quantum compilation can provide plug-ins for additional compilation, optimization, gate synthesis, and layout strategies.

APA, Harvard, Vancouver, ISO, and other styles

16

Gallardo, Esthela, Jérôme Vienne, Leonardo Fialho, Patricia Teller, and James Browne. "Employing MPI_T in MPI Advisor to optimize application performance." International Journal of High Performance Computing Applications 32, no. 6 (January 31, 2017): 882–96. http://dx.doi.org/10.1177/1094342016684005.

Full text

Abstract:

MPI_T, the MPI Tool Information Interface, was introduced in the MPI 3.0 standard with the aim of enabling the development of more effective tools to support the Message Passing Interface (MPI), a standardized and portable message-passing system that is widely used in parallel programs. Most MPI optimization tools do not yet employ MPI_T and only describe the interactions between an application and an MPI library, thus requiring that users have expert knowledge to translate this information into optimizations. In contrast, MPI Advisor, a recently developed, easy-to-use methodology and tool for MPI performance optimization, pioneered the use of information provided by MPI_T to characterize the communication behaviors of an application and identify an MPI configuration that may enhance application performance. In addition to enabling the recommendation of performance optimizations, MPI_T has the potential to enable automatic runtime application of these optimizations. Optimization of MPI configurations is important because: (1) the vast majority of parallel applications executed on high-performance computing clusters use MPI for communication among processes, (2) most users execute their programs using the cluster’s default MPI configuration, and (3) while default configurations may give adequate performance, it is well known that optimizing the MPI runtime environment can significantly improve application performance, in particular, when the way in which the application is executed and/or the application’s input changes. This paper provides an overview of MPI_T, describes how it can be used to develop more effective MPI optimization tools, and demonstrates its use within an extended version of MPI Advisor. In doing the latter, it presents several MPI configuration choices that can significantly impact performance, shows how use of information collected at runtime with MPI_T and PMPI can be used to enhance performance, and presents MPI Advisor case studies of these configuration optimizations with performance gains of up to 40%.

APA, Harvard, Vancouver, ISO, and other styles

17

Azar, Celine. "A Flexible Software/Hardware Adaptive Network for Embedded Distributed Architectures." Circuits and Systems: An International Journal 08, no. 03 (July 31, 2021): 01–15. http://dx.doi.org/10.5121/csij.2021.8301.

Full text

Abstract:

Embedded platforms are projected to integrate hundreds of cores in the near future, and expanding the interconnection network remains a key challenge. We propose SNet, a new Scalable NETwork paradigm that extends the NoCs area to include a software/hardware dynamic routing mechanism. To design routing pathways among communicating processes, it uses a distributed, adaptive, non-supervised routing method based on the ACO algorithm (Ant Colony Optimization). A small footprint hardware unit called DMC speeds up data transfer (Direct Management of Communications). SNet has the benefit of being extremely versatile, allowing for the creation of a broad range of routing topologies to meet the needs of various applications. We provide the DMC module in this work and assess SNet performance by executing a large number of test cases.

APA, Harvard, Vancouver, ISO, and other styles

18

Liu, Lun, Todd Millstein, and Madanlal Musuvathi. "Safe-by-default Concurrency for Modern Programming Languages." ACM Transactions on Programming Languages and Systems 43, no. 3 (September 30, 2021): 1–50. http://dx.doi.org/10.1145/3462206.

Full text

Abstract:

Modern “safe” programming languages follow a design principle that we call safety by default and performance by choice . By default, these languages enforce important programming abstractions, such as memory and type safety, but they also provide mechanisms that allow expert programmers to explicitly trade some safety guarantees for increased performance. However, these same languages have adopted the inverse design principle in their support for multithreading. By default, multithreaded programs violate important abstractions, such as program order and atomic access to individual memory locations to admit compiler and hardware optimizations that would otherwise need to be restricted. Not only does this approach conflict with the design philosophy of safe languages, but very little is known about the practical performance cost of providing a stronger default semantics. In this article, we propose a safe-by-default and performance-by-choice multithreading semantics for safe languages, which we call volatile -by-default . Under this semantics, programs have sequential consistency (SC) by default, which is the natural “interleaving” semantics of threads. However, the volatile -by-default design also includes annotations that allow expert programmers to avoid the associated overheads in performance-critical code. We describe the design, implementation, optimization, and evaluation of the volatile -by-default semantics for two different safe languages: Java and Julia. First, we present V BD-HotSpot and V BDA-HotSpot, modifications of Oracle’s HotSpot JVM that enforce the volatile -by-default semantics on Intel x86-64 hardware and ARM-v8 hardware. Second, we present S C-Julia, a modification to the just-in-time compiler within the standard Julia implementation that provides best-effort enforcement of the volatile -by-default semantics on x86-64 hardware for the purpose of performance evaluation. We also detail two different implementation techniques: a baseline approach that simply reuses existing mechanisms in the compilers for handling atomic accesses, and a speculative approach that avoids the overhead of enforcing the volatile -by-default semantics until there is the possibility of an SC violation. Our results show that the cost of enforcing SC is significant but arguably still acceptable for some use cases today. Further, we demonstrate that compiler optimizations as well as programmer annotations can reduce the overhead considerably.

APA, Harvard, Vancouver, ISO, and other styles

19

France-Pillois, Maxime, Jérôme Martin, and Frédéric Rousseau. "A Non-Intrusive Tool Chain to Optimize MPSoC End-to-End Systems." ACM Transactions on Architecture and Code Optimization 18, no. 2 (March 2021): 1–22. http://dx.doi.org/10.1145/3445030.

Full text

Abstract:

Multi-core systems are now found in many electronic devices. But does current software design fully leverage their capabilities? The complexity of the hardware and software stacks in these platforms requires software optimization with end-to-end knowledge of the system. To optimize software performance, we must have accurate information about system behavior and time losses. Standard monitoring engines impose tradeoffs on profiling tools, making it impossible to reconcile all the expected requirements: accurate hardware views, fine-grain measurements, speed, and so on. Subsequently, new approaches have to be examined. In this article, we propose a non-intrusive, accurate tool chain, which can reveal and quantify slowdowns in low-level software mechanisms. Based on emulation, this tool chain extracts behavioral information (time, contention) through hardware side channels, without distorting the software execution flow. This tool consists of two parts. (1) An online acquisition part that dumps hardware platform signals. (2) An offline processing part that consolidates meaningful behavioral information from the dumped data. Using our tool chain, we studied and propose optimizations to MultiProcessor System on Chip (MPSoC) support in the Linux kernel, saving about 60% of the time required for the release phase of the GNU OpenMP synchronization barrier when running on a 64-core MPSoC.

APA, Harvard, Vancouver, ISO, and other styles

20

Schneider, Florian T., Mathias Payer, and Thomas R. Gross. "Online optimizations driven by hardware performance monitoring." ACM SIGPLAN Notices 42, no. 6 (June 10, 2007): 373–82. http://dx.doi.org/10.1145/1273442.1250777.

Full text

APA, Harvard, Vancouver, ISO, and other styles

21

Ramakrishna, Laxmikant, Abdulfattah Mohamed Ali, and Hani Baniodeh. "Interfacing PMDC Motor to Data Port of Personal Computer." Conference Papers in Engineering 2013 (June 11, 2013): 1–6. http://dx.doi.org/10.1155/2013/218127.

Full text

Abstract:

Procedures and techniques of hardware interfacing to personal computer system through parallel data port to control permanent magnet DC (PMDC) motor and create LabVIEW integrated-development-environments (IDEs) based Virtual Instrument (VI) software are discussed. To test the designed VI software diagram, authors constructed interface hardware without taking support of any commercially available DAQ boards. Hardware resource utilization and performance optimization by creating VI are discussed. Testing the design (Hardware and VI) by varying the set point speed of the motor is concluded. It is observed that the motor speed gradually approaches and locks to the desired or set speed.

APA, Harvard, Vancouver, ISO, and other styles

22

KAPLOW, WESLEY K., and BOLESLAW K. SZYMANSKI. "PROGRAM OPTIMIZATION BASED ON COMPILE-TIME CACHE PERFORMANCE PREDICTION." Parallel Processing Letters 06, no. 01 (March 1996): 173–84. http://dx.doi.org/10.1142/s0129626496000170.

Full text

Abstract:

We present a novel, compile-time method for determining the cache performance of the loop nests in a program. The cache hit-rates are produced by applying the reference string, determined during compilation, to an architecturally parameterized cache simulator. We also describe a heuristic that uses this method for compile-time optimization of loop ranges in iteration-space blocking. The results of the loop program optimizations are presented for different parallel program benchmarks and various processor architectures, such as IBM SP1 RS/6000, the SuperSPARC, and the Intel 1860.

APA, Harvard, Vancouver, ISO, and other styles

23

Roth, Jonathan, Naraig Manjikian, and Subramania Sudharsanan. "Performance optimization and parallelization of turbo decoding for software-defined radio." Canadian Journal of Electrical and Computer Engineering 34, no. 3 (2009): 115–23. http://dx.doi.org/10.1109/cjece.2009.5443859.

Full text

APA, Harvard, Vancouver, ISO, and other styles

24

Muškinja, Miha, John Derek Chapman, and Heather Gray. "Geant4 performance optimization in the ATLAS experiment." EPJ Web of Conferences 245 (2020): 02036. http://dx.doi.org/10.1051/epjconf/202024502036.

Full text

Abstract:

Software improvements in the ATLAS Geant4-based simulation are critical to keep up with evolving hardware and increasing luminosity. Geant4 simulation currently accounts for about 50% of CPU consumption in ATLAS and it is expected to remain the leading CPU load during Run 4 (HL-LHC upgrade) with an approximately 25% share in the most optimistic computing model. The ATLAS experiment recently developed two algorithms for optimizing Geant4 performance: Neutron Russian Roulette (NRR) and range cuts for electromagnetic processes. The NRR randomly terminates a fraction of low energy neutrons in the simulation and weights energy deposits of the remaining neutrons to maintain physics performance. Low energy neutrons typically undergo many interactions with the detector material and their path becomes uncorrelated with the point of origin. Therefore, the response of neutrons can be efficiently estimated only with a subset of neutrons. Range cuts for electromagnetic processes exploit a built-in feature of Geant4 and terminate low energy electrons that originate from physics processes including conversions, the photoelectric effect, and Compton scattering. Both algorithms were tuned to maintain physics performance in ATLAS and together they bring about a 20% speed-up of the ATLAS Geant4 simulation. Additional ideas for improvements, currently under investigation, will also be discussed in this paper. Lastly, this paper presents how the ATLAS experiment utilizes software packages such as Intel’s VTune to identify and resolve hot-spots in simulation.

APA, Harvard, Vancouver, ISO, and other styles

25

Bédard, Anne-Catherine, Andrea Adamo, Kosi C. Aroh, M. Grace Russell, Aaron A. Bedermann, Jeremy Torosian, Brian Yue, Klavs F. Jensen, and Timothy F. Jamison. "Reconfigurable system for automated optimization of diverse chemical reactions." Science 361, no. 6408 (September 20, 2018): 1220–25. http://dx.doi.org/10.1126/science.aat0650.

Full text

Abstract:

Chemical synthesis generally requires labor-intensive, sometimes tedious trial-and-error optimization of reaction conditions. Here, we describe a plug-and-play, continuous-flow chemical synthesis system that mitigates this challenge with an integrated combination of hardware, software, and analytics. The system software controls the user-selected reagents and unit operations (reactors and separators), processes reaction analytics (high-performance liquid chromatography, mass spectrometry, vibrational spectroscopy), and conducts automated optimizations. The capabilities of this system are demonstrated in high-yielding implementations of C-C and C-N cross-coupling, olefination, reductive amination, nucleophilic aromatic substitution (SNAr), photoredox catalysis, and a multistep sequence. The graphical user interface enables users to initiate optimizations, monitor progress remotely, and analyze results. Subsequent users of an optimized procedure need only download an electronic file, comparable to a smartphone application, to implement the protocol on their own apparatus.

APA, Harvard, Vancouver, ISO, and other styles

26

Ahmed, O., S. Areibi, R. Collier, and G. Grewal. "An Impulse-C Hardware Accelerator for Packet Classification Based on Fine/Coarse Grain Optimization." International Journal of Reconfigurable Computing 2013 (2013): 1–23. http://dx.doi.org/10.1155/2013/130765.

Full text

Abstract:

Current software-based packet classification algorithms exhibit relatively poor performance, prompting many researchers to concentrate on novel frameworks and architectures that employ both hardware and software components. The Packet Classification with Incremental Update (PCIU) algorithm, Ahmed et al. (2010), is a novel and efficient packet classification algorithm with a unique incremental update capability that demonstrated excellent results and was shown to be scalable for many different tasks and clients. While a pure software implementation can generate powerful results on a server machine, an embedded solution may be more desirable for some applications and clients. Embedded, specialized hardware accelerator based solutions are typically much more efficient in speed, cost, and size than solutions that are implemented on general-purpose processor systems. This paper seeks to explore the design space of translating the PCIU algorithm into hardware by utilizing several optimization techniques, ranging from fine grain to coarse grain and parallel coarse grain approaches. The paper presents a detailed implementation of a hardware accelerator of the PCIU based on an Electronic System Level (ESL) approach. Results obtained indicate that the hardware accelerator achieves on average 27x speedup over a state-of-the-art Xeon processor.

APA, Harvard, Vancouver, ISO, and other styles

27

Li, Guihong, Sumit K. Mandal, Umit Y. Ogras, and Radu Marculescu. "FLASH: F ast Neura l A rchitecture S earch with H ardware Optimization." ACM Transactions on Embedded Computing Systems 20, no. 5s (October 31, 2021): 1–26. http://dx.doi.org/10.1145/3476994.

Full text

Abstract:

Neural architecture search (NAS) is a promising technique to design efficient and high-performance deep neural networks (DNNs). As the performance requirements of ML applications grow continuously, the hardware accelerators start playing a central role in DNN design. This trend makes NAS even more complicated and time-consuming for most real applications. This paper proposes FLASH, a very fast NAS methodology that co-optimizes the DNN accuracy and performance on a real hardware platform. As the main theoretical contribution, we first propose the NN-Degree, an analytical metric to quantify the topological characteristics of DNNs with skip connections (e.g., DenseNets, ResNets, Wide-ResNets, and MobileNets). The newly proposed NN-Degree allows us to do training-free NAS within one second and build an accuracy predictor by training as few as 25 samples out of a vast search space with more than 63 billion configurations. Second, by performing inference on the target hardware, we fine-tune and validate our analytical models to estimate the latency, area, and energy consumption of various DNN architectures while executing standard ML datasets. Third, we construct a hierarchical algorithm based on simplicial homology global optimization (SHGO) to optimize the model-architecture co-design process, while considering the area, latency, and energy consumption of the target hardware. We demonstrate that, compared to the state-of-the-art NAS approaches, our proposed hierarchical SHGO-based algorithm enables more than four orders of magnitude speedup (specifically, the execution time of the proposed algorithm is about 0.1 seconds). Finally, our experimental evaluations show that FLASH is easily transferable to different hardware architectures, thus enabling us to do NAS on a Raspberry Pi-3B processor in less than 3 seconds.

APA, Harvard, Vancouver, ISO, and other styles

28

Bachmat, Eitan, and Sveinung Erland. "Performance analysis, Optimization and Optics." ACM SIGMETRICS Performance Evaluation Review 48, no. 2 (November 23, 2020): 12–14. http://dx.doi.org/10.1145/3439602.3439608.

Full text

APA, Harvard, Vancouver, ISO, and other styles

29

Zhao, Jinyuan, Zhigang Hu, Bing Xiong, Liu Yang, and Keqin Li. "Modeling and optimization of packet forwarding performance in software-defined WAN." Future Generation Computer Systems 106 (May 2020): 412–25. http://dx.doi.org/10.1016/j.future.2019.12.010.

Full text

APA, Harvard, Vancouver, ISO, and other styles

30

Lagravière, Jérémie, Johannes Langguth, Martina Prugger, Lukas Einkemmer, Phuong Hoai Ha, and Xing Cai. "Performance Optimization and Modeling of Fine-Grained Irregular Communication in UPC." Scientific Programming 2019 (March 3, 2019): 1–20. http://dx.doi.org/10.1155/2019/6825728.

Full text

Abstract:

The Unified Parallel C (UPC) programming language offers parallelism via logically partitioned shared memory, which typically spans physically disjoint memory subsystems. One convenient feature of UPC is its ability to automatically execute between-thread data movement, such that the entire content of a shared data array appears to be freely accessible by all the threads. The programmer friendliness, however, can come at the cost of substantial performance penalties. This is especially true when indirectly indexing the elements of a shared array, for which the induced between-thread data communication can be irregular and have a fine-grained pattern. In this paper, we study performance enhancement strategies specifically targeting such fine-grained irregular communication in UPC. Starting from explicit thread privatization, continuing with block-wise communication, and arriving at message condensing and consolidation, we obtained considerable performance improvement of UPC programs that originally require fine-grained irregular communication. Besides the performance enhancement strategies, the main contribution of the present paper is to propose performance models for the different scenarios, in the form of quantifiable formulas that hinge on the actual volumes of various data movements plus a small number of easily obtainable hardware characteristic parameters. These performance models help to verify the enhancements obtained, while also providing insightful predictions of similar parallel implementations, not limited to UPC, that also involve between-thread or between-process irregular communication. As a further validation, we also apply our performance modeling methodology and hardware characteristic parameters to an existing UPC code for solving a 2D heat equation on a uniform mesh.

APA, Harvard, Vancouver, ISO, and other styles

31

Liu, Wei, Quan Wang, Yunlong Zhu, and Hanning Chen. "GRU: optimization of NPI performance." Journal of Supercomputing 76, no. 5 (October 19, 2018): 3542–54. http://dx.doi.org/10.1007/s11227-018-2634-9.

Full text

APA, Harvard, Vancouver, ISO, and other styles

32

Mou, Xin Gang, Guo Hua Wei, and Xiao Zhou. "Parallel Programming and Optimization Based on TMS320C6678." Applied Mechanics and Materials 615 (August 2014): 259–64. http://dx.doi.org/10.4028/www.scientific.net/amm.615.259.

Full text

Abstract:

The development of multi-core processors has provided a good solution to applications that require real-time processing and a large number of calculations. However, simply exploiting parallelism in software is hard to make full use of the hardware performance. This paper studies the parallel programming and optimization techniques on TMS320C6678 multicore digital signal processors. We firstly illustrate an implementation of a selected parallel image convolution algorithm by OpenMP. Then several optimization techniques such as compiler intrinsics, cache, DMA are used to further enhance the application performance and achieve a good execution time according to the test results.

APA, Harvard, Vancouver, ISO, and other styles

33

Cong, Jason, Lei He, Cheng-Kok Koh, and Patrick H. Madden. "Performance optimization of VLSI interconnect layout." Integration 21, no. 1-2 (November 1996): 1–94. http://dx.doi.org/10.1016/s0167-9260(96)00008-9.

Full text

APA, Harvard, Vancouver, ISO, and other styles

34

Zheng, Xin, Xianghong Hu, Jinglong Zhang, Jian Yang, Shuting Cai, and Xiaoming Xiong. "An Efficient and Low-Power Design of the SM3 Hash Algorithm for IoT." Electronics 8, no. 9 (September 14, 2019): 1033. http://dx.doi.org/10.3390/electronics8091033.

Full text

Abstract:

The Internet-of-Things (IoT) has a security problem that has become increasingly significant. New architecture of SM3 which can be implemented in loT devices is proposed in this paper. The software/hardware co-design approach is put forward to implement the new architecture to achieve high performance and low costs. To facilitate software/hardware co-design, an AHB-SM3 interface controller (AHB-SIC) is designed as an AHB slave interface IP to exchange data with the embedded CPU. Task scheduling and hardware resource optimization techniques are adopted in the design of expansion modules. The task scheduling and critical path optimization techniques are utilized in the compression module design. The proposed architecture is implemented with ASIC using SMIC 130 nm technology. For the purpose of comparison, the proposed architecture is also implemented on Virtex 7 FPGA with a 36 MHz system clock. Compared with the standard implementation of SM3, the proposed architecture saves the number of registers for approximately 3.11 times, and 263 Mbps throughput is achieved under the 36 MHz clock. This design signifies an excellent trade-off between performance and the hardware area. Thus, the design accommodates the resource-limited IoT security devices very well. The proposed architecture is applied to an intelligent security gateway device.

APA, Harvard, Vancouver, ISO, and other styles

35

Walton, M., O. Ahmed, G. Grewal, and S. Areibi. "An Empirical Investigation on System and Statement Level Parallelism Strategies for Accelerating Scatter Search Using Handel-C and Impulse-C." VLSI Design 2012 (February 1, 2012): 1–11. http://dx.doi.org/10.1155/2012/793196.

Full text

Abstract:

Scatter Search is an effective and established population-based metaheuristic that has been used to solve a variety of hard optimization problems. However, the time required to find high-quality solutions can become prohibitive as problem sizes grow. In this paper, we present a hardware implementation of Scatter Search on a field-programmable gate array (FPGA). Our objective is to improve the run time of Scatter Search by exploiting the potentially massive performance benefits that are available through the native parallelism in hardware. When implementing Scatter Search we employ two different high-level languages (HLLs): Handel-C and Impulse-C. Our empirical results show that by effectively exploiting source-code optimizations, data parallelism, and pipelining, a 28x speed up over software can be achieved.

APA, Harvard, Vancouver, ISO, and other styles

36

Iguider, Adil, Oussama Elissati, Abdeslam En-Nouaary, and Mouhcine Chami. "Shortest Path Method for Hardware/Software Partitioning Problems." International Journal of Information Systems and Social Change 12, no. 3 (July 2021): 40–57. http://dx.doi.org/10.4018/ijissc.2021070104.

Full text

Abstract:

Smart systems are becoming more present in every aspect of our daily lives. The main component of such systems is an embedded system; this latter assures the collection, the treatment, and the transmission of the accurate information in the right time and for the right component. Modern embedded systems are facing several challenges; the objective is to design a system with high performance and to decrease the cost and the development time. Consequently, some robust methodologies like the Codesign were developed to fulfill those requirements. The most important step of the Codesign is the partitioning of the systems' functionalities between a hardware set and a software set. This article deals with this problem and uses a heuristic approach based on shortest path optimizations to solve the problem. The aim is to minimize the total hardware area and to respect a constraint on the overall execution time of the system. Experiments results demonstrate that the proposed method is very fast and gives better results compared to the genetic algorithm.

APA, Harvard, Vancouver, ISO, and other styles

37

Ben Haj Hassine, Siwar, Mehdi Jemai, and Bouraoui Ouni. "Power and Execution Time Optimization through Hardware Software Partitioning Algorithm for Core Based Embedded System." Journal of Optimization 2017 (2017): 1–11. http://dx.doi.org/10.1155/2017/8624021.

Full text

Abstract:

Shortening the marketing cycle of the product and accelerating its development efficiency have become a vital concern in the field of embedded system design. Therefore, hardware/software partitioning has become one of the mainstream technologies of embedded system development since it affects the overall system performance. Given today’s largest requirement for great efficiency necessarily accompanied by high speed, our new algorithm presents the best version that can meet such unpreceded levels. In fact, we describe in this paper an algorithm that is based on HW/SW partitioning which aims to find the best tradeoff between power and latency of a system taking into consideration the dark silicon problem. Moreover, it has been tested and has shown its efficiency compared to other existing heuristic well-known algorithms which are Simulated Annealing, Tabu search, and Genetic algorithms.

APA, Harvard, Vancouver, ISO, and other styles

38

Van der Wijngaart, Rob F., Sekhar R. Sarukkai, and Pankaj Mehra. "Analysis and Optimization of Software Pipeline Performance on MIMD Parallel Computers." Journal of Parallel and Distributed Computing 38, no. 1 (October 1996): 37–50. http://dx.doi.org/10.1006/jpdc.1996.0127.

Full text

APA, Harvard, Vancouver, ISO, and other styles

39

Teixeira, Thiago SFX, William Gropp, and David Padua. "Managing code transformations for better performance portability." International Journal of High Performance Computing Applications 33, no. 6 (August 4, 2019): 1290–306. http://dx.doi.org/10.1177/1094342019865606.

Full text

Abstract:

Code optimization is an intricate task that is getting more complex as computing systems evolve. Managing the program optimization process, including the implementation and evaluation of code variants, is tedious, inefficient, and errors are likely to be introduced in the process. Moreover, because each platform typically requires a different sequence of transformations to fully harness its computing power, the optimization process complexity grows as new platforms are adopted. To address these issues, systems and frameworks have been proposed to automate the code optimization process. They, however, have not been widely adopted and are primarily used by experts with deep knowledge about underlying architecture and compiler intricacies. This article describes the requirements that we believe necessary for making automatic performance tuning more broadly used, especially in complex, long-lived high-performance computing applications. Besides discussing limitations of current systems and strategies to overcome these, we describe the design of a system that is able to semi-automatically generate efficient platform-specific code. In the proposed system, the code optimization is programmer-guided, separately from application code, on an external file in what we call optimization programming. The language to program the optimization process is able to represent complex collections of transformations and, as a result, generate efficient platform-specific code. A database manages different optimized versions of code regions, providing a pragmatic approach to performance portability, and the framework itself has separate components, allowing the optimized code to be used on systems without installing all of the modules required for the code generation. We present experiments on two different platforms to illustrate the generation of efficient platform-specific code that performs comparable to hand-optimized, vendor-provided code.

APA, Harvard, Vancouver, ISO, and other styles

40

Su, Yu-Shih, Da-Chung Wang, Shih-Chieh Chang, and Malgorzata Marek-Sadowska. "Performance Optimization Using Variable-Latency Design Style." IEEE Transactions on Very Large Scale Integration (VLSI) Systems 19, no. 10 (October 2011): 1874–83. http://dx.doi.org/10.1109/tvlsi.2010.2058874.

Full text

APA, Harvard, Vancouver, ISO, and other styles

41

Chang, Cheng, Ligang He, Nadeem Chaudhary, Songling Fu, Hao Chen, Jianhua Sun, Kenli Li, Zhangjie Fu, and Ming-Liang Xu. "Performance analysis and optimization for workflow authorization." Future Generation Computer Systems 67 (February 2017): 194–205. http://dx.doi.org/10.1016/j.future.2016.09.011.

Full text

APA, Harvard, Vancouver, ISO, and other styles

42

ZAVANELLA, ANDREA. "SKELETONS, BSP AND PERFORMANCE PORTABILITY." Parallel Processing Letters 11, no. 04 (December 2001): 393–407. http://dx.doi.org/10.1142/s0129626401000683.

Full text

Abstract:

The Skeletal approach to parallel programming conjugates a high-level compositional style and efficiency. A second advantage of Skeletal programming is portability since implementation decisions are usually taken at compile time. The paper claims that an intermediate model embedding the main performance features of the target architecture facilitates performance portability across parallel architectures. This is motivated by describing the Skel-BSP framework which implements a skeleton system on top of a BSP computer. A prototype compiler based on a set of BSP templates is presented together with a set of performance models for each skeleton which allow a local optimization. The paper also introduces a global optimization strategy using a set of transformation rules. This local+global approach seems a viable solution to writing parallel software in machine-independent way (Writing Once and Compiling Everywhere).

APA, Harvard, Vancouver, ISO, and other styles

43

Wang, Ying, Ri Bo Ge, and Mei Hua Li. "Design of Resistance Touch Screen Based on S3C6410 Embedded System." Applied Mechanics and Materials 556-562 (May 2014): 1491–94. http://dx.doi.org/10.4028/www.scientific.net/amm.556-562.1491.

Full text

Abstract:

This design uses Samsung ARM11 S3C6410 microprocessor and 4 wires resistive touch screen as the hardware foundation. Based on the hardware structure, the touch screen application is developed. System performance has been improved by means of the algorithm optimization of the sampled data and software filtration method, and lead to a strong practicability. In the end, three-point calibration algorithm is introduced for the offset problem of the screen coordinates, then the calibration matrix is decided by the selection of three-point, and then using software implementation to realize the calibration of point-to-point mapping relation, and finally make it more accurate in practical applications.

APA, Harvard, Vancouver, ISO, and other styles

44

Izosimov, Viacheslav, Paul Pop, Petru Eles, and Zebo Peng. "Scheduling and Optimization of Fault-Tolerant Embedded Systems with Transparency/Performance Trade-Offs." ACM Transactions on Embedded Computing Systems 11, no. 3 (September 2012): 1–35. http://dx.doi.org/10.1145/2345770.2345773.

Full text

APA, Harvard, Vancouver, ISO, and other styles

45

Leech, Charles, Charan Kumar, Amit Acharyya, Sheng Yang, Geoff V. Merrett, and Bashir M. Al-Hashimi. "Runtime Performance and Power Optimization of Parallel Disparity Estimation on Many-Core Platforms." ACM Transactions on Embedded Computing Systems 17, no. 2 (April 26, 2018): 1–19. http://dx.doi.org/10.1145/3133560.

Full text

APA, Harvard, Vancouver, ISO, and other styles

46

Barve, Yogesh D., Himanshu Neema, Zhuangwei Kang, Harsh Vardhan, Hongyang Sun, and Aniruddha Gokhale. "EXPPO: EXecution Performance Profiling and Optimization for CPS Co-simulation-as-a-Service." Journal of Systems Architecture 118 (September 2021): 102189. http://dx.doi.org/10.1016/j.sysarc.2021.102189.

Full text

APA, Harvard, Vancouver, ISO, and other styles

47

Cremonesi, Francesco, Georg Hager, Gerhard Wellein, and Felix Schürmann. "Analytic performance modeling and analysis of detailed neuron simulations." International Journal of High Performance Computing Applications 34, no. 4 (April 3, 2020): 428–49. http://dx.doi.org/10.1177/1094342020912528.

Full text

Abstract:

Big science initiatives are trying to reconstruct and model the brain by attempting to simulate brain tissue at larger scales and with increasingly more biological detail than previously thought possible. The exponential growth of parallel computer performance has been supporting these developments, and at the same time maintainers of neuroscientific simulation code have strived to optimally and efficiently exploit new hardware features. Current state-of-the-art software for the simulation of biological networks has so far been developed using performance engineering practices, but a thorough analysis and modeling of the computational and performance characteristics, especially in the case of morphologically detailed neuron simulations, is lacking. Other computational sciences have successfully used analytic performance engineering, which is based on “white-box,” that is, first-principles performance models, to gain insight on the computational properties of simulation kernels, aid developers in performance optimizations and eventually drive codesign efforts, but to our knowledge a model-based performance analysis of neuron simulations has not yet been conducted. We present a detailed study of the shared-memory performance of morphologically detailed neuron simulations based on the Execution-Cache-Memory performance model. We demonstrate that this model can deliver accurate predictions of the runtime of almost all the kernels that constitute the neuron models under investigation. The gained insight is used to identify the main governing mechanisms underlying performance bottlenecks in the simulation. The implications of this analysis on the optimization of neural simulation software and eventually codesign of future hardware architectures are discussed. In this sense, our work represents a valuable conceptual and quantitative contribution to understanding the performance properties of biological networks simulations.

APA, Harvard, Vancouver, ISO, and other styles

48

Beard, Jonathan C., Peng Li, and Roger D. Chamberlain. "RaftLib: A C++ template library for high performance stream parallel processing." International Journal of High Performance Computing Applications 31, no. 5 (October 19, 2016): 391–404. http://dx.doi.org/10.1177/1094342016672542.

Full text

Abstract:

Stream processing is a compute paradigm that has been around for decades, yet until recently has failed to garner the same attention as other mainstream languages and libraries (e.g. C++, OpenMP, MPI). Stream processing has great promise: the ability to safely exploit extreme levels of parallelism to process huge volumes of streaming data. There have been many implementations, both libraries and full languages. The full languages implicitly assume that the streaming paradigm cannot be fully exploited in legacy languages, while library approaches are often preferred for being integrable with the vast expanse of extant legacy code. Libraries, however are often criticized for yielding to the shape of their respective languages. RaftLib aims to fully exploit the stream processing paradigm, enabling a full spectrum of streaming graph optimizations, while providing a platform for the exploration of integrability with legacy C/C++ code. RaftLib is built as a C++ template library, enabling programmers to utilize the robust C++ standard library, and other legacy code, along with RaftLib’s parallelization framework. RaftLib supports several online optimization techniques: dynamic queue optimization, automatic parallelization, and real-time low overhead performance monitoring.

APA, Harvard, Vancouver, ISO, and other styles

49

Kremer, Ulrich. "Optimal and Near–Optimal Solutions for Hard Compilation Problems." Parallel Processing Letters 07, no. 04 (December 1997): 371–78. http://dx.doi.org/10.1142/s0129626497000371.

Full text

Abstract:

An optimizing compiler typically uses multiple program representations at different levels of program and performance abstractions in order to be able to perform transformations that – at least in the majority of cases – will lead to an overall improvement in program performance. The complexities of the program and performance abstractions used to formulate compiler optimization problems have to match the complexities of the high–level programming model and of the underlying target system. Scalable parallel systems typically have multi–level memory hierarchies and able to exploit coarse–grain and fine–grain parallelism. Most likely, future systems will have even deeper memory hierarchies and more granularities of parallelism. As a result, future compiler optimizations will have to use more and more complex, multi–level computation and performance models in order to keep up with the complexities of their future target systems. Most of the optimization problems encountered in highly optimizing compilers are already NP–hard, and there is little hope that most newly encountered optimization formulations will not be at least NP–hard as well. To face this "complexity crisis", new methods are needed to evaluate the benefits of a compiler optimization formulation. A crucial step in this evaluation process is to compute the optimal solution of the formulation. Using ad–hoc methods to compute optimal solutions to NP–complete problems may be prohibitively expensive. Recent improvements in mixed integer and 0–1 integer programming suggest that this technology may provide the key to efficient, optimal and near–optimal solutions to NP–complete compiler optimization problems. In fact, early results indicate that integer programming formulations may be efficient enough to be included in not only evaluation prototypes, but in production programming environments or even production compilers. This paper discusses the potential benefits of integer programming as a tool to deal with NP–complete compiler optimization formulations in compilers and programming environments.

APA, Harvard, Vancouver, ISO, and other styles

50

Sznura, Marek, and Piotr Przystałka. "Development of a Power and Communication Bus Using HIL and Computational Intelligence." Applied Sciences 11, no. 18 (September 18, 2021): 8709. http://dx.doi.org/10.3390/app11188709.

Full text

Abstract:

This paper deals with the development of a power and communication bus named DLN (Device Lightweight Network) that can be seen as a new interface with auto-addressing functionality to transfer power and data by means of two wires in modern cars. The main research goal of this paper is to elaborate a new method based on a hardware in the loop technique aided by computational intelligence algorithms in order to search for the optimal structure of the communication modules, as well as optimal features of hardware parts and the values of software parameters. The desired properties of communication modules, which have a strong influence on the performance of the bus, cannot be found using a classical engineering approach due to the large number of possible combinations of configuration of the hardware and software parts of the whole system. Therefore, an HIL-based optimization method for bus prototyping is proposed, in which the optimization task is formulated as a multi-criteria optimization problem. Several criterion functions are proposed, corresponding to the automotive objectives and requirements. Different soft computing optimization algorithms, such as a single-objective/multi-objectives evolutionary algorithm and a particle swarm optimization algorithm, are applied to searching for the optimal solution. The verification study was carried out in order to show the merits and limitations of the proposed approach. Attention was also paid to the problem of the selection of the behavioural parameters of the heuristic algorithms. The overall results proved the high practical potential of the DLN, which was developed using the proposed optimization method.

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!