To see the other types of publications on this topic, follow the link: High-power computing.

Dissertations / Theses on the topic 'High-power computing'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 43 dissertations / theses for your research on the topic 'High-power computing.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Choi, Jee Whan. "Power and performance modeling for high-performance computing algorithms." Diss., Georgia Institute of Technology, 2015. http://hdl.handle.net/1853/53561.

Full text
Abstract:
The overarching goal of this thesis is to provide an algorithm-centric approach to analyzing the relationship between time, energy, and power. This research is aimed at algorithm designers and performance tuners so that they may be able to make decisions on how algorithms should be designed and tuned depending on whether the goal is to minimize time or to minimize energy on current and future systems. First, we present a simple analytical cost model for energy and power. Assuming a simple von Neumann architecture with a two-level memory hierarchy, this model pre- dicts energy and power for algorithms using just a few simple parameters, such as the number of floating point operations (FLOPs or flops) and the amount of data moved (bytes or words). Using highly optimized microbenchmarks and a small number of test platforms, we show that although this model uses only a few simple parameters, it is, nevertheless, accurate. We can also visualize this model using energy “arch lines,” analogous to the “rooflines” in time. These “rooflines in energy” allow users to easily assess and com- pare different algorithms’ intensities in energy and time to various target systems’ balances in energy and time. This visualization of our model gives us many inter- esting insights, and as such, we refer to our analytical model as the energy roofline model. Second, we present the results of our microbenchmarking study of time, energy, and power costs of computation and memory access of several candidate compute- node building blocks of future high–performance computing (HPC) systems. Over a dozen server-, desktop-, and mobile-class platforms that span a range of compute and power characteristics were evaluated, including x86 (both conventional and Xeon Phi accelerator), ARM, graphics processing units (GPU), and hybrid (AMD accelerated processing units (APU) and other system–on–chip (SoC)) processors. The purpose of this study was twofold; first, it was to extend the validation of the energy roofline model to a more comprehensive set of target systems to show that the model works well independent of system hardware and microarchitecture; second, it was to improve the model by uncovering and remedying potential shortcomings, such as incorporating the effects of power “capping,” multi–level memory hierarchy, and different implementation strategies on power and performance. Third, we incorporate dynamic voltage and frequency scaling (DVFS) into the energy roofline model to explore its potential for saving energy. Rather than the more traditional approach of using DVFS to reduce energy, whereby a “slack” in computation is used as an opportunity to dynamically cycle down the processor clock, the energy roofline model can be used to determine precisely how the time and energy costs of different operations, both compute and memory, change with respect to frequency and voltage settings. This information can be used to target a specific optimization goal, whether that be time, energy, or a combination of both. In the final chapter of this thesis, we use our model to predict the energy dissi- pation of a real application running on a real system. The fast multipole method (FMM) kernel was executed on the GPU component of the Tegra K1 SoC under various frequency and voltage settings and a breakdown of instructions and data ac- cess pattern was collected via performance counters. The total energy dissipation of FMM was then calculated as a weighted sum of these instructions and the associated costs in energy. On eight different voltage and frequency settings and eight different algorithm–specific input parameters per setting, for a total of 64 total test cases, the accuracy of the energy roofline model for predicting total energy dissipation was within 6.2%, with a standard deviation of 4.7%, when compared to actual energy measurements. Despite its simplicity and its foundation on the first principles of algorithm anal- ysis, the energy roofline model has proven to be both practical and accurate for real applications running on a real system. And as such, it can be an invaluable tool for al- gorithm designers and performance tuners with which they can more precisely analyze the impact of their design decisions on both performance and energy efficiency.
APA, Harvard, Vancouver, ISO, and other styles
2

Borghesi, Andrea <1988&gt. "Power-Aware Job Dispatching in High Performance Computing Systems." Doctoral thesis, Alma Mater Studiorum - Università di Bologna, 2017. http://amsdottorato.unibo.it/7982/1/master.pdf.

Full text
Abstract:
This works deals with the power-aware job dispatching problem in supercomputers; broadly speaking the dispatching consists of assigning finite capacity resources to a set of activities, with a special concern toward power and energy efficient solutions. We introduce novel optimization approaches to address its multiple aspects. The proposed techniques have a broad application range but are aimed at applications in the field of High Performance Computing (HPC) systems. Devising a power-aware HPC job dispatcher is a complex, where contrasting goals must be satisfied. Furthermore, the online nature of the problem request that solutions must be computed in real time respecting stringent limits. This aspect historically discouraged the usage of exact methods and favouring instead the adoption of heuristic techniques. The application of optimization approaches to the dispatching task is still an unexplored area of research and can drastically improve the performance of HPC systems. In this work we tackle the job dispatching problem on a real HPC machine, the Eurora supercomputer hosted at the Cineca research center, Bologna. We propose a Constraint Programming (CP) model that outperforms the dispatching software currently in use. An essential element to take power-aware decisions during the job dispatching phase is the possibility to estimate jobs power consumptions before their execution. To this end, we applied Machine Learning techniques to create a prediction model that was trained and tested on the Euora supercomputer, showing a great prediction accuracy. Then we finally develop a power-aware solution, considering the same target machine, and we devise different approaches to solve the dispatching problem while curtailing the power consumption of the whole system under a given threshold. We proposed a heuristic technique and a CP/heuristic hybrid method, both able to solve practical size instances and outperform the current state-of-the-art techniques.
APA, Harvard, Vancouver, ISO, and other styles
3

MA, LIANG. "Low power and high performance heterogeneous computing on FPGAs." Doctoral thesis, Politecnico di Torino, 2019. http://hdl.handle.net/11583/2727228.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

ROOZMEH, MEHDI. "High Performance Computing via High Level Synthesis." Doctoral thesis, Politecnico di Torino, 2018. http://hdl.handle.net/11583/2710706.

Full text
Abstract:
As more and more powerful integrated circuits are appearing on the market, more and more applications, with very different requirements and workloads, are making use of the available computing power. This thesis is in particular devoted to High Performance Computing applications, where those trends are carried to the extreme. In this domain, the primary aspects to be taken into consideration are (1) performance (by definition) and (2) energy consumption (since operational costs dominate over procurement costs). These requirements can be satisfied more easily by deploying heterogeneous platforms, which include CPUs, GPUs and FPGAs to provide a broad range of performance and energy-per-operation choices. In particular, as we will see, FPGAs clearly dominate both CPUs and GPUs in terms of energy, and can provide comparable performance. An important aspect of this trend is of course design technology, because these applications were traditionally programmed in high-level languages, while FPGAs required low-level RTL design. The OpenCL (Open Computing Language) developed by the Khronos group enables developers to program CPU, GPU and recently FPGAs using functionally portable (but sadly not performance portable) source code which creates new possibilities and challenges both for research and industry. FPGAs have been always used for mid-size designs and ASIC prototyping thanks to their energy efficient and flexible hardware architecture, but their usage requires hardware design knowledge and laborious design cycles. Several approaches are developed and deployed to address this issue and shorten the gap between software and hardware in FPGA design flow, in order to enable FPGAs to capture a larger portion of the hardware acceleration market in data centers. Moreover, FPGAs usage in data centers is growing already, regardless of and in addition to their use as computational accelerators, because they can be used as high performance, low power and secure switches inside data-centers. High-Level Synthesis (HLS) is the methodology that enables designers to map their applications on FPGAs (and ASICs). It synthesizes parallel hardware from a model originally written C-based programming languages .e.g. C/C++, SystemC and OpenCL. Design space exploration of the variety of implementations that can be obtained from this C model is possible through wide range of optimization techniques and directives, e.g. to pipeline loops and partition memories into multiple banks, which guide RTL generation toward application dependent hardware and benefit designers from flexible parallel architecture of FPGAs. Model Based Design (MBD) is a high-level and visual process used to generate implementations that solve mathematical problems through a varied set of IP-blocks. MBD enables developers with different expertise, e.g. control theory, embedded software development, and hardware design to share a common design framework and contribute to a shared design using the same tool. Simulink, developed by MATLAB, is a model based design tool for simulation and development of complex dynamical systems. Moreover, Simulink embedded code generators can produce verified C/C++ and HDL code from the graphical model. This code can be used to program micro-controllers and FPGAs. This PhD thesis work presents a study using automatic code generator of Simulink to target Xilinx FPGAs using both HDL and C/C++ code to demonstrate capabilities and challenges of high-level synthesis process. To do so, firstly, digital signal processing unit of a real-time radar application is developed using Simulink blocks. Secondly, generated C based model was used for high level synthesis process and finally the implementation cost of HLS is compared to traditional HDL synthesis using Xilinx tool chain. Alternative to model based design approach, this work also presents an analysis on FPGA programming via high-level synthesis techniques for computationally intensive algorithms and demonstrates the importance of HLS by comparing performance-per-watt of GPUs(NVIDIA) and FPGAs(Xilinx) manufactured in the same node running standard OpenCL benchmarks. We conclude that generation of high quality RTL from OpenCL model requires stronger hardware background with respect to the MBD approach, however, the availability of a fast and broad design space exploration ability and portability of the OpenCL code, e.g. to CPUs and GPUs, motivates FPGA industry leaders to provide users with OpenCL software development environment which promises FPGA programming in CPU/GPU-like fashion. Our experiments, through extensive design space exploration(DSE), suggest that FPGAs have higher performance-per-watt with respect to two high-end GPUs manufactured in the same technology(28 nm). Moreover, FPGAs with more available resources and using a more modern process (20 nm) can outperform the tested GPUs while consuming much less power at the cost of more expensive devices.
APA, Harvard, Vancouver, ISO, and other styles
5

Bartolini, Andrea <1981&gt. "Dynamic power management: from portable devices to high performance computing." Doctoral thesis, Alma Mater Studiorum - Università di Bologna, 2011. http://amsdottorato.unibo.it/3558/1/bartolini_andrea_tesi.pdf.

Full text
Abstract:
Electronic applications are nowadays converging under the umbrella of the cloud computing vision. The future ecosystem of information and communication technology is going to integrate clouds of portable clients and embedded devices exchanging information, through the internet layer, with processing clusters of servers, data-centers and high performance computing systems. Even thus the whole society is waiting to embrace this revolution, there is a backside of the story. Portable devices require battery to work far from the power plugs and their storage capacity does not scale as the increasing power requirement does. At the other end processing clusters, such as data-centers and server farms, are build upon the integration of thousands multiprocessors. For each of them during the last decade the technology scaling has produced a dramatic increase in power density with significant spatial and temporal variability. This leads to power and temperature hot-spots, which may cause non-uniform ageing and accelerated chip failure. Nonetheless all the heat removed from the silicon translates in high cooling costs. Moreover trend in ICT carbon footprint shows that run-time power consumption of the all spectrum of devices accounts for a significant slice of entire world carbon emissions. This thesis work embrace the full ICT ecosystem and dynamic power consumption concerns by describing a set of new and promising system levels resource management techniques to reduce the power consumption and related issues for two corner cases: Mobile Devices and High Performance Computing.
APA, Harvard, Vancouver, ISO, and other styles
6

Bartolini, Andrea <1981&gt. "Dynamic power management: from portable devices to high performance computing." Doctoral thesis, Alma Mater Studiorum - Università di Bologna, 2011. http://amsdottorato.unibo.it/3558/.

Full text
Abstract:
Electronic applications are nowadays converging under the umbrella of the cloud computing vision. The future ecosystem of information and communication technology is going to integrate clouds of portable clients and embedded devices exchanging information, through the internet layer, with processing clusters of servers, data-centers and high performance computing systems. Even thus the whole society is waiting to embrace this revolution, there is a backside of the story. Portable devices require battery to work far from the power plugs and their storage capacity does not scale as the increasing power requirement does. At the other end processing clusters, such as data-centers and server farms, are build upon the integration of thousands multiprocessors. For each of them during the last decade the technology scaling has produced a dramatic increase in power density with significant spatial and temporal variability. This leads to power and temperature hot-spots, which may cause non-uniform ageing and accelerated chip failure. Nonetheless all the heat removed from the silicon translates in high cooling costs. Moreover trend in ICT carbon footprint shows that run-time power consumption of the all spectrum of devices accounts for a significant slice of entire world carbon emissions. This thesis work embrace the full ICT ecosystem and dynamic power consumption concerns by describing a set of new and promising system levels resource management techniques to reduce the power consumption and related issues for two corner cases: Mobile Devices and High Performance Computing.
APA, Harvard, Vancouver, ISO, and other styles
7

Ge, Rong. "Theories and Techniques for Efficient High-End Computing." Diss., Virginia Tech, 2007. http://hdl.handle.net/10919/28863.

Full text
Abstract:
Today, power consumption costs supercomputer centers millions of dollars annually and the heat produced can reduce system reliability and availability. Achieving high performance while reducing power consumption is challenging since power and performance are inextricably interwoven; reducing power often results in degradation in performance. This thesis aims to address these challenges by providing theories, techniques, and tools to 1) accurately predict performance and improve it in systems with advanced hierarchical memories, 2) understand and evaluate power and its impacts on performance, 3) control power and performance for maximum efficiency. Our theories, techniques, and tools have been applied to high-end computing systems. Our theroetical models can improve algorithm performance by up to 59% and accurately predict the impacts of power on performance. Our techniques can evaluate power consumption of high-end computing systems and their applications with fine granularity and save up to 36% energy with little performance degradation.<br>Ph. D.
APA, Harvard, Vancouver, ISO, and other styles
8

Zhang, Ziming. "Adaptive Power Management for Autonomic Resource Configuration in Large-scale Computer Systems." Thesis, University of North Texas, 2015. https://digital.library.unt.edu/ark:/67531/metadc804939/.

Full text
Abstract:
In order to run and manage resource-intensive high-performance applications, large-scale computing and storage platforms have been evolving rapidly in various domains in both academia and industry. The energy expenditure consumed to operate and maintain these cloud computing infrastructures is a major factor to influence the overall profit and efficiency for most cloud service providers. Moreover, considering the mitigation of environmental damage from excessive carbon dioxide emission, the amount of power consumed by enterprise-scale data centers should be constrained for protection of the environment.Generally speaking, there exists a trade-off between power consumption and application performance in large-scale computing systems and how to balance these two factors has become an important topic for researchers and engineers in cloud and HPC communities. Therefore, minimizing the power usage while satisfying the Service Level Agreements have become one of the most desirable objectives in cloud computing research and implementation. Since the fundamental feature of the cloud computing platform is hosting workloads with a variety of characteristics in a consolidated and on-demand manner, it is demanding to explore the inherent relationship between power usage and machine configurations. Subsequently, with an understanding of these inherent relationships, researchers are able to develop effective power management policies to optimize productivity by balancing power usage and system performance. In this dissertation, we develop an autonomic power-aware system management framework for large-scale computer systems. We propose a series of techniques including coarse-grain power profiling, VM power modelling, power-aware resource auto-configuration and full-system power usage simulator. These techniques help us to understand the characteristics of power consumption of various system components. Based on these techniques, we are able to test various job scheduling strategies and develop resource management approaches to enhance the systems' power efficiency.
APA, Harvard, Vancouver, ISO, and other styles
9

Zheng, Li. "Power distribution network modeling and microfluidic cooling for high-performance computing systems." Diss., Georgia Institute of Technology, 2015. http://hdl.handle.net/1853/54449.

Full text
Abstract:
A silicon interposer platform with microfluidic cooling is proposed for high-performance computing systems. The key components and technologies for the proposed platform, including electrical and fluidic microbumps, microfluidic vias and heat sinks, and simultaneous flip-chip bonding of the electrical and fluidic microbumps, are developed and demonstrated. Fine-pitch electrical microbumps of 25 µm diameter and 50 µm pitch, fluidic vias of 100 µm diameter, and annular-shaped fluidic microbumps of 150 µm inner diameter and 210 µm outer diameter were fabricated and bonded. Electrical and fluidic tests were conducted to verify the bonding results. Moreover, the thermal and signaling benefits of the proposed platform were evaluated based on thermal measurements and simulations, and signaling simulations. Compared to the conventional air cooling, significant reductions in system temperature and thermal coupling are achieved with the proposed platform. Moreover, the signaling performance is improved due to the reduced temperature, especially for long interconnects on the silicon interposer. A numerical power distribution network (PDN) simulator is developed based on distributed circuit models for on-die power/ground grids, package- and board- level power/ground planes, and the finite difference method. The simulator enables power supply noise simulation, including IR-drop and simultaneous switching noise, for a full chip with multiple blocks of different power, decoupling capacitor, and power/ground pad densities. The distributed circuit model is further extended to include TSVs to enable simulations for 3D PDN. The integration of package- and board- level power/ground planes enables co-simulation of die-package-board PDN and exploration of new PDN configurations.
APA, Harvard, Vancouver, ISO, and other styles
10

Tang, Kun. "Improving the Performance and Energy Efficiency for Power-constrained High Performance Computing." Diss., Temple University Libraries, 2017. http://cdm16002.contentdm.oclc.org/cdm/ref/collection/p245801coll10/id/467325.

Full text
Abstract:
Computer and Information Science<br>Ph.D.<br>The continuous growth in computing capability has expedited the scientific discovery and enabled scientific applications to simulate physical phenomena for increased problem sizes. However, as the computing capability escalates, power constraints are becoming a first-order concern for high performance computing (HPC) facilities. For example, the U.S. Department of Energy has set a power constraint of 20 MW to each exascale machine. How to achieve the target performance under power constraints remains to be an issue. Therefore, efficient operation of these facilities requires power constraints to be taken into account at all layers, which potentially impacts the performance and energy efficiency. In order to improve the performance and energy efficiency for computing and storage resources under power constraints, I proposed the following three techniques. First, I developed a power-aware checkpointing model through exploring the interplay among power capping, temperature, reliability, performance, and energy efficiency. Applying the model leads to maximized performance and energy efficiency, and minimized data movements over storage systems. Second, I characterized the performance and energy efficiency of HPC workflows on heterogeneous processors. In addition, I also characterized how scientific simulation and analysis react to power capping differently and how they vary based on error resilience. Based on the characterization of HPC workflows, I developed a reliability-aware platform configuration model to determine the optimal platform configuration which includes power allocation and distribution, power capping levels, and computing scales for power-constrained HPC workflows. Third, I developed a proactive burst buffer draining scheme to minimize the I/O provisioning requirement of permanent storage systems while preserving the system I/O performance. Facing power constraints, reducing the storage provisioning level directly decreases the power consumption of storage systems. Applying the proactive burst buffer draining scheme minimizes the storage provisioning level and power consumption without compromising the storage I/O performance.<br>Temple University--Theses
APA, Harvard, Vancouver, ISO, and other styles
11

Amsler, Christopher. "The effects of hardware acceleration on power usage in basic high-performance computing." Thesis, Kansas State University, 2012. http://hdl.handle.net/2097/13742.

Full text
Abstract:
Master of Science<br>Department of Electrical Engineering<br>Dwight Day<br>Power consumption has become a large concern in many systems including portable electronics and supercomputers. Creating efficient hardware that can do more computation with less power is highly desirable. This project proposes a possible avenue to complete this goal by hardware accelerating a conjugate gradient solve using a Field Programmable Gate Array (FPGA). This method uses three basic operations frequently: dot product, weighted vector addition, and sparse matrix vector multiply. Each operation was accelerated on the FPGA. A power monitor was also implemented to measure the power consumption of the FPGA during each operation with several different implementations. Results showed that a decrease in time can be achieved with the dot product being hardware accelerated in relation to a software only approach. However, the more memory intensive operations were slowed using the current architecture for hardware acceleration.
APA, Harvard, Vancouver, ISO, and other styles
12

Khasawneh, Shadi Turki. "Low-power high-performance register file design for chip multiprocessors." Diss., Online access via UMI:, 2006.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
13

Ahmed, Kishwar. "Energy Demand Response for High-Performance Computing Systems." FIU Digital Commons, 2018. https://digitalcommons.fiu.edu/etd/3569.

Full text
Abstract:
The growing computational demand of scientific applications has greatly motivated the development of large-scale high-performance computing (HPC) systems in the past decade. To accommodate the increasing demand of applications, HPC systems have been going through dramatic architectural changes (e.g., introduction of many-core and multi-core systems, rapid growth of complex interconnection network for efficient communication between thousands of nodes), as well as significant increase in size (e.g., modern supercomputers consist of hundreds of thousands of nodes). With such changes in architecture and size, the energy consumption by these systems has increased significantly. With the advent of exascale supercomputers in the next few years, power consumption of the HPC systems will surely increase; some systems may even consume hundreds of megawatts of electricity. Demand response programs are designed to help the energy service providers to stabilize the power system by reducing the energy consumption of participating systems during the time periods of high demand power usage or temporary shortage in power supply. This dissertation focuses on developing energy-efficient demand-response models and algorithms to enable HPC system's demand response participation. In the first part, we present interconnection network models for performance prediction of large-scale HPC applications. They are based on interconnected topologies widely used in HPC systems: dragonfly, torus, and fat-tree. Our interconnect models are fully integrated with an implementation of message-passing interface (MPI) that can mimic most of its functions with packet-level accuracy. Extensive experiments show that our integrated models provide good accuracy for predicting the network behavior, while at the same time allowing for good parallel scaling performance. In the second part, we present an energy-efficient demand-response model to reduce HPC systems' energy consumption during demand response periods. We propose HPC job scheduling and resource provisioning schemes to enable HPC system's emergency demand response participation. In the final part, we propose an economic demand-response model to allow both HPC operator and HPC users to jointly reduce HPC system's energy cost. Our proposed model allows the participation of HPC systems in economic demand-response programs through a contract-based rewarding scheme that can incentivize HPC users to participate in demand response.
APA, Harvard, Vancouver, ISO, and other styles
14

Song, Shuaiwen. "Power, Performance and Energy Models and Systems for Emergent Architectures." Diss., Virginia Tech, 2013. http://hdl.handle.net/10919/19316.

Full text
Abstract:
Massive parallelism combined with complex memory hierarchies and heterogeneity in high-performance computing (HPC) systems form a barrier to efficient application and architecture design. The performance achievements of the past must continue over the next decade to address the needs of scientific simulations. However, building an exascale system by 2022 that uses less than 20 megawatts will require significant innovations in power and performance efficiency.<br />    A key limitation of past approaches is a lack of power-performance policies allowing users to quantitatively bound the effects of power management on the performance of their applications and systems. Existing controllers and predictors use policies fixed by a knowledgeable user to opportunistically save energy and minimize performance impact. While the qualitative effects are often good and the aggressiveness of a controller can be tuned to try to save more or less energy, the quantitative effects of tuning and setting opportunistic policies on performance and power are unknown. In other words, the controller will save energy and minimize performance loss in many cases but we have little understanding of the quantitative effects of controller tuning. This makes setting power-performance policies a manual trial and error process for domain experts and a black art for practitioners. To improve upon past approaches to high-performance power management, we need to quantitatively understand the effects of power and performance at scale.<br />    In this work, I have developed theories and techniques to quantitatively understand the relationship between power and performance for high performance systems at scale. For instance, our system-level, iso-energy-efficiency model analyzes, evaluates and predicts the performance and energy use of data intensive parallel applications on multi-core systems. This model allows users to study the effects of machine and application dependent characteristics on system energy efficiency. Furthermore, this model helps users isolate root causes of energy or performance inefficiencies and develop strategies for scaling systems to maintain or improve efficiency.  I have also developed methodologies which can be extended and applied to model modern heterogeneous architectures such as GPU-based clusters to improve their efficiency at scale. <br /><br>Ph. D.
APA, Harvard, Vancouver, ISO, and other styles
15

Shah, Ankur Savailal. "Prediction Models for Multi-dimensional Power-Performance Optimization on Many Cores." Thesis, Virginia Tech, 2008. http://hdl.handle.net/10919/31826.

Full text
Abstract:
Power has become a primary concern for HPC systems. Dynamic voltage and frequency scaling (DVFS) and dynamic concurrency throttling (DCT) are two software tools (or knobs) for reducing the dynamic power consumption of HPC systems. To date, few works have considered the synergistic integration of DVFS and DCT in performance-constrained systems, and, to the best of our knowledge, no prior research has developed application-aware simultaneous DVFS and DCT controllers in real systems and parallel programming frameworks. We present a multi-dimensional, online performance prediction framework, which we deploy to address the problem of simultaneous runtime optimization of DVFS, DCT, and thread placement on multi-core systems. We present results from an implementation of the prediction framework in a runtime system linked to the Intel OpenMP runtime environment and running on a real dual-processor quad-core system as well as a dual-processor dual-core system. We show that the prediction framework derives near-optimal settings of the three power-aware program adaptation knobs that we consider. Our overall runtime optimization framework achieves significant reductions in energy (12.27% mean) and ED2 (29.6% mean), through simultaneous power savings (3.9% mean) and performance improvements (10.3% mean). Our prediction and adaptation framework outperforms earlier solutions that adapt only DVFS or DCT, as well as one that sequentially applies DCT then DVFS. <p> Further, our results indicate that prediction-based schemes for runtime adaptation compare favorably and typically improve upon heuristic search-based approaches in both performance and energy savings.<br>Master of Science
APA, Harvard, Vancouver, ISO, and other styles
16

Cao, Zhenwei. "Power Saving Analysis and Experiments for Large Scale Global Optimization." Thesis, Virginia Tech, 2009. http://hdl.handle.net/10919/33944.

Full text
Abstract:
Green computing, an emerging field of research that seeks to reduce excess power consumption in high performance computing (HPC), is gaining popularity among researchers. Research in this field often relies on simulation or only uses a small cluster, typically 8 or 16 nodes, because of the lack of hardware support. In contrast, System G at Virginia Tech is a 2592 processor supercomputer equipped with power aware components suitable for large scale green computing research. DIRECT is a deterministic global optimization algorithm, implemented in the mathematical software package VTDIRECT95. This thesis explores the potential energy savings for the parallel implementation of DIRECT, called pVTdirect, when used with a large scale computational biology application, parameter estimation for a budding yeast cell cycle model, on System G. Two power aware approaches for pVTdirect are developed and compared against the CPUSPEED power saving system tool. The results show that knowledge of the parallel workload of the underlying application is beneficial for power management.<br>Master of Science
APA, Harvard, Vancouver, ISO, and other styles
17

Ceccolini, Enrico. "A Machine Learning and Constraint Programming Approach to Power-Aware Job Dispatching in HPC Systems." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2020. http://amslaurea.unibo.it/20585/.

Full text
Abstract:
The demand for more powerful supercomputers continues to increase along with the types of applications submitted by researchers. Recent developments in the analysis of big data and new paradigms of programming as the map-reduce, put further emphasis on short jobs. Hence, HPC job dispatcher have to rapidly process a large number of short jobs in real time taking into account their energy needs. Since these can be similar, it is effective to exploit past executions information quickly. The existing Machine Learning methods need to be trained on new data, so we build a very simple data-driven model of power consumption that can be updated online. Using data from the Eurora’s monitoring framework we built a data set to develop our ML-based model. We integrated our job power predictor in an existing CP-based dispatcher to obtain a power-aware job dispatcher. We achieved excellent results without affect the HPC systems' quality-of-Service (QoS).
APA, Harvard, Vancouver, ISO, and other styles
18

Kailasam, Umadevi. "High level VHDL modeling of a low-power ASIC for a tour guide." [Tampa, Fla.] : University of South Florida, 2004. http://purl.fcla.edu/fcla/etd/SFE0000262.

Full text
APA, Harvard, Vancouver, ISO, and other styles
19

Etinski, Maja. "DVFS power management in HPC systems." Doctoral thesis, Universitat Politècnica de Catalunya, 2012. http://hdl.handle.net/10803/96192.

Full text
Abstract:
Recent increase in performance of High Performance Computing (HPC) systems has been followed by even higher increase in power consumption. Power draw of modern supercomputers leads to very high operating costs and reliability concerns. Furthermore, it has negative consequences on the environment. Accordingly, over the last decade there have been many works dealing with power/energy management in HPC systems. Since CPUs accounts for a high portion of the total system power consumption, our work aims at CPU power reduction. Dynamic Voltage Frequency Scaling (DVFS) is a widely used technique for CPU power management. Running an application at lower frequency/voltage reduces its power consumption. However, frequency scaling should be used carefully since it has negative effects on the application performance. We argue that the job scheduler level presents a good place for power management in an HPC center having in mind that a parallel job scheduler has a global overview of the entire system. In this thesis we propose power-aware parallel job scheduling policies where the scheduler determines the job CPU frequency, besides the job execution order. Based on the goal, the proposed policies can be classified into two groups: energy saving and power budgeting policies. The energy saving policies aim to reduce CPU energy consumption with a minimal job performance penalty. The first of the energy saving policies assigns the job frequency based on system utilization while the other makes job performance predictions. While for less loaded workloads these policies achieve energy savings, highly loaded workloads suffer from a substantial performance degradation because of higher job wait times due to an increase in load caused by longer job run times. Our results show higher potential of the DVFS technique when applied for power budgeting. The second group of policies are policies for power constrained systems. In contrast to the systems without a power limitation, in the case of a given power budget the DVFS technique even improves overall job performance reducing the average job wait time. This comes from a lower job power consumption that allows more jobs to run simultaneously. The first proposed policy from this group assigns CPU frequency using the job predicted performance and current power draw of already running jobs. The other power budgeting policy is based on an optimization problem which solution determines the job execution order, as well as power distribution among jobs selected for execution. This policy fully exploits available power and leads to further performance improvements. The last contribution of the thesis is an analysis of the DVFS technique potential for energyperformance trade-off in current and future HPC systems. Ongoing changes in technology decrease the DVFS applicability for energy savings but the technique still reduces power consumption making it useful for power constrained systems. In order to analyze DVFS potential, a model of frequency scaling impact on MPI application execution time has been proposed and validated against measurements on a large-scale system. This parametric analysis showed for which application/platform characteristic, frequency scaling leads to energy savings.<br>El aumento de rendimiento que han experimentado los sistemas de altas prestaciones ha venido acompañado de un aumento aún mayor en el consumo de energía. El consumo de los supercomputadores actuales implica unos costes muy altos de funcionamiento. Estos costes no tienen simplemente implicaciones a nivel económico sino también implicaciones en el medio ambiente. Dado la importancia del problema, en los últimos tiempos se han realizado importantes esfuerzos de investigación para atacar el problema de la gestión eficiente de la energía que consumen los sistemas de supercomputación. Dado que la CPU supone un alto porcentaje del consumo total de un sistema, nuestro trabajo se centra en la reducción y gestión eficiente de la energía consumida por la CPU. En concreto, esta tesis se centra en la viabilidad de realizar esta gestión mediante la técnica de Dynamic Voltage Frequency Scalingi (DVFS), una técnica ampliamente utilizada con el objetivo de reducir el consumo energético de la CPU. Sin embargo, esta técnica puede implicar una reducción en el rendimiento de las aplicaciones que se ejecutan, ya que implica una reducción de la frecuencia. Si tenemos en cuenta que el contexto de esta tesis son sistemas de alta prestaciones, minimizar el impacto en la pérdida de rendimiento será uno de nuestros objetivos. Sin embargo, en nuestro contexto, el rendimiento de un trabajo viene determinado por dos factores, tiempo de ejecución y tiempo de espera, por lo que habrá que considerar los dos componentes. Los sistemas de supercomputación suelen estar gestionados por sistemas de colas. Los trabajos, dependiendo de la política que se aplique y el estado del sistema, deberán esperar más o menos tiempo antes de ser ejecutado. Dado las características del sistema objetivo de esta tesis, nosotros consideramos que el Planificador de trabajo (o Job Scheduler), es el mejor componente del sistema para incluir la gestión de la energía ya que es el único punto donde se tiene una visión global de todo el sistema. En este trabajo de tesis proponemos un conjunto de políticas de planificación que considerarán el consumo energético como un recurso más. Estas políticas decidirán que trabajo ejecutar, el número de cpus asignadas y la lista de cpus (y nodos) sino también la frecuencia a la que estas cpus se ejecutarán. Estas políticas estarán orientadas a dos objetivos: reducir la energía total consumida por un conjunto de trabajos y controlar en consumo puntual de un conjunto puntual para evitar saturaciones del sistema en aquellos centros que puedan tener una capacidad limitada (permanente o puntual). El primer grupo de políticas intentará reducir el consumo total minimizando el impacto en el rendimiento. En este grupo encontramos una primera política que asigna la frecuencia de las cpus en función de la utilización del sistema y una segunda que calcula una estimación de la penalización que sufrirá el trabajo que va a empezar para decidir si reducir o no la frecuencia. Estas políticas han mostrado unos resultados aceptables con sistemas poco cargados, pero han mostrado unas pérdidas de rendimiento significativas cuando el sistema está muy cargado. Estas pérdidas de rendimiento no han sido a nivel de incremento significativo del tiempo de ejecución de los trabajos, pero sí de las métricas de rendimiento que incluyen el tiempo de espera de los trabajos (habituales en este contexto). El segundo grupo de políticas, orientadas a sistemas con limitaciones en cuanto a la potencia que pueden consumir, han mostrado un gran potencial utilizando DVFS como mecanismo de gestión. En este caso, comparado con un sistema que no incluya esta gestión, han demostrado mejoras en el rendimiento ya que permiten ejecutar más trabajos de forma simultánea, reduciendo significativamente el tiempo de espera de los trabajos. En este segundo grupo proponemos una política basada en el rendimiento del trabajo que se va a ejecutar y una segunda que considera la asignación de todos los recursos como un problema de optimización lineal. Esta última política es la contribución más importante de la tesis ya que demuestra un buen comportamiento en todos los casos evaluados. La última contribución de la tesis es un estudio del potencial de DVFS como técnica de gestión de la energía en un futuro próximo, en función de un estudio de las características de las aplicaciones, de la reducción de DVFS en el consumo de la CPU y del peso de la CPU dentro de todo el sistema. Este estudio indica que la capacidad de DVFS de ahorrar energía será limitado pero sigue mostrando un gran potencial de cara al control del consumo energético.
APA, Harvard, Vancouver, ISO, and other styles
20

Tunc, Cihan. "Autonomic Cloud Resource Management." Diss., The University of Arizona, 2015. http://hdl.handle.net/10150/347144.

Full text
Abstract:
The power consumption of data centers and cloud systems has increased almost three times between 2007 and 2012. The traditional resource allocation methods are typically designed for high performance as the primary objective to support peak resource requirements. However, it is shown that server utilization is between 12% and 18%, while the power consumption is close to those at peak loads. Hence, there is a pressing need for devising sophisticated resource management approaches. State of the art dynamic resource management schemes typically rely on only a single resource such as core number, core speed, memory, disk, and network. There is a lack of fundamental research on methods addressing dynamic management of multiple resources and properties with the objective of allocating just enough resources for each workload to meet quality of service requirements while optimizing for power consumption. The main focus of this dissertation is to simultaneously manage power and performance for large cloud systems. The objective of this research is to develop a framework of performance and power management and investigate a general methodology for an integrated autonomic cloud management. In this dissertation, we developed an autonomic management framework based on a novel data structure, AppFlow, used for modeling current and near-term future cloud application behavior. We have developed the following capabilities for the performance and power management of the cloud computing systems: 1) online modeling and characterizing the cloud application behavior and resource requirements; 2) predicting the application behavior to proactively optimize its operations at runtime; 3) a holistic optimization methodology for performance and power using number of cores, CPU frequency, and memory amount; and 4) an autonomic cloud management to support the dynamic change in VM configurations at runtime to simultaneously optimize multiple objectives including performance, power, availability, etc. We validated our approach using RUBiS benchmark (emulating eBay), on an IBM HS22 blade server. Our experimental results showed that our approach can lead to a significant reduction in power consumption upto 87% when compared to the static resource allocation strategy, 72% when compared to adaptive frequency scaling strategy, and 66% when compared to a multi-resource management strategy.
APA, Harvard, Vancouver, ISO, and other styles
21

Curtis-Maury, Matthew. "Improving the Efficiency of Parallel Applications on Multithreaded and Multicore Systems." Diss., Virginia Tech, 2008. http://hdl.handle.net/10919/26697.

Full text
Abstract:
The scalability of parallel applications executing on multithreaded and multicore multiprocessors is often quite limited due to large degrees of contention over shared resources on these systems. In fact, negative scalability frequently occurs such that a non-negligable performance loss is observed through the use of more processors and cores. In this dissertation, we present a prediction model for identifying efficient operating points of concurrency in multithreaded scientific applications in terms of both performance as a primary objective and power secondarily. We also present a runtime system that uses live analysis of hardware event rates through the prediction model to optimize applications dynamically. We discuss a dynamic, phase-aware performance prediction model (DPAPP), which combines statistical learning techniques, including multivariate linear regression and artificial neural networks, with runtime analysis of data collected from hardware event counters to locate optimal operating points of concurrency. We find that the scalability model achieves accuracy approaching 95%, sufficiently accurate to identify improved concurrency levels and thread placements from within real parallel scientific applications. Using DPAPP, we develop a prediction-driven runtime optimization scheme, called ACTOR, which throttles concurrency so that power consumption can be reduced and performance can be set at the knee of the scalability curve of each parallel execution phase in an application. ACTOR successfully identifies and exploits program phases where limited scalability results in a performance loss through the use of more processing elements, providing simultaneous reductions in execution time by 5%-18% and power consumption by 0%-11% across a variety of parallel applications and architectures. Further, we extend DPAPP and ACTOR to include support for runtime adaptation of DVFS, allowing for the synergistic exploitation of concurrency throttling and DVFS from within a single, autonomically-acting library, providing improved energy-efficiency compared to either approach in isolation.<br>Ph. D.
APA, Harvard, Vancouver, ISO, and other styles
22

Li, Dong. "Scalable and Energy Efficient Execution Methods for Multicore Systems." Diss., Virginia Tech, 2011. http://hdl.handle.net/10919/26098.

Full text
Abstract:
Multicore architectures impose great pressure on resource management. The exploration spaces available for resource management increase explosively, especially for large-scale high end computing systems. The availability of abundant parallelism causes scalability concerns at all levels. Multicore architectures also impose pressure on power management. Growth in the number of cores causes continuous growth in power. In this dissertation, we introduce methods and techniques to enable scalable and energy efficient execution of parallel applications on multicore architectures. We study strategies and methodologies that combine DCT and DVFS for the hybrid MPI/OpenMP programming model. Our algorithms yield substantial energy saving (8.74% on average and up to 13.8%) with either negligible performance loss or performance gain (up to 7.5%). To save additional energy for high-end computing systems, we propose a power-aware MPI task aggregation framework. The framework predicts the performance effect of task aggregation in both computation and communication phases and its impact in terms of execution time and energy of MPI programs. Our framework provides accurate predictions that lead to substantial energy saving through aggregation (64.87% on average and up to 70.03%) with tolerable performance loss (under 5%). As we aggregate multiple MPI tasks within the same node, we have the scalability concern of memory registration for high performance networking. We propose a new memory registration/deregistration strategy to reduce registered memory on multicore architectures with helper threads. We investigate design polices and performance implications of the helper thread approach. Our method efficiently reduces registered memory (23.62% on average and up to 49.39%) and avoids memory registration/deregistration costs for reused communication memory. Our system enables the execution of application input sets that could not run to the completion with the memory registration limitation.<br>Ph. D.
APA, Harvard, Vancouver, ISO, and other styles
23

Woo, Dong Hyuk. "Designing heterogeneous many-core processors to provide high performance under limited chip power budget." Diss., Georgia Institute of Technology, 2010. http://hdl.handle.net/1853/37294.

Full text
Abstract:
This thesis describes the efficient design of a future many-core processor that can provide higher performance under the limited chip power budget. To achieve such a goal, this thesis first develops an analytical framework within which computer architects can estimate achievable performance improvement of different many-core architectures given the same power budget. From this study, this thesis found that a future many-core processor needs (1) energy-efficient parallel cores and (2) a high-performance sequential core. Based on these observations, this thesis proposes an energy-efficient broad-purpose acceleration layer that can be snapped on top of a conventional general-purpose processor. In addition to such an energy-efficient parallel cores, this thesis also proposes different architectural techniques to further boost the performance of sequential computation while those parallel cores are idle. In particular, this thesis develops low-cost architectural techniques to enhance the memory performance of a host core by utilizing those idle parallel cores. This idea is evaluated in two different system architectures: one with the aforementioned acceleration layer and the other with an emerging integrated CPU and GPU chip.
APA, Harvard, Vancouver, ISO, and other styles
24

Green, Robert C. II. "Novel Computational Methods for the Reliability Evaluation of Composite Power Systems using Computational Intelligence and High Performance Computing Techniques." University of Toledo / OhioLINK, 2012. http://rave.ohiolink.edu/etdc/view?acc_num=toledo1338894641.

Full text
APA, Harvard, Vancouver, ISO, and other styles
25

Schultek, Brian Robert. "Design and Implementation of the Heterogeneous Computing Device Management Architecture." University of Dayton / OhioLINK, 2014. http://rave.ohiolink.edu/etdc/view?acc_num=dayton1417801414.

Full text
APA, Harvard, Vancouver, ISO, and other styles
26

Shang, Pengju. "Research in high performance and low power computer systems for data-intensive environment." Doctoral diss., University of Central Florida, 2011. http://digital.library.ucf.edu/cdm/ref/collection/ETD/id/5033.

Full text
Abstract:
According to the data affinity, DAFA re-organizes data to maximize the parallelism of the affinitive data, and also subjective to the overall load balance. This enables DAFA to realize the maximum number of map tasks with data-locality. Besides the system performance, power consumption is another important concern of current computer systems. In the U.S. alone, the energy used by servers which could be saved comes to 3.17 million tons of carbon dioxide, or 580,678 cars {Kar09}. However, the goals of high performance and low energy consumption are at odds with each other. An ideal power management strategy should be able to dynamically respond to the change (either linear or nonlinear, or non-model) of workloads and system configuration without violating the performance requirement. We propose a novel power management scheme called MAR (modeless, adaptive, rule-based) in multiprocessor systems to minimize the CPU power consumption under performance constraints. By using richer feedback factors, e.g. the I/O wait, MAR is able to accurately describe the relationships among core frequencies, performance and power consumption. We adopt a modeless control model to reduce the complexity of system modeling. MAR is designed for CMP (Chip Multi Processor) systems by employing multi-input/multi-output (MIMO) theory and per-core level DVFS (Dynamic Voltage and Frequency Scaling).; TRAID deduplicates this overlap by only logging one compact version (XOR results) of recovery references for the updating data. It minimizes the amount of log content as well as the log flushing overhead, thereby boosts the overall transaction processing performance. At the same time, TRAID guarantees comparable RAID reliability, the same recovery correctness and ACID semantics of traditional transactional processing systems. On the other hand, the emerging myriad data intensive applications place a demand for high-performance computing resources with massive storage. Academia and industry pioneers have been developing big data parallel computing frameworks and large-scale distributed file systems (DFS) widely used to facilitate the high-performance runs of data-intensive applications, such as bio-informatics {Sch09}, astronomy {RSG10}, and high-energy physics {LGC06}. Our recent work {SMW10} reported that data distribution in DFS can significantly affect the efficiency of data processing and hence the overall application performance. This is especially true for those with sophisticated access patterns. For example, Yahoo's Hadoop {refg} clusters employs a random data placement strategy for load balance and simplicity {reff}. This allows the MapReduce {DG08} programs to access all the data (without or not distinguishing interest locality) at full parallelism. Our work focuses on Hadoop systems. We observed that the data distribution is one of the most important factors that affect the parallel programming performance. However, the default Hadoop adopts random data distribution strategy, which does not consider the data semantics, specifically, data affinity. We propose a Data-Affinity-Aware (DAFA) data placement scheme to address the above problem. DAFA builds a history data access graph to exploit the data affinity.; The evolution of computer science and engineering is always motivated by the requirements for better performance, power efficiency, security, user interface (UI), etc {CM02}. The first two factors are potential tradeoffs: better performance usually requires better hardware, e.g., the CPUs with larger number of transistors, the disks with higher rotation speed; however, the increasing number of transistors on the single die or chip reveals super-linear growth in CPU power consumption {FAA08a}, and the change in disk rotation speed has a quadratic effect on disk power consumption {GSK03}. We propose three new systematic approaches as shown in Figure 1.1, Transactional RAID, data-affinity-aware data placement DAFA and Modeless power management, to tackle the performance problem in Database systems, large scale clusters or cloud platforms, and the power management problem in Chip Multi Processors, respectively. The first design, Transactional RAID (TRAID), is motivated by the fact that in recent years, more storage system applications have employed transaction processing techniques Figure 1.1 Research Work Overview] to ensure data integrity and consistency. In transaction processing systems(TPS), log is a kind of redundancy to ensure transaction ACID (atomicity, consistency, isolation, durability) properties and data recoverability. Furthermore, high reliable storage systems, such as redundant array of inexpensive disks (RAID), are widely used as the underlying storage system for Databases to guarantee system reliability and availability with high I/O performance. However, the Databases and storage systems tend to implement their independent fault tolerant mechanisms {GR93, Tho05} from their own perspectives and thereby leading to potential high overhead. We observe the overlapped redundancies between the TPS and RAID systems, and propose a novel reliable storage architecture called Transactional RAID (TRAID).<br>ID: 030423445; System requirements: World Wide Web browser and PDF reader.; Mode of access: World Wide Web.; Thesis (Ph.D.)--University of Central Florida, 2011.; Includes bibliographical references (p. 119-128).<br>Ph.D.<br>Doctorate<br>Electrical Engineering and Computer Science<br>Engineering and Computer Science<br>Computer Science
APA, Harvard, Vancouver, ISO, and other styles
27

Savoie, Lee, David K. Lowenthal, Bronis R. de Supinski, et al. "I/O Aware Power Shifting." IEEE, 2016. http://hdl.handle.net/10150/622666.

Full text
Abstract:
Power limits on future high-performance computing (HPC) systems will constrain applications. However, HPC applications do not consume constant power over their lifetimes. Thus, applications assigned a fixed power bound may be forced to slow down during high-power computation phases, but may not consume their full power allocation during low-power I/O phases. This paper explores algorithms that leverage application semantics-phase frequency, duration and power needs-to shift unused power from applications in I/O phases to applications in computation phases, thus improving system-wide performance. We design novel techniques that include explicit staggering of applications to improve power shifting. Compared to executing without power shifting, our algorithms can improve average performance by up to 8% or improve performance of a single, high-priority application by up to 32%.
APA, Harvard, Vancouver, ISO, and other styles
28

Patki, Tapasya. "The Case For Hardware Overprovisioned Supercomputers." Diss., The University of Arizona, 2015. http://hdl.handle.net/10150/577307.

Full text
Abstract:
Power management is one of the most critical challenges on the path to exascale supercomputing. High Performance Computing (HPC) centers today are designed to be worst-case power provisioned, leading to two main problems: limited application performance and under-utilization of procured power. In this dissertation we introduce hardware overprovisioning: a novel, flexible design methodology for future HPC systems that addresses the aforementioned problems and leads to significant improvements in application and system performance under a power constraint. We first establish that choosing the right configuration based on application characteristics when using hardware overprovisioning can improve application performance under a power constraint by up to 62%. We conduct a detailed analysis of the infrastructure costs associated with hardware overprovisioning and show that it is an economically viable supercomputing design approach. We then develop RMAP (Resource MAnager for Power), a power-aware, low-overhead, scalable resource manager for future hardware overprovisioned HPC systems. RMAP addresses the issue of under-utilized power by using power-aware backfilling and improves job turnaround times by up to 31%. This dissertation opens up several new avenues for research in power-constrained supercomputing as we venture toward exascale, and we conclude by enumerating these.
APA, Harvard, Vancouver, ISO, and other styles
29

Schöne, Robert. "A Unified Infrastructure for Monitoring and Tuning the Energy Efficiency of HPC Applications." Doctoral thesis, Saechsische Landesbibliothek- Staats- und Universitaetsbibliothek Dresden, 2017. http://nbn-resolving.de/urn:nbn:de:bsz:14-qucosa-230356.

Full text
Abstract:
High Performance Computing (HPC) has become an indispensable tool for the scientific community to perform simulations on models whose complexity would exceed the limits of a standard computer. An unfortunate trend concerning HPC systems is that their power consumption under high-demanding workloads increases. To counter this trend, hardware vendors have implemented power saving mechanisms in recent years, which has increased the variability in power demands of single nodes. These capabilities provide an opportunity to increase the energy efficiency of HPC applications. To utilize these hardware power saving mechanisms efficiently, their overhead must be analyzed. Furthermore, applications have to be examined for performance and energy efficiency issues, which can give hints for optimizations. This requires an infrastructure that is able to capture both, performance and power consumption information concurrently. The mechanisms that such an infrastructure would inherently support could further be used to implement a tool that is able to do both, measuring and tuning of energy efficiency. This thesis targets all steps in this process by making the following contributions: First, I provide a broad overview on different related fields. I list common performance measurement tools, power measurement infrastructures, hardware power saving capabilities, and tuning tools. Second, I lay out a model that can be used to define and describe energy efficiency tuning on program region scale. This model includes hardware and software dependent parameters. Hardware parameters include the runtime overhead and delay for switching power saving mechanisms as well as a contemplation of their scopes and the possible influence on application performance. Thus, in a third step, I present methods to evaluate common power saving mechanisms and list findings for different x86 processors. Software parameters include their performance and power consumption characteristics as well as the influence of power-saving mechanisms on these. To capture software parameters, an infrastructure for measuring performance and power consumption is necessary. With minor additions, the same infrastructure can later be used to tune software and hardware parameters. Thus, I lay out the structure for such an infrastructure and describe common components that are required for measuring and tuning. Based on that, I implement adequate interfaces that extend the functionality of contemporary performance measurement tools. Furthermore, I use these interfaces to conflate performance and power measurements and further process the gathered information for tuning. I conclude this work by demonstrating that the infrastructure can be used to manipulate power-saving mechanisms of contemporary x86 processors and increase the energy efficiency of HPC applications.
APA, Harvard, Vancouver, ISO, and other styles
30

Adhinarayanan, Vignesh. "Models and Techniques for Green High-Performance Computing." Diss., Virginia Tech, 2020. http://hdl.handle.net/10919/98660.

Full text
Abstract:
High-performance computing (HPC) systems have become power limited. For instance, the U.S. Department of Energy set a power envelope of 20MW in 2008 for the first exascale supercomputer now expected to arrive in 2021--22. Toward this end, we seek to improve the greenness of HPC systems by improving their performance per watt at the allocated power budget. In this dissertation, we develop a series of models and techniques to manage power at micro-, meso-, and macro-levels of the system hierarchy, specifically addressing data movement and heterogeneity. We target the chip interconnect at the micro-level, heterogeneous nodes at the meso-level, and a supercomputing cluster at the macro-level. Overall, our goal is to improve the greenness of HPC systems by intelligently managing power. The first part of this dissertation focuses on measurement and modeling problems for power. First, we study how to infer chip-interconnect power by observing the system-wide power consumption. Our proposal is to design a novel micro-benchmarking methodology based on data-movement distance by which we can properly isolate the chip interconnect and measure its power. Next, we study how to develop software power meters to monitor a GPU's power consumption at runtime. Our proposal is to adapt performance counter-based models for their use at runtime via a combination of heuristics, statistical techniques, and application-specific knowledge. In the second part of this dissertation, we focus on managing power. First, we propose to reduce the chip-interconnect power by proactively managing its dynamic voltage and frequency (DVFS) state. Toward this end, we develop a novel phase predictor that uses approximate pattern matching to forecast future requirements and in turn, proactively manage power. Second, we study the problem of applying a power cap to a heterogeneous node. Our proposal proactively manages the GPU power using phase prediction and a DVFS power model but reactively manages the CPU. The resulting hybrid approach can take advantage of the differences in the capabilities of the two devices. Third, we study how in-situ techniques can be applied to improve the greenness of HPC clusters. Overall, in our dissertation, we demonstrate that it is possible to infer power consumption of real hardware components without directly measuring them, using the chip interconnect and GPU as examples. We also demonstrate that it is possible to build models of sufficient accuracy and apply them for intelligently managing power at many levels of the system hierarchy.<br>Doctor of Philosophy<br>Past research in green high-performance computing (HPC) mostly focused on managing the power consumed by general-purpose processors, known as central processing units (CPUs) and to a lesser extent, memory. In this dissertation, we study two increasingly important components: interconnects (predominantly focused on those inside a chip, but not limited to them) and graphics processing units (GPUs). Our contributions in this dissertation include a set of innovative measurement techniques to estimate the power consumed by the target components, statistical and analytical approaches to develop power models and their optimizations, and algorithms to manage power statically and at runtime. Experimental results show that it is possible to build models of sufficient accuracy and apply them for intelligently managing power on multiple levels of the system hierarchy: chip interconnect at the micro-level, heterogeneous nodes at the meso-level, and a supercomputing cluster at the macro-level.
APA, Harvard, Vancouver, ISO, and other styles
31

Huang, Song. "Enhancing Storage Dependability and Computing Energy Efficiency for Large-Scale High Performance Computing Systems." Thesis, University of North Texas, 2019. https://digital.library.unt.edu/ark:/67531/metadc1505142/.

Full text
Abstract:
With the advent of information explosion age, larger capacity disk drives are used to store data and powerful devices are used to process big data. As the scale and complexity of computer systems increase, we expect these systems to provide dependable and energy-efficient services and computation. Although hard drives are reliable in general, they are the most commonly replaced hardware components. Disk failures cause data corruption and even data loss, which can significantly affect system performance and financial losses. In this dissertation research, I analyze different manifestations of disk failures in production data centers and explore data mining techniques combined with statistical analysis methods to discover categories of disk failures and their distinctive properties. I use similarity measures to quantify the degradation process of each failure type and derive the degradation signature. The derived degradation signatures are further leveraged to forecast when future disk failures may happen. Meanwhile, this dissertation also studies energy efficiency of high performance computers. Specifically, I characterize the power and energy consumption of Haswell processors which are used in multiple supercomputers, and analyze the power and energy consumption of Legion, a data-centric programming model and runtime system, and Legion applications. We find that power and energy efficiency can be improved significantly by optimizing the settings and runtime scheduling of processors, and Legion runtime performs well for larger-scale computation in terms of power and energy consumption.
APA, Harvard, Vancouver, ISO, and other styles
32

Rubeck, Christophe. "Calcul hautes performances pour les formulations intégrales en électromagnétisme basses fréquences." Phd thesis, Université de Grenoble, 2012. http://tel.archives-ouvertes.fr/tel-00793505.

Full text
Abstract:
Les méthodes intégrales sont des méthodes particulièrement bien adaptées à la modélisation des systèmes électromagnétiques car contrairement aux méthodes par éléments finis elles ne nécessitent pas le maillage des matériaux inactifs tel que l'air. Ces modèles sont donc légers en termes de nombre de degrés de liberté. Cependant ceux sont des méthodes à interactions totales qui génèrent des matrices de systèmes d'équations pleines. Ces matrices sont longues à calculer en temps processeur et coûteuses à stocker dans la mémoire vive de l'ordinateur. Nous réduisons dans ces travaux les temps de calcul grâce au parallélisme, c'est-à-dire l'utilisation de plusieurs processeurs, notamment sur cartes graphiques (GPGPU). Nous réduisons également le coût du stockage mémoire via de la compression matricielle par ondelettes (il s'agit d'un algorithme proche de la compression d'images). C'est une compression par pertes, nous avons ainsi développé un critère pour contrôler l'erreur introduite par la compression. Les méthodes développées sont appliquées sur une formulation électrostatique de calcul de capacités, mais elles sont à priori également applicables à d'autres formulations.
APA, Harvard, Vancouver, ISO, and other styles
33

Tsafack, Chetsa Ghislain Landry. "System Profiling and Green Capabilities for Large Scale and Distributed Infrastructures." Phd thesis, Ecole normale supérieure de lyon - ENS LYON, 2013. http://tel.archives-ouvertes.fr/tel-00946583.

Full text
Abstract:
Nowadays, reducing the energy consumption of large scale and distributed infrastructures has truly become a challenge for both industry and academia. This is corroborated by the many efforts aiming to reduce the energy consumption of those systems. Initiatives for reducing the energy consumption of large scale and distributed infrastructures can without loss of generality be broken into hardware and software initiatives.Unlike their hardware counterpart, software solutions to the energy reduction problem in large scale and distributed infrastructures hardly result in real deployments. At the one hand, this can be justified by the fact that they are application oriented. At the other hand, their failure can be attributed to their complex nature which often requires vast technical knowledge behind proposed solutions and/or thorough understanding of applications at hand. This restricts their use to a limited number of experts, because users usually lack adequate skills. In addition, although subsystems including the memory are becoming more and more power hungry, current software energy reduction techniques fail to take them into account. This thesis proposes a methodology for reducing the energy consumption of large scale and distributed infrastructures. Broken into three steps known as (i) phase identification, (ii) phase characterization, and (iii) phase identification and system reconfiguration; our methodology abstracts away from any individual applications as it focuses on the infrastructure, which it analyses the runtime behaviour and takes reconfiguration decisions accordingly.The proposed methodology is implemented and evaluated in high performance computing (HPC) clusters of varied sizes through a Multi-Resource Energy Efficient Framework (MREEF). MREEF implements the proposed energy reduction methodology so as to leave users with the choice of implementing their own system reconfiguration decisions depending on their needs. Experimental results show that our methodology reduces the energy consumption of the overall infrastructure of up to 24% with less than 7% performance degradation. By taking into account all subsystems, our experiments demonstrate that the energy reduction problem in large scale and distributed infrastructures can benefit from more than "the traditional" processor frequency scaling. Experiments in clusters of varied sizes demonstrate that MREEF and therefore our methodology can easily be extended to a large number of energy aware clusters. The extension of MREEF to virtualized environments like cloud shows that the proposed methodology goes beyond HPC systems and can be used in many other computing environments.
APA, Harvard, Vancouver, ISO, and other styles
34

Gnanavignesh, R. "Parallel Computing Techniques for High Speed Power System Solutions." Thesis, 2020. https://etd.iisc.ac.in/handle/2005/4981.

Full text
Abstract:
Modern power systems are enormously large and complex entities. Planning, maintaining and operating such a system would be cumbersome if it were not for the wide assortment of analytical methods available to assist the power engineer. With the advent of interconnected systems came the necessity of developing techniques for enabling the power system operator to determine the electrical state of the network and to predict how it would respond to different disturbances such that reliability and other economic criteria are always met. Increase in system size, introduction of complex controls, uncertainties in forecasting, etc. necessitate faster software tools to handle power system planning, operation and operator training. This thesis aims to improve the performance of power system software tools by proposing parallel algorithms with the objective of reducing their execution time. Solution of a sparse set of linear algebraic equations is one of the most essential modules used in almost all power system software tools. The thesis addresses the issue of reducing the execution time of sparse linear algebraic solver by parallelizing sparse matrix factorization. A LU factorization algorithm which is more amenable for parallelization is identified and chosen. In this work, the structural symmetry property of power system sparse matrices is exploited to maximize the column or node level parallelism. Results obtained from the implementation of the proposed algorithm on Graphical Processing Units (GPUs) corroborate its efficacy by achieving significant reduction in the solution time when compared with state of the art CPU based sequential sparse linear solvers. Power flow algorithm is one of the most frequently executed algorithms with respect to the steady state realm of the power system. The output of the power flow algorithm is the phasor bus voltages and line flows for the given load-generation pattern. Reduction in the solution time for the power flow algorithm would further boost other applications like contingency analysis, optimal power flow, dynamic studies, etc. This thesis proposes a parallel power flow algorithm based on Newton-Raphson method. Inclusion of reactive power limit constraints at generator buses in the problem formulation stage itself eradicates the need to use heuristic techniques. In this work, the given power system network for which the power flow solution is desired, is decomposed into smaller sub-networks and processed in an independent as well as in a concurrent manner. Partial results from the sub-networks are consolidated to arrive at the solution of original network. The proposed algorithm is implemented on a computer architecture comprising of multiple cores. Results obtained indicate preservation of the superior convergence property of Newton-Raphson method and a significant reduction in the solution time required for the parallel version of the power flow when compared with the sequential version. Transient stability assessment is an important module within the Dynamic Security Assessment application. The objective of transient stability assessment is to obtain the dynamic, low frequency electromechanical phenomenon and determine whether the power system would be able to maintain synchronism after an electrical disturbance. Time domain simulation for the stability assessment by solving thousands of Differential Algebraic Equations (DAEs), even though is the preferred method, is computationally intensive and becomes a major computing challenge as system size increases. The thesis proposes a parallel algorithm based on spatial domain decomposition employing relaxation conditions to speedup the transient stability simulation to handle the aforementioned challenge. A convergence enhancing mechanism through selection of appropriate admittance parameters for the network emulating fictitious buses which mimic the remainder of the system for each sub-network is derived. Also, a technique of port dependency reduction, which guarantees convergence for any general network is presented. Results obtained from implementation on a multicore parallel architecture corroborate the scalability and improved speedup features of the methodology which achieves a significant reduction in the simulation execution time which would greatly aid in reliably operating the power system.
APA, Harvard, Vancouver, ISO, and other styles
35

Luo, Fengji. "Advanced analysis methods for power system with high renewable penetrations." Thesis, 2014. http://hdl.handle.net/1959.13/1042442.

Full text
Abstract:
Research Doctorate - Doctor of Philosophy (PhD)<br>In recent years, power system has been undergoing rapid development. One of the most significant trends of modern power system is the increasing penetration of the renewable energy sources. Most common renewable energies include the wind power, solar power, etc. The integration of renewable energy can lower the CO2 emission and release the pressure of the global climate warming. However, the stochastic nature and uncertain power output of many kinds of renewable energy sources challenge the grid in the aspects of effective data collection and analysis, system dispatch, and system stability and security. This research focuses on studying the advanced analysis methods to address the challenges of the power system with high penetration of renewable energy, mainly including three parts: architecture of the new information infrastructure, new system dispatch methods, and new system stability and security analysis methods. In the first part, this work first proposes a cloud computing-based information infrastructure for the next-generation power system with high penetration of renewable energy and deployment of smart grid. The operation model of the infrastructure and the structure of the power cloud data center are discussed in details. The major benefits of the proposed infrastructure to power system are analyzed as well. To demonstrate how the proposed information infrastructure can be used to integrate renewable energy into system operation, a cloud computing based distributed bargaining framework is proposed for the micro grids and the distribution utilities. Compared with currently centralized computing mode, the proposed cloud based bargaining framework is more secure, data-centric and scalable. In the second part, this research studies the new methods of the system dispatch with high penetration of renewable energy. This research mainly studies two problems. The first one is the dispatch of the wind farm with a battery energy storage system (BESS) and the second one is the unit commitment (UC) model by considering the probabilistic wind generation. For the wind farm dispatch problem, a novel short-term dispatch scheme is proposed for dispatching the charge/discharge behaviors of the BESS to better mitigate the wind power forecast uncertainties than the traditional method. The methodology of determine the necessary power and energy capacities of the BES under the proposed dispatch scheme is also discussed. For the UC problem, a UC framework is proposed by integrating the stochastic model of the wind power generation. A novel algorithm called Fuzzy Adaptive Particle Swarm Optimization (FAPSO) is also developed to solve the model. In order to enhance the performance of the FAPSO algorithm, a cloud computing based parallel computational framework is developed for the FAPSO algorithm. The parallel frame work is implemented on top of the Amazon Elastic Cloud (Amazon EC2). In the third part, this research studies some new methods to do fast and robust online system security and stability analysis, which will be a big challenge caused by the highly fluctuation of the power output of the renewable energy sources. Two problems are studied. The first one is the online dynamic security assessment (DSA) and the second one is the distributed transient stability analysis based on PSS/E. For the DSA problem, an improved pattern discovery (PD) algorithm and a fuzzy control based classification method are proposed. In order to overcome the inherence performance bottleneck of the PD algorithm, a cloud-based parallel processing and online deployment framework is developed based on the Google App Engine (GAE). For the transient stability analysis problem, distributed computing technology is used to construct a distributed PSS/E simulation framework to do the distributed transient stability simulations based on the EnFuzion platform. Four different types of analysis are considered in this research.
APA, Harvard, Vancouver, ISO, and other styles
36

Zamani, Reza. "Run-time Predictive Modeling of Power and Performance via Time-Series in High Performance Computing." Thesis, 2012. http://hdl.handle.net/1974/7639.

Full text
Abstract:
Pressing demands for less power consumption of processors while delivering higher performance levels have put an extra attention on efficiency of the systems. Efficient management of resources in the current computing systems, given their increasing number of entities and complexity, requires accurate predictive models that can easily adapt to system and application changes. Through performance monitoring counter (PMC) events, in modern processors, a vast amount of information can be obtained from the system. This thesis provides a methodology to efficiently choose such events for power modeling purposes. In addition, exploiting the time-dependence of the data measured through PMCs and multi-meters, we build predictive multivariate time-series models that estimate the run-time power consumption of a system. In particular, we find an autoregressive moving average with exogenous inputs (ARMAX) model that is combined with a recursive least squares (RLS) algorithm as a good candidate for such purposes. Many of the available estimation or prediction models avoid using the metrics that are affected by the changes of the processor frequency. This thesis proposes a method to mitigate the impact of frequency scaling in a run-time model on power and PMC metrics. This method is based on a practical Gaussian approximation. Different segments of the trend of a metric that are associated with different frequencies are scaled and offset into a zero mean unit variance signal. This is an attempt to transform the variable frequency trend into a weakly stationary time-series. Using this approach, we have shown that power estimation of a system using PMCs can be done in a variable frequency environment. We extend the ARMAX-RLS model to predict the near future power consumption and PMCs of different applications in a variable frequency environment. The proposed method is adaptive, independent of the system and applications. We have shown that a run-time per core or aggregate system PMC event prediction, multiple-steps ahead of time, is feasible using an ARMAX-RLS model. This is crucial for progressing from the reactive power and performance management methods to more proactive algorithms.<br>Thesis (Ph.D, Electrical & Computer Engineering) -- Queen's University, 2012-11-12 12:21:00.152
APA, Harvard, Vancouver, ISO, and other styles
37

Mitra, Gaurav. "Low-power System-on-Chip Processors for Energy Efficient High Performance Computing: The Texas Instruments Keystone II." Phd thesis, 2017. http://hdl.handle.net/1885/132717.

Full text
Abstract:
The High Performance Computing (HPC) community recognizes energy consumption as a major problem. Extensive research is underway to identify means to increase energy efficiency of HPC systems including consideration of alternative building blocks for future systems. This thesis considers one such system, the Texas Instruments Keystone II, a heterogeneous Low-Power System-on-Chip (LPSoC) processor that combines a quad core ARM CPU with an octa-core Digital Signal Processor (DSP). It was first released in 2012. Four issues are considered: i) maximizing the Keystone II ARM CPU performance; ii) implementation and extension of the OpenMP programming model for the Keystone II; iii) simultaneous use of ARM and DSP cores across multiple Keystone SoCs; and iv) an energy model for applications running on LPSoCs like the Keystone II and heterogeneous systems in general. Maximizing the performance of the ARM CPU on the Keystone II system is fundamental to adoption of this system by the HPC community and, of the ARM architecture more broadly. Key to achieving good performance is exploitation of the ARM vector instructions. This thesis presents the first detailed comparison of the use of ARM compiler intrinsic functions with automatic compiler vectorization across four generations of ARM processors. Comparisons are also made with x86 based platforms and the use of equivalent Intel vector instructions. Implementation of the OpenMP programming model on the Keystone II system presents both challenges and opportunities. Challenges in that the OpenMP model was originally developed for a homogeneous programming environment with a common instruction set architecture, and in 2012 work had only just begun to consider how OpenMP might work with accelerators. Opportunities in that shared memory is accessible to all processing elements on the LPSoC, offering performance advantages over what typically exists with attached accelerators. This thesis presents an analysis of a prototype version of OpenMP implemented as a bare-metal runtime on the DSP of a Keystone I system. An implementation for the Keystone II that maps OpenMP 4.0 accelerator directives to OpenCL runtime library operations is presented and evaluated. Exploitation of some of the underlying hardware features of the Keystone II is also discussed. Simultaneous use of the ARM and DSP cores across multiple Keystone II boards is fundamental to the creation of commercially viable HPC offerings based on Keystone technology. The nCore BrownDwarf and HPE Moonshot systems represent two such systems. This thesis presents a proof-of-concept implementation of matrix multiplication (GEMM) for the BrownDwarf system. The BrownDwarf utilizes both Keystone II and Keystone I SoCs through a point-to-point interconnect called Hyperlink. Details of how a novel message passing communication framework across Hyperlink was implemented to support this complex environment are provided. An energy model that can be used to predict energy usage as a function of what fraction of a particular computation is performed on each of the available compute devices offers the opportunity for making runtime decisions on how best to minimize energy usage. This thesis presents a basic energy usage model that considers rates of executions on each device and their active and idle power usages. Using this model, it is shown that only under certain conditions does there exist an energy-optimal work partition that uses multiple compute devices. To validate the model a high resolution energy measurement environment is developed and used to gather energy measurements for a matrix multiplication benchmark running on a variety of systems. Results presented support the model. Drawing on the four issues noted above and other developments that have occurred since the Keystone II system was first announced, the thesis concludes by making comments regarding the future of LPSoCs as building blocks for HPC systems.
APA, Harvard, Vancouver, ISO, and other styles
38

BIAGIONI, ANDREA. "Characterization and optimization of network traffic in cortical simulation." Doctoral thesis, 2018. http://hdl.handle.net/11573/1082376.

Full text
Abstract:
Considering the great variety of obstacles the Exascale systems have to face in the next future, a deeper attention will be given in this thesis to the interconnect and the power consumption. The data movement challenge involves the whole hierarchical organization of components in HPC systems — i.e. registers, cache, memory, disks. Running scientific applications needs to provide the most effective methods of data transport among the levels of hierarchy. On current petaflop systems, memory access at all the levels is the limiting factor in almost all applications. This drives the requirement for an interconnect achieving adequate rates of data transfer, or throughput, and reducing time delays, or latency, between the levels. Power consumption is identified as the largest hardware research challenge. The annual power cost to operate the system would be above 2.5 B$ per year for an Exascale system using current technology. The research for alternative power-efficient computing device is mandatory for the procurement of the future HPC systems. In this thesis, a preliminary approach will be offered to the critical process of co-design. Co-desing is defined as the simultaneos design of both hardware and software, to implement a desired function. This process both integrates all components of the Exascale initiative and illuminates the trade-offs that must be made within this complex undertaking.
APA, Harvard, Vancouver, ISO, and other styles
39

Gupta, Abhishek. "A Study On High Voltage AC Power Transmission Line Electric And Magnetic Field Coupling With Nearby Metallic Pipelines." Thesis, 2006. https://etd.iisc.ac.in/handle/2005/369.

Full text
Abstract:
In the recent years, there has been a trend to run metallic pipelines carrying petroleum products and high voltage AC power lines parallel to each other in a relatively narrow strip of land. The case of electromagnetic interference between high voltage transmission lines and metallic pipelines has been a topic of major concern since the early 60’s. The main reasons for that are: • The ever increasing cost of right-of-ways, suitable for power lines and pipelines, along with recent environmental regulations, aiming to protect nature and wildlife, has forced various utilities to share common corridors for both high voltage power lines and pipelines. Therefore, situations where a pipeline is laid at close distance from a transmission line for several kilometers have become very frequent. • The rapid increase in energy consumption, which has led to the adoption of higher load and short circuit current levels, thus making the problem more acute. Due to this sharing of the right-of-way, overhead AC power line field may induce voltages on the metallic pipelines running in close vicinity leading to serious adverse effects. This electromagnetic interference is present both during normal operating conditions as well as during faults. The coupling of the field with the pipeline takes place either through the capacitive path or through the inductive or conductive paths. In the present work, the induced voltages due to capacitive and inductive coupling on metallic pipelines running in close vicinity of high voltage power transmission lines have been computed.The conductor surface field gradients calculated for the various phaseconfigurations have been presented in the thesis. Also the electric fields under transmission lines, for both single circuit and double circuit (various phase arrangements) have been analysed. Based on the above results, an optimum configuration giving the lowest field under the power line as well as the lowest conductor surface gradient has been arrived at and for this configuration induced voltage on the pipeline has been computed using the Charge Simulation Method (CSM). For comparison, induced voltages on the pipeline have been computed for the various other phase configurations also. A very interesting result is that the induced voltage on the pipeline becomes almost negligible at a critical lateral distance from the center of the powerline and beyond which the induced voltage again increases.This critical distance depends on the conductor configuration. Hence it is suggested that the pipeline be located close to the critical distance so that the induced voltage would be close to zero. For calculating the induced voltage due to the inductive coupling, electromotive force (EMF),induced along the pipeline due to the magnetic field created by the transmission line has been calculated. The potential difference between the pipeline and the earth, due to the above induced EMFs, is then calculated. As the zones of influence are generally formed by parallelism, approaches, crossings as well as removals, the computation involves subdividing the zone into several sections corresponding to these zones. The calculation of voltages is carried out at both the ends of the sections. Each section is represented by an equivalent π electrical network, which is influenced by the induced EMF. The induced EMF is calculated during faulted conditions as well as during steady state conditions. Inductive coupling calculations have been carried out for the following cases: •Perfect parallelism between powerline and pipeline. •zone of influences formed by parallelism, approaches, crossings and removals. It has been observed that when the pipeline is approaching the HV transmission line at an angle, then running parallel for certain distance and finally deviating away, the induced voltage is maximum at the point of approach or removal of the pipeline from the transmission line corridor.The induced voltage is almost negligible near to the midpoint of the zone of influence. The profile of the induced voltage also depend on whether the pipeline is grounded or left open circuited at the extremities of the zone of influence. Effect of earth resistivity and anti-corrosive coatings on induced voltage has also been studied. For mitigating the induced voltage on the pipeline,numerous low resistive earthings have been suggested. Results show that significant reduction in induced voltage can be achieved as the number of earth points is increased.
APA, Harvard, Vancouver, ISO, and other styles
40

Gupta, Abhishek. "A Study On High Voltage AC Power Transmission Line Electric And Magnetic Field Coupling With Nearby Metallic Pipelines." Thesis, 2006. http://hdl.handle.net/2005/369.

Full text
Abstract:
In the recent years, there has been a trend to run metallic pipelines carrying petroleum products and high voltage AC power lines parallel to each other in a relatively narrow strip of land. The case of electromagnetic interference between high voltage transmission lines and metallic pipelines has been a topic of major concern since the early 60’s. The main reasons for that are: • The ever increasing cost of right-of-ways, suitable for power lines and pipelines, along with recent environmental regulations, aiming to protect nature and wildlife, has forced various utilities to share common corridors for both high voltage power lines and pipelines. Therefore, situations where a pipeline is laid at close distance from a transmission line for several kilometers have become very frequent. • The rapid increase in energy consumption, which has led to the adoption of higher load and short circuit current levels, thus making the problem more acute. Due to this sharing of the right-of-way, overhead AC power line field may induce voltages on the metallic pipelines running in close vicinity leading to serious adverse effects. This electromagnetic interference is present both during normal operating conditions as well as during faults. The coupling of the field with the pipeline takes place either through the capacitive path or through the inductive or conductive paths. In the present work, the induced voltages due to capacitive and inductive coupling on metallic pipelines running in close vicinity of high voltage power transmission lines have been computed.The conductor surface field gradients calculated for the various phaseconfigurations have been presented in the thesis. Also the electric fields under transmission lines, for both single circuit and double circuit (various phase arrangements) have been analysed. Based on the above results, an optimum configuration giving the lowest field under the power line as well as the lowest conductor surface gradient has been arrived at and for this configuration induced voltage on the pipeline has been computed using the Charge Simulation Method (CSM). For comparison, induced voltages on the pipeline have been computed for the various other phase configurations also. A very interesting result is that the induced voltage on the pipeline becomes almost negligible at a critical lateral distance from the center of the powerline and beyond which the induced voltage again increases.This critical distance depends on the conductor configuration. Hence it is suggested that the pipeline be located close to the critical distance so that the induced voltage would be close to zero. For calculating the induced voltage due to the inductive coupling, electromotive force (EMF),induced along the pipeline due to the magnetic field created by the transmission line has been calculated. The potential difference between the pipeline and the earth, due to the above induced EMFs, is then calculated. As the zones of influence are generally formed by parallelism, approaches, crossings as well as removals, the computation involves subdividing the zone into several sections corresponding to these zones. The calculation of voltages is carried out at both the ends of the sections. Each section is represented by an equivalent π electrical network, which is influenced by the induced EMF. The induced EMF is calculated during faulted conditions as well as during steady state conditions. Inductive coupling calculations have been carried out for the following cases: •Perfect parallelism between powerline and pipeline. •zone of influences formed by parallelism, approaches, crossings and removals. It has been observed that when the pipeline is approaching the HV transmission line at an angle, then running parallel for certain distance and finally deviating away, the induced voltage is maximum at the point of approach or removal of the pipeline from the transmission line corridor.The induced voltage is almost negligible near to the midpoint of the zone of influence. The profile of the induced voltage also depend on whether the pipeline is grounded or left open circuited at the extremities of the zone of influence. Effect of earth resistivity and anti-corrosive coatings on induced voltage has also been studied. For mitigating the induced voltage on the pipeline,numerous low resistive earthings have been suggested. Results show that significant reduction in induced voltage can be achieved as the number of earth points is increased.
APA, Harvard, Vancouver, ISO, and other styles
41

(6838184), Parami Wijesinghe. "Neuro-inspired computing enhanced by scalable algorithms and physics of emerging nanoscale resistive devices." 2019.

Find full text
Abstract:
<p>Deep ‘Analog Artificial Neural Networks’ (AANNs) perform complex classification problems with high accuracy. However, they rely on humongous amount of power to perform the calculations, veiling the accuracy benefits. The biological brain on the other hand is significantly more powerful than such networks and consumes orders of magnitude less power, indicating some conceptual mismatch. Given that the biological neurons are locally connected, communicate using energy efficient trains of spikes, and the behavior is non-deterministic, incorporating these effects in Artificial Neural Networks (ANNs) may drive us few steps towards a more realistic neural networks. </p> <p> </p> <p>Emerging devices can offer a plethora of benefits including power efficiency, faster operation, low area in a vast array of applications. For example, memristors and Magnetic Tunnel Junctions (MTJs) are suitable for high density, non-volatile Random Access Memories when compared with CMOS implementations. In this work, we analyze the possibility of harnessing the characteristics of such emerging devices, to achieve neuro-inspired solutions to intricate problems.</p> <p> </p> <p>We propose how the inherent stochasticity of nano-scale resistive devices can be utilized to realize the functionality of spiking neurons and synapses that can be incorporated in deep stochastic Spiking Neural Networks (SNN) for image classification problems. While ANNs mainly dwell in the aforementioned classification problem solving domain, they can be adapted for a variety of other applications. One such neuro-inspired solution is the Cellular Neural Network (CNN) based Boolean satisfiability solver. Boolean satisfiability (k-SAT) is an NP-complete (k≥3) problem that constitute one of the hardest classes of constraint satisfaction problems. We provide a proof of concept hardware based analog k-SAT solver that is built using MTJs. The inherent physics of MTJs, enhanced by device level modifications, is harnessed here to emulate the intricate dynamics of an analog, CNN based, satisfiability (SAT) solver. </p> <p> </p> <p>Furthermore, in the effort of reaching human level performance in terms of accuracy, increasing the complexity and size of ANNs is crucial. Efficient algorithms for evaluating neural network performance is of significant importance to improve the scalability of networks, in addition to designing hardware accelerators. We propose a scalable approach for evaluating Liquid State Machines: a bio-inspired computing model where the inputs are sparsely connected to a randomly interlinked reservoir (or liquid). It has been shown that biological neurons are more likely to be connected to other neurons in the close proximity, and tend to be disconnected as the neurons are spatially far apart. Inspired by this, we propose a group of locally connected neuron reservoirs, or an ensemble of liquids approach, for LSMs. We analyze how the segmentation of a single large liquid to create an ensemble of multiple smaller liquids affects the latency and accuracy of an LSM. In our analysis, we quantify the ability of the proposed ensemble approach to provide an improved representation of the input using the Separation Property (SP) and Approximation Property (AP). Our results illustrate that the ensemble approach enhances class discrimination (quantified as the ratio between the SP and AP), leading to improved accuracy in speech and image recognition tasks, when compared to a single large liquid. Furthermore, we obtain performance benefits in terms of improved inference time and reduced memory requirements, due to lower number of connections and the freedom to parallelize the liquid evaluation process.</p>
APA, Harvard, Vancouver, ISO, and other styles
42

Khubaib. "Performance and energy efficiency via an adaptive MorphCore architecture." Thesis, 2014. http://hdl.handle.net/2152/25092.

Full text
Abstract:
The level of Thread-Level Parallelism (TLP), Instruction-Level Parallelism (ILP), and Memory-Level Parallelism (MLP) varies across programs and across program phases. Hence, every program requires different underlying core microarchitecture resources for high performance and/or energy efficiency. Current core microarchitectures are inefficient because they are fixed at design time and do not adapt to variable TLP, ILP, or MLP. I show that if a core microarchitecture can adapt to the variation in TLP, ILP, and MLP, significantly higher performance and/or energy efficiency can be achieved. I propose MorphCore, a low-overhead adaptive microarchitecture built from a traditional OOO core with small changes. MorphCore adapts to TLP by operating in two modes: (a) as a wide-width large-OOO-window core when TLP is low and ILP is high, and (b) as a high-performance low-energy highly-threaded in-order SMT core when TLP is high. MorphCore adapts to ILP and MLP by varying the superscalar width and the out-of-order (OOO) window size by operating in four modes: (1) as a wide-width large-OOO-window core, 2) as a wide-width medium-OOO-window core, 3) as a medium-width large-OOO-window core, and 4) as a medium-width medium-OOO-window core. My evaluation with single-thread and multi-thread benchmarks shows that when highest single-thread performance is desired, MorphCore achieves performance similar to a traditional out-of-order core. When energy efficiency is desired on single-thread programs, MorphCore reduces energy by up to 15% (on average 8%) over an out-of-order core. When high multi-thread performance is desired, MorphCore increases performance by 21% and reduces energy consumption by 20% over an out-of-order core. Thus, for multi-thread programs, MorphCore's energy efficiency is similar to highly-threaded throughput-optimized small and medium core architectures, and its performance is two-thirds of their potential.<br>text
APA, Harvard, Vancouver, ISO, and other styles
43

Κεραμίδας, Γεώργιος. "Αρχιτεκτονικές επεξεργαστών και μνημών ειδικού σκοπού για την υποστήριξη φερέγγυων (ασφαλών) δικτυακών υπηρεσιών". Thesis, 2008. http://nemertes.lis.upatras.gr/jspui/handle/10889/1037.

Full text
Abstract:
Η ασφάλεια των υπολογιστικών συστημάτων αποτελεί πλέον μια πολύ ενεργή περιοχή και αναμένεται να γίνει μια νέα παράμετρος σχεδίασης ισάξια μάλιστα με τις κλασσικές παραμέτρους σχεδίασης των συστημάτων, όπως είναι η απόδοση, η κατανάλωση ισχύος και το κόστος. Οι φερέγγυες υπολογιστικές πλατφόρμες έχουν προταθεί σαν μια υποσχόμενη λύση, ώστε να αυξήσουν τα επίπεδα ασφάλειας των συστημάτων και να παρέχουν προστασία από μη εξουσιοδοτημένη άδεια χρήσης των πληροφοριών που είναι αποθηκευμένες σε ένα σύστημα. Ένα φερέγγυο σύστημα θα πρέπει να διαθέτει τους κατάλληλους μηχανισμούς, ώστε να είναι ικανό να αντιστέκεται στο σύνολο, τόσο γνωστών όσο και νέων, επιθέσεων άρνησης υπηρεσίας. Οι επιθέσεις αυτές μπορεί να έχουν ως στόχο να βλάψουν το υλικό ή/και το λογισμικό του συστήματος. Ωστόσο, η μεγαλύτερη βαρύτητα στην περιοχή έχει δοθεί στην αποτροπή επιθέσεων σε επίπεδο λογισμικού. Στην παρούσα διατριβή προτείνονται έξι μεθοδολογίες σχεδίασης ικανές να θωρακίσουν ένα υπολογιστικό σύστημα από επιθέσεις άρνησης υπηρεσίας που έχουν ως στόχο να πλήξουν το υλικό του συστήματος. Η κύρια έμφαση δίνεται στο υποσύστημα της μνήμης (κρυφές μνήμες). Στις κρυφές μνήμες αφιερώνεται ένα μεγάλο μέρος της επιφάνειας του ολοκληρωμένου, είναι αυτές που καλούνται να "αποκρύψουν" τους αργούς χρόνους απόκρισης της κύριας μνήμης και ταυτόχρονα σε αυτές οφείλεται ένα μεγάλο μέρος της συνολικής κατανάλωσης ισχύος. Ως εκ τούτου, παρέχοντας βελτιστοποιήσεις στις κρυφές μνήμες καταφέρνουμε τελικά να μειώσουμε τον χρόνο εκτέλεσης του λογισμικού, να αυξήσουμε το ρυθμό μετάδοσης των ψηφιακών δεδομένων και να θωρακίσουμε το σύστημα από επιθέσεις άρνησης υπηρεσίας σε επίπεδο υλικού.<br>Data security concerns have recently become very important, and it can be expected that security will join performance, power and cost as a key distinguish factor in computer systems. Trusted platforms have been proposed as a promising approach to enhance the security of the modern computer system and prevent unauthorized accesses and modifications of the sensitive information stored in the system. Unfortunately, previous approaches only provide a level of security against software-based attacks and leave the system wide open to hardware attacks. This dissertation thesis proposes six design methodologies to shield a uniprocessor or a multiprocessor system against a various number of Denial of Service (DoS) attacks at the architectural and the operating system level. Specific focus is given to the memory subsystem (i.e. cache memories). The cache memories account for a large portion of the silicon area, they are greedy power consumers and they seriously determine system performance due to the even growing gap between the processor speed and main memory access latency. As a result, in this thesis we propose methodologies to optimize the functionality and lower the power consumption of the cache memories. The goal in all cases is to increase the performance of the system, the achieved packet throughput and to enhance the protection against a various number of passive and Denial of Service attacks.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography