Dissertations / Theses on the topic 'Cache memory – Design'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 37 dissertations / theses for your research on the topic 'Cache memory – Design.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Gieske, Edmund Joseph. "Critical Words Cache Memory." University of Cincinnati / OhioLINK, 2008. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1208368190.
Full textJakšić, Zoran. "Cache memory design in the FinFET era." Doctoral thesis, Universitat Politècnica de Catalunya, 2015. http://hdl.handle.net/10803/316394.
Full textEl principal problema de l'escalat la tecnologia són les variacions en els paràmetres de disseny (imperfeccions) durant procés de fabricació. D'altra banda, els dispositius també són més sensibles als canvis ambientals de temperatura, la tensió d'alimentació, així com l'envelliment. Totes aquestes influències es manifesten en els circuits integrats com l'augment de consum d'energia, la reducció de la freqüència d'operació màxima i l'augment del nombre de xips descartats. Aquests efectes s'han superat parcialment amb la introducció de la tecnologia FinFET que ha resolt el problema de la variabilitat causada per les fluctuacions de dopants aleatòries. No obstant això, en els propers deu anys, l'ample del canal es preveu que es reduirà a 10nm, on la font de la variabilitat generada per les rugositats de les línies de material dominarà, i els seu efecte en les variacions de voltatge llindar augmentarà. Les memòries encastades amb les seves cel·les com la unitat bàsica de construcció són les més propenses a sofrir aquests efectes a causa de les seves dimensions més petites. A causa d'això, cal dissenyar les memòries amb una especial cura per tal de fer possible l'escalat de la tecnologia. Aquesta tesi explora la tecnologia de FinFETs de 10nm i els problemes existents en el disseny de memòries amb aquesta tecnologia. A més a més, presentem noves tècniques originals sobre diferents nivells d'abstracció del disseny per a la mitigació dels efectes les variacions tan de procés com ambientals. En primer lloc, presentem un mètode original per a la simulació de la variabilitat de Tri-Gate FinFETs usant entorn de simulació HSPICE convencional i models de tecnologia BSIMCMG. Després, es realitza la caracterització completa dels circuits de cel·les SRAM tradicionals (6T i 8T) conjuntament amb l'ús de Gate-independent FinFETs per augmentar l'estabilitat de la cèl·lula.
Pendyala, Ragini. "Cache memory design with embedded LRU replacement policy /." Available to subscribers only, 2006. http://proquest.umi.com/pqdweb?did=1240704191&sid=10&Fmt=2&clientId=1509&RQT=309&VName=PQD.
Full textRasquinha, Mitchelle. "An energy efficient cache design using spin torque transfer (STT) RAM." Thesis, Georgia Institute of Technology, 2011. http://hdl.handle.net/1853/42715.
Full textKarlsson, Martin. "Cache memory design trade-offs for current and emerging workloads." Licentiate thesis, Uppsala universitet, Avdelningen för datorteknik, 2003. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-86156.
Full textLodde, Mario. "Smart Memory and Network-On-Chip Design for High-Performance Shared-Memory Chip Multiprocessors." Doctoral thesis, Universitat Politècnica de València, 2014. http://hdl.handle.net/10251/35325.
Full textLodde, M. (2014). Smart Memory and Network-On-Chip Design for High-Performance Shared-Memory Chip Multiprocessors [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/35325
TESIS
Bond, Paul Joseph. "Design and analysis of reconfigurable and adaptive cache structures." Diss., Georgia Institute of Technology, 1995. http://hdl.handle.net/1853/14983.
Full textChandran, Pravin Chander. "Design of ALU and Cache memory for an 8 bit microprocessor." Connect to this title online, 2007. http://etd.lib.clemson.edu/documents/1202498822/.
Full textTan, Yudong. "Cache design and timing analysis for preemptive multi-tasking real-time uniprocessor systems." Diss., Available online, Georgia Institute of Technology, 2005, 2005. http://etd.gatech.edu/theses/available/etd-04132005-212947/unrestricted/yudong%5Ftan%5F200505%5Fphd.pdf.
Full textSchimmel, David, Committee Member ; Meliopoulos, A. P. Sakis, Committee Member ; Mooney, Vincent, Committee Chair ; Prvulovic, Milos, Committee Member ; Yalamanchili, Sudhakar, Committee Member. Includes bibliographical references.
Rabbah, Rodric Michel. "Design Space Exploration and Optimization of Embedded Memory Systems." Diss., Georgia Institute of Technology, 2006. http://hdl.handle.net/1853/11605.
Full textChae, Youngsu. "Algorithms, protocols and services for scalable multimedia streaming." Diss., Georgia Institute of Technology, 2002. http://hdl.handle.net/1853/8148.
Full textAkgul, Bilge Ebru Saglam. "The System-on-a-Chip Lock Cache." Diss., Georgia Institute of Technology, 2004. http://hdl.handle.net/1853/5253.
Full textCarlson, Ingvar. "Design and Evaluation of High Density 5T SRAM Cache for Advanced Microprocessors." Thesis, Linköping University, Department of Electrical Engineering, 2004. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-2286.
Full textThis thesis presents a five-transistor SRAM intended for the advanced microprocessor cache market. The goal is to reduce the area of the cache memory array while maintaining competitive performance. Various existing technologies are briefly discussed with their strengths and weaknesses. The design metrics for the five-transistor cell are discussed in detail and performance and stability are evaluated. Finally a comparison is done between a 128Kb memory of an existing six-transistor technology and the proposed technology. The comparisons include area, performance and stability of the memories. It is shown that the area of the memory array can be reduced by 23% while maintaining comparable performance. The new cell also has 43% lower total leakage current. As a trade-off for these advantages some of the stability margin is lost but the cell is still stable in all process corners. The performance and stability has been validated through post-layout simulations using Cadence Spectre.
Fazli, Yeknami Ali. "Design and Evaluation of A Low-Voltage, Process-Variation-Tolerant SRAM Cache in 90nm CMOS Technology." Thesis, Linköping University, Department of Electrical Engineering, 2008. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-12260.
Full textThis thesis presents a novel six-transistor SRAM intended for advanced
microprocessor cache application. The objectives are to reduce power
consumption through scaling the supply voltage and to design a SRAM that is fully process-variation-tolerant, utilizing separate read and write access ports as well as exploiting asymmetry. Traditional six-transistor SRAM is designed and its strengths and weaknesses are discussed in detail. Afterwards, a new SRAM technology developed in the division of Electronic Devices, Linköping University is proposed and its capabilities and drawbacks are illustrated deeply. Subsequently, the impact of mismatch and process variation on both standard 6T and proposed asymmetric 6T SRAM cells is investigated. Eventually, the cells are compared regarding the voltage scalability, stability, and tolerability to variations in process parameters. It is shown that the new cell functions in 430mV while maintaining acceptable SNM margin in all process corners. It is also demonstrated that the proposed SRAM is fully process-variation-tolerant.
Additionally, a dual-V t asymmetric 6T cell is introduced having wide SNM margin comparable with that of conventional 6T cell such that it is capable of functioning in 580mV.
Giordano, Omar. "Design and Implementation of an Architecture-aware In-memory Key- Value Store." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-291213.
Full textKey-Value Stores (KVSs) är en typ av icke-relationsdatabaser vars data representeras som ett nyckel-värdepar och används ofta för att representera lagring av cache och session. Bland dem är Memcached en av de mest populära, eftersom den används ofta i olika internettjänster som sociala nätverk och strömmande plattformar. Med tanke på den kontinuerliga och allt snabbare tillväxten av nätverksenheter som använder dessa tjänster måste den råvaruhårdvara som databaserna bygger på bearbeta paket snabbare för att möta marknadens behov. Under de senaste åren har dock prestandaförbättringarna som kännetecknar den nya hårdvaran blivit tunnare och tunnare. Härifrån, eftersom inköp av nya produkter inte längre är synonymt med betydande prestandaförbättringar, måste företagen utnyttja den fulla potentialen för hårdvaran som redan finns i deras besittning, vilket skjuter upp köpet av nyare hårdvara. En av de senaste idéerna för att öka prestanda för råvaruhårdvara är användningen av skivmedveten minneshantering. Denna teknik utnyttjar den Sista Nivån av Cache (SNC) genom att se till att de enskilda kärnorna tar data från minnesplatser som är mappade till deras respektive cachepartier (dvs. SNCskivor). Denna avhandling fokuserar på förverkligandet av en KVS-prototyp— baserad på Intel Haswell mikroarkitektur—byggd ovanpå Data Plane Development Kit (DPDK), och på vilken principerna för skivmedveten minneshantering tillämpas. För att testa dess prestanda, med tanke på att det inte finns en DPDK-baserad trafikgenerator som stöder Memcachedprotokollet, har en ytterligare prototyp av en trafikgenerator som stöder dessa funktioner också utvecklats. Föreställningarna mättes med två olika maskiner: en för trafikgeneratorn och en för KVS. Först testades den “vanliga” KVSprototypen, för att se de faktiska fördelarna, den skivmedvetna. Båda KVSprototyperna utsattes för två typer av trafik: (i) enhetlig trafik där nycklarna alltid skiljer sig från varandra och (ii) sned trafik, där nycklar upprepas och vissa nycklar är mer benägna att upprepas än andra. Experimenten visar att i verkliga scenarier (dvs. kännetecknas av snedställda nyckelfördelningar) kan användningen av en skivmedveten minneshanteringsteknik i en KVS förbättra förbättringen från slut till slut (dvs. ~2%). Dessutom påverkar sådan teknik i hög grad uppslagstiden som krävs av CPU: n för att hitta nyckeln och motsvarande värde i databasen, vilket minskar medeltiden med ~22, 5% och förbättrar 99th percentilen med ~62, 7%.
Kofuji, Jussara Marândola. "Método otimizado de arquitetura de coerência de cache baseado em sistemas embarcados multinúcleos." Universidade de São Paulo, 2011. http://www.teses.usp.br/teses/disponiveis/3/3142/tde-03042012-082623/.
Full textThis thesis presents the optimized method of cache coherent architecture based on embedded systems. The main contribution of this method presents the proposal of shared memory architecture CMP oriented by memory access patterns and cache coherent hybrid protocol. The cache coherent architecture provided the hardware specification called pattern table which can be validated by formal representation and the first implementation of pattern table. Through pattern table was developed the model of messages transaction to hybrid protocol witch differ the messages in classical and speculative. The final contribution presents the analytic model of effective cost of hybrid protocol performance.
Pan, Xiang. "Designing Future Low-Power and Secure Processors with Non-Volatile Memory." The Ohio State University, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=osu1492631536670669.
Full textGuo, Lei. "Insights into access patterns of Internet media systems measurements, analysis, and system design /." Columbus, Ohio : Ohio State University, 2008. http://rave.ohiolink.edu/etdc/view?acc%5Fnum=osu1198696679.
Full textLiang, Shuang. "Algorithms Designs and Implementations for Page Allocation in SSD Firmware and SSD Caching in Storage Systems." The Ohio State University, 2010. http://rave.ohiolink.edu/etdc/view?acc_num=osu1268420517.
Full textKwon, Woo Cheol. "Co-design of on-chip caches and networks for scalable shared-memory many-core CMPs." Thesis, Massachusetts Institute of Technology, 2018. http://hdl.handle.net/1721.1/118084.
Full textCataloged from PDF version of thesis.
Includes bibliographical references (pages 169-180).
Chip Multi-Processors(CMPs) have become mainstream in recent years, providing increased parallelism as core counts scale. While a tiled CMP is widely accepted to be a scalable architecture for the many-core era, on-chip cache organization and coherence are far from solved problems. As the on-chip interconnect directly influences the latency and bandwidth of on-chip cache, scalable interconnect is an essential part of on-chip cache design. On the other hand, optimal design of interconnect can be determined by the traffic forms that it should handle. Thus, on-chip cache organization is inherently interleaved with on-chip interconnect design and vice versa. This dissertation aims to motivate the need for re-organization of on-chip caches to leverage the advancement of on-chip network technology to harness the full potential of future many-core CMPs. Conversely, we argue that on-chip network should also be designed to support specific functionalities required by the on-chip cache. We propose such co-design techniques to offer significant improvement of on-chip cache performance, and thus to provide scalable CMP cache solutions towards future many-core CMPs. The dissertation starts with the problem of remote on-chip cache access latency. Prior locality-aware approaches fundamentally attempt to keep data as close as possible to the requesting cores. In this dissertation, we challenge this design approach by introducing new cache organization that leverages a co-designed on-chip network that allows multi-hop single-cycle traversals. Next, the dissertation moves to cache coherence request ordering. Without built-in ordering capability within the interconnect, cache coherence protocols have to rely on external ordering points. This dissertation proposes a scalable ordered Network-on-Chip which supports ordering of requests for snoopy cache coherence. Lastly, we describe development of a 36-core research prototype chip to demonstrate that the proposed Network-on-Chip enables shared-memory CMPs to be readily scalable to many-core platforms.
by Woo Cheol Kwon.
Ph. D.
Kong, Jingfei. "ARCHITECTURAL SUPPORT FOR IMPROVING COMPUTER SECURITY." Doctoral diss., University of Central Florida, 2010. http://digital.library.ucf.edu/cdm/ref/collection/ETD/id/2610.
Full textPh.D.
School of Electrical Engineering and Computer Science
Engineering and Computer Science
Computer Science PhD
Agarwal, Vikas. "Scalable primary cache memory architectures." Thesis, 2004. http://hdl.handle.net/2152/1862.
Full text"Real-time cache design." Chinese University of Hong Kong, 1996. http://library.cuhk.edu.hk/record=b5888780.
Full textThesis (M.Phil.)--Chinese University of Hong Kong, 1996.
Includes bibliographical references (leaves 102-105).
Abstract --- p.i
Acknowledgement --- p.iii
Chapter 1 --- Introduction --- p.1
Chapter 1.1 --- Overview --- p.1
Chapter 1.2 --- Scheduling In Real-time Systems --- p.4
Chapter 1.3 --- Cache Memories --- p.5
Chapter 1.4 --- Outline Of The Dissertation --- p.8
Chapter 2 --- Related Work --- p.9
Chapter 2.1 --- Introduction --- p.9
Chapter 2.2 --- Predictable Cache Designs --- p.9
Chapter 2.2.1 --- Locking Cache Lines Design --- p.9
Chapter 2.2.2 --- Partially Dynamic And Static Cache Partition Allocation Design --- p.10
Chapter 2.2.3 --- SMART (Strategic Memory Allocation for Real Time) Cache Design --- p.10
Chapter 2.3 --- Prefetching --- p.11
Chapter 2.3.1 --- Introduction --- p.11
Chapter 2.3.2 --- Hardware Support Prefetching --- p.12
Chapter 2.3.3 --- Software Assisted Prefetching --- p.12
Chapter 2.3.4 --- Partial Cache Hit --- p.13
Chapter 2.3.5 --- Cache Pollution Problems --- p.13
Chapter 2.4 --- Cache Line Replacement Policies --- p.13
Chapter 2.5 --- Main Memory Update Policies --- p.14
Chapter 2.6 --- Summaries --- p.15
Chapter 3 --- Problems And Motivations --- p.16
Chapter 3.1 --- Introduction --- p.16
Chapter 3.2 --- Problems --- p.16
Chapter 3.2.1 --- Modern Cache Architecture Is Inappropriate For Real-time Systems --- p.16
Chapter 3.2.2 --- Intertask Interference: The Effects Of Preemption --- p.17
Chapter 3.2.3 --- Intratask Interference: Cache Line Collision --- p.20
Chapter 3.3 --- Motivations --- p.21
Chapter 3.3.1 --- Improvement Of The Cache Performance In Real-time Systems --- p.21
Chapter 3.3.2 --- Hiding of Preemption Effects --- p.22
Chapter 3.4 --- Conclusions --- p.25
Chapter 4 --- Proposed Real-Time Cache Design --- p.26
Chapter 4.1 --- Introduction --- p.26
Chapter 4.2 --- Concepts Definition --- p.26
Chapter 4.2.1 --- Tasks Definition --- p.26
Chapter 4.2.2 --- Cache Performance Values --- p.27
Chapter 4.3 --- Issues Related To Proposed Real-Time Cache Design --- p.28
Chapter 4.3.1 --- A Task Serving Policy --- p.30
Chapter 4.3.2 --- Number Of Private And Shared Cache Partitions --- p.31
Chapter 4.3.3 --- Controlling The Cache Partitions: Cache Partition Table And Pro- cess Info Table --- p.32
Chapter 4.3.4 --- Re-organization Of Task Owns Cache Partition(s) --- p.34
Chapter 4.3.5 --- Handling The Bus Bandwidth: Memory Requests Queue ( MRQ ) --- p.35
Chapter 4.3.6 --- How To Address The Cache Models --- p.37
Chapter 4.3.7 --- Data Coherence Problems For Partitioned Cache Model And Non- partitioned Cache Model --- p.39
Chapter 4.4 --- Mechanism For Proposed Real-Time Cache Design --- p.43
Chapter 4.4.1 --- Basic Operation Of Proposed Real-Time Cache Design --- p.43
Chapter 4.4.2 --- Assumptions And Rules --- p.43
Chapter 4.4.3 --- First Round Dynamic Cache Partition Re-allocation --- p.44
Chapter 4.4.4 --- Later Round Dynamic Cache Partition Re-allocation --- p.45
Chapter 5 --- Simulation Environments --- p.56
Chapter 5.1 --- Proposed Architectural Model --- p.56
Chapter 5.2 --- Working Environment For Proposed Real-time Cache Models --- p.57
Chapter 5.2.1 --- Cost Model --- p.57
Chapter 5.2.2 --- System Model --- p.64
Chapter 5.2.3 --- Fair Comparsion Between The Unified Cache And The Separate Caches --- p.64
Chapter 5.2.4 --- Operations Within The Preemption --- p.65
Chapter 5.3 --- Benchmark Programs --- p.65
Chapter 5.3.1 --- The NASA7 Benchmark --- p.66
Chapter 5.3.2 --- The SU2COR Benchmark --- p.66
Chapter 5.3.3 --- The TOMCATV Benchmark --- p.66
Chapter 5.3.4 --- The WAVE5 Benchmark --- p.67
Chapter 5.3.5 --- The COMPRESS Benchmark --- p.67
Chapter 5.3.6 --- The ESPRESSO Benchmark --- p.68
Chapter 5.4 --- Simulations Parameters --- p.68
Chapter 6 --- Analysis Of Simulations --- p.71
Chapter 6.1 --- Introduction --- p.71
Chapter 6.2 --- Trace Files Statistics --- p.71
Chapter 6.3 --- Interpretation Of Partial Cache Hit --- p.72
Chapter 6.4 --- The Effects Of Cache Size --- p.72
Chapter 6.4.1 --- "Performances Of Model 1, Model 2, Model 3 And Model 4" --- p.72
Chapter 6.5 --- The Effects Of Cache Partition Size --- p.76
Chapter 6.5.1 --- Performance Of Model 3 --- p.79
Chapter 6.5.2 --- Performance Of Model 1 --- p.79
Chapter 6.6 --- The Effects Of Line Size --- p.80
Chapter 6.6.1 --- "Performance Of Model 1, Model 2, Model 3 And Model 4" --- p.80
Chapter 6.7 --- The Effects Of Set Associativity --- p.83
Chapter 6.7.1 --- "Performance Of Model 1, Model 2, Model 3 And Model 4" --- p.83
Chapter 6.8 --- The Effects Of The Best-expected Cache Performance --- p.84
Chapter 6.8.1 --- Performance of Model 1 --- p.87
Chapter 6.8.2 --- Performance of Model 3 --- p.88
Chapter 6.9 --- The Effects Of The Standard-expected Cache Performance --- p.89
Chapter 6.9.1 --- Performance Of Model 1 --- p.89
Chapter 6.9.2 --- Performance Of Model 3 --- p.91
Chapter 6.10 --- The Effects Of Cycle Execution Time/Cycle Deadline Period --- p.92
Chapter 6.10.1 --- "Performances Of Model 1, Model 2, Model 3 And Model 4" --- p.92
Chapter 7 --- Conclusions And Future Work --- p.95
Chapter 7.1 --- Conclusions --- p.95
Chapter 7.1.1 --- Unified Cache Model Is More Suitable In Real-time Systems --- p.99
Chapter 7.1.2 --- Comments On Aperiodic Tasks --- p.100
Chapter 7.2 --- Future Work --- p.100
Lin, Ya-Ching, and 林雅清. "Design of Efficient Cache Memory Systems for Multimedia Applications." Thesis, 1999. http://ndltd.ncl.edu.tw/handle/52464734104099086424.
Full text國立中正大學
資訊工程研究所
87
Since the more and more popular of using computer, it is obvious that the multimedia applications are more and more common and important. Also, depending on the higher and higher developing rate of CPU, we can enjoy the smooth and real-time audio/video entertainment by computer now. However, if the scale of the multimedia data is too large, the application is probably unable to run smoothly. This is because of the memory latency. If the accessing time of memory storage can not be improved efficiently, the memory latency will always be the bottleneck of computer execution time no matter how fast the CPU speed is. According to the program and data characteristics of multimedia applications, the thesis describes how to reduce the reuseless data to be cached into cache by a RCT (Reference Contribution Table). For hiding the memory latency for multimedia applications, a Packed Cache, which stores the prefetched and packed data, is discussed in this thesis. Multiple expected data can be prefetched into the Packed Cache. This not only can improve the miss rate of normal cache, but also can hide a portion of memory latency.
"Design of disk cache for high performance computing." Chinese University of Hong Kong, 1995. http://library.cuhk.edu.hk/record=b5888561.
Full textThesis (M.Phil.)--Chinese University of Hong Kong, 1995.
Includes bibliographical references (leaves 123-127).
Abstract --- p.i
Acknowledgement --- p.ii
List of Tables --- p.vii
List of Figures --- p.viii
Chapter 1 --- Introduction --- p.1
Chapter 1.1 --- I/O System --- p.2
Chapter 1.2 --- Disk Cache --- p.4
Chapter 1.3 --- Dissertation Outline --- p.5
Chapter 2 --- Related Work --- p.7
Chapter 2.1 --- Prefetching --- p.7
Chapter 2.2 --- Cache Partitioning --- p.9
Chapter 2.2.1 --- Hardware Assisted Mechanism --- p.9
Chapter 2.2.2 --- Software Assisted Mechanism --- p.10
Chapter 2.3 --- Replacement Policy --- p.12
Chapter 2.4 --- Caching Write Operation --- p.13
Chapter 2.5 --- Others --- p.14
Chapter 2.6 --- Summary --- p.15
Chapter 3 --- Methodology and Models --- p.17
Chapter 3.1 --- Performance Measurement --- p.17
Chapter 3.1.1 --- Partial Hit --- p.17
Chapter 3.1.2 --- Time Model --- p.17
Chapter 3.2 --- Terminology --- p.19
Chapter 3.2.1 --- Transfer Block --- p.19
Chapter 3.2.2 --- Multiple-sector Request --- p.19
Chapter 3.2.3 --- "Dynamic Block, Heading Sectors and Content Sectors" --- p.20
Chapter 3.2.4 --- Heading Reuse and Non-heading Reuse --- p.22
Chapter 3.3 --- New Models --- p.23
Chapter 3.3.1 --- Unified Cache with Always Prefetch --- p.24
Chapter 3.3.2 --- Partitioned Cache: Branch Target Cache and Prefetch Buffer --- p.25
Chapter 3.3.3 --- BTC + PB with Alternative Storing Sector Technique --- p.29
Chapter 3.3.4 --- BTC + PB with ASST Applying to Dynamic Block --- p.34
Chapter 3.3.5 --- BTC + PB with Storing Enough Head Technique --- p.35
Chapter 3.4 --- Impact of Block Size --- p.38
Chapter 4 --- Trace Driven Simulation --- p.41
Chapter 4.1 --- Simulation Environment --- p.41
Chapter 4.2 --- Two Kinds Of Disk --- p.43
Chapter 4.3 --- Control Models --- p.43
Chapter 4.3.1 --- Model 1: No Cache --- p.43
Chapter 4.3.2 --- Model 2: Unified Cache without Prefetch --- p.44
Chapter 4.3.3 --- Model 3: Unified Cache with Prefetch on Miss --- p.44
Chapter 4.4 --- Two Comparison Standards --- p.45
Chapter 4.5 --- Trace Properties --- p.46
Chapter 5 --- Performance Evaluation of Common Disk --- p.54
Chapter 5.1 --- The Effect Of Cache Size --- p.54
Chapter 5.1.1 --- Trends of Absolute Reduction in Time --- p.55
Chapter 5.1.2 --- Trends of Relative Reduction in Time --- p.55
Chapter 5.2 --- The Effect Of Block Size --- p.68
Chapter 5.2.1 --- Trends of Absolute Reduction in Time --- p.68
Chapter 5.2.2 --- Trends of Relative Reduction in Time --- p.73
Chapter 5.3 --- The Effect Of Set Associativity --- p.77
Chapter 5.3.1 --- Trends of Absolute Reduction in Time --- p.77
Chapter 5.4 --- The Effect Of Start-up Time C1 --- p.79
Chapter 5.4.1 --- Trends of Absolute Reduction in Time --- p.80
Chapter 5.4.2 --- Trends of Relative Reduction in Time --- p.80
Chapter 5.5 --- The Effect Of Transfer Time C2 --- p.83
Chapter 5.5.1 --- Trends of Absolute Reduction in Time --- p.83
Chapter 5.5.2 --- Trends of Relative Reduction in Time --- p.83
Chapter 5.5.3 --- Impact of C2=0.5 on Cache Size --- p.86
Chapter 5.5.4 --- Impact of C2=0.5 on Block Size --- p.87
Chapter 5.6 --- The Effect Of Prefetch Buffer Size --- p.90
Chapter 5.7 --- Others --- p.93
Chapter 5.7.1 --- In The Case of Very Small Cache with Large Block Size --- p.93
Chapter 5.7.2 --- Comparing Performance of Model 6 and Model 7 --- p.94
Chapter 5.8 --- Conclusion --- p.95
Chapter 5.8.1 --- The Number of Actual Sectors Transferred between Disk and Cache . --- p.95
Chapter 5.8.2 --- The Efficiency of Our Models on Common Disk --- p.96
Chapter 6 --- Performance Evaluation of High Performance Disk --- p.98
Chapter 6.1 --- Difference Between Common Disk And High Performance Disk --- p.98
Chapter 6.2 --- The Effect Of Cache Size --- p.99
Chapter 6.2.1 --- Trends of Absolute Reduction in Time --- p.99
Chapter 6.2.2 --- Trends of Relative Reduction in Time --- p.99
Chapter 6.3 --- The Effect Of Block Size --- p.103
Chapter 6.3.1 --- Trends of Absolute Reduction in Time --- p.105
Chapter 6.3.2 --- Trends of Relative Reduction in Time --- p.105
Chapter 6.4 --- The Effect Of Start-up Time C1 --- p.110
Chapter 6.4.1 --- Trends of Relative Reduction in Time --- p.110
Chapter 6.5 --- The Effect Of Transfer Time C2 --- p.110
Chapter 6.5.1 --- Trends of Relative Reduction in Time --- p.112
Chapter 6.5.2 --- Impact of C2=0.5 on Cache Size --- p.112
Chapter 6.5.3 --- Impact of C2=0.5 on Block Size --- p.116
Chapter 6.6 --- Conclusion --- p.117
Chapter 7 --- Conclusions and Future Work --- p.119
Chapter 7.1 --- Conclusions --- p.119
Chapter 7.2 --- Future Work --- p.122
Bibliography --- p.123
HUNG, CHI-CHAO, and 洪啟超. "Efficient Cache Bypassing and Adaptive Threads Controlling of Cache Memory for High Performance GPU Design." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/gy2437.
Full textTing, Kuo-Chang, and 丁國章. "A Range-Based Cache Design For Shared-Memory Multiprocessors System." Thesis, 1995. http://ndltd.ncl.edu.tw/handle/40544672647805802592.
Full text國立中正大學
資訊工程研究所
83
Efficient memory system is crucial to the performance of the shared-memory multiprocessors system. The private cache for each processor was used generally in shared-memory multiprocessors in order to achieve good performance. However, due to the introduction of the private cache, the cache coherent problems occur. In the multilevel hierarchical cache, the multi-levels inclusion (MLI)property was proposed to solve the memory block invalidation problem. However, the higher the tree level, the problem of the associativity constraint becomes more serious. We propose the range-based cache to overcome the associativity overflow problem of the MLI cache. There are start variation tag end variation tag to indicate the range of the block address owned by the processor in the range-based cache. The vector field of the entry of the cache directory is used to record which processor owns the data block in this range. In order to solve the null empty problems, we devise a counter for each entry of all of the sets to indicate the number of active blocks maintained in this entry. If the counter is zero, we can ensure this is a empty entry, and we may clear all the fields of the entry. We use Cachemire, a program driven simulation tool, to evaluate our proposed scheme. The simulation results show that our range-based cache is better than the MLI cache. With only a small cache (eg. 8K), range-based cache can be better than that of the large cache (eg. 64K) for the MLI cache. We conclude that the range-based cache is more cost-effective than the MLI cache.
Mohammad, Baker Shehadah. "Cache design for low power and yield enhancement." 2008. http://hdl.handle.net/2152/17884.
Full texttext
MA, JING-HUA, and 馬靖華. "A design of memory management unit and cache controller for the MARS system." Thesis, 1992. http://ndltd.ncl.edu.tw/handle/97824129942546114382.
Full textHuh, Jaehyuk. "Hardware techniques to reduce communication costs in multiprocessors." Thesis, 2006. http://hdl.handle.net/2152/2533.
Full text"An Analysis of the Memory Bottleneck and Cache Performance of Most Apparent Distortion Image Quality Assessment Algorithm on GPU." Master's thesis, 2016. http://hdl.handle.net/2286/R.I.41229.
Full textDissertation/Thesis
Masters Thesis Computer Science 2016
Dai, Zefu. "Appliction-driven Memory System Design on FPGAs." Thesis, 2013. http://hdl.handle.net/1807/43538.
Full textShastri, Vijnan. "Caching Strategies And Design Issues In CD-ROM Based Multimedia Storage." Thesis, 1997. http://etd.iisc.ernet.in/handle/2005/1805.
Full textDwarakanath, Nagendra Gulur. "Multi-Core Memory System Design : Developing and using Analytical Models for Performance Evaluation and Enhancements." Thesis, 2015. http://etd.iisc.ernet.in/2005/3935.
Full text(9179468), Timothy A. Pritchett. "Workload Driven Designs for Cost-Effective Non-Volatile Memory Hierarchies." Thesis, 2020.
Find full textMy first contribution is the design and implementation of WriteGuard– a self-tuning sieving write-buffer algorithm that filters writes as well as the highly-effective (but computationally-expensive) algorithms while requiring lightweight computation comparable to a simple LRU-based write-buffer. While WriteGuard reduces the capacity needed for DRAM buffering (to approx. 64 MB), it does not eliminate the need for DRAM buffers (and corresponding power backup).
For my second thrust, I identify two specific application characteristics – (1) the vast majority of the write-buffer’s contents is composed of write-dominant blocks, and (2) the vast majority of blocks in the write-buffer are overwritten within a period of 28 hours. I show that these characteristics help enable a high-density, optimized STT-MRAM as a replacement for DRAM, which enables durable write-buffers (thus eliminating the cost of power backup for the write-buffer). My optimized STT-MRAM-based write buffer achieves higher density by (a) trading off superfluous durability by exploiting characteristic (2), and (b) deoptimizing the read-performance of STT-MRAM by leveraging characteristic (1). Together, the techniques increase the density of STT-MRAM by 20% with low or no impact on write-buffer performance.
Chang, Da-Wei, and 張大緯. "A Design and Implementation of Memory Caches in World Wide Web Servers." Thesis, 1997. http://ndltd.ncl.edu.tw/handle/80244474560521073343.
Full text國立交通大學
資訊科學學系
86
With the popularity of the World Wide Web( WWW ), web traffic has become the fastest growing component among all kinds of Internet traffics. However, so large volumes of traffic causes the document retrieval latency perceived by web users becomes longer.Many researchers notice the problem and have made efforts on improving the WWW latency. The latency can be reduced in two ways: the reduction of network delay and the improvement of web server's throughput. Our research aims at improving web server's throughput by keeping a memory cache in a web server's address space. In this thesis, we focus on the design and implementation of a memory cache. We propose a novel web cache management policy. The experiment results show three things. First, our memory cache is beneficial since, by keeping a cache which it size is only 1.8% of total document size, the throughput improvement can achieve 16.9%.Second, our cache management policy is suitable for current web traffic. Third, with the increasing popularity of multimedia files, it is very likely that a file is larger than the total size of the memory cache. Under this condition, our policy will outperform others that are currently used in WWW.
Ting-JyunLin and 林霆鈞. "Distributed In-Memory Caches for NoSQL Persistent Stores: Design Considerations and Performance Impacts." Thesis, 2013. http://ndltd.ncl.edu.tw/handle/hx6fsd.
Full text國立成功大學
資訊工程學系碩博士班
101
NoSQL key-value persistent data stores are emerging, which are designed to accommodate a potentially large volume of data increased rapidly from a variety of sources. Examples of such key-value stores include Google Bigtable/Spanner, Hadoop HBase, Cassandra, MongoDB, Counchbase, etc. As typical key-value stores manipulate data objects stored in disks, the accesses are inefficiency. We present in this paper a caching model to store data objects tentatively in memories distributed in a number of cache servers. We reveal a number of design issues involved in developing such a caching model. These issues are relevant to consistency, scalability and availability. Specifically, our study notes that to develop a cache cluster shall have an in-depth understanding regarding the backend persistent key-value stores. We then present a caching model, assuming that Hadoop HBase serves as our backend. In addition, performing range search, one of major operations offered by most NoSQL key-value stores, in our caching model together with the HBase backend is illustrated. Computer simulations demonstrate our performance results.