Dissertations / Theses on the topic 'CELL Broadband Engine'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 19 dissertations / theses for your research on the topic 'CELL Broadband Engine.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Ålind, Markus. "A Skeleton library for Cell Broadband Engine." Thesis, Linköping University, Department of Computer and Information Science, 2008. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-54476.
Full textThe Cell Broadband Engine processor is a powerful processor capable of over 220 GFLOPS. It is highly specialized and can be controlled in detail by the programmer. The Cell is significantly more complicated to program than a standard homogeneous multi core processor such as the Intel Core2 Duo and Quad. This thesis explores the possibility to abstract some of the complexities of Cell programming while maintaining high performance. The abstraction is achieved through a library of parallel skeletons implemented in the bulk synchronous parallel programming environment NestStep. The library includes constructs for user defined SIMD optimized data parallel skeletons such as map, reduce and more. The evaluation of the library includes porting of a vector based scientific computation program from sequential C code to the Cell using the library and the NestStep environment. The ported program shows good performance when compared to the sequential original code run on a high-end x86 processor. The evaluation also shows that a dot product implemented with the skeleton library is faster than the dot product in the IBM BLAS library for the Cell processor with more than two slave processors.
Lundberg, Marcus. "A Parallel Monte Carlo Implementation on the Cell Broadband Engine." Thesis, Uppsala University, Department of Information Technology, 2009. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-108035.
Full textThe Cell Broadband Engine is a heterogeneous multi-core processor architecture thattrades ease-of-programming for high performance. While primarily featured in theSony PlayStation 3 (PS3) for high-end games, it is a promising technology for scientistsworking with computationally heavy numerical methods. This paper presents threeimplementations of a Monte Carlo simulation of a system of charged particles on thePS3. The first method, while easy to implement and use, did not yield anyperformance advantage over conventional x86 processors. The second method ranmore than twice as fast on the PS3 as a comparable code on a 1.86 GHz Intel Xeonmachine but could run only a limited problem size. The third program ran over sixtimes faster than the x86 reference system and could handle any problem up to thesaturation of the PS3 main memory. The final program is also suitable for a cluster ofPlayStations and is easily adaptable to work on a distributed computing framework.
Rajamohan, Srijith Datta Suman Narayanan Vijaykrishnan. "A neural network based classifier on the cell broadband engine." [University Park, Pa.] : Pennsylvania State University, 2009. http://etda.libraries.psu.edu/theses/approved/WorldWideIndex/ETD-4512/index.html.
Full textLopes, André Filipe da Rocha. "tlCell: a software transactional memory for the cell broadband engine architecture." Master's thesis, Faculdade de Cencias e Tecnologia, 2010. http://hdl.handle.net/10362/4110.
Full textOs computadores evoluíram exponencialmente na ultima década. A performance tem sido o principal objectivo resultando no aumento do frequência dos processadores, situação que já não é fazível devido ao consumo de energia exagerado dos processadores actuais. A arquitectura Cell Broadband Engine começou com o objectivo de providenciar alta capacidade computacional com um baixo consumo energético. O resultado é uma arquitectura com multiprocessadores heterogéneos e uma distribuição de memória única com vista a alto desempenho e redução da complexidade do hardware para reduzir o custo de produção. Espera-se que as técnicas de concorrência e paralelismo aumentem a performance desta arquitectura, no entanto as soluções de alto desempenho apresentadas s˜ao sempre muito especificas e devido à sua arquitectura e distribuição de memória inovadora ´e ainda difícil apresentar ferramentas passíveis de explorar concorrência e paralelismo como um camada de abstracção. Memória Transaccional por Software é um modelo de programação que propõe este nível de abstracção e tem vindo a ganhar popularidade existindo já variadas implementações com performance perto de soluções específicas de grão fino. A possibilidade de usar Memória Transaccional por Software nesta arquitectura inovadora, desenvolvendo uma ferramenta capaz de abstrair o programador da consistência e gestão de memória é apelativo. Neste documento especifica-se uma plataforma deffered-update de Memória Transactional por Software para a arquitectura Cell Broadband Engine que tira partido da capacidade computacional dos Synergistic Processing Elements (SPEs) usando locks em commit-time. São propostos dois modelos diferentes, fully local e multi-buffered de forma a poder estudar as implicações das escolhas feitas no desenho da plataforma.
Azuelos, Nathaniel. "An integrated functional solution for multi-core programming on the cell broadband engine." Thesis, McGill University, 2009. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=32276.
Full textLes récents efforts en développement de microprocesseurs tendent à une coexistence entre plusieurs Unités Centrales (UC) sur une seule puce. Le Cell Broadband Engine (CBE), le fruit d'une collaboration entre Sony, Toshiba et IBM, intègre le CU patrimonial d'IBM PowerPC, avec un nouvel ensemble d'unités simples, communiquant entre elles avec un bus de haute vitesse. Les nombreueses unités présentes dans le CBE permettent aux utilisateurs d'exploiter la nature parallèle de leurs programmes. Cependant, il est souvent difficile d'extraire le parallélisme d'une application et de distribuer des tâches de façon appropriée. Nous proposons donc d'approcher la programmation du CBE sous une perspective de flux de données où le compilateur est chargé de partitionner les tâches et de l'infrastructure de la distribution des tâches. Dans ce travail, nous présentons la langue de programmation NCC, le compilateur et l'environnement d'exécution Squid. NCC est un langage fonctionnel stricte de flux, qui force les entre variables à être explicites, afin d'exploiter le parallelisme d'une application. Le code NCC est donc rédigé par l'utilsateur sans spécifier le parallelisme explicitement. Le compilateur Squid dessine un graphe de flux de données virtuel issu du code NCC. Ce graphe est partitionné selon des critères particuliers à l'implémentation en tâches et supertâches. Chaque tâche est ensuite traduite en ANSI-C, et les supertâches sont analysées et transformées en structures d'ordonnançement. Toutes les tâches sont exécutées par les untiés simples du CBE. L'Environnement d'Exécution Squid (EES) interagit avec l'ordonnanceur pour ordonner$
Aji, Ashwin Mandayam. "Exploiting Multigrain Parallelism in Pairwise Sequence Search on Emergent CMP Architectures." Thesis, Virginia Tech, 2008. http://hdl.handle.net/10919/33606.
Full textMaster of Science
Cox, Guilherme Mota Cavalcanti de Albuquerque. "Implementação de Visualização de Dados Tridimensionais de Malhas Irregulares no Processador Cell Broadband Engine." Universidade do Estado do Rio de Janeiro, 2009. http://www.bdtd.uerj.br/tde_busca/arquivo.php?codArquivo=8269.
Full textA renderização de volume direta tornou-se uma técnica popular para visualização volumétrica de dados extraídos de fontes como simulações científicas, funções analíticas, scanners médicos, entre outras. Algoritmos de renderização de volume, como o raycasting, produzem imagens de alta qualidade. O seu uso, contudo, é limitado devido à alta demanda de processamento computacional e o alto uso de memória. Nesse trabalho, propomos uma nova implementação do algoritmo de raycasting que aproveita a arquitetura altamente paralela do processador Cell Broadband Engine, com seus 9 núcleos heterogêneos, que permitem renderização eficiente em malhas irregulares de dados. O poder computacional do processador Cell BE demanda um modelo de programação diferente. Aplicações precisam ser reescritas para explorar o potencial completo do processador Cell, que requer o uso de multithreading e código vetorizado. Em nossa abordagem, enfrentamos esse problema distribuindo a computação de cada raio incidente nas faces visíveis do volume entre os núcleos do processador, e vetorizando as operações da integral de iluminação em cada um. Os resultados experimentais mostram que podemos obter bons speedups reduzindo o tempo total de renderização de forma significativa.
Li, Yi-Hsien. "Real-Time Space-Time Adaptive Processing on the STI CELL Multiprocessor." Thesis, Linköping University, Department of Electrical Engineering, 2007. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-8933.
Full textSpace-Time Adaptive Processing (STAP) has been widely used in modern radar systems such as Ground Moving Target Indication (GMTI) systems in order to suppress jamming and interference. However, the high performance comes at a price of higher computational complexity, which requires extensive powerful hardware.
The new STI Cell Broadband Engine (CBE) processor combines PowerPC core augmented with eight streamlined high-performance SIMD processing engine offers an opportunity to implement the STAP baseband signal processing without any full custom hardware. This paper presents the implementation of an STAP baseband signal processing flow on the state-of-the-art STI CELL multiprocessor, which enables the concept of Software-Defined Radar (SDR). The potential of the Cell BE processor is studied so that kernel subroutine such as QR decomposition, Fast Fourier Transform (FFT), and FIR filtering of STAP are mapped to the SPE co-processors of Cell BE processor with variety of architectural specific optimization techniques.
This report starts with an overview of airborne radar technique and then the standard, specifically the third-order Doppler-factored STAP are introduced. Next, it goes with the thorough description of Cell BE architecture, its programming tool chain and parallel programming methods for Cell BE. In later chapter, how the STAP is implemented on the Cell BE processor is discussed and the simulation results are presented. Furthermore, based on the result of earlier benchmarking, an optimized task partition and scheduling method is proposed to improve the overall performance.
Schmuland, Todd E. "Exploiting Parallel Processing Techniques for Implementation of Wideband MUSIC Algorithm on the IBM Cell Broadband Engine Processor." University of Toledo / OhioLINK, 2010. http://rave.ohiolink.edu/etdc/view?acc_num=toledo1271273869.
Full textJakobsson, Teodor. "Parallelization of Animation Blending on the PlayStation®3." Thesis, Linköpings universitet, Informationskodning, 2012. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-79409.
Full textZhang, Zikai. "Hardware acceleration on IBM cell broadband engine for simulation of coupled interconnects using waveform relaxation and transverse partitioning." Thesis, McGill University, 2009. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=32420.
Full textRésumé Au cours des dernières années, la tendance dans la conception des microprocesseurs est passée de l'augmentation de la fréquence d'horloge à des modèles multi-core qui intègrent de multiples noyaux de traitement sur la même puce. Cela signifie que nous ne pouvons plus compter sur l'augmentation des fréquences d'horloge dans le but d'améliorer les performances des outils d'automatisation de conception électronique (EDA). En fait, pour prendre avantage des progrès réalisés dans la conception de microprocesseurs, ces outils doivent être adaptés afin d'utiliser des architectures de calcul parallèle. Dans cette thèse nous avons paralléliser et de mis en oeuvre un algorithme d'IBM sur le Cell Broadband Engine (Cell BE), qui est basée sur les techniques de relaxation d'onde et de partition transversale pour simuler de manière efficace des circuits d'interconnection couplés à haute vitesse. Plusieurs stratégies sont utilisées dans le Cell BE programs pour atteindre la haute performance. Le processeur Cell BE réalise la meilleure performance avec une vitesse de 10x lorsque le nombre de lignes de transmission est un multiple du nombre maximum d'éléments synergiques du processeur (SPEs) qui sont en cours d'exécution simultanément.
Paiva, Pedro Emanuel Pinto de. "Utilização do processador Cell para o processamento de dados obtidos por tomografia aplicada a materiais compósitos." Master's thesis, Faculdade de Ciências e Tecnologia, 2011. http://hdl.handle.net/10362/6095.
Full textOs materiais compósitos, em que numa base (matriz) se dispersam partículas (reforços), são muito usados em várias áreas como a aeronáutica. Quando os engenheiros de Materiais testam novas formas de fabricar estes materiais, usam dados obtidos em tomógrafos de raios X para caracterizar a população de reforços. Os dados gerados pelos tomógrafos exigem grandes capacidades de processamento, não só pelo seu volume (da ordem de 1 Gbyte) como pela complexidade computacional de alguns algoritmos. É possível reduzir os tempos de execução de algumas fases de processamento de dados tomográficos fazendo a paralelização dos algoritmos correspondentes. Em trabalhos anteriores,foram usados multiprocessadores de memória distribuída e de memória partilhada como plataforma de execução dessas versões dos algoritmos. O Cell Broadband Engine (Cell BE) é multi-processador heterogéneo desenhado para oferecer uma elevada capacidade de processamento com mais eficiência energética do que os CPUs convencionais. Estas características tornam fazem com que o Cell BE seja muito utilizado no desenvolvimento de programas para a Ciência e Engenharia Computacionais. Nesta tese, são desenvolvidas versões de algumas operações de processamento de dados tomográficos vocacionadas para o Cell/BE. O Cell BE é um multiprocessador heterogéneo onde no mesmo chip coexistem um processador convencional (PPU), 8 processadores especializados em “number crunching” (SPUs) e um bus de interligação. Alguns autores chamam ao Cell BE um “cluster num chip”, para frisar que existe um conjunto de espaços de endereçamento,obrigando a que o programador ou o ambiente de execução façam a gestão explícita das transferências de dados entre as várias partes de memória. Esta organização sugere que, para construir versões paralelas dos algoritmos de processamento, se considerem estratégias de paralelização geométrica semelhantes às que se utilizaram num cluster de máquinas convencionais. A experiência mostrou que a escassa memória local existente nos SPUs obriga a que esta estratégia tenha de ser complementada por outras. Apesar destas limitações, a tese mostra que, no Cell BE se conseguem reduções significativas dos tempos de execução de alguns algoritmos de processamento de dados tomográficas, mesmo em relação a trabalhos anteriores em que foram usados multiprocessadores convencionais.
SHI, YU. "Enhanced SAR Image Processing Using A Heterogeneous Multiprocessor." Thesis, Linköping University, Department of Computer and Information Science, 2008. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-11517.
Full textSynthetic antenna aperture (SAR) is a pulses focusing airborne radar which can achieve high resolution radar image. A number of image process algorithms have been developed for this kind of radar, but the calculation burden is still heavy. So the image processing of SAR is normally performed “off-line”.
The Fast Factorized Back Projection (FFBP) algorithm is considered as a computationally efficient algorithm for image formation in SAR, and several applications have been implemented which try to make the process “on-line”.
CELL Broadband Engine is one of the newest multi-core-processor jointly developed by Sony, Toshiba and IBM. CELL is good at parallel computation and floating point numbers, which all fit the demands of SAR image formation.
This thesis is going to implement FFBP algorithm on CELL Broadband Engine, and compare the results with pre-projects. In this project, we try to make it possible to perform SAR image formation in real-time.
"A Tiger Compiler for the Cell Broadband Engine Architecture." Thesis, 2013. http://hdl.handle.net/10388/ETD-2013-08-1238.
Full textJohnson, Jacob Raghavan Padma. "Power efficiency and scaling of the cell broadband engine." 2009. http://etda.libraries.psu.edu/theses/approved/WorldWideIndex/ETD-3966/index.html.
Full textShaffer, Andrew P. Raghavan Padma. "Pfftc an improved fast fourier transform for the ibm cell broadband engine /." 2009. http://etda.libraries.psu.edu/theses/approved/WorldWideIndex/ETD-4024/index.html.
Full textChien, Jung-Yin, and 簡榮胤. "A Development Environment of Dataflow Programming Model with Application to IBM Cell Broadband Engine." Thesis, 2009. http://ndltd.ncl.edu.tw/handle/56951943052958810030.
Full text國立成功大學
資訊工程學系碩博士班
97
Multicore processor provides large computation capability but also involves the complicate parallel programming. One of major considerations in parallel programming is the performance. Traditional design methodologies which start a design on a selected platform usually spend a lot of effort and time on tuning performance and debugging. When platform is changed, the entire design flow may have to be repeated and very time-consuming. Hence a flexible design methodology is necessary. In this thesis, we present a dataflow design methodology and use it in the programming of Cell processor. The dataflow model provides a high-level abstraction of underlying hardware. Computation and communication of the target application are separated and represented as modules and channels, respectively. To demonstrate the proposed programming model, a MPEG-4 SP decoder is used as an example. The parallelisms of MPEG-4 decoder are discussed and exposed with the dataflow model. To map the high level dataflow model to Cell processor, the mapping flow, including offline profiling, task allocation and runtime libraries, are developed. According to the profiled data, the allocation algorithm could allocate task on multiprocessors as balanced as possible. An efficient synchronization mechanism on Cell processor is also proposed. We also discuss the impact of the models and the mapping flow corresponding to performance about decoding speed. The results show that the proposed methodology gets considerable performance boost when number of cores is increased. It is possible to synthesize the model targeting to either dedicate hardware or software on multiprocessor once the original tool chain of the new platform is modified. For example, the proposed model can be translated into SystemC model to facilitate system level design methodology.
Girard, Natalie. "CellPilot: An extension of the Pilot library for Cell Broadband Engine processors and heterogeneous clusters." Thesis, 2012. http://hdl.handle.net/10214/3279.
Full textXu, Meilian. "Exploiting parallelism of irregular problems and performance evaluation on heterogeneous multi-core architectures." 2012. http://hdl.handle.net/1993/9236.
Full text