To see the other types of publications on this topic, follow the link: Encoder-decoder.

Dissertations / Theses on the topic 'Encoder-decoder'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'Encoder-decoder.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Kalchbrenner, Nal. "Encoder-decoder neural networks." Thesis, University of Oxford, 2017. http://ora.ox.ac.uk/objects/uuid:d56e48db-008b-4814-bd82-a5d612000de9.

Full text
Abstract:
This thesis introduces the concept of an encoder-decoder neural network and develops architectures for the construction of such networks. Encoder-decoder neural networks are probabilistic conditional generative models of high-dimensional structured items such as natural language utterances and natural images. Encoder-decoder neural networks estimate a probability distribution over structured items belonging to a target set conditioned on structured items belonging to a source set. The distribution over structured items is factorized into a product of tractable conditional distributions over individual elements that compose the items. The networks estimate these conditional factors explicitly. We develop encoder-decoder neural networks for core tasks in natural language processing and natural image and video modelling. In Part I, we tackle the problem of sentence modelling and develop deep convolutional encoders to classify sentences; we extend these encoders to models of discourse. In Part II, we go beyond encoders to study the longstanding problem of translating from one human language to another. We lay the foundations of neural machine translation, a novel approach that views the entire translation process as a single encoder-decoder neural network. We propose a beam search procedure to search over the outputs of the decoder to produce a likely translation in the target language. Besides known recurrent decoders, we also propose a decoder architecture based solely on convolutional layers. Since the publication of these new foundations for machine translation in 2013, encoder-decoder translation models have been richly developed and have displaced traditional translation systems both in academic research and in large-scale industrial deployment. In services such as Google Translate these models process in the order of a billion translation queries a day. In Part III, we shift from the linguistic domain to the visual one to study distributions over natural images and videos. We describe two- and three- dimensional recurrent and convolutional decoder architectures and address the longstanding problem of learning a tractable distribution over high-dimensional natural images and videos, where the likely samples from the distribution are visually coherent. The empirical validation of encoder-decoder neural networks as state-of- the-art models of tasks ranging from machine translation to video prediction has a two-fold significance. On the one hand, it validates the notions of assigning probabilities to sentences or images and of learning a distribution over a natural language or a domain of natural images; it shows that a probabilistic principle of compositionality, whereby a high- dimensional item is composed from individual elements at the encoder side and whereby a corresponding item is decomposed into conditional factors over individual elements at the decoder side, is a general method for modelling cognition involving high-dimensional items; and it suggests that the relations between the elements are best learnt in an end-to-end fashion as non-linear functions in distributed space. On the other hand, the empirical success of the networks on the tasks characterizes the underlying cognitive processes themselves: a cognitive process as complex as translating from one language to another that takes a human a few seconds to perform correctly can be accurately modelled via a learnt non-linear deterministic function of distributed vectors in high-dimensional space.
APA, Harvard, Vancouver, ISO, and other styles
2

Padinjare, Sainath. "VLSI implementation of a turbo encoder/decoder /." Internet access available to MUN users only, 2003. http://collections.mun.ca/u?/theses,162832.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Weitzman, Jonathan M. "SELECTABLE PERMUTATION ENCODER/DECODER FOR A QPSK MODEM." International Foundation for Telemetering, 2003. http://hdl.handle.net/10150/605817.

Full text
Abstract:
International Telemetering Conference Proceedings / October 20-23, 2003 / Riviera Hotel and Convention Center, Las Vegas, Nevada
An artifact of QPSK modems is ambiguity of the recovered data. There are four variations of the output data for a given input data stream. All are equally probable. To resolve this ambiguity, the QPSK data streams can be differentially encoded before modulation and differentially decoded after demodulation. The encoder maps each input data pair to a phase angle change of the QPSK carrier. In the demodulator, the inverse is performed - each phase change of the input QPSK carrier is mapped to an output data pair. This paper discusses a very simple and unique differential encoder/decoder that handles all possible data pair/phase change permutations.
APA, Harvard, Vancouver, ISO, and other styles
4

Mejdi, Sami. "Encoder-Decoder Networks for Cloud Resource Consumption Forecasting." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-291546.

Full text
Abstract:
Excessive resource allocation in telecommunications networks can be prevented by forecasting the resource demand when dimensioning the networks and the allocation the necessary resources accordingly, which is an ongoing effort to achieve a more sustainable development. In this work, traffic data from cloud environments that host deployed virtualized network functions (VNFs) of an IP Multimedia Subsystem (IMS) has been collected along with the computational resource consumption of the VNFs. A supervised learning approach was adopted to address the forecasting problem by considering encoder-decoder networks. These networks were applied to forecast future resource consumption of the VNFs by regarding the problem as a time series forecasting problem, and recasting it as a sequence-to-sequence (seq2seq) problem. Different encoder-decoder network architectures were then utilized to forecast the resource consumption. The encoder-decoder networks were compared against a widely deployed classical time series forecasting model that served as a baseline model. The results show that while the considered encoder-decoder models failed to outperform the baseline model in overall Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE), the forecasting capabilities were more resilient to degradation over time. This suggests that the encoder-decoder networks are more appropriate for long-term forecasting, which is an agreement with related literature. Furthermore, the encoder-decoder models achieved competitive performance when compared to the baseline, despite being treated with limited hyperparameter-tuning and the absence of more sophisticated functionality such as attention. This work has shown that there is indeed potential for deep learning applications in forecasting of cloud resource consumption.
Överflödig allokering av resurser I telekommunikationsnätverk kan förhindras genom att prognosera resursbehoven vid dimensionering av dessa nätverk. Detta görs i syfte att bidra till en mer hållbar utveckling. Inför detta prjekt har trafikdata från molnmiljön som hyser aktiva virtuella komponenter (VNFs) till ett IÅ Multimedia Subsystem (IMS) samlats in tillsammans med resursförbrukningen av dessa komponenter. Detta examensarbete avhandlar hur effektivt övervakad maskininlärning i form av encoder-decoder nätverk kan användas för att prognosera resursbehovet hos ovan nämnda VNFs. Encoder-decoder nätverken appliceras genom att betrakta den samlade datan som en tidsserie. Problemet med att förutspå utvecklingen av tidsserien formuleras sedan som ett sequence-2-sequence (seq2seq) problem. I detta arbete användes en samling encoder-decoder nätverk med olika arkitekturer för att prognosera resursförbrukningen och dessa jämfördes med en populär modell hämtad från klassisk tidsserieanalys. Resultaten visar att encoder-decoder nätverken misslyckades med att överträffa den klassiska tidsseriemodellen med avseende på Root Mean Squeared Error (RMSE) och Mean Absolut Error (MAE). Dock visar encoder-decoder nätverken en betydlig motståndskraft mot prestandaförfall över tid i jämförelse med den klassiska tidsseriemodellen. Detta indikerar att encoder-decoder nätverk är lämpliga för prognosering över en längre tidshorisont. Utöver detta visade encoder-decoder nätverken en konkurrenskraftig förmåga att förutspå det korrekta resursbehovet, trots en begränsad justering av disponeringsparametrarna och utan mer sofistikerad funktionalitet implementerad som exempelvis attention.
APA, Harvard, Vancouver, ISO, and other styles
5

Mejdi, Sami. "Encoder-Decoder Networks for Cloud Resource Consumption Forecasting." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-294066.

Full text
Abstract:
Excessive resource allocation in telecommunications networks can be prevented by forecasting the resource demand when dimensioning the networks and then allocating the necessary resources accordingly, which is an ongoing effort to achieve a more sustainable development. In this work, traffic data from cloud environments that host deployed virtualized network functions (VNFs) of an IP Multimedia Subsystem (IMS) has been collected along with the computational resource consumption of the VNFs. A supervised learning approach was adopted to address the forecasting problem by considering encoder-decoder networks. These networks were applied to forecast future resource consumption of the VNFs by regarding the problem as a time series forecasting problem, and recasting it as a sequence-to-sequence (seq2seq) problem. Different encoder-decoder network architectures were then utilized to forecast the resource consumption. The encoder-decoder networks were compared against a widely deployed classical time series forecasting model that served as a baseline model. The results show that while the considered encoder-decoder models failed to outperform the baseline model in overall Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE), the forecasting capabilities were more resilient to degradation over time. This suggests that the encoder-decoder networks are more appropriate for long-term forecasting, which is in agreement with related literature. Furthermore, the encoder-decoder models achieved competitive performance when compared to the baseline, despite being treated with limited hyperparameter-tuning and the absence of more sophisticated functionality such as attention. This work has shown that there is indeed potential for deep learning applications in forecasting of cloud resource consumption.
Överflödig allokering av resurser i telekommunikationsnätverk kan förhindras genom att prognosera resursbehoven vid dimensionering av dessa nätverk. Detta görs i syfte att bidra till en mer hållbar utveckling. Infor  detta  projekt har  trafikdata från molnmiljon som hyser aktiva virtuella komponenter (VNFs) till ett  IP Multimedia Subsystem (IMS) samlats in tillsammans med resursförbrukningen  av dessa komponenter. Detta examensarbete avhandlar hur effektivt övervakad maskininlärning i form av encoder-decoder natverk kan användas för att prognosera resursbehovet hos ovan nämnda VNFs. Encoder-decoder nätverken appliceras genom att betrakta den samlade datan som en tidsserie. Problemet med att förutspå utvecklingen av tidsserien formuleras sedan som ett sequence-to-sequence (seq2seq) problem. I detta arbete användes en samling encoder-decoder nätverk med olika arkitekturer for att prognosera resursförbrukningen och dessa jämfördes med en populär modell hämtad från klassisk tidsserieanalys. Resultaten visar att encoder- decoder nätverken misslyckades med att överträffa den klassiska tidsseriemodellen med avseende på Root Mean Squared Error (RMSE) och Mean Absolute Error (MAE). Dock visade encoder-decoder nätverken en betydlig motståndskraft mot prestandaförfall över tid i jämförelse med den klassiska tidsseriemodellen. Detta indikerar att encoder-decoder nätverk är lämpliga för prognosering över en längre tidshorisont. Utöver detta visade encoder-decoder nätverken en konkurrenskraftig förmåga att förutspå det korrekta resursbehovet, trots en begränsad justering av disponeringsparametrarna och utan mer sofistikerad funktionalitet implementerad som exempelvis attention.
APA, Harvard, Vancouver, ISO, and other styles
6

Correia, Tiago Miguel Pina. "FPGA implementation of Alamouti encoder/decoder for LTE." Master's thesis, Universidade de Aveiro, 2013. http://hdl.handle.net/10773/12679.

Full text
Abstract:
Mestrado em Engenharia Electrónica e Telecomunicações
Motivados por transmissões mais rápidas e mais fiáveis num canal sem fios, os sistemas da 4G devem proporcionar processamento de dados mais rápido a baixa complexidade, elevadas taxas de dados, assim como robustez na performance reduzindo também, a latência e os custos de operação. LTE apresenta, na sua camada física, tecnologias como OFDM e MIMO que prometem alcançar elevadas taxas de dados e aumentar a eficiência espectral. Especificamente a camada física do LTE emprega OFDMA para downlink e SC-FDMA para uplink. A tecnologia MIMO permite também melhorar significativamente o desempenho dos sistemas OFDM com as vantagens de multiplexação e diversidade espacial diminuindo o efeito de desvanecimento de multi-percurso no canal. Nesta dissertação são implementados um codificador e um descodificador com base no algoritimo de Alamouti num sistema MISO nomeadamente para serem incluídos num OFDM transceiver que segue as especificações da camada física do LTE. A codificação/descodificação de Alamouti realiza-se no espaço e frequência e os blocos foram projetados e simulados em Matlab através do ambiente Simulink com o auxílio dos blocos da Xilinx inseridos no seu software System Generator para DSP. Pode-se concluir que os blocos baseados no algoritmo de Alamouti foram implementados em hardware com sucesso.
Motivated by faster transmissions and more reliable wireless channel, future 4G systems should provide faster data processing at low complexity, high data rates, as well as robustness in performance while also reducing the latency and operating costs. LTE presents in its physical layer technologies such as OFDM and MIMO that promise to achieve high data rates and increase spectral efficiency. Specifically the physical layer of LTE employs OFDMA on the downlink and SC-FDMA for uplink. MIMO technology also allows to significantly improve the performance of OFDM systems with the advantages of multiplexing and spatial diversity by decreasing the effect of multipath fading in the channel. In this thesis we implemented an encoder and a decoder based on an Alamouti algorithm in a MISO system namely to be added to an OFDM transceiver that follows closely the LTE physical layer specifications. Alamouti coding/decoding is performed in frequency and space and the blocks were projected and simulated in Matlab using Simulink environment through the Xilink's blocks in the System Generator for DSP. One can conclude that the blocks based on Alamouti algorithm were well-implemented.
APA, Harvard, Vancouver, ISO, and other styles
7

Leivas, Oliveira Gabriel [Verfasser], Thomas [Akademischer Betreuer] Brox, and Wolfram [Akademischer Betreuer] Burgard. "Encoder-decoder methods for semantic segmentation: efficiency and robustness aspects." Freiburg : Universität, 2019. http://d-nb.info/1191689476/34.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Kopparthi, Sunitha. "Flexible encoder and decoder designs for low-density parity-check codes." Diss., Manhattan, Kan. : Kansas State University, 2010. http://hdl.handle.net/2097/4190.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Pisacane, Claudia. "Skopos Theory La figura del traduttore come decoder e re-encoder." Bachelor's thesis, Alma Mater Studiorum - Università di Bologna, 2015. http://amslaurea.unibo.it/8926/.

Full text
Abstract:
La Skopos Theory è una teoria introdotta nel mondo della traduzione dal linguista tedesco Hans Joseph Vermeer. Skopos è una parola di derivazione greca che significa “fine” o “scopo”. La teoria elaborata da Vermeer si basa sull’idea che ogni testo abbia uno skopos che determina i metodi e le strategie secondo le quali esso debba essere tradotto. Oltre alla Skopos Theory, che sarà la base della tesi, i testi a seguire verranno analizzati seguendo altri autori, quali Mona Baker e Laurence Venuti, che si rifanno all’idea di skopos e analizzano molto dettagliatamente la figura del traduttore come de-coder e re-encoder del testo.
APA, Harvard, Vancouver, ISO, and other styles
10

Nina, Oliver A. Nina. "A Multitask Learning Encoder-N-Decoder Framework for Movie and Video Description." The Ohio State University, 2018. http://rave.ohiolink.edu/etdc/view?acc_num=osu1531996548147165.

Full text
APA, Harvard, Vancouver, ISO, and other styles
11

Sari, Mehmet. "Designing fast Golay encoder/decoder in Xilinx XACT with Mentor Graphics CAD interface." Monterey, Calif. : Springfield, Va. : Naval Postgraduate School ; Available from National Technical Information Service, 1997. http://handle.dtic.mil/100.2/ADA331926.

Full text
APA, Harvard, Vancouver, ISO, and other styles
12

Kumbala, Bharadwaj Reddy. "Predictive Maintenance of NOx Sensor using Deep Learning : Time series prediction with encoder-decoder LSTM." Thesis, Blekinge Tekniska Högskola, Institutionen för tillämpad signalbehandling, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-18668.

Full text
Abstract:
In automotive industry there is a growing need for predicting the failure of a component, to achieve the cost saving and customer satisfaction. As failure in a component leads to the work breakdown for the customer. This paper describes an effort in making a prediction failure monitoring model for NOx sensor in trucks. It is a component that used to measure the level of nitrogen oxide emission from the truck. The NOx sensor has chosen because its failure leads to the slowdown of engine efficiency and it is fragile and costly to replace. The data from a good and contaminated NOx sensor which is collated from the test rigs is used the input to the model. This work in this paper shows approach of complementing the Deep Learning models with Machine Learning algorithm to achieve the results. In this work LSTMs are used to detect the gain in NOx sensor and Encoder-Decoder LSTM is used to predict the variables. On top of it Multiple Linear Regression model is used to achieve the end results. The performance of the monitoring model is promising. The approach described in this paper is a general model and not specific to this component, but also can be used for other sensors too as it has a universal kind of approach.
APA, Harvard, Vancouver, ISO, and other styles
13

Sozen, Serkan. "A Viterbi Decoder Using System C For Area Efficient Vlsi Implementation." Master's thesis, METU, 2006. http://etd.lib.metu.edu.tr/upload/12607567/index.pdf.

Full text
Abstract:
In this thesis, the VLSI implementation of Viterbi decoder using a design and simulation platform called SystemC is studied. For this purpose, the architecture of Viterbi decoder is tried to be optimized for VLSI implementations. Consequently, two novel area efficient structures for reconfigurable Viterbi decoders have been suggested. The traditional and SystemC design cycles are compared to show the advantages of SystemC, and the C++ platforms supporting SystemC are listed, installation issues and examples are discussed. The Viterbi decoder is widely used to estimate the message encoded by Convolutional encoder. For the implementations in the literature, it can be found that special structures called trellis have been formed to decrease the complexity and the area. In this thesis, two new area efficient reconfigurable Viterbi decoder approaches are suggested depending on the rearrangement of the states of the trellis structures to eliminate the switching and memory addressing complexity. The first suggested architecture based on reconfigurable Viterbi decoder reduces switching and memory addressing complexity. In the architectures, the states are reorganized and the trellis structures are realized by the usage of the same structures in subsequent instances. As the result, the area is minimized and power consumption is reduced. Since the addressing complexity is reduced, the speed is expected to increase. The second area efficient Viterbi decoder is an improved version of the first one and has the ability to configure the parameters of constraint length, code rate, transition probabilities, trace-back depth and generator polynomials.
APA, Harvard, Vancouver, ISO, and other styles
14

Ferreira, Nathan. "An Assessment of Available Software Defined Radio Platforms Utilizing Iterative Algorithms." Digital WPI, 2015. https://digitalcommons.wpi.edu/etd-theses/728.

Full text
Abstract:
As the demands of communication systems have become more complex and varied, software defined radios (SDR) have become increasingly popular. With behavior that can be modified in software, SDR's provide a highly flexible and configurable development environment. Despite its programmable behavior, the maximum performance of an SDR is still rooted in its hardware. This limitation and the desire for the use of SDRs in different applications have led to the rise of various pieces of hardware to serve as SDR platforms. These platforms vary in aspects such as their performance limitations, implementation details, and cost. In this way the choice of SDR platform is not solely based on the cost of the hardware and should be closely examined before making a final decision. This thesis examines the various SDR platform families available on the market today and compares the advantages and disadvantages present for each during development. As many different types of hardware can be considered an option to successfully implement an SDR, this thesis specifically focuses on general purpose processors, system on chip, and field-programmable gate array implementations. When examining these SDR families, the Freescale BSC9131 is chosen to represent the system on chip implementation, while the Nutaq PicoSDR 2x2 Embedded with Virtex6 SX315 is used for the remaining two options. In order to test each of these platforms, a Viterbi algorithm is implemented on each and the performance measured. This performance measurement considers both how quickly the platform is able to perform the decoding, as well as its bit error rate performance in order to ascertain the implementations' accuracy. Other factors considered when comparing each platform are its flexibility and the amount of options available for development. After testing, the details of each implementation are discussed and guidelines for choosing a platform are suggested.
APA, Harvard, Vancouver, ISO, and other styles
15

Rajan, Rachel. "Semi Supervised Learning for Accurate Segmentation of Roughly Labeled Data." University of Dayton / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=dayton1597082270750151.

Full text
APA, Harvard, Vancouver, ISO, and other styles
16

Grúbel, Michal. "Implementace metriky pro hodnocení kvality videosekvencí do dekodéru H.264/AVC." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2010. http://www.nusl.cz/ntk/nusl-218404.

Full text
Abstract:
In this diploma thesis an algorithm for the evaluation of picture quality of H.264-coded video sequences is introduced and applied. As a measure of picture quality objective metric the peak signal to noise ratio (PSNR) is used. While the computation of the PSNR usually requires a reference signal and compares it to the distorted video sequence, this algorithm is able to evaluate PSNR following the coded transform coefficients. Thus, no reference signal is needed.
APA, Harvard, Vancouver, ISO, and other styles
17

Zhao, Chenyuan. "Spike Processing Circuit Design for Neuromorphic Computing." Diss., Virginia Tech, 2019. http://hdl.handle.net/10919/93591.

Full text
Abstract:
Von Neumann Bottleneck, which refers to the limited throughput between the CPU and memory, has already become the major factor hindering the technical advances of computing systems. In recent years, neuromorphic systems started to gain increasing attention as compact and energy-efficient computing platforms. Spike based-neuromorphic computing systems require high performance and low power neural encoder and decoder to emulate the spiking behavior of neurons. These two spike-analog signals converting interface determine the whole spiking neuromorphic computing system's performance, especially the highest performance. Many state-of-the-art neuromorphic systems typically operate in the frequency range between 〖10〗^0KHz and 〖10〗^2KHz due to the limitation of encoding/decoding speed. In this dissertation, all these popular encoding and decoding schemes, i.e. rate encoding, latency encoding, ISI encoding, together with related hardware implementations have been discussed and analyzed. The contributions included in this dissertation can be classified into three main parts: neuron improvement, three kinds of ISI encoder design, two types of ISI decoder design. Two-path leakage LIF neuron has been fabricated and modular design methodology is invented. Three kinds of ISI encoding schemes including parallel signal encoding, full signal iteration encoding, and partial signal encoding are discussed. The first two types ISI encoders have been fabricated successfully and the last ISI encoder will be taped out by the end of 2019. Two types of ISI decoders adopted different techniques which are sample-and-hold based mixed-signal design and spike-timing-dependent-plasticity (STDP) based analog design respectively. Both these two ISI encoders have been evaluated through post-layout simulations successfully. The STDP based ISI encoder will be taped out by the end of 2019. A test bench based on correlation inspection has been built to evaluate the information recovery capability of the proposed spiking processing link.
Doctor of Philosophy
Neuromorphic computing is a kind of specific electronic system that could mimic biological bodies’ behavior. In most cases, neuromorphic computing system is built with analog circuits which have benefits in power efficient and low thermal radiation. Among neuromorphic computing system, one of the most important components is the signal processing interface, i.e. encoder/decoder. To increase the whole system’s performance, novel encoders and decoders have been proposed in this dissertation. In this dissertation, three kinds of temporal encoders, one rate encoder, one latency encoder, one temporal decoder, and one general spike decoder have been proposed. These designs could be combined together to build high efficient spike-based data link which guarantee the processing performance of whole neuromorphic computing system.
APA, Harvard, Vancouver, ISO, and other styles
18

Ploštica, Stanislav. "Turbo kódy a jejich aplikace." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2009. http://www.nusl.cz/ntk/nusl-218201.

Full text
Abstract:
This Diploma thesis aims to explain the data coding using turbo codes. These codes belong to the group of error correction codes. We can reach the high efficiency using these codes. The first part describes process of encoding and decoding. There are describes parts of encoder and decoder. Principle of encoding and decoding demonstrate a simple example. The end of this part contains description of two most frequently used decoding algorithms (SOVA and MAP). The second part contains description of computer program that was made for using as teaching aid. This program was created in Matlab GUI. This program enables to browse error correction process step by step. This program contains graphic interface with many options and display results. In the third part is described program created in Matlab Simulink that was implemented into the TMS320C6713 kit and there is description of measuring procedure. For verification of efficiency of turbo codes was measured any parameters. Some of these parameters are: number of decoding iterations, generating polynoms and using of puncturing. The last part contains measured value and result evaluation.
APA, Harvard, Vancouver, ISO, and other styles
19

Öjerteg, Theo. "Design and implementation of test a tool for the GSM traffic channel." Thesis, Linköping University, Department of Electrical Engineering, 2002. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-1240.

Full text
Abstract:

Todays’ systems for telecommunication are getting more and more complex. Automatic testing is required to guarantee quality of the systems produced. An actual example is the introduction of GPRS traffic in the GSM network nodes. This thesis investigates the need and demands for such an automatic testing of the traffic channels in the GSM system. A solution intended to be a part of the Ericsson TSS is proposed. One problem to be solved is that today’s tools for testing do not support testing of speech channels with the speech transcoder unit installed. As part of the investigation, a speech codec is implemented for execution on current hardware used in the test platform. The selected speech codec is the enhanced full rate codec, generating a bitstream of 12.2 kbit/s, and gives a good trade-off between compression and speech quality. The report covers the design of the test tool and the implementation of speech codec. Particularly performance problems in the implementation of the encoder will be addressed.

APA, Harvard, Vancouver, ISO, and other styles
20

Monsen, Julius. "Building high-quality datasets for abstractive text summarization : A filtering‐based method applied on Swedish news articles." Thesis, Linköpings universitet, Institutionen för datavetenskap, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-176352.

Full text
Abstract:
With an increasing amount of information on the internet, automatic text summarization could potentially make content more readily available for a larger variety of people. Training and evaluating text summarization models require datasets of sufficient size and quality. Today, most such datasets are in English, and for minor languages such as Swedish, it is not easy to obtain corresponding datasets with handwritten summaries. This thesis proposes methods for compiling high-quality datasets suitable for abstractive summarization from a large amount of noisy data through characterization and filtering. The data used consists of Swedish news articles and their preambles which are here used as summaries. Different filtering techniques are applied, yielding five different datasets. Furthermore, summarization models are implemented by warm-starting an encoder-decoder model with BERT checkpoints and fine-tuning it on the different datasets. The fine-tuned models are evaluated with ROUGE metrics and BERTScore. All models achieve significantly better results when evaluated on filtered test data than when evaluated on unfiltered test data. Moreover, models trained on the most filtered dataset with the smallest size achieves the best results on the filtered test data. The trade-off between dataset size and quality and other methodological implications of the data characterization, the filtering and the model implementation are discussed, leading to suggestions for future research.
APA, Harvard, Vancouver, ISO, and other styles
21

Larsson, Susanna. "Monocular Depth Estimation Using Deep Convolutional Neural Networks." Thesis, Linköpings universitet, Datorseende, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-159981.

Full text
Abstract:
For a long time stereo-cameras have been deployed in visual Simultaneous Localization And Mapping (SLAM) systems to gain 3D information. Even though stereo-cameras show good performance, the main disadvantage is the complex and expensive hardware setup it requires, which limits the use of the system. A simpler and cheaper alternative are monocular cameras, however monocular images lack the important depth information. Recent works have shown that having access to depth maps in monocular SLAM system is beneficial since they can be used to improve the 3D reconstruction. This work proposes a deep neural network that predicts dense high-resolution depth maps from monocular RGB images by casting the problem as a supervised regression task. The network architecture follows an encoder-decoder structure in which multi-scale information is captured and skip-connections are used to recover details. The network is trained and evaluated on the KITTI dataset achieving results comparable to state-of-the-art methods. With further development, this network shows good potential to be incorporated in a monocular SLAM system to improve the 3D reconstruction.
APA, Harvard, Vancouver, ISO, and other styles
22

Holcner, Jonáš. "Strojový překlad pomocí umělých neuronových sítí." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2018. http://www.nusl.cz/ntk/nusl-386020.

Full text
Abstract:
The goal of this thesis is to describe and build a system for neural machine translation. System is built with recurrent neural networks - encoder-decoder architecture in particular. The result is a nmt library used to conduct experiments with different model parameters. Results of the experiments are compared with system built with the statistical tool Moses.
APA, Harvard, Vancouver, ISO, and other styles
23

Ranalli, Lorenzo. "Studio ed implementazione di un modello di Action Recognition. Classificazione delle azioni di gioco e della tipologia di colpi durante un match di Tennis." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2021.

Find full text
Abstract:
Il Machine Learning e lo sport stanno sempre di più consolidando il proprio matrimonio. Che siano sport individuali, sport di squadra, sport più o meno professionistici, è sempre più presente una componente smart che emerge sia in fase di arbitraggio che in fase di coaching virtuale. È proprio in ambito Virtual Coaching che si colloca l’idea di IConsulting che, con mAIcoach, cerca di ridefinire le regole degli allenamenti di tennis, assistere l’atleta e guidarlo nell’esecuzione corretta dei movimenti. Più nello specifico l’idea è quella di trasmettere un metodo matematico attraverso un sistema smart di valutazioni del tennista. L’utente potrà effettuare submit dei video del proprio allenamento e ricevere consigli e critiche costruttive al fine di migliorare le proprie posture ed i propri colpi.
APA, Harvard, Vancouver, ISO, and other styles
24

Němec, Jaroslav. "Bezeztrátová komprese videa." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2012. http://www.nusl.cz/ntk/nusl-236537.

Full text
Abstract:
This master's thesis deals with lossless video compression. This thesis includes basic concepts and techniques used in image representation. The reader can find an explanation of basic difference between lossless video compression and lossy video compression and lossless video compression limitations. There is also possible find a description of the basic blocks forming the video codec (every block is described in detail). In every block there are introduced its possible variants. Implemented lossless videocodec was compared with common lossless videocodecs.
APA, Harvard, Vancouver, ISO, and other styles
25

Nawaz, Sabeen. "Analysis of Transactional Data with Long Short-Term Memory Recurrent Neural Networks." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-281282.

Full text
Abstract:
An issue authorities and banks face is fraud related to payments and transactions where huge monetary losses occur to a party or where money laundering schemes are carried out. Previous work in the field of machine learning for fraud detection has addressed the issue as a supervised learning problem. In this thesis, we propose a model which can be used in a fraud detection system with transactions and payments that are unlabeled. The proposed modelis a Long Short-term Memory in an auto-encoder decoder network (LSTMAED)which is trained and tested on transformed data. The data is transformed by reducing it to Principal Components and clustering it with K-means. The model is trained to reconstruct the sequence with high accuracy. Our results indicate that the LSTM-AED performs better than a random sequence generating process in learning and reconstructing a sequence of payments. We also found that huge a loss of information occurs in the pre-processing stages.
Obehöriga transaktioner och bedrägerier i betalningar kan leda till stora ekonomiska förluster för banker och myndigheter. Inom maskininlärning har detta problem tidigare hanterats med hjälp av klassifierare via supervised learning. I detta examensarbete föreslår vi en modell som kan användas i ett system för att upptäcka bedrägerier. Modellen appliceras på omärkt data med många olika variabler. Modellen som används är en Long Short-term memory i en auto-encoder decoder nätverk. Datan transformeras med PCA och klustras med K-means. Modellen tränas till att rekonstruera en sekvens av betalningar med hög noggrannhet. Vår resultat visar att LSTM-AED presterar bättre än en modell som endast gissar nästa punkt i sekvensen. Resultatet visar också att mycket information i datan går förlorad när den förbehandlas och transformeras.
APA, Harvard, Vancouver, ISO, and other styles
26

Daliparthi, Venkata Satya Sai Ajay. "Semantic Segmentation of Urban Scene Images Using Recurrent Neural Networks." Thesis, Blekinge Tekniska Högskola, Institutionen för datavetenskap, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-20651.

Full text
Abstract:
Background: In Autonomous Driving Vehicles, the vehicle receives pixel-wise sensor data from RGB cameras, point-wise depth information from the cameras, and sensors data as input. The computer present inside the Autonomous Driving vehicle processes the input data and provides the desired output, such as steering angle, torque, and brake. To make an accurate decision by the vehicle, the computer inside the vehicle should be completely aware of its surroundings and understand each pixel in the driving scene. Semantic Segmentation is the task of assigning a class label (Such as Car, Road, Pedestrian, or Sky) to each pixel in the given image. So, a better performing Semantic Segmentation algorithm will contribute to the advancement of the Autonomous Driving field. Research Gap: Traditional methods, such as handcrafted features and feature extraction methods, were mainly used to solve Semantic Segmentation. Since the rise of deep learning, most of the works are using deep learning to dealing with Semantic Segmentation. The most commonly used neural network architecture to deal with Semantic Segmentation was the Convolutional Neural Network (CNN). Even though some works made use of Recurrent Neural Network (RNN), the effect of RNN in dealing with Semantic Segmentation was not yet thoroughly studied. Our study addresses this research gap. Idea: After going through the existing literature, we came up with the idea of “Using RNNs as an add-on module, to augment the skip-connections in Semantic Segmentation Networks through residual connections.” Objectives and Method: The main objective of our work is to improve the Semantic Segmentation network’s performance by using RNNs. The Experiment was chosen as a methodology to conduct our study. In our work, We proposed three novel architectures called UR-Net, UAR-Net, and DLR-Net by implementing our idea to the existing networks U-Net, Attention U-Net, and DeepLabV3+ respectively. Results and Findings: We empirically showed that our proposed architectures have shown improvement in efficiently segmenting the edges and boundaries. Through our study, we found that there is a trade-off between using RNNs and Inference time of the model. Suppose we use RNNs to improve the performance of Semantic Segmentation Networks. In that case, we need to trade off some extra seconds during the inference of the model. Conclusion: Our findings will not contribute to the Autonomous driving field, where we need better performance in real-time. But, our findings will contribute to the advancement of Bio-medical Image segmentation, where doctors can trade-off those extra seconds during inference for better performance.
APA, Harvard, Vancouver, ISO, and other styles
27

Nishikimi, Ryo. "Generative, Discriminative, and Hybrid Approaches to Audio-to-Score Automatic Singing Transcription." Doctoral thesis, Kyoto University, 2021. http://hdl.handle.net/2433/263772.

Full text
APA, Harvard, Vancouver, ISO, and other styles
28

Šedý, Jakub. "Turbo konvoluční a turbo blokové kódy." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2011. http://www.nusl.cz/ntk/nusl-219287.

Full text
Abstract:
The aim is to explain the Turbo convolutional and block turbo codes and decoding the secure message. The practical part focuses on the design of a demonstration program in Matlab. The work is divided into four parts. The first two deal with theoretical analysis of coding and decoding. The third section contains a description created a demonstration program that allows you to navigate the process of encoding and decoding. The fourth is devoted to simulation and performance of turbo codes.
APA, Harvard, Vancouver, ISO, and other styles
29

Závorka, Radek. "Program pro demonstraci kanálového kódování." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2020. http://www.nusl.cz/ntk/nusl-413009.

Full text
Abstract:
The main subject of this thesis is creating a programme, used for channel coding demonstration. This programme will be used for teaching purposes. The programme contains various codes from simple ones, to those which almost reach Shanon’s channel capacity theorem. Specifically these are the Hamming code, cyclic code, convolutional code and LDPC code. These functions are based on theoretical background described in this thesis and have been programed in Matlab. Practical output of this thesis is user interface, where the user is able to input information word, simulate transmission through the transmission channel and observe coding and decoding for each code. This thesis also contains a comparison between individual codes, concerning bit-error rate depending on SNR and various parameters. There is a computer lab with theoretical background, assignment and sheets for convenient accomplishment of each task.
APA, Harvard, Vancouver, ISO, and other styles
30

Kašpar, Jaroslav. "Zabezpečení přenosu dat BCH kódy." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2008. http://www.nusl.cz/ntk/nusl-217733.

Full text
Abstract:
The thesis Data transmission error-protection with BCH codes deals with a large class of random-error correcting cyclic codes which are able to protect binary data and can be used for example in data storages, high speed modems. Bose, Chaudhuri and Hocquenghem (BCH) codes operate over algebraic structures called Galois fields. The BCH encoding is the same as cyclic encoding and can be done with linear feedback shift register but decoding is more complex and can be done with different algorithms - in this thesis there are two algorithms for decoding Peterson and Berlekam-Massey mentioned. The aim of this thesis is to find BCH code which is able to correct t = 6 independent errors in up to data sequence n = 150 bits, then peruse possible realizations of the codecs and set criteria for the best realization, then design and test this realization. This thesis is split into three main parts. In the first part there are encoding and decoding methods of the BCH code generally described. The second part deals with selecting of the right code and realization. There was chosen BCH (63,30) code and realization with FPGA chip. In the last part is described design of BCH encoder and decoder and compilation in the Altera design software.
APA, Harvard, Vancouver, ISO, and other styles
31

Lindblad, Maria. "A Comparative Study of the Quality between Formality Style Transfer of Sentences in Swedish and English, leveraging the BERT model." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-299932.

Full text
Abstract:
Formality Style Transfer (FST) is the task of automatically transforming a piece of text from one level of formality to another. Previous research has investigated different methods of performing FST on text in English, but at the time of this project there were to the author’s knowledge no previous studies analysing the quality of FST on text in Swedish. The purpose of this thesis was to investigate how a model trained for FST in Swedish performs. This was done by comparing the quality of a model trained on text in Swedish for FST, to an equivalent model trained on text in English for FST. Both models were implemented as encoder-decoder architectures, warm-started using two pre-existing Bidirectional Encoder Representations from Transformers (BERT) models, pre-trained on Swedish and English text respectively. The two FST models were fine-tuned for both the informal to formal task as well as the formal to informal task, using the Grammarly’s Yahoo Answers Formality Corpus (GYAFC). The Swedish version of GYAFC was created through automatic machine translation of the original English version. The Swedish corpus was then evaluated on the three criteria meaning preservation, formality preservation and fluency preservation. The results of the study indicated that the Swedish model had the capacity to match the quality of the English model but was held back by the inferior quality of the Swedish corpus. The study also highlighted the need for task specific corpus in Swedish.
Överföring av formalitetsstil syftar på uppgiften att automatiskt omvandla ett stycke text från en nivå av formalitet till en annan. Tidigare forskning har undersökt olika metoder för att utföra uppgiften på engelsk text men vid tiden för detta projekt fanns det enligt författarens vetskap inga tidigare studier som analyserat kvaliteten för överföring av formalitetsstil på svensk text. Syftet med detta arbete var att undersöka hur en modell tränad för överföring av formalitetsstil på svensk text presterar. Detta gjordes genom att jämföra kvaliteten på en modell tränad för överföring av formalitetsstil på svensk text, med en motsvarande modell tränad på engelsk text. Båda modellerna implementerades som kodnings-avkodningsmodeller, vars vikter initierats med hjälp av två befintliga Bidirectional Encoder Representations from Transformers (BERT)-modeller, förtränade på svensk respektive engelsk text. De två modellerna finjusterades för omvandling både från informell stil till formell och från formell stil till informell. Under finjusteringen användes en svensk och en engelsk version av korpusen Grammarly’s Yahoo Answers Formality Corpus (GYAFC). Den svenska versionen av GYAFC skapades genom automatisk maskinöversättning av den ursprungliga engelska versionen. Den svenska korpusen utvärderades sedan med hjälp av de tre kriterierna betydelse-bevarande, formalitets-bevarande och flödes-bevarande. Resultaten från studien indikerade att den svenska modellen hade kapaciteten att matcha kvaliteten på den engelska modellen men hölls tillbaka av den svenska korpusens sämre kvalitet. Studien underströk också behovet av uppgiftsspecifika korpusar på svenska.
APA, Harvard, Vancouver, ISO, and other styles
32

Špaček, Milan. "Porovnání možností komprese multimediálních signálů." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2013. http://www.nusl.cz/ntk/nusl-220319.

Full text
Abstract:
Thesis deals with multimedia signal comparison of compression options focused on video and advanced codecs. Specifically it describes the encoding and decoding of video recordings according to the MPEG standard. The theoretical part of the thesis describes characteristic properties of the video signal and justification for the need to use recording and transmission compression. There are also described methods for elimination of encoded video signal redundancy and irrelevance. Further on are discussed ways of measuring the video signal quality. A separate chapter is focused on the characteristics of currently used and promising codecs. In the practical part of the thesis were created functions in Matlab environment. These functions were implemented into graphic user interface that simulates the activity of functional blocks of the encoder and decoder. Based on user-specified input parameters it performs encoding and decoding of any given picture, composed of images in RGB format, and displays the outputs of individual functional blocks. There are implemented algorithms for the initial processing of the input sequence including sub-sampling, as well as DCT, quantization, motion compensation and their inverse operations. Separate chapters are dedicated to the realisation of codec description in the Matlab environment and to the individual processing steps output. Further on are mentioned compress algorithm comparisons and the impact of parameter change onto the final signal. The findings are summarized in conclusion.
APA, Harvard, Vancouver, ISO, and other styles
33

Hardy, Clément. "Architectures multi-échelles de type encοdeur-décοdeur pοur la stéréοphοtοmétrie." Electronic Thesis or Diss., Normandie, 2024. http://www.theses.fr/2024NORMC222.

Full text
Abstract:
La stéréophotométrie est une technique de reconstruction 3D de la surface d'un objet. De plus en plus de recherches s'intéressent à ce problème qui se veut prometteur dans le monde industriel. En effet, la stéréophotométrie peut être utilisée pour détecter les défauts d'usinage de pièces mécaniques ou pour de la reconnaissance faciale par exemple. Cette thèse explore les méthodes d'apprentissage profond pour la stéréophotométrie, notamment les différents aspects liés aux bases de données d'entraînement et aux architectures considérées.De manière générale, la sur-paramétrisation d'un réseau de neurones est souvent suffisante pour supporter la diversité des problèmes rencontrés. La base de données d'entraînement est alors considérée comme le point clé permettant de conditionner le réseau au problème traité. Par conséquent, pour répondre à ce besoin, nous proposons une nouvelle base de données d'entraînement synthétique. Cette base de données considère une très grande variété de géométries, de textures, de directions ou conditions lumineuses mais également d'environnements, permettant donc de générer un nombre de situation quasiment infini.Le second point décisif d'une bonne reconstruction concerne le choix de l'architecture. L'architecture d'un réseau doit assurer une bonne capacité de généralisation sur de nouvelles données pour générer de très bons résultats sur des données inédites. Et ce, quelle que soit l'application. En particulier, pour la stéréophotométrie, l'enjeu est d'être capable de reconstruire des images très haute résolution afin de ne pas perdre de détails. Nous proposons alors une architecture multi-échelles de type encodeur-décodeur afin de répondre à ce problème.Dans un premier temps, nous proposons une architecture fondée sur les réseaux convolutionnels pour répondre au problème de stéréophotométrie calibrée, i.e. quand la direction lumineuse est connue. Dans un second temps, nous proposons une version fondé sur les Transformers afin de répondre au problème de stéréophotométrie universelle. C'est-à-dire que nous sommes en capacité de gérer n'importe quel environnement, direction lumineuse, etc., sans aucune information préalable. Finalement, pour améliorer les reconstructions sur des matériaux difficiles (translucides ou brillants par exemple), nous proposons une nouvelle approche que nous appelons ``faiblement calibrée'' pour la stéréophotométrie. Dans ce contexte, nous n'avons qu'une connaissance approximative de la direction d'éclairage.L'ensemble des pistes que nous avons explorées ont conduit à des résultats convaincants, à la fois quantitatifs et visuels sur l'ensemble des bases de données de l'état-de-l'art. En effet, nous avons pu observer une amélioration notable de la précision de reconstruction des cartes de normales, contribuant ainsi à avancer l'état de l'art dans ce domaine
Photometric stereo is a technique for 3D surface reconstruction of objects. This field has seen a surge in research interest due to its potential applications in industry. Specifically, photometric stereo can be employed for tasks such as detecting machining defects in mechanical components or facial recognition. This thesis delves into deep learning methods for photometry stero, with a particular focus on training data and network architectures.While neural network over-parameterization is often adequate, the training dataset plays a pivotal role in task adaptation. To generate a highly diverse and extensible training set, we propose a new synthetic dataset. This dataset incorporates a broad spectrum of geometric, textural, lighting, and environmental variations, allowing for the creation of nearly infinite training instances.The second decisive point of a good reconstruction concerns the choice of architecture. The architecture of a network must ensure a good generalization capacity on new data to generate very good results on unseen data. And this, regardless of the application. In particular, for the photometric stereo problem, the challenge is to be able to reconstruct very high-resolution images in order not to lose any details. We therefore propose a multi-scale encoder-decoder architecture to address this problem.We first introduce a convolutional neural network architecture for calibrated photometric stereo, where the lighting direction is known. To handle unconstrained environments, we propose a Transformers-based approach for universal photometric stereo. Lastly, for challenging materials shiny like translucent or shiny surfaces, we introduce a ``weakly calibrated'' approach that assumes only approximate knowledge of the lighting direction.The approaches we have investigated have consistently demonstrated strong performance on standard benchmarks, as evidenced by both quantitative metrics and visual assessments. Our results, particularly the improved accuracy of reconstructed normal maps, represent a significant advancement in photometric stereo
APA, Harvard, Vancouver, ISO, and other styles
34

Trčka, Tomáš. "Turbokódy a jejich použití ve sdělovacích systémech." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2008. http://www.nusl.cz/ntk/nusl-217561.

Full text
Abstract:
This Diploma thesis deals with Turbo code problems. The Turbo codes belong to the group of error correction codes, sometimes referred to as forward error correcting (FEC) codes or channel codes. This thesis can be thematically divided into two basic parts. The first part describes turbo code encoder and decoder block diagram with the illustration of two most frequently used iterative decoding algorithms (SOVA and MAP). The end of this part contains best known turbo codes, which are used in present communication systems. The second part pursues simulation results for the turbo codes using Binary Phase Shift Keying (BPSK) over Additive White Gaussian Noise (AWGN) channels. These simulations were created in the MATLAB/SIMULINK computer program. It will be shown here, that there exist many different parameters, greatly affecting turbo codes performance. Some of these parameters are: number of decoding iterations used, the input data frame length, generating polynoms and RSC encoders constraint lengths, properly designed interleaving block, decoding algorithm used, etc.
APA, Harvard, Vancouver, ISO, and other styles
35

Nilsson, Mårten. "Augmenting High-Dimensional Data with Deep Generative Models." Thesis, KTH, Robotik, perception och lärande, RPL, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-233969.

Full text
Abstract:
Data augmentation is a technique that can be performed in various ways to improve the training of discriminative models. The recent developments in deep generative models offer new ways of augmenting existing data sets. In this thesis, a framework for augmenting annotated data sets with deep generative models is proposed together with a method for quantitatively evaluating the quality of the generated data sets. Using this framework, two data sets for pupil localization was generated with different generative models, including both well-established models and a novel model proposed for this purpose. The unique model was shown both qualitatively and quantitatively to generate the best data sets. A set of smaller experiments on standard data sets also revealed cases where this generative model could improve the performance of an existing discriminative model. The results indicate that generative models can be used to augment or replace existing data sets when training discriminative models.
Dataaugmentering är en teknik som kan utföras på flera sätt för att förbättra träningen av diskriminativa modeller. De senaste framgångarna inom djupa generativa modeller har öppnat upp nya sätt att augmentera existerande dataset. I detta arbete har ett ramverk för augmentering av annoterade dataset med hjälp av djupa generativa modeller föreslagits. Utöver detta så har en metod för kvantitativ evaulering av kvaliteten hos genererade data set tagits fram. Med hjälp av detta ramverk har två dataset för pupillokalisering genererats med olika generativa modeller. Både väletablerade modeller och en ny modell utvecklad för detta syfte har testats. Den unika modellen visades både kvalitativt och kvantitativt att den genererade de bästa dataseten. Ett antal mindre experiment på standardiserade dataset visade exempel på fall där denna generativa modell kunde förbättra prestandan hos en existerande diskriminativ modell. Resultaten indikerar att generativa modeller kan användas för att augmentera eller ersätta existerande dataset vid träning av diskriminativa modeller.
APA, Harvard, Vancouver, ISO, and other styles
36

Djikic, Addi. "Segmentation and Depth Estimation of Urban Road Using Monocular Camera and Convolutional Neural Networks." Thesis, KTH, Robotik, perception och lärande, RPL, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-235496.

Full text
Abstract:
Deep learning for safe autonomous transport is rapidly emerging. Fast and robust perception for autonomous vehicles will be crucial for future navigation in urban areas with high traffic and human interplay. Previous work focuses on extracting full image depth maps, or finding specific road features such as lanes. However, in urban environments lanes are not always present, and sensors such as LiDAR with 3D point clouds provide a quite sparse depth perception of road with demanding algorithmic approaches. In this thesis we derive a novel convolutional neural network that we call AutoNet. It is designed as an encoder-decoder network for pixel-wise depth estimation of an urban drivable free-space road, using only a monocular camera, and handled as a supervised regression problem. AutoNet is also constructed as a classification network to solely classify and segment the drivable free-space in real- time with monocular vision, handled as a supervised classification problem, which shows to be a simpler and more robust solution than the regression approach. We also implement the state of the art neural network ENet for comparison, which is designed for fast real-time semantic segmentation and fast inference speed. The evaluation shows that AutoNet outperforms ENet for every performance metrics, but shows to be slower in terms of frame rate. However, optimization techniques are proposed for future work, on how to advance the frame rate of the network while still maintaining the robustness and performance. All the training and evaluation is done on the Cityscapes dataset. New ground truth labels for road depth perception are created for training with a novel approach of fusing pre-computed depth maps with semantic labels. Data collection with a Scania vehicle is conducted, mounted with a monocular camera to test the final derived models. The proposed AutoNet shows promising state of the art performance in regards to road depth estimation as well as road classification.
Deep learning för säkra autonoma transportsystem framträder mer och mer inom forskning och utveckling. Snabb och robust uppfattning om miljön för autonoma fordon kommer att vara avgörande för framtida navigering inom stadsområden med stor trafiksampel. I denna avhandling härleder vi en ny form av ett neuralt nätverk som vi kallar AutoNet. Där nätverket är designat som en autoencoder för pixelvis djupskattning av den fria körbara vägytan för stadsområden, där nätverket endast använder sig av en monokulär kamera och dess bilder. Det föreslagna nätverket för djupskattning hanteras som ett regressions problem. AutoNet är även konstruerad som ett klassificeringsnätverk som endast ska klassificera och segmentera den körbara vägytan i realtid med monokulärt seende. Där detta är hanterat som ett övervakande klassificerings problem, som även visar sig vara en mer simpel och mer robust lösning för att hitta vägyta i stadsområden. Vi implementerar även ett av de främsta neurala nätverken ENet för jämförelse. ENet är utformat för snabb semantisk segmentering i realtid, med hög prediktions- hastighet. Evalueringen av nätverken visar att AutoNet utklassar ENet i varje prestandamätning för noggrannhet, men visar sig vara långsammare med avseende på antal bilder per sekund. Olika optimeringslösningar föreslås för framtida arbete, för hur man ökar nätverk-modelens bildhastighet samtidigt som man behåller robustheten.All träning och utvärdering görs på Cityscapes dataset. Ny data för träning samt evaluering för djupskattningen för väg skapas med ett nytt tillvägagångssätt, genom att kombinera förberäknade djupkartor med semantiska etiketter för väg. Datainsamling med ett Scania-fordon utförs även, monterad med en monoculär kamera för att testa den slutgiltiga härleda modellen. Det föreslagna nätverket AutoNet visar sig vara en lovande topp-presterande modell i fråga om djupuppskattning för väg samt vägklassificering för stadsområden.
APA, Harvard, Vancouver, ISO, and other styles
37

Benetka, Miroslav. "Modul digitálního signálového procesoru pro ruční RFID čtečku." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2008. http://www.nusl.cz/ntk/nusl-217543.

Full text
Abstract:
This diploma thesis deals with design and realization of a module for a digital signal procesor, for handheld RFID reader working in UHF band. It utilises a special chip EM4298 for RFID signals processing. Module is controlled by the microcontroller ATmega32L, which communicates with the PC through USB bus. Settings in EM4298 is made by a service program which processes received identifying data obtained from tags. Source codes for microcontroller are created in AVR Studio 4.13 program. Source codes for microcontroller are created in C++ Builder 6.0 program. Further thing is Desing and realization of analog interface and a UHF transceiver for wireless communication with tags. A Webench program was used for the analog interface design, which is freely available on the internet. For verification of parameters of the analog interface it was used PSpice 10.0 program. The UHF transceiver is build-up with a MAX2903 chip (transmitter) and AD8347 (receiver) and transmitting and receiving antennae.
APA, Harvard, Vancouver, ISO, and other styles
38

Balestri, Roberto. "Intelligenza artificiale e industrie culturali storia, tecnologie e potenzialità dell’ia nella produzione cinematografica." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2022. http://amslaurea.unibo.it/25176/.

Full text
Abstract:
Negli ultimi anni stiamo assistendo, in svariati campi, a un sempre più vasto utilizzo di tecnologie che utilizzano quella che viene comunemente chiamata “intelligenza artificiale”. Anche il settore audiovisivo, da sempre recettore di novità e incline a evolversi continuamente, sta già vivendo quei processi che lo porteranno a essere rivoluzionato da questo tipo di tecnologie. In un periodo di frenetico progresso scientifico è difficile riuscire a fissare nel tempo e su carta lo stato attuale dello sviluppo tecnologico, dato che ciò che oggi viene considerato come novità domani potrebbe già essere stato superato. È necessario, quindi, uno strumento che riesca a catalogare, se non tutte, almeno le più importanti rivoluzionarie tecnologie d’intelligenza artificiale che hanno investito il mondo della produzione artistica e delle industrie culturali. Uno studio approfondito è dedicato, in particolare, all’industria cinematografica. Dopo una breve introduzione di carattere storico, vengono descritti i principali tipi di rete neurale artificiale e la loro evoluzione. Sono poi delineate e descritte le principali tecnologie d’IA applicate all’elaborazione, comprensione e generazione automatica o assistita d’immagine e testo. Ancora più nel dettaglio sono osservate alcune soluzioni tecnologiche che interessano le varie fasi del processo di produzione cinematografica, come la fase di scrittura e analisi della sceneggiatura, quella di editing e montaggio video, così come quelle riguardanti l’implementazione di effetti visivi e la composizione musicale. Il testo risulta essere, da un lato, una fotografia sul passato che ha interessato lo sviluppo delle tecnologie d’IA, dall’altro uno strumento che illustra il presente così da aiutarci, se non a predire, almeno a non trovarci completamente impreparati di fronte agli sviluppi futuri che interesseranno sia la produzione audiovisiva che, in senso più ampio, la nostra vita di tutti i giorni.
APA, Harvard, Vancouver, ISO, and other styles
39

Chang, Hao-Ming, and 張浩銘. "Implementation of AES encoder/decoder with FPGA." Thesis, 2008. http://ndltd.ncl.edu.tw/handle/21005168836243667774.

Full text
Abstract:
碩士
龍華科技大學
電子工程研究所
96
In this thesis, we report the implementation methods of AES encoder/decoder algorithm with Altera FPGA. The theoretical background, data flow, transformations/generations of round keys and the process of encoding/decoding are first reviewed and the corresponding circuit architectures are introduced subsequently. The simulations and synthesis are performed on Altera Stratix II EPS60F1020C5 device, and the figure of merits are as follows: the highest clock rate is 90.03MHz, the latencies for AES-128, AES-192 and AES-256 are 21, 24 and 29 clocks respectively. The throughput are 548.75Mbps, 480.16Mbps and 397.37Mbps for the above three cases respectively. The three options of AES-128, 192 and 256 encoder/decoder are integrated in our module such that it can meet the needs of modern broadband communications.
APA, Harvard, Vancouver, ISO, and other styles
40

Su, Chuan-Ming, and 蘇筌銘. "The CAVLC Encoder/Decoder for H.264." Thesis, 2006. http://ndltd.ncl.edu.tw/handle/70347252506009426331.

Full text
Abstract:
碩士
南台科技大學
電子工程系
94
The representation of audio, image and video signals involves a vast amount of data, so signal compression is indispensable. Recently, Variable-Length Code (VLC) has been widely used in multimedia and lossless coding compression such as JPEG and MPEG. It is also an important part in MPEG-4 video compression standard. The CAVLC (context adaptive variable length coding) in H.264 Standard (also called MPEG-4 Part 10) is an adaptive VLC. This thesis describes a low memory class-based CAVLC encoder/decoder algorithm for H.264.The computation complexity of the proposed CAVLC is quite low and its memory requirement is small. Hence, it’s easily implemented by VLSI. The CAVLC architecture has been synthesized on Quartus II (FPGA software).Finally, the logic element of CAVLC encoder/decoder is 4144 and its clock frequency is 12MHz.
APA, Harvard, Vancouver, ISO, and other styles
41

Lei, Chao-Sheng, and 雷朝聖. "The VLC encoder/decoder for MPEG-4." Thesis, 2004. http://ndltd.ncl.edu.tw/handle/79750329993411057859.

Full text
Abstract:
碩士
南台科技大學
電子工程系
92
The representation of audio, image and video signals involves a vast amount of data, so signal compression is indispensable. Recently, variable length code (VLC) has been widely used in multimedia and lossless coding compression such as JPEG and MPEG. This thesis describes an efficient class-based VLC encoder/decoder and its VLSI architecture for MPEG-4. It takes advantage of logic optimization techniques, and achieves high throughput. The computation complexity of the proposed VLC is quite low and its memory requirement is small. Hence, it is easily implemented by VLSI and very suitable for real-time MPEG-4 applications. The VLC architecture has been synthesized on Synopsys Design Compiler with the standard-cell from TSMC 0.35-μm cell library. Finally, the layout for the design was generated with the Avant! tools, Apollo (for floorplan, placement and routing). The gate counts and core size of VLC encoder/decoder are 9624/9486 and / , respectively. In the simulation, it’s clock frequency achieves 50MHz.
APA, Harvard, Vancouver, ISO, and other styles
42

Shao, Chi-Yung, and 邵志勇. "An Efficient Class-Based VLC Encoder/Decoder." Thesis, 2002. http://ndltd.ncl.edu.tw/handle/02310155774295126365.

Full text
Abstract:
碩士
南台科技大學
電子工程系
90
The representation of audio, image and video signals involves a vast amount of data, so signal compression is indispensable. Recently, variable length code (VLC) has been widely used in multimedia and lossless coding compression. This thesis describes an efficient class-based VLC Encoder/Decoder and its VLSI architecture. It takes advantage of logic optimization techniques, and achieves high throughput. Besides, it required smaller memory size and it very suitable for applications on the many international image standard, such as JPEG, H.261 and MPEG. The computation complexity of the VLC is quite low and its memory requirement is small. Hence, it is easily implemented by VLSI and very suitable for real-time applications. An efficient VLSI architecture for VLC is designed and implemented with FPGA chip. It is synthesized with the standard-cell from TSMC’s 0.35um cell library. The chip areas are about 1645.2um *1645.2um and 1646um*1646um for the encoder and the decoder respectively. It can achieve about 50Mega symbols/sec encoding/decoding rate.
APA, Harvard, Vancouver, ISO, and other styles
43

Chen, Jung-Yu, and 陳榮裕. "Study of Parallel Encoder and Decoder of BCH." Thesis, 2010. http://ndltd.ncl.edu.tw/handle/38970291157601272195.

Full text
Abstract:
碩士
國立高雄第一科技大學
電腦與通訊工程所
98
With the error control code development, early by the Hamming code, Reed-Solomon Code have the current popularity of BCH codes, error control code is quite a mature technology now, the traditional sequence of BCH codes, need to wait for the full transmission code length before End for encoding and decoding, the parallel transmission structure, use quite a waste of time. This is paper attempts to develop parallel architecture of the BCH code and to hardware implementation, so many dollars to meet the immediate transfer of the demand for high-speed operation of the Nand flash with degrees under more fit. This is paper used (7,4) BCH code and can correct 1 bit for the prototype structure, the parallel realization of BCH codes, observed and recorded every bit flow situation, the coding part of the framework we use LFSR encoder for parallel , in the decoder side we find a parallel circuit syndrome start, followed by the syndrome vector use the wrong location to do the corresponding correction actions. We observe and record the sequence of experiments by (7,4) BCH code for each element flow conditions, and parallel simulation in Xilinx platform architecture BCH code, (7,4) The experimental results show that code execution time on parallel 4bits architecture is the traditional sequence of 4 times the cost of the upgrade is only 1.667 times, and decoding parallel to the traditional 7-bit sequence is 7 times the cost of the upgrade is only 1.276 times. Parallel encoding up to 50Mb / s, decode 10Mb / s.
APA, Harvard, Vancouver, ISO, and other styles
44

Kuo, Shih-Shan, and 郭仕杉. "A Chip Design for TCM Encoder/Decoder system." Thesis, 1994. http://ndltd.ncl.edu.tw/handle/84882337170197028063.

Full text
Abstract:
碩士
國立成功大學
電機工程研究所
82
Trellis-coded modulation(TCM) is a scheme combining codingnd modulation. It can get significant coding gains withoutncreasing the transmitted power or the required bandwidth.ut the complexity of Viterbi decoder increases correspondingly.or the TCM encoder proposed in this thesis, there are eight pathsn all parallel transitions of the four- state trellis diagram.he VLSI architecture proposed in this thesis has the advantagef TL(Table-Lookup) method, that it needs only a clock time toecide the minimum branch metric path of the parallel transition,nd finish the branch metric calculation. On the other hand,t needs much smaller area than TL method. This chip has two working modes: Encoding and Decoding.two-phase pipelining architecture for implementing both encodingnd decoding improves the speed performance of this chip. We usehe standard-cell of CCL provided by CIC(Chip Implementationenter),to finish the chip design on CADENCE OPUS3. This chipas fabricated using TSMC 0.8um technology. The chip size is.43*0.46 cm*cm, it contains 48 PADs and gate count is 15440.he working frequency is 30MHZ, and the data rate is 13.3Mbits/s.
APA, Harvard, Vancouver, ISO, and other styles
45

Lee, Yu-jen, and 李友仁. "Implementation of MPEG-4 Video Encoder/Decoder on Microprocessors." Thesis, 2004. http://ndltd.ncl.edu.tw/handle/54437330710388387590.

Full text
Abstract:
碩士
國立中山大學
資訊工程學系研究所
92
Digital image data requires large compression ratio in applications like internet, communication and audio-visual environment. In this thesis, we realize the MPEG-4 codec standard on the ARM9-based platform and improve the execution performance by efficient implementations of the core operations such as Motion Estimation and DCT. In the assembly codes obtained by directly compiling the C codes, there exists a lot of redundant checking which causes a large amount of execution time waste. We rewrite some of the compiled assembly codes to improve the execution efficiency using a variety of techniques such as loop-unrolling and data-type optimization. We also analyze the experimental results using several benchmark video sequences with different modes.
APA, Harvard, Vancouver, ISO, and other styles
46

Chen, Jia-Wei, and 陳嘉偉. "Power Efficient H.264 Video Encoder/Decoder Chip Design." Thesis, 2012. http://ndltd.ncl.edu.tw/handle/41858166558877613536.

Full text
Abstract:
博士
國立中正大學
電機工程研究所
100
With the advances in the video coding algorithms, there is more and more demanded computational complexity as well as power consumption for battery-operated devices. In this dissertation, several design techniques with low operational voltage scheme are proposed to realize power efficient H.264 video coding design for battery-operated devices. The proposed design techniques which include the quality-adjustable search algorithm, the energy efficient CMOS scheme, and multiple-power domain CMOS scheme reduce the operational voltage for reducing the power consumption at the same time provide better processing performance in low operational voltage. In addition, this dissertation also provides the optimization in the aspects of algorithmic, architectural, and logic levels to reduce the memory bandwidth, hardware cost, and computational complexity to achieve the design goal of power efficient H.264 video design. Using those techniques, this dissertation proposes three power efficient designs for different battery-driven device applications. First, to achieve high throughput rates, low-power consumption, and power-aware features, we proposes a dynamic quality-adjustable H.264 baseline profile (BP) video encoder that can achieve real-time H.264 video encoding on CIF, D1, and HD720@30fps with 7mW-to-25mW, 27mW-to-162mW, and 122mW-to-183mW power dissipation in different quality modes. In addition, this chapter also proposes a dynamic quality-adjustable H.264 intra coder to encode H.264 intra video sequences on D1, HD720 and HD1080 with 10mW to 16mW, 27mW to 45mW, and 60mW power consumption under different quality modes, respectively. For low operational voltage and high processing performance features, the proposed test chip supporting low voltage (LV) H.264/AVC high profile (HP) video decoding with MBAFF coding tool is fabricated in a 90nm CMOS technology. It delivers a maximum throughput of D1@35fps at 0.5-V, which outperforms the 65nm video design at 0.5-V through a 28x improvement in throughput and provides a minimum energy consumption of 280pJ/pixel at 0.5-V as compared to the state-of-the-art H.264 video decoders. Finally, we propose a 0.4v ultra-low-energy (ULE) H.264 image/video encoder to achieve high throughput rates and low power features for wireless capsule endoscope (WCE) applications. Designed in 65nm CMOS technology at the supply voltage of 0.4v, the proposed design owns 2.5x improvement in processing throughput with 0.0196mW. Furthermore, the energy consumption of the proposed design is 14.2pJ/pixel which can achieve one order of the reduction in energy consumption as compared to state-of-the-art implementations. Moreover, in this chapter we also realize the H.264 video encoder by the same design concept with low-voltage and low-power features. Compared to previous compressors used in WCE applications, the proposed video encoder can obtain 18% reduction in energy consumption at 0.4-V.
APA, Harvard, Vancouver, ISO, and other styles
47

Cheng, Wei-Cheng, and 程偉政. "Bus Arbiters of the VC-1 Encoder and Decoder." Thesis, 2008. http://ndltd.ncl.edu.tw/handle/89780672286158088262.

Full text
Abstract:
碩士
國立中正大學
電機工程所
97
In this dissertation, the bus arbiters for the Video Codec 1 (VC-1) are successfully implemented on the Field Programmable Gate Array (FPGA) platform. Particularly, there are three versions of the bus arbiter for the VC-1 decoder and one version of the bus arbiter for the VC-1 encoder. First, the data flows, memory organization and timing arrangement of the bus arbiter are introduced in detail where the addressing mechanism of the bus arbiter depends on the memory types. Second, the implementation manners of the bus arbiters are clearly illustrated. The verilog codes of the bus arbiters are synthesized by the Xilinx synthesis tool, and then simulated by the ModelSim software to verify their functionalities. Third, the proposed bus arbiters integrated with the other modules of the VC-1 codec at different profiles are demonstrated. Additionally, the total equivalent gate counts of the bus arbiters at four versions are explored. From the practical results, the proposed bus arbiters can effectively and correctly conduct the data accessing and manipulation for the VC-1 codec on various multimedia applications.
APA, Harvard, Vancouver, ISO, and other styles
48

Song, Wu Chin, and 吳錦松. "Design of an Encoder/Decoder of Reed Solomon code." Thesis, 1995. http://ndltd.ncl.edu.tw/handle/88182428915938709677.

Full text
APA, Harvard, Vancouver, ISO, and other styles
49

Banakar, Rajeshwari M. "Low power design methodology for turbo encoder and decoder." Thesis, 2004. http://localhost:8080/xmlui/handle/12345678/5557.

Full text
APA, Harvard, Vancouver, ISO, and other styles
50

Liao, Wen-Hsien, and 廖文賢. "Real-time Implementation for H.264 CAVLC Encoder and Decoder." Thesis, 2006. http://ndltd.ncl.edu.tw/handle/65547232605630931189.

Full text
Abstract:
碩士
國立高雄第一科技大學
電腦與通訊工程所
95
In this thesis, we design some IPs for the CODEC of H.264/AVC, specially for intra-mode coding. In the encoder, the main module for Discrete Cosine Transform(DCT), Quantizer, Zigzag Scan converter, CAVLC encoder and Packer, are designed. With the modular-by-modular connecting, the circuit can read by 4 pixels per cycle and are compressed to video bit-stream in real-time operation. For CAVLC encoder, we design a parallel architecture that can process a 4*4 block within 16 cycles. For VLC codebook, the equation-based is used rather than ROM table to reduce the ROM size. The simulations are step-by-step to check each modular output to verify the function. The encoder used about 30K gates using cell-based design and the operation speed can achieve 52MHz. In the decoder, the design flow is the relative inverse module to correspond the encoder. We also have the timing control module and input-buffer control module. With parallel structure, the CAVLC decoder can decode one codeword per cycle. Also, we use some condition equations rather than ROM tables. The decoder is implemented with 63K gates, and the operation speed can achieve 40MHz. The output of encoder is sent to the decoder to do the whole system simulation. The decoder can construct the original block in real-time schedule and the results meet our expectations.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography