To see the other types of publications on this topic, follow the link: Transformée en cosinus discrète (DCT).

Dissertations / Theses on the topic 'Transformée en cosinus discrète (DCT)'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 45 dissertations / theses for your research on the topic 'Transformée en cosinus discrète (DCT).'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Qiu, Han. "Une architecture de protection des données efficace basée sur la fragmentation et le cryptage." Electronic Thesis or Diss., Paris, ENST, 2017. http://www.theses.fr/2017ENST0049.

Full text
Abstract:
Une architecture logicielle de protection de données entièrement revisité basé sur le chiffrement sélectif est présentée. Tout d’abord, ce nouveau schéma est agnostique en terme de format de données (multimedia ou textuel). Deuxièmement, son implémentation repose sur une architecture parallèle utilisant un GPGPU de puissance moyenne permettant aux performances d’être comparable aux algorithmes de chiffrement utilisant l’architecture NI proposée par Intel particulièrement adaptée. Le format bitmap, en tant que format multimédia non compressé, est abordé comme un premier cas d’étude puis sera utilisé comme format pivot pour traiter tout autre format. La transformée en cosinus discrète (DCT) est la première transformation considérée pour fragmenter les données, les protéger et les stocker séparément sur des serveurs locaux et sur un cloud public. Ce travail a contribué à largement améliorer les précédents résultats publiés pour la protection sélective d’image bitmap en proposant une nouvelle architecture et en fournissant des expérimentations pratiques. L’unité de traitement graphique à usage général (GPGPU) est exploitée pour optimiser l’efficacité du calcul par rapport aux algorithmes traditionnels de chiffrement (tel que AES). Puis, un chiffrement sélectif agnostique basé sur la transformée en ondelettes discrètes sans perte (DWT) est présenté. Cette conception, avec des expérimentations pratiques sur différentes configurations matérielles, offre à la fois un fort niveau de protection et de bonnes performances en même temps que des possibilités de dispersion de stockage flexibles. Notre solution agnostique de protection de données combinant fragmentation, chiffrement et dispersion est applicable à une large gamme d’applications par un utilisateur final. Enfin, un ensemble complet d’analyses de sécurité est déployé pour s’assurer du bon niveau de protection fourni même pour les fragments les moins bien protégés
In this thesis, a completely revisited data protection scheme based on selective encryption is presented. First, this new scheme is agnostic in term of data format, second it has a parallel architecture using GPGPU allowing performance to be at least comparable to full encryption algorithms. Bitmap, as a special uncompressed multimedia format, is addressed as a first use case. Discrete Cosine Transform (DCT) is the first transformation for splitting fragments, getting data protection, and storing data separately on local device and cloud servers. This work has largely improved the previous published ones for bitmap protection by providing new designs and practical experimentations. General purpose graphic processing unit (GPGPU) is exploited as an accelerator to guarantee the efficiency of the calculation compared with traditional full encryption algorithms. Then, an agnostic selective encryption based on lossless Discrete Wavelet Transform (DWT) is presented. This design, with practical experimentations on different hardware configurations, provides strong level of protection and good performance at the same time plus flexible storage dispersion schemes. Therefore, our agnostic data protection and transmission solution combining fragmentation, encryption, and dispersion is made available for a wide range of end-user applications. Also a complete set of security analysis are deployed to test the level of provided protection
APA, Harvard, Vancouver, ISO, and other styles
2

Makkaoui, Leila. "Compression d'images dans les réseaux de capteurs sans fil." Phd thesis, Université de Lorraine, 2012. http://tel.archives-ouvertes.fr/tel-00795503.

Full text
Abstract:
Cette thèse forme une contribution au problème de la conservation de l'énergie dans le cas particulier des réseaux de capteurs d'images, où une partie voire tous les nœuds du réseau sont équipés d'une petite caméra à technologie CMOS. Les images engagent des volumes de données très largement supérieurs aux mesures scalaires classiques telles que la température, et donc des dépenses énergétiques plus élevées. L'émetteur radio étant l'un des composants les plus gourmands en énergie, il est évident que la compression de l'image à la source peut réduire significativement l'énergie dépensée pour la transmission de l'image, tant au niveau du nœud caméra que des nœuds formant le chemin jusqu'au point de collecte. Toutefois, les méthodes de compression bien connues (JPEG, JPEG2000, SPIHT) sont mal adaptées à la limitation des ressources de calcul et de mémoire caractéristiques des nœuds-capteurs. Sur certaines plateformes matérielles, ces algorithmes ont même un coût énergétique supérieur au gain qu'ils amènent sur la transmission. Autrement dit, le nœud caméra épuise plus vite sa batterie en envoyant des images compressées que des images non compressées. La complexité de l'algorithme de compression est donc un critère de performance aussi important que le rapport débit-distorsion. Les contributions contenues dans ce mémoire de thèses sont triples : - Tout d'abord, nous avons proposé un algorithme de compression basé sur la transformée en cosinus discrète (DCT 8 points) de complexité réduite, combinant la méthode de DCT rapide la plus efficace de la littérature (DCT de Cordic-Loeffler) à une exécution réduite aux coefficients délimités par une zone carrée de taille k<8, les plus importants dans la reconstruction visuelle. Avec cette approche zonale, le nombre de coefficients à calculer, mais aussi à quantifier et encoder par bloc de 8x8 pixels est réduit à k^2 au lieu de 64, ce qui diminue mécaniquement le coût de la compression. - Nous avons ensuite étudié l'impact de k, donc du nombre de coefficients sélectionnés, sur la qualité de l'image finale. L'étude a été réalisée avec un jeu d'une soixantaine d'images de référence et la qualité des images était évaluée en utilisant plusieurs métriques, le PSNR, le PSNR-HVS et le MMSIM. Les résultats ont servi à identifier, pour un débit donné, la valeur limite de k qu'on peut choisir (statistiquement) sans dégradation perceptible de la qualité, et par conséquent les limites posées sur la réduction de la consommation d'énergie à débit et qualité constants. - Enfin, nous donnons les résultats de performances obtenus par des expérimentations sur une plateforme réelle composée d'un nœud Mica2 et d'une caméra Cyclops afin de démontrer la validité de nos propositions. Dans un scénario considérant des images de 128x128 pixels encodées à 0,5 bpp par exemple, la dépense d'énergie du nœud caméra (incluant compression et transmission) est divisée par 6 comparée au cas sans compression, et par 2 comparée au cas de l'algorithme JPEG standard.
APA, Harvard, Vancouver, ISO, and other styles
3

Makkaoui, Leila. "Compression d'images dans les réseaux de capteurs sans fil." Electronic Thesis or Diss., Université de Lorraine, 2012. http://www.theses.fr/2012LORR0416.

Full text
Abstract:
Les réseaux de capteurs sans fil d'images sont utilisés aujourd'hui dans de nombreuses applications qui diffèrent par leurs objectifs et leurs contraintes individuelles. Toutefois, le dénominateur commun de toutes les applications de réseaux de capteurs reste la vulnérabilité des noeuds-capteurs en raison de leurs ressources matérielles limitées dont la plus contraignante est l'énergie. En effet, les technologies sans fil disponibles dans ce type de réseaux sont généralement à faible portée, et les ressources matérielles (CPU, batterie) sont également de faible puissance. Il faut donc répondre à un double objectif : l'efficacité d'une solution tout en offrant une bonne qualité d'image à la réception. La contribution de cette thèse porte principalement sur l'étude des méthodes de traitement et de compression d'images au noeud-caméra, nous avons proposé une nouvelle méthode de compression d'images qui permet d'améliorer l'efficacité énergétique des réseaux de capteurs sans fil. Des expérimentations sur une plate-forme réelle de réseau de capteurs d'images ont été réalisées afin de démontrer la validité de nos propositions, en mesurant des aspects telles que la quantité de mémoire requise pour l'implantation logicielle de nos algorithmes, leur consommation d'énergie et leur temps d'exécution. Nous présentons aussi, les résultats de synthèse de la chaine de compression proposée sur des systèmes à puce FPGA et ASIC
The increasing development of Wireless Camera Sensor Networks today allows a wide variety of applications with different objectives and constraints. However, the common problem of all the applications of sensor networks remains the vulnerability of sensors nodes because of their limitation in material resources, the most restricting being energy. Indeed, the available wireless technologies in this type of networks are usually a low-power, short-range wireless technology and low power hardware resources (CPU, battery). So we should meet a twofold objective: an efficient solution while delivering outstanding image quality on reception. This thesis concentrates mainly on the study and evaluation of compression methods dedicated to transmission over wireless camera sensor networks. We have suggested a new image compression method which decreases the energy consumption of sensors and thus maintains a long network lifetime. We evaluate its hardware implementation using experiments on real camera sensor platforms in order to show the validity of our propositions, by measuring aspects such as the quantity of memory required for the implantation program of our algorithms, the energy consumption and the execution time. We then focus on the study of the hardware features of our proposed method of synthesis of the compression circuit when implemented on a FPGA and ASIC chip prototype
APA, Harvard, Vancouver, ISO, and other styles
4

Auclair, Beaudry Jean-Sébastien. "Modelage de contexte simplifié pour la compression basée sur la transformée en cosinus discrète." Mémoire, Université de Sherbrooke, 2009. http://savoirs.usherbrooke.ca/handle/11143/1511.

Full text
Abstract:
Le manque grandissant de médecins spécialistes à l'extérieur des grands centres influe négativement sur' la qualité des soins reçus par les patients. Une solution possible à ce problème est la supervision des médecins généralistes en région par des spécialistes disponibles dans les grands centres. Cette supervision à distance nécessite le développement de technologies répondant aux besoins précis de celle-ci. Dans le cadre de ce projet de recherche, la transmission de l'image est considérée. En vue de développer un codec vidéo adéquat pour l'application dans le futur, le codec intra-image est étudié. Plus précisément, le but recherché est de simplifier et de rendre parallélisable le codec AGU 1 [PONOMARENKO et coll., 2005] sans en réduire les performances en deça des performances de JPEG2000 [SxoDRAS et coll., 2001]. Ces améliorations facilitent la réalisation matérielle du codec en réduisant la latence si critique aux applications de télésupervision. Pour accomplir ces objectifs, le modelage du contexte du codec AGU doit être modifié. La méthodologie proposée passe par l'implémentation du codec AGU, l'étude de la source de données et la modification du modelage de contexte. La modification en question est le remplacement de l'utilisation d'une méthode adaptative basée sur un arbre de conditions par un réseau de neurones. Au terme de cette recherche, le réseau de neurones utilisé comme modeleur de contexte s'avère être un succès. Une structure à neuf entrées et aucune couche cachée est utilisée et permet de rendre presque triviale l'opération de modelage du contexte en gardant des performances supérieures à JPEG2000 en moyenne. La performance est inférieure à JPEG2000 pour une seule image de test sur cinq. Dans le futur, il est possible d'étudier comment améliorer davantage ce codec intra-image à travers l'utilisation d'un meilleur réseau de neurones ou d'une transformée différente. Il est également souhaitable d'étudier comment faire évoluer le codec en un codec inter-image.
APA, Harvard, Vancouver, ISO, and other styles
5

Eude, Thierry. "Compression d'images médicales pour la transmission et l'archivage, par la transformée en cosinus discrète." Rouen, 1993. http://www.theses.fr/1993ROUES056.

Full text
Abstract:
En milieu hospitalier, l'utilisation de l'imagerie prend une place de plus en plus importante. Mais la quantité d'informations que représentent les images quand on parle d'archivage ou de transmission numérique pose des problèmes très préoccupants. Une réponse à ceux-ci consiste à compresser ces images. De nombreuses méthodes existent. La plus utilisée est basée sur la transformée en cosinus discrète (TCD). Elle constitue le noyau de la norme JPEG sur la compression des images fixes. Les travaux présentés dans ce manuscrit, consistent à déterminer des outils pouvant être intégrés à cette norme, et adaptés spécifiquement aux images médicales. Une étude statistique a donc été faite de façon extensive pour déterminer les lois que suivent les coefficients résultant de la TCD de ces images. Les résultats obtenus sont alors utilisés pour adapter la compression
APA, Harvard, Vancouver, ISO, and other styles
6

Dugas, Alexandre. "Architecture de transformée de cosinus discrète sur deux dimensions sans multiplication et mémoire de transposition." Mémoire, Université de Sherbrooke, 2012. http://hdl.handle.net/11143/6174.

Full text
Abstract:
Au cours des dix dernières années, les capacités technologiques de transmission vidéo rendent possible une panoplie d'applications de télésanté. Ce média permet en effet la participation de médecins spécialisés à des interventions médicales ayant lieu à des endroits distants. Cependant, lorsque ces dernières se déroulent loin des grands centres, les infrastructures de télécommunication n'offrnt pas un débit assez important pour permettre à la fois une transmission d'images fluides et de bonne qualité. Un des moyens entrepris pour pallier ce problème est l'utilisation d'encodeur et décodeur vidéo (CODEC) permettant de compresser les images avant leur transmission et de les décompresser à la réception. Il existe un bon nombre de CODEC vidéo offrant différent compromis entre la qualité d'image, la rapidité de compression, la latence initiale de traitement et la robustesse du protocole de transmission. Malheureusement, aucun n'est en mesure de rencontrer simultanément toutes les exigences définies en télésanté. Un des problèmes majeurs réside dans le délai de traitement initial causé par la compression avec perte du CODEC. L'objet de la recherche s'intéresse donc à deux CODEC qui répondent aux exigences de délais de traitement et de qualité d'image en télésanté, et plus particulièrement pour une application de téléassistance en salle d'urgence. L'emphase est mise sur les modules de quantification des CODEC qui utilisent la transformée en cosinus discrète. Cette transformée limite la transmission des images vidéo fluide et quasi sans délais en raison des délais de traitement initiaux issus des nombreuses manipulations arithmétiques qu'elle requiert. À l'issu de la recherche, une structure efficace de la transformée en cosinus est proposée afin de présenter une solution au temps de latence des CODEC et ainsi de répondre aux exigences de télécommunication en télésanté. Cette solution est implémentée dans un CODEC JPEG développé en VHDL afin de simuler un contexte d'application réelle.
APA, Harvard, Vancouver, ISO, and other styles
7

Hmida, Hedi. "Étude et comparaison d'algorithmes de transformée en cosinus discrète en vue de leur intégration en VLSI." Paris 11, 1988. http://www.theses.fr/1988PA112133.

Full text
Abstract:
On envisage actuellement l'utilisation de la transformée en cosinus discrète (TCD), pour la compression de données d'images fixes (télécopie, vidéotexte. . . ) ou animées (TV, visio-conférence. . . ). Des algorithmes rapides de calcul de cette transformée existent mais des problèmes (liés à la structure des algorithmes) se posent au niveau de leur intégration en VLSI. Nous avons situé ce travail au niveau de l'interaction algorithme ­ architecture à plusieurs niveaux: d'une part, l'étude de deux algorithmes classiques de TCD nous a permis de sélectionner un certain nombre d'opérateurs spécifiques des algorithmes rapides à 2 dimensions. Et, après une présentation générale des architectures de circuits VLSI (architecture série, parallèle, pipeline, systolique. . . ), nous proposons un ensemble d'opérateurs arithmétiques nouveaux. La nouveauté porte soit sur la réduction de la complexité d'opérateurs classiques (schémas simplifiés d'additionneurs), soit sur l'originalité de leurs fonction (additionneur­ soustracteur, ou "papillon"). Puis, nous appliquons l'ensemble de ces outils à l'implantation d'algorithmes classiques, en essayant de dégager les points qui favorisent ou qui gênent leur implantation VLSI. Ceci résulte d'une part dans plusieurs propositions d'architectures mais aussi (et surtout) d'un essai de mise en évidence des problèmes propres à ces algorithmes, ce qui nous a conduit à proposer plusieurs nouveaux algorithmes corrigeant les défauts essentiels des algorithmes classiques.
APA, Harvard, Vancouver, ISO, and other styles
8

Haque, S. M. Rafizul. "Singular Value Decomposition and Discrete Cosine Transform based Image Watermarking." Thesis, Blekinge Tekniska Högskola, Avdelningen för för interaktion och systemdesign, 2008. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-5269.

Full text
Abstract:
Rapid evolution of digital technology has improved the ease of access to digital information enabling reliable, faster and efficient storage, transfer and processing of digital data. It also leads to the consequence of making the illegal production and redistribution of digital media easy and undetectable. Hence, the risk of copyright violation of multimedia data has increased due to the enormous growth of computer networks that provides fast and error free transmission of any unauthorized duplicate and possibly manipulated copy of multimedia information. One possible solution may be to embed a secondary signal or pattern into the image that is not perceivable and is mixed so well with the original digital data that it is inseparable and remains unaffected against any kind of multimedia signal processing. This embedded secondary information is digital watermark which is, in general, a visible or invisible identification code that may contain some information about the intended recipient, the lawful owner or author of the original data, its copyright etc. in the form of textual data or image. In order to be effective for copyright protection, digital watermark must be robust which are difficult to remove from the object in which they are embedded despite a variety of possible attacks. Several types of watermarking algorithms have been developed so far each of which has its own advantages and limitations. Among these, recently Singular Value Decomposition (SVD) based watermarking algorithms have attracted researchers due to its simplicity and some attractive mathematical properties of SVD. Here a number of pure and hybrid SVD based watermarking schemes have been investigated and finally a RST invariant modified SVD and Discrete Cosine Transform (DCT) based algorithm has been developed. A preprocessing step before the watermark extraction has been proposed which makes the algorithm resilient to geometric attack i.e. RST attack. Performance of this watermarking scheme has been analyzed by evaluating the robustness of the algorithm against geometric attack including rotation, scaling, translation (RST) and some other attacks. Experimental results have been compared with existing algorithm which seems to be promising.
Phone number: +88041730212
APA, Harvard, Vancouver, ISO, and other styles
9

Hantehzadeh, Neda. "3-D Face Recognition using the Discrete Cosine Transform (DCT)." Available to subscribers only, 2009. http://proquest.umi.com/pqdweb?did=1964658571&sid=3&Fmt=2&clientId=1509&RQT=309&VName=PQD.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Aimer, Younes. "Étude des performances d'un système de communication sans fil à haut débit." Thesis, Poitiers, 2019. http://www.theses.fr/2019POIT2269.

Full text
Abstract:
La demande des usagers en termes de débit, de couverture et de qualité de service croît exponentiellement, avec une demande de plus en plus accrue en énergie électrique pour assurer les liaisons entre les réseaux. Dans ce contexte, les nouvelles formes d’ondes basées sur la modulation OFDM se sont répandues et sont utilisées dans les récentes architectures de radiocommunications. Cependant, ces signaux sont très sensibles aux non-linéarités des amplificateurs de puissance à cause des fortes fluctuations d’enveloppe caractérisées par un fort PAPR qui dégrade le bilan énergétique et le rendement de l’émetteur. Dans ce travail de thèse, nous avons tout d’abord dressé un état de l’art des techniques de réduction du PAPR. Cette présentation nous a permis de proposer une nouvelle méthodologie basée sur les techniques d’entrelacement et de codage. Cette première contribution consiste en l’utilisation de la technique d’entrelacement, en faisant appel aux sous porteuses nulles pour la transmission des informations auxiliaires, tout en respectant les spécifications fréquentielles du standard utilisé. La deuxième contribution est basée sur la combinaison des deux techniques de Shaping et de Transformée en Cosinus Discrète DCT, dans l’objectif d’améliorer davantage les performances du système. Les résultats de simulation ont montré que l’utilisation de ces deux techniques permet un gain significatif en termes de réduction du PAPR, qui se traduit par l’amélioration du rendement. Enfin, nous avons présenté une étude expérimentale pour l’ensemble des techniques proposées afin de confirmer les résultats obtenus en simulation. Ces évaluations sont réalisées avec un banc d'essais de radiocommunications pour le test d'un amplificateur de puissance commercial de technologie LDMOS de 20W, fonctionnant à 3.7 GHz en classe AB. Les résultats obtenus pour les standards IEEE 802.11 montrent que ces nouvelles approches permettent de garantir la robustesse de transmission, d’améliorer la qualité des liaisons et d’optimiser la consommation électrique
The request of the users in terms of rate, coverage and quality of service is growing exponentially, with increasing demand for electrical energy, to ensure networks link. In this context, new waveforms based on the OFDM modulation become widely popular and used intensively in recent radio communications architectures. However, these signals are sensitive to the power amplifier nonlinearities because of their high envelope fluctuations characterized by a high PAPR, which degrades the energy consumption and the transmitter efficiency.In this thesis, we first began by a state art of the PAPR reduction techniques. This presentation allowed us to propose a new method based on interleaving and coding techniques. The first contribution consists on the use of the interleaving technique using null-subcarriers for the transmission of the side information, while respecting the frequency specifications of the used standard. The second one is based on the conjunction of the Shaping technique and the Discrete Cosine Transform (DCT), with the aim of improving the performance of the system. Simulation results show that the use of these two techniques allows a significant gain in terms of PAPR reduction, which results in the improvement of the system efficiency. Finally, we presented an experimental study of the proposed techniques using an RF test bench with a commercial LDMOS 20 W PA, class AB operating at 3.7 GHz. The results obtained for the IEEE 802.11 standards show that the proposed approaches allow the transmission robustness and quality, while optimizing the power consumption
APA, Harvard, Vancouver, ISO, and other styles
11

Pagliari, Carla Liberal. "Perspective-view image matching in the DCT domain." Thesis, University of Essex, 2000. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.298594.

Full text
APA, Harvard, Vancouver, ISO, and other styles
12

Maschio, Nicole. "Contribution à la compression d'images numériques par codage prédictif et transformée en cosinus discrète avec utilisation de codes arithmétiques." Nice, 1989. http://www.theses.fr/1989NICE4281.

Full text
Abstract:
Amélioration, sans dégradation de la qualité de l'image, du rendement de systèmes de compressions d'images bases respectivement sur le codage prédictif et sur la transformée en cosinus discrète. On traite le signal a la sortie du quantificateur
APA, Harvard, Vancouver, ISO, and other styles
13

Bhardwaj, Divya Anshu. "Inverse Discrete Cosine Transform by Bit Parallel Implementation and Power Comparision." Thesis, Linköping University, Department of Electrical Engineering, 2003. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-2447.

Full text
Abstract:

The goal of this project was to implement and compare Invere Discrete Cosine Transform using three methods i.e. by bit parallel, digit serial and bit serial. This application describes a one dimensional Discrete Cosine Transform by bit prallel method and has been implemented by 0.35 ìm technology. When implementing a design, there are several considerations like word length etc. were taken into account. The code was implemented using WHDL and some of the calculations were done in MATLAB. The VHDL code was the synthesized using Design Analyzer of Synopsis; power was calculated and the results were compared.

APA, Harvard, Vancouver, ISO, and other styles
14

Urbano, Rodriguez Luis Alberto. "Contribution à la compression d'images par transformée en cosinus discrète en imagerie médicale, et évaluation sur une base d'images multi-modalités." Tours, 1991. http://www.theses.fr/1991TOUR3307.

Full text
APA, Harvard, Vancouver, ISO, and other styles
15

Faridi, Imran Zafar. "Image Compression Using Bidirectional DCT to Remove Blocking Artifacts." Digital Archive @ GSU, 2005. http://digitalarchive.gsu.edu/cs_theses/9.

Full text
Abstract:
Discrete Cosine Transform (DCT) is widely used transform in many areas of the current information age. It is used in signal compression such as voice recognition, shape recognition and also in FBI finger prints. DCT is the standard compression system used in JPEG format. The DCT quality deteriorates at low-bit compression rate. The deterioration is due to the blocking artifact inherent in block DCT. One of the successful attempts to reduce these blocking artifacts was conversion of Block-DCT into Line-DCT. In this thesis we will explore the Line-DCT and introduce a new form of line-DCT called Bidirectional-DCT, which retains the properties of Line- DCT while improving computational efficiency. The results obtained in this thesis show significant reduction in processing time both in one dimensional and two dimensional DCT in comparison with the traditional Block-DCT. The quality analysis also shows that the least mean square error is considerably lower than the traditional Block-DCT which is a consequence of removing the blocking artifacts. Finally, unlike the traditional block DCT, the Bidirectional-DCT enables compression with very low bit rates and very low blocking artifacts.
APA, Harvard, Vancouver, ISO, and other styles
16

Al-Gindy, Ahmed M. N. "Design and analysis of Discrete Cosine Transform-based watermarking algorithms for digital images. Development and evaluation of blind Discrete Cosine Transform-based watermarking algorithms for copyright protection of digital images using handwritten signatures and mobile phone numbers." Thesis, University of Bradford, 2011. http://hdl.handle.net/10454/5450.

Full text
Abstract:
This thesis deals with the development and evaluation of blind discrete cosine transform-based watermarking algorithms for copyright protection of digital still images using handwritten signatures and mobile phone numbers. The new algorithms take into account the perceptual capacity of each low frequency coefficients inside the Discrete Cosine Transform (DCT) blocks before embedding the watermark information. They are suitable for grey-scale and colour images. Handwritten signatures are used instead of pseudo random numbers. The watermark is inserted in the green channel of the RGB colour images and the luminance channel of the YCrCb images. Mobile phone numbers are used as watermarks for images captured by mobile phone cameras. The information is embedded multiple-times and a shuffling scheme is applied to ensure that no spatial correlation exists between the original host image and the multiple watermark copies. Multiple embedding will increase the robustness of the watermark against attacks since each watermark will be individually reconstructed and verified before applying an averaging process. The averaging process has managed to reduce the amount of errors of the extracted information. The developed watermarking methods are shown to be robust against JPEG compression, removal attack, additive noise, cropping, scaling, small degrees of rotation, affine, contrast enhancements, low-pass, median filtering and Stirmark attacks. The algorithms have been examined using a library of approximately 40 colour images of size 512 512 with 24 bits per pixel and their grey-scale versions. Several evaluation techniques were used in the experiment with different watermarking strengths and different signature sizes. These include the peak signal to noise ratio, normalized correlation and structural similarity index measurements. The performance of the proposed algorithms has been compared to other algorithms and better invisibility qualities with stronger robustness have been achieved.
APA, Harvard, Vancouver, ISO, and other styles
17

Coudoux, François-Xavier. "Evaluation de la visibilité des effets de blocs dans les images codées par transformée : application à l'amélioration d'images." Valenciennes, 1994. https://ged.uphf.fr/nuxeo/site/esupversions/a0a7cc38-609d-4d86-9c3a-a018590bc012.

Full text
Abstract:
Les méthodes de codage avec perte basées sur la Transformée en Cosinus Discrète constituent la base de la plupart des standards actuels de compression d'images numériques. Ce type de codage par Transformée nécessite une segmentation préalable de l'image en blocs de pixels disjoints. L'effet de blocs constitue le principal défaut de ce type de codage: les frontières entre blocs adjacents deviennent visibles pour des taux de compression élevés. Ce défaut est particulièrement gênant pour l'observateur, et affecte sévèrement la qualité visuelle de l'image reconstruite. Le but de notre étude est de proposer une méthode de détection locale des artefacts, ainsi qu'une mesure de leur importance visuelle. Cette mesure, qui prend en compte plusieurs propriétés du système visuel humain, caractérise la dégradation introduite dans l'image par la distorsion de blocs. Elle est utilisée afin d'établir un critère de qualité globale des images codées JPEG. Ce critère permet de quantifier la qualité des images reconstruites, en attribuant une note de qualité à l'image dégradée. Une application directe des résultats de la mesure des visibilités des effets de blocs concerne la détection et la correction de ces défauts à l'intérieur d'une image traitée par blocs. Nous présentons une méthode originale de réduction des effets de blocs; elle consiste en un filtrage local adaptatif à la visibilité des artefacts de blocs dans l'image. La correction apportée permet de réduire sensiblement les défauts les plus visibles, sans dégrader le reste de l'image. La méthode est validée dans le cas d'images fixes codées JPEG; son extension immédiate aux standards MPEG1 et MPEG2 de compression de séquences d'images reste possible, moyennant éventuellement la prise en compte des propriétés temporelles de la vision. Une implémentation matérielle est envisagée sous forme d'un circuit électronique, qui pourra être utilisé sur les terminaux de consultation multimédia afin d'améliorer la qualité visuelle de l'image avant affichage final.
APA, Harvard, Vancouver, ISO, and other styles
18

Ahmed, Kamal Ali. "Digital watermarking of still images." Thesis, University of Manchester, 2013. https://www.research.manchester.ac.uk/portal/en/theses/digital-watermarking-of-still-images(0dc4b146-3d97-458f-9506-8c67bc3a155b).html.

Full text
Abstract:
This thesis presents novel research work on copyright protection of grey scale and colour digital images. New blind frequency domain watermarking algorithms using one dimensional and two dimensional Walsh coding were developed. Handwritten signatures and mobile phone numbers were used in this project as watermarks. In this research eight algorithms were developed based on the DCT using 1D and 2D Walsh coding. These algorithms used the low frequency coefficients of the 8 × 8 DCT blocks for embedding. A shuffle process was used in the watermarking algorithms to increase the robustness against the cropping attacks. All algorithms are blind since they do not require the original image. All algorithms caused minimum distortion to the host images and the watermarking is invisible. The watermark is embedded in the green channel of the RGB colour images. The Walsh coded watermark is inserted several times by using the shuffling process to improve its robustness. The effect of changing the Walsh lengths and the scaling strength of the watermark on the robustness and image quality were studied. All algorithms are examined by using several grey scale and colour images of sizes 512 × 512. The fidelity of the images was assessed by using the peak signal to noise ratio (PSNR), the structural similarity index measure (SSIM), normalized correlation (NC) and StirMark benchmark tools. The new algorithms were tested on several grey scale and colour images of different sizes. Evaluation techniques using several tools with different scaling factors have been considered in the thesis to assess the algorithms. Comparisons carried out against other methods of embedding without coding have shown the superiority of the algorithms. The results have shown that use of 1D and 2D Walsh coding with DCT Blocks offers significant improvement in the robustness against JPEG compression and some other image processing operations compared to the method of embedding without coding. The originality of the schemes enables them to achieve significant robustness compared to conventional non-coded watermarking methods. The new algorithms offer an optimal trade-off between perceptual distortion caused by embedding and robustness against certain attacks. The new techniques could offer significant advantages to the digital watermark field and provide additional benefits to the copyright protection industry.
APA, Harvard, Vancouver, ISO, and other styles
19

Nortershauser, David. "Résolution de problèmes inverses tridimensionnels instationnaires de conduction de la chaleur." Toulouse, ENSAE, 2000. http://www.theses.fr/2000ESAE0017.

Full text
Abstract:
L'étude a pour objectif l'estimation d'échanges de chaleur surfaciques transitoires dans des environnements agressifs, c'est à dire inaccessibles à la mesure directe, tant dans les cas linéaires que non-linéaires. Les champs d'application concernés sont vastes : chambres de combustion, systèmes anti-givrage, caractérisation de réactions chimiques exothermiques. Une méthode de résolution inverse permet de résoudre ces problèmes. Une équation d'observation formée par des mesures de températures surfaciques sur une face est nécessaire à l'estimation des échanges. Dans la mesure où ces méthodes inverses sont très sensibles au bruit de mesure, il a été nécessaire d'adopter une stratégie de stabilisation des solutions efficace. De ce fait, l'utilisation d'une transformée en cosinus discrète (T. C. D. ) afin de filtrer et/ou compacter les données du problème constitue un point clé de l'étude. Chaque outil présenté est soumis à des tests numériques afin de connaître son domaine de validité. De plus, deux expériences de laboratoire permettent de juger des principales sources d'erreur rencontrées lors de la confrontation des modèles à la réalité.
APA, Harvard, Vancouver, ISO, and other styles
20

Abdallah, Abdallah Sabry. "Investigation of New Techniques for Face detection." Thesis, Virginia Tech, 2007. http://hdl.handle.net/10919/33191.

Full text
Abstract:
The task of detecting human faces within either a still image or a video frame is one of the most popular object detection problems. For the last twenty years researchers have shown great interest in this problem because it is an essential pre-processing stage for computing systems that process human faces as input data. Example applications include face recognition systems, vision systems for autonomous robots, human computer interaction systems (HCI), surveillance systems, biometric based authentication systems, video transmission and video compression systems, and content based image retrieval systems. In this thesis, non-traditional methods are investigated for detecting human faces within color images or video frames. The attempted methods are chosen such that the required computing power and memory consumption are adequate for real-time hardware implementation. First, a standard color image database is introduced in order to accomplish fair evaluation and benchmarking of face detection and skin segmentation approaches. Next, a new pre-processing scheme based on skin segmentation is presented to prepare the input image for feature extraction. The presented pre-processing scheme requires relatively low computing power and memory needs. Then, several feature extraction techniques are evaluated. This thesis introduces feature extraction based on Two Dimensional Discrete Cosine Transform (2D-DCT), Two Dimensional Discrete Wavelet Transform (2D-DWT), geometrical moment invariants, and edge detection. It also attempts to construct a hybrid feature vector by the fusion between 2D-DCT coefficients and edge information, as well as the fusion between 2D-DWT coefficients and geometrical moments. A self organizing map (SOM) based classifier is used within all the experiments to distinguish between facial and non-facial samples. Two strategies are tried to make the final decision from the output of a single SOM or multiple SOM. Finally, an FPGA based framework that implements the presented techniques, is presented as well as a partial implementation. Every presented technique has been evaluated consistently using the same dataset. The experiments show very promising results. The highest detection rate of 89.2% was obtained when using a fusion between DCT coefficients and edge information to construct the feature vector. A second highest rate of 88.7% was achieved by using a fusion between DWT coefficients and geometrical moments. Finally, a third highest rate of 85.2% was obtained by calculating the moments of edges.
Master of Science
APA, Harvard, Vancouver, ISO, and other styles
21

Akhtar, Mahmood Electrical Engineering &amp Telecommunications Faculty of Engineering UNSW. "Genomic sequence processing: gene finding in eukaryotes." Publisher:University of New South Wales. Electrical Engineering & Telecommunications, 2008. http://handle.unsw.edu.au/1959.4/40912.

Full text
Abstract:
Of the many existing eukaryotic gene finding software programs, none are able to guarantee accurate identification of genomic protein coding regions and other biological signals central to pathway from DNA to the protein. Eukaryotic gene finding is difficult mainly due to noncontiguous and non-continuous nature of genes. Existing approaches are heavily dependent on the compositional statistics of the sequences they learn from and are not equally suitable for all types of sequences. This thesis firstly develops efficient digital signal processing-based methods for the identification of genomic protein coding regions, and then combines the optimum signal processing-based non-data-driven technique with an existing data-driven statistical method in a novel system demonstrating improved identification of acceptor splice sites. Most existing well-known DNA symbolic-to-numeric representations map the DNA information into three or four numerical sequences, potentially increasing the computational requirement of the sequence analyzer. Proposed mapping schemes, to be used for signal processing-based gene and exon prediction, incorporate DNA structural properties in the representation, in addition to reducing complexity in subsequent processing. A detailed comparison of all DNA representations, in terms of computational complexity and relative accuracy for the gene and exon prediction problem, reveals the newly proposed ?paired numeric? to be the best DNA representation. Existing signal processing-based techniques rely mostly on the period-3 behaviour of exons to obtain one dimensional gene and exon prediction features, and are not well equipped to capture the complementary properties of exonic / intronic regions and deal with the background noise in detection of exons at their nucleotide levels. These issues have been addressed in this thesis, by proposing six one-dimensional and three multi-dimensional signal processing-based gene and exon prediction features. All one-dimensional and multi-dimensional features have been evaluated using standard datasets such as Burset/Guigo1996, HMR195, and the GENSCAN test set. This is the first time that different gene and exon prediction features have been compared using substantial databases and using nucleotide-level metrics. Furthermore, the first investigation of the suitability of different window sizes for period-3 exon detection is performed. Finally, the optimum signal processing-based gene and exon prediction scheme from our evaluations is combined with a data-driven statistical technique for the recognition of acceptor splice sites. The proposed DSP-statistical hybrid is shown to achieve 43% reduction in false positives over WWAM, as used in GENSCAN.
APA, Harvard, Vancouver, ISO, and other styles
22

Muller, Rikus. "Applying the MDCT to image compression." Thesis, Stellenbosch : University of Stellenbosch, 2009. http://hdl.handle.net/10019.1/1197.

Full text
Abstract:
Thesis (DSc (Mathematical Sciences. Applied Mathematics))--University of Stellenbosch, 2009.
The replacement of the standard discrete cosine transform (DCT) of JPEG with the windowed modifed DCT (MDCT) is investigated to determine whether improvements in numerical quality can be achieved. To this end, we employ an existing algorithm for optimal quantisation, for which we also propose improvements. This involves the modelling and prediction of quantisation tables to initialise the algorithm, a strategy that is also thoroughly tested. Furthermore, the effects of various window functions on the coding results are investigated, and we find that improved quality can indeed be achieved by modifying JPEG in this fashion.
APA, Harvard, Vancouver, ISO, and other styles
23

Virette, David. "Étude de transformées temps-fréquence pour le codage audio faible retard en haute qualité." Rennes 1, 2012. http://www.theses.fr/2012REN1E014.

Full text
Abstract:
In recent years there has been a phenomenal increase in the number of products and applications which make use of audio coding formats. Among the most successful audio coding schemes, we can list the MPEG-1 Layer III (mp3), the MPEG-2 Advanced Audio Coding (AAC) or its evolution MPEG-4 High Efficiency-Advanced Audio Coding (HE-AAC). More recently, perceptual audio coding has been adapted to achieve low delay audio coding and to become suitable for conversational applications. Traditionally, the use of filter bank such as the Modified Discrete Cosine Transform (MDCT) is a central component of perceptual audio coding and its adaptation to low delay audio coding has become a very popular re­search topic. Low delay transforms have been developed Fin order to main­tain the performances of this main component while reducing dramatically the associated algorithmic delay. This work presents a low delay block switching tool which aliows the di­rect transition between long transform and short transform without the in­sertion of transition window. The same principle has been extended to de­fine new perfect reconstruction conditions for the MDCT with relaxed con­straints compared to the original definition. A seamless reconstruction method has been derived allowing to increase the flexibility of transform coding schemes with the possibility to select a transform window inde­pendently from the previous and the following frames. Additionally, based on this new approach, a new low delay window design procedure has been derived allowing to obtain an analytic defmition. Those new approaches have been successfully applied to the newly devel­oped MPEG low delay audio coding (LD-AAC and ELD-AAC) allowing to significantly improve the quality for transient signais. Moreover, the low delay window design has been adopted in G. 718, a scalable speech and au­dio codec standardized in ITU-T and has demonstrated its benefit in terms of delay reduction while maintaining the audio quality of a traditional MDCT.
APA, Harvard, Vancouver, ISO, and other styles
24

Krejčí, Michal. "Komprese dat." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2009. http://www.nusl.cz/ntk/nusl-217934.

Full text
Abstract:
This thesis deals with lossless and losing methods of data compressions and their possible applications in the measurement engineering. In the first part of the thesis there is a theoretical elaboration which informs the reader about the basic terminology, the reasons of data compression, the usage of data compression in standard practice and the division of compression algorithms. The practical part of thesis deals with the realization of the compress algorithms in Matlab and LabWindows/CVI.
APA, Harvard, Vancouver, ISO, and other styles
25

Špaček, Milan. "Porovnání možností komprese multimediálních signálů." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2013. http://www.nusl.cz/ntk/nusl-220319.

Full text
Abstract:
Thesis deals with multimedia signal comparison of compression options focused on video and advanced codecs. Specifically it describes the encoding and decoding of video recordings according to the MPEG standard. The theoretical part of the thesis describes characteristic properties of the video signal and justification for the need to use recording and transmission compression. There are also described methods for elimination of encoded video signal redundancy and irrelevance. Further on are discussed ways of measuring the video signal quality. A separate chapter is focused on the characteristics of currently used and promising codecs. In the practical part of the thesis were created functions in Matlab environment. These functions were implemented into graphic user interface that simulates the activity of functional blocks of the encoder and decoder. Based on user-specified input parameters it performs encoding and decoding of any given picture, composed of images in RGB format, and displays the outputs of individual functional blocks. There are implemented algorithms for the initial processing of the input sequence including sub-sampling, as well as DCT, quantization, motion compensation and their inverse operations. Separate chapters are dedicated to the realisation of codec description in the Matlab environment and to the individual processing steps output. Further on are mentioned compress algorithm comparisons and the impact of parameter change onto the final signal. The findings are summarized in conclusion.
APA, Harvard, Vancouver, ISO, and other styles
26

Dvořák, Martin. "Výukový video kodek." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2012. http://www.nusl.cz/ntk/nusl-219882.

Full text
Abstract:
The first goal of diploma thesis is to study the basic principles of video signal compression. Introduction to techniques used to reduce irrelevancy and redundancy in the video signal. The second goal is, on the basis of information about compression tools, implement the individual compression tools in the programming environment of Matlab and assemble simple model of the video codec. Diploma thesis contains a description of the three basic blocks, namely - interframe coding, intraframe coding and coding with variable length word - according the standard MPEG-2.
APA, Harvard, Vancouver, ISO, and other styles
27

Mammeri, Abdelhamid. "Compression et transmission d'images avec énergie minimale application aux capteurs sans fil." Thèse, Université de Sherbrooke, 2010. http://hdl.handle.net/11143/5800.

Full text
Abstract:
Un réseau de capteurs d'images sans fil (RCISF) est un réseau ad hoc formé d'un ensemble de noeuds autonomes dotés chacun d'une petite caméra, communiquant entre eux sans liaison filaire et sans l'utilisation d'une infrastructure établie, ni d'une gestion de réseau centralisée. Leur utilité semble majeure dans plusieurs domaines, notamment en médecine et en environnement. La conception d'une chaîne de compression et de transmission sans fil pour un RCISF pose de véritables défis. L'origine de ces derniers est liée principalement à la limitation des ressources des capteurs (batterie faible , capacité de traitement et mémoire limitées). L'objectif de cette thèse consiste à explorer des stratégies permettant d'améliorer l'efficacité énergétique des RCISF, notamment lors de la compression et de la transmission des images. Inéluctablement, l'application des normes usuelles telles que JPEG ou JPEG2000 est éner- givore, et limite ainsi la longévité des RCISF. Cela nécessite leur adaptation aux contraintes imposées par les RCISF. Pour cela, nous avons analysé en premier lieu, la faisabilité d'adapter JPEG au contexte où les ressources énergétiques sont très limitées. Les travaux menés sur cet aspect nous permettent de proposer trois solutions. La première solution est basée sur la propriété de compactage de l'énergie de la Transformée en Cosinus Discrète (TCD). Cette propriété permet d'éliminer la redondance dans une image sans trop altérer sa qualité, tout en gagnant en énergie. La réduction de l'énergie par l'utilisation des régions d'intérêts représente la deuxième solution explorée dans cette thèse. Finalement, nous avons proposé un schéma basé sur la compression et la transmission progressive, permettant ainsi d'avoir une idée générale sur l'image cible sans envoyer son contenu entier. En outre, pour une transmission non énergivore, nous avons opté pour la solution suivante. N'envoyer fiablement que les basses fréquences et les régions d'intérêt d'une image. Les hautes fréquences et les régions de moindre intérêt sont envoyées""infiablement"", car leur pertes n'altèrent que légèrement la qualité de l'image. Pour cela, des modèles de priorisation ont été comparés puis adaptés à nos besoins. En second lieu, nous avons étudié l'approche par ondelettes (wavelets ). Plus précisément, nous avons analysé plusieurs filtres d'ondelettes et déterminé les ondelettes les plus adéquates pour assurer une faible consommation en énergie, tout en gardant une bonne qualité de l'image reconstruite à la station de base. Pour estimer l'énergie consommée par un capteur durant chaque étape de la 'compression, un modèle mathématique est développé pour chaque transformée (TCD ou ondelette). Ces modèles, qui ne tiennent pas compte de la complexité de l'implémentation, sont basés sur le nombre d'opérations de base exécutées à chaque étape de la compression.
APA, Harvard, Vancouver, ISO, and other styles
28

Reeves, Robert William. "Image matching in the compressed domain." Thesis, Queensland University of Technology, 1999.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
29

Zhu, Zuowei. "Modèles géométriques avec defauts pour la fabrication additive." Thesis, Université Paris-Saclay (ComUE), 2019. http://www.theses.fr/2019SACLN021/document.

Full text
Abstract:
Les différentes étapes et processus de la fabrication additive (FA) induisent des erreurs de sources multiples et complexes qui soulèvent des problèmes majeurs au niveau de la qualité géométrique du produit fabriqué. Par conséquent, une modélisation effective des écarts géométriques est essentielle pour la FA. Le paradigme Skin Model Shapes (SMS) offre un cadre intégral pour la modélisation des écarts géométriques des produits manufacturés et constitue ainsi une solution efficace pour la modélisation des écarts géométriques en FA.Dans cette thèse, compte tenu de la spécificité de fabrication par couche en FA, un nouveau cadre de modélisation à base de SMS est proposé pour caractériser les écarts géométriques en FA en combinant une approche dans le plan et une approche hors plan. La modélisation des écarts dans le plan vise à capturer la variabilité de la forme 2D de chaque couche. Une méthode de transformation des formes est proposée et qui consiste à représenter les effets de variations sous la forme de transformations affines appliquées à la forme nominale. Un modèle paramétrique des écarts est alors établi dans un système de coordonnées polaires, quelle que soit la complexité de la forme. Ce modèle est par la suite enrichi par un apprentissage statistique permettant la collecte simultanée de données des écarts de formes multiples et l'amélioration des performances de la méthode.La modélisation des écarts hors plan est réalisée par la déformation de la couche dans la direction de fabrication. La modélisation des écarts hors plan est effectuée à l'aide d'une méthode orientée données. Sur la base des données des écarts obtenues à partir de simulations par éléments finis, deux méthodes d'analyse modale: la transformée en cosinus discrète (DCT) et l'analyse statistique des formes (SSA) sont exploitées. De plus, les effets des paramètres des pièces et des procédés sur les modes identifiés sont caractérisés par le biais d'un modèle à base de processus Gaussien.Les méthodes présentées sont finalement utilisées pour obtenir des SMSs haute-fidélité pour la fabrication additive en déformant les contours de la couche nominale avec les écarts prédits et en reconstruisant le modèle de surface non idéale complet à partir de ces contours déformés. Une toolbox est développée dans l'environnement MATLAB pour démontrer l'efficacité des méthodes proposées
The intricate error sources within different stages of the Additive Manufacturing (AM) process have brought about major issues regarding the dimensional and geometrical accuracy of the manufactured product. Therefore, effective modeling of the geometric deviations is critical for AM. The Skin Model Shapes (SMS) paradigm offers a comprehensive framework aiming at addressing the deviation modeling problem at different stages of product lifecycle, and is thus a promising solution for deviation modeling in AM. In this thesis, considering the layer-wise characteristic of AM, a new SMS framework is proposed which characterizes the deviations in AM with in-plane and out-of-plane perspectives. The modeling of in-plane deviation aims at capturing the variability of the 2D shape of each layer. A shape transformation perspective is proposed which maps the variational effects of deviation sources into affine transformations of the nominal shape. With this assumption, a parametric deviation model is established based on the Polar Coordinate System which manages to capture deviation patterns regardless of the shape complexity. This model is further enhanced with a statistical learning capability to simultaneously learn from deviation data of multiple shapes and improve the performance on all shapes.Out-of-plane deviation is defined as the deformation of layer in the build direction. A layer-level investigation of out-of-plane deviation is conducted with a data-driven method. Based on the deviation data collected from a number of Finite Element simulations, two modal analysis methods, Discrete Cosine Transform (DCT) and Statistical Shape Analysis (SSA), are adopted to identify the most significant deviation modes in the layer-wise data. The effect of part and process parameters on the identified modes is further characterized with a Gaussian Process (GP) model. The discussed methods are finally used to obtain high-fidelity SMSs of AM products by deforming the nominal layer contours with predicted deviations and rebuilding the complete non-ideal surface model from the deformed contours. A toolbox is developed in the MATLAB environment to demonstrate the effectiveness of the proposed methods
APA, Harvard, Vancouver, ISO, and other styles
30

Atrevi, Dieudonne Fabrice. "Détection et analyse des évènements rares par vision, dans un contexte urbain ou péri-urbain." Thesis, Orléans, 2019. http://www.theses.fr/2019ORLE2008.

Full text
Abstract:
L’objectif principal de cette thèse est le développement de méthodes complètes de détection d’événements rares. Les travaux de cette thèse se résument en deux parties. La première partie est consacrée à l’étude de descripteurs de formes de l’état de l’art. D’une part, la robustesse de certains descripteurs face à différentes conditions de luminosité a été étudiée. D’autre part, les moments géométriques ont été comparés à travers une application d’estimation de pose humaine 3D à partir d’image 2D. De cette étude, nous avons montré qu’à travers une application de recherche de formes, les moments géométriques permettent d’estimer la pose d’une personne à travers une recherche exhaustive dans une base d’apprentissage de poses connues.Cette application peut être utilisée dans un système de reconnaissance d’actions pour une analyse plus fine des événements détectés. Dans la deuxième partie, trois contributions à la détection d’événements rares sont présentées. La première contribution concerne l’élaboration d’une méthode d’analyse globale de scène pour la détection des événements liés aux mouvements de foule. Dans cette approche, la modélisation globale de la scène est faite en nous basant sur des points d’intérêt filtrés à partir de la carte de saillance de la scène. Les caractéristiques exploitées sont l’histogramme des orientations du flot optique et un ensemble de descripteur de formes étudié dans la première partie. L’algorithme LDA (Latent Dirichlet Allocation) est utilisé pour la création des modèles d’événements à partir d’une représentation en document visuel à partir de séquences d’images (clip vidéo). La deuxième contribution consiste en l’élaboration d’une méthode de détection de mouvements saillants ou dominants dans une vidéo. La méthode, totalement non supervisée,s’appuie sur les propriétés de la transformée en cosinus discrète pour analyser les informations du flot optique de la scène afin de mettre en évidence les mouvements saillants. La modélisation locale pour la détection et la localisation des événements est au coeur de la dernière contribution de cette thèse. La méthode se base sur les scores de saillance des mouvements et de l’algorithme SVM dans sa version "one class" pour créer le modèle d’événements. Les méthodes ont été évaluées sur différentes bases publiques et les résultats obtenus sont prometteurs
The main objective of this thesis is the development of complete methods for rare events detection. The works can be summarized in two parts. The first part is devoted to the study of shapes descriptors of the state of the art. On the one hand, the robustness of some descriptors to varying light conditions was studied.On the other hand, the ability of geometric moments to describe the human shape was also studied through a3D human pose estimation application based on 2D images. From this study, we have shown that through a shape retrieval application, geometric moments can be used to estimate a human pose through an exhaustive search in a pose database. This kind of application can be used in human actions recognition system which may be a final step of an event analysis system. In the second part of this report, three main contributions to rare event detection are presented. The first contribution concerns the development of a global scene analysis method for crowd event detection. In this method, global scene modeling is done based on spatiotemporal interest points filtered from the saliency map of the scene. The characteristics used are the histogram of the optical flow orientations and a set of shapes descriptors studied in the first part. The Latent Dirichlet Allocation algorithm is used to create event models by using a visual document representation of image sequences(video clip). The second contribution is the development of a method for salient motions detection in video.This method is totally unsupervised and relies on the properties of the discrete cosine transform to explore the optical flow information of the scene. Local modeling for events detection and localization is at the core of the latest contribution of this thesis. The method is based on the saliency score of movements and one class SVM algorithm to create the events model. The methods have been tested on different public database and the results obtained are promising
APA, Harvard, Vancouver, ISO, and other styles
31

Chung, Ming-Shen, and 鐘明聲. "FPGA Implementation of the Discrete Fourier Transform (DFT) and the Discrete Cosine Transform (DCT)." Thesis, 2002. http://ndltd.ncl.edu.tw/handle/01456280849939764692.

Full text
Abstract:
碩士
國立高雄第一科技大學
電腦與通訊工程所
90
The Discrete Fourier Transform(DFT)has been widely applied in communcation, speech processing, image processing, radar and sonar systems, etc. The architecture of DFT implement can be classified into two fields:(1)one is a pipelined systolic architecture,(2)the other is a memory-based architecture. Discrete Cosine Transform(DCT)has been commonly adopted in the various atandardsfor image compression while FPGA has become a new trend of ASIC design, so we will apply FPGA techinque to implement the DFT and the DCT. This thesis deals with how to use FPGA techinque to implement: (1)the pipelined systolic array architecture that requires log2N complex multipliers, 2log2N complex adders, 2log2N multiplexers, N delay elements and is able to provide a throughput of one transform sample per clock cycle; (2)the memory-based architecture that consists of three two-port RAM’s, one ROM, one complex multiplier, two complex adders, one multiplexer, and has capability of computing one transform sample every log2N+1 clock cycles on average; (3)Improved architecture in(2)under increasing little hardware that spends half of run time, i.e.N(log2N)/2; (4)2D-DFT that use architecture in(2)of 1D-DFT; (5)DCT operation and 2D-DCT operation.
APA, Harvard, Vancouver, ISO, and other styles
32

Liu, Jian-Cheng, and 劉建成. "Multi-dimentional Discrete Cosine Transform (DCT) Chip Design." Thesis, 2002. http://ndltd.ncl.edu.tw/handle/56071432152209136424.

Full text
APA, Harvard, Vancouver, ISO, and other styles
33

"Postprocessing of images coded using block DCT at low bit rates." 2007. http://library.cuhk.edu.hk/record=b5893316.

Full text
Abstract:
Sun, Deqing.
Thesis (M.Phil.)--Chinese University of Hong Kong, 2007.
Includes bibliographical references (leaves 86-91).
Abstracts in English and Chinese.
Abstract --- p.i
摘要 --- p.iii
Contributions --- p.iv
Acknowledgement --- p.vi
Abbreviations --- p.xviii
Notations --- p.xxi
Chapter 1 --- Introduction --- p.1
Chapter 1.1 --- Image compression and postprocessing --- p.1
Chapter 1.2 --- A brief review of postprocessing --- p.3
Chapter 1.3 --- Objective and methodology of the research --- p.7
Chapter 1.4 --- Thesis organization --- p.8
Chapter 1.5 --- A note on publication --- p.9
Chapter 2 --- Background Study --- p.11
Chapter 2.1 --- Image models --- p.11
Chapter 2.1.1 --- Minimum edge difference (MED) criterion for block boundaries --- p.12
Chapter 2.1.2 --- van Beek's edge model for an edge --- p.15
Chapter 2.1.3 --- Fields of experts (FoE) for an image --- p.16
Chapter 2.2 --- Degradation models --- p.20
Chapter 2.2.1 --- Quantization constraint set (QCS) and uniform noise --- p.21
Chapter 2.2.2 --- Narrow quantization constraint set (NQCS) --- p.22
Chapter 2.2.3 --- Gaussian noise --- p.22
Chapter 2.2.4 --- Edge width enlargement after quantization --- p.25
Chapter 2.3 --- Use of these models for postprocessing --- p.27
Chapter 2.3.1 --- MED and edge models --- p.27
Chapter 2.3.2 --- The FoE prior model --- p.27
Chapter 3 --- Postprocessing using MED and edge models --- p.28
Chapter 3.1 --- Blocking artifacts suppression by coefficient restoration --- p.29
Chapter 3.1.1 --- AC coefficient restoration by MED --- p.29
Chapter 3.1.2 --- General derivation --- p.31
Chapter 3.2 --- Detailed algorithm --- p.34
Chapter 3.2.1 --- Edge identification --- p.36
Chapter 3.2.2 --- Region classification --- p.36
Chapter 3.2.3 --- Edge reconstruction --- p.37
Chapter 3.2.4 --- Image reconstruction --- p.37
Chapter 3.3 --- Experimental results --- p.38
Chapter 3.3.1 --- Results of the proposed method --- p.38
Chapter 3.3.2 --- Comparison with one wavelet-based method --- p.39
Chapter 3.4 --- On the global minimum of the edge difference . . --- p.41
Chapter 3.4.1 --- The constrained minimization problem . . --- p.41
Chapter 3.4.2 --- Experimental examination --- p.42
Chapter 3.4.3 --- Discussions --- p.43
Chapter 3.5 --- Conclusions --- p.44
Chapter 4 --- Postprocessing by the MAP criterion using FoE --- p.49
Chapter 4.1 --- The proposed method --- p.49
Chapter 4.1.1 --- The MAP criterion --- p.49
Chapter 4.1.2 --- The optimization problem --- p.51
Chapter 4.2 --- Experimental results --- p.52
Chapter 4.2.1 --- Setting algorithm parameters --- p.53
Chapter 4.2.2 --- Results --- p.56
Chapter 4.3 --- Investigation on the quantization noise model . . --- p.58
Chapter 4.4 --- Conclusions --- p.61
Chapter 5 --- Conclusion --- p.71
Chapter 5.1 --- Contributions --- p.71
Chapter 5.1.1 --- Extension of the DCCR algorithm --- p.71
Chapter 5.1.2 --- Examination of the MED criterion --- p.72
Chapter 5.1.3 --- Use of the FoE prior in postprocessing . . --- p.72
Chapter 5.1.4 --- Investigation on the quantization noise model --- p.73
Chapter 5.2 --- Future work --- p.73
Chapter 5.2.1 --- Degradation model --- p.73
Chapter 5.2.2 --- Efficient implementation of the MAP method --- p.74
Chapter 5.2.3 --- Postprocessing of compressed video --- p.75
Chapter A --- Detailed derivation of coefficient restoration --- p.76
Chapter B --- Implementation details of the FoE prior --- p.81
Chapter B.1 --- The FoE prior model --- p.81
Chapter B.2 --- Energy function and its gradient --- p.83
Chapter B.3 --- Conjugate gradient descent method --- p.84
Bibliography --- p.86
APA, Harvard, Vancouver, ISO, and other styles
34

Suresh, K. "MDCT Domain Enhancements For Audio Processing." Thesis, 2010. https://etd.iisc.ac.in/handle/2005/1184.

Full text
Abstract:
Modified discrete cosine transform (MDCT) derived from DCT IV has emerged as the most suitable choice for transform domain audio coding applications due to its time domain alias cancellation property and de-correlation capability. In the present research work, we focus on MDCT domain analysis of audio signals for compression and other applications. We have derived algorithms for linear filtering in DCT IV and DST IV domains for symmetric and non-symmetric filter impulse responses. These results are also extended to MDCT and MDST domains which have the special property of time domain alias cancellation. We also derive filtering algorithms for the DCT II and DCT III domains. Comparison with other methods in the literature shows that, the new algorithm developed is computationally MAC efficient. These results are useful for MDCT domain audio processing such as reverb synthesis, without having to reconstruct the time domain signal and then perform the necessary filtering operations. In audio coding, the psychoacoustic model plays a crucial role and is used to estimate the masking thresholds for adaptive bit-allocation. Transparent quality audio coding is possible if the quantization noise is kept below the masking threshold for each frame. In the existing methods, the masking threshold is calculated using the DFT of the signal frame separately for MDCT domain adaptive quantization. We have extended the spectral integration based psychoacoustic model proposed for sinusoidal modeling of audio signals to the MDCT domain. This has been possible because of the detailed analysis of the relation between DFT and MDCT; we interpret the MDCT coefficients as co-sinusoids and then apply the sinusoidal masking model. The validity of the masking threshold so derived is verified through listening tests as well as objective measures. Parametric coding techniques are used for low bit rate encoding of multi-channel audio such as 5.1 format surround audio. In these techniques, the surround channels are synthesized at the receiver using the analysis parameters of the parametric model. We develop algorithms for MDCT domain analysis and synthesis of reverberation. Integrating these ideas, a parametric audio coder is developed in the MDCT domain. For the parameter estimation, we use a novel analysis by synthesis scheme in the MDCT domain which results in better modeling of the spatial audio. The resulting parametric stereo coder is able to synthesize acceptable quality stereo audio from the mono audio channel and a side information of approximately 11 kbps. Further, an experimental audio coder is developed in the MDCT domain incorporating the new psychoacoustic model and the parametric model.
APA, Harvard, Vancouver, ISO, and other styles
35

Suresh, K. "MDCT Domain Enhancements For Audio Processing." Thesis, 2010. http://etd.iisc.ernet.in/handle/2005/1184.

Full text
Abstract:
Modified discrete cosine transform (MDCT) derived from DCT IV has emerged as the most suitable choice for transform domain audio coding applications due to its time domain alias cancellation property and de-correlation capability. In the present research work, we focus on MDCT domain analysis of audio signals for compression and other applications. We have derived algorithms for linear filtering in DCT IV and DST IV domains for symmetric and non-symmetric filter impulse responses. These results are also extended to MDCT and MDST domains which have the special property of time domain alias cancellation. We also derive filtering algorithms for the DCT II and DCT III domains. Comparison with other methods in the literature shows that, the new algorithm developed is computationally MAC efficient. These results are useful for MDCT domain audio processing such as reverb synthesis, without having to reconstruct the time domain signal and then perform the necessary filtering operations. In audio coding, the psychoacoustic model plays a crucial role and is used to estimate the masking thresholds for adaptive bit-allocation. Transparent quality audio coding is possible if the quantization noise is kept below the masking threshold for each frame. In the existing methods, the masking threshold is calculated using the DFT of the signal frame separately for MDCT domain adaptive quantization. We have extended the spectral integration based psychoacoustic model proposed for sinusoidal modeling of audio signals to the MDCT domain. This has been possible because of the detailed analysis of the relation between DFT and MDCT; we interpret the MDCT coefficients as co-sinusoids and then apply the sinusoidal masking model. The validity of the masking threshold so derived is verified through listening tests as well as objective measures. Parametric coding techniques are used for low bit rate encoding of multi-channel audio such as 5.1 format surround audio. In these techniques, the surround channels are synthesized at the receiver using the analysis parameters of the parametric model. We develop algorithms for MDCT domain analysis and synthesis of reverberation. Integrating these ideas, a parametric audio coder is developed in the MDCT domain. For the parameter estimation, we use a novel analysis by synthesis scheme in the MDCT domain which results in better modeling of the spatial audio. The resulting parametric stereo coder is able to synthesize acceptable quality stereo audio from the mono audio channel and a side information of approximately 11 kbps. Further, an experimental audio coder is developed in the MDCT domain incorporating the new psychoacoustic model and the parametric model.
APA, Harvard, Vancouver, ISO, and other styles
36

Meghanani, Amit. "Pitch-Synchronous Discrete Cosine Transform Features for Speaker Recognition and Other Applications." Thesis, 2020. https://etd.iisc.ac.in/handle/2005/4822.

Full text
Abstract:
Extracting speaker-speci c information from the speech is of great interest since speaker recognition technology nds application in a wide range of areas such as forensics and biometric security systems. In this thesis, we propose a new feature named pitch-synchronous discrete cosine transform (PS-DCT), derived from the voiced part of the speech for speaker identi cation (SID) and veri cation (SV) tasks. Variants of the proposed PS-DCT features are explored for other speech-based applications. PS-DCT features are based on the `time-domain, quasi-periodic waveform shape' of the voiced sounds, which is captured by the discrete cosine transform (DCT). We test our PS-DCT feature on TIMIT, Mandarin and YOHO datasets for text-independent SID and SV studies. On TIMIT with 168 and Mandarin with 855 speakers, we obtain the text-independent SID accuracies of 99.4% and 96.1%, respectively, using a Gaussian mixture model-based classi er. SV studies are performed using the i-vector based framework. To ensure good performance in text-independent speaker veri cation, su cient training data for enrollment and longer utterances as test data is required. In i-vector based SV, fusing the `PS-DCT based system' with the `MFCC-based system' at the score level using a convex combination scheme reduces the equal error rate (EER) for both YOHO and Mandarin datasets. In the case of limited test data along with session variability, we obtain a signi cant reduction in equal error rate (EER). The reduction in EER is up to 5.8% on YOHO database and 3.2% on the Mandarin dataset for test data of duration < 3 sec. Thus, our experiments demonstrate that the proposed features supplement the handcrafted classical features, such as MFCC. The improvement in performance is more prominent in the case of limited test data speaker veri cation task. As mentioned earlier, we have also explored the e cacy of the proposed features for other speech-based tasks. Emotions in uence both the voice characteristics and linguistic content of speech. Speech emotion recognition (SER) is the task of extracting emotions from the voice characteristics of speech. Since variations in pitch play an important role while expressing the emotions through speech, we have tested PS-DCT features for emotion recognition. For the SER experiments, we have used Berlin emotional speech database (EmoDB), which contains 535 utterances spoken by 10 actors (5 female, 5 male) in 7 simulated emotions (anger, boredom, disgust, fear, joy, sadness and neutral). Bi-directional long short-term memory (BiLSTM) has the ability to model the temporal dependencies of sequential data such as speech and it has already been used for emotion recognition task. Hence, we have used a BiLSTM network trained with conventional MFCC features at the front end as baseline. To provide an accurate assessment of the model, we train and validate the model using leave-one-speaker-out (LOSO) k-fold (k = 10 in this case) cross validation. In this method, we train using k-1 speakers and then validate on the left-out speaker and repeat this procedure k times. The nal validation accuracy is computed as the average of the k folds of cross-validation. Experiments show that the BiLSTM network trained with combined PS-DCT and PS-MFCC features gives improved performance over a network trained on regular MFCC. The absolute improvement in 10-fold cross-validation accuracy is 3.5% using the fused (PS-DCT + PS-MFCC) features. Since every class of the vowels has a xed temporal gross level dynamics over di erent speakers and has quasi-stationary structure, we have explored the usefulness of PS-DCT to perform the task of vowel recognition too. The task of vowel recognition from speech can be viewed as a subset of phone recognition task. Since PS-DCT are de ned only for voiced sounds, we have focused only on the vowel recognition task, since all the vowels are inherently voiced in nature. For this study, we have used the vowel dataset available at https://homepages. wmich.edu/~hillenbr/voweldata.html. This dataset has 12 vowels (/ae/ as in \had", /ah/ as in \hod", /aw/ as in \hawed", /eh/ as in \head", /er/ as in \heard", /ey/ as in \hayed", /ih/ as in \hid", /iy/ as in \head", /oa/ as in \hoed", /oo/ as in \hood", /uh/ as in \hud", /uw/ as in \who'd") recorded from 139 subjects covering both genders and a wide range of age. It has utterances recorded from 45 men, 48 women and 46 children (both boys and girls). In total, it has 1668 utterances covering all the 12 vowels (words like `had' and `hod'). The results show that PS-DCT is able to classify vowels independently with 5-fold cross-validation average accuracy of 73.3%. Using MFCC features at the front end of the BiLSTM network, the obtained 5-fold cross-validation average accuracy is 88.5%, which is much better than that obtained with the PS-DCT. Training the network with combined PS-DCT and PS-MFCC has not led to improved performance, unlike the case of emotion recognition
APA, Harvard, Vancouver, ISO, and other styles
37

Mehrotra, Abhishek. "Shape Adaptive Integer Wavelet Transform Based Coding Scheme For 2-D/3-D Brain MR Images." Thesis, 2004. https://etd.iisc.ac.in/handle/2005/1171.

Full text
APA, Harvard, Vancouver, ISO, and other styles
38

Bittens, Sina Vanessa. "Sparse Fast Trigonometric Transforms." Doctoral thesis, 2019. http://hdl.handle.net/21.11130/00-1735-0000-0003-C16D-9.

Full text
APA, Harvard, Vancouver, ISO, and other styles
39

Mehrotra, Abhishek. "Shape Adaptive Integer Wavelet Transform Based Coding Scheme For 2-D/3-D Brain MR Images." Thesis, 2004. http://etd.iisc.ernet.in/handle/2005/1171.

Full text
APA, Harvard, Vancouver, ISO, and other styles
40

Chen, Chien-Feng, and 陳建豐. "The Application of Discrete Cosine Transform (DCT) Combined with the Nonlinear Regression Analysis on Optical Auto-Focusing." Thesis, 2008. http://ndltd.ncl.edu.tw/handle/15902479623677292509.

Full text
Abstract:
碩士
國立高雄師範大學
物理學系
96
This research presents a fast and accurate real-time optical auto-focusing system, which utilizes a frequency component of the discrete cosine transform (DCT) as the focus measure. Besides, a nonlinear regression routine is combined in the algorithm to quickly move a rotational stepper motor to the best focus. The concise and effective algorithm can be applied to digital cameras, microscopes and optical inspection instruments.
APA, Harvard, Vancouver, ISO, and other styles
41

Gupta, Pradeep Kumar. "Denoising And Inpainting Of Images : A Transform Domain Based Approach." Thesis, 2007. https://etd.iisc.ac.in/handle/2005/515.

Full text
Abstract:
Many scientific data sets are contaminated by noise, either because of data acquisition process, or because of naturally occurring phenomena. A first step in analyzing such data sets is denoising, i.e., removing additive noise from a noisy image. For images, noise suppression is a delicate and a difficult task. A trade of between noise reduction and the preservation of actual image features has to be made in a way that enhances the relevant image content. The beginning chapter in this thesis is introductory in nature and discusses the Popular denoising techniques in spatial and frequency domains. Wavelet transform has wide applications in image processing especially in denoising of images. Wavelet systems are a set of building blocks that represent a signal in an expansion set involving indices for time and scale. These systems allow the multi-resolution representation of signals. Several well known denoising algorithms exist in wavelet domain which penalize the noisy coefficients by threshold them. We discuss the wavelet transform based denoising of images using bit planes. This approach preserves the edges in an image. The proposed approach relies on the fact that wavelet transform allows the denoising strategy to adapt itself according to directional features of coefficients in respective sub-bands. Further, issues related to low complexity implementation of this algorithm are discussed. The proposed approach has been tested on different sets images under different noise intensities. Studies have shown that this approach provides a significant reduction in normalized mean square error (NMSE). The denoised images are visually pleasing. Many of the image compression techniques still use the redundancy reduction property of the discrete cosine transform (DCT). So, the development of a denoising algorithm in DCT domain has a practical significance. In chapter 3, a DCT based denoising algorithm is presented. In general, the design of filters largely depends on the a-priori knowledge about the type of noise corrupting the image and image features. This makes the standard filters to be application and image specific. The most popular filters such as average, Gaussian and Wiener reduce noisy artifacts by smoothing. However, this operation normally results in smoothing of the edges as well. On the other hand, sharpening filters enhance the high frequency details making the image non-smooth. An integrated approach to design filters based on DCT is proposed in chapter 3. This algorithm reorganizes DCT coefficients in a wavelet transform manner to get the better energy clustering at desired spatial locations. An adaptive threshold is chosen because such adaptively can improve the wavelet threshold performance as it allows additional local information of the image to be incorporated in the algorithm. Evaluation results show that the proposed filter is robust under various noise distributions and does not require any a-priori Knowledge about the image. Inpainting is another application that comes under the category of image processing. In painting provides a way for reconstruction of small damaged portions of an image. Filling-in missing data in digital images has a number of applications such as, image coding and wireless image transmission for recovering lost blocks, special effects (e.g., removal of objects) and image restoration (e.g., removal of solid lines, scratches and noise removal). In chapter 4, a wavelet based in painting algorithm is presented for reconstruction of small missing and damaged portion of an image while preserving the overall image quality. This approach exploits the directional features that exist in wavelet coefficients in respective sub-bands. The concluding chapter presents a brief review of the three new approaches: wavelet and DCT based denoising schemes and wavelet based inpainting method.
APA, Harvard, Vancouver, ISO, and other styles
42

Gupta, Pradeep Kumar. "Denoising And Inpainting Of Images : A Transform Domain Based Approach." Thesis, 2007. http://hdl.handle.net/2005/515.

Full text
Abstract:
Many scientific data sets are contaminated by noise, either because of data acquisition process, or because of naturally occurring phenomena. A first step in analyzing such data sets is denoising, i.e., removing additive noise from a noisy image. For images, noise suppression is a delicate and a difficult task. A trade of between noise reduction and the preservation of actual image features has to be made in a way that enhances the relevant image content. The beginning chapter in this thesis is introductory in nature and discusses the Popular denoising techniques in spatial and frequency domains. Wavelet transform has wide applications in image processing especially in denoising of images. Wavelet systems are a set of building blocks that represent a signal in an expansion set involving indices for time and scale. These systems allow the multi-resolution representation of signals. Several well known denoising algorithms exist in wavelet domain which penalize the noisy coefficients by threshold them. We discuss the wavelet transform based denoising of images using bit planes. This approach preserves the edges in an image. The proposed approach relies on the fact that wavelet transform allows the denoising strategy to adapt itself according to directional features of coefficients in respective sub-bands. Further, issues related to low complexity implementation of this algorithm are discussed. The proposed approach has been tested on different sets images under different noise intensities. Studies have shown that this approach provides a significant reduction in normalized mean square error (NMSE). The denoised images are visually pleasing. Many of the image compression techniques still use the redundancy reduction property of the discrete cosine transform (DCT). So, the development of a denoising algorithm in DCT domain has a practical significance. In chapter 3, a DCT based denoising algorithm is presented. In general, the design of filters largely depends on the a-priori knowledge about the type of noise corrupting the image and image features. This makes the standard filters to be application and image specific. The most popular filters such as average, Gaussian and Wiener reduce noisy artifacts by smoothing. However, this operation normally results in smoothing of the edges as well. On the other hand, sharpening filters enhance the high frequency details making the image non-smooth. An integrated approach to design filters based on DCT is proposed in chapter 3. This algorithm reorganizes DCT coefficients in a wavelet transform manner to get the better energy clustering at desired spatial locations. An adaptive threshold is chosen because such adaptively can improve the wavelet threshold performance as it allows additional local information of the image to be incorporated in the algorithm. Evaluation results show that the proposed filter is robust under various noise distributions and does not require any a-priori Knowledge about the image. Inpainting is another application that comes under the category of image processing. In painting provides a way for reconstruction of small damaged portions of an image. Filling-in missing data in digital images has a number of applications such as, image coding and wireless image transmission for recovering lost blocks, special effects (e.g., removal of objects) and image restoration (e.g., removal of solid lines, scratches and noise removal). In chapter 4, a wavelet based in painting algorithm is presented for reconstruction of small missing and damaged portion of an image while preserving the overall image quality. This approach exploits the directional features that exist in wavelet coefficients in respective sub-bands. The concluding chapter presents a brief review of the three new approaches: wavelet and DCT based denoising schemes and wavelet based inpainting method.
APA, Harvard, Vancouver, ISO, and other styles
43

Jiang, Jianmin, and G. C. Feng. "The spatial relationship of DCT coefficients between a block and its sub-blocks." 2002. http://hdl.handle.net/10454/3446.

Full text
Abstract:
No
At present, almost all digital images are stored and transferred in their compressed format in which discrete cosine transform (DCT)-based compression remains one of the most important data compression techniques due to the efforts from JPEG. In order to save the computation and memory cost, it is desirable to have image processing operations such as feature extraction, image indexing, and pattern classifications implemented directly in the DCT domain. To this end, we present in this paper a generalized analysis of spatial relationships between the DCTs of any block and its sub-blocks. The results reveal that DCT coefficients of any block can be directly obtained from the DCT coefficients of its sub-blocks and that the interblock relationship remains linear. It is useful in extracting global features in compressed domain for general image processing tasks such as those widely used in pyramid algorithms and image indexing. In addition, due to the fact that the corresponding coefficient matrix of the linear combination is sparse, the computational complexity of the proposed algorithms is significantly lower than that of the existing methods.
APA, Harvard, Vancouver, ISO, and other styles
44

Abhiram, B. "Characterization of the Voice Source by the DCT for Speaker Information." Thesis, 2014. http://etd.iisc.ac.in/handle/2005/2894.

Full text
Abstract:
Extracting speaker-specific information from speech is of great interest to both researchers and developers alike, since speaker recognition technology finds application in a wide range of areas, primary among them being forensics and biometric security systems. Several models and techniques have been employed to extract speaker information from the speech signal. Speech production is generally modeled as an excitation source followed by a filter. Physiologically, the source corresponds to the vocal fold vibrations and the filter corresponds to the spectrum-shaping vocal tract. Vocal tract-based features like the melfrequency cepstral coefficients (MFCCs) and linear prediction cepstral coefficients have been shown to contain speaker information. However, high speed videos of the larynx show that the vocal folds of different individuals vibrate differently. Voice source (VS)-based features have also been shown to perform well in speaker recognition tasks, thereby revealing that the VS does contain speaker information. Moreover, a combination of the vocal tract and VS-based features has been shown to give an improved performance, showing that the latter contains supplementary speaker information. In this study, the focus is on extracting speaker information from the VS. The existing techniques for the same are reviewed, and it is observed that the features which are obtained by fitting a time-domain model on the VS perform poorly than those obtained by simple transformations of the VS. Here, an attempt is made to propose an alternate way of characterizing the VS to extract speaker information, and to study the merits and shortcomings of the proposed speaker-specific features. The VS cannot be measured directly. Thus, to characterize the VS, we first need an estimate of the VS, and the integrated linear prediction residual (ILPR) extracted from the speech signal is used as the VS estimate in this study. The voice source linear prediction model, which was proposed in an earlier study to obtain the ILPR, is used in this work. It is hypothesized here that a speaker’s voice may be characterized by the relative proportions of the harmonics present in the VS. The pitch synchronous discrete cosine transform (DCT) is shown to capture these, and the gross shape of the ILPR in a few coefficients. The ILPR and hence its DCT coefficients are visually observed to distinguish between speakers. However, it is also observed that they do have intra-speaker variability, and thus it is hypothesized that the distribution of the DCT coefficients may capture speaker information, and this distribution is modeled by a Gaussian mixture model (GMM). The DCT coefficients of the ILPR (termed the DCTILPR) are directly used as a feature vector in speaker identification (SID) tasks. Issues related to the GMM, like the type of covariance matrix, are studied, and it is found that diagonal covariance matrices perform better than full covariance matrices. Thus, mixtures of Gaussians having diagonal covariances are used as speaker models, and by conducting SID experiments on three standard databases, it is found that the proposed DCTILPR features fare comparably with the existing VS-based features. It is also found that the gross shape of the VS contains most of the speaker information, and the very fine structure of the VS does not help in distinguishing speakers, and instead leads to more confusion between speakers. The major drawbacks of the DCTILPR are the session and handset variability, but they are also present in existing state-of-the-art speaker-specific VS-based features and the MFCCs, and hence seem to be common problems. There are techniques to compensate these variabilities, which need to be used when the systems using these features are deployed in an actual application. The DCTILPR is found to improve the SID accuracy of a system trained with MFCC features by 12%, indicating that the DCTILPR features capture speaker information which is missed by the MFCCs. It is also found that a combination of MFCC and DCTILPR features on a speaker verification task gives significant performance improvement in the case of short test utterances. Thus, on the whole, this study proposes an alternate way of extracting speaker information from the VS, and adds to the evidence for speaker information present in the VS.
APA, Harvard, Vancouver, ISO, and other styles
45

Abhiram, B. "Characterization of the Voice Source by the DCT for Speaker Information." Thesis, 2014. http://etd.iisc.ernet.in/handle/2005/2894.

Full text
Abstract:
Extracting speaker-specific information from speech is of great interest to both researchers and developers alike, since speaker recognition technology finds application in a wide range of areas, primary among them being forensics and biometric security systems. Several models and techniques have been employed to extract speaker information from the speech signal. Speech production is generally modeled as an excitation source followed by a filter. Physiologically, the source corresponds to the vocal fold vibrations and the filter corresponds to the spectrum-shaping vocal tract. Vocal tract-based features like the melfrequency cepstral coefficients (MFCCs) and linear prediction cepstral coefficients have been shown to contain speaker information. However, high speed videos of the larynx show that the vocal folds of different individuals vibrate differently. Voice source (VS)-based features have also been shown to perform well in speaker recognition tasks, thereby revealing that the VS does contain speaker information. Moreover, a combination of the vocal tract and VS-based features has been shown to give an improved performance, showing that the latter contains supplementary speaker information. In this study, the focus is on extracting speaker information from the VS. The existing techniques for the same are reviewed, and it is observed that the features which are obtained by fitting a time-domain model on the VS perform poorly than those obtained by simple transformations of the VS. Here, an attempt is made to propose an alternate way of characterizing the VS to extract speaker information, and to study the merits and shortcomings of the proposed speaker-specific features. The VS cannot be measured directly. Thus, to characterize the VS, we first need an estimate of the VS, and the integrated linear prediction residual (ILPR) extracted from the speech signal is used as the VS estimate in this study. The voice source linear prediction model, which was proposed in an earlier study to obtain the ILPR, is used in this work. It is hypothesized here that a speaker’s voice may be characterized by the relative proportions of the harmonics present in the VS. The pitch synchronous discrete cosine transform (DCT) is shown to capture these, and the gross shape of the ILPR in a few coefficients. The ILPR and hence its DCT coefficients are visually observed to distinguish between speakers. However, it is also observed that they do have intra-speaker variability, and thus it is hypothesized that the distribution of the DCT coefficients may capture speaker information, and this distribution is modeled by a Gaussian mixture model (GMM). The DCT coefficients of the ILPR (termed the DCTILPR) are directly used as a feature vector in speaker identification (SID) tasks. Issues related to the GMM, like the type of covariance matrix, are studied, and it is found that diagonal covariance matrices perform better than full covariance matrices. Thus, mixtures of Gaussians having diagonal covariances are used as speaker models, and by conducting SID experiments on three standard databases, it is found that the proposed DCTILPR features fare comparably with the existing VS-based features. It is also found that the gross shape of the VS contains most of the speaker information, and the very fine structure of the VS does not help in distinguishing speakers, and instead leads to more confusion between speakers. The major drawbacks of the DCTILPR are the session and handset variability, but they are also present in existing state-of-the-art speaker-specific VS-based features and the MFCCs, and hence seem to be common problems. There are techniques to compensate these variabilities, which need to be used when the systems using these features are deployed in an actual application. The DCTILPR is found to improve the SID accuracy of a system trained with MFCC features by 12%, indicating that the DCTILPR features capture speaker information which is missed by the MFCCs. It is also found that a combination of MFCC and DCTILPR features on a speaker verification task gives significant performance improvement in the case of short test utterances. Thus, on the whole, this study proposes an alternate way of extracting speaker information from the VS, and adds to the evidence for speaker information present in the VS.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography