Log in

Relevant bibliographies by topics / Speech Activity Detection (SAD) / Dissertations / Theses

To see the other types of publications on this topic, follow the link: Speech Activity Detection (SAD).

Dissertations / Theses on the topic 'Speech Activity Detection (SAD)'

Author: Grafiati

Published: 24 June 2021

Last updated: 29 July 2025

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 26 dissertations / theses for your research on the topic 'Speech Activity Detection (SAD).'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Näslund, Anton, and Charlie Jeansson. "Robust Speech Activity Detection and Direction of Arrival Using Convolutional Neural Networks." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-297756.

Full text

Abstract:

Social robots are becoming more and more common in our everyday lives. In the field of conversational robotics, the development goes towards socially engaging robots with humanlike conversation. This project looked into one of the technical aspects when recognizing speech, videlicet speech activity detection (SAD). The presented solution uses a convolutional neural network (CNN) based system to detect speech in a forward azimuth area. The project used a dataset from FestVox, called CMU Artic and was complimented by adding recorded noises. A library called Pyroomacoustics were used to simulate

APA, Harvard, Vancouver, ISO, and other styles

2

Wejdelind, Marcus, and Nils Wägmark. "Multi-speaker Speech Activity Detection From Video." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-297701.

Full text

Abstract:

A conversational robot will in many cases have todeal with multi-party spoken interaction in which one or morepeople could be speaking simultaneously. To do this, the robotmust be able to identify the speakers in order to attend to them.Our project has approached this problem from a visual pointof view where a Convolutional Neural Network (CNN) wasimplemented and trained using video stream input containingone or more faces from an already existing data set (AVA-Speech). The goal for the network has then been to for eachface, and in each point in time, detect the probability of thatperson speak

APA, Harvard, Vancouver, ISO, and other styles

3

Murrin, Paul. "Objective measurement of voice activity detectors." Thesis, University of York, 1999. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.325647.

Full text

APA, Harvard, Vancouver, ISO, and other styles

4

Laverty, Stephen William. "Detection of Nonstationary Noise and Improved Voice Activity Detection in an Automotive Hands-free Environment." Link to electronic thesis, 2005. http://www.wpi.edu/Pubs/ETD/Available/etd-051105-110646/.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

Minotto, Vicente Peruffo. "Audiovisual voice activity detection and localization of simultaneous speech sources." reponame:Biblioteca Digital de Teses e Dissertações da UFRGS, 2013. http://hdl.handle.net/10183/77231.

Full text

Abstract:

Em vista da tentência de se criarem intefaces entre humanos e máquinas que cada vez mais permitam meios simples de interação, é natural que sejam realizadas pesquisas em técnicas que procuram simular o meio mais convencional de comunicação que os humanos usam: a fala. No sistema auditivo humano, a voz é automaticamente processada pelo cérebro de modo efetivo e fácil, também comumente auxiliada por informações visuais, como movimentação labial e localizacão dos locutores. Este processamento realizado pelo cérebro inclui dois componentes importantes que a comunicação baseada em fala requere: Det

APA, Harvard, Vancouver, ISO, and other styles

6

Ent, Petr. "Voice Activity Detection." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2009. http://www.nusl.cz/ntk/nusl-235483.

Full text

Abstract:

Práce pojednává o využití support vector machines v detekci řečové aktivity. V první části jsou zkoumány různé druhy příznaků, jejich extrakce a zpracování a je nalezena jejich optimální kombinace, která podává nejlepší výsledky. Druhá část představuje samotný systém pro detekci řečové aktivity a ladění jeho parametrů. Nakonec jsou výsledky porovnány s dvěma dalšími systémy, založenými na odlišných principech. Pro testování a ladění byla použita ERT broadcast news databáze. Porovnání mezi systémy bylo pak provedeno na databázi z NIST06 Rich Test Evaluations.

APA, Harvard, Vancouver, ISO, and other styles

7

Cho, Yong Duk. "Speech detection, enhancement and compression for voice communications." Thesis, University of Surrey, 2001. http://epubs.surrey.ac.uk/842991/.

Full text

Abstract:

Speech signal processing for voice communications can be characterised in terms of silence compression, noise reduction, and speech compression. The limit in the channel bandwidth of voice communication systems requires efficient compression of speech and silence signals while retaining the voice quality. Silence compression by means of both voice activity detection (VAD) and comfort noise generation could present transparent speech-quality while substantially lowering the transmission bit-rate, since pause regions between talk spurts do not include any voice information. Thus, this thesis pro

APA, Harvard, Vancouver, ISO, and other styles

8

Doukas, Nikolaos. "Voice activity detection using energy based measures and source separation." Thesis, Imperial College London, 1998. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.245220.

Full text

APA, Harvard, Vancouver, ISO, and other styles

9

Sinclair, Mark. "Speech segmentation and speaker diarisation for transcription and translation." Thesis, University of Edinburgh, 2016. http://hdl.handle.net/1842/20970.

Full text

Abstract:

This dissertation outlines work related to Speech Segmentation – segmenting an audio recording into regions of speech and non-speech, and Speaker Diarization – further segmenting those regions into those pertaining to homogeneous speakers. Knowing not only what was said but also who said it and when, has many useful applications. As well as providing a richer level of transcription for speech, we will show how such knowledge can improve Automatic Speech Recognition (ASR) system performance and can also benefit downstream Natural Language Processing (NLP) tasks such as machine translation and p

APA, Harvard, Vancouver, ISO, and other styles

10

VECCHIOTTI, PAOLO. "Deep neural networks for speech detection and speaker localization in reverberant environments." Doctoral thesis, Università Politecnica delle Marche, 2019. http://hdl.handle.net/11566/263049.

Full text

Abstract:

In questa tesi vengono affrontate le tematiche del Voice Activity Detection (VAD) e dello Speaker LOCalization (SLOC) in ambiente riverberante. Un approccio data-driven caratterizza questo lavoro, e per questo motivo reti neurali deep vengono ampliamente sfruttate e analizzate. Sebbene diversi algoritmi classici siano stati utilizzati per VAD e SLOC per lungo tempo, le recenti scoperte nel campo del machine learning applicato all’audio hanno portato a risultati incoraggianti per quanto concerne VAD e SLOC. Di conseguenza, questa tesi propone numerose strategie vincenti per VAD e SLOC basate su

APA, Harvard, Vancouver, ISO, and other styles

11

Thorell, Hampus. "Voice Activity Detection in the Tiger Platform." Thesis, Linköping University, Department of Electrical Engineering, 2006. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-6586.

Full text

Abstract:

<p>Sectra Communications AB has developed a terminal for encrypted communication called the Tiger platform. During voice communication delays have sometimes been experienced resulting in conversational complications.</p><p>A solution to this problem, as was proposed by Sectra, would be to introduce voice activity detection, which means a separation of speech parts and non-speech parts of the input signal, to the Tiger platform. By only transferring the speech parts to the receiver, the bandwidth needed should be dramatically decreased. A lower bandwidth needed implies that the delays slowly sh

APA, Harvard, Vancouver, ISO, and other styles

12

Mancini, Eleonora. "Disruptive Situations Detection on Public Transports through Speech Emotion Recognition." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2021. http://amslaurea.unibo.it/24721/.

Full text

Abstract:

In this thesis, we describe a study on the application of Machine Learning and Deep Learning methods for Voice Activity Detection (VAD) and Speech Emotion Recognition (SER). The study is in the context of a European project whose objective is to detect disruptive situations in public transports. To this end, we developed an architecture, implemented a prototype and ran validation tests on a variety of options. The architecture consists of several modules. The denoising module was realized through the use of a filter and the VAD module through an open-source toolkit, while the SER system was

APA, Harvard, Vancouver, ISO, and other styles

13

Cooper, Douglas. "Speech Detection using Gammatone Features and One-Class Support Vector Machine." Master's thesis, University of Central Florida, 2013. http://digital.library.ucf.edu/cdm/ref/collection/ETD/id/5923.

Full text

Abstract:

A network gateway is a mechanism which provides protocol translation and/or validation of network traffic using the metadata contained in network packets. For media applications such as Voice-over-IP, the portion of the packets containing speech data cannot be verified and can provide a means of maliciously transporting code or sensitive data undetected. One solution to this problem is through Voice Activity Detection (VAD). Many VAD's rely on time-domain features and simple thresholds for efficient speech detection however this doesn't say much about the signal being passed. More sophisticate

APA, Harvard, Vancouver, ISO, and other styles

14

Temko, Andriy. "Acoustic event detection and classification." Doctoral thesis, Universitat Politècnica de Catalunya, 2007. http://hdl.handle.net/10803/6880.

Full text

Abstract:

L'activitat humana que té lloc en sales de reunions o aules d'ensenyament es veu reflectida en una rica varietat d'events acústics, ja siguin produïts pel cos humà o per objectes que les persones manegen. Per això, la determinació de la identitat dels sons i de la seva posició temporal pot ajudar a detectar i a descriure l'activitat humana que té lloc en la sala. A més a més, la detecció de sons diferents de la veu pot ajudar a millorar la robustes de tecnologies de la parla com el reconeixement automàtica a condicions de treball adverses. L'objectiu d'aquesta tesi és la detecció i classificac

APA, Harvard, Vancouver, ISO, and other styles

15

Danko, Michal. "Identifikace hudby, řeči, křiku, zpěvu v audio (video) záznamu." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2016. http://www.nusl.cz/ntk/nusl-255309.

Full text

Abstract:

This thesis follows the trend of last decades in using neural networks in order to detect speech in noisy data. The text begins with basic knowledge about discussed topics, such as audio features, machine learning and neural networks. The network parameters are examined in order to provide the most suitable background for the experiments. The main focus of the experiments is to observe the influence of various sound events on the speech detection on a small, diverse database. Where the sound events correlated to the speech proved to be the most beneficial. In addition, the accuracy of the acou

APA, Harvard, Vancouver, ISO, and other styles

16

Podloucká, Lenka. "Identifikace pauz v rušeném řečovém signálu." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2008. http://www.nusl.cz/ntk/nusl-217266.

Full text

Abstract:

This diploma thesis deals with pause identification with degraded speech signal. The speech characteristics and the conception of speech signal processing are described here. The work aim was to create the reliable recognizing method to establish speech and non-speech segments of speech signal with and without degraded speech signal. The five empty pause detectors were realized in computing environment MATLAB. There was the energetic detector in time domain, two-step detector in spectral domain, one-step integral detector, two-step integral detector and differential detector in cepstrum. The s

APA, Harvard, Vancouver, ISO, and other styles

17

Min-Chang, Chang, and 張民昌. "Voice Activity Detection and Its Application to Speech Coding." Thesis, 2003. http://ndltd.ncl.edu.tw/handle/86106727484751378312.

Full text

Abstract:

碩士<br>國立臺北科技大學<br>電機工程系碩士班<br>91<br>Voice activity detector is usually to be the preprocessor of a speech encoder in order to determine whether the incoming signal is a speech segment or not. If it is, a normal speech coder is used to encode the speech segment. If it is not, fewer parameters called silence insertion descriptor (SID) are needed to transmit to the decoder then a comfort noise generator (CNG) is exploited to mimic the background noise. According to the statistics about people’s talking, above 40 % even as higher as 60 % time slice is silence between talk spurts, so lots of bit ra

APA, Harvard, Vancouver, ISO, and other styles

18

Lai, Chen-Wei, and 賴辰瑋. "The Research on the Voice Activity Detection and Speech Enhancement for Noisy Speech Recognition." Thesis, 2005. http://ndltd.ncl.edu.tw/handle/06070640933840270072.

Full text

Abstract:

碩士<br>國立暨南國際大學<br>電機工程學系<br>93<br>When a speech recognizer is applied in a real environment, its performance is often degraded seriously due to the existence of additive noise. In order to improve the robustness of the recognition system under noisy conditions, various approaches have been proposed, one direction of these approaches is attempt to detect the presence the presence of noise, to estimate the characteristics of the noise and then to remove or alleviate the noise in speech signals. In the thesis, we first study several voice activity detection (endpoint detection) approaches, which

APA, Harvard, Vancouver, ISO, and other styles

19

WU, DONG-HAN, and 吳東翰. "Reduced Computation of Speech Coder Using a Voice Activity Detection Algorithm." Thesis, 2017. http://ndltd.ncl.edu.tw/handle/83961092562227492103.

Full text

Abstract:

碩士<br>南臺科技大學<br>資訊工程系<br>105<br>The explosive growth of Internet use and multimedia technology, multimedia communication is integrated into a personal information machine nowadays, and due to the latter’s limited computational capability, the need for a coder with low computational complexity to match different hardware platforms and integrate the services of media sources has arisen. For an Internet or wireless speech communicator, heavy computation uses more power and contributes to higher pricing of the communicator or reduced battery life. In order to achieve the real-time and continuity o

APA, Harvard, Vancouver, ISO, and other styles

20

Tu, Wen Hsiang, and 杜文祥. "Study on the Voice Activity Detection Techniques for Robust Speech Feature Extraction." Thesis, 2007. http://ndltd.ncl.edu.tw/handle/76966247400637028949.

Full text

Abstract:

碩士<br>國立暨南國際大學<br>電機工程學系<br>95<br>The performance of a speech recognition system is often degraded due to the mismatch between the environments of development and application. One of the major sources that give rises to this mismatch is additive noise. The approaches for handling the problem of additive noise can be divided into three classes: speech enhancement, robust speech feature extraction, and compensation of speech models. In this thesis, we are focused on the second class, robust speech feature extraction. The approaches of speech robust feature extraction are often together with the

APA, Harvard, Vancouver, ISO, and other styles

21

楊佳興. "A Real-time Speech Purification and Voice Activity Detection System Using Microphone Array." Thesis, 2005. http://ndltd.ncl.edu.tw/handle/qy6qq9.

Full text

APA, Harvard, Vancouver, ISO, and other styles

22

Hsuei, Yan-Jung, and 許晏榮. "SOPC Implementation of Speech Purification and Voice Activity Detection System Using Microphone Array." Thesis, 2005. http://ndltd.ncl.edu.tw/handle/mkjwd4.

Full text

Abstract:

碩士<br>國立交通大學<br>電機與控制工程系所<br>94<br>A real-time speech purification and voice activity detection (VAD) system for noisy indoor environment is proposed in this thesis. The system contains a real-time eight channel microphone array signal processing platform. An adaptive spatial filter is also designed on the platform to provide the system with the ability of environmental characteristic and noise adaptation. All the algorithms are realized on a Nios embedded system-on-programmable-chip (SOPC) platform. The VAD algorithm is executed by the Nios processor and the adaptive filter is accelerated by

APA, Harvard, Vancouver, ISO, and other styles

23

Chen, Hung-Bin, and 陳鴻彬. "On the Study of Energy-Based Speech Feature Normalization and Application to Voice Activity Detection." Thesis, 2007. http://ndltd.ncl.edu.tw/handle/41039482721804356460.

Full text

Abstract:

碩士<br>國立臺灣師範大學<br>資訊工程研究所<br>95<br>This thesis considered robust speech recognition in various noise environments, with a special focus on investigating the ways to reconstruct the clean time-domain log-energy features from the noise-contaminated ones. Based on the distribution characteristics of the log-energy features of each speech utterance, we aimed to develop an efficient approach to rescale the log-energy features of the noisy speech utterance so as to alleviate the mismatch caused by environmental noises for better speech recognition performance. As the time-domain phenomena of the spe

APA, Harvard, Vancouver, ISO, and other styles

24

ZHENG, SU-XING, and 鄭素幸. "A study on wireless digital subscriber loop and the channel sharing efficiency through speech activity detection." Thesis, 1992. http://ndltd.ncl.edu.tw/handle/03260404361617659124.

Full text

APA, Harvard, Vancouver, ISO, and other styles

25

Naini, Abinay Reddy. "Speaker verification using whispered speech." Thesis, 2020. https://etd.iisc.ac.in/handle/2005/5516.

Full text

Abstract:

Like neutral speech, whispered speech is one of the natural modes of speech production, and it is often used by speakers in their day-to-day life. For some people, such as laryngectomees, whispered speech is the only mode of communication. Despite the absence of voicing in whispered speech and difference in characteristics compared to the neutral speech, previous works in the literature demonstrated that whispered speech contains adequate information about the content and the speaker. In recent times, virtual assistants have become more natural and widespread. This led to an increase in

APA, Harvard, Vancouver, ISO, and other styles

26

Venter, Petrus Jacobus. "Recording and automatic detection of African elephant (Loxodonta africana) infrasonic rumbles." Diss., 2008. http://hdl.handle.net/2263/28329.

Full text

Abstract:

The value of studying elephant vocalizations lies in the abundant information that can be retrieved from it. Recordings of elephant rumbles can be used by researchers to determine the size and composition of the herd, the sexual state, as well as the emotional condition of an elephant. It is a difficult task for researchers to obtain large volumes of continuous recordings of elephant vocalizations. Recordings are normally analysed manually to identify the location of rumbles via the tedious and time consuming methods of sped up listening and the visual evaluation of spectrograms. The applicati

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!