Academic literature on the topic 'Generative audio models'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Generative audio models.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Generative audio models"

1

Evans, Zach, Scott H. Hawley, and Katherine Crowson. "Musical audio samples generated from joint text embeddings." Journal of the Acoustical Society of America 152, no. 4 (2022): A178. http://dx.doi.org/10.1121/10.0015956.

Full text
Abstract:
The field of machine learning has benefited from the appearance of diffusion-based generative models for images and audio. While text-to-image models have become increasingly prevalent, text-to-audio generative models are currently an active area of research. We present work on generating short samples of musical instrument sounds generated by a model which was conditioned on text descriptions and the file structure labels of large sample libraries. Preliminary findings indicate that generation of wide-spectrum sounds such as percussion are not difficult, while the generation of harmonic music
APA, Harvard, Vancouver, ISO, and other styles
2

Kang, Hyunju, Geonhee Han, Yoonjae Jeong, and Hogun Park. "AudioGenX: Explainability on Text-to-Audio Generative Models." Proceedings of the AAAI Conference on Artificial Intelligence 39, no. 17 (2025): 17733–41. https://doi.org/10.1609/aaai.v39i17.33950.

Full text
Abstract:
Text-to-audio generation models (TAG) have achieved significant advances in generating audio conditioned on text descriptions. However, a critical challenge lies in the lack of transparency regarding how each textual input impacts the generated audio. To address this issue, we introduce AudioGenX, an Explainable AI (XAI) method that provides explanations for text-to-audio generation models by highlighting the importance of input tokens. AudioGenX optimizes an Explainer by leveraging factual and counterfactual objective functions to provide faithful explanations at the audio token level. This m
APA, Harvard, Vancouver, ISO, and other styles
3

Samson, Grzegorz. "Perspectives on Generative Sound Design: A Generative Soundscapes Showcase." Arts 14, no. 3 (2025): 67. https://doi.org/10.3390/arts14030067.

Full text
Abstract:
Recent advancements in generative neural networks, particularly transformer-based models, have introduced novel possibilities for sound design. This study explores the use of generative pre-trained transformers (GPT) to create complex, multilayered soundscapes from textual and visual prompts. A custom pipeline is proposed, featuring modules for converting the source input into structured sound descriptions and subsequently generating cohesive auditory outputs. As a complementary solution, a granular synthesizer prototype was developed to enhance the usability of generative audio samples by ena
APA, Harvard, Vancouver, ISO, and other styles
4

Jeong, Yujin, Yunji Kim, Sanghyuk Chun, and Jiyoung Lee. "Read, Watch and Scream! Sound Generation from Text and Video." Proceedings of the AAAI Conference on Artificial Intelligence 39, no. 17 (2025): 17590–98. https://doi.org/10.1609/aaai.v39i17.33934.

Full text
Abstract:
Despite the impressive progress of multimodal generative models, video-to-audio generation still suffers from limited performance and limits the flexibility to prioritize sound synthesis for specific objects within the scene. Conversely, text-to-audio generation methods generate high-quality audio but pose challenges in ensuring comprehensive scene depiction and time-varying control. To tackle these challenges, we propose a novel video-and-text-to-audio generation method, called ReWaS, where video serves as a conditional control for a text-to-audio generation model. Especially, our method esti
APA, Harvard, Vancouver, ISO, and other styles
5

Wang, Heng, Jianbo Ma, Santiago Pascual, Richard Cartwright, and Weidong Cai. "V2A-Mapper: A Lightweight Solution for Vision-to-Audio Generation by Connecting Foundation Models." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 14 (2024): 15492–501. http://dx.doi.org/10.1609/aaai.v38i14.29475.

Full text
Abstract:
Building artificial intelligence (AI) systems on top of a set of foundation models (FMs) is becoming a new paradigm in AI research. Their representative and generative abilities learnt from vast amounts of data can be easily adapted and transferred to a wide range of downstream tasks without extra training from scratch. However, leveraging FMs in cross-modal generation remains under-researched when audio modality is involved. On the other hand, automatically generating semantically-relevant sound from visual input is an important problem in cross-modal generation studies. To solve this vision-
APA, Harvard, Vancouver, ISO, and other styles
6

Ji, Wenliang, Ming Jin, and Yixin Chen. "Optimization of Digital Media Content Generation and Communication Effect Combined with Deep Learning Technology." Journal of Combinatorial Mathematics and Combinatorial Computing 127a (April 15, 2025): 1449–66. https://doi.org/10.61091/jcmcc127a-084.

Full text
Abstract:
The combination of deep learning and digital media technology provides great scope for content creation. The article uses Generative Adversarial Network (GAN) in deep learning for content generation. Based on the three major forms of digital media content (image, audio, and video), image, audio, and video are generated by U-Net_GAN model, MAS-GAN model, and SSFLVGAN model, respectively, to construct a digital media content generation model based on generative adversarial networks. Subsequently, the model is validated for performance and the generated images, audio and video are evaluated for e
APA, Harvard, Vancouver, ISO, and other styles
7

Sakirin, Tam, and Siddartha Kusuma. "A Survey of Generative Artificial Intelligence Techniques." Babylonian Journal of Artificial Intelligence 2023 (March 10, 2023): 10–14. http://dx.doi.org/10.58496/bjai/2023/003.

Full text
Abstract:
Generative artificial intelligence (AI) refers to algorithms capable of creating novel, realistic digital content autonomously. Recently, generative models have attained groundbreaking results in domains like image and audio synthesis, spurring vast interest in the field. This paper surveys the landscape of modern techniques powering the rise of creative AI systems. We structurally examine predominant algorithmic approaches including generative adversarial networks (GANs), variational autoencoders (VAEs), and autoregressive models. Architectural innovations and illustrations of generated outpu
APA, Harvard, Vancouver, ISO, and other styles
8

Broad, Terence, Frederic Fol Leymarie, and Mick Grierson. "Network Bending: Expressive Manipulation of Generative Models in Multiple Domains." Entropy 24, no. 1 (2021): 28. http://dx.doi.org/10.3390/e24010028.

Full text
Abstract:
This paper presents the network bending framework, a new approach for manipulating and interacting with deep generative models. We present a comprehensive set of deterministic transformations that can be inserted as distinct layers into the computational graph of a trained generative neural network and applied during inference. In addition, we present a novel algorithm for analysing the deep generative model and clustering features based on their spatial activation maps. This allows features to be grouped together based on spatial similarity in an unsupervised fashion. This results in the mean
APA, Harvard, Vancouver, ISO, and other styles
9

Cao, Yongnian, Xuechun Yang, and Rui Sun. "Generative AI Models Theoretical Foundations and Algorithmic Practices." Journal of Industrial Engineering and Applied Science 3, no. 1 (2025): 1–9. https://doi.org/10.70393/6a69656173.323633.

Full text
Abstract:
Generative models in AI are an entirely new paradigm for machine learning, allowing computers to create realistic data in all kinds of categories, like text (NLP), images, and even physics simulations. In this paper this formalism is used to guide the theory, algorithms and applications of generative models, with particular focus on a few well established techniques like VAEs, GANs, and diffusion models. It stresses the importance of probabilistic generative modelling and information theory (I.e. KL divergence, ELBO, adversarial optimization, etc.) We cover algorithmic practices such as optimi
APA, Harvard, Vancouver, ISO, and other styles
10

Aldausari, Nuha, Arcot Sowmya, Nadine Marcus, and Gelareh Mohammadi. "Video Generative Adversarial Networks: A Review." ACM Computing Surveys 55, no. 2 (2023): 1–25. http://dx.doi.org/10.1145/3487891.

Full text
Abstract:
With the increasing interest in the content creation field in multiple sectors such as media, education, and entertainment, there is an increased trend in the papers that use AI algorithms to generate content such as images, videos, audio, and text. Generative Adversarial Networks (GANs) is one of the promising models that synthesizes data samples that are similar to real data samples. While the variations of GANs models in general have been covered to some extent in several survey papers, to the best of our knowledge, this is the first paper that reviews the state-of-the-art video GANs models
APA, Harvard, Vancouver, ISO, and other styles
More sources

Dissertations / Theses on the topic "Generative audio models"

1

Douwes, Constance. "On the Environmental Impact of Deep Generative Models for Audio." Electronic Thesis or Diss., Sorbonne université, 2023. http://www.theses.fr/2023SORUS074.

Full text
Abstract:
Cette thèse étudie l'impact environnemental des modèles d'apprentissage profond pour la génération audio et vise à mettre le coût de calcul au cœur du processus d'évaluation. En particulier, nous nous concentrons sur différents types de modèles d'apprentissage profond spécialisés dans la synthèse audio de formes d'onde brutes. Ces modèles sont désormais un élément clé des systèmes audio modernes, et leur utilisation a considérablement augmenté ces dernières années. Leur flexibilité et leurs capacités de généralisation en font des outils puissants dans de nombreux contextes, de la synthèse de t
APA, Harvard, Vancouver, ISO, and other styles
2

Caillon, Antoine. "Hierarchical temporal learning for multi-instrument and orchestral audio synthesis." Electronic Thesis or Diss., Sorbonne université, 2023. http://www.theses.fr/2023SORUS115.

Full text
Abstract:
Les progrès récents en matière d'apprentissage automatique ont permis l'émergence de nouveaux types de modèles adaptés à de nombreuses tâches, ce grâce à l'optimisation d'un ensemble de paramètres visant à minimiser une fonction de coût. Parmi ces techniques, les modèles génératifs probabilistes ont permis des avancées notables dans la génération de textes, d'images et de sons. Cependant, la génération de signaux audio musicaux reste un défi. Cela vient de la complexité intrinsèque des signaux audio, une seule seconde d'audio brut comprenant des dizaines de milliers d'échantillons individuels.
APA, Harvard, Vancouver, ISO, and other styles
3

Nishikimi, Ryo. "Generative, Discriminative, and Hybrid Approaches to Audio-to-Score Automatic Singing Transcription." Doctoral thesis, Kyoto University, 2021. http://hdl.handle.net/2433/263772.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

CHEMLA, ROMEU SANTOS AXEL CLAUDE ANDRE'. "MANIFOLD REPRESENTATIONS OF MUSICAL SIGNALS AND GENERATIVE SPACES." Doctoral thesis, Università degli Studi di Milano, 2020. http://hdl.handle.net/2434/700444.

Full text
Abstract:
Tra i diversi campi di ricerca nell’ambito dell’informatica musicale, la sintesi e la generazione di segnali audio incarna la pluridisciplinalità di questo settore, nutrendo insieme le pratiche scientifiche e musicale dalla sua creazione. Inerente all’informatica dalla sua creazione, la generazione audio ha ispirato numerosi approcci, evolvendo colle pratiche musicale e gli progressi tecnologici e scientifici. Inoltre, alcuni processi di sintesi permettono anche il processo inverso, denominato analisi, in modo che i parametri di sintesi possono anche essere parzialmente o totalmente estratti
APA, Harvard, Vancouver, ISO, and other styles
5

Guenebaut, Boris. "Automatic Subtitle Generation for Sound in Videos." Thesis, University West, Department of Economics and IT, 2009. http://urn.kb.se/resolve?urn=urn:nbn:se:hv:diva-1784.

Full text
Abstract:
<p>The last ten years have been the witnesses of the emergence of any kind of video content. Moreover, the appearance of dedicated websites for this phenomenon has increased the importance the public gives to it. In the same time, certain individuals are deaf and occasionally cannot understand the meanings of such videos because there is not any text transcription available. Therefore, it is necessary to find solutions for the purpose of making these media artefacts accessible for most people. Several software propose utilities to create subtitles for videos but all require an extensive partic
APA, Harvard, Vancouver, ISO, and other styles
6

Scarlato, Michele. "Sicurezza di rete, analisi del traffico e monitoraggio." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2012. http://amslaurea.unibo.it/3223/.

Full text
Abstract:
Il lavoro è stato suddiviso in tre macro-aree. Una prima riguardante un'analisi teorica di come funzionano le intrusioni, di quali software vengono utilizzati per compierle, e di come proteggersi (usando i dispositivi che in termine generico si possono riconoscere come i firewall). Una seconda macro-area che analizza un'intrusione avvenuta dall'esterno verso dei server sensibili di una rete LAN. Questa analisi viene condotta sui file catturati dalle due interfacce di rete configurate in modalità promiscua su una sonda presente nella LAN. Le interfacce sono due per potersi interfacciar
APA, Harvard, Vancouver, ISO, and other styles
7

Mehri, Soroush. "Sequential modeling, generative recurrent neural networks, and their applications to audio." Thèse, 2016. http://hdl.handle.net/1866/18762.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Books on the topic "Generative audio models"

1

Osipov, Vladimir. Control and audit of the activities of a commercial organization: external and internal. INFRA-M Academic Publishing LLC., 2021. http://dx.doi.org/10.12737/1137320.

Full text
Abstract:
The textbook reveals the role of control in ensuring the effective operation of a commercial organization, and sets its purpose and objectives. The main directions of external and internal control of the activities of a commercial organization are defined and the characteristics of the functions performed by them are given. The basic principles of external and internal audit are formulated, their purpose is defined, and the procedure for regulatory and legal regulation of audit activities in the Russian Federation is considered. The features of control over the activities of a commercial organ
APA, Harvard, Vancouver, ISO, and other styles
2

Kazimagomedov, Abdulla, Aida Abdulsalamova, M. Mel'nikov, and N. Gadzhiev. Analysis of the activities of a commercial bank. INFRA-M Academic Publishing LLC., 2022. http://dx.doi.org/10.12737/1831614.

Full text
Abstract:
The textbook presents modern ideas about the analysis of the activities of a commercial bank, in particular, the theoretical and practical issues related to the organization of internal control and audit, analysis of banking operations and services, customer base and creditworthiness of borrowers, banking risks, regulatory requirements of the Central Bank of the Russian Federation and interest rates, financial condition and financial results of a commercial bank are comprehensively disclosed et al . &#x0D; Meets the requirements of the federal state educational standards of higher education of
APA, Harvard, Vancouver, ISO, and other styles
3

Nikiforova, Elena, Lyudmila Kupriyanova, and Ol'ga Shnayder. Management accounting and analysis. INFRA-M Academic Publishing LLC., 2025. https://doi.org/10.12737/2122904.

Full text
Abstract:
The conceptual foundations of management accounting and analysis are revealed, the directions of theoretical and practical skills in assessing and determining effective management decisions are determined. The textbook material forms and consolidates the knowledge of the conceptual framework of managerial accounting and analysis. The main approaches to information support are described; applied tools aimed at ensuring the financial stability of a business are presented, practical situations that reveal its work, approaches to assessing the effectiveness of financial and economic activities bas
APA, Harvard, Vancouver, ISO, and other styles
4

Kerouac, Jack. Big-Sur: Roman. Azbuka, 2013.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
5

Colmeiro, José. Peripheral Visions / Global Sounds. Liverpool University Press, 2018. http://dx.doi.org/10.5949/liverpool/9781786940308.001.0001.

Full text
Abstract:
Galician audio/visual culture has experienced an unprecedented period of growth following the process of political and cultural devolution in post-Franco Spain. This creative explosion has occurred in a productive dialogue with global currents and with considerable projection beyond the geopolitical boundaries of the nation and the state, but these seismic changes are only beginning to be the subject of attention of cultural and media studies. This book examines contemporary audio/visual production in Galicia as privileged channels through which modern Galician cultural identities have been im
APA, Harvard, Vancouver, ISO, and other styles
6

Aguayo, Angela J. Documentary Resistance. Oxford University Press, 2019. http://dx.doi.org/10.1093/oso/9780190676216.001.0001.

Full text
Abstract:
The potential of documentary moving images to foster democratic exchange has been percolating within media production culture for the last century, and now, with mobile cameras at our fingertips and broadcasts circulating through unpredictable social networks, the documentary impulse is coming into its own as a political force of social change. The exploding reach and power of audio and video are multiplying documentary modes of communication. Once considered an outsider media practice, documentary is finding mass appeal in the allure of moving images, collecting participatory audiences that c
APA, Harvard, Vancouver, ISO, and other styles
7

Kerouac, Jack. Big Sur. Penguin Publishing Group, 2013.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
8

Kerouac, Jack. Big Sur (Flamingo Modern Classics). Flamingo, 2001.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
9

Kerouac, Jack. Big Sur. Blackstone Audiobooks, 2002.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
10

Kerouac, Jack. Big Sur. McGraw-Hill Companies, 1990.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
More sources

Book chapters on the topic "Generative audio models"

1

Huzaifah, Muhammad, and Lonce Wyse. "Deep Generative Models for Musical Audio Synthesis." In Handbook of Artificial Intelligence for Music. Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-72116-9_22.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Gallagher, Sean, Ben Gelman, Salma Taoufiq, et al. "Phishing and Social Engineering in the Age of LLMs." In Large Language Models in Cybersecurity. Springer Nature Switzerland, 2024. http://dx.doi.org/10.1007/978-3-031-54827-7_8.

Full text
Abstract:
AbstractThe human factor remains a major vulnerability in cybersecurity. This chapter explores the escalating threats that Large Language Models (LLMs) pose in the field of cybercrime, particularly in phishing and social engineering. Due to their ability to generate highly convincing and individualized content, LLMs enhance the effectiveness and scale of phishing attacks, making them increasingly difficult to detect. The integration of multimodal generative models allows malicious actors to leverage AI-generated text, images, and audio, increasing attack avenues and making attacks more convinc
APA, Harvard, Vancouver, ISO, and other styles
3

Boll, Antônio Oss, Letícia Maria Puttlitz, Heloísa Oss Boll, and Rodrigo Mor Malossi. "Beyond Audio Signals: Generative Model-Based Speaker Diarization in Portuguese." In Lecture Notes in Computer Science. Springer Nature Switzerland, 2025. https://doi.org/10.1007/978-3-031-79029-4_17.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Ye, Sheng, Yu-Hui Wen, Yanan Sun, et al. "Audio-Driven Stylized Gesture Generation with Flow-Based Model." In Lecture Notes in Computer Science. Springer Nature Switzerland, 2022. http://dx.doi.org/10.1007/978-3-031-20065-6_41.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Wyse, Lonce, Purnima Kamath, and Chitralekha Gupta. "Sound Model Factory: An Integrated System Architecture for Generative Audio Modelling." In Artificial Intelligence in Music, Sound, Art and Design. Springer International Publishing, 2022. http://dx.doi.org/10.1007/978-3-031-03789-4_20.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Farkas, Michal, and Peter Lacko. "Using Advanced Audio Generating Techniques to Model Electrical Energy Load." In Engineering Applications of Neural Networks. Springer International Publishing, 2017. http://dx.doi.org/10.1007/978-3-319-65172-9_4.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Golani, Mati, and Shlomit S. Pinter. "Generating a Process Model from a Process Audit Log." In Lecture Notes in Computer Science. Springer Berlin Heidelberg, 2003. http://dx.doi.org/10.1007/3-540-44895-0_10.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Ma, Bin, Weixun Li, Huifeng Li, et al. "Generate Unnoticeable Adversarial Examples on Audio Classification Models with Multi-perspective Objectives." In Lecture Notes in Networks and Systems. Springer Nature Switzerland, 2024. https://doi.org/10.1007/978-3-031-74443-3_20.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

de Berardinis, Jacopo, Valentina Anita Carriero, Nitisha Jain, et al. "The Polifonia Ontology Network: Building a Semantic Backbone for Musical Heritage." In The Semantic Web – ISWC 2023. Springer Nature Switzerland, 2023. http://dx.doi.org/10.1007/978-3-031-47243-5_17.

Full text
Abstract:
AbstractIn the music domain, several ontologies have been proposed to annotate musical data, in both symbolic and audio form, and generate semantically rich Music Knowledge Graphs. However, current models lack interoperability and are insufficient for representing music history and the cultural heritage context in which it was generated; risking the propagation of recency and cultural biases to downstream applications. In this article, we propose the Polifonia Ontology Network (PON) for music cultural heritage, centred around four modules: Music Meta (metadata), Representation (content), Sourc
APA, Harvard, Vancouver, ISO, and other styles
10

Kim, Sang-Kyun, Doo Sun Hwang, Ji-Yeun Kim, and Yang-Seock Seo. "An Effective News Anchorperson Shot Detection Method Based on Adaptive Audio/Visual Model Generation." In Lecture Notes in Computer Science. Springer Berlin Heidelberg, 2005. http://dx.doi.org/10.1007/11526346_31.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "Generative audio models"

1

Roman, Robin San, Pierre Fernandez, Antoine Deleforge, Yossi Adi, and Romain Serizel. "Latent Watermarking of Audio Generative Models." In ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2025. https://doi.org/10.1109/icassp49660.2025.10889782.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Akman, Alican, Qiyang Sun, and Björn W. Schuller. "Audio Explanation Synthesis with Generative Foundation Models." In ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2025. https://doi.org/10.1109/icassp49660.2025.10890370.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Yang, Qian, Jin Xu, Wenrui Liu, et al. "AIR-Bench: Benchmarking Large Audio-Language Models via Generative Comprehension." In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, 2024. http://dx.doi.org/10.18653/v1/2024.acl-long.109.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Heydari, Mojtaba, Mehrez Souden, Bruno Conejo, and Joshua Atkins. "ImmerseDiffusion: A Generative Spatial Audio Latent Diffusion Model." In ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2025. https://doi.org/10.1109/icassp49660.2025.10889311.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Kushwaha, Saksham Singh, Jianbo Ma, Mark R. P. Thomas, Yapeng Tian, and Avery Bruni. "Diff-SAGe: End-to-End Spatial Audio Generation Using Diffusion Models." In ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2025. https://doi.org/10.1109/icassp49660.2025.10888882.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Liang, Yiwei, and Ming Li. "Vivid Background Audio Generation Based on Large Language Models and AudioLDM." In 2024 IEEE 14th International Symposium on Chinese Spoken Language Processing (ISCSLP). IEEE, 2024. https://doi.org/10.1109/iscslp63861.2024.10800334.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Gao, Yiming. "A systematic research of text-to-audio generation with diffusion models." In Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), edited by Haiquan Zhao and Lei Chen. SPIE, 2025. https://doi.org/10.1117/12.3053123.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Kyaw, Kaung Myat, and Jonathan Hoyin Chan. "A Framework for Synthetic Audio Conversations Generation Using Large Language Models." In 2024 IEEE/WIC International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT). IEEE, 2024. https://doi.org/10.1109/wi-iat62293.2024.00056.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Li, Jiaqi, Dongmei Wang, Xiaofei Wang, et al. "Investigating Neural Audio Codecs For Speech Language Model-Based Speech Generation." In 2024 IEEE Spoken Language Technology Workshop (SLT). IEEE, 2024. https://doi.org/10.1109/slt61566.2024.10832266.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Yang, Jie, and Feilong Bao. "MDG:Multilingual Co-speech Gesture Generation with Low-level Audio Representation and Diffusion Models." In 2024 International Conference on Asian Language Processing (IALP). IEEE, 2024. http://dx.doi.org/10.1109/ialp63756.2024.10661182.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Reports on the topic "Generative audio models"

1

Decleir, Cyril, Mohand-Saïd Hacid, and Jacques Kouloumdjian. A Database Approach for Modeling and Querying Video Data. Aachen University of Technology, 1999. http://dx.doi.org/10.25368/2022.90.

Full text
Abstract:
Indexing video data is essential for providing content based access. In this paper, we consider how database technology can offer an integrated framework for modeling and querying video data. As many concerns in video (e.g., modeling and querying) are also found in databases, databases provide an interesting angle to attack many of the problems. From a video applications perspective, database systems provide a nice basis for future video systems. More generally, database research will provide solutions to many video issues even if these are partial or fragmented. From a database perspective, v
APA, Harvard, Vancouver, ISO, and other styles
2

Vakaliuk, Tetiana A., Valerii V. Kontsedailo, Dmytro S. Antoniuk, Olha V. Korotun, Iryna S. Mintii, and Andrey V. Pikilnyak. Using game simulator Software Inc in the Software Engineering education. [б. в.], 2020. http://dx.doi.org/10.31812/123456789/3762.

Full text
Abstract:
The article presents the possibilities of using game simulator Sotware Inc in the training of future software engineer in higher education. Attention is drawn to some specific settings that need to be taken into account when training in the course of training future software engineers. More and more educational institutions are introducing new teaching methods, which result in the use of engineering students, in particular, future software engineers, to deal with real professional situations in the learning process. The use of modern ICT, including game simulators, in the educational process,
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!