Literatura académica sobre el tema "Generative audio models"

Crea una cita precisa en los estilos APA, MLA, Chicago, Harvard y otros

Elija tipo de fuente:

Consulte las listas temáticas de artículos, libros, tesis, actas de conferencias y otras fuentes académicas sobre el tema "Generative audio models".

Junto a cada fuente en la lista de referencias hay un botón "Agregar a la bibliografía". Pulsa este botón, y generaremos automáticamente la referencia bibliográfica para la obra elegida en el estilo de cita que necesites: APA, MLA, Harvard, Vancouver, Chicago, etc.

También puede descargar el texto completo de la publicación académica en formato pdf y leer en línea su resumen siempre que esté disponible en los metadatos.

Artículos de revistas sobre el tema "Generative audio models"

1

Evans, Zach, Scott H. Hawley, and Katherine Crowson. "Musical audio samples generated from joint text embeddings." Journal of the Acoustical Society of America 152, no. 4 (2022): A178. http://dx.doi.org/10.1121/10.0015956.

Texto completo
Resumen
The field of machine learning has benefited from the appearance of diffusion-based generative models for images and audio. While text-to-image models have become increasingly prevalent, text-to-audio generative models are currently an active area of research. We present work on generating short samples of musical instrument sounds generated by a model which was conditioned on text descriptions and the file structure labels of large sample libraries. Preliminary findings indicate that generation of wide-spectrum sounds such as percussion are not difficult, while the generation of harmonic music
Los estilos APA, Harvard, Vancouver, ISO, etc.
2

Kang, Hyunju, Geonhee Han, Yoonjae Jeong, and Hogun Park. "AudioGenX: Explainability on Text-to-Audio Generative Models." Proceedings of the AAAI Conference on Artificial Intelligence 39, no. 17 (2025): 17733–41. https://doi.org/10.1609/aaai.v39i17.33950.

Texto completo
Resumen
Text-to-audio generation models (TAG) have achieved significant advances in generating audio conditioned on text descriptions. However, a critical challenge lies in the lack of transparency regarding how each textual input impacts the generated audio. To address this issue, we introduce AudioGenX, an Explainable AI (XAI) method that provides explanations for text-to-audio generation models by highlighting the importance of input tokens. AudioGenX optimizes an Explainer by leveraging factual and counterfactual objective functions to provide faithful explanations at the audio token level. This m
Los estilos APA, Harvard, Vancouver, ISO, etc.
3

Samson, Grzegorz. "Perspectives on Generative Sound Design: A Generative Soundscapes Showcase." Arts 14, no. 3 (2025): 67. https://doi.org/10.3390/arts14030067.

Texto completo
Resumen
Recent advancements in generative neural networks, particularly transformer-based models, have introduced novel possibilities for sound design. This study explores the use of generative pre-trained transformers (GPT) to create complex, multilayered soundscapes from textual and visual prompts. A custom pipeline is proposed, featuring modules for converting the source input into structured sound descriptions and subsequently generating cohesive auditory outputs. As a complementary solution, a granular synthesizer prototype was developed to enhance the usability of generative audio samples by ena
Los estilos APA, Harvard, Vancouver, ISO, etc.
4

Jeong, Yujin, Yunji Kim, Sanghyuk Chun, and Jiyoung Lee. "Read, Watch and Scream! Sound Generation from Text and Video." Proceedings of the AAAI Conference on Artificial Intelligence 39, no. 17 (2025): 17590–98. https://doi.org/10.1609/aaai.v39i17.33934.

Texto completo
Resumen
Despite the impressive progress of multimodal generative models, video-to-audio generation still suffers from limited performance and limits the flexibility to prioritize sound synthesis for specific objects within the scene. Conversely, text-to-audio generation methods generate high-quality audio but pose challenges in ensuring comprehensive scene depiction and time-varying control. To tackle these challenges, we propose a novel video-and-text-to-audio generation method, called ReWaS, where video serves as a conditional control for a text-to-audio generation model. Especially, our method esti
Los estilos APA, Harvard, Vancouver, ISO, etc.
5

Wang, Heng, Jianbo Ma, Santiago Pascual, Richard Cartwright, and Weidong Cai. "V2A-Mapper: A Lightweight Solution for Vision-to-Audio Generation by Connecting Foundation Models." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 14 (2024): 15492–501. http://dx.doi.org/10.1609/aaai.v38i14.29475.

Texto completo
Resumen
Building artificial intelligence (AI) systems on top of a set of foundation models (FMs) is becoming a new paradigm in AI research. Their representative and generative abilities learnt from vast amounts of data can be easily adapted and transferred to a wide range of downstream tasks without extra training from scratch. However, leveraging FMs in cross-modal generation remains under-researched when audio modality is involved. On the other hand, automatically generating semantically-relevant sound from visual input is an important problem in cross-modal generation studies. To solve this vision-
Los estilos APA, Harvard, Vancouver, ISO, etc.
6

Ji, Wenliang, Ming Jin, and Yixin Chen. "Optimization of Digital Media Content Generation and Communication Effect Combined with Deep Learning Technology." Journal of Combinatorial Mathematics and Combinatorial Computing 127a (April 15, 2025): 1449–66. https://doi.org/10.61091/jcmcc127a-084.

Texto completo
Resumen
The combination of deep learning and digital media technology provides great scope for content creation. The article uses Generative Adversarial Network (GAN) in deep learning for content generation. Based on the three major forms of digital media content (image, audio, and video), image, audio, and video are generated by U-Net_GAN model, MAS-GAN model, and SSFLVGAN model, respectively, to construct a digital media content generation model based on generative adversarial networks. Subsequently, the model is validated for performance and the generated images, audio and video are evaluated for e
Los estilos APA, Harvard, Vancouver, ISO, etc.
7

Sakirin, Tam, and Siddartha Kusuma. "A Survey of Generative Artificial Intelligence Techniques." Babylonian Journal of Artificial Intelligence 2023 (March 10, 2023): 10–14. http://dx.doi.org/10.58496/bjai/2023/003.

Texto completo
Resumen
Generative artificial intelligence (AI) refers to algorithms capable of creating novel, realistic digital content autonomously. Recently, generative models have attained groundbreaking results in domains like image and audio synthesis, spurring vast interest in the field. This paper surveys the landscape of modern techniques powering the rise of creative AI systems. We structurally examine predominant algorithmic approaches including generative adversarial networks (GANs), variational autoencoders (VAEs), and autoregressive models. Architectural innovations and illustrations of generated outpu
Los estilos APA, Harvard, Vancouver, ISO, etc.
8

Broad, Terence, Frederic Fol Leymarie, and Mick Grierson. "Network Bending: Expressive Manipulation of Generative Models in Multiple Domains." Entropy 24, no. 1 (2021): 28. http://dx.doi.org/10.3390/e24010028.

Texto completo
Resumen
This paper presents the network bending framework, a new approach for manipulating and interacting with deep generative models. We present a comprehensive set of deterministic transformations that can be inserted as distinct layers into the computational graph of a trained generative neural network and applied during inference. In addition, we present a novel algorithm for analysing the deep generative model and clustering features based on their spatial activation maps. This allows features to be grouped together based on spatial similarity in an unsupervised fashion. This results in the mean
Los estilos APA, Harvard, Vancouver, ISO, etc.
9

Cao, Yongnian, Xuechun Yang, and Rui Sun. "Generative AI Models Theoretical Foundations and Algorithmic Practices." Journal of Industrial Engineering and Applied Science 3, no. 1 (2025): 1–9. https://doi.org/10.70393/6a69656173.323633.

Texto completo
Resumen
Generative models in AI are an entirely new paradigm for machine learning, allowing computers to create realistic data in all kinds of categories, like text (NLP), images, and even physics simulations. In this paper this formalism is used to guide the theory, algorithms and applications of generative models, with particular focus on a few well established techniques like VAEs, GANs, and diffusion models. It stresses the importance of probabilistic generative modelling and information theory (I.e. KL divergence, ELBO, adversarial optimization, etc.) We cover algorithmic practices such as optimi
Los estilos APA, Harvard, Vancouver, ISO, etc.
10

Aldausari, Nuha, Arcot Sowmya, Nadine Marcus, and Gelareh Mohammadi. "Video Generative Adversarial Networks: A Review." ACM Computing Surveys 55, no. 2 (2023): 1–25. http://dx.doi.org/10.1145/3487891.

Texto completo
Resumen
With the increasing interest in the content creation field in multiple sectors such as media, education, and entertainment, there is an increased trend in the papers that use AI algorithms to generate content such as images, videos, audio, and text. Generative Adversarial Networks (GANs) is one of the promising models that synthesizes data samples that are similar to real data samples. While the variations of GANs models in general have been covered to some extent in several survey papers, to the best of our knowledge, this is the first paper that reviews the state-of-the-art video GANs models
Los estilos APA, Harvard, Vancouver, ISO, etc.
Más fuentes

Tesis sobre el tema "Generative audio models"

1

Douwes, Constance. "On the Environmental Impact of Deep Generative Models for Audio." Electronic Thesis or Diss., Sorbonne université, 2023. http://www.theses.fr/2023SORUS074.

Texto completo
Resumen
Cette thèse étudie l'impact environnemental des modèles d'apprentissage profond pour la génération audio et vise à mettre le coût de calcul au cœur du processus d'évaluation. En particulier, nous nous concentrons sur différents types de modèles d'apprentissage profond spécialisés dans la synthèse audio de formes d'onde brutes. Ces modèles sont désormais un élément clé des systèmes audio modernes, et leur utilisation a considérablement augmenté ces dernières années. Leur flexibilité et leurs capacités de généralisation en font des outils puissants dans de nombreux contextes, de la synthèse de t
Los estilos APA, Harvard, Vancouver, ISO, etc.
2

Caillon, Antoine. "Hierarchical temporal learning for multi-instrument and orchestral audio synthesis." Electronic Thesis or Diss., Sorbonne université, 2023. http://www.theses.fr/2023SORUS115.

Texto completo
Resumen
Les progrès récents en matière d'apprentissage automatique ont permis l'émergence de nouveaux types de modèles adaptés à de nombreuses tâches, ce grâce à l'optimisation d'un ensemble de paramètres visant à minimiser une fonction de coût. Parmi ces techniques, les modèles génératifs probabilistes ont permis des avancées notables dans la génération de textes, d'images et de sons. Cependant, la génération de signaux audio musicaux reste un défi. Cela vient de la complexité intrinsèque des signaux audio, une seule seconde d'audio brut comprenant des dizaines de milliers d'échantillons individuels.
Los estilos APA, Harvard, Vancouver, ISO, etc.
3

Nishikimi, Ryo. "Generative, Discriminative, and Hybrid Approaches to Audio-to-Score Automatic Singing Transcription." Doctoral thesis, Kyoto University, 2021. http://hdl.handle.net/2433/263772.

Texto completo
Los estilos APA, Harvard, Vancouver, ISO, etc.
4

CHEMLA, ROMEU SANTOS AXEL CLAUDE ANDRE'. "MANIFOLD REPRESENTATIONS OF MUSICAL SIGNALS AND GENERATIVE SPACES." Doctoral thesis, Università degli Studi di Milano, 2020. http://hdl.handle.net/2434/700444.

Texto completo
Resumen
Tra i diversi campi di ricerca nell’ambito dell’informatica musicale, la sintesi e la generazione di segnali audio incarna la pluridisciplinalità di questo settore, nutrendo insieme le pratiche scientifiche e musicale dalla sua creazione. Inerente all’informatica dalla sua creazione, la generazione audio ha ispirato numerosi approcci, evolvendo colle pratiche musicale e gli progressi tecnologici e scientifici. Inoltre, alcuni processi di sintesi permettono anche il processo inverso, denominato analisi, in modo che i parametri di sintesi possono anche essere parzialmente o totalmente estratti
Los estilos APA, Harvard, Vancouver, ISO, etc.
5

Guenebaut, Boris. "Automatic Subtitle Generation for Sound in Videos." Thesis, University West, Department of Economics and IT, 2009. http://urn.kb.se/resolve?urn=urn:nbn:se:hv:diva-1784.

Texto completo
Resumen
<p>The last ten years have been the witnesses of the emergence of any kind of video content. Moreover, the appearance of dedicated websites for this phenomenon has increased the importance the public gives to it. In the same time, certain individuals are deaf and occasionally cannot understand the meanings of such videos because there is not any text transcription available. Therefore, it is necessary to find solutions for the purpose of making these media artefacts accessible for most people. Several software propose utilities to create subtitles for videos but all require an extensive partic
Los estilos APA, Harvard, Vancouver, ISO, etc.
6

Scarlato, Michele. "Sicurezza di rete, analisi del traffico e monitoraggio." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2012. http://amslaurea.unibo.it/3223/.

Texto completo
Resumen
Il lavoro è stato suddiviso in tre macro-aree. Una prima riguardante un'analisi teorica di come funzionano le intrusioni, di quali software vengono utilizzati per compierle, e di come proteggersi (usando i dispositivi che in termine generico si possono riconoscere come i firewall). Una seconda macro-area che analizza un'intrusione avvenuta dall'esterno verso dei server sensibili di una rete LAN. Questa analisi viene condotta sui file catturati dalle due interfacce di rete configurate in modalità promiscua su una sonda presente nella LAN. Le interfacce sono due per potersi interfacciar
Los estilos APA, Harvard, Vancouver, ISO, etc.
7

Mehri, Soroush. "Sequential modeling, generative recurrent neural networks, and their applications to audio." Thèse, 2016. http://hdl.handle.net/1866/18762.

Texto completo
Los estilos APA, Harvard, Vancouver, ISO, etc.

Libros sobre el tema "Generative audio models"

1

Osipov, Vladimir. Control and audit of the activities of a commercial organization: external and internal. INFRA-M Academic Publishing LLC., 2021. http://dx.doi.org/10.12737/1137320.

Texto completo
Resumen
The textbook reveals the role of control in ensuring the effective operation of a commercial organization, and sets its purpose and objectives. The main directions of external and internal control of the activities of a commercial organization are defined and the characteristics of the functions performed by them are given. The basic principles of external and internal audit are formulated, their purpose is defined, and the procedure for regulatory and legal regulation of audit activities in the Russian Federation is considered. The features of control over the activities of a commercial organ
Los estilos APA, Harvard, Vancouver, ISO, etc.
2

Kazimagomedov, Abdulla, Aida Abdulsalamova, M. Mel'nikov, and N. Gadzhiev. Analysis of the activities of a commercial bank. INFRA-M Academic Publishing LLC., 2022. http://dx.doi.org/10.12737/1831614.

Texto completo
Resumen
The textbook presents modern ideas about the analysis of the activities of a commercial bank, in particular, the theoretical and practical issues related to the organization of internal control and audit, analysis of banking operations and services, customer base and creditworthiness of borrowers, banking risks, regulatory requirements of the Central Bank of the Russian Federation and interest rates, financial condition and financial results of a commercial bank are comprehensively disclosed et al . &#x0D; Meets the requirements of the federal state educational standards of higher education of
Los estilos APA, Harvard, Vancouver, ISO, etc.
3

Nikiforova, Elena, Lyudmila Kupriyanova, and Ol'ga Shnayder. Management accounting and analysis. INFRA-M Academic Publishing LLC., 2025. https://doi.org/10.12737/2122904.

Texto completo
Resumen
The conceptual foundations of management accounting and analysis are revealed, the directions of theoretical and practical skills in assessing and determining effective management decisions are determined. The textbook material forms and consolidates the knowledge of the conceptual framework of managerial accounting and analysis. The main approaches to information support are described; applied tools aimed at ensuring the financial stability of a business are presented, practical situations that reveal its work, approaches to assessing the effectiveness of financial and economic activities bas
Los estilos APA, Harvard, Vancouver, ISO, etc.
4

Kerouac, Jack. Big-Sur: Roman. Azbuka, 2013.

Buscar texto completo
Los estilos APA, Harvard, Vancouver, ISO, etc.
5

Colmeiro, José. Peripheral Visions / Global Sounds. Liverpool University Press, 2018. http://dx.doi.org/10.5949/liverpool/9781786940308.001.0001.

Texto completo
Resumen
Galician audio/visual culture has experienced an unprecedented period of growth following the process of political and cultural devolution in post-Franco Spain. This creative explosion has occurred in a productive dialogue with global currents and with considerable projection beyond the geopolitical boundaries of the nation and the state, but these seismic changes are only beginning to be the subject of attention of cultural and media studies. This book examines contemporary audio/visual production in Galicia as privileged channels through which modern Galician cultural identities have been im
Los estilos APA, Harvard, Vancouver, ISO, etc.
6

Aguayo, Angela J. Documentary Resistance. Oxford University Press, 2019. http://dx.doi.org/10.1093/oso/9780190676216.001.0001.

Texto completo
Resumen
The potential of documentary moving images to foster democratic exchange has been percolating within media production culture for the last century, and now, with mobile cameras at our fingertips and broadcasts circulating through unpredictable social networks, the documentary impulse is coming into its own as a political force of social change. The exploding reach and power of audio and video are multiplying documentary modes of communication. Once considered an outsider media practice, documentary is finding mass appeal in the allure of moving images, collecting participatory audiences that c
Los estilos APA, Harvard, Vancouver, ISO, etc.
7

Kerouac, Jack. Big Sur. Penguin Publishing Group, 2013.

Buscar texto completo
Los estilos APA, Harvard, Vancouver, ISO, etc.
8

Kerouac, Jack. Big Sur (Flamingo Modern Classics). Flamingo, 2001.

Buscar texto completo
Los estilos APA, Harvard, Vancouver, ISO, etc.
9

Kerouac, Jack. Big Sur. Blackstone Audiobooks, 2002.

Buscar texto completo
Los estilos APA, Harvard, Vancouver, ISO, etc.
10

Kerouac, Jack. Big Sur. McGraw-Hill Companies, 1990.

Buscar texto completo
Los estilos APA, Harvard, Vancouver, ISO, etc.
Más fuentes

Capítulos de libros sobre el tema "Generative audio models"

1

Huzaifah, Muhammad, and Lonce Wyse. "Deep Generative Models for Musical Audio Synthesis." In Handbook of Artificial Intelligence for Music. Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-72116-9_22.

Texto completo
Los estilos APA, Harvard, Vancouver, ISO, etc.
2

Gallagher, Sean, Ben Gelman, Salma Taoufiq, et al. "Phishing and Social Engineering in the Age of LLMs." In Large Language Models in Cybersecurity. Springer Nature Switzerland, 2024. http://dx.doi.org/10.1007/978-3-031-54827-7_8.

Texto completo
Resumen
AbstractThe human factor remains a major vulnerability in cybersecurity. This chapter explores the escalating threats that Large Language Models (LLMs) pose in the field of cybercrime, particularly in phishing and social engineering. Due to their ability to generate highly convincing and individualized content, LLMs enhance the effectiveness and scale of phishing attacks, making them increasingly difficult to detect. The integration of multimodal generative models allows malicious actors to leverage AI-generated text, images, and audio, increasing attack avenues and making attacks more convincing. Two case studies provide a comprehensive look, examining how AI technology orchestrates a phishing attack posing as a typical e-commerce transaction and how an LLM was used in a romance-themed cryptocurrency scam. Both scenarios underline the need for increased awareness and improved defenses against these novel and sophisticated cyber threats.
Los estilos APA, Harvard, Vancouver, ISO, etc.
3

Boll, Antônio Oss, Letícia Maria Puttlitz, Heloísa Oss Boll, and Rodrigo Mor Malossi. "Beyond Audio Signals: Generative Model-Based Speaker Diarization in Portuguese." In Lecture Notes in Computer Science. Springer Nature Switzerland, 2025. https://doi.org/10.1007/978-3-031-79029-4_17.

Texto completo
Los estilos APA, Harvard, Vancouver, ISO, etc.
4

Ye, Sheng, Yu-Hui Wen, Yanan Sun, et al. "Audio-Driven Stylized Gesture Generation with Flow-Based Model." In Lecture Notes in Computer Science. Springer Nature Switzerland, 2022. http://dx.doi.org/10.1007/978-3-031-20065-6_41.

Texto completo
Los estilos APA, Harvard, Vancouver, ISO, etc.
5

Wyse, Lonce, Purnima Kamath, and Chitralekha Gupta. "Sound Model Factory: An Integrated System Architecture for Generative Audio Modelling." In Artificial Intelligence in Music, Sound, Art and Design. Springer International Publishing, 2022. http://dx.doi.org/10.1007/978-3-031-03789-4_20.

Texto completo
Los estilos APA, Harvard, Vancouver, ISO, etc.
6

Farkas, Michal, and Peter Lacko. "Using Advanced Audio Generating Techniques to Model Electrical Energy Load." In Engineering Applications of Neural Networks. Springer International Publishing, 2017. http://dx.doi.org/10.1007/978-3-319-65172-9_4.

Texto completo
Los estilos APA, Harvard, Vancouver, ISO, etc.
7

Golani, Mati, and Shlomit S. Pinter. "Generating a Process Model from a Process Audit Log." In Lecture Notes in Computer Science. Springer Berlin Heidelberg, 2003. http://dx.doi.org/10.1007/3-540-44895-0_10.

Texto completo
Los estilos APA, Harvard, Vancouver, ISO, etc.
8

Ma, Bin, Weixun Li, Huifeng Li, et al. "Generate Unnoticeable Adversarial Examples on Audio Classification Models with Multi-perspective Objectives." In Lecture Notes in Networks and Systems. Springer Nature Switzerland, 2024. https://doi.org/10.1007/978-3-031-74443-3_20.

Texto completo
Los estilos APA, Harvard, Vancouver, ISO, etc.
9

de Berardinis, Jacopo, Valentina Anita Carriero, Nitisha Jain, et al. "The Polifonia Ontology Network: Building a Semantic Backbone for Musical Heritage." In The Semantic Web – ISWC 2023. Springer Nature Switzerland, 2023. http://dx.doi.org/10.1007/978-3-031-47243-5_17.

Texto completo
Resumen
AbstractIn the music domain, several ontologies have been proposed to annotate musical data, in both symbolic and audio form, and generate semantically rich Music Knowledge Graphs. However, current models lack interoperability and are insufficient for representing music history and the cultural heritage context in which it was generated; risking the propagation of recency and cultural biases to downstream applications. In this article, we propose the Polifonia Ontology Network (PON) for music cultural heritage, centred around four modules: Music Meta (metadata), Representation (content), Source (provenance) and Instrument (cultural objects). We design PON with a strong accent on cultural stakeholder requirements and competency questions (CQs), contributing an NLP-based toolkit to support knowledge engineers in generating, validating, and analysing them; and a novel, high-quality CQ dataset produced as a result. We show current and future use of these resources by internal project pilots, early adopters in the music industry, and opportunities for the Semantic Web and Music Information Retrieval communities.
Los estilos APA, Harvard, Vancouver, ISO, etc.
10

Kim, Sang-Kyun, Doo Sun Hwang, Ji-Yeun Kim, and Yang-Seock Seo. "An Effective News Anchorperson Shot Detection Method Based on Adaptive Audio/Visual Model Generation." In Lecture Notes in Computer Science. Springer Berlin Heidelberg, 2005. http://dx.doi.org/10.1007/11526346_31.

Texto completo
Los estilos APA, Harvard, Vancouver, ISO, etc.

Actas de conferencias sobre el tema "Generative audio models"

1

Roman, Robin San, Pierre Fernandez, Antoine Deleforge, Yossi Adi, and Romain Serizel. "Latent Watermarking of Audio Generative Models." In ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2025. https://doi.org/10.1109/icassp49660.2025.10889782.

Texto completo
Los estilos APA, Harvard, Vancouver, ISO, etc.
2

Akman, Alican, Qiyang Sun, and Björn W. Schuller. "Audio Explanation Synthesis with Generative Foundation Models." In ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2025. https://doi.org/10.1109/icassp49660.2025.10890370.

Texto completo
Los estilos APA, Harvard, Vancouver, ISO, etc.
3

Yang, Qian, Jin Xu, Wenrui Liu, et al. "AIR-Bench: Benchmarking Large Audio-Language Models via Generative Comprehension." In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, 2024. http://dx.doi.org/10.18653/v1/2024.acl-long.109.

Texto completo
Los estilos APA, Harvard, Vancouver, ISO, etc.
4

Heydari, Mojtaba, Mehrez Souden, Bruno Conejo, and Joshua Atkins. "ImmerseDiffusion: A Generative Spatial Audio Latent Diffusion Model." In ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2025. https://doi.org/10.1109/icassp49660.2025.10889311.

Texto completo
Los estilos APA, Harvard, Vancouver, ISO, etc.
5

Kushwaha, Saksham Singh, Jianbo Ma, Mark R. P. Thomas, Yapeng Tian, and Avery Bruni. "Diff-SAGe: End-to-End Spatial Audio Generation Using Diffusion Models." In ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2025. https://doi.org/10.1109/icassp49660.2025.10888882.

Texto completo
Los estilos APA, Harvard, Vancouver, ISO, etc.
6

Liang, Yiwei, and Ming Li. "Vivid Background Audio Generation Based on Large Language Models and AudioLDM." In 2024 IEEE 14th International Symposium on Chinese Spoken Language Processing (ISCSLP). IEEE, 2024. https://doi.org/10.1109/iscslp63861.2024.10800334.

Texto completo
Los estilos APA, Harvard, Vancouver, ISO, etc.
7

Gao, Yiming. "A systematic research of text-to-audio generation with diffusion models." In Fifth International Conference on Signal Processing and Computer Science (SPCS 2024), edited by Haiquan Zhao and Lei Chen. SPIE, 2025. https://doi.org/10.1117/12.3053123.

Texto completo
Los estilos APA, Harvard, Vancouver, ISO, etc.
8

Kyaw, Kaung Myat, and Jonathan Hoyin Chan. "A Framework for Synthetic Audio Conversations Generation Using Large Language Models." In 2024 IEEE/WIC International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT). IEEE, 2024. https://doi.org/10.1109/wi-iat62293.2024.00056.

Texto completo
Los estilos APA, Harvard, Vancouver, ISO, etc.
9

Li, Jiaqi, Dongmei Wang, Xiaofei Wang, et al. "Investigating Neural Audio Codecs For Speech Language Model-Based Speech Generation." In 2024 IEEE Spoken Language Technology Workshop (SLT). IEEE, 2024. https://doi.org/10.1109/slt61566.2024.10832266.

Texto completo
Los estilos APA, Harvard, Vancouver, ISO, etc.
10

Yang, Jie, and Feilong Bao. "MDG:Multilingual Co-speech Gesture Generation with Low-level Audio Representation and Diffusion Models." In 2024 International Conference on Asian Language Processing (IALP). IEEE, 2024. http://dx.doi.org/10.1109/ialp63756.2024.10661182.

Texto completo
Los estilos APA, Harvard, Vancouver, ISO, etc.

Informes sobre el tema "Generative audio models"

1

Decleir, Cyril, Mohand-Saïd Hacid, and Jacques Kouloumdjian. A Database Approach for Modeling and Querying Video Data. Aachen University of Technology, 1999. http://dx.doi.org/10.25368/2022.90.

Texto completo
Resumen
Indexing video data is essential for providing content based access. In this paper, we consider how database technology can offer an integrated framework for modeling and querying video data. As many concerns in video (e.g., modeling and querying) are also found in databases, databases provide an interesting angle to attack many of the problems. From a video applications perspective, database systems provide a nice basis for future video systems. More generally, database research will provide solutions to many video issues even if these are partial or fragmented. From a database perspective, v
Los estilos APA, Harvard, Vancouver, ISO, etc.
2

Vakaliuk, Tetiana A., Valerii V. Kontsedailo, Dmytro S. Antoniuk, Olha V. Korotun, Iryna S. Mintii, and Andrey V. Pikilnyak. Using game simulator Software Inc in the Software Engineering education. [б. в.], 2020. http://dx.doi.org/10.31812/123456789/3762.

Texto completo
Resumen
The article presents the possibilities of using game simulator Sotware Inc in the training of future software engineer in higher education. Attention is drawn to some specific settings that need to be taken into account when training in the course of training future software engineers. More and more educational institutions are introducing new teaching methods, which result in the use of engineering students, in particular, future software engineers, to deal with real professional situations in the learning process. The use of modern ICT, including game simulators, in the educational process,
Los estilos APA, Harvard, Vancouver, ISO, etc.
Ofrecemos descuentos en todos los planes premium para autores cuyas obras están incluidas en selecciones literarias temáticas. ¡Contáctenos para obtener un código promocional único!