To see the other types of publications on this topic, follow the link: AI video synthesis.

Journal articles on the topic 'AI video synthesis'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'AI video synthesis.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

A P, Aiswarya. "Text to Video Generation Using Generative AI for Interior Design Visualization." INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT 09, no. 03 (2025): 1–9. https://doi.org/10.55041/ijsrem42124.

Full text
Abstract:
The emerging discipline of text-to-video synthesis combines computer vision and natural language understanding to create coherent, realistic videos that are based on written descriptions. The research is an endeavour to provide a bridge between the fields of computer vision and natural language processing by using a robust text-to-video production system. The system's main goal is to convert text prompts into visually appealing videos using pre-trained models and style transfer techniques, providing a fresh approach to content development. The method demonstrates flexibility and effectiveness
APA, Harvard, Vancouver, ISO, and other styles
2

Rana, Jignesh. "Ai-Studios." International Journal for Research in Applied Science and Engineering Technology 13, no. 2 (2025): 563–68. https://doi.org/10.22214/ijraset.2025.66893.

Full text
Abstract:
Ai-Studios, a system that combines large language models with Stable Diffusion techniques to craft captivating poems and stories based on user prompts. This innovative system begins with user-provided prompts and offers the choice between poetry and narratives. Advanced language models generate rich textual content, forming the foundation of our creative journey. To translate this text into visually stunning experiences, Stable Diffusion models transform each sentence into vivid images with high accuracy. By using cross-attention layers, these models offer flexibility in responding to differen
APA, Harvard, Vancouver, ISO, and other styles
3

Li, Yaowei, Xintao Wang, Zhaoyang Zhang, et al. "Image Conductor: Precision Control for Interactive Video Synthesis." Proceedings of the AAAI Conference on Artificial Intelligence 39, no. 5 (2025): 5031–38. https://doi.org/10.1609/aaai.v39i5.32533.

Full text
Abstract:
Filmmaking and animation production often require sophisticated techniques for coordinating camera transitions and object movements, typically involving labor-intensive real-world capturing. Despite advancements in generative AI for video creation, achieving precise control over motion for interactive video asset generation remains challenging. To this end, we propose Image Conductor, a method for precise control of camera transitions and object movements to generate video assets from a single image. An well-cultivated training strategy is proposed to separate distinct camera and object motion
APA, Harvard, Vancouver, ISO, and other styles
4

KJ, Karthik. "A SURVEY ON AI-CONTENT GENERATOR." INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT 09, no. 01 (2025): 1–9. https://doi.org/10.55041/ijsrem41078.

Full text
Abstract:
This survey provides an in-depth exploration of AI-driven content generation, covering key areas such as text creation, image synthesis, video generation, and automated coding. By examining advancements in technologies like Large Language Models (LLMs), GANs, and diffusion models, the study highlights AI's role in transforming diverse fields. Text generation technologies are enabling structured, creative, and conversational outputs, while image and video synthesis models like Imagen and Phenaki are setting new benchmarks for visual quality and realism. In code generation, tools like ChatGPT an
APA, Harvard, Vancouver, ISO, and other styles
5

Azieiev, Serhii. "ARTIFICIAL INTELLIGENCE TOOLS IN JOURNALISTS’ WORK WITH AUDIOVISUAL CONTENT." Dialog: media studios, no. 30 (December 13, 2024): 7–22. https://doi.org/10.18524/2308-3255.2024.30.318416.

Full text
Abstract:
The rapid development of artificial intelligence (AI) is significantly transforming the creation, processing, and analysis of audiovisual content in journalism. Neural networks, machine learning algorithms, and other intelligent technologies enable the automation of many routine tasks – from information gathering and processing to video editing and voice synthesis – enhancing the efficiency of journalists’ work. This article examines key AI tools used in journalistic activities, including automatic speech transcription systems, AI-driven text and image generation, facial and object recognition
APA, Harvard, Vancouver, ISO, and other styles
6

Dey, Biswajit, and Rajdeep paul. "Integrating Indian Sign Language Recognition with Real-Time Speech Synthesis for video conferences." INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT 09, no. 03 (2025): 1–9. https://doi.org/10.55041/ijsrem42865.

Full text
Abstract:
Both hearing and deaf people commonly face major communication hurdles in their daily lives. To solve, this study presents a real-time video calling system that uses ai model to recognize Indian Sign Language (ISL). Peers are connected via WebSockets, and video data is shared with the AI model for identification. Our approach captures 30 frames a second and buffers them as groups of 3 seconds that a backend AI model interprets. The application using grid fragmentation-based splitting and k-NN prediction detects the hand movements very accurately and translates these movements to textual equiva
APA, Harvard, Vancouver, ISO, and other styles
7

Luo, Ziqian, Feiyang Chen, Xiaoyang Chen, and Xueting Pan. "A Novel Framework for Text-Image Pair to Video Generation in Music Anime Douga (MAD) Production." Artificial Intelligence Advances 6, no. 1 (2024): 25–33. http://dx.doi.org/10.30564/aia.v6i1.6848.

Full text
Abstract:
The rapid growth of digital media has driven advancements in multimedia generation, notably in Music Anime Douga (MAD), which blends animation with music. Creating MADs currently requires extensive manual labor, particularly for designing critical frames. Existing methods like GANs and transformers excel at text-to-video synthesis but lack the precision needed for artistic control in MADs. They often neglect the crucial hand-drawn frames that form the visual foundation of these videos. This paper introduces a novel framework for generating high-quality videos from text-image pairs, addressing
APA, Harvard, Vancouver, ISO, and other styles
8

Meshram, Sahil. "Genius AI A Unified Platform for Text, Image, Audio, Video, and Code AI." International Journal for Research in Applied Science and Engineering Technology 13, no. 6 (2025): 825–29. https://doi.org/10.22214/ijraset.2025.71461.

Full text
Abstract:
The rapid evolution of artificial intelligence (AI) has led to the development of specialized models across different modalities such as text, image, video, audio, and program code. This paper presents the design and conceptual framework for a multimodal AI platform that harmoniously brings together multiple AI systems into a single, user-friendly. The proposed platform leverages state-of-the-art AI models, each tailored for a specific modality—Natural Language Processing (NLP) models for text understanding and generation, Computer Vision models for image analysis and synthesis, Generative Vid
APA, Harvard, Vancouver, ISO, and other styles
9

Donika, Valcheva, Kalushkov Teodor, and Shipkovenski Georgi. "Research on Motion Capture Technologies and AI Video Synthesis for Creating Digital Bulgarian Folk Choreographies." BRAIN. Broad Research in Artificial Intelligence and Neuroscience 16, Special Issue 1 (2025): 117–26. https://doi.org/10.70594/brain/16.S1/10.

Full text
Abstract:
The article is focused on the following three scientific and applied activities: research into motion capture technologies, analysis of applications for AI video synthesis, and developing a methodology for creating digital choreographies of Bulgarian folk dances. The possibilities for automation through AI and MoCap were assessed, including extraction, adaptation, and synchronisation of choreographic movements with virtual avatars using retargeting technologies. When analysing the leading AI-based applications for markerless MoCap, it was found that they differ in the detail of motion capture,
APA, Harvard, Vancouver, ISO, and other styles
10

Uppin, Mr Rohit B. "Introduction to Generative AI and its application in Education." International Journal for Research in Applied Science and Engineering Technology 12, no. 1 (2024): 861–66. http://dx.doi.org/10.22214/ijraset.2024.57563.

Full text
Abstract:
Abstract: Generative AI has made significant progress in re- cent years, with a growing range of applications in a variety of fields. Generative AI applications have catalyzed a new erain the synthesis and manipulation of digital content. Genera- tive AIis very recent technology which changed the way tradi-tional search engines work. The search engines work on the principles of information retrieval. However, openGL came up with use of Artificial Intelligence (AI) for synthesis of digital content and launched well known asChatGPT. The GenerativeAI differsfrom traditional AL as it takes text ,a
APA, Harvard, Vancouver, ISO, and other styles
11

Shimpi, A. N. "Deep Fake Detection." INTERNATIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT 09, no. 05 (2025): 1–9. https://doi.org/10.55041/ijsrem47392.

Full text
Abstract:
ABSTRACT The rapid advancement of generative adversarial networks (GANs) and other AI-driven synthesis techniques, deepfake videos have emerged as a significant threat to digital media integrity, enabling the creation of highly realistic but fake video content. These manipulated videos can be used maliciously in disinformation campaigns, identity theft, and other cybercrimes, making their detection a critical challenge. This paper presents a deep learning-based approach for deepfake video detection that leverages both spatial artifacts and temporal inconsistencies introduced during the manipul
APA, Harvard, Vancouver, ISO, and other styles
12

Xiao, Wenxin. "The Evolution of Multi-modal Recommendation Algorithms for Short Videos." Applied and Computational Engineering 154, no. 1 (2025): 95–101. https://doi.org/10.54254/2755-2721/2025.tj23119.

Full text
Abstract:
The rapid proliferation of short video platforms has necessitated the evolution of recommendation systems beyond traditional unimodal approaches. This survey comprehensively analyzes the advancements, challenges, and future directions of multi-modal recommendation algorithms tailored for short videos. Unlike conventional methods reliant on singular data sources (e.g., user logs or text), multi-modal systems integrate visual, audio, textual, and behavioral signals to address critical limitations such as data sparsity, cold starts, and dynamic user intent. We systematically categorize multi-moda
APA, Harvard, Vancouver, ISO, and other styles
13

Benson, Chigozie Emmanuel, Chinelo Harriet Okolo, and Olatunji Oke. "Redefining the Creative Process in Media Production: A Conceptual Framework for AI-Enhanced Video Editing and Automated Content Creation." Journal of Frontiers in Multidisciplinary Research 3, no. 2 (2022): 29–34. https://doi.org/10.54660/.ijfmr.2022.3.2.29-34.

Full text
Abstract:
Integrating artificial intelligence (AI) into media production redefines traditional creative processes, offering unprecedented efficiency and innovation. This paper explores the transformative role of AI in video editing and automated content creation, highlighting its benefits such as enhanced productivity, creativity, and scalability. Key AI features, including automated cutting, scene detection, and effects application, are examined for their impact on video editing. Furthermore, the paper delves into the applications of AI-driven tools in content generation, such as scriptwriting and voic
APA, Harvard, Vancouver, ISO, and other styles
14

Swami, Shivam. "CNG – GPS Tracking and Booking using AI." International Scientific Journal of Engineering and Management 04, no. 05 (2025): 1–7. https://doi.org/10.55041/isjem03389.

Full text
Abstract:
Abstract: This project introduces a CNG GPS Tracking and Booking System powered by Artificial Intelligence (AI), designed to minimize waiting times and enhance the efficiency of the CNG refueling process. The system offers a web-based platform where users can conveniently book CNG refueling appointments in advance, thereby reducing idle time and improving overall user experience. By integrating AI, GPS technology, and real-time data analytics, users can easily locate nearby CNG stations, check fuel availability, reserve time slots, and receive timely notifications. Additionally, pump owners be
APA, Harvard, Vancouver, ISO, and other styles
15

Plekhanova, Tatyana, Volodymyr Tarasiuk, Gaiana Iuksel, Iryna Putsiata, and Olena Kulykova. "Internet journalism in modern society: an overview of mechanisms for resisting media manipulation." Revista Amazonia Investiga 12, no. 61 (2023): 103–11. http://dx.doi.org/10.34069/ai/2023.61.01.11.

Full text
Abstract:
At the current stage of the development of the information society, the influence of Internet journalism on the formation of public opinion (in particular, if we are talking about outright manipulation) is extremely noticeable. The purpose of the article is to analyze these influences in modern society in terms of the presence and use of media manipulation mechanisms and ways to counter them. The main research methods were general scientific (analysis, synthesis) and special scientific (abstraction and concretization). Manifestations of the manipulative influence of Internet journalism on huma
APA, Harvard, Vancouver, ISO, and other styles
16

B, Sreekantha, Arfath Khan, Mohamed Sajid B, Mohammed Anas Asif, and Anish M. Morya. "Advancements in Text-to-Video Creation through AI Models: A Comprehensive Review." Journal of Knowledge in Data Science and Information Management 1, no. 1 (2024): 22–29. http://dx.doi.org/10.46610/jokdsim.2024.v01i01.003.

Full text
Abstract:
This review paper explores the dynamic landscape of text-to-video creation facilitated by Artificial Intelligence (AI) models. Examining the intersection of Natural Language Processing (NLP) and Computer Vision (CV), we delve into methodologies, challenges, and advancements shaping this evolving field. From traditional rule-based systems to advanced deep learning architectures like GPT and CLIP, the paper navigates through the diverse spectrum of AI models driving text-to-video synthesis. Challenges, such as context preservation and ethical considerations, are discussed, along with practical a
APA, Harvard, Vancouver, ISO, and other styles
17

Rulyan, Gautam. "The Convergence of Artificial Intelligence and Visual Art: Tools, Ethics, and the Future of Creativity." International Journal for Research in Applied Science and Engineering Technology 13, no. 7 (2025): 319–20. https://doi.org/10.22214/ijraset.2025.72993.

Full text
Abstract:
Artificial Intelligence (AI) has rapidly emerged as a transformative force in the visual arts. From neural networks capable of generating images to real-time video synthesis, AI-powered tools such as DALL·E, Midjourney, and Sora are redefining how we conceptualize and create art. This paper explores the intersection of AI and visual art by evaluating the tools, methods, and ethical dilemmas surrounding generative art. It also examines authorship, copyright challenges, and how emerging AI platforms are reshaping the creative process. Through a multidisciplinary lens, the study underscores the g
APA, Harvard, Vancouver, ISO, and other styles
18

Chen, Sinan, Liuyi Yang, Yue Zhang, et al. "Digital Human Technology in E-Learning: Custom Content Solutions." Applied Sciences 15, no. 7 (2025): 3807. https://doi.org/10.3390/app15073807.

Full text
Abstract:
With advances in digital transformation (DX) in education and digital technologies becoming more deeply integrated into educational settings, global demand for video-based learning materials continues to rise, resulting in substantial effort being required from teachers to create e-learning videos. Furthermore, while many existing services offer visual content, they primarily rely on templates, making it challenging to design custom content that addresses specific needs. In this study, we develop a web service that facilitates e-learning video creation through integrated artificial intelligenc
APA, Harvard, Vancouver, ISO, and other styles
19

K, Tresha, Kavya, PB Medhaa, and T. Pragathi. "Automatic Video Generator." International Journal of Innovative Science and Research Technology (IJISRT) 9, no. 12 (2024): 104–8. https://doi.org/10.5281/zenodo.14470731.

Full text
Abstract:
Text-to-video (T2V) generation is an emerging field in artificial intelligence, gaining traction with advances in deep learning models like generative adversarial networks (GANs), diffusion models, and hybrid architectures. This paper provides a comprehensive survey of recent T2V methodologies, exploring models such as GAN-based frameworks, VEGAN-CLIP, IRC-GAN, Sora OpenAI, and CogVideoX, which aim to transform textual descriptions into coherent video content. These models face challenges in maintaining semantic coherence, temporal consistency, and realistic motion across generated frames. We
APA, Harvard, Vancouver, ISO, and other styles
20

Valcheva, Donika, Teodor Kalushkov, and Georgi Shipkovenski. "Research on Motion Capture Technologies and AI Video Synthesis for Creating Digital Bulgarian Folk Choreographies." BRAIN. Broad Research in Artificial Intelligence and Neuroscience 16, no. 1 Sup1 (2025): 117. https://doi.org/10.70594/brain/16.s1/10.

Full text
Abstract:
<p dir="ltr"><span>The article is focused on the following three scientific and applied activities: research into motion capture technologies, analysis of applications for AI video synthesis, and developing a methodology for creating digital choreographies of Bulgarian folk dances. The possibilities for automation through AI and MoCap were assessed, including extraction, adaptation, and synchronisation of choreographic movements with virtual avatars using retargeting technologies. When analysing the leading AI-based applications for markerless MoCap, it was found that they differ i
APA, Harvard, Vancouver, ISO, and other styles
21

Thorat, Ms Madhuri. "From Words to Wonders: AI-Generated Multimedia for Poetry Learning." International Journal for Research in Applied Science and Engineering Technology 13, no. 5 (2025): 3382–94. https://doi.org/10.22214/ijraset.2025.70946.

Full text
Abstract:
The rise of Generative AI has led to the development of various tools that present new opportunities for businesses and professionals engaged in content creation. The education sector is undergoing a significant transformation in the methods of content development and delivery. AI models and tools facilitate the creation of customized learning materials and effective visuals that enhance and simplify the educational experience. The advent of Large Language Models (LLMs) such as GPT and Text-to-Image models like Stable Diffusion, Flux-Schnell has fundamentally changed and expedited the content
APA, Harvard, Vancouver, ISO, and other styles
22

Kryvoruchko, Larysa, Oleksii Kucher, Vlada Husieva, Iryna Timush, and Diana Timush. "Legal and organizational principles of person identification by appearance during the investigation of criminal offenses in Ukraine." Revista Amazonia Investiga 11, no. 56 (2022): 82–90. http://dx.doi.org/10.34069/ai/2022.56.08.9.

Full text
Abstract:
The purpose of the study is to determine the legal and organizational basis for identifying a person based on appearance during the investigation of criminal offenses in Ukraine. In order to achieve the goal of the article, the authors used methods of synthesis and analysis. Statistical methods were also used, with the help of which the problems that make it impossible to carry out portrait examinations based on the materials of video recordings and photographs, as well as other revealing ones, are defined and displayed in percentage form. The logical method and the method of generalization we
APA, Harvard, Vancouver, ISO, and other styles
23

Bhargavi, J. "Creative Mind AI: Automated Multimodal Ad Synthesis for Enhanced Brand Engagement." INTERNATIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT 09, no. 05 (2025): 1–9. https://doi.org/10.55041/ijsrem47603.

Full text
Abstract:
ABSTRACT CreativeMindAI is a platform driven by artificial intelligence that enables companies to design creative ads in image form. It also uses DALL·E, an advanced AI model by OpenAI capable of generating high-quality images from text descriptions, ensuring visually appealing and contextually relevant ads. The current AI-powered tools are able to create images or text but do not have a single system to maintain brand consistency and engaged targeting. It also integrates machine learning based audience targeting. The model analyzes factors such as product category, language style, and past co
APA, Harvard, Vancouver, ISO, and other styles
24

Tian, Yuang. "The Application and Practice of Artificial Intelligence in the Entertainment Field." Applied and Computational Engineering 110, no. 1 (2024): 50–54. http://dx.doi.org/10.54254/2755-2721/110/2024melb0098.

Full text
Abstract:
Artificial intelligence (AI) technology has witnessed unprecedented advancements and a gradual penetration into civilian applications. This paper aims to thoroughly investigate the application of AI in the entertainment industry, with a particular focus on the principles and cross-disciplinary implementations of 3D real-life scanning, AI for non-player characters (NPCs), and AI video generation. By synthesizing how these technologies streamline content creation processes, lower technical barriers, and inspire novel approaches to game design, we observe that AI is not only reshaping the ecosyst
APA, Harvard, Vancouver, ISO, and other styles
25

Negassi, Misgana, Rodrigo Suarez-Ibarrola, Simon Hein, Arkadiusz Miernik, and Alexander Reiterer. "Application of artificial neural networks for automated analysis of cystoscopic images: a review of the current status and future prospects." World Journal of Urology 38, no. 10 (2020): 2349–58. http://dx.doi.org/10.1007/s00345-019-03059-0.

Full text
Abstract:
Abstract Background Optimal detection and surveillance of bladder cancer (BCa) rely primarily on the cystoscopic visualization of bladder lesions. AI-assisted cystoscopy may improve image recognition and accelerate data acquisition. Objective To provide a comprehensive review of machine learning (ML), deep learning (DL) and convolutional neural network (CNN) applications in cystoscopic image recognition. Evidence acquisition A detailed search of original articles was performed using the PubMed-MEDLINE database to identify recent English literature relevant to ML, DL and CNN applications in cys
APA, Harvard, Vancouver, ISO, and other styles
26

Tejaskumar Pujari, Anshul Goel, and Deepak Kejriwal. "Ethical and Responsible AI in the Age of Adversarial Diffusion Models: Challenges, Risks, and Mitigation Strategies." International Journal Science and Technology 1, no. 3 (2022): 54–68. https://doi.org/10.56127/ijst.v1i3.1963.

Full text
Abstract:
The rapid pace of diffusion models in generative AI has completely restructured many fields, particularly with respect to image synthesis, video generation, and creative data enhancement. However, promising developments remain tinged with ethical questions in view of diffusion-based model dual-use. By misusing these models, purveyors could think up deepfaked videos, unpredictable forms of misinformation, instead outing cyber warfare-related attacks over the Internet, therefore aggravating societal vulnerabilities. This paper explores and analyzes these potential ethical risks and adversarial t
APA, Harvard, Vancouver, ISO, and other styles
27

P. Jayanth, K. Lakshmi Sree, K. Karthik Kumar Reddy, G. Om Prakash, and G. Reddy Prasad. "Vision-to-Voice: AI for generating Description & Audio of Visual Content." International Research Journal of Innovations in Engineering and Technology 09, Special Issue ICCIS (2025): 206–13. https://doi.org/10.47001/irjiet/2025.iccis-202533.

Full text
Abstract:
Abstract - The seamless transformation of visual content into descriptive text and naturalistic speech, termed Vision-to-Voice, represents a significant interdisciplinary advancement at the intersection of computer vision, natural language processing (NLP), and speech synthesis. This paper explores the development of an end-to-end Vision-to-Voice pipeline, encompassing visual scene understanding, semantic description generation, and highquality speech synthesis, thereby enabling AI systems to narrate visual content for human users. The proposed methodology integrates Transformer-based image ca
APA, Harvard, Vancouver, ISO, and other styles
28

Bertovsky, Lev V., Inna A. Ryzhkova, and Sergey A. Ryzhkov. "Innovative technologies and principles of criminal proceeding when conducting investigative actions." Revista Amazonia Investiga 10, no. 48 (2021): 18–25. http://dx.doi.org/10.34069/ai/2021.48.12.2.

Full text
Abstract:
The discrepancy between the introduction of consumer innovations at the modern level of the criminal procedure minimizes the operation of the principle of criminal proceedings. Purpose of the study is to analyze the use of the problems of introducing innovations in criminal proceedings, as well as the development of norms aimed at improving the mechanism for exercising rights during investigative actions using modern technical means. Methodology: the study of the problem of non-use in the preliminary investigation of modern technologies was carried out using formal-logical research methods, an
APA, Harvard, Vancouver, ISO, and other styles
29

Poo Hernandez, Sergio, and Vadim Bulitko. "Speeding Up Heuristic Function Synthesis via Extending the Formula Grammar." Proceedings of the International Symposium on Combinatorial Search 12, no. 1 (2021): 233–35. http://dx.doi.org/10.1609/socs.v12i1.18594.

Full text
Abstract:
Heuristic search algorithms have long been used in video-game AI for unit navigation and planning. The quality of the solution they produce depends substantially on the quality of the heuristic function they use. Recent work automatically synthesized human-readable heuristic functions for a given pathfinding map. This enables tailoring a heuristic to the map but is expensive since each map requires an independent synthesis run. In this paper we propose and evaluate re-using elements of heuristics synthesized for one map in synthesizing heuristics for another map. We do so by adding parts of a
APA, Harvard, Vancouver, ISO, and other styles
30

Lu, Min, and Huamin Wang. "The practice and reflection of generative AI in the cultivation of aesthetic education in colleges and universities: Centred on environmental design major." MATEC Web of Conferences 395 (2024): 01021. http://dx.doi.org/10.1051/matecconf/202439501021.

Full text
Abstract:
Through a comprehensive analysis of the development history of generative AI and its application in the aesthetic education of environmental design majors, this paper aims to reveal its potential significance and revelations in the field of aesthetic education. This paper first outlines the concept and development history of generative AI, and then delves into its practice in the aesthetic education of environmental design majors. For different application scenarios, including natural language processing, image recognition, audio processing and video synthesis, its specific applications and ef
APA, Harvard, Vancouver, ISO, and other styles
31

Baig, Prof Mirza Moiz. "An Automated Video Language Translator using STT-TTT-TTS Translation." International Journal for Research in Applied Science and Engineering Technology 13, no. 4 (2025): 5935–40. https://doi.org/10.22214/ijraset.2025.69786.

Full text
Abstract:
Advancements in Natural Language Processing (NLP) have significantly improved multilingual communication through machine translation, text-to-speech conversion, and cross-language information retrieval (CLIR) [1]-[5]. Various approaches, including rule-based and statistical models, enhance translation accuracy and language identification [6]-[8]. Neural machine translation (NMT) and deep learning techniques further refine speech recognition and sentiment analysis [9]- [12]. Structural differences in languages, such as Subject-Verb-Object (SVO) versus Subject-Object-Verb (SOV) order, influence
APA, Harvard, Vancouver, ISO, and other styles
32

Mkrtchyan, Koryun Aharon, Tatiana Mikhailovna Yamnenko, Tetiana Georgievna Holovan, and Mykola Dmytrovych Zhdan. "Improvement of organizational measures to ensure public security and order during the mass events by the National Police of Ukraine." Revista Amazonia Investiga 9, no. 26 (2020): 167–73. http://dx.doi.org/10.34069/ai/2020.26.02.18.

Full text
Abstract:
The purpose of this article is to identify ways to improve organizational measures to ensure public security and order during the mass events by the National Police. Methods such as structural-functional, formal-logical, modeling, analysis, and synthesis were used in the writing of the article. The successful implementation of this measure depends largely on organization, relationships between each police authority and unit, rational distribution of forces and resources, early response to the detection of violations and other events, effective coordination, placement of patrol posts and routes
APA, Harvard, Vancouver, ISO, and other styles
33

Toktarova, V. I., and O. V. Rebko. "Educational audio and video content in the practice of university lecturers: Intelligent tools and approaches to development and implementation." Informatics and education 40, no. 2 (2025): 5–15. https://doi.org/10.32517/0234-0453-2025-40-2-5-15.

Full text
Abstract:
The rapid development of artificial intelligence (AI) technologies in the field of education has led to the emergence of innovative approaches such as podcast pedagogy, micro- and nano-learning, the use of digital avatars. Intelligent tools and services provide personalization and inclusiveness of learning, increase student engagement and save teachers’ time. The purpose of the work presented in the article is to study the educational potential of audio and video content in the activities of a higher education teacher and modern intellectual tools for the development and implementation of such
APA, Harvard, Vancouver, ISO, and other styles
34

Patil, Samarth. "Integrating AI and Encryption in WebRTC: A Smart Video Conferencing Platform with Real-Time Transcription and Translation." INTERNATIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT 09, no. 05 (2025): 1–9. https://doi.org/10.55041/ijsrem47811.

Full text
Abstract:
Abstract - As the demand for seamless, real-time communication tools grows, traditional video conferencing platforms need help to balance simplicity with advanced features. BabelTalk aims to address this challenge by integrating AI-powered capabilities into a scalable, secure, real-time communication platform built on WebRTC. WebRTC relies on basic peer-to-peer architectures, which limit scalability and also lack pre-processing of media streams, BabelTalk takes inspiration from a quasi-peer architecture to enable advanced features such as real-time transcription, translation into captions or s
APA, Harvard, Vancouver, ISO, and other styles
35

Thaseen Ikram, Sumaiya, Priya V, Shourya Chambial, Dhruv Sood, and Arulkumar V. "A Performance Enhancement of Deepfake Video Detection through the use of a Hybrid CNN Deep Learning Model." International journal of electrical and computer engineering systems 14, no. 2 (2023): 169–78. http://dx.doi.org/10.32985/ijeces.14.2.6.

Full text
Abstract:
In the current era, many fake videos and images are created with the help of various software and new AI (Artificial Intelligence) technologies, which leave a few hints of manipulation. There are many unethical ways videos can be used to threaten, fight, or create panic among people. It is important to ensure that such methods are not used to create fake videos. An AI-based technique for the synthesis of human images is called Deep Fake. They are created by combining and superimposing existing videos onto the source videos. In this paper, a system is developed that uses a hybrid Convolutional
APA, Harvard, Vancouver, ISO, and other styles
36

Marcinkevage, Carrie, and Akhil Kumar. "Generative AI in Higher Education Constituent Relationship Management (CRM): Opportunities, Challenges, and Implementation Strategies." Computers 14, no. 3 (2025): 101. https://doi.org/10.3390/computers14030101.

Full text
Abstract:
This research explores opportunities for generative artificial intelligence (GenAI) in higher education constituent (customer) relationship management (CRM) to address the industry’s need for digital transformation driven by demographic shifts, economic challenges, and technological advancements. Using a qualitative research approach grounded in the principles of grounded theory, we conducted semi-structured interviews and an open-ended qualitative data collection instrument with technology vendors, implementation consultants, and HEI professionals that are actively exploring GenAI application
APA, Harvard, Vancouver, ISO, and other styles
37

Bernard, Dianala M., and Maren A. Benn. "REVITALIZATION OR RECLAMATION? REFRAMING THE RECOVERY OF INDIGENOUS LANGUAGES IN LATIN AMERICA: A HISTORICAL AND AIDRIVEN APPROACH." International Journal of Language, Linguistics, Literature and Culture 04, no. 01 (2025): 104–31. https://doi.org/10.59009/ijlllc.2025.0103.

Full text
Abstract:
Indigenous languages of Latin America have faced significant decline due to colonization, globalization, and sociopolitical factors. While some languages remain endangered, others have entirely disappeared, leaving behind limited historical records or, in some cases, none at all. This study explores the historical transmission of these languages, the current state of documentation, and the role of artificial intelligence (AI) in their recovery, including revitalization and reclamation. Focusing on endangered languages such as Bribri, Cabécar, Maléku, Ngäbere, and Kuna, alongside extinct langua
APA, Harvard, Vancouver, ISO, and other styles
38

Yan, Jielu, Zhengli Chen, Jianxiu Cai, et al. "Video-Driven Artificial Intelligence for Predictive Modelling of Antimicrobial Peptide Generation: Literature Review on Advances and Challenges." Applied Sciences 15, no. 13 (2025): 7363. https://doi.org/10.3390/app15137363.

Full text
Abstract:
How video-based methodologies and advanced computer vision algorithms can facilitate the development of antimicrobial peptide (AMP) generation models should be further reviewed, structural and functional patterns should be elucidated, and the generative power of in silico pipelines should be enhanced. AMPs have drawn significant interest as promising therapeutic agents because of their broad-spectrum efficacy, low resistance profile, and membrane-disrupting mechanisms. However, traditional discovery methods are hindered by high costs, lengthy synthesis processes, and difficulty in accessing th
APA, Harvard, Vancouver, ISO, and other styles
39

G C, Shwethashree. "Inclusive Communication: Leveraging AI for Sign Language Translation and Real-Time Audio Transcription." INTERNATIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT 09, no. 05 (2025): 1–9. https://doi.org/10.55041/ijsrem48555.

Full text
Abstract:
Abstract - Humans communicate through both natural language and body language, including gestures, facial expressions, and lip movements. While understanding spoken language is essential, recognizing sign language is equally important, especially for individuals with hearing impairments. Deaf individuals often struggle to communicate with those unfamiliar with sign language, making real-time translation systems invaluable. This paper proposes a real-time meeting platform that recognizes Indian Sign Language (ISL) gestures and converts them into text and speech, enabling smooth interaction betw
APA, Harvard, Vancouver, ISO, and other styles
40

Barabanschikov, V. A., and M. M. Marinova. "Deepfake in Face Perception Research." Experimental Psychology (Russia) 14, no. 1 (2021): 4–19. http://dx.doi.org/10.17759/exppsy.2021000001.

Full text
Abstract:
Presents the state-of-the-art Deepfake face replacement image collage method, an artificial intelligence (AI) product that can be used to create high-quality, realistic videos with a fake or replaced face, with no obvious signs of manipulation. Based on the DeepFaceLab (DFL) application, the process of creating video images of an “impossible face” is described step by step. The results of the experiments of studying the perception patterns of the moving “impossible face” and their differences in statics and dynamics are presented. The stimuli were two DFL-generated models of virtual sitters wi
APA, Harvard, Vancouver, ISO, and other styles
41

Zhang, Zhen, Zhichu Ren, and Ju Li. "(Invited) Human-AI-Robotics Collaboration for Catalysts Discovery in Electrochemical Reactions." ECS Meeting Abstracts MA2024-02, no. 69 (2024): 4820. https://doi.org/10.1149/ma2024-02694820mtgabs.

Full text
Abstract:
The exploration and discovery of materials for electrochemical reactions have traditionally been a tedious and time-intensive process. The complexity inherent in the design of materials with high-dimensional parameters and the exhaustive nature of data acquisition are the primary causes of this bottleneck [1]. To overcome this challenge, we introduce the Copilot for Real-world Experimental Scientist (CRESt). CRESt employs a large multimodal model (LMM) to guide a robotic system that is adept at active learning (driven by the Gaussian process-Bayesian optimization process), thereby streamlining
APA, Harvard, Vancouver, ISO, and other styles
42

Sakirin, Tam, and Siddartha Kusuma. "A Survey of Generative Artificial Intelligence Techniques." Babylonian Journal of Artificial Intelligence 2023 (March 10, 2023): 10–14. http://dx.doi.org/10.58496/bjai/2023/003.

Full text
Abstract:
Generative artificial intelligence (AI) refers to algorithms capable of creating novel, realistic digital content autonomously. Recently, generative models have attained groundbreaking results in domains like image and audio synthesis, spurring vast interest in the field. This paper surveys the landscape of modern techniques powering the rise of creative AI systems. We structurally examine predominant algorithmic approaches including generative adversarial networks (GANs), variational autoencoders (VAEs), and autoregressive models. Architectural innovations and illustrations of generated outpu
APA, Harvard, Vancouver, ISO, and other styles
43

Syriopoulou-Delli, Christine K. "Advances in Autism Spectrum Disorder (ASD) Diagnostics: From Theoretical Frameworks to AI-Driven Innovations." Electronics 14, no. 5 (2025): 951. https://doi.org/10.3390/electronics14050951.

Full text
Abstract:
This study provides a comprehensive analysis of the evolution of Autism Spectrum Disorder (ASD) diagnostics, tracing its progression from psychoanalytic origins to the integration of advanced artificial intelligence (AI) technologies. The study explores, through scientific data bases like Pub Med, Scopus, and Google Scholar, how theoretical frameworks, including psychoanalysis, behavioral psychology, cognitive development, and neurobiological paradigms, have shaped diagnostic methodologies over time. Each paradigm’s associated assessment tools, such as the Autism Diagnostic Observation Schedul
APA, Harvard, Vancouver, ISO, and other styles
44

B Meenakshi, Mohammed Wajahat Hussain, and Mittapalli Arvind Sai. "Real-Time Multilingual Speech Translation for Peer Communication." International Research Journal on Advanced Engineering Hub (IRJAEH) 3, no. 06 (2025): 2893–98. https://doi.org/10.47392/irjaeh.2025.0427.

Full text
Abstract:
Language continues to be a major obstacle to effective communication in a world that is becoming more interconnected by the day. This paper presented a real-time audio translation system that facilitates multilingual communication during peer-to-peer video calls. The application enables natural communication in the user’s preferred language by utilizing WebRTC for low-latency media transmission and incorporating sophisticated AI models such as Whisper for speech-to-text, GPT for language translation, and gTTS for text-to-speech synthesis. In addition to allowing real-time subtitle overlays and
APA, Harvard, Vancouver, ISO, and other styles
45

Li, Lianghao. "Overview of Multimodal Generative Models in Natural Language Processing and Computer Vision." Journal of Computer Technology and Applied Mathematics 1, no. 4 (2024): 69–78. https://doi.org/10.5281/zenodo.13988327.

Full text
Abstract:
Multimodal generative models have become essential in the deep learning renaissance, as they provide unparalleled flexibility over a diverse context of applications within Natural Language Processing (NLP) and Computer Vision (CV). In this paper, we systematically review the basic concepts and technical improvements in multimodal generative models by discussing their applications across different modalities such as text, images, audio,and video. These models though augment the strength of AI to comprehend and perform complicated tasks by coalescing data from various modalities. In this paper,
APA, Harvard, Vancouver, ISO, and other styles
46

Rakhimova, Diana, Aidana Karibayeva, Vladislav Karyukin, Assem Turarbek, Zhansaya Duisenbekkyzy, and Rashid Aliyev. "Development of a Children’s Educational Dictionary for a Low-Resource Language Using AI Tools." Computers 13, no. 10 (2024): 253. http://dx.doi.org/10.3390/computers13100253.

Full text
Abstract:
Today, various interactive tools or partially available artificial intelligence applications are actively used in educational processes to solve multiple problems for resource-rich languages, such as English, Spanish, French, etc. Unfortunately, the situation is different and more complex for low-resource languages, like Kazakh, Uzbek, Mongolian, and others, due to the lack of qualitative and accessible resources, morphological complexity, and the semantics of agglutinative languages. This article presents research on early childhood learning resources for the low-resource Kazakh language. Gen
APA, Harvard, Vancouver, ISO, and other styles
47

Llorca, Allen A. "AI-WEAR: SMART TEXT READER FOR BLIND/VISUALLY IMPAIRED STUDENTS USING RASPBERRY PI WITH AUDIO-VISUAL CALL AND GOOGLE ASSISTANCE." International Journal of Advanced Research in Computer Science 14, no. 03 (2023): 119–29. http://dx.doi.org/10.26483/ijarcs.v14i3.6997.

Full text
Abstract:
The goal of this research is to create a prototype referred to as AI-WEAR: Smart Text Reader for Blind or Visually Impaired Students, which utilizes Raspberry Pi equipped with Audio-Visual Call and Google Assistance. The prototype incorporates various functionalities including text-to-speech capability for reading, Google assistance for online support, and video streaming through Jitsi Meet, enabling students to interact with their teachers. The device offers two modes of control: voice commands and user-friendly buttons with Braille letters engraved on them. OCR (Optical Character Recognition
APA, Harvard, Vancouver, ISO, and other styles
48

Debnath, Minakshi, Sana Alamgeer, Md Shahriar Kabir, and Anne H. Ngu. "Enhancing Wearable Fall Detection System via Synthetic Data." Sensors 25, no. 15 (2025): 4639. https://doi.org/10.3390/s25154639.

Full text
Abstract:
Deep learning models rely heavily on extensive training data, but obtaining sufficient real-world data remains a major challenge in clinical fields. To address this, we explore methods for generating realistic synthetic multivariate fall data to supplement limited real-world samples collected from three fall-related datasets: SmartFallMM, UniMib, and K-Fall. We apply three conventional time-series augmentation techniques, a Diffusion-based generative AI method, and a novel approach that extracts fall segments from public video footage of older adults. A key innovation of our work is the explor
APA, Harvard, Vancouver, ISO, and other styles
49

Iksanov, Radmir, Igor Vladimirov, and Ravil Gizzatullin. "THE IMPACT OF IMPLEMENTING ARTIFICIAL INTELLIGENCE TECHNOLOGIES ON RESOURCE SAVING IN AGRICULTURE." Bulletin of KSAU, no. 3 (March 10, 2025): 131–39. https://doi.org/10.36718/1819-4036-2025-3-131-139.

Full text
Abstract:
The objective of the study is to examine the concept of artificial intelligence (AI), its properties, areas of application and to determine the degree of influence of the implemented artificial intelligence technologies on resource conservation in agriculture, as well as the prospects for further use of AI in agriculture. Objectives: to determine the types of artificial intelligence technologies used in agriculture, to determine the status and types of artificial intelligence, as well as the areas of influence of artificial intelligence technologies on resource conservation in agriculture. Gen
APA, Harvard, Vancouver, ISO, and other styles
50

Galyashina, E. I., and V. D. Nikishin. "The protection of megascience projects from deepfake technologies threats: information law aspects." Journal of Physics: Conference Series 2210, no. 1 (2022): 012007. http://dx.doi.org/10.1088/1742-6596/2210/1/012007.

Full text
Abstract:
Abstract The paper examines the potential threats of the malicious use of deepfake technology to destabilize and discredit megascience projects in the global information space. The phenomenology of using artificial intelligence (AI) to create video recordings and voice messages, in which people do and say something that did not take place in reality, is considered. Special attention is paid to speech synthesis technologies based on arbitrary text and spoofing, i.e. replacing the speaker’s personality while preserving the content of the original speech message. The authors’ definition of a voic
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!