Log in

Relevant bibliographies by topics / AI model deployment / Journal articles

To see the other types of publications on this topic, follow the link: AI model deployment.

Journal articles on the topic 'AI model deployment'

Author: Grafiati

Published: 7 June 2025

Last updated: 17 July 2025

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'AI model deployment.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Swamy, Prasadarao Velaga. "Continuous Deployment of AI Systems: Strategies for Seamless Updates and Rollbacks." International Journal of Innovative Research in Engineering & Multidisciplinary Physical Sciences 6, no. 6 (2018): 1–8. https://doi.org/10.5281/zenodo.12805458.

Full text

Abstract:

The deployment of artificial intelligence (AI) systems poses unique challenges compared to traditional software applications, primarily due to the dynamic nature of AI models and their sensitivity to data changes. Continuous deployment (CD) strategies play a crucial role in managing these complexities by enabling organizations to deploy, update, and manage AI models seamlessly and efficiently. This paper reviews key strategies for implementing CD in AI systems, focusing on seamless updates and robust rollback mechanisms. Strategies discussed include incremental deployment, A/B testing, canary releases, and automated rollback procedures, each designed to minimize disruption and optimize performance during model updates. Additionally, the importance of monitoring and feedback loops in ensuring ongoing performance and reliability is highlighted, emphasizing their role in detecting anomalies and integrating user feedback for continuous model improvement. The paper concludes with a discussion on future research directions, including advanced testing methodologies for AI models, scalable deployment strategies across heterogeneous environments, and ethical considerations in AI deployment practices. By addressing these challenges and embracing innovative approaches, organizations can enhance the agility, reliability, and effectiveness of AI deployments, paving the way for broader adoption and impactful application across various domains.

APA, Harvard, Vancouver, ISO, and other styles

2

Vijayan, Naveen Edapurath. "Building Scalable MLOps: Optimizing Machine Learning Deployment and Operations." INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT 08, no. 10 (2024): 1–9. http://dx.doi.org/10.55041/ijsrem37784.

Full text

Abstract:

As machine learning (ML) models become increasingly integrated into mission-critical applications and production systems, the need for robust and scalable MLOps (Machine Learning Operations) practices has grown significantly. This paper explores key strategies and best practices for building scalable MLOps pipelines to optimize the deployment and operation of machine learning models at an enterprise scale. It delves into the importance of automating the end-to-end lifecycle of ML models, from data ingestion and model training to testing, deployment, and monitoring. Approaches for implementing continuous integration and continuous deployment (CI/CD) pipelines tailored for ML workflows are discussed, enabling efficient and repeatable model updates and deployments. The paper emphasizes the criticality of implementing comprehensive monitoring and observability mechanisms to track model performance, detect drift, and ensure the reliability and trustworthiness of deployed models. The paper also addresses the challenges of managing model versioning and governance at scale, including techniques for maintaining a centralized model registry, enforcing access controls, and ensuring compliance with regulatory requirements. The paper aims to provide a comprehensive guide for organizations seeking to establish scalable and robust MLOps practices, enabling them to unlock the full potential of machine learning while mitigating risks and ensuring responsible AI deployment. Keywords—Machine Learning Operations (MLOps), Scalable AI Deployment, Continuous Integration and Continuous Deployment (CI/CD) for ML, ML Monitoring and Observability, Model Reproducibility, Model Versioning and Governance, Centralized Model Registry, Responsible AI Deployment, Ethical AI Practices, Enterprise MLOps

APA, Harvard, Vancouver, ISO, and other styles

3

Sudheer Obbu. "Building a Robust CI/CD Pipeline for AI-Powered Cloud Applications." Journal of Computer Science and Technology Studies 7, no. 3 (2025): 215–25. https://doi.org/10.32996/jcsts.2025.7.3.25.

Full text

Abstract:

The deployment of AI applications in cloud environments presents unique challenges that traditional CI/CD pipelines fail to address, particularly in model versioning, data quality management, and system integration. This paper presents a comprehensive framework for building AI-specific CI/CD pipelines that effectively bridge these gaps. Through empirical analysis of successful implementations, we demonstrate how specialized pipeline architectures incorporating automated testing, intelligent resource allocation, and continuous monitoring can reduce deployment incidents by 37% while improving model reliability by 42%. Our findings show that organizations adopting these practices achieve 65% higher success rates in production deployments and reduce operational overhead by 41%. The proposed approach provides a practical roadmap for organizations seeking to streamline their AI deployment processes while maintaining robust security and performance standards.

APA, Harvard, Vancouver, ISO, and other styles

4

Sharma, Ankush. "Green AI: Minimizing Environmental Cost of AI Model Training and Deployment." ADHYAYAN: A JOURNAL OF MANAGEMENT SCIENCES 14, no. 02 (2024): 28–30. https://doi.org/10.21567/adhyayan.v14i2.06.

Full text

Abstract:

The rapid development of artificial intelligence (AI), particularly deep learning models, has contributed to transformative innovations across various industries. The environmental influence of AI model training and deployment, especially energy consumption and carbon emissions through large-scale computational tasks, has gained increasing attention. This paper explores the concept of “Green AI,” a framework that emphasises minimizing the environmental costs of AI without sacrificing performance. By examining current practices in model development, energy consumption during training, and the role of sustainable deployment strategies, this research highlights practical solutions to mitigate AI’s environmental footprint while encouraging more efficient and eco-friendly models.

APA, Harvard, Vancouver, ISO, and other styles

5

Sachin, Samrat Medavarapu. "Demystifying AI: A Comprehensive Review of Explainable AI Techniques and Applications." European Journal of Advances in Engineering and Technology 10, no. 6 (2023): 49–52. https://doi.org/10.5281/zenodo.13627267.

Full text

Abstract:

Explainable Artificial Intelligence (XAI) seeks to make AI systems more transparent and understandable to users. This review examines the various techniques developed to achieve explainability in AI models and their applications across different domains. We discuss methods such as feature attribution, model simplification, and example-based explanations, highlighting their strengths and limitations. Additionally, we explore the importance of XAI in critical fields like healthcare, finance, and law. The findings underscore the necessity of explainability for trust, accountability, and ethical AI deployment, pointing towards future directions in the field.

APA, Harvard, Vancouver, ISO, and other styles

6

Isabirye Edward Kezron. "Securing the AI supply chain: Mitigating vulnerabilities in AI model development and deployment." World Journal of Advanced Research and Reviews 22, no. 2 (2024): 2336–46. https://doi.org/10.30574/wjarr.2024.22.2.1394.

Full text

Abstract:

The rapid advancement and integration of Artificial Intelligence (AI) across critical sectors — including healthcare, finance, defense, and infrastructure — have exposed an often-overlooked risk: vulnerabilities within the AI supply chain. This research examines the security challenges and potential threats affecting AI model development and deployment, focusing on adversarial attacks, data poisoning, model theft, and compromised third-party components. By dissecting the AI supply chain into its core stages — data sourcing, model training, deployment, and maintenance — this study identifies key entry points for malicious actors. The paper proposes a multi-layered security framework combining blockchain-based data provenance, federated learning for decentralized model training, and zero-trust architecture to ensure secure deployment. Additionally, it explores how adversarial training, model watermarking, and real-time anomaly detection can mitigate risks without sacrificing model performance. Case studies of high-profile AI breaches are analyzed to demonstrate the consequences of unsecured pipelines, emphasizing the urgency of securing AI systems.

APA, Harvard, Vancouver, ISO, and other styles

7

Toluwase Peter Gbenle, Abraham Ayodeji Abayomi, Abel Chukwuemeke Uzoka, Oyejide Timothy Odofin, Oluwasanmi Segun Adanigbo, and Jeffrey Chidera Ogeawuchi. "Developing an AI Model Registry and Lifecycle Management System for Cross-Functional Tech Teams." International Journal of Scientific Research in Science, Engineering and Technology 11, no. 4 (2024): 442–56. https://doi.org/10.32628/ijsrset25121179.

Full text

Abstract:

This paper presents a comprehensive solution for managing AI models across their lifecycle through the development of an AI model registry and lifecycle management system. As AI continues to play a crucial role across industries, the complexity of managing models—from development to deployment—presents significant challenges, especially within cross-functional teams. These challenges include issues such as model versioning, metadata management, deployment inconsistencies, and communication breakdowns among data scientists, engineers, and business stakeholders. The proposed system addresses these challenges by providing a centralized platform that integrates features such as version control, metadata management, and automated deployment, thereby improving transparency and reducing the risk of deployment errors. Furthermore, the system fosters enhanced collaboration by integrating widely-used project management tools like GitHub, Jira, and Slack, ensuring that teams remain aligned throughout the model's lifecycle. By enabling continuous monitoring and incorporating automated model drift detection, the system ensures that AI models remain accurate and efficient post-deployment. This paper also explores the technical implementation strategy for the system, including the use of containerization, cloud-native infrastructure, and microservices architecture to ensure scalability and flexibility. The implications of this work extend beyond technical considerations, as it enhances collaboration, improves model quality, and accelerates deployment cycles. Future research directions include exploring automation in model updates, scalability in large enterprises, and the integration of additional tools and frameworks. This work provides a critical step toward optimizing AI model management, offering a scalable, efficient, and secure approach to managing AI models throughout their lifecycle.

APA, Harvard, Vancouver, ISO, and other styles

8

Prudhvi, Naayini, and Bura Chiranjeevi. "Optimizing AI Model Inference on Serverless Cloud Platforms: A Scalable Approach." International Journal of Current Science Research and Review 08, no. 05 (2025): 1927–35. https://doi.org/10.5281/zenodo.15323189.

Full text

Abstract:

Abstract : The increasing prevalence of Artificial Intelligence (AI) and Machine Learning (ML) models across various industries has highlighted the critical need for efficient and scalable deployment strategies. Traditional deployment methods often struggle with adapting to fluctuating demands and maintaining cost-effectiveness. Serverless computing has emerged as a promising solution to address these challenges. This paper investigates the deployment of AI models within serverless architectures on Amazon Web Services (AWS), specifically focusing on AWS Lambda and Knative. The study analyzes the limitations of conventional deployment approaches and proposes innovative strategies leveraging the capabilities of serverless technologies. Furthermore, it presents a rigorous evaluation of the performance characteristics of these serverless deployment strategies, discusses crucial security and privacy considerations, incorporates illustrative real-world case studies, and outlines potential future research directions.

APA, Harvard, Vancouver, ISO, and other styles

9

Researcher. "CLOUD-BASED AI/ML MODEL DEPLOYMENT: A COMPARATIVE ANALYSIS OF MANAGED AND SELF-MANAGED PLATFORMS." International Journal of Computer Engineering and Technology (IJCET) 15, no. 6 (2024): 1380–96. https://doi.org/10.5281/zenodo.14500931.

Full text

Abstract:

The widespread adoption of artificial intelligence and machine learning (AI/ML) technologies has created an urgent need for efficient and scalable deployment solutions across industries. This article presents a comprehensive analysis of cloud-based AI/ML model deployment strategies, examining both managed platforms offered by major cloud providers (AWS SageMaker, Google Vertex AI, and Microsoft Azure Machine Learning) and self-managed infrastructure solutions. Through systematic evaluation of platform capabilities, infrastructure requirements, and organizational considerations, the article develops a decision framework to guide enterprises in selecting appropriate deployment architectures. The article analysis reveals that while managed platforms offer significant advantages in terms of reduced complexity, automated infrastructure management, and faster time-to-market, self-managed solutions provide superior customization capabilities and potential cost benefits at scale for organizations with sufficient technical expertise. The article synthesizes implementation data from multiple enterprise case studies to identify critical success factors in AI/ML deployment, including infrastructure scalability, monitoring capabilities, and resource optimization. Furthermore, the article proposes a novel evaluation matrix for assessing the total cost of ownership across different deployment scenarios, incorporating both direct infrastructure costs and indirect expenses related to expertise and maintenance. These findings contribute to the growing body of knowledge on enterprise AI/ML operations while providing practical guidance for organizations navigating the complex landscape of cloud-based model deployment strategies.

APA, Harvard, Vancouver, ISO, and other styles

10

Gaurav Samdani, Kabita Paul, and Flavia Saldanha. "Serverless architectures for agentic AI deployment." World Journal of Advanced Engineering Technology and Sciences 7, no. 2 (2022): 320–33. https://doi.org/10.30574/wjaets.2022.7.2.0144.

Full text

Abstract:

This paper presents directions on improving scalabilities, costs, and flexibility in serverless architectures incorporating agentic AI deployment. Using event-driven and a pay-as-you-go model, Serverless computing is shown to be an optimal way to deploy agentic AI systems due to their need for flexibility. The research objectives include the assessment of the possibilities for serverless platforms, the assessment of the effectiveness of its case applications, and the development of a solid methodology for its application in real life. The methodology uses case studies, comparative analysis, and evaluation metrics to determine the benefits of serverless computing to AI workloads. The main findings emphasize latency optimization, cost-effectiveness, and flexibility of operations. These insights are mostly general for businesses, developers, and cloud providers interested in AI effectiveness and deployment. Consequently, this research finds that linking serverless architectures to agentic AI endorses innovation possibilities in deploying AI.

APA, Harvard, Vancouver, ISO, and other styles

11

Banerjee, Joyanta. "Scalable AI Model Deployment with AWS SageMaker and EKS." International Journal of Computer Trends and Technology 72, no. 11 (2024): 135–42. https://doi.org/10.14445/22312803/ijctt-v72i11p114.

Full text

APA, Harvard, Vancouver, ISO, and other styles

12

Ganesh, Prakhar. "Model Multiplicity for Responsible AI." Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society 7, no. 2 (2025): 14–17. https://doi.org/10.1609/aies.v7i2.31896.

Full text

Abstract:

Machine learning has experienced a remarkable rise, with highly sophisticated over-parameterized models leading the way. Consequently, these cutting-edge models find application across diverse domains. Their increasing deployment has sparked concerns about their real-world impact, studied under the umbrella of responsible AI. A crucial aspect of building responsible AI models is the idea of model multiplicity. If managed well, model multiplicity gives us the freedom to prioritize several metrics, including those associated with responsible AI, and select the best models to minimize harm. However, the existence of multiplicity also marks the unavoidable presence of arbitrariness in model selection that can impact individual-level decisions, necessitating a broader discussion on the role and expectations of AI decision makers in our society.

APA, Harvard, Vancouver, ISO, and other styles

13

R, Antony Roshan, Barath V, Deva Dharshini D, Dhilak M, and Saraswathi R. "LLM ENHANCED AI CHATBOT." INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT 08, no. 12 (2024): 1–5. https://doi.org/10.55041/ijsrem40033.

Full text

Abstract:

The development and deployment of a Large Language Model (LLM)-based tool designed to generate human-like responses to natural language inputs in a network isolated environment presents unique technical and logistical challenges. Such a tool leverages state-of-the-art Natural Language Processing (NLP) and machine learning techniques to simulate real-time, coherent, and contextually appropriate interactions without relying on an active internet connection. This approach involves training or fine-tuning a pre-existing LLM on domain-specific datasets and configuring the model to operate efficiently on limited resources, addressing the constraints of an offline deployment. The solution provides an autonomous conversational agent capable of supports text summarization, upload CSV and PDF files. The tools used for this project Natural Language Processing(NLP), Hugging face. Key components include model optimization to reduce computational overhead, storage requirements, and latency in response generation. Future scope includes improving model robustness and incorporating ethical guardrails to ensure the responsible use of LLMs, especially in sensitive fields such as healthcare, education, and customer support.

APA, Harvard, Vancouver, ISO, and other styles

14

Nikhila Pothukuchi. "Hardware-aware neural network training: A comprehensive framework for Efficient AI model deployment." World Journal of Advanced Engineering Technology and Sciences 15, no. 1 (2025): 1831–38. https://doi.org/10.30574/wjaets.2025.15.1.0344.

Full text

Abstract:

This article presents a comprehensive guide to hardware-aware training techniques for artificial intelligence models, addressing the critical balance between performance optimization and resource efficiency. The discussion encompasses key strategies including quantization methods for precision reduction, systematic network pruning for architecture refinement, sparsity implementation for model optimization, and hardware-specific adaptations. Through detailed exploration of these techniques, the article demonstrates how integrating hardware considerations during the training process leads to substantial improvements in deployment efficiency, energy consumption, and overall model performance. The framework outlined offers practical solutions for organizations seeking to optimize their AI deployments across various platforms, from edge devices to cloud infrastructure, while maintaining competitive accuracy levels.

APA, Harvard, Vancouver, ISO, and other styles

15

Prabu, Arjunan. "AI Model Management with AWS Cloud Infrastructure." International Journal on Science and Technology 15, no. 4 (2024): 1–5. https://doi.org/10.5281/zenodo.14514123.

Full text

Abstract:

State-of-the-art AI development requires a strong infrastructure that could handle everything: from initial experimentation to the production deployment of a model. AWS offers an end-to-end suite of services that allows enterprises to build, train, deploy, and manage machine learning models at scale. This whitepaper details an enterprise approach to managing AI models using AWS Cloud Infrastructure with version control, reproducibility, and operational efficiency in mind.

APA, Harvard, Vancouver, ISO, and other styles

16

Maddala, Suresh Kumar. "Understanding Explainability in Enterprise AI Models." International Journal of Management Technology 12, no. 1 (2025): 58–68. https://doi.org/10.37745/ijmt.2013/vol12n25868.

Full text

Abstract:

This article examines the critical role of explainability in enterprise AI deployments, where algorithmic transparency has emerged as both a regulatory necessity and a business imperative. As organizations increasingly rely on sophisticated machine learning models for consequential decisions, the "black box" problem threatens stakeholder trust, regulatory compliance, and effective model governance. We explore the multifaceted business case for explainable AI across regulated industries, analyze the spectrum of interpretability techniques—from inherently transparent models to post-hoc explanation methods for complex neural networks—and investigate industry-specific applications in finance, healthcare, fraud detection, and human resources. The article addresses practical implementation challenges, including the accuracy-interpretability tradeoff, computational constraints, and ethical considerations around data bias. Looking forward, the article examines emerging developments in regulatory frameworks, hybrid model architectures, causal inference approaches, and integrated explanation interfaces. Throughout the analysis, the article demonstrates that explainability is not merely a technical consideration but a foundational element of responsible AI deployment that allows organizations to balance innovation with accountability in an increasingly algorithm-driven business landscape.

APA, Harvard, Vancouver, ISO, and other styles

17

Savita Nuguri, Rahul Saoji, Krishnateja Shiva, Pradeep Etikani, and Vijaya Venkata Sri Rama Bhaskar. "OPTIMIZING AI MODEL DEPLOYMENT IN CLOUD ENVIRONMENTS: CHALLENGES AND SOLUTIONS." International Journal for Research Publication and Seminar 12, no. 2 (2021): 159–68. http://dx.doi.org/10.36676/jrps.v12.i2.1461.

Full text

Abstract:

Among the studies related to the use of artificial intelligence in cloud compting, this research seeks to identify techniues that may help in the effectve implementation of models in cloud based sysems. Some of the main questions that are answered include cost control, working with multiple cloud services, achieving higher speed, preserving the privacy of inforation, and creating conitions for its safe storage, also provider migration. Possible solution instances include autoscaling, model compression, secure enclaves, and contaner for measurability tasks with a range of solutions being consdered and Android-specific solutions being compared. The reference architectural model of cloud and edge systems is described. The findings estabish the effectiveness and need for such methodologes since artificial Inteligence initiatives can be easily and securely implemented and sustained through cloud technologies

APA, Harvard, Vancouver, ISO, and other styles

18

Naayini, Prudhvi. "Scalable AI Model Deployment and Management on Serverless Cloud Architecture." International Journal of Electrical, Electronics and Computers 9, no. 1 (2024): 1–12. https://doi.org/10.22161/eec.91.1.

Full text

Abstract:

Scalable deployment of deep learning models in the cloud faces challenges in balancing performance, cost, and manageability. This paper investigates serverless cloud architecture for AI model inference, focusing on AWS technologies such as AWS Lambda, API Gateway, and Kubernetes-based serverless extensions (e.g., AWS EKS with Knative). We first outline the limitations of traditional, server-based model hosting to motivate the serverless approach. Then, we present novel strategies for scalable model serving: an adaptive resource provisioning algorithm, intelligent model caching, and efficient model sharding. Our methodology includes pseudo-code and architectural diagrams that illustrate these techniques on AWS. Analytical modeling and simulation using AWS performance and cost metrics validate that the proposed system can automatically scale to thousands of concurrent requests while maintaining low latency. In addition, an in-depth threat model is developed to address security and privacy concerns. Finally, real-world case studies (e.g., real-time video analytics, recommendation engines, and fraud detection) are described to demonstrate the practical viability of the approach, and a detailed cost analysis is presented. Future research directions include advanced scheduling algorithms and serverless training frameworks.

APA, Harvard, Vancouver, ISO, and other styles

19

Mistry, Het. "Mastering Model Selection for AI/ML Models." European Journal of Computer Science and Information Technology 13, no. 14 (2025): 55–67. https://doi.org/10.37745/ejcsit.2013/vol13n145567.

Full text

Abstract:

This article presents a comprehensive framework for mastering model selection in artificial intelligence and machine learning applications across diverse domains. The article addresses the fundamental challenge of selecting models that optimally balance complexity with generalization capability, navigating the classic bias-variance tradeoff that underpins predictive performance. Beginning with theoretical foundations of regularization approaches and complexity measures, the article proceeds through data-driven selection strategies, including cross-validation techniques and advanced hyperparameter optimization methods. The article incorporates robust evaluation metrics for both classification and regression tasks, emphasizing the importance of multi-metric assessment in capturing various performance dimensions. The article extends beyond initial model selection to address the critical yet often overlooked dimension of post-deployment maintenance, including concept drift detection and retraining strategies that ensure sustained model performance over time. The article demonstrates the practical application of these principles in high-stakes environments with domain-specific constraints. The article's integrated framework offers decision support for strategy selection based on data characteristics, with implementation guidance across common machine learning platforms. By synthesizing theoretical insights with practical considerations, this article provides researchers and practitioners with a structured approach to model selection throughout the complete machine learning lifecycle, ultimately enhancing the reliability and sustainability of AI applications in production environments.

APA, Harvard, Vancouver, ISO, and other styles

20

Satyam, Chauhan. "Intelligent Edge Computing for IoT Data Processing and AI Model Deployment." INTERNATIONAL JOURNAL OF INNOVATIVE RESEARCH AND CREATIVE TECHNOLOGY 9, no. 4 (2023): 1–16. https://doi.org/10.5281/zenodo.14613685.

Full text

Abstract:

The exponential growth of IoT devices and the rising demand for AI-driven applications have introduced significant challenges in data processing, scalability, and latency. Intelligent Edge Computing (IEC) emerges as a transformative solution by processing data closer to its source, thus addressing these challenges while enhancing privacy and reducing bandwidth usage. This paper explores the architecture, techniques, and strategies of IEC for IoT data processing and AI model deployment. Key topics include edge architecture, lightweight AI algorithms, federated learning, and transfer learning for real-time decision-making. The study also evaluates its application in financial services, showcasing use cases in high-frequency trading, equity research, and fixed income analysis. Insights drawn emphasize the potential of IEC in shaping next-generation IoT ecosystems and its significance across multiple sectors, paving the way for a more distributed and efficient computational paradigm.

APA, Harvard, Vancouver, ISO, and other styles

21

Biswas, Jeet. "NICKONN – AN AI-POWERED SEARCH ENGINE POWERED BY LLAMA MODEL." International Scientific Journal of Engineering and Management 04, no. 05 (2025): 1–9. https://doi.org/10.55041/isjem03638.

Full text

Abstract:

Abstract: With the growing volume of information on the internet, retrieving accurate, relevant, and context-aware content has become a significant challenge. Traditional search engines rely heavily on keyword matching and static ranking, often overlooking the semantic context behind user queries. This project presents Nickonn – An AI-Powered Search Engine, a modular, AI-enhanced, and privacy-first platform that delivers summarized, intelligent, and referenced responses by integrating open-source metasearch technology (SearxNG) with transformer-based language models such as GPT, LLaMA, and Mixtral. Nickonn offers complete offline functionality, enabling secure deployment in educational and enterprise environments. Built with React, Next.js, and Node.js, the system efficiently interprets user intent, aggregates data from trusted sources, and presents answers with citation tracing. Performance evaluation and a comparative study with existing AI search engines reveal Nickonn’s superiority in contextual accuracy, privacy assurance, and user trust. Keywords- AI Search Engine, Semantic Search, Natural Language Processing, Large Language Models, SearxNG, Open Source, Local Deployment, Information Retrieval, GPT, Data Privacy

APA, Harvard, Vancouver, ISO, and other styles

22

Dr., Juby Mathew, Sen Easow Neil, Shankar Rajalakshmi, Babu Nandhu, and Pratap Singh Rudra. "Career Finder: AI powered career guider." International Journal on Emerging Research Areas (IJERA) 05, no. 01 (2025): 174–77. https://doi.org/10.5281/zenodo.15187120.

Full text

Abstract:

This paper discusses the overall design, development, and deployment of an AI-based career recommendation system, organized into four interdependent modules: User Interface (UI) Design and Development, Backend Development and API Management, AI Model Integration and Recommendation Engine, and Database and Deployment. The platform leverages cutting-edge technologies such as React.js for a dynamic front-end, Flask for robust backend API development, OpenAI GPT-based models (or alternatives like Hugging Face Transformers or LLaMA) for personalized career insights, and MongoDB for scalable data storage. The UI module prioritizes creating an intuitive and responsive user experience, incorporating features like dynamic forms and skill gap analysis dashboards. The Backend module focuses on developing secure and efficient RESTful APIs to handle user data processing and AI model communication. The AI Model Integration module delves into natural language processing (NLP) techniques to analyze user inputs, match skills, and generate tailored career recommendations. The Database and Deployment module is focused on data management that scales and is secure on cloud systems such as AWS, Azure, or Google Cloud, with authentication through Firebase and CI/CD through GitHub Actions. The project focuses on team collaboration, with tools such as Jira and GitHub used for easy integration between modules and effective development practices. This paper lays out the process of development, challenges encountered, improvements in the future of the platform and possible improvements in the future of the platform.  Keywords— Career recommendation, AI, React.js, Flask, OpenAI GPT, MongoDB, user interface, backend development, machine learning, NLP, cloud deployment, REST API, skill gap analysis.Introduction. 

APA, Harvard, Vancouver, ISO, and other styles

23

Nakayama, Luis Filipe, João Matos, Justin Quion, et al. "Unmasking biases and navigating pitfalls in the ophthalmic artificial intelligence lifecycle: A narrative review." PLOS Digital Health 3, no. 10 (2024): e0000618. http://dx.doi.org/10.1371/journal.pdig.0000618.

Full text

Abstract:

Over the past 2 decades, exponential growth in data availability, computational power, and newly available modeling techniques has led to an expansion in interest, investment, and research in Artificial Intelligence (AI) applications. Ophthalmology is one of many fields that seek to benefit from AI given the advent of telemedicine screening programs and the use of ancillary imaging. However, before AI can be widely deployed, further work must be done to avoid the pitfalls within the AI lifecycle. This review article breaks down the AI lifecycle into seven steps—data collection; defining the model task; data preprocessing and labeling; model development; model evaluation and validation; deployment; and finally, post-deployment evaluation, monitoring, and system recalibration—and delves into the risks for harm at each step and strategies for mitigating them.

APA, Harvard, Vancouver, ISO, and other styles

24

Balaji, Soundararajan. "Engineering Systems for Dynamic Retraining and Deployment of AI Models." International Journal of Innovative Research in Engineering & Multidisciplinary Physical Sciences 11, no. 2 (2023): 1–9. https://doi.org/10.5281/zenodo.15054625.

Full text

Abstract:

The increasing reliance on artificial intelligence (AI) in dynamic business environments enables the adaptive model management systems to mitigate performance degradation caused by evolving data patterns, operational shifts, and market changes. Traditional retraining methods are resource-intensive and struggle to maintain consistency, prompting the need for innovative approaches such as Just-in-Time (JIT) retraining, real-time monitoring, and automated deployment pipelines. We will examine the engineering challenges of designing adaptive AI systems, including scalability, computational costs, privacy concerns, and integration complexities. We will learn the role of continuous learning frameworks, transfer learning, and ensemble techniques in enabling efficient model recalibration. Case studies from urban automation, conversational AI, and industrial applications illustrate practical implementations of dynamic retraining systems, emphasizing reduced technical debt and improved operational resilience. We will also extend our focus on best practices for monitoring model drift, deploying CI/CD pipelines, and balancing human oversight with automation. By synthesizing research and real-world applications, this work provides a roadmap for organizations to enhance AI reliability, adaptability, and trustworthiness in production environments while addressing ethical and compliance risks.

APA, Harvard, Vancouver, ISO, and other styles

25

D, Aishwaya. "AI Driven Phishing Detection Model." International Journal for Research in Applied Science and Engineering Technology 13, no. 4 (2025): 1023–25. https://doi.org/10.22214/ijraset.2025.70029.

Full text

Abstract:

Phishing attacks are a significant cybersecurity threat as they trick people into revealing personal information through fake websites. This project introduces an integrated CNN-LSTM model to detect phishing URLs. It uses Convolutional Neural Networks (CNNs) to look for local patterns and Long Short-Term Memory (LSTM) networks to analyze the order of information in URLs. To further clarify, SHAP (SHapley Additive Explanations) and LIME (Local Interpretable Model-agnostic Explanations) are implemented, giving insights into how the model predicts. The trained model is served as a FastAPI/Flask web service, enabling real-time URL analysis. A browser extension is also created to communicate with the API, facilitating on-thefly phishing detection as users surf the web. The system offers predictions as well as explanations, enhancing user trust and security awareness. By integrating deep learning, explainability, and web deployment, this project provides a real-world and scalable cybersecurity solution, with potential for further improvements using Graph Neural Networks (GNNs), Reinforcement Learning, or Meta-Learning., or Math in Paper Title or Abstract.

APA, Harvard, Vancouver, ISO, and other styles

26

Kshirsagar, Meghana, Krishn Kumar Gupt, Gauri Vaidya, Conor Ryan, Joseph P. Sullivan, and Vivek Kshirsagar. "Insights Into Incorporating Trustworthiness and Ethics in AI Systems With Explainable AI." International Journal of Natural Computing Research 11, no. 1 (2022): 1–23. http://dx.doi.org/10.4018/ijncr.310006.

Full text

Abstract:

Over the past seven decades since the advent of artificial intelligence (AI) technology, researchers have demonstrated and deployed systems incorporating AI in various domains. The absence of model explainability in critical systems such as medical AI and credit risk assessment among others has led to neglect of key ethical and professional principles which can cause considerable harm. With explainability methods, developers can check their models beyond mere performance and identify errors. This leads to increased efficiency in time and reduces development costs. The article summarizes that steering the traditional AI systems toward responsible AI engineering can address concerns raised in the deployment of AI systems and mitigate them by incorporating explainable AI methods. Finally, the article concludes with the societal benefits of the futuristic AI systems and the market shares for revenue generation possible through the deployment of trustworthy and ethical AI systems.

APA, Harvard, Vancouver, ISO, and other styles

27

Harshavardhan1, Polamarasetty. "Design and Implementation of a Fine-Tuned Llama-Based AI Chatbot with Voice and Text Interaction Using Streamlit and Ollama." INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT 09, no. 03 (2025): 1–9. https://doi.org/10.55041/ijsrem42961.

Full text

Abstract:

Natural language processing (NLP) drives artificial intelligence (AI)driven chatbots that have gained great popularity in many industries in recent years [15], therefore improving humancomputer interactions. Optimized for quick conversational reactions, this paper describes an AIdriven chatbot driven by a finely tuned Llama 3.2 model. Using Streamlit for an interactive user interface, Ollama for model deployment, and SpeechRecognition and pyttsx3 for smooth voice input and texttospeech (TTS) output [5][10[11]], the chatbot combines voice and textbased communication [2][4]. Using Sloth, a dedicated framework for finetuning big language models (LLMs), the chatbot model is trained in a Google Colab environment. Exported in GGUF format and deployed using the Ollama runtime, the trained model helps run inference efficiently. Streamlit produces a strong user interface including chat history management, voicebased interaction toggle, live response streaming, and downloadable conversation logs [7][8]. The techniques used for chatbot finetuning, model deployment, and improvement of realtime interaction are brought forward in this study. The accuracy of responses, user experience evaluations, and performance improvements mentioned. The research shows how connecting cuttingedge NLP models with current deployment platforms improves chatbot usability and performance [1][9][12][15]. Keywords—Artificial Intelligence (AI), Natural Language Processing (NLP), Conversational AI, AI Chatbots, Large Language Models (LLMs), Llama 3.2, Fine-Tuning, Sloth Framework, Google Colab, Ollama Deployment, GGUF Format, Streamlit UI, Speech-to-Text (STT), Text-to-Speech (TTS), SpeechRecognition, pyttsx3, Model Optimization, User Interaction, Real-time Response, Voice-Based Chatbot

APA, Harvard, Vancouver, ISO, and other styles

28

Krones, Felix, and Benjamin Walker. "From theoretical models to practical deployment: A perspective and case study of opportunities and challenges in AI-driven cardiac auscultation research for low-income settings." PLOS Digital Health 3, no. 12 (2024): e0000437. https://doi.org/10.1371/journal.pdig.0000437.

Full text

Abstract:

This article includes a literature review and a case study of artificial intelligence (AI) heart murmur detection models to analyse the opportunities and challenges in deploying AI in cardiovascular healthcare in low- or medium-income countries (LMICs). This study has two parallel components: (1) The literature review assesses the capacity of AI to aid in addressing the observed disparity in healthcare between high- and low-income countries. Reasons for the limited deployment of machine learning models are discussed, as well as model generalisation. Moreover, the literature review discusses how emerging human-centred deployment research is a promising avenue for overcoming deployment barriers. (2) A predictive AI screening model is developed and tested in a case study on heart murmur detection in rural Brazil. Our binary Bayesian ResNet model leverages overlapping log mel spectrograms of patient heart sound recordings and integrates demographic data and signal features via XGBoost to optimise performance. This is followed by a discussion of the model’s limitations, its robustness, and the obstacles preventing its practical application. The difficulty with which this model, and other state-of-the-art models, generalise to out-of-distribution data is also discussed. By integrating the results of the case study with those of the literature review, the NASSS framework was applied to evaluate the key challenges in deploying AI-supported heart murmur detection in low-income settings. The research accentuates the transformative potential of AI-enabled healthcare, particularly for affordable point-of-care screening systems in low-income settings. It also emphasises the necessity of effective implementation and integration strategies to guarantee the successful deployment of these technologies.

APA, Harvard, Vancouver, ISO, and other styles

29

Fauveau, Valentin, Sean Sun, Zelong Liu, et al. "Discovery Viewer (DV): Web-Based Medical AI Model Development Platform and Deployment Hub." Bioengineering 10, no. 12 (2023): 1396. http://dx.doi.org/10.3390/bioengineering10121396.

Full text

Abstract:

The rapid rise of artificial intelligence (AI) in medicine in the last few years highlights the importance of developing bigger and better systems for data and model sharing. However, the presence of Protected Health Information (PHI) in medical data poses a challenge when it comes to sharing. One potential solution to mitigate the risk of PHI breaches is to exclusively share pre-trained models developed using private datasets. Despite the availability of these pre-trained networks, there remains a need for an adaptable environment to test and fine-tune specific models tailored for clinical tasks. This environment should be open for peer testing, feedback, and continuous model refinement, allowing dynamic model updates that are especially important in the medical field, where diseases and scanning techniques evolve rapidly. In this context, the Discovery Viewer (DV) platform was developed in-house at the Biomedical Engineering and Imaging Institute at Mount Sinai (BMEII) to facilitate the creation and distribution of cutting-edge medical AI models that remain accessible after their development. The all-in-one platform offers a unique environment for non-AI experts to learn, develop, and share their own deep learning (DL) concepts. This paper presents various use cases of the platform, with its primary goal being to demonstrate how DV holds the potential to empower individuals without expertise in AI to create high-performing DL models. We tasked three non-AI experts to develop different musculoskeletal AI projects that encompassed segmentation, regression, and classification tasks. In each project, 80% of the samples were provided with a subset of these samples annotated to aid the volunteers in understanding the expected annotation task. Subsequently, they were responsible for annotating the remaining samples and training their models through the platform’s “Training Module”. The resulting models were then tested on the separate 20% hold-off dataset to assess their performance. The classification model achieved an accuracy of 0.94, a sensitivity of 0.92, and a specificity of 1. The regression model yielded a mean absolute error of 14.27 pixels. And the segmentation model attained a Dice Score of 0.93, with a sensitivity of 0.9 and a specificity of 0.99. This initiative seeks to broaden the community of medical AI model developers and democratize the access of this technology to all stakeholders. The ultimate goal is to facilitate the transition of medical AI models from research to clinical settings.

APA, Harvard, Vancouver, ISO, and other styles

30

Sherman, Eli, and Ian Eisenberg. "AI Risk Profiles: A Standards Proposal for Pre-deployment AI Risk Disclosures." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 21 (2024): 23047–52. http://dx.doi.org/10.1609/aaai.v38i21.30348.

Full text

Abstract:

As AI systems’ sophistication and proliferation have increased, awareness of the risks has grown proportionally. The AI industry is increasingly emphasizing the need for transparency, with proposals ranging from standardizing use of technical disclosures, like model cards, to regulatory licensing regimes. Since the AI value chain is complicated, with actors bringing varied expertise, perspectives, and values, it is crucial that consumers of transparency disclosures be able to understand the risks of the AI system in question. In this paper we propose a risk profiling standard which can guide downstream decision-making, including triaging further risk assessment, informing procurement and deployment, and directing regulatory frameworks. The standard is built on our proposed taxonomy of AI risks, which distills the wide variety of risks proposed in the literature into a high-level categorization. We outline the myriad data sources needed to construct informative Risk Profiles and propose a template and methodology for collating risk information into a standard, yet flexible, structure. We apply this methodology to a number of prominent AI systems using publicly available information. To conclude, we discuss design decisions for the profiles and future work.

APA, Harvard, Vancouver, ISO, and other styles

31

Soundararajan, Balaji. "Developing New AI Model Compression Techniques." INTERNATIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT 09, no. 03 (2025): 1–8. https://doi.org/10.55041/ijsrem43474.

Full text

Abstract:

The rapid growth of artificial intelligence (AI) model complexity has created significant challenges for deployment on resource-constrained devices and customization by developers. Model compression techniques, such as pruning, quantization, and knowledge distillation, have emerged as critical solutions to reduce computational and memory demands while preserving accuracy. This work explores foundational and state-of-the-art approaches to AI model compression, emphasizing their role in enabling efficient edge computing, lowering energy consumption, and democratizing access to advanced AI capabilities. We discuss the trade-offs between model size, inference speed, and accuracy, and evaluate methods for deploying compressed models on mobile and IoT devices. Case studies on architectures like MobileNet and SqueezeNet demonstrate practical successes, while benchmark datasets and evaluation metrics highlight the need for standardized methodologies to assess compression efficacy. The paper concludes with guidelines for reliable experimentation and future directions in optimizing model efficiency without compromising performance. Keywords AI Model Compression, Pruning and Sparsity, Quantization, Knowledge Distillation, MobileNet, Edge Computing, Resource-Constrained Devices, Inference Optimization.

APA, Harvard, Vancouver, ISO, and other styles

32

Kiran, Kumar Voruganti. "Edge-AI and IoT DevOps: Managing Deployment Pipelines for Real-Time Analytics." Journal of Scientific and Engineering Research 9, no. 6 (2022): 84–94. https://doi.org/10.5281/zenodo.12666911.

Full text

Abstract:

The integration of Edge-AI and IoT within DevOps practices is revolutionizing data processing and real-time analytics, enabling immediate insights and decision-making across various industries. This paper explores the deployment of Edge-AI and IoT in DevOps environments, focusing on system architecture, automation, AI model training, real-time data processing, and security mechanisms. By examining the roles of edge computing nodes, AI model deployment, and real-time analytics, the study highlights the benefits of reduced latency, enhanced data privacy, and efficient resource utilization. Through detailed case studies in smart manufacturing and healthcare monitoring systems, this research demonstrates the practical applications and effectiveness of the proposed framework. The findings provide valuable insights for developing robust, scalable, and secure Edge-AI and IoT systems, addressing challenges such as data volume, latency, and security. This study contributes to the evolving field of IoT DevOps, offering a comprehensive framework and best practices for enhancing real-time analytics capabilities in cloud computing environments.

APA, Harvard, Vancouver, ISO, and other styles

33

Bhaskar Goyal. "Understanding cloud-native AI: The foundation of scalable platform architecture." World Journal of Advanced Engineering Technology and Sciences 15, no. 1 (2025): 822–27. https://doi.org/10.30574/wjaets.2025.15.1.0251.

Full text

Abstract:

Cloud-native AI represents a transformative paradigm shift in enterprise artificial intelligence deployment, fundamentally reimagining how organizations architect, deploy, and manage AI systems. By embracing containerization, microservices architecture, and declarative configuration, this approach enables unprecedented levels of scalability, resilience, and operational efficiency. The integration of Kubernetes orchestration with specialized hardware management creates a foundation for dynamically scaling AI workloads while optimizing resource utilization. Organizations implementing these architectural patterns have demonstrated substantial improvements across deployment velocity, infrastructure costs, and system reliability metrics. The layered platform design, separation of training and inference environments, and implementation of feature stores collectively address the unique challenges of enterprise AI deployment. Furthermore, the extension of DevOps practices into machine learning through MLOps automation accelerates the path from model development to production while maintaining robust governance and quality assurance. This architectural approach positions organizations to fully leverage AI capabilities while maintaining the scalability, reliability, and efficiency demanded by enterprise environments.

APA, Harvard, Vancouver, ISO, and other styles

34

Sharipov, Rinat. "Analysis and Reduction of Errors in AI Models." American Journal of Engineering and Technology 07, no. 05 (2025): 202–10. https://doi.org/10.37547/tajet/volume07issue05-20.

Full text

Abstract:

The issue of errors in artificial intelligence (AI) models is a critical aspect that requires systematic analysis and the application of effective methods for their reduction. Errors in AI models can occur at various stages of development and deployment, including data collection, model training, and operation phases. The key tasks in this field involve identifying error sources and applying approaches aimed at eliminating them. Methods such as cross-validation, regularization, and the use of ensemble models play a significant role in reducing errors and improving prediction accuracy. Therefore, for the successful use of AI technologies in various domains, continuous attention to model monitoring, parameter adjustment, and the implementation of innovative methods to minimize risks is necessary.

APA, Harvard, Vancouver, ISO, and other styles

35

Nihar, Malali. "THE ROLE OF DEVSECOPS IN FINANCIAL AI MODELS: INTEGRATING SECURITY AT EVERY STAGE OF AI/ML MODEL DEVELOPMENT IN BANKING AND INSURANCE." International Journal of Engineering Technology Research & Management (IJETRM) 06, no. 11 (2022): 218–25. https://doi.org/10.5281/zenodo.15239176.

Full text

Abstract:

Artificial intelligence (AI) and machine learning (ML) technologies have brought revolutionary changes tofinancial institutions such as banks and insurers during their operations. The financial industry relies heavily onAI models for both automated underwriting policies and personalized recommendation services and fraudulentactivity discovery along with credit scoring assessments. The deeper financial institutions incorporate thesemodels into their systems them more vulnerability to cyberattacks. DevSecOps represents a revolutionary methodwhich includes security measures during every stage of development from the initial phase through the final phaseof AI/ML model lifecycle. This paper investigates how DevSecOps secures financial AI models while discussingthe special digital threats faced by banking and insurance institutions. Security considerations need to beembedded throughout the entire data preprocessing to deployment cycle due to their roles in ensuring regulatoryadherence, safeguarding sensitive information and preserving dependability.The document utilizes five sections to present information about DevSecOps events in financial AI systems anddevelopment pipeline security approaches as well as deployment risk reduction techniques and compliancerequirements together with successful implementation examples. This work provides organizations with a realworld approach for developing resilient and compliant AI systems which benefits data specialists along withpractitioners from financial IT teams. Minimizing risk takes precedence over convenience because modernfinancial institutions operate with extensive data sets and serious consequences making DevSecOpsimplementation mandatory. Security as a baseline framework before consideration of anything else allowsinstitutions to protect both customer trust along with institutional integrity while ensuring their AI initiatives haveenduring value.

APA, Harvard, Vancouver, ISO, and other styles

36

Venkata Krishna Koganti. "Autonomous CI/CD Meshes: Self-healing deployment architectures with AI-ML Orchestration." World Journal of Advanced Engineering Technology and Sciences 15, no. 2 (2025): 2731–45. https://doi.org/10.30574/wjaets.2025.15.2.0777.

Full text

Abstract:

This article introduces a novel architecture for autonomous continuous integration and continuous deployment (CI/CD) systems capable of self-healing and self-optimization without human intervention. The article presents intelligent deployment meshes that integrate deep anomaly detection using LSTM networks with Bayesian change-point detection to identify deployment anomalies before they impact production environments. The proposed framework leverages causal CI/CD graphs to model complex interdependencies between microservices, enabling context-aware remediation strategies including automated rollbacks and intelligent canary analysis. The article's approach unifies machine learning metadata tracking (MLMD) with traditional software observability stacks, creating dual-aspect visibility that optimizes for both model-aware and application-aware pipeline configurations. The article demonstrates how semantic diffing engines can perform version-aware auto-validation, significantly reducing false positives in anomaly detection while improving remediation accuracy in multi-tenant environments. The resulting autonomous CI/CD architecture represents a paradigm shift from reactive to predictive deployment strategies, enabling organizations to maintain high availability while accelerating release velocity in complex microservice ecosystems.

APA, Harvard, Vancouver, ISO, and other styles

37

Tarafdar, Rajarshi. "SELF-HEALING AI MODEL INFRASTRUCTURE: AN AUTOMATED APPROACH TO MODEL DEPLOYMENT MAINTENANCE AND RELIABILITY." INTERNATIONAL JOURNAL OF INFORMATION TECHNOLOGY AND MANAGEMENT INFORMATION SYSTEMS 16, no. 1 (2025): 992–1004. https://doi.org/10.34218/ijitmis_16_01_071.

Full text

APA, Harvard, Vancouver, ISO, and other styles

38

Gupta, Shreya. "The Rise of Serverless AI: Transforming Machine Learning Deployment." European Journal of Computer Science and Information Technology 13, no. 5 (2025): 45–67. https://doi.org/10.37745/ejcsit.2013/vol13n54567.

Full text

Abstract:

Serverless computing has revolutionized artificial intelligence deployment by introducing a paradigm shift in infrastructure management and resource utilization. The technology enables organizations to deploy AI solutions without managing underlying infrastructure, offering automatic scaling and pay-per-use pricing models. Function-as-a-Service dominates the market share, particularly in the Banking, Financial Services and Insurance sector, while Backend-as-a-Service gains traction in AI applications. Organizations achieve significant reductions in total cost of ownership while maintaining high service availability. The geographical distribution showcases North American leadership, with Asia Pacific regions demonstrating substantial growth potential. Technical advancements in serverless AI platforms support diverse ML frameworks and model architectures, enabling efficient resource utilization and rapid deployment capabilities. While cold start latency and resource constraints present challenges, continuous platform optimization and framework development address these issues. The integration of edge computing with serverless principles enhances distributed AI applications, reducing data transfer requirements and improving overall system performance.

APA, Harvard, Vancouver, ISO, and other styles

39

Li, Xinnuo. "Analysis of the Designs and Applications of AI Chip." Highlights in Science, Engineering and Technology 76 (December 31, 2023): 168–80. http://dx.doi.org/10.54097/k1p7yk27.

Full text

Abstract:

The rapid evolution of deep learning model architectures and the increasing scale of model parameters have imposed heightened demands on deep learning training, inference, and deployment, leading to the swift advancement and unprecedented prosperity of AI chips. Therefore, this study sets out to analyze the designs and applications of AI chips by considering their unique requirements compared to conventional chips, and by combining software and hardware aspects. The paper delineates the classification of common AI chips along with their distinct design strategies and optimization algorithms. It commences with the fundamental hardware design of AI chips, elucidating the basic design process and addressing the specialized demands of AI computation, particularly data parallelism and storage optimization. Subsequently, transitioning to the manufacturing process, it examines how current AI chips circumvent fabrication bottlenecks and achieve significant breakthroughs in architecture and performance through chip stacking techniques. The paper then bridges hardware and software through the AI compiler, expounding on model optimization approaches, e.g., quantization and pruning, completing the comprehensive journey from AI chip design to deployment. It identifies current developmental challenges in the AI chip realm and provides a glimpse into future prospects. Through a holistic perspective spanning design, manufacturing, algorithms, and applications of AI chips, this paper offers insights that steer upcoming innovations and practical implementations in artificial intelligence, paving the way for a dynamic future in AI chip development.

APA, Harvard, Vancouver, ISO, and other styles

40

Bhanu Prakash Kolli. "The rise of AI-Augmented DevOps: How human engineers and AI Co-manage cloud infrastructure." World Journal of Advanced Engineering Technology and Sciences 15, no. 1 (2025): 1577–88. https://doi.org/10.30574/wjaets.2025.15.1.0270.

Full text

Abstract:

The integration of artificial intelligence into DevOps practices represents a paradigm shift in cloud infrastructure management. As cloud environments grow increasingly complex with microservices architectures and multi-cloud deployments, traditional operational approaches are proving insufficient. Rather than replacing human engineers, AI-augmented DevOps serves as a collaborative force that enhances decision-making capabilities, automates routine tasks, and provides insights that are impossible to derive manually. This article explores several key dimensions of this emerging paradigm: AI-powered observability systems that dramatically reduce false positives while improving anomaly detection; intelligent CI/CD pipelines that optimize code quality, deployment strategies, and rollback procedures; the critical balance between human expertise and AI automation; and practical implementation frameworks for organizations at various maturity levels. Through case studies from financial services and e-commerce sectors, the article demonstrates how thoughtful integration of AI capabilities with human workflows creates a new operational model that achieves unprecedented levels of reliability, performance, and security at scale while enabling engineering teams to focus on innovation rather than firefighting.

APA, Harvard, Vancouver, ISO, and other styles

41

Khoroshylov, S. V., and V. K. Shamakhanov. "Deployment control of transformable rod structures using reinforcement learning." Technical mechanics 2025, no. 1 (2025): 63–76. https://doi.org/10.15407/itm2025.01.063.

Full text

Abstract:

The task of controlling the deployment of transformable rod structures for space applications is studied. An example of such structures is a mesh antenna truss, which is deployed using a cable-pulley system. The aim of the study is to develop an intelligent agent (IA) based on the reinforcement learning (RL) methodology, which ensures the deployment and maintenance of the structure under consideration in the deployed position, taking into account the specified requirements. The main requirements are the deployment time and the minimum angular velocities of the V-folding rods at the final stage of the structure deployment. During the research, methods of dynamic modeling of multibody systems, control theory, reinforcement learning, and computer simulation were used. The possibility of using the RL methodology to overcome a number of difficulties inherent in traditional approaches to controlling the deployment of transformable rod structures is demonstrated. In particular, the RL allows optimizing the deployment system using models obtained using specialized software for modeling of the multibody dynamics, taking into account the necessary criteria and constraints. The features of this approach to controling the deployment of rod structures were investigated using a simplified model of one section of a transformable mesh antenna. The AI was designed on the basis of the actor-critic architecture. The structure of AI neural networks was proposed, which ensure the implementation of constraints on control actions and the stability of the learning process. Proximal policy optimization algorithm is used for training the IA. Various cases are investigated, which differ in cost functions, actor activation functions, and friction parameters of the joints. In cases where the dynamic properties of the model and the real structure differ significantly, the AI can be fine-tuned. This operation can be implemented by deploying the real structure, since the AI requires significantly fewer attempts for final fine-tuning than for preliminary training. The practical value of the obtained results is that they allow facilitating the development of space structure deployment control systems and improve their performance according to different specified criteria.

APA, Harvard, Vancouver, ISO, and other styles

42

Park, Jeman, Misun Yu, Jinse Kwon, Junmo Park, Jemin Lee, and Yongin Kwon. "NEST‐C: A deep learning compiler framework for heterogeneous computing systems with artificial intelligence accelerators." ETRI Journal 46, no. 5 (2024): 851–64. http://dx.doi.org/10.4218/etrij.2024-0139.

Full text

Abstract:

AbstractDeep learning (DL) has significantly advanced artificial intelligence (AI); however, frameworks such as PyTorch, ONNX, and TensorFlow are optimized for general‐purpose GPUs, leading to inefficiencies on specialized accelerators such as neural processing units (NPUs) and processing‐in‐memory (PIM) devices. These accelerators are designed to optimize both throughput and energy efficiency but they require more tailored optimizations. To address these limitations, we propose the NEST compiler (NEST‐C), a novel DL framework that improves the deployment and performance of models across various AI accelerators. NEST‐C leverages profiling‐based quantization, dynamic graph partitioning, and multi‐level intermediate representation (IR) integration for efficient execution on diverse hardware platforms. Our results show that NEST‐C significantly enhances computational efficiency and adaptability across various AI accelerators, achieving higher throughput, lower latency, improved resource utilization, and greater model portability. These benefits contribute to more efficient DL model deployment in modern AI applications.

APA, Harvard, Vancouver, ISO, and other styles

43

Thota, Ravi Chandra. "AI-driven infrastructure automation: Enhancing cloud efficiency with MLOps and DevOps." INTERNATIONAL JOURNAL OF NOVEL RESEARCH AND DEVELOPMENT 6, no. 9 (2021): 1–11. https://doi.org/10.5281/zenodo.15041132.

Full text

Abstract:

Cloud infrastructure management reached a reliable system structure that combines operational efficiency andscalability because of AI-driven automation. The research outlines strategies to integrate AI systems into DevOps andMLOps operations because they enhance cloud management performance outcomes. AI divides the automation solutioninto predictive analytics capabilities combined with workload adaptation features which also includes an automaticrecovery system designed to operate cloud infrastructure management. AI deployment within DevOps operations resultsin deployment acceleration of 40-60% while resource consumption reaches 30-50% enhancement. The anomalydetection algorithms protect cloud infrastructure by automatically halting system failures at a 45% level which increasescloud system reliability. AI automation enhances every area within cloud management because its execution surpassesconventional strategies across cost-reduction system security and operational achievement metrics. The problems thatcome with implementing AI model drift and complex system integration should not prevent the development ofautonomous intelligent cloud environments through AI infrastructure automation. Organizations experience improvedinfrastructure management through AI because the solution provides advanced digital operation capabilities that guidebusinesses toward enhanced solutions.

APA, Harvard, Vancouver, ISO, and other styles

44

Lahlali, Mustapha, Naoual Berbiche, and Jamila El Alami. "Artificial Intelligence Operating Model: A Proposal Framework for AI Operationalization and Deployment." Journal of Computer Science 18, no. 11 (2022): 1100–1109. http://dx.doi.org/10.3844/jcssp.2022.1100.1109.

Full text

APA, Harvard, Vancouver, ISO, and other styles

45

Xu, Kan, Zhe Chen, Fu Xiao, Jing Zhang, Hanbei Zhang, and Tianyou Ma. "Semantic model-based large-scale deployment of AI-driven building management applications." Automation in Construction 165 (September 2024): 105579. http://dx.doi.org/10.1016/j.autcon.2024.105579.

Full text

APA, Harvard, Vancouver, ISO, and other styles

46

Malik, Rohan. "DevOps and MLOps: Integrating CI/CD Pipelines for Scalable AI Model Deployment." International Journal of Emerging Trends in Computer Science and Information Technology 3, no. 1 (2022): 1–7. https://doi.org/10.63282/3050-9246.ijetcsit-v3i4p101.

Full text

APA, Harvard, Vancouver, ISO, and other styles

47

Chaudhuri, Ranjan, Sheshadri Chatterjee, Demetris Vrontis, and Sumana Chaudhuri. "Innovation in SMEs, AI Dynamism, and Sustainability: The Current Situation and Way Forward." Sustainability 14, no. 19 (2022): 12760. http://dx.doi.org/10.3390/su141912760.

Full text

Abstract:

The purpose of this study is to examine artificial intelligence (AI) dynamism and its impact on sustainability of firms, including small and medium enterprises (SMEs). In addition, this study investigates the moderating effects of technological and leadership support for AI technology deployment and sustainability for manufacturing and production firms. We developed a theoretical model through the lenses of expectation disconfirmation theory (EDT), technology–trust–fit (TTF) theory, contingency theory, and the knowledge contained in the existing literature. We tested the proposed theoretical model using factor-based PLS-SEM technique by analyzing data from 343 managers of SMEs. The findings of this study demonstrate that organizational characteristics, situational characteristics, technological characteristics, and individual characteristics all impacted SMEs’ deployment of AI technologies for the purpose of achieving sustainability, with technological and leadership support acting as moderators.

APA, Harvard, Vancouver, ISO, and other styles

48

Symeonides, Moysis, Demetris Trihinas, and Fotis Nikolaidis. "FedMon: A Federated Learning Monitoring Toolkit." IoT 5, no. 2 (2024): 227–49. http://dx.doi.org/10.3390/iot5020012.

Full text

Abstract:

Federated learning (FL) is rapidly shaping into a key enabler for large-scale Artificial Intelligence (AI) where models are trained in a distributed fashion by several clients without sharing local and possibly sensitive data. For edge computing, sharing the computational load across multiple clients is ideal, especially when the underlying IoT and edge nodes encompass limited resource capacity. Despite its wide applicability, monitoring FL deployments comes with significant challenges. AI practitioners are required to invest a vast amount of time (and labor) in manually configuring state-of-the-art monitoring tools. This entails addressing the unique characteristics of the FL training process, including the extraction of FL-specific and system-level metrics, aligning metrics to training rounds, pinpointing performance inefficiencies, and comparing current to previous deployments. This work introduces FedMon, a toolkit designed to ease the burden of monitoring FL deployments by seamlessly integrating the probing interface with the FL deployment, automating the metric extraction, providing a rich set of system, dataset, model, and experiment-level metrics, and providing the analytic means to assess trade-offs and compare different model and training configurations.

APA, Harvard, Vancouver, ISO, and other styles

49

Naveen Kumar Birru. "Secure AI Infrastructure: Building Trustworthy AI Systems in Distributed Environments." World Journal of Advanced Engineering Technology and Sciences 15, no. 2 (2025): 2756–67. https://doi.org/10.30574/wjaets.2025.15.2.0748.

Full text

Abstract:

As enterprises increasingly deploy artificial intelligence to drive customer experiences, business intelligence, and automation, ensuring the security of AI infrastructure has become paramount. Distributed AI systems must not only be scalable and performant they must also be trustworthy, protecting sensitive data and model integrity across dynamic, cloud-native environments. This article explores critical components of secure AI infrastructure, highlighting strategies and technologies for building resilient systems that withstand sophisticated threats. From securing data pipelines with encryption and access controls to protecting model training environments and inference endpoints, a comprehensive defense-in-depth approach addresses the unique security challenges of AI systems. Privacy-preserving techniques like federated learning and differential privacy enable organizations to balance utility with data protection requirements. Proper governance frameworks incorporating model inventories, version control, and ethical considerations establish the foundation for responsible AI deployment. Through practical implementation examples, including a case study from the financial services sector, this article demonstrates how organizations can create AI systems that protect against emerging threats while maintaining operational effectiveness across diverse computing environments.

APA, Harvard, Vancouver, ISO, and other styles

50

Ruchita, Singhania. "Medibuddy- A Healthcare Chatbot using AI." International Journal of Soft Computing and Engineering (IJSCE) 14, no. 3 (2024): 14–19. https://doi.org/10.35940/ijsce.G9902.14030724.

Full text

Abstract:

<strong>Abstract: </strong>This paper presents the development of a Flask-based web application designed to predict diseases based on user-reported symptoms and provide relevant health information. Leveraging machine learning techniques, the system utilizes a dataset of diseases and their associated symptoms to generate predictions through cosine similarity and a pre-trained Random Forest model. The application features a user- friendly interface for registration, login, and symptom reporting. Additionally, it integrates the DuckDuckGo search API to fetch detailed information about predicted diseases, enhancing the user experience with comprehensive health insights. The application also includes an interactive chatbot to guide users through the symptom input process, ensuring accurate data collection for reliable disease prediction. The system is built with Python, utilizing libraries such as pandas, numpy, and scikit-learn for data processing and model deployment, and is powered by SQLAlchemy for database management. This work aims to provide an accessible tool for preliminary health assessment, potentially aiding in early diagnosis and prompt medical.

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!