Log in

Relevant bibliographies by topics / DeepSeek-R1 / Journal articles

To see the other types of publications on this topic, follow the link: DeepSeek-R1.

Journal articles on the topic 'DeepSeek-R1'

Author: Grafiati

Published: 7 June 2025

Last updated: 2 August 2025

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'DeepSeek-R1.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Hayder, Wrya Anwar. "Highlighting DeepSeek-R1: Architecture, Features and Future Implications." International Journal of Computer Science and Mobile Computing 14, no. 2 (2025): 1–13. https://doi.org/10.47760/ijcsmc.2025.v14i02.001.

Full text

Abstract:

Large language models have taken the central stage in artificial intelligence, but they confront challenges like high computational costs, strong limitations on scaling, and difficulty adapting to new tasks. In contrast, DeepSeek-R1, used extensively, addresses such issues using architecture insights, novel learning paradigms, and optimization approaches. In this paper, a high-level comparison is made of DeepSeek-R1 versus generic LLMs. In this article generic LLMs refer to popular models predating DeepSeek-R1, such as OpenAI's GPT-4, Meta's Llama, and Google's PaLM, which rely on other archit

APA, Harvard, Vancouver, ISO, and other styles

2

李昌奎. "ChatGPT与DeepSeek-R1比较研究：架构、推理能力与应用场景分析A Comparative Study of ChatGPT and DeepSeek-R1: Analysis of Architecture, Reasoning Capabilities, and Application Scenarios". Theory and Practice of Social Science 7, № 2 (2025): 18–31. https://doi.org/10.6914/tpss.070202.

Full text

Abstract:

人工智能技术的飞速发展推动了大语言模型（LLM）的不断进步。在众多LLM中，OpenAI推出的ChatGPT和DeepSeek-AI开发的DeepSeek-R1尤为引人注目。ChatGPT基于GPT-4架构，具备强大的自然语言理解能力和广泛的应用场景，而DeepSeek-R1则通过强化学习方法优化推理能力，在数学推理和编程任务中展现了强劲的竞争力。本文基于DeepSeek-R1的最新研究成果，全面对比ChatGPT与DeepSeek-R1在模型架构、训练方法、推理能力、应用场景及开放性等方面的差异。研究发现，ChatGPT依赖监督微调（SFT）和基于人类反馈的强化学习（RLHF），在自然语言处理任务上表现突出，而DeepSeek-R1更倾向于通过强化学习优化推理能力，尤其在数学推理、代码生成等任务上表现优异。此外，ChatGPT采用闭源策略，主要用于商业应用，而DeepSeek-R1则采取开源模式，为研究社区和开发者提供更大的灵活性。本文的研究结果为人工智能研究人员和开发者提供了重要参考，以期促进LLM技术的发展，并为未来的大模型优化提供新思路。 The rapid development of artificial intelligence has driven the continuous advancement of large language models (LLMs).

APA, Harvard, Vancouver, ISO, and other styles

3

Chan, Lining, Xinjie Xu, and Kaiyang Lv. "DeepSeek-R1 and GPT-4 are comparable in a complex diagnostic challenge: a historical control study." International Journal of Surgery 111, no. 6 (2025): 4056–59. https://doi.org/10.1097/js9.0000000000002386.

Full text

Abstract:

Background: Large language models (LLMs) have demonstrated potential in medical diagnostics, but their accuracy in complex cases remains a subject of investigation. DeepSeek-R1, an open-source model with advanced reasoning capabilities, has gained global attention. This study evaluates the diagnostic performance of DeepSeek-R1 compared to GPT-4 in complex clinical cases. Materials and methods: A historical control study was conducted using 100 clinicopathologic cases from the New England Journal of Medicine (NEJM), published between 18 August 2022, and 30 January 2025. Each case was processed

APA, Harvard, Vancouver, ISO, and other styles

4

ZHANG, Huimin. "How DeepSeek-R1 was created?" Journal of Shenzhen University Science and Engineering 42, no. 2 (2025): 226–32. https://doi.org/10.3724/sp.j.1249.2025.02226.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

Qin, Wenting, Lijie Suo, Liangchen Li, and Fan Yang. "Advancing Software Vulnerability Detection with Reasoning LLMs: DeepSeek-R1′s Performance and Insights." Applied Sciences 15, no. 12 (2025): 6651. https://doi.org/10.3390/app15126651.

Full text

Abstract:

The increasing complexity of software systems has heightened the need for efficient and accurate vulnerability detection. Large Language Models have emerged as promising tools in this domain; however, their reasoning capabilities and limitations remain insufficiently explored. This study presents a systematic evaluation of different Large Language Models with and without explicit reasoning mechanisms, including Claude-3.5-Haiku, GPT-4o-Mini, DeepSeek-V3, O3-Mini, and DeepSeek-R1. Experimental results demonstrate that reasoning-enabled models, particularly DeepSeek-R1, outperform their non-reas

APA, Harvard, Vancouver, ISO, and other styles

6

Meo, Sultan Ayoub, Farah A. Abukhalaf, Riham A. ElToukhy, and Kamran Sattar. "Exploring the role of DeepSeek-R1, ChatGPT-4, and Google Gemini in medical education: How valid and reliable are they?" Pakistan Journal of Medical Sciences 41, no. 7 (2025): 1887–92. https://doi.org/10.12669/pjms.41.7.12183.

Full text

Abstract:

Objective: In recent years, Artificial Intelligence (AI) has led to rapid advancements in science, technology, industries, healthcare settings, and medical education. A Chinese-built large language model, DeepSeek-R1, inspires the scientific community as an affordable and open alternative to earlier established US-based AI models, ChatGPT-4 and Google Gemini 1.5 Pro. This study aimed to explore the role of “DeepSeek-R1, ChatGPT-4 and Google Gemini 1.5 Pro” and to assess the validity and reliability of these AI tools in medical education. Methods: The current cross-sectional study was performed

APA, Harvard, Vancouver, ISO, and other styles

7

Sallam, Malik, Israa M. Alasfoor, Shahad W. Khalid, et al. "Chinese generative AI models (DeepSeek and Qwen) rival ChatGPT-4 in ophthalmology queries with excellent performance in Arabic and English." Narra J 5, no. 1 (2025): e2371. https://doi.org/10.52225/narra.v5i1.2371.

Full text

Abstract:

The rapid evolution of generative artificial intelligence (genAI) has ushered in a new era of digital medical consultations, with patients turning to AI-driven tools for guidance. The emergence of Chinese-developed genAI models such as DeepSeek-R1 and Qwen-2.5 presented a challenge to the dominance of OpenAI’s ChatGPT. The aim of this study was to benchmark the performance of Chinese genAI models against ChatGPT-4o and to assess disparities in performance across English and Arabic. Following the METRICS checklist for genAI evaluation, Qwen-2.5, DeepSeek-R1, and ChatGPT-4o were assessed for com

APA, Harvard, Vancouver, ISO, and other styles

8

Jiao, Cheng, Erik Rosas, Hassan Asadigandomani, et al. "Diagnostic Performance of Publicly Available Large Language Models in Corneal Diseases: A Comparison with Human Specialists." Diagnostics 15, no. 10 (2025): 1221. https://doi.org/10.3390/diagnostics15101221.

Full text

Abstract:

Background/Objectives: This study evaluated the diagnostic accuracy of seven publicly available large language models (LLMs)—GPT-3.5, GPT-4.o Mini, GPT-4.o, Gemini 1.5 Flash, Claude 3.5 Sonnet, Grok3, and DeepSeek R1—in diagnosing corneal diseases, comparing their performance to human specialists. Methods: Twenty corneal disease cases from the University of Iowa’s EyeRounds were presented to each LLM. Diagnostic accuracy was determined by comparing LLM-generated diagnoses to the confirmed case diagnoses. Four human cornea specialists evaluated the same cases to establish a benchmark and assess

APA, Harvard, Vancouver, ISO, and other styles

9

Papaioannou, Ioannis, Christos Korkas, and Elias Kosmatopoulos. "Smart Building Recommendations with LLMs: A Semantic Comparison Approach." Buildings 15, no. 13 (2025): 2303. https://doi.org/10.3390/buildings15132303.

Full text

Abstract:

The increasing need for sustainable energy management in smart buildings calls for cost-effective solutions that balance energy efficiency and occupant comfort. This article presents a Large Language Model (LLM)-based recommendation system capable of generating proactive, context-aware suggestions from dynamic building conditions. The system was trained on a combination of real-world data and Sinergym simulations, capturing inputs such as weather conditions, forecasts, energy usage, electricity prices, and detailed zone parameters. Five models were fine-tuned and evaluated: GPT-2-Small, GPT-2-

APA, Harvard, Vancouver, ISO, and other styles

10

Han, Zongshuo. "Silicon Disruption: An Event Study of DeepSeek R1’s Breakthrough Impact on Semiconductor Markets." SHS Web of Conferences 218 (2025): 01030. https://doi.org/10.1051/shsconf/202521801030.

Full text

Abstract:

This paper conducts an event study to examine the US stock market response to the launch of the DeepSeek R1 model by its Chinese competitor, as well as to assess how US semiconductor manufacturers reacted to this launch. Which is a little new AI-large language model, designed to challenge the performance level of the existing AIs, such as Hudy, Claude-3 and o1-mini. 5. The results from the event study show a significant negative reaction from investors to US semiconductor stocks in response to the release of the DeepSeek R1 model. Furthermore, the effect is stronger among specialized AI servic

APA, Harvard, Vancouver, ISO, and other styles

11

Wang, Runchen, Jianxing He, and Hengrui Liang. "Medicine’s J.A.R.V.I.S. moment: how DeepSeek-R1 transforms clinical practice." Journal of Thoracic Disease 17, no. 3 (2025): 1784–87. https://doi.org/10.21037/jtd-2025b-05.

Full text

APA, Harvard, Vancouver, ISO, and other styles

12

Di Palma, Luca, Fatemeh Darvizeh, Marco Alì, and Deborah Fazzini. "Structured Transformation of Unstructured Prostate MRI Reports Using Large Language Models." Tomography 11, no. 6 (2025): 69. https://doi.org/10.3390/tomography11060069.

Full text

Abstract:

Objectives: to assess the ability of high-performing open-weight large language models (LLMs) in extracting key radiological features from prostate MRI reports. Methods: Five LLMs (Llama3.3, DeepSeek-R1-Llama3.3, Phi4, Gemma-2, and Qwen2.5-14B) were used to analyze free-text MRI reports retrieved from clinical practice. Each LLM processed reports three times using specialized prompts to extract (1) dimensions, (2) volume and PSA density, and (3) lesion characteristics. An experienced radiologist manually annotated the dataset, defining entities (Exam) and sub-entities (Lesion, Dimension). Feat

APA, Harvard, Vancouver, ISO, and other styles

13

Huang, Huan. "The Transformation of Discourse Power in the AI Era: An Analysis of News Communication Power with DeepSeek as an Example." Advances in Education, Humanities and Social Science Research 14, no. 1 (2025): 97. https://doi.org/10.56028/aehssr.14.1.97.2025.

Full text

Abstract:

The iteration of artificial intelligence technology is reshaping the global communication power pattern, and the media characteristics of generative AI profoundly reconstruct the production and distribution mechanism of discourse power. This paper takes DeepSeek as the research object, combines the theoretical backgrounds of Actor-Network Theory, the sociology of news production, and technological affordance, and systematically explores how the DeepSeek-R1 model, a masterwork of generative AI, promotes the reengineering of the news production process and the transformation of discourse power t

APA, Harvard, Vancouver, ISO, and other styles

14

Ajmani, Prerna, Garima Saini, Snehlata Sheoran, et al. "DeepSeek-R1 vs. ChatGPT: Assessing the Titans of Next-Generation AI Linguistic Models." Journal of Neonatal Surgery 14, no. 13S (2025): 1226–44. https://doi.org/10.63682/jns.v14i13s.4153.

Full text

Abstract:

Artificial intelligence models have rapidly evolved, leading to the development of advanced large language models (LLMs) like DeepSeek R1 and ChatGPT. These models represent significant advancements in natural language processing and generative tasks, each offering unique features and capabilities. This study provides a comprehensive comparison of the two, focusing on their architectures, functionalities, and applications across various domains. It highlights the strengths of DeepSeek R1, such as its versatility in handling multiple types of content, and contrasts them with ChatGPT’s conversat

APA, Harvard, Vancouver, ISO, and other styles

15

Impito, Pinto Francisco. "Content Analysis of DeepSeek’s Disruption in the Technology Industry." Journal of Technology and Systems 7, no. 2 (2025): 7–21. https://doi.org/10.47941/jts.2573.

Full text

Abstract:

Purpose: Artificial Intelligence (AI) has become a powerful catalyst for change in the modern world, transforming industries, economies, and daily life. Through methods like machine learning, deep learning, and neural networks, AI systems excel at processing vast amounts of data, enabling predictive analytics, automation, and sophisticated decision-making. This paper examines feedback given in response to the launch of the DeepSeek AI model, collected from accessible sources such as YouTube. Methodology: This research is based on video content analysis. Our search strategy led to the identific

APA, Harvard, Vancouver, ISO, and other styles

16

Hussain, Manowar, and Joyanta Ghosh. "The economic impact of artificial intelligence: A case study of DeepSeek R1." International Journal of Research in Finance and Management 8, no. 1 (2025): 80–84. https://doi.org/10.33545/26175754.2025.v8.i1a.430.

Full text

APA, Harvard, Vancouver, ISO, and other styles

17

Sallam, Malik, Kholoud Al-Mahzoum, Mohammed Sallam, and Maad M. Mijwil. "DeepSeek: Is it the End of Generative AI Monopoly or the Mark of the Impending Doomsday?" Mesopotamian Journal of Big Data 2025 (January 30, 2025): 26–34. https://doi.org/10.58496/mjbd/2025/002.

Full text

Abstract:

The rise of superintelligent open-source generative AI (genAI) heralds both extraordinary potential and unprecedented risk, exemplified by the rapid emergence of DeepSeek as a global AI innovator. This perspective article examines the dual-edged nature of open source genAI technologies, highlighting their capacity to democratize innovation while exposing critical vulnerabilities. By providing affordable, high-performing, and openly available models like DeepSeek-R1 and DeepSeek-V3, this Chinese AI company has disrupted the proprietary dominance of Western AI giants. These advancements are expe

APA, Harvard, Vancouver, ISO, and other styles

18

Tamez, Valdemar, and Gbolahan Solomon Osho. "AI Disruption at Scale: DeepSeek’s Open-Source Model and Its Macroeconomic Impact on Markets, Labor, and Global Growth." International Journal of Business Management and Finance Research 8, no. 3 (2025): 1–11. https://doi.org/10.53935/26415313.v8i3.398.

Full text

Abstract:

This study explores the macroeconomic implications of DeepSeek, a Chinese artificial intelligence (AI) startup that disrupted the global AI landscape through the release of DeepSeek-R1, an open-source, cost-efficient large language model. Developed with significantly lower training costs and released under an MIT license, DeepSeek’s model rivaled proprietary offerings from Western firms like OpenAI and Google, triggering financial market turbulence and investor realignment. The study analyzes how DeepSeek’s innovation challenged prevailing norms in AI infrastructure, democratized access to adv

APA, Harvard, Vancouver, ISO, and other styles

19

Wang, Qiang. "DeepSeek Hits Hard: Helping to Revolutionize Higher Education in the Era of Artificial Intelligence." International Journal of Higher Education 14, no. 2 (2025): 26. https://doi.org/10.5430/ijhe.v14n2p26.

Full text

Abstract:

DeepSeek is reshaping the higher education model in all aspects by assisting differentiated teaching and personalized learning, promoting flat teaching management processes, creating interactive learning modes with deep participation, creating intelligent paths for boundaryless learning, providing accurate and comprehensive data-driven feedback, and enhancing global education equity and inclusiveness. For better applying artificial intelligence to promote the development of higher education, this study summarizes the main scenarios in which DeepSeek R1 contributes to the sustainable developmen

APA, Harvard, Vancouver, ISO, and other styles

20

Gupta, Anamika, Sakshi Garg, and Harsh Bamotra. "Evaluation of Prompting Strategies for Cyberbullying Detection Using Various Large Language Models." Advances in Knowledge-Based Systems, Data Science, and Cybersecurity 02, no. 01 (2025): 184–96. https://doi.org/10.54364/cybersecurityjournal.2025.1109.

Full text

Abstract:

Sentiment analysis detects toxic language for safer online spaces and helps businesses refine strategies through customer feedback analysis [1, 2]. Advancements in Large Language Models (LLMs) and prompt engineering have introduced novel approaches to sentiment analysis, cyberbullying detection, and toxicity classification. However, several challenges persist, particularly in handling text ambiguity, sarcasm, multilingual contexts, and nuanced emotional comprehension, which limit the ability to achieve accurate and human-aligned results. This study uses the CYBY23 dataset, which contains 112 h

APA, Harvard, Vancouver, ISO, and other styles

21

Li, Dongjun. "Impact of the DeepSeek-R1 Model Launch on the Value of Chinese AI Concept Companies." SHS Web of Conferences 218 (2025): 01027. https://doi.org/10.1051/shsconf/202521801027.

Full text

Abstract:

On January 20, 2025, the open-source large model DeepSeek-R1 was launched in China, marking another major technological breakthrough in generative AI. As a representative of domestic large models, it quickly attracted significant market attention. This study selects 152 A-share listed companies from the artificial intelligence industry chain as the observation sample, and employs an event study method to quantitatively analyze abnormal stock price movements before and after the technology announcement. The results indicate that the release of new technology produced a significant positive impa

APA, Harvard, Vancouver, ISO, and other styles

22

Wu, Haodong, Shuxin Yao, Huanli Bao, Yishun Guo, Chao Xu, and Jianbing Ma. "ChatGPT-4.0 and DeepSeek-R1 does not yet provide clinically supported answers for knee osteoarthritis." Knee 56 (October 2025): 386–96. https://doi.org/10.1016/j.knee.2025.06.007.

Full text

APA, Harvard, Vancouver, ISO, and other styles

23

Spennemann, Dirk H. R. "The Origins and Veracity of References ‘Cited’ by Generative Artificial Intelligence Applications: Implications for the Quality of Responses." Publications 13, no. 1 (2025): 12. https://doi.org/10.3390/publications13010012.

Full text

Abstract:

The public release of ChatGPT in late 2022 has resulted in considerable publicity and has led to widespread discussion of the usefulness and capabilities of generative Artificial intelligence (Ai) language models. Its ability to extract and summarise data from textual sources and present them as human-like contextual responses makes it an eminently suitable tool to answer questions users might ask. Expanding on a previous analysis of the capabilities of ChatGPT3.5, this paper tested what archaeological literature appears to have been included in the training phase of three recent generative Ai

APA, Harvard, Vancouver, ISO, and other styles

24

POGORILYY, S., and P. V. BILETSKY. "THE DEVELOPMENT OF METHODS FOR COREFERENCE RESOLUTION IN UKRAINIAN TEXTS BASED ON LARGE LANGUAGE MODELS." Scientific papers of Donetsk National Technical University. Series: Informatics, Cybernetics and Computer Science 1, no. 40 (2025): 152–27. https://doi.org/10.31474/1996-1588-2025-1-40-15-27.

Full text

Abstract:

"This work investigates the task of automated Coreference Resolution in Ukrainian texts – a key problem in Natural Language Processing (NLP) necessary for deep semantic analysis, information extraction, machine translation, and other applications. Extended examples of coreference are provided, emphasizing the importance of the task and the specific difficulties of solving it for the Ukrainian language, primarily due to its free word order. The Transformer architecture, which underlies modern Large Language Models (LLMs), is discussed. The key characteristics of several state-of-the-art large l

APA, Harvard, Vancouver, ISO, and other styles

25

Chen, HuanChang. "Research on Korean Translation of Chinese Resultative Object Sentences Based on DeepSeek R1: AI Translation Quality Evaltlation and OptiIylization." Journal of Chinese Literature 99 (May 30, 2025): 211–35. https://doi.org/10.31985/jcl.99.9.

Full text

APA, Harvard, Vancouver, ISO, and other styles

26

Li, Chenkuan, Reza Saadati, Safoura Rezaei Aderyani, and Min-Jie Luo. "On the Generalized Fractional Convection–Diffusion Equation with an Initial Condition in Rn." Fractal and Fractional 9, no. 6 (2025): 347. https://doi.org/10.3390/fractalfract9060347.

Full text

Abstract:

Time-fractional convection–diffusion equations are significant for their ability to model complex transport phenomena that deviate from classical behavior, with numerous applications in anomalous diffusion, memory effects, and nonlocality. This paper derives, for the first time, a unique series solution to a multiple time-fractional convection–diffusion equation with a non-homogenous source term, based on an inverse operator, a newly-constructed space, and the multivariate Mittag–Leffler function. Several illustrative examples are provided to show the power and simplicity of our main theorems

APA, Harvard, Vancouver, ISO, and other styles

27

Xiao, Jin, Buhong Wang, Ruochen Dong, Zhengyang Zhao, and Bofu Zhao. "SatGuard: Satellite Networks Penetration Testing and Vulnerability Risk Assessment Methods." Aerospace 12, no. 5 (2025): 431. https://doi.org/10.3390/aerospace12050431.

Full text

Abstract:

Satellite networks face escalating cybersecurity threats from evolving attack vectors and systemic complexities. This paper proposes SatGuard, a novel framework integrating a three-dimensional penetration testing methodology and a nonlinear risk assessment mechanism tailored for satellite security. To address limitations of conventional tools in handling satellite-specific vulnerabilities, SatGuard employs large language models (LLMs) like GPT-4 and DeepSeek-R1. By leveraging their contextual reasoning and code-generation abilities, SatGuard enables semi-automated vulnerability analysis and ex

APA, Harvard, Vancouver, ISO, and other styles

28

Tanwir, Tanwir, Khasnur Hidjah, and Dyah Susilowati. "Implementasi Konsultasi Stunting Balita Menggunakan Large Language Models (LLMs)." Reputasi: Jurnal Rekayasa Perangkat Lunak 6, no. 1 (2025): 13–20. https://doi.org/10.31294/reputasi.v6i1.8961.

Full text

Abstract:

Stunting pada balita merupakan masalah kesehatan kritis di Indonesia yang memerlukan intervensi berbasis teknologi untuk meningkatkan akses informasi nutrisi. Penelitian ini bertujuan mengembangkan chatbot konsultasi stunting berbasis Large Language Models (LLMs) guna menyediakan rekomendasi kesehatan yang akurat dan mudah diakses. Metode yang digunakan berupa Model LLaMA 3 di-fine-tuning menggunakan dataset Q&A spesifik stunting berisi 7.642 entri, kemudian dievaluasi dengan matrik ROUGE untuk mengukur kesesuaian semantik respons. Hasil menunjukkan model Stunting mencapai skor ROUGE-1 (72

APA, Harvard, Vancouver, ISO, and other styles

29

Shao, Shiyu. "Research on Story Text Generation Based on Transformer Model." Applied and Computational Engineering 175, no. 1 (2025): 8–17. https://doi.org/10.54254/2755-2721/2025.ast24685.

Full text

Abstract:

The transformer model was used to train and generate story text this time because certain parts or endings of the original story were not satisfactory. This study tried to use the model training to obtain other story paths. The main purpose is to study two paths: one is how to use pre-trained models for fine-tuning to achieve the desired effect, and the other is how to build a model trained from scratch to achieve the desired effect. DeepSeek R1 will be used as a control group to evaluate the generation effect.According to the results, the pre-trained model performs better on smaller datasets,

APA, Harvard, Vancouver, ISO, and other styles

30

Roumeliotis, Konstantinos I., Nikolaos D. Tselikas, and Dimitrios K. Nasiopoulos. "Think Before You Classify: The Rise of Reasoning Large Language Models for Consumer Complaint Detection and Classification." Electronics 14, no. 6 (2025): 1070. https://doi.org/10.3390/electronics14061070.

Full text

Abstract:

Large language models (LLMs) have demonstrated remarkable capabilities in various natural language processing (NLP) tasks, but their effectiveness in real-world consumer complaint classification without fine-tuning remains uncertain. Zero-shot classification offers a promising solution by enabling models to categorize consumer complaints without prior exposure to labeled training data, making it valuable for handling emerging issues and dynamic complaint categories in finance. However, this task is particularly challenging, as financial complaint categories often overlap, requiring a deep unde

APA, Harvard, Vancouver, ISO, and other styles

31

Young, Richard J., Alice M. Matthews, and Brach Poston. "Benchmarking Multiple Large Language Models for Automated Clinical Trial Data Extraction in Aging Research." Algorithms 18, no. 5 (2025): 296. https://doi.org/10.3390/a18050296.

Full text

Abstract:

Large-language models (LLMs) show promise for automating evidence synthesis, yet head-to-head evaluations remain scarce. We benchmarked five state-of-the-art LLMs—openai/o1-mini, x-ai/grok-2-1212, meta-llama/Llama-3.3-70B-Instruct, google/Gemini-Flash-1.5-8B, and deepseek/DeepSeek-R1-70B-Distill—on extracting protocol details from transcranial direct-current stimulation (tDCS) trials enrolling older adults. A multi-LLM ensemble pipeline ingested ClinicalTrials.gov records, applied a structured JSON schema, and generated comparable outputs from unstructured text. The pipeline retrieved 83 aging

APA, Harvard, Vancouver, ISO, and other styles

32

魏榕. "人工智能推理模型应用于法律文本汉英翻译的效能研究". 人文与社会科学学刊 1, № 6 (2025): 65–71. https://doi.org/10.70693/rwsk.v1i6.682.

Full text

Abstract:

随着人工智能技术的持续演进，人工智能推理模型逐渐被引入法律翻译实践中。然而，该领域的系统性实证研究仍较为稀缺，尤其是对模型输出质量的量化分析与对比研究尚待深入。本研究《中华人民共和国刑法》总则为语例，选取四种主流推理模型——ChatGPT o1、DeepSeek R1、Grok 3 with Think 和 Gemini 2.0 Flash Thinking Experimental，对模型生成的译文进行BLEU评分，并结合Python与SPSS进行统计检验与多维度对比分析。数据表明，四种推理模型在法律文本翻译中的表现存在显著差异，其中Grok 3与Gemini 2.0在BLEU均值、高质量翻译频次和低分率等维度上相对优越，而ChatGPT o1与DeepSeek R1则在术语处理、句法结构及稳定性方面存在较大提升空间。进一步的分析表明，推理模型在应对法律术语、复杂句式和法律逻辑方面仍存在准确性挑战。人工智能推理模型虽具备赋能法律翻译的潜力，但在专业语域的深层理解与表达上仍难以取代人工译者。为提升模型译文质量，建议加强本土法律语料建设、优化术语一致性机制、引入译后人工校对流程，并推动构建具有中国特色的法律翻译模型。

APA, Harvard, Vancouver, ISO, and other styles

33

Liu, Yicen. "Research on AI-Powered Essay Writing Assistant with Contextual Revision System." Applied and Computational Engineering 142, no. 1 (2025): 109–15. https://doi.org/10.54254/2755-2721/2025.kl22313.

Full text

Abstract:

This report presents an AI-powered essay writing assistant that combines three advanced modules: a fine-tuned DeepSeek-R1-Distill-Llama-8B language model for contextual revisions, retrieval-augmented generation (RAG) for evidence-based suggestions, and a DeBERTaV3-based ordinal regression model for essay scoring. By combining the two models, the system demonstrates high grammatical accuracy in revisions and achieves a satisfactory value of quadratic weighted Kappa in scoring while providing detailed suggestions for further improvement, significantly outperforming baseline approaches. Through s

APA, Harvard, Vancouver, ISO, and other styles

34

Mo, Guohui, and Poonpilas Asavisanu. "RESEARCH PROGRESS ON THE APPLICATION OF GENERATIVE AI IN NURSING EDUCATION IN CHINA: POTENTIAL AND CHALLENGES." EUrASEANs: journal on global socio-economic dynamics, no. 3(52) (May 19, 2025): 520–31. https://doi.org/10.35678/2539-5645.3(52).2025.520-531.

Full text

Abstract:

With the rapid advancement of artificial intelligence, generative AI (such as ChatGPT, DeepSeek R1, etc.) has been increasingly applied in the field of medical and nursing education. This paper looks at past research on how AI has been used in medical education over the last ten years and explores how generative AI is currently used in nursing education, particularly in areas like teaching support, virtual clinical simulations, and personalized learning feedback. The findings reveal that generative AI offers notable advantages in enriching learning resources, enhancing student efficiency, and

APA, Harvard, Vancouver, ISO, and other styles

35

Rodrigues, Gabriel Arquelau Pimenta, André Luiz Marques Serrano, Guilherme Dantas Bispo, Geraldo Pereira Rocha Filho, Vinícius Pereira Gonçalves, and Rodolfo Ipolito Meneguette. "IHRAS: Automated Medical Report Generation from Chest X-Rays via Classification, Segmentation, and LLMs." Bioengineering 12, no. 8 (2025): 795. https://doi.org/10.3390/bioengineering12080795.

Full text

Abstract:

The growing demand for accurate and efficient Chest X-Ray (CXR) interpretation has prompted the development of AI-driven systems to alleviate radiologist workload and reduce diagnostic variability. This paper introduces the Intelligent Humanized Radiology Analysis System (IHRAS), a modular framework that automates the end-to-end process of CXR analysis and report generation. IHRAS integrates four core components: (i) deep convolutional neural networks for multi-label classification of 14 thoracic conditions; (ii) Grad-CAM for spatial visualization of pathologies; (iii) SAR-Net for anatomical s

APA, Harvard, Vancouver, ISO, and other styles

36

Dias, Rachael, and Kayvan Karim. "Translation of User Crochet Patterns to CrochetPARADE Syntax Using Large Language Models." Proceedings of the AAAI Symposium Series 6, no. 1 (2025): 200–208. https://doi.org/10.1609/aaaiss.v6i1.36054.

Full text

Abstract:

Crochet, with its rich history and popularity, provides a creative and therapeutic outlet for millions across the globe, from many walks of life. However, crochet pattern creation and modification can be challenging for novice users, due to the spatial reasoning and structural understanding of stitches required. CrochetPARADE is a tool created to ease this process through pattern visualisation, but it uses a syntax that differs from standard notation and may not be intuitive to the average crocheter. This study explores the use of Large Language Models (LLMs) to translate user-generated croche

APA, Harvard, Vancouver, ISO, and other styles

37

Rasool, Abdur, Muhammad Irfan Shahzad, Hafsa Aslam, Vincent Chan, and Muhammad Ali Arshad. "Emotion-Aware Embedding Fusion in Large Language Models (Flan-T5, Llama 2, DeepSeek-R1, and ChatGPT 4) for Intelligent Response Generation." AI 6, no. 3 (2025): 56. https://doi.org/10.3390/ai6030056.

Full text

Abstract:

Empathetic and coherent responses are critical in automated chatbot-facilitated psychotherapy. This study addresses the challenge of enhancing the emotional and contextual understanding of large language models (LLMs) in psychiatric applications. We introduce Emotion-Aware Embedding Fusion, a novel framework integrating hierarchical fusion and attention mechanisms to prioritize semantic and emotional features in therapy transcripts. Our approach combines multiple emotion lexicons, including NRC Emotion Lexicon, VADER, WordNet, and SentiWordNet, with state-of-the-art LLMs such as Flan-T5, Llama

APA, Harvard, Vancouver, ISO, and other styles

38

Altermatt, Fernando R., Andres Neyem, Nicolás I. Sumonte, Ignacio Villagrán, Marcelo Mendoza, and Hector J. Lacassie. "Evaluating the Performance of Large Language Models on the CONACEM Anesthesiology Certification Exam: A Comparison with Human Participants." Applied Sciences 15, no. 11 (2025): 6245. https://doi.org/10.3390/app15116245.

Full text

Abstract:

Large Language Models (LLMs) have demonstrated strong performance on English-language medical exams, but their effectiveness in non-English, high-stakes environments is less understood. This study benchmarks nine LLMs against human examinees on the Chilean Anesthesiology Certification Exam (CONACEM), a Spanish-language board examination. A curated set of 63 multiple-choice questions was used, categorized by Bloom’s taxonomy into four cognitive levels. Model responses were assessed using Item Response Theory and Classical Test Theory, complemented by additional error analysis, categorizing erro

APA, Harvard, Vancouver, ISO, and other styles

39

Shadbahr, Tolou, Antti S. Rannikko, Tuomas Mirtti, and Teemu D. Laajala. "Abstract B018: Practical benchmarking of large language models for structuring synthetic prostate cancer histopathology statements in English and Finnish." Clinical Cancer Research 31, no. 13_Supplement (2025): B018. https://doi.org/10.1158/1557-3265.aimachine-b018.

Full text

Abstract:

Abstract Large Language Models (LLMs) are increasingly applied to clinical oncology tasks, such as structuring free-text format histopathology statements. Given the rapid pace of LLM development, guidance is needed for optimal model, parameter, and prompt design choices, especially in minority languages. A total of 100 novel synthetic histopathological prostate cancer (PCa) statements were manually generated to mimic real-life data of Helsinki University Hospital. First, 25 statements were generated in Finnish and manually translated to English. Manually censored versions were then generated,

APA, Harvard, Vancouver, ISO, and other styles

40

Siyuan Wu and Hong Li. "Evaluation of Large Language Models in the Full Process of Battery Research and Development and Inorganic Solid Electrolyte Materials Database." Acta Physica Sinica 74, no. 16 (2025): 0. https://doi.org/10.7498/aps.74.20250572.

Full text

Abstract:

The emergence of large language models has significantly advanced scientific research. Representative models such as ChatGPT and DeepSeek R1 have brought notable transformations to the paradigm of scientific research. While these models are general-purpose, they have demonstrated strong generalization capabilities in the field of batteries, particularly in solid-state battery research. In this study, we systematically screened 5,309,268 articles from key journals up to 2024, accurately extracting 124,021 relevant battery-related papers.Additionally, we comprehensively searched through 17,559,7

APA, Harvard, Vancouver, ISO, and other styles

41

Uldin, Hasaam, Sonal Saran, Girish Gandikota, et al. "A comparison of performance of DeepSeek-R1 model-generated responses to musculoskeletal radiology queries against ChatGPT-4 and ChatGPT-4o – A feasibility study." Clinical Imaging 123 (July 2025): 110506. https://doi.org/10.1016/j.clinimag.2025.110506.

Full text

APA, Harvard, Vancouver, ISO, and other styles

42

Xia, Chunqiu Steven, Yinlin Deng, Soren Dunn, and Lingming Zhang. "Demystifying LLM-Based Software Engineering Agents." Proceedings of the ACM on Software Engineering 2, FSE (2025): 801–24. https://doi.org/10.1145/3715754.

Full text

Abstract:

Recent advancements in large language models (LLMs) have significantly advanced the automation of software development tasks, including code synthesis, program repair, and test generation. More recently, researchers and industry practitioners have developed various autonomous LLM agents to perform end-to-end software development tasks. These agents are equipped with the ability to use tools, run commands, observe feedback from the environment, and plan for future actions. However, the complexity of these agent-based approaches, together with the limited abilities of current LLMs, raises the fo

APA, Harvard, Vancouver, ISO, and other styles

43

Zhong, Wei, YiFan Liu, Yan Liu, et al. "Performance of ChatGPT-4o and Four Open-Source Large Language Models in Generating Diagnoses Based on China’s Rare Disease Catalog: Comparative Study." Journal of Medical Internet Research 27 (June 18, 2025): e69929-e69929. https://doi.org/10.2196/69929.

Full text

Abstract:

Abstract Background Diagnosing rare diseases remains challenging due to their inherent complexity and limited physician knowledge. Large language models (LLMs) offer new potential to enhance diagnostic workflows. Objective This study aimed to evaluate the diagnostic accuracy of ChatGPT-4o and 4 open-source LLMs (qwen2.5:7b, Llama3.1:8b, qwen2.5:72b, and Llama3.1:70b) for rare diseases, assesses the language effect on diagnostic performance, and explore retrieval augmented generation (RAG) and chain-of-thought (CoT) reasoning. Methods We extracted clinical manifestations of 121 rare diseases fr

APA, Harvard, Vancouver, ISO, and other styles

44

Peng, Xiaohong, Hongbin Jiang, Jing Chen, Mingxin Liu, and Xiao Chen. "Research and Construction of Knowledge Map of Golden Pomfret Based on LA-CANER Model." Journal of Marine Science and Engineering 13, no. 3 (2025): 400. https://doi.org/10.3390/jmse13030400.

Full text

Abstract:

To address the issues of fragmented species information, low knowledge extraction efficiency, and insufficient utilization in the aquaculture domain, the main objective of this study is to construct the first knowledge graph for the Golden Pomfret aquaculture field and optimize the named entity recognition (NER) methods used in the construction process. The dataset contains challenges such as long text processing, strong local context dependencies, and entity sample imbalance, which result in low information extraction efficiency, recognition errors or omissions, and weak model generalization.

APA, Harvard, Vancouver, ISO, and other styles

45

McCoy, Thomas H., and Roy H. Perlis. "Reasoning language models for more transparent prediction of suicide risk." BMJ Mental Health 28, no. 1 (2025): e301654. https://doi.org/10.1136/bmjment-2025-301654.

Full text

Abstract:

BackgroundWe previously demonstrated that a large language model could estimate suicide risk using hospital discharge notes.ObjectiveWith the emergence of reasoning models that can be run on consumer-grade hardware, we investigated whether these models can approximate the performance of much larger and costlier models.MethodsFrom 458 053 adults hospitalised at one of two academic medical centres between 4 January 2005 and 2 January 2014, we identified 1995 who died by suicide or accident, and matched them with 5 control individuals. We used Llama-DeepSeek-R1 8B to generate predictions of risk.

APA, Harvard, Vancouver, ISO, and other styles

46

Yan, Jin, and Yuling Huang. "MambaLLM: Integrating Macro-Index and Micro-Stock Data for Enhanced Stock Price Prediction." Mathematics 13, no. 10 (2025): 1599. https://doi.org/10.3390/math13101599.

Full text

Abstract:

Accurate stock price prediction requires the integration of heterogeneous data streams, yet conventional techniques struggle to simultaneously leverage fine-grained micro-stock features and broader macroeconomic indicators. To address this gap, we propose MambaLLM, a novel framework that fuses macro-index and micro-stock inputs through the synergistic use of state-space models (SSMs) and large language models (LLMs). Our two-branch architecture comprises (i) Micro-Stock Encoder, a Mamba-based temporal encoder for processing granular stock-level data (prices, volumes, and technical indicators),

APA, Harvard, Vancouver, ISO, and other styles

47

Wasif, Mohammad. "AI-Teaching Chatbot: Personalized Learning Platform with an AI Teaching Bot." INTERNATIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT 09, no. 05 (2025): 1–9. https://doi.org/10.55041/ijsrem48384.

Full text

Abstract:

Over the past few years, educators have faced mounting pressure to deliver personalized support to ever-larger and more diverse student cohorts—a challenge that traditional methods struggle to meet effectively. AI-powered teaching chatbots have emerged as a promising solution, offering students instant, tailored explanations and practice exercises around the clock. In a recent pilot involving 100 varied academic queries, our chatbot achieved 92% factual accuracy, mirroring expert-verified answers in most cases. Meanwhile, average response times consistently fell below five seconds, ensuring le

APA, Harvard, Vancouver, ISO, and other styles

48

JONNALA, Sridhar, Basavaraj SWAMY, and Nisha Mary THOMAS. "Geopolitical Bias in Sovereign Large Language Models: A Comparative Mixed-Methods Study." Journal of Research, Innovation and Technologies (JoRIT) 4, no. 16 (2025): 173. https://doi.org/10.57017/jorit.v4.2(8).04.

Full text

Abstract:

Sovereign large language models (LLMs), emerging as strategic assets in global information ecosystems, represent advanced AI system developed under distinct national governance regimes. This study examines how model origin and governance context influence AI-generated narratives on international territorial disputes. The study compares outputs from three prominent sovereign LLMs - OpenAI’s GPT-4o (United States), DeepSeek-R1 (China), and Mistral (European Union), across 12 high-profile territorial conflicts. Statistically significant differences in each model's sentiment distribution and geopo

APA, Harvard, Vancouver, ISO, and other styles

49

Wang, Zhuorui, Xiaoyu Zheng, Fanyun Meng, Kang Wang, Xincheng Wu, and Dexin Yu. "Exploring the Joint Influence of Built Environment Factors on Urban Rail Transit Peak-Hour Ridership Using DeepSeek." Buildings 15, no. 10 (2025): 1744. https://doi.org/10.3390/buildings15101744.

Full text

Abstract:

Modern cities are facing increasing challenges such as traffic congestion, high energy consumption, and poor air quality, making rail transit systems, known for their high capacity and low emissions, essential components of sustainable urban infrastructure. While numerous studies have examined how the built environment impacts transit ridership, the complex interactions among these factors warrant further investigation. Recent advancements in the reasoning capabilities of large language models (LLMs) offer a robust methodological foundation for analyzing the complex joint influence of multiple

APA, Harvard, Vancouver, ISO, and other styles

50

Bessa, Renato Freitas, Adonias Caetano de Oliveira, Rafael Freitas Bessa, et al. "Performance Comparison of Large Language Models on Brazil’s Medical Revalidation Exam for Foreign-Trained Graduates." Applied Sciences 15, no. 13 (2025): 7134. https://doi.org/10.3390/app15137134.

Full text

Abstract:

This study aimed to compare the performance of various Large Language Models (LLMs) in answering multiple-choice questions from the last six editions (2017 to 2024) of Revalida exam. The evaluation focused on models capable of processing content in Brazilian Portuguese (PT-BR), including open-source models, namely LLaMA 3.1 (8B parameters), Qwen 2.5 (7B parameters), and their reasoning-oriented distilled variants based on the DeepSeek-R1 architecture, as well as open-access commercial models such as GPT-3.5, GPT-4o, and Gemini. After evaluating the models’ accuracy against the official answer

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!