DeepSeek-R1 and GPT-4 are comparable in a complex diagnostic challenge: a historical control study.
DeepSeek-R1 與 GPT-4 在複雜診斷挑戰中的表現相當:一項歷史對照研究
Int J Surg 2025-06-12
Preliminary analysis of the impact of lab results on large language model generated differential diagnoses.
實驗室結果對大型語言模型生成的鑑別診斷影響的初步分析。
NPJ Digit Med 2025-03-19
Comparative benchmarking of the DeepSeek large language model on medical tasks and clinical reasoning.
DeepSeek 大型語言模型在醫學任務與臨床推理上的比較性基準評估
Nat Med 2025-04-23
A comparison of performance of DeepSeek-R1 model-generated responses to musculoskeletal radiology queries against ChatGPT-4 and ChatGPT-4o - A feasibility study.
DeepSeek-R1 模型在肌肉骨骼放射學問題回應表現與 ChatGPT-4 及 ChatGPT-4o 之比較-一項可行性研究
Clin Imaging 2025-05-17
A large language model improves clinicians' diagnostic performance in complex critical illness cases.
大型語言模型提升臨床醫師在複雜重症病例中的診斷表現
Crit Care 2025-06-06
這項研究發現,DeepSeek-R1 AI 能有效協助加護病房住院醫師診斷複雜重症,讓診斷正確率從 27% 提升到 58%,AI 自己的正確率則是 60%。有 AI 幫忙時,住院醫師不只診斷更準確,速度也更快,鑑別診斷品質也提升。整體來說,這類 AI 未來很有機會成為加護病房醫師的重要幫手。
PubMedDOI
Performance analysis of large language models in multi-disease detection from chest computed tomography reports: a comparative study: Experimental Research.
大型語言模型於胸部電腦斷層報告多重疾病偵測之表現分析:比較性研究
Int J Surg 2025-06-11
Comparative analysis of large language models in clinical diagnosis: performance evaluation across common and complex medical cases.
大型語言模型於臨床診斷的比較分析:於常見與複雜醫療案例中的表現評估
JAMIA Open 2025-06-13
GPT-4 vs. Radiologists: who advances mediastinal tumor classification better across report quality levels? A cohort study.
GPT-4 vs. 放射科醫師:誰能在不同報告品質層級下更好地推進縱膈腫瘤分類?一項世代研究
Int J Surg 2025-08-11