Evaluating the o1 reasoning large language model for cognitive bias: a vignette study.
O1推理大型語言模型於認知偏誤的評估:一項情境研究
Crit Care 2025-08-22
o1 reasoning model 是新一代 AI 系統,研究發現它在臨床決策時的認知偏誤比 GPT-4 和人類醫師少,且表現較穩定。在 10 個案例中有 7 個完全沒偏誤,但遇到特定提示時還是會出現明顯偏誤。雖然 o1 有助減少偏誤和不一致,但仍無法完全避免,未來還需進一步研究其在醫療輔助上的限制。
相關文章PubMedDOI推理
In-Context Learning with Large Language Models: A Simple and Effective Approach to Improve Radiology Report Labeling.
利用大型語言模型進行情境學習:提升放射科報告標註的簡單且有效方法
Healthc Inform Res 2025-08-21
Public Perceptions and Barriers to Tuberculosis Treatment in Korea: A Large Language Model-Based Analysis of Naver Knowledge-iN Data from 2002 to 2024.
韓國民眾對結核病治療的認知與障礙:基於大型語言模型分析2002至2024年Naver Knowledge-iN資料
Healthc Inform Res 2025-08-21
Development and Evaluation of a Retrieval-Augmented Generation-Based Electronic Medical Record Chatbot System.
基於檢索增強生成(Retrieval-Augmented Generation, RAG)之電子病歷聊天機器人系統的開發與評估
Healthc Inform Res 2025-08-21
Transforming Cancer Nanotechnology Data Analysis and User Experience. Part II: Providing Future Solutions Using Large Language Models.
轉化癌症奈米科技數據分析與使用者體驗(第二部分):運用大型語言模型提供未來解決方案
Wiley Interdiscip Rev Nanomed Nanobiotechnol 2025-08-21
Leveraging Retrieval-Augmented Large Language Models for Dietary Recommendations With Traditional Chinese Medicine's Medicine Food Homology: Algorithm Development and Validation.
運用檢索增強大型語言模型於中醫藥食同源飲食建議之演算法開發與驗證
JMIR Med Inform 2025-08-21
A comparative evaluation of publicly available large language models in the assessment of CTG traces according to the FIGO criteria.
根據FIGO標準,公開大型語言模型於CTG波形評估之比較性分析
Arch Gynecol Obstet 2025-08-21
Trends in the Distribution of P Values in Epidemiology Journals: A Statistical, P-Curve, and Simulation Study.
流行病學期刊中 P 值分布趨勢:統計分析、P-curve 及模擬研究
Am J Epidemiol 2025-08-21