Evaluating and leveraging large language models in clinical pharmacology and therapeutics assessment: From exam takers to exam shapers.
在臨床藥理學與治療學評估中評價與應用大型語言模型:從考生到考題設計者
Br J Clin Pharmacol 2025-06-10
最新研究發現,像 ChatGPT-4 Omni 這類大型語言模型,在 CPT 和歐洲處方考試的表現跟醫學生差不多,甚至更厲害,特別是在知識和開藥技巧上。這些 AI 還能揪出題目寫不清楚的地方,不只適合當教學工具,也有助於改進考題品質。
PubMedDOI♡
站上相關主題文章列表
Performance of Language Models on the Family Medicine In-Training Exam.
家庭醫學在職考試中語言模型的表現。
Fam Med 2024-08-29
Aligning Large Language Models with Humans: A Comprehensive Survey of ChatGPT's Aptitude in Pharmacology.
與人類對齊的大型語言模型:ChatGPT 在藥理學中的能力綜合調查。
Drugs 2024-12-20
Evaluating accuracy and reproducibility of large language model performance on critical care assessments in pharmacy education.
評估大型語言模型在藥學教育中對重症護理評估的準確性和可重複性。
Front Artif Intell 2025-01-24
Comparison of Large Language Models' Performance on 600 Nuclear Medicine Technology Board Examination-Style Questions.
大型語言模型在600題核醫技術師國家考試題型上的表現比較
J Nucl Med Technol 2025-05-09
Performance of large language models on Thailand's national medical licensing examination: a cross-sectional study.
大型語言模型在泰國國家醫師執照考試中的表現:一項橫斷面研究
J Educ Eval Health Prof 2025-05-12
Comparison of a generative large language model to pharmacy student performance on therapeutics examinations.
生成式大型語言模型與藥學系學生在治療學考試表現之比較
Curr Pharm Teach Learn 2025-05-23
ChatGPT-3.5 在治療學考試的表現明顯不如藥學系學生,分數只有 53%,學生平均則有 82%。它在需要應用和案例分析的題目上特別吃力,只有在記憶型題目表現較好,顯示生成式 AI 在複雜醫學教育任務上還有不少限制。
PubMedDOI
Evaluating Large Language Models for Enhancing Radiology Specialty Examination: A Comparative Study with Human Performance.
用於提升放射科專科考試的大型語言模型評估:與人類表現的比較研究
Acad Radiol 2025-05-28
Evaluating Large Language Models on American Board of Anesthesiology-style Anesthesiology Questions: Accuracy, Domain Consistency, and Clinical Implications.
以美國麻醉科醫學會(American Board of Anesthesiology)風格麻醉學試題評估大型語言模型:準確性、領域一致性與臨床意涵
J Cardiothorac Vasc Anesth 2025-06-15