The Advanced Reasoning Capabilities of Large Language Models for Detecting Contraindicated Options in Medical Exams.
大型語言模型於醫學考試中偵測禁忌選項的進階推理能力
JMIR Med Inform 2025-05-12
Evaluating the Effectiveness of advanced large language models in medical Knowledge: A Comparative study using Japanese national medical examination.
評估先進大型語言模型在醫學知識中的有效性:使用日本國家醫學考試的比較研究。
Int J Med Inform 2024-10-29
Performance Evaluation and Implications of Large Language Models in Radiology Board Exams: Prospective Comparative Analysis.
大型語言模型在放射科考試中的表現評估及其影響:前瞻性比較分析。
JMIR Med Educ 2025-01-17
An Evaluation of the Performance of OpenAI-o1 and GPT-4o in the Japanese National Examination for Physical Therapists.
對於 OpenAI-o1 和 GPT-4o 在日本物理治療師國家考試中的表現評估。
Cureus 2025-02-06
Evaluating the performance of GPT-3.5, GPT-4, and GPT-4o in the Chinese National Medical Licensing Examination.
GPT-3.5、GPT-4 與 GPT-4o 在中國國家醫師執照考試中的表現評估
Sci Rep 2025-04-24
Evaluating the Performance of Reasoning Large Language Models on Japanese Radiology Board Examination Questions.
以推理為主的大型語言模型在日本放射科專科醫師考試題目上的表現評估
Acad Radiol 2025-05-18
Evaluating Large Language Models for Enhancing Radiology Specialty Examination: A Comparative Study with Human Performance.
用於提升放射科專科考試的大型語言模型評估:與人類表現的比較研究
Acad Radiol 2025-05-28
Advancing medical AI: GPT-4 and GPT-4o surpass GPT-3.5 in Taiwanese medical licensing exams.
推進醫療 AI:GPT-4 與 GPT-4o 在台灣醫師國考中表現超越 GPT-3.5
PLoS One 2025-06-04
A recent evaluation on the performance of LLMs on radiation oncology physics using questions of randomly shuffled options.
近期以隨機重排選項題目評估大型語言模型(LLMs)於放射腫瘤物理學表現
Front Oncol 2025-06-09