Ophthalmological Question Answering and Reasoning Using OpenAI o1 vs Other Large Language Models.
OpenAI o1 與其他大型語言模型於眼科問題回答與推理之比較
JAMA Ophthalmol 2025-07-31
Large language models for extraction of OPS-codes from operative reports in meningioma surgery.
用於從腦膜瘤手術報告中擷取 OPS 編碼的大型語言模型
Acta Neurochir (Wien) 2025-07-31
Information Extraction and Summarization for Neurovascular Consultations with GPT-4o: A Clinical Case Study.
使用 GPT-4o 於神經血管會診的資訊擷取與摘要:臨床個案研究
Clin Neuroradiol 2025-07-31
Are clinical improvements in large language models a reality? Longitudinal comparisons of ChatGPT models and DeepSeek-R1 for psychiatric assessments and interventions.
大型語言模型在臨床上的進步是真實存在的嗎?針對精神科評估與介入的 ChatGPT 模型與 DeepSeek-R1 之縱向比較
Int J Soc Psychiatry 2025-07-31
Chatbots in urology: accuracy, calibration, and comprehensibility; is DeepSeek taking over the throne?
泌尿科領域中的聊天機器人:準確性、校準性與可理解性;DeepSeek 是否正在取而代之?
BJU Int 2025-07-31
我們測試了五款主流聊天機器人,發現 ChatGPT-4o、DeepSeek-R1 和 Grok-2 的正確率最高(80%),其中 ChatGPT-4o 校準最準確。DeepSeek-R1 內容最易讀,住院醫師則覺得 Claude 3.5 最好懂。整體來看,各家 AI 各有優缺點,若要在泌尿科實際應用,還需要再優化。
相關文章PubMedDOI推理
Leveraging MDS2 and SBOM data for LLM-assisted vulnerability analysis of medical devices.
運用 MDS2 與 SBOM 資料進行大型語言模型(LLM)輔助的醫療器材弱點分析
Comput Struct Biotechnol J 2025-07-31
Correction: Diagnostic efficacy of large language models in the pediatric emergency department: a pilot study.
更正:大型語言模型於兒科急診部門的診斷效能:初步研究
Front Digit Health 2025-07-31