Investigating the Impact of Prompt Engineering on the Performance of Large Language Models for Standardizing Obstetric Diagnosis Text: Comparative Study.
研究即時工程對大型語言模型在標準化產科診斷文本表現的影響:比較研究。
JMIR Form Res 2024-02-25
Hallucination Rates and Reference Accuracy of ChatGPT and Bard for Systematic Reviews: Comparative Analysis.
ChatGPT和Bard用於系統性評論的幻覺率和參考準確性:比較分析。
J Med Internet Res 2024-05-22
Large Language Models, scientific knowledge and factuality: A framework to streamline human expert evaluation.
大型語言模型、科學知識與事實性:一個簡化人類專家評估的框架。
J Biomed Inform 2024-09-14
Empowering large language models for automated clinical assessment with generation-augmented retrieval and hierarchical chain-of-thought.
利用生成增強檢索和分層思維鏈來提升大型語言模型的自動臨床評估能力。
Artif Intell Med 2025-02-20
An active inference strategy for prompting reliable responses from large language models in medical practice.
在醫學實踐中促使大型語言模型產生可靠回應的主動推理策略。
NPJ Digit Med 2025-02-22