OpenMedLM: prompt engineering can out-perform fine-tuning in medical question-answering with open-source large language models.
OpenMedLM: 在醫學問答中,prompt engineering 可以優於對開源大型語言模型進行微調。
Sci Rep 2024-06-19
Data Set and Benchmark (MedGPTEval) to Evaluate Responses From Large Language Models in Medicine: Evaluation Development and Validation.
醫學領域中用於評估大型語言模型回應的資料集和基準(MedGPTEval):評估開發和驗證。
JMIR Med Inform 2024-07-02
MedExpQA: Multilingual benchmarking of Large Language Models for Medical Question Answering.
MedExpQA:大型語言模型在醫學問答中的多語言基準測試。
Artif Intell Med 2024-08-09
MedFrenchmark, a Small Set for Benchmarking Generative LLMs in Medical French.
MedFrenchmark:一個用於基準測試醫學法語生成大型語言模型的小型數據集。
Stud Health Technol Inform 2024-08-23
Performance of Publicly Available Large Language Models on Internal Medicine Board-style Questions.
公開可用的大型語言模型在內科醫學考試風格問題上的表現。
PLOS Digit Health 2024-09-17