MedExpQA: Multilingual benchmarking of Large Language Models for Medical Question Answering.
MedExpQA:大型語言模型在醫學問答中的多語言基準測試。
Artif Intell Med 2024-08-09
Large Language Model-Based Evaluation of Medical Question Answering Systems: Algorithm Development and Case Study.
基於大型語言模型的醫學問答系統評估:演算法開發與案例研究。
Stud Health Technol Inform 2024-04-29
Large language models leverage external knowledge to extend clinical insight beyond language boundaries.
大型語言模型利用外部知識擴展臨床洞察力,超越語言界限。
J Am Med Inform Assoc 2024-04-29
Data Set and Benchmark (MedGPTEval) to Evaluate Responses From Large Language Models in Medicine: Evaluation Development and Validation.
醫學領域中用於評估大型語言模型回應的資料集和基準(MedGPTEval):評估開發和驗證。
JMIR Med Inform 2024-07-02
MedFrenchmark, a Small Set for Benchmarking Generative LLMs in Medical French.
MedFrenchmark:一個用於基準測試醫學法語生成大型語言模型的小型數據集。
Stud Health Technol Inform 2024-08-23