Large language models in methodological quality evaluation of radiomics research based on METRICS: ChatGPT vs NotebookLM vs radiologist.
基於 METRICS 的放射組學研究方法學質量評估中的大型語言模型:ChatGPT 與 NotebookLM 與放射科醫生的比較。
Eur J Radiol 2025-02-12
Comparing large Language models and human annotators in latent content analysis of sentiment, political leaning, emotional intensity and sarcasm.
比較大型語言模型與人類標註者在情感、政治傾向、情緒強度和諷刺的潛在內容分析中的表現。
Sci Rep 2025-04-03
Evaluating Large Language Models for Enhancing Radiology Specialty Examination: A Comparative Study with Human Performance.
用於提升放射科專科考試的大型語言模型評估:與人類表現的比較研究
Acad Radiol 2025-05-28
Evaluation of a large language model (ChatGPT) versus human researchers in assessing risk-of-bias and community engagement levels: a systematic review use-case analysis.
大型語言模型(ChatGPT)與人類研究人員在評估偏倚風險與社區參與程度之比較:系統性回顧案例分析
Eur J Public Health 2025-06-10
Do Language Model Agents Align with Humans in Rating Visualizations? An Empirical Study.
語言模型代理在評分視覺化圖表時是否與人類一致?一項實證研究
IEEE Comput Graph Appl 2025-07-09