Leaderboard

The display results for various types and stages of exams as well as clinical case consultations are as follows:

Submissions of model test results to CMB are always welcome. For submission guidelines, please refer to the link.


CMB-Exam

Listed below are the optimal accuracy rates selected from four generation strategies: Zero-shot (with/without COT) and Few-shot (with/without COT).

For detailed information on generation and evaluation, please refer to the link.

Model Institution Avg. Avg. 医师考试 护理考试 药师考试 医技考试 专业知识考试 医学考研
医师考试 护理考试 药师考试 医技考试 专业知识考试 医学考研 住院医师 执业助理医师 执业医师 中级职称 高级职称 护士执业资格 护师执业资格 主管护师 高级护师 执业西药师 执业中药师 初级药士 初级药师 初级中药士 初级中药师 主管药师 主管中药师 医技士 医技师 主管技师 基础医学 临床医学 预防医学与公共卫生学 中医学与中药学 护理学 考研政治 西医综合 中医综合
墨融AI 苏州墨融科技有限公司 87.21 86.05 92.12 86.46 87.33 83.56 87.75 87.00 89.50 89.75 83.50 80.50 96.00 95.50 92.25 84.75 80.75 81.00 92.25 90.50 82.25 86.75 88.50 89.75 89.75 87.25 85.00 84.25 83.25 85.75 81.00 83.00 96.25 89.00 82.75
RobotGPT-30B 达闼机器人(成都)
解放军总医院医学创新研究部
86.80 86.45 91.69 86.09 89.08 82.69 84.81 86.50 92.00 91.50 83.50 78.75 96.00 94.75 94.25 81.75 78.75 81.75 92.75 90.25 81.50 87.50 86.75 89.50 92.25 87.00 88.00 84.00 79.50 86.75 80.50 81.50 96.75 86.75 74.25
中国电子云医疗大模型 深圳陆兮科技有限公司
中国电子云-中国信创云
83.09 81.95 89.13 84.38 84.08 77.63 81.37 90.75 90.00 82.00 72.75 74.25 95.25 94.00 88.75 78.50 75.25 81.75 92.00 88.00 84.00 83.00 84.00 87.00 83.00 85.75 83.50 79.75 76.75 82.75 71.25 78.75 92.50 84.50 69.75
砭石 智慧眼科技股份有限公司 82.56 81.85 88.44 81.97 78.33 78.13 84.69 82.75 85.50 85.25 80.50 75.25 94.50 91.25 89.25 78.75 75.75 78.25 87.25 84.50 79.25 83.25 81.50 86.00 80.25 76.25 78.50 73.75 77.25 83.25 78.25 81.25 96.25 83.00 78.25
jianpeiGPT 健培科技 81.78 81.70 86.38 81.72 81.42 77.44 82.00 80.50 87.75 88.50 78.75 73.00 92.00 92.00 87.50 74.00 74.00 79.25 89.00 87.75 76.50 82.25 83.00 82.00 81.50 80.00 82.75 78.50 75.50 83.50 72.25 77.00 94.00 80.50 76.50
HuatuoGPTII-34B (华佗) CUHKSZ-NLP 76.80 75.65 82.31 76.81 76.17 74.38 75.56 73.75 77.75 82.50 75.50 68.75 87.75 87.50 81.50 72.50 68.25 73.50 85.25 82.25 73.25 77.00 74.75 80.25 77.25 74.00 77.25 72.00 74.25 78.75 72.50 73.25 79.75 77.00 72.25
Qwen-72B-Chat Qwen 74.38 78.55 83.56 79.78 77.92 68.25 58.19 78.00 86.00 88.00 75.00 65.75 94.25 92.50 87.75 59.75 69.50 73.75 88.25 87.75 77.00 79.50 78.25 84.25 77.50 79.00 77.25 66.25 65.00 77.50 64.25 70.25 50.00 65.50 47.00
CollectiveSFT CAS·SIAT-NLP 72.45 74.20 83.38 74.09 73.42 67.69 61.88 70.50 82.25 84.50 70.25 63.50 91.75 90.25 85.00 66.50 64.25 70.00 85.25 81.50 67.00 75.25 73.75 75.75 72.25 73.00 75.00 67.25 66.50 74.25 62.75 69.00 58.50 65.50 54.50
BrainAuGPT 脑动极光 71.12 70.30 81.31 70.31 66.50 68.50 69.81 70.75 74.25 81.00 64.75 60.75 87.50 87.25 82.50 68.00 61.50 68.00 79.00 77.00 65.00 68.75 71.75 71.50 66.75 64.50 68.25 69.50 67.00 76.75 60.75 69.00 91.00 65.50 53.75
Yi-34B-Chat Yi 69.17 71.10 77.56 73.16 73.67 66.56 52.94 68.00 78.75 78.75 69.50 60.50 87.00 84.00 82.25 57.00 59.00 66.00 84.00 80.75 69.75 71.50 75.00 79.25 73.25 74.25 73.50 66.00 64.75 73.00 62.50 63.75 45.25 56.50 46.25
HuatuoGPTII-13B (华佗) CUHKSZ-NLP 67.85 62.75 66.13 64.91 62.00 61.94 53.69 66.25 71.00 74.50 65.75 61.75 71.50 71.75 65.75 55.50 61.50 63.25 70.00 64.75 59.75 68.00 62.50 69.50 63.75 60.75 61.50 58.75 62.00 64.00 63.00 59.75 43.00 55.75 56.25
Yi-6B-Chat Yi 65.87 67.25 76.38 68.50 67.83 61.75 53.50 64.50 74.00 78.75 63.25 55.75 86.50 86.50 80.00 52.50 56.75 65.75 78.25 74.75 63.50 71.00 68.75 79.25 65.50 68.50 69.50 62.75 59.00 68.50 56.75 64.00 45.00 58.25 46.75
ShuKunGPT 数坤科技 64.44 68.65 71.44 70.78 61.92 62.81 51.06 63.00 76.50 81.00 64.50 58.25 77.50 78.50 74.75 55.00 73.75 70.50 76.00 73.75 66.25 65.50 67.25 73.25 59.25 61.25 65.25 58.50 62.25 68.75 61.75 56.25 57.00 49.75 41.25
AntGLM-Med-10B AntGroup 64.09 66.85 71.75 69.44 55.92 61.19 59.38 63.25 71.00 81.75 62.00 56.25 82.50 74.75 70.50 59.25 70.50 63.50 76.50 75.75 63.75 71.50 65.50 68.50 53.00 53.00 61.75 58.50 61.50 63.50 61.25 60.25 57.75 64.50 55.00
GPT-4 OpenAI 59.46 59.90 69.31 52.19 61.50 59.69 54.19 59.75 58.50 64.50 60.75 56.00 77.50 72.50 68.75 58.50 54.75 47.00 60.00 63.25 39.50 47.25 59.50 46.25 58.50 60.75 65.25 63.00 65.50 68.25 42.00 57.00 61.00 61.25 37.50
HuatuoGPTII-7B (华佗) CUHKSZ-NLP 59.00 64.55 63.75 61.06 56.25 56.63 51.81 64.75 67.00 70.75 64.75 55.50 70.25 68.50 64.75 51.50 53.75 56.75 66.25 62.25 60.00 63.50 59.50 66.50 55.75 53.75 59.25 55.00 56.50 59.50 55.50 57.75 43.75 53.75 52.00
Qwen-14B-Chat Qwen 57.64 60.40 65.63 60.94 58.83 54.50 45.56 58.25 65.00 69.00 60.50 49.25 73.00 74.00 68.25 47.25 61.00 63.00 71.00 66.75 51.25 57.00 58.25 59.25 58.25 57.75 60.50 51.50 54.25 65.25 46.00 51.75 43.00 50.00 37.50
Deepseek-llm-67B-Chat Deepseek-llm 51.99 52.90 61.50 54.28 51.42 51.19 40.62 49.75 57.25 60.25 54.50 42.75 71.25 67.00 65.50 42.25 48.75 50.50 63.25 56.25 49.75 57.25 52.75 55.75 51.25 52.75 50.25 47.00 48.75 59.25 49.75 47.25 35.50 43.50 36.25
Baichuan2-13B-Chat Baichuan-inc 48.87 49.55 56.75 49.41 50.08 48.25 39.19 51.25 51.25 56.50 47.75 41.00 63.25 63.00 59.75 41.00 41.25 46.75 56.50 52.25 44.50 53.50 45.75 54.75 46.00 51.00 53.25 44.25 50.00 48.75 50.00 45.50 36.00 39.25 36.00
Qwen-7B-Chat Qwen 46.58 48.00 54.25 48.34 48.08 44.88 35.94 50.25 48.50 56.25 46.00 39.00 63.50 60.00 54.00 39.50 46.00 45.50 53.50 55.75 42.00 46.50 48.50 49.00 47.50 47.50 49.25 43.75 43.25 53.50 39.00 37.25 36.25 39.50 30.75
ChatGLM2-6B THUDM 45.05 44.40 53.88 43.41 41.58 42.13 44.88 43.50 43.75 48.25 47.25 39.25 54.25 63.50 51.50 46.25 36.50 38.25 49.50 48.50 43.75 40.75 43.75 46.25 43.50 38.50 42.75 39.25 43.00 50.25 36.00 43.00 55.75 42.25 38.50
Baichuan2-7B-Chat Baichuan-inc 43.33 42.55 51.75 44.59 45.50 43.00 32.56 40.25 44.50 50.75 44.25 33.00 59.50 54.50 53.75 39.25 39.75 40.75 49.75 46.25 45.50 46.75 41.00 47.00 43.50 48.75 44.25 42.00 40.00 49.00 41.00 37.00 32.75 31.50 29.00
Baichuan-13B-chat Baichuan-inc 41.40 39.00 48.31 42.94 38.50 41.06 38.56 38.50 39.00 43.75 38.75 35.00 53.75 54.75 46.25 38.50 37.50 37.75 53.00 43.25 41.25 44.25 40.75 45.75 37.75 38.00 39.75 39.25 37.75 47.00 40.25 39.75 40.75 35.00 38.75
Qwen-1.8B Qwen 40.00 44.15 50.63 39.78 39.25 36.56 33.75 43.25 45.50 49.00 45.50 37.50 62.75 52.25 50.25 37.25 34.50 33.50 48.25 46.75 35.25 41.75 36.00 42.25 38.50 37.25 42.00 33.75 36.50 45.75 30.25 38.50 34.50 36.50 25.50
ChatGLM3-6B THUDM 40.00 42.55 47.31 39.56 41.08 37.44 32.06 40.25 43.25 49.00 44.75 35.50 52.25 54.25 46.25 36.50 35.25 35.50 46.00 45.00 34.25 41.75 40.50 38.25 40.75 38.25 44.25 35.75 38.50 44.00 31.50 37.25 30.25 33.00 27.75
DISC-MedLLM-13B FudanDISC 39.76 42.25 46.88 38.44 38.83 40.75 31.44 42.25 42.00 47.25 44.25 35.50 49.75 54.25 49.75 33.75 31.25 35.50 44.00 44.25 34.75 40.00 36.50 41.25 37.25 39.75 39.50 38.50 40.25 47.25 37.00 37.25 26.75 34.25 27.50
IvyGPT Macao Polytechnic University 38.54 37.70 43.56 40.47 38.08 35.31 36.13 36.75 37.75 42.00 42.00 30.00 51.00 47.00 46.25 30.00 36.75 40.50 47.25 38.75 35.00 43.50 39.50 42.50 37.00 37.00 40.25 34.50 34.75 35.75 36.25 32.25 40.25 39.75 32.25
Sunsimiao X-D Lab 38.57 38.75 44.37 38.81 38.33 37.50 33.31 37.00 37.25 46.00 39.75 33.75 53.25 47.75 44.00 32.50 33.75 40.25 43.75 42.25 32.50 37.25 39.50 41.25 35.50 38.50 41.00 33.50 34.00 45.25 37.25 30.00 39.75 36.25 27.25
Internlm-Chat-20B Internlm 38.17 39.35 45.44 38.53 37.92 38.13 29.63 36.75 40.25 44.75 38.75 36.25 54.25 50.50 45.75 31.25 34.25 38.00 44.25 43.75 33.75 39.00 39.00 36.25 38.00 38.00 37.75 33.50 36.25 51.00 31.75 31.75 31.75 31.25 23.75
ChatGPT OpenAI 38.09 40.75 45.69 36.59 40.08 37.94 28.81 42.75 39.50 43.25 41.00 37.25 53.25 46.75 46.25 36.50 35.75 33.00 43.00 44.75 26.50 37.50 39.25 33.00 38.50 41.25 40.50 35.25 40.00 50.25 26.25 36.50 20.50 39.25 19.00
Mixtral-8x7B Mistral 36.37 39.00 41.87 33.12 39.50 36.44 28.25 40.00 35.50 43.50 41.00 35.00 46.50 42.00 45.25 33.75 30.50 32.75 37.50 41.25 28.25 32.50 32.75 29.50 36.00 39.25 43.25 35.25 36.75 47.50 26.25 31.25 24.25 35.00 22.50
Internlm-Chat-7B Internlm 34.91 34.45 42.13 33.69 37.50 33.75 27.94 34.50 33.50 40.25 34.50 29.50 47.00 46.25 40.00 35.25 32.75 30.25 38.50 38.75 26.50 33.75 34.75 34.25 34.00 39.00 39.50 28.00 35.25 46.25 25.50 33.75 29.50 27.00 22.50
HuatuoGPT (华佗) CUHK(SZ)-NLP 31.38 31.85 35.00 30.56 31.38 35.00 28.25 32.00 34.50 33.25 31.25 28.25 37.00 38.00 36.00 29.00 28.00 28.75 36.50 31.50 26.00 31.75 31.50 30.50 30.75 29.50 36.25 29.50 29.00 38.25 28.50 28.25 28.50 30.25 26.00
MedicalGPT Xu Ming 26.15 26.56 30.94 24.72 27.17 25.44 21.50 25.00 25.75 29.25 29.50 25.75 34.75 35.75 32.25 21.25 23.75 24.00 27.50 25.75 24.25 26.25 26.50 27.00 28.00 25.75 27.75 22.00 29.50 32.00 20.75 22.50 22.25 24.50 21.25
Mistral-7B Mistral 22.26 23.75 22.19 20.97 25.83 21.94 18.88 24.75 22.50 27.25 25.75 18.50 22.50 23.50 23.25 19.50 21.00 19.25 20.25 22.75 17.75 20.75 21.25 24.75 24.00 24.25 29.25 21.50 23.00 23.75 19.50 25.25 17.75 16.25 16.25
ChatMed-Consult 华东师范大学 21.58 21.41 23.48 21.58 23.55 21.36 18.08 20.50 24.00 21.25 22.50 18.75 24.75 25.50 23.00 17.25 19.25 23.75 24.25 20.25 19.50 21.25 20.75 22.50 23.75 22.50 24.75 20.75 18.00 23.50 19.25 20.00 24.50 21.25 20.75
Bentsao (本草) SCIR-HI 20.39 21.67 19.99 21.07 22.85 19.83 16.93 20.39 23.00 24.25 20.50 18.25 24.00 23.25 22.25 15.25 18.25 20.75 25.00 22.00 18.25 22.00 19.50 24.75 23.50 22.75 26.00 18.25 20.00 21.00 19.25 20.50 16.75 21.50 19.75
ChatGLM-Med SCIR-HI 21.56 23.59 23.37 22.67 21.85 19.83 16.93 21.50 22.50 23.75 21.50 19.50 23.50 25.75 24.00 15.00 19.25 19.75 20.75 18.50 24.50 24.25 21.75 26.25 18.75 23.75 20.50 17.00 18.75 21.50 16.50 20.50 14.25 19.75 15.50
DoctorGLM 上海科技大学 7.63 6.95 7.31 8.28 9.75 7.50 6.06 6.00 8.00 6.00 9.00 5.75 6.00 11.50 6.75 5.00 7.00 6.00 9.50 10.25 10.50 6.75 9.25 7.00 8.00 12.50 8.75 8.25 7.75 6.75 7.25 5.00 6.25 7.50 5.50
BianQue-2 (扁鹊) 华东师范大学 7.38 6.95 7.31 7.25 9.75 6.94 6.06 7.50 9.00 8.00 8.75 6.00 6.75 7.00 7.50 5.25 7.75 6.75 6.75 6.00 8.25 8.00 7.25 9.00 6.00 9.00 10.00 6.50 5.75 6.25 9.75 5.25 4.00 6.25 9.00

CMB-Clin

The following represents the scoring results of the AI grading model based on the groundtruth, evaluated across four dimensions: fluency, relevance, completeness, and professionalism. Scores range from 1 to 5.

Model Fluency Relevance Completeness Proficiency Avg.
GPT-4 4.95 4.71 4.35 4.66 4.67
Yi-34B-Chat 4.99 4.69 4.34 4.64 4.67
HuatuoGPTII-34B 4.96 4.61 4.31 4.53 4.60
Qwen-72B-Chat 4.96 4.58 4.12 4.55 4.55
ChatGPT 4.97 4.49 4.17 4.53 4.53
Baichuan2-13B-chat 4.93 4.41 4.03 4.36 4.43
Baichuan-13B-chat 4.96 4.19 3.97 4.23 4.34
ChatGLM3-6B 4.92 4.11 3.74 4.23 4.25
Internlm-Chat-20B 4.90 3.91 3.25 4.14 4.05
ChatGLM2-6B 4.86 3.76 3.51 4.00 4.03
HuatuoGPT 4.89 3.75 3.38 3.86 3.97
Deepseek-llm-67B-Chat 4.78 4.04 2.62 4.16 3.90
BianQue-2 4.86 3.52 3.02 3.60 3.75
DISC-MedLLM 4.86 3.08 2.67 3.30 3.48
ChatMed-Consult 4.88 3.08 2.67 3.30 3.48
MedicalGPT 4.48 2.64 2.19 2.89 3.05
DoctorGLM 4.74 2.00 1.65 2.30 2.67
Bentsao 3.88 2.05 1.71 2.58 2.55
ChatGLM-Med 3.55 1.97 1.61 2.37 2.38
Mixtral-8x7B 2.53 2.28 1.54 3.04 2.35