+
| Model |
Size |
- Visual Tokens |
- MME |
- MMB dev (en) |
- MMB dev (zh) |
- MMMU val |
- CMMMU val |
+ TextVQA val |
+ DocVQA test |
+ OCRBench |
+ OpenCompass |
+ MME |
+ MMB dev(en) |
+ MMB dev(zh) |
+ MMMU val |
+ MathVista |
+ LLaVA Bench |
+ Object HalBench |
- | LLaVA-Phi |
- 3B |
- 576 |
- 1335 |
- 59.8 |
- - |
+ Proprietary models |
+
+
+ | Gemini Pro Vision |
- |
+ 74.6 |
+ 88.1 |
+ 680 |
+ 63.8 |
+ 2148.9 |
+ 75.2 |
+ 74.0 |
+ 48.9 |
+ 45.8 |
+ 79.9 |
- |
- | MobileVLM |
- 3B |
- 144 |
- 1289 |
- 59.6 |
- - |
- - |
+ GPT-4V |
- |
+ 78.0 |
+ 88.4 |
+ 645 |
+ 63.2 |
+ 1771.5 |
+ 75.1 |
+ 75.0 |
+ 53.8 |
+ 47.8 |
+ 93.1 |
+ 86.4 / 92.7 |
- | Imp-v1 |
- 3B |
- 576 |
- 1434 |
- 66.5 |
- - |
- - |
+ Open-source models 6B~34B |
+
+
+ | Yi-VL-6B |
+ 6.7B |
+ 45.5* |
+ 17.1* |
+ 290 |
+ 49.3 |
+ 1915.1 |
+ 68.6 |
+ 68.3 |
+ 40.3 |
+ 28.8 |
+ 51.9 |
- |
| Qwen-VL-Chat |
9.6B |
- 256 |
- 1487 |
+ 61.5 |
+ 62.6 |
+ 488 |
+ 52.1 |
+ 1860.0 |
60.6 |
56.7 |
- 35.9 |
- 30.7 |
+ 37.0 |
+ 33.8 |
+ 67.7 |
+ 56.2 / 80.0 |
- | CogVLM |
- 17.4B |
- 1225 |
- 1438 |
- 63.7 |
- 53.8 |
- 32.1 |
+ Yi-VL-34B |
+ 34B |
+ 43.4* |
+ 16.9* |
+ 290 |
+ 52.6 |
+ 2050.2 |
+ 71.1 |
+ 71.4 |
+ 45.1 |
+ 30.7 |
+ 62.3 |
- |
- | MiniCPM-V(3B) |
- 3B |
- 64 |
- 1452 |
- 67.3 |
- 61.9 |
+ DeepSeek-VL-7B |
+ 7.3B |
+ 64.7* |
+ 47.0* |
+ 435 |
+ 55.6 |
+ 1765.4 |
+ 74.1 |
+ 72.8 |
+ 38.3 |
+ 36.8 |
+ 77.8 |
+ - |
+
+
+ | TextMonkey |
+ 9.7B |
+ 64.3 |
+ 66.7 |
+ 558 |
+ - |
+ - |
+ - |
+ - |
+ - |
+ - |
+ - |
+ - |
+
+
+ | CogVLM-Chat |
+ 17.4B |
+ 70.4 |
+ 33.3* |
+ 590 |
+ 52.5 |
+ 1736.6 |
+ 63.7 |
+ 53.8 |
+ 37.3 |
34.7 |
- 32.1 |
+ 73.9 |
+ 73.6 / 87.4 |
+
+
+ | Open-source models 1B~3B |
+
+
+ | DeepSeek-VL-1.3B |
+ 1.7B |
+ 58.4* |
+ 37.9* |
+ 413 |
+ 46.0 |
+ 1531.6 |
+ 64.0 |
+ 61.2 |
+ 33.8 |
+ 29.4 |
+ 51.1 |
+ - |
+
+
+ | MobileVLM V2 |
+ 3.1B |
+ 57.5 |
+ 19.4* |
+ - |
+ - |
+ 1440.5(P) |
+ 63.2 |
+ - |
+ - |
+ - |
+ - |
+ - |
+
+
+ | Mini-Gemini |
+ 2.2B |
+ 56.2 |
+ 34.2* |
+ - |
+ - |
+ 1653.0 |
+ 59.8 |
+ - |
+ 31.7 |
+ - |
+ - |
+ - |
+
+
+ | MiniCPM-V |
+ 2.8B |
+ 60.6 |
+ 38.2 |
+ 366 |
+ 47.6 |
+ 1650.2 |
+ 67.9 |
+ 65.3 |
+ 38.3 |
+ 28.9 |
+ 51.3 |
+ 78.4 / 88.5 |
+
+
+ | MiniCPM-V 2.0 |
+ 2.8B |
+ 74.1 |
+ 71.9 |
+ 605 |
+ 55.0 |
+ 1808.6 |
+ 69.6 |
+ 68.1 |
+ 38.2 |
+ 38.7 |
+ 69.2 |
+ 85.5 / 92.2 |
+* We evaluate the officially released checkpoint by ourselves.
#### DPO evaluation
diff --git a/README.md b/README.md
index 3743c5f..7a512ca 100644
--- a/README.md
+++ b/README.md
@@ -15,7 +15,7 @@
MiniCPM 技术博客 |
OmniLMM 多模态模型 |
CPM-C 千亿模型试用 |
-加入我们的
discord 和
wechat
+加入我们的
discord 和
微信群
@@ -61,10 +61,10 @@ MiniCPM 是面壁智能与清华大学自然语言处理实验室共同开源的
## 更新日志
-- 2024/04/11 开源[MiniCPM-V-2.0](https://huggingface.co/openbmb/MiniCPM-V-2.0)、[MiniCPM-2B-128k](https://huggingface.co/openbmb/MiniCPM-2B-128k)、[MiniCPM-MoE-8x2B](https://huggingface.co/openbmb/MiniCPM-MoE-8x2B)和[MiniCPM-1B-sft-bf16](https://huggingface.co/openbmb/MiniCPM-1B-sft-bf16)!
+- 2024/04/11 开源[MiniCPM-V-2.0](https://huggingface.co/openbmb/MiniCPM-V-2.0)、[MiniCPM-2B-128k](https://huggingface.co/openbmb/MiniCPM-2B-128k)、[MiniCPM-MoE-8x2B](https://huggingface.co/openbmb/MiniCPM-MoE-8x2B)和[MiniCPM-1B](https://huggingface.co/openbmb/MiniCPM-1B-sft-bf16)!
- 2024/03/16 MiniCPM-2B 的30余个中间检查点开放了
- 2024/02/13 支持了llama.cpp
-- 2024/02/09 我们在readme里加入了一个[开源社区](#community)章节,用来收集开源社区对MiniCPM的支持案例。
+- 2024/02/09 我们在README里加入了一个[开源社区](#community)章节,用来收集开源社区对MiniCPM的支持案例。
- 2024/02/08 我们更新了[llama-format的模型权重](#llamaformat),方便大家更加快捷地使用我们的模型。
- 2024/02/01 初始发布。
@@ -437,86 +437,236 @@ print(model.response("<用户>山东省最高的山是哪座山, 它比黄山高
#### 多模态模型评测
-
+
| Model |
Size |
- Visual Tokens |
- MME |
- MMB dev (en) |
- MMB dev (zh) |
- MMMU val |
- CMMMU val |
+ TextVQA val |
+ DocVQA test |
+ OCRBench |
+ OpenCompass |
+ MME |
+ MMB dev(en) |
+ MMB dev(zh) |
+ MMMU val |
+ MathVista |
+ LLaVA Bench |
+ Object HalBench |
- | LLaVA-Phi |
- 3B |
- 576 |
- 1335 |
- 59.8 |
- - |
+ Proprietary models |
+
+
+ | Gemini Pro Vision |
- |
+ 74.6 |
+ 88.1 |
+ 680 |
+ 63.8 |
+ 2148.9 |
+ 75.2 |
+ 74.0 |
+ 48.9 |
+ 45.8 |
+ 79.9 |
- |
- | MobileVLM |
- 3B |
- 144 |
- 1289 |
- 59.6 |
- - |
- - |
+ GPT-4V |
- |
+ 78.0 |
+ 88.4 |
+ 645 |
+ 63.2 |
+ 1771.5 |
+ 75.1 |
+ 75.0 |
+ 53.8 |
+ 47.8 |
+ 93.1 |
+ 86.4 / 92.7 |
- | Imp-v1 |
- 3B |
- 576 |
- 1434 |
- 66.5 |
- - |
- - |
+ Open-source models 6B~34B |
+
+
+ | Yi-VL-6B |
+ 6.7B |
+ 45.5* |
+ 17.1* |
+ 290 |
+ 49.3 |
+ 1915.1 |
+ 68.6 |
+ 68.3 |
+ 40.3 |
+ 28.8 |
+ 51.9 |
- |
| Qwen-VL-Chat |
9.6B |
- 256 |
- 1487 |
+ 61.5 |
+ 62.6 |
+ 488 |
+ 52.1 |
+ 1860.0 |
60.6 |
56.7 |
- 35.9 |
- 30.7 |
+ 37.0 |
+ 33.8 |
+ 67.7 |
+ 56.2 / 80.0 |
- | CogVLM |
- 17.4B |
- 1225 |
- 1438 |
- 63.7 |
- 53.8 |
- 32.1 |
+ Yi-VL-34B |
+ 34B |
+ 43.4* |
+ 16.9* |
+ 290 |
+ 52.6 |
+ 2050.2 |
+ 71.1 |
+ 71.4 |
+ 45.1 |
+ 30.7 |
+ 62.3 |
- |
- | MiniCPM-V(3B) |
- 3B |
- 64 |
- 1452 |
- 67.3 |
- 61.9 |
+ DeepSeek-VL-7B |
+ 7.3B |
+ 64.7* |
+ 47.0* |
+ 435 |
+ 55.6 |
+ 1765.4 |
+ 74.1 |
+ 72.8 |
+ 38.3 |
+ 36.8 |
+ 77.8 |
+ - |
+
+
+ | TextMonkey |
+ 9.7B |
+ 64.3 |
+ 66.7 |
+ 558 |
+ - |
+ - |
+ - |
+ - |
+ - |
+ - |
+ - |
+ - |
+
+
+ | CogVLM-Chat |
+ 17.4B |
+ 70.4 |
+ 33.3* |
+ 590 |
+ 52.5 |
+ 1736.6 |
+ 63.7 |
+ 53.8 |
+ 37.3 |
34.7 |
- 32.1 |
+ 73.9 |
+ 73.6 / 87.4 |
+
+
+ | Open-source models 1B~3B |
+
+
+ | DeepSeek-VL-1.3B |
+ 1.7B |
+ 58.4* |
+ 37.9* |
+ 413 |
+ 46.0 |
+ 1531.6 |
+ 64.0 |
+ 61.2 |
+ 33.8 |
+ 29.4 |
+ 51.1 |
+ - |
+
+
+ | MobileVLM V2 |
+ 3.1B |
+ 57.5 |
+ 19.4* |
+ - |
+ - |
+ 1440.5(P) |
+ 63.2 |
+ - |
+ - |
+ - |
+ - |
+ - |
+
+
+ | Mini-Gemini |
+ 2.2B |
+ 56.2 |
+ 34.2* |
+ - |
+ - |
+ 1653.0 |
+ 59.8 |
+ - |
+ 31.7 |
+ - |
+ - |
+ - |
+
+
+ | MiniCPM-V |
+ 2.8B |
+ 60.6 |
+ 38.2 |
+ 366 |
+ 47.6 |
+ 1650.2 |
+ 67.9 |
+ 65.3 |
+ 38.3 |
+ 28.9 |
+ 51.3 |
+ 78.4 / 88.5 |
+
+
+ | MiniCPM-V 2.0 |
+ 2.8B |
+ 74.1 |
+ 71.9 |
+ 605 |
+ 55.0 |
+ 1808.6 |
+ 69.6 |
+ 68.1 |
+ 38.2 |
+ 38.7 |
+ 69.2 |
+ 85.5 / 92.2 |
+* 我们自己评测了正式开源的模型权重。