mirror of
https://github.com/RYDE-WORK/MiniCPM.git
synced 2026-01-19 21:03:39 +08:00
update readme
This commit is contained in:
parent
7f55e1626f
commit
2d3212613d
22
README-en.md
22
README-en.md
@ -30,6 +30,8 @@ We release all model parameters for research and limited commercial use.
|
||||
- SFT and DPO version based on MiniCPM-2B and human preference: **MiniCPM-2B-SFT/DPO**
|
||||
- The multi-modal model **MiniCPM-V** based on MiniCPM-2B, which outperforms models with similar size, i.e., Phi-2
|
||||
- The INT4 quantized version **MiniCPM-2B-SFT/DPO-Int4** based on MiniCPM-2B-SFT/DPO
|
||||
- The 128k long context version of MiniCPM-2B: **MiniCPM-2B-128k**.
|
||||
- The MoE version of MiniCPM-2B: **MiniCPM-MoE-8x2B**.
|
||||
- Mobile phone application based on MLC-LLM and LLMFarm. Both language model and multimodel model can conduct inference on smartphones.
|
||||
- 30 Intermidiate [checkpoints](https://huggingface.co/openbmb/MiniCPM-2B-history) for academic purpose.
|
||||
|
||||
@ -57,7 +59,7 @@ We release all model parameters for research and limited commercial use.
|
||||
<p id="0"></p>
|
||||
|
||||
## Update Log
|
||||
- 2024/04/11 We release [MiniCPM-V-v2.0](https://huggingface.co/openbmb/MiniCPM-V-v2.0)、[MiniCPM-2B-128k](https://huggingface.co/openbmb/MiniCPM-2B-128k) and[MiniCPM-MoE-8x2B](https://huggingface.co/openbmb/MiniCPM-MoE-8x2B)。
|
||||
- 2024/04/11 We release [MiniCPM-V-v2.0](https://huggingface.co/openbmb/MiniCPM-V-v2.0)、[MiniCPM-2B-128k](https://huggingface.co/openbmb/MiniCPM-2B-128k) and [MiniCPM-MoE-8x2B](https://huggingface.co/openbmb/MiniCPM-MoE-8x2B)。
|
||||
- 2024/03/16 Intermediate checkpoints were released [here](https://huggingface.co/openbmb/MiniCPM-2B-history)!
|
||||
- 2024/02/13 We support llama.cpp
|
||||
- 2024/02/09 We have included a [Community](#community) section in the README to encourage support for MiniCPM from the open-source community.
|
||||
@ -70,22 +72,16 @@ We release all model parameters for research and limited commercial use.
|
||||
|
||||
* Language Model
|
||||
|
||||
| HuggingFace | ModelScope | WiseModel | Replicate |
|
||||
|-------------|------------|-----------|-----------|
|
||||
| HuggingFace | ModelScope | WiseModel |
|
||||
|-------------|------------|-----------|
|
||||
|[MiniCPM-2B-sft-bf16](https://huggingface.co/openbmb/MiniCPM-2B-sft-bf16)|[MiniCPM-2B-sft-bf16](https://modelscope.cn/models/OpenBMB/miniCPM-bf16)|[MiniCPM-2B-sft-bf16](https://wisemodel.cn/models/OpenBMB/miniCPM-bf16)|
|
||||
|[MiniCPM-2B-dpo-bf16](https://huggingface.co/openbmb/MiniCPM-2B-dpo-bf16)|[MiniCPM-2B-dpo-bf16](https://modelscope.cn/models/OpenBMB/MiniCPM-2B-dpo-bf16/summary)|[MiniCPM-2B-dpo-bf16](https://wisemodel.cn/models/OpenBMB/MiniCPM-2B-dpo-bf16)|[MiniCPM-2B-dpo-bf16](https://replicate.com/tuantuanzhang/minicpm)
|
||||
|[MiniCPM-2B-128k](https://huggingface.co/openbmb/MiniCPM-2B-128k) |[MiniCPM-2B-128k](https://modelscope.cn/models/openbmb/MiniCPM-2B-128k/summary)|
|
||||
|[MiniCPM-MoE-8x2B]() |[MiniCPM-MoE-8x2B]()|
|
||||
|[MiniCPM-2B-sft-fp32-llama-format](https://huggingface.co/openbmb/MiniCPM-2B-sft-fp32-llama-format)|
|
||||
|[MiniCPM-2B-sft-bf16-llama-format](https://huggingface.co/openbmb/MiniCPM-2B-sft-bf16-llama-format)|
|
||||
|[MiniCPM-2B-dpo-bf16-llama-format](https://huggingface.co/openbmb/MiniCPM-2B-dpo-bf16-llama-format)|
|
||||
|[MiniCPM-2B-dpo-fp16-gguf](https://huggingface.co/runfuture/MiniCPM-2B-dpo-fp16-gguf) |
|
||||
|[MiniCPM-2B-dpo-q4km-gguf](https://huggingface.co/runfuture/MiniCPM-2B-dpo-q4km-gguf) |
|
||||
|
||||
Note:
|
||||
1. The model training was conducted in bf16 format, so inference using bf16 will yield the best results. Other formats might experience a slight performance decline due to precision issues.
|
||||
2. The models with a '-llama-format' suffix are those where we have transformed the MiniCPM structure into the Llama structure (primarily integrating the parameterization scheme of mup into the model's own parameters). This enables users of the Llama model to try out MiniCPM at no extra cost. [See details](#llamaformat)
|
||||
3. Thanks to [the contributor](https://github.com/runfuture) for adapting MiniCPM to [llama.cpp](https://github.com/ggerganov/llama.cpp) and [ollama](https://github.com/ollama/ollama).
|
||||
2. More model versions can be found [here](https://huggingface.co/collections/openbmb/minicpm-2b-65d48bf958302b9fd25b698f).
|
||||
|
||||
* Multimodel Model
|
||||
|
||||
@ -131,7 +127,7 @@ The capital city of China is Beijing. Beijing is not only the political center o
|
||||
<p id="llamaformat"></p>
|
||||
|
||||
##### MiniCPM-2B (Llama Format)
|
||||
We have converted the model weights of MiniCPM into a format that can be directly called by Llama code, for everyone to try:
|
||||
To facilitate ease of use, we have converted the model weights of MiniCPM to adapt to the structure of the LLaMA model:
|
||||
```python
|
||||
import torch
|
||||
from transformers import LlamaTokenizerFast, LlamaForCausalLM
|
||||
@ -184,7 +180,7 @@ print(res)
|
||||
|
||||
|
||||
#### llama.cpp、Ollama、fastllm Inference
|
||||
We have supported inference with [llama.cpp](https://github.com/ggerganov/llama.cpp/) 、[ollama](https://github.com/ollama/ollama)、[fastllm](https://github.com/ztxz16/fastllm).
|
||||
We have supported inference with [llama.cpp](https://github.com/ggerganov/llama.cpp/) 、[ollama](https://github.com/ollama/ollama)、[fastllm](https://github.com/ztxz16/fastllm). Thanks to [@runfuture](https://github.com/runfuture) for the adaptation of llama.cpp and ollama.
|
||||
|
||||
|
||||
**llama.cpp**
|
||||
@ -207,7 +203,7 @@ Solving [this issue](https://github.com/ollama/ollama/issues/2383)
|
||||
- [ChatLLM](https://github.com/foldl/chatllm.cpp) :[Run MiniCPM on CPU](https://huggingface.co/openbmb/MiniCPM-2B-dpo-bf16/discussions/2#65c59c4f27b8c11e43fc8796)
|
||||
|
||||
**fastllm**
|
||||
1. [install fastllm]([fastllm](https://github.com/ztxz16/fastllm)
|
||||
1. install [fastllm](https://github.com/ztxz16/fastllm)
|
||||
2. inference
|
||||
```
|
||||
import torch
|
||||
|
||||
24
README.md
24
README.md
@ -28,9 +28,11 @@ MiniCPM 是面壁智能与清华大学自然语言处理实验室共同开源的
|
||||
|
||||
我们完全开源MiniCPM-2B的模型参数供学术研究和有限商用.
|
||||
具体而言,我们目前已公开以下模型,地址详见 [模型下载](#1) 部分
|
||||
- 基于MiniCPM-2B的指令微调与人类偏好对**MiniCPM-2B-SFT/DPO**。
|
||||
- 基于MiniCPM-2B的指令微调与人类偏好对齐**MiniCPM-2B-SFT/DPO**。
|
||||
- 基于MiniCPM-2B的多模态模型**MiniCPM-V**,能力超越基于Phi-2的同参数级别多模态模型。
|
||||
- MiniCPM-2B-SFT/DPO的Int4量化版**MiniCPM-2B-SFT/DPO-Int4**。
|
||||
- MiniCPM-2B的128k长文本版本**MiniCPM-2B-128k**。
|
||||
- MiniCPM-2B的MoE版本**MiniCPM-MoE-8x2B**。
|
||||
- 基于MLC-LLM、LLMFarm开发的MiniCPM手机端程序,**文本及多模态模型均可在手机端进行推理**。
|
||||
- 训练过程中的[30个Checkpoints](https://huggingface.co/openbmb/MiniCPM-2B-history)供模型机理研究。
|
||||
|
||||
@ -59,7 +61,7 @@ MiniCPM 是面壁智能与清华大学自然语言处理实验室共同开源的
|
||||
|
||||
## 更新日志
|
||||
- 2024/04/11 开源[MiniCPM-V-v2.0](https://huggingface.co/openbmb/MiniCPM-V-v2.0)、[MiniCPM-2B-128k](https://huggingface.co/openbmb/MiniCPM-2B-128k)和[MiniCPM-MoE-8x2B](https://huggingface.co/openbmb/MiniCPM-MoE-8x2B)。
|
||||
- 2024/03/16 minicpm-2b 的30余个中间检查点开放了
|
||||
- 2024/03/16 MiniCPM-2B 的30余个中间检查点开放了
|
||||
- 2024/02/13 支持了llama.cpp
|
||||
- 2024/02/09 我们在readme里加入了一个[开源社区](#community)章节,用来收集开源社区对MiniCPM的支持案例。
|
||||
- 2024/02/08 我们更新了[llama-format的模型权重](#llamaformat),方便大家更加快捷地使用我们的模型。
|
||||
@ -71,22 +73,16 @@ MiniCPM 是面壁智能与清华大学自然语言处理实验室共同开源的
|
||||
|
||||
* 语言模型
|
||||
|
||||
| HuggingFace | ModelScope | WiseModel | Replicate |
|
||||
|-------------|------------|-----------|-----------|
|
||||
| HuggingFace | ModelScope | WiseModel |
|
||||
|-------------|------------|-----------|
|
||||
|[MiniCPM-2B-sft-bf16](https://huggingface.co/openbmb/MiniCPM-2B-sft-bf16)|[MiniCPM-2B-sft-bf16](https://modelscope.cn/models/OpenBMB/miniCPM-bf16)|[MiniCPM-2B-sft-bf16](https://wisemodel.cn/models/OpenBMB/miniCPM-bf16)|
|
||||
|[MiniCPM-2B-dpo-bf16](https://huggingface.co/openbmb/MiniCPM-2B-dpo-bf16)|[MiniCPM-2B-dpo-bf16](https://modelscope.cn/models/OpenBMB/MiniCPM-2B-dpo-bf16/summary)|[MiniCPM-2B-dpo-bf16](https://wisemodel.cn/models/OpenBMB/MiniCPM-2B-dpo-bf16)|[MiniCPM-2B-dpo-bf16](https://replicate.com/tuantuanzhang/minicpm)
|
||||
|[MiniCPM-2B-128k](https://huggingface.co/openbmb/MiniCPM-2B-128k) |[MiniCPM-2B-128k](https://modelscope.cn/models/openbmb/MiniCPM-2B-128k/summary)|
|
||||
|[MiniCPM-MoE-8x2B]() |[MiniCPM-MoE-8x2B]()|
|
||||
|[MiniCPM-2B-sft-fp32-llama-format](https://huggingface.co/openbmb/MiniCPM-2B-sft-fp32-llama-format)|
|
||||
|[MiniCPM-2B-sft-bf16-llama-format](https://huggingface.co/openbmb/MiniCPM-2B-sft-bf16-llama-format)|
|
||||
|[MiniCPM-2B-dpo-bf16-llama-format](https://huggingface.co/openbmb/MiniCPM-2B-dpo-bf16-llama-format)|
|
||||
|[MiniCPM-2B-dpo-fp16-gguf](https://huggingface.co/runfuture/MiniCPM-2B-dpo-fp16-gguf) |
|
||||
|[MiniCPM-2B-dpo-q4km-gguf](https://huggingface.co/runfuture/MiniCPM-2B-dpo-q4km-gguf) |
|
||||
|
||||
注:
|
||||
1. 模型训练为bf16训练,因此用bf16进行推理将取得最好的效果,其他的格式会由于精度问题造成一点的性能下降。
|
||||
2. -llama-format后缀的模型是我们将MiniCPM结构的模型转化成了Llama结构的(主要将mup的参数化方案融合进了模型本身的参数)。使得Llama模型的使用者可以零成本尝试MiniCPM。[详见这里](#llamaformat)
|
||||
3. 感谢[@runfuture](https://github.com/runfuture)对MiniCPM进行了[llama.cpp](https://github.com/ggerganov/llama.cpp)和[ollama](https://github.com/ollama/ollama)的适配
|
||||
2. 更多模型版本见[这里](https://huggingface.co/collections/openbmb/minicpm-2b-65d48bf958302b9fd25b698f)。
|
||||
|
||||
|
||||
* 多模态模型
|
||||
@ -133,7 +129,7 @@ print(responds)
|
||||
<p id="llamaformat"></p>
|
||||
|
||||
##### MiniCPM-2B (Llama Format)
|
||||
我们将MiniCPM的模型权重转化成了Llama代码可以直接调用的形式,以便大家尝试:
|
||||
我们将MiniCPM的模型权重转化成了Llama代码可以直接调用的[格式](https://huggingface.co/openbmb/MiniCPM-2B-sft-bf16-llama-format),以便大家尝试:
|
||||
```python
|
||||
import torch
|
||||
from transformers import LlamaTokenizerFast, LlamaForCausalLM
|
||||
@ -192,7 +188,7 @@ python inference/inference_vllm.py --model_path <hf_repo_path> --prompt_path pro
|
||||
```
|
||||
|
||||
#### llama.cpp、Ollama、fastllm推理
|
||||
我们支持了[llama.cpp](https://github.com/ggerganov/llama.cpp/) 推理、[ollama](https://github.com/ollama/ollama)推理、[fastllm](https://github.com/ztxz16/fastllm)推理.
|
||||
MiniCPM支持[llama.cpp](https://github.com/ggerganov/llama.cpp/) 、[ollama](https://github.com/ollama/ollama)、[fastllm](https://github.com/ztxz16/fastllm)推理。感谢[@runfuture](https://github.com/runfuture)对llama.cpp和ollama的适配。
|
||||
|
||||
**llama.cpp**
|
||||
1. [安装llama.cpp](https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#build)
|
||||
@ -227,7 +223,7 @@ print(model.response("<用户>山东省最高的山是哪座山, 它比黄山高
|
||||
|
||||
## 开源社区
|
||||
|
||||
- [ChatLLM框架](https://github.com/foldl/chatllm.cpp):[在CPU上跑MiniCPM](https://huggingface.co/openbmb/MiniCPM-2B-dpo-bf16/discussions/2#65c59c4f27b8c11e43fc8796)
|
||||
- [ChatLLM框架](https://github.com/foldl/chatllm.cpp):[在CPU上跑MiniCPM](https://huggingface.co/openbmb/MiniCPM-2B-dpo-bf16/discussions/2#65c59c4f27b8c11e43fc8796)
|
||||
|
||||
|
||||
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user