mirror of
https://github.com/RYDE-WORK/MiniCPM.git
synced 2026-01-19 12:53:36 +08:00
commit
21266f3d88
93
README.md
93
README.md
@ -46,19 +46,28 @@ MiniCPM 是面壁智能与清华大学自然语言处理实验室共同开源的
|
|||||||
|
|
||||||
## 目录
|
## 目录
|
||||||
|
|
||||||
- [更新日志](#0)
|
- [更新日志](#0)|
|
||||||
- [模型下载](#1)
|
- [模型下载](#1)|
|
||||||
- [快速上手](#2)
|
- [快速上手](#2)|
|
||||||
- [模型量化](#quantize)
|
- [模型量化](#quantize)|
|
||||||
- [开源社区](#community)
|
- [开源社区](#community)|
|
||||||
- [评测结果](#3)
|
- [评测结果](#3)|
|
||||||
- [手机部署](#4)
|
- [手机部署](#4)|
|
||||||
- [Demo & API 部署](#5)
|
- [Demo & API 部署](#5)|
|
||||||
- [二次开发](#6)
|
- [二次开发](#6)|
|
||||||
- [开源协议](#7)
|
- [开源协议](#7)|
|
||||||
- [工作引用](#8)
|
- [工作引用](#8)|
|
||||||
- [典型示例](#9)
|
- [典型示例](#9)|
|
||||||
|
|
||||||
|
## 常用模块导航
|
||||||
|
| [推理](#2) | [微调](#6) | [手机部署](#4) | [量化](#quantize)
|
||||||
|
|-------------|------------|-----------|-----------|
|
||||||
|
|[Transformers](#Huggingface模型)|[Transformers](#transformer_finetune)|[MLC部署](#MLC)|[GPTQ](#gptq)|
|
||||||
|
|[vLLM](#vllm-推理)|[mlx_finetune](#mlx)|[llama.cpp](#llama.cpp)|[AWQ](#awq)|
|
||||||
|
|[llama.cpp](#llama.cpp)|[llama_factory](https://github.com/OpenBMB/MiniCPM/tree/main/finetune/llama_factory_example/README.md)||[困惑度测试](#quantize_test)|
|
||||||
|
|[ollama](#ollama)||||
|
||||||
|
|[fastllm](#fastllm)||||
|
||||||
|
|[mlx_lm](#mlx_lm)||||
|
||||||
<p id="0"></p>
|
<p id="0"></p>
|
||||||
|
|
||||||
## 更新日志
|
## 更新日志
|
||||||
@ -104,6 +113,8 @@ MiniCPM 是面壁智能与清华大学自然语言处理实验室共同开源的
|
|||||||
|
|
||||||
- [Colab](https://colab.research.google.com/drive/1tJcfPyWGWA5HezO7GKLeyeIso0HyOc0l?usp=sharing)
|
- [Colab](https://colab.research.google.com/drive/1tJcfPyWGWA5HezO7GKLeyeIso0HyOc0l?usp=sharing)
|
||||||
|
|
||||||
|
<p id="Huggingface模型"></p>
|
||||||
|
|
||||||
#### Huggingface 模型
|
#### Huggingface 模型
|
||||||
|
|
||||||
##### MiniCPM-2B
|
##### MiniCPM-2B
|
||||||
@ -195,7 +206,9 @@ python inference/inference_vllm.py --model_path <hf_repo_path> --prompt_path pro
|
|||||||
#### llama.cpp、Ollama、fastllm、mlx_lm推理
|
#### llama.cpp、Ollama、fastllm、mlx_lm推理
|
||||||
MiniCPM支持[llama.cpp](https://github.com/ggerganov/llama.cpp/) 、[ollama](https://github.com/ollama/ollama)、[fastllm](https://github.com/ztxz16/fastllm)、[mlx_lm](https://github.com/ml-explore/mlx-examples)推理。感谢[@runfuture](https://github.com/runfuture)对llama.cpp和ollama的适配。
|
MiniCPM支持[llama.cpp](https://github.com/ggerganov/llama.cpp/) 、[ollama](https://github.com/ollama/ollama)、[fastllm](https://github.com/ztxz16/fastllm)、[mlx_lm](https://github.com/ml-explore/mlx-examples)推理。感谢[@runfuture](https://github.com/runfuture)对llama.cpp和ollama的适配。
|
||||||
|
|
||||||
**llama.cpp**
|
<p id="llama.cpp"></p>
|
||||||
|
|
||||||
|
#### llama.cpp
|
||||||
1. [安装llama.cpp](https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#build)
|
1. [安装llama.cpp](https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#build)
|
||||||
2. 下载gguf形式的模型。[下载链接-fp16格式](https://huggingface.co/runfuture/MiniCPM-2B-dpo-fp16-gguf) [下载链接-q4km格式](https://huggingface.co/runfuture/MiniCPM-2B-dpo-q4km-gguf)
|
2. 下载gguf形式的模型。[下载链接-fp16格式](https://huggingface.co/runfuture/MiniCPM-2B-dpo-fp16-gguf) [下载链接-q4km格式](https://huggingface.co/runfuture/MiniCPM-2B-dpo-q4km-gguf)
|
||||||
3. 在命令行运行示例代码:
|
3. 在命令行运行示例代码:
|
||||||
@ -204,8 +217,9 @@ MiniCPM支持[llama.cpp](https://github.com/ggerganov/llama.cpp/) 、[ollama](ht
|
|||||||
```
|
```
|
||||||
更多参数调整[详见](https://github.com/ggerganov/llama.cpp/blob/master/examples/main/README.md)
|
更多参数调整[详见](https://github.com/ggerganov/llama.cpp/blob/master/examples/main/README.md)
|
||||||
|
|
||||||
**ollama**
|
<p id="ollama"></p>
|
||||||
|
|
||||||
|
#### ollama
|
||||||
***ollama自动安装模型***
|
***ollama自动安装模型***
|
||||||
1. [安装ollama](https://github.com/ollama/ollama)
|
1. [安装ollama](https://github.com/ollama/ollama)
|
||||||
2. 在命令行运行:
|
2. 在命令行运行:
|
||||||
@ -233,8 +247,9 @@ ollama create ollama_model_name -f model_name.Modelfile
|
|||||||
```
|
```
|
||||||
ollama run ollama_model_name
|
ollama run ollama_model_name
|
||||||
```
|
```
|
||||||
|
<p id="fastllm"></p>
|
||||||
|
|
||||||
**fastllm**
|
#### fastllm
|
||||||
1. [编译安装fastllm](https://github.com/ztxz16/fastllm)
|
1. [编译安装fastllm](https://github.com/ztxz16/fastllm)
|
||||||
2. 模型推理
|
2. 模型推理
|
||||||
```python
|
```python
|
||||||
@ -248,8 +263,9 @@ llm.set_device_map("cpu")
|
|||||||
model = llm.from_hf(model, tokenizer, dtype = "float16") # dtype支持 "float16", "int8", "int4"
|
model = llm.from_hf(model, tokenizer, dtype = "float16") # dtype支持 "float16", "int8", "int4"
|
||||||
print(model.response("<用户>山东省最高的山是哪座山, 它比黄山高还是矮?差距多少?<AI>", top_p=0.8, temperature=0.5, repeat_penalty=1.02))
|
print(model.response("<用户>山东省最高的山是哪座山, 它比黄山高还是矮?差距多少?<AI>", top_p=0.8, temperature=0.5, repeat_penalty=1.02))
|
||||||
```
|
```
|
||||||
|
<p id="mlx_lm"></p>
|
||||||
|
|
||||||
**mlx_lm**
|
#### mlx_lm
|
||||||
1. 安装mlx_lm库
|
1. 安装mlx_lm库
|
||||||
```shell
|
```shell
|
||||||
pip install mlx_lm
|
pip install mlx_lm
|
||||||
@ -259,9 +275,11 @@ print(model.response("<用户>山东省最高的山是哪座山, 它比黄山高
|
|||||||
```shell
|
```shell
|
||||||
python -m mlx_lm.generate --model mlx-community/MiniCPM-2B-sft-bf16-llama-format-mlx --prompt "hello, tell me a joke." --trust-remote-code
|
python -m mlx_lm.generate --model mlx-community/MiniCPM-2B-sft-bf16-llama-format-mlx --prompt "hello, tell me a joke." --trust-remote-code
|
||||||
```
|
```
|
||||||
<p id="community"></p>
|
<p id="quantize"></p>
|
||||||
|
|
||||||
## 模型量化
|
## 模型量化
|
||||||
|
<p id="gptq"></p>
|
||||||
|
|
||||||
**gptq量化**
|
**gptq量化**
|
||||||
1. 首先git获取[minicpm_gptqd代码](https://github.com/LDLINGLINGLING/AutoGPTQ/tree/minicpm_gptq)
|
1. 首先git获取[minicpm_gptqd代码](https://github.com/LDLINGLINGLING/AutoGPTQ/tree/minicpm_gptq)
|
||||||
2. 进入minicpm_gptqd主目录./AutoGPTQ,命令行输入:
|
2. 进入minicpm_gptqd主目录./AutoGPTQ,命令行输入:
|
||||||
@ -275,14 +293,37 @@ print(model.response("<用户>山东省最高的山是哪座山, 它比黄山高
|
|||||||
```
|
```
|
||||||
5. 可以使用./AutoGPTQ/examples/quantization/inference.py进行推理,也可以参考前文使用vllm对量化后的模型,单卡4090下minicpm-1b-int4模型vllm推理在2000token/s左右。
|
5. 可以使用./AutoGPTQ/examples/quantization/inference.py进行推理,也可以参考前文使用vllm对量化后的模型,单卡4090下minicpm-1b-int4模型vllm推理在2000token/s左右。
|
||||||
|
|
||||||
|
<p id="awq"></p>
|
||||||
|
|
||||||
**awq量化**
|
**awq量化**
|
||||||
1. 在quantize/awq_quantize.py 文件中修改根据注释修改配置参数:model_path , quant_path, quant_data_path , quant_config, quant_samples, 如需自定数据集则需要修改 custom_data。
|
1. 在quantize/awq_quantize.py 文件中修改根据注释修改配置参数:
|
||||||
2. 在quantize/quantize_data文件下已经提供了alpaca和wiki_text两个数据集作为量化校准集,如果需要自定义数据集,修改quantize/awq_quantize.py中的custom_data变量,如:
|
```python
|
||||||
```
|
model_path = '/root/ld/ld_model_pretrained/MiniCPM-1B-sft-bf16' # model_path or model_id
|
||||||
|
quant_path = '/root/ld/ld_project/pull_request/MiniCPM/quantize/awq_cpm_1b_4bit' # quant_save_path
|
||||||
|
quant_data_path='/root/ld/ld_project/pull_request/MiniCPM/quantize/quantize_data/wikitext'# 写入自带量化数据集,data下的alpaca或者wikitext
|
||||||
|
quant_config = { "zero_point": True, "q_group_size": 128, "w_bit": 4, "version": "GEMM" } # "w_bit":4 or 8
|
||||||
|
quant_samples=512 # how many samples to use for calibration
|
||||||
|
custom_data=[{'question':'你叫什么名字。','answer':'我是openmbmb开源的小钢炮minicpm。'}, # 自定义数据集可用
|
||||||
|
{'question':'你有什么特色。','answer':'我很小,但是我很强。'}]
|
||||||
|
```
|
||||||
|
2. 在quantize/quantize_data文件下已经提供了alpaca和wiki_text两个数据集作为量化校准集,修改上述quant_data_path为其中一个文件夹的路径
|
||||||
|
3. 如果需要自定义数据集,修改quantize/awq_quantize.py中的custom_data变量,如:
|
||||||
|
```python
|
||||||
custom_data=[{'question':'过敏性鼻炎有什么症状?','answer':'过敏性鼻炎可能鼻塞,流鼻涕,头痛等症状反复发作,严重时建议及时就医。'},
|
custom_data=[{'question':'过敏性鼻炎有什么症状?','answer':'过敏性鼻炎可能鼻塞,流鼻涕,头痛等症状反复发作,严重时建议及时就医。'},
|
||||||
{'question':'1+1等于多少?','answer':'等于2'}]
|
{'question':'1+1等于多少?','answer':'等于2'}]
|
||||||
```
|
```
|
||||||
3. 运行quantize/awq_quantize.py文件,在设置的quan_path目录下可得awq量化后的模型。
|
4. 根据选择的数据集,选择以下某一行代码替换 quantize/awq_quantize.py 中第三十八行:
|
||||||
|
```python
|
||||||
|
#使用wikitext进行量化
|
||||||
|
model.quantize(tokenizer, quant_config=quant_config, calib_data=load_wikitext(quant_data_path=quant_data_path))
|
||||||
|
#使用alpaca进行量化
|
||||||
|
model.quantize(tokenizer, quant_config=quant_config, calib_data=load_alpaca(quant_data_path=quant_data_path))
|
||||||
|
#使用自定义数据集进行量化
|
||||||
|
model.quantize(tokenizer, quant_config=quant_config, calib_data=load_cust_data(quant_data_path=quant_data_path))
|
||||||
|
|
||||||
|
```
|
||||||
|
5. 运行quantize/awq_quantize.py文件,在设置的quan_path目录下可得awq量化后的模型。
|
||||||
|
<p id="quantize_test"></p>
|
||||||
|
|
||||||
**量化测试**
|
**量化测试**
|
||||||
1. 命令行进入到 MiniCPM/quantize 目录下
|
1. 命令行进入到 MiniCPM/quantize 目录下
|
||||||
@ -750,6 +791,7 @@ print(model.response("<用户>山东省最高的山是哪座山, 它比黄山高
|
|||||||
<p id="4"></p>
|
<p id="4"></p>
|
||||||
|
|
||||||
## 手机部署
|
## 手机部署
|
||||||
|
<p id="MLC"></p>
|
||||||
|
|
||||||
#### 部署步骤
|
#### 部署步骤
|
||||||
|
|
||||||
@ -821,14 +863,17 @@ python demo/hf_based_demo.py --model_path <hf_repo_path>
|
|||||||
<p id="6"></p>
|
<p id="6"></p>
|
||||||
|
|
||||||
## 二次开发
|
## 二次开发
|
||||||
|
<p id="transformer_finetune"></p>
|
||||||
|
|
||||||
* 高效参数微调
|
* 高效参数微调
|
||||||
* 一张1080/2080可实现高效参数微调
|
* 一张1080/2080可实现高效参数微调
|
||||||
* [高效参数微调代码](https://github.com/OpenBMB/MiniCPM/tree/main/finetune)
|
* [高效参数微调代码](https://github.com/OpenBMB/MiniCPM/tree/main/finetune)
|
||||||
|
<p id="BMTrain"></p>
|
||||||
|
|
||||||
* 全参数微调 or 持续训练
|
* 全参数微调 or 持续训练
|
||||||
* 使用[BMTrain](https://github.com/OpenBMB/BMTrain),借助重计算和ZeRO-3,一张3090/4090可实现全参数微调,一台机器可实现持续训练
|
* 使用[BMTrain](https://github.com/OpenBMB/BMTrain),借助重计算和ZeRO-3,一张3090/4090可实现全参数微调,一台机器可实现持续训练
|
||||||
* 相关代码也将陆续推出
|
* 相关代码也将陆续推出
|
||||||
|
<p id="mlx"></p>
|
||||||
|
|
||||||
* mlx高效参数微调
|
* mlx高效参数微调
|
||||||
* 环境准备
|
* 环境准备
|
||||||
@ -842,7 +887,7 @@ python demo/hf_based_demo.py --model_path <hf_repo_path>
|
|||||||
# test
|
# test
|
||||||
python mlx_finetune.py --model MiniCPM-2B-sft-bf16-llama-format-mlx --data data/AdvertiseGen --test --seed 2024
|
python mlx_finetune.py --model MiniCPM-2B-sft-bf16-llama-format-mlx --data data/AdvertiseGen --test --seed 2024
|
||||||
```
|
```
|
||||||
|
* [llama_factory微调](https://github.com/OpenBMB/MiniCPM/tree/main/finetune/llama_factory_example/README.md)
|
||||||
|
|
||||||
<p id="9"></p>
|
<p id="9"></p>
|
||||||
|
|
||||||
|
|||||||
101
finetune/llama_factory_example/README.md
Normal file
101
finetune/llama_factory_example/README.md
Normal file
@ -0,0 +1,101 @@
|
|||||||
|
# MiniCPM_llama_factory 微调
|
||||||
|
MiniCPM已经支持llama_factory微调,llama_factory支持continue_pretrain,sft,ppo,dpo,kto,orpo等等微调方式。
|
||||||
|
由于llama_factory功能强大,但初学者较难上手,我们录制了微调教程
|
||||||
|
|
||||||
|
**我们提供了 llama_factory_example文件夹,用来微调minicpm1b,minicpm2b模型。**
|
||||||
|
1.首先安装llama_factory依赖。
|
||||||
|
```bash
|
||||||
|
git clone https://github.com/hiyouga/LLaMA-Factory
|
||||||
|
cd LLaMA-Factory
|
||||||
|
pip install -r requirements.txt
|
||||||
|
```
|
||||||
|
2.将数据集处理成Minicpm/finetune/llama_factory_example/llama_factory_data文件夹中的格式,示例包括dpo,kto,sft三种微调方式并放置到llama_factory/data目录下.以dpo为例:
|
||||||
|
```json
|
||||||
|
[
|
||||||
|
{
|
||||||
|
"conversations": [
|
||||||
|
{
|
||||||
|
"from": "human",
|
||||||
|
"value": "Hi! I'd like to create a new language game simulating the first person perspective of a character named Angela."
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"chosen": {
|
||||||
|
"from": "gpt",
|
||||||
|
"value": "That sounds like a fun and engaging idea! Here are some tips to help you create the game:\n1. ......"
|
||||||
|
},
|
||||||
|
"rejected": {
|
||||||
|
"from": "gpt",
|
||||||
|
"value": "Hello! I'd be happy to help you create a language game simulating the first-person perspective ....."
|
||||||
|
}
|
||||||
|
}
|
||||||
|
]
|
||||||
|
```
|
||||||
|
3.在llama_factory/data/dataset_info.json中添加数据集信息,保证dataset_info.json中能找到你的数据集,如下例:
|
||||||
|
``` json
|
||||||
|
{"identity": {
|
||||||
|
"file_name": "identity.json"
|
||||||
|
},
|
||||||
|
"sft_zh_demo": {
|
||||||
|
"file_name": "alpaca_zh_demo.json"
|
||||||
|
},
|
||||||
|
"kto_en_demo": {
|
||||||
|
"file_name": "kto_en_demo.json",
|
||||||
|
"formatting": "sharegpt",
|
||||||
|
"columns": {
|
||||||
|
"messages": "messages",
|
||||||
|
"kto_tag": "label"
|
||||||
|
},
|
||||||
|
"tags": {
|
||||||
|
"role_tag": "role",
|
||||||
|
"content_tag": "content",
|
||||||
|
"user_tag": "user",
|
||||||
|
"assistant_tag": "assistant"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"dpo_en_demo": {
|
||||||
|
"file_name": "dpo_en_demo.json",
|
||||||
|
"ranking": true,
|
||||||
|
"formatting": "sharegpt",
|
||||||
|
"columns": {
|
||||||
|
"messages": "conversations",
|
||||||
|
"chosen": "chosen",
|
||||||
|
"rejected": "rejected"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
4.将MiniCPM/finetune/llama_factory_example中文件复制到LLaMA-Factory/examples目录下。
|
||||||
|
```bash
|
||||||
|
cd LLaMA-Factory/examples
|
||||||
|
mkdir minicpm
|
||||||
|
#以下代码中的/your/path要改成你的MiniCPM代码和LLaMA-Factory路径
|
||||||
|
cp -r /your/path/MiniCPM/finetune/llama_factory_example/* /your/path/LLaMA-Factory/examples/minicpm
|
||||||
|
```
|
||||||
|
5.以dpo为例,首先修改minicpm_dpo.yaml,需要修改的:
|
||||||
|
```yaml
|
||||||
|
model_name_or_path: openbmb/MiniCPM-2B-sft-bf16 #或者你本地保存的地址
|
||||||
|
dataset: dpo_en_demo #这里写dataset_info.json中的键名
|
||||||
|
output_dir: your/finetune_minicpm/save/path
|
||||||
|
bf16: true #如果你的设备支持bf16,否则false
|
||||||
|
deepspeed: examples/deepspeed/ds_z2_config.json #如果显存不够可以改成ds_z3_config.json
|
||||||
|
```
|
||||||
|
6.修改single_node.sh文件中:
|
||||||
|
|
||||||
|
- 1.如果是a100以及更高端服务器,删除以下两行
|
||||||
|
```bash
|
||||||
|
export NCCL_P2P_DISABLE=1
|
||||||
|
export NCCL_IB_DISABLE=1
|
||||||
|
```
|
||||||
|
- 2.设置你希望参与微调的卡,以下示例为第1张到第8张卡都参与微调
|
||||||
|
```bash
|
||||||
|
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
|
||||||
|
```
|
||||||
|
- 3.将以下代码src/train.py空格后方参数改为llama_facoty中minicpm_dpo.yaml的绝对路径
|
||||||
|
```bash
|
||||||
|
src/train.py /root/ld/ld_project/LLaMA-Factory/examples/minicpm/minicpm_sft.yaml
|
||||||
|
```
|
||||||
|
7.执行:
|
||||||
|
```bash
|
||||||
|
cd LLaMA-Factory
|
||||||
|
bash single_node.sh
|
||||||
|
```
|
||||||
7226
finetune/llama_factory_example/llama_factory_data/dpo_en_demo.json
Normal file
7226
finetune/llama_factory_example/llama_factory_data/dpo_en_demo.json
Normal file
File diff suppressed because one or more lines are too long
5398
finetune/llama_factory_example/llama_factory_data/kto_en_demo.json
Normal file
5398
finetune/llama_factory_example/llama_factory_data/kto_en_demo.json
Normal file
File diff suppressed because one or more lines are too long
5002
finetune/llama_factory_example/llama_factory_data/sft_zh_demo.json
Normal file
5002
finetune/llama_factory_example/llama_factory_data/sft_zh_demo.json
Normal file
File diff suppressed because it is too large
Load Diff
42
finetune/llama_factory_example/minicpm_dpo.yaml
Normal file
42
finetune/llama_factory_example/minicpm_dpo.yaml
Normal file
@ -0,0 +1,42 @@
|
|||||||
|
### model
|
||||||
|
model_name_or_path: /root/ld/ld_project/LLaMA-Factory/saves/minicpm/full/sft/
|
||||||
|
|
||||||
|
### method
|
||||||
|
stage: dpo
|
||||||
|
do_train: true
|
||||||
|
finetuning_type: full
|
||||||
|
|
||||||
|
### ddp
|
||||||
|
ddp_timeout: 180000000
|
||||||
|
deepspeed: examples/deepspeed/ds_z2_config.json
|
||||||
|
|
||||||
|
### dataset
|
||||||
|
dataset: dpo_en_demo
|
||||||
|
template: cpm
|
||||||
|
cutoff_len: 1200
|
||||||
|
max_samples: 50000000
|
||||||
|
overwrite_cache: true
|
||||||
|
preprocessing_num_workers: 16
|
||||||
|
|
||||||
|
|
||||||
|
### output
|
||||||
|
output_dir: saves/minicpm/dpo
|
||||||
|
logging_steps: 10
|
||||||
|
save_steps: 500
|
||||||
|
plot_loss: true
|
||||||
|
overwrite_output_dir: true
|
||||||
|
save_strategy: epoch
|
||||||
|
### train
|
||||||
|
per_device_train_batch_size: 2
|
||||||
|
gradient_accumulation_steps: 4
|
||||||
|
learning_rate: 0.00001
|
||||||
|
num_train_epochs: 2.0
|
||||||
|
lr_scheduler_type: cosine
|
||||||
|
warmup_steps: 0.1
|
||||||
|
bf16: true
|
||||||
|
|
||||||
|
### eval
|
||||||
|
val_size: 0.1
|
||||||
|
per_device_eval_batch_size: 4
|
||||||
|
evaluation_strategy: steps
|
||||||
|
eval_steps: 500
|
||||||
42
finetune/llama_factory_example/minicpm_kto.yaml
Normal file
42
finetune/llama_factory_example/minicpm_kto.yaml
Normal file
@ -0,0 +1,42 @@
|
|||||||
|
### model
|
||||||
|
model_name_or_path: /root/ld/ld_model_pretrain/MiniCPM-1B-sft-bf16/
|
||||||
|
|
||||||
|
### method
|
||||||
|
stage: kto
|
||||||
|
do_train: true
|
||||||
|
finetuning_type: full
|
||||||
|
kto_ftx: 0.1
|
||||||
|
|
||||||
|
### ddp
|
||||||
|
ddp_timeout: 180000000
|
||||||
|
deepspeed: examples/deepspeed/ds_z2_config.json
|
||||||
|
|
||||||
|
### dataset
|
||||||
|
dataset: kto_harmless
|
||||||
|
template: cpm
|
||||||
|
cutoff_len: 1200
|
||||||
|
max_samples: 500000
|
||||||
|
overwrite_cache: true
|
||||||
|
preprocessing_num_workers: 16
|
||||||
|
|
||||||
|
### output
|
||||||
|
output_dir: saves/minicpm/kto
|
||||||
|
logging_steps: 10
|
||||||
|
save_steps: 500
|
||||||
|
plot_loss: true
|
||||||
|
overwrite_output_dir: true
|
||||||
|
|
||||||
|
### train
|
||||||
|
per_device_train_batch_size: 4
|
||||||
|
gradient_accumulation_steps: 4
|
||||||
|
learning_rate: 0.000005
|
||||||
|
num_train_epochs: 1.0
|
||||||
|
lr_scheduler_type: cosine
|
||||||
|
warmup_steps: 0.1
|
||||||
|
bf16: true
|
||||||
|
|
||||||
|
### eval
|
||||||
|
val_size: 0.1
|
||||||
|
per_device_eval_batch_size: 16
|
||||||
|
evaluation_strategy: steps
|
||||||
|
eval_steps: 500
|
||||||
41
finetune/llama_factory_example/minicpm_sft.yaml
Normal file
41
finetune/llama_factory_example/minicpm_sft.yaml
Normal file
@ -0,0 +1,41 @@
|
|||||||
|
### model
|
||||||
|
model_name_or_path: /root/ld/ld_model_pretrained/miniCPM-bf16/
|
||||||
|
|
||||||
|
### method
|
||||||
|
stage: sft
|
||||||
|
do_train: true
|
||||||
|
finetuning_type: full
|
||||||
|
|
||||||
|
### ddp
|
||||||
|
ddp_timeout: 180000000
|
||||||
|
deepspeed: examples/deepspeed/ds_z2_config.json
|
||||||
|
|
||||||
|
### dataset
|
||||||
|
dataset: glaive_toolcall_en,glaive_toolcall_zh
|
||||||
|
template: cpm
|
||||||
|
cutoff_len: 1800
|
||||||
|
max_samples: 500000
|
||||||
|
overwrite_cache: true
|
||||||
|
preprocessing_num_workers: 16
|
||||||
|
|
||||||
|
### output
|
||||||
|
output_dir: saves/minicpm/fuction_call
|
||||||
|
logging_steps: 10
|
||||||
|
save_strategy: epoch
|
||||||
|
plot_loss: true
|
||||||
|
overwrite_output_dir: true
|
||||||
|
|
||||||
|
### train
|
||||||
|
per_device_train_batch_size: 2
|
||||||
|
gradient_accumulation_steps: 4
|
||||||
|
learning_rate: 0.0001
|
||||||
|
num_train_epochs: 3.0
|
||||||
|
lr_scheduler_type: cosine
|
||||||
|
warmup_steps: 0.1
|
||||||
|
bf16: true
|
||||||
|
|
||||||
|
### eval
|
||||||
|
val_size: 0.1
|
||||||
|
per_device_eval_batch_size: 4
|
||||||
|
evaluation_strategy: steps
|
||||||
|
eval_steps: 500
|
||||||
16
finetune/llama_factory_example/single_node.sh
Normal file
16
finetune/llama_factory_example/single_node.sh
Normal file
@ -0,0 +1,16 @@
|
|||||||
|
#!/bin/bash
|
||||||
|
|
||||||
|
NPROC_PER_NODE=8
|
||||||
|
NNODES=1
|
||||||
|
RANK=0
|
||||||
|
MASTER_ADDR=127.0.0.1
|
||||||
|
MASTER_PORT=29500
|
||||||
|
export NCCL_P2P_DISABLE=1
|
||||||
|
export NCCL_IB_DISABLE=1
|
||||||
|
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 torchrun \
|
||||||
|
--nproc_per_node $NPROC_PER_NODE \
|
||||||
|
--nnodes $NNODES \
|
||||||
|
--node_rank $RANK \
|
||||||
|
--master_addr $MASTER_ADDR \
|
||||||
|
--master_port $MASTER_PORT \
|
||||||
|
src/train.py /root/ld/ld_project/LLaMA-Factory/examples/minicpm/minicpm_sft.yaml
|
||||||
@ -7,10 +7,10 @@ import os
|
|||||||
|
|
||||||
model_path = '/root/ld/ld_model_pretrained/MiniCPM-1B-sft-bf16' # model_path or model_id
|
model_path = '/root/ld/ld_model_pretrained/MiniCPM-1B-sft-bf16' # model_path or model_id
|
||||||
quant_path = '/root/ld/ld_project/pull_request/MiniCPM/quantize/awq_cpm_1b_4bit' # quant_save_path
|
quant_path = '/root/ld/ld_project/pull_request/MiniCPM/quantize/awq_cpm_1b_4bit' # quant_save_path
|
||||||
quant_data_path='/root/ld/ld_project/pull_request/MiniCPM/quantize/quantize_data/wikitext'# 写入自带
|
quant_data_path='/root/ld/ld_project/pull_request/MiniCPM/quantize/quantize_data/wikitext'# 写入自带数据集地址
|
||||||
quant_config = { "zero_point": True, "q_group_size": 128, "w_bit": 4, "version": "GEMM" } # "w_bit":4 or 8
|
quant_config = { "zero_point": True, "q_group_size": 128, "w_bit": 4, "version": "GEMM" } # "w_bit":4 or 8
|
||||||
quant_samples=512 # how many samples to use for calibration
|
quant_samples=512 # how many samples to use for calibration
|
||||||
custom_data=[{'question':'你叫什么名字。','answer':'我是openmbmb开源的小钢炮minicpm。'},
|
custom_data=[{'question':'你叫什么名字。','answer':'我是openmbmb开源的小钢炮minicpm。'}, # 自定义数据集可用
|
||||||
{'question':'你有什么特色。','answer':'我很小,但是我很强。'}]
|
{'question':'你有什么特色。','answer':'我很小,但是我很强。'}]
|
||||||
# Load model
|
# Load model
|
||||||
model = AutoAWQForCausalLM.from_pretrained(model_path)
|
model = AutoAWQForCausalLM.from_pretrained(model_path)
|
||||||
|
|||||||
Loading…
x
Reference in New Issue
Block a user