增加了llama_facoty的微调示例

2026-01-24 15:33:15 +08:00 · 2024-06-27 09:04:43 +08:00 · 2024-06-27 09:04:43 +08:00 · d58e892a98
commit d58e892a98
parent c3206bad3f
8 changed files with 17840 additions and 0 deletions
--- a/finetune/llama_factory_example/README.md
+++ b/finetune/llama_factory_example/README.md
@ -0,0 +1,73 @@
+# MiniCPM_llama_factory 微调
+MiniCPM已经支持llama_factory微调，llama_factory支持continue_pretrain,sft,ppo,dpo,kto,orpo等等微调方式。
+由于llama_factory功能强大，但初学者较难上手，我们录制了微调教程
+**我们提供了 llama_factory_example文件夹，用来微调minicpm1b，minicpm2b模型。**
+1.首先安装llama_factory依赖。
+```bash
+git clone https://github.com/hiyouga/LLaMA-Factory
+cd LLaMA-Factory
+pip install -r requirements.txt
+```
+2.将数据集处理成Minicpm/finetune/llama_factory_example/llama_factory_data文件夹中的格式,示例包括dpo,kto,sft三种微调方式并放置到llama_factory/data目录下.
+3.在llama_factory/data/dataset_info.json中添加数据集信息,保证dataset_info.json中能找到你的数据集，如下例：
+``` json
+  {"identity": {
+    "file_name": "identity.json"
+  },
+    "alpaca_zh_demo": {
+      "file_name": "alpaca_zh_demo.json"
+    },
+    "kto_en_demo": {
+      "file_name": "kto_en_demo.json",
+      "formatting": "sharegpt",
+      "columns": {
+        "messages": "messages",
+        "kto_tag": "label"
+      },
+      "tags": {
+        "role_tag": "role",
+        "content_tag": "content",
+        "user_tag": "user",
+        "assistant_tag": "assistant"
+      }
+    },
+    "dpo_en_demo": {
+      "file_name": "dpo_en_demo.json",
+      "ranking": true,
+      "formatting": "sharegpt",
+      "columns": {
+        "messages": "conversations",
+        "chosen": "chosen",
+        "rejected": "rejected"
+      }
+    }
+  }
+```
+4.将MiniCPM/finetune/llama_factory_example中文件复制到LLaMA-Factory/examples目录下。
+5.以dpo为例，首先修改minicpm_dpo.yaml,需要修改的：
+```bash
+  model_name_or_path: openbmb/MiniCPM-2B-sft-bf16 #或者你本地保存的地址
+  dataset: dpo_en_demo #这里写dataset_info.json中的键名
+  output_dir: your/finetune_minicpm/save/path
+  bf16: true #如果你的设备支持bf16，否则false
+  deepspeed: examples/deepspeed/ds_z2_config.json #如果显存不够可以改成ds_z3_config.json
+```
+6.修改single_node.sh文件中：
+  1.如果是a100以及更高端服务器，删除以下两行
+  ```bash
+    export NCCL_P2P_DISABLE=1
+    export NCCL_IB_DISABLE=1 
+  ```
+  2.设置你希望参与微调的卡，以下示例为第1张到第8张卡都参与微调
+  ```bash
+    CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
+  ```
+  3.将以下代码src/train.py空格后方参数改为llama_facoty中minicpm_dpo.yaml的绝对路径
+  ```bash
+    src/train.py /root/ld/ld_project/LLaMA-Factory/examples/minicpm/minicpm_sft.yaml
+  ```
+7.执行：
+```bash
+  cd LLaMA-Factory
+  bash single_node.sh
+```
--- a/finetune/llama_factory_example/llama_factory_data/dpo_en_demo.json
+++ b/finetune/llama_factory_example/llama_factory_data/dpo_en_demo.json
--- a/finetune/llama_factory_example/llama_factory_data/kto_en_demo.json
+++ b/finetune/llama_factory_example/llama_factory_data/kto_en_demo.json
--- a/finetune/llama_factory_example/llama_factory_data/sft_zh_demo.json
+++ b/finetune/llama_factory_example/llama_factory_data/sft_zh_demo.json
--- a/finetune/llama_factory_example/minicpm_dpo.yaml
+++ b/finetune/llama_factory_example/minicpm_dpo.yaml
@ -0,0 +1,42 @@
+### model
+model_name_or_path: /root/ld/ld_project/LLaMA-Factory/saves/minicpm/full/sft/
+
+### method
+stage: dpo
+do_train: true
+finetuning_type: full
+
+### ddp
+ddp_timeout: 180000000
+deepspeed: examples/deepspeed/ds_z2_config.json
+
+### dataset
+dataset: dpo_en_demo
+template: cpm
+cutoff_len: 1200
+max_samples: 50000000
+overwrite_cache: true
+preprocessing_num_workers: 16
+
+
+### output
+output_dir: saves/minicpm/dpo
+logging_steps: 10
+save_steps: 500
+plot_loss: true
+overwrite_output_dir: true
+save_strategy: epoch
+### train
+per_device_train_batch_size: 2
+gradient_accumulation_steps: 4
+learning_rate: 0.00001
+num_train_epochs: 2.0
+lr_scheduler_type: cosine
+warmup_steps: 0.1
+bf16: true
+
+### eval
+val_size: 0.1
+per_device_eval_batch_size: 4
+evaluation_strategy: steps
+eval_steps: 500
--- a/finetune/llama_factory_example/minicpm_kto.yaml
+++ b/finetune/llama_factory_example/minicpm_kto.yaml
@ -0,0 +1,42 @@
+### model
+model_name_or_path: /root/ld/ld_model_pretrain/MiniCPM-1B-sft-bf16/
+
+### method
+stage: kto
+do_train: true
+finetuning_type: full
+kto_ftx: 0.1
+
+### ddp
+ddp_timeout: 180000000
+deepspeed: examples/deepspeed/ds_z2_config.json
+
+### dataset
+dataset: kto_harmless
+template: cpm
+cutoff_len: 1200
+max_samples: 500000
+overwrite_cache: true
+preprocessing_num_workers: 16
+
+### output
+output_dir: saves/minicpm/kto
+logging_steps: 10
+save_steps: 500
+plot_loss: true
+overwrite_output_dir: true
+
+### train
+per_device_train_batch_size: 4
+gradient_accumulation_steps: 4
+learning_rate: 0.000005
+num_train_epochs: 1.0
+lr_scheduler_type: cosine
+warmup_steps: 0.1
+bf16: true
+
+### eval
+val_size: 0.1
+per_device_eval_batch_size: 16
+evaluation_strategy: steps
+eval_steps: 500
--- a/finetune/llama_factory_example/minicpm_sft.yaml
+++ b/finetune/llama_factory_example/minicpm_sft.yaml
@ -0,0 +1,41 @@
+### model
+model_name_or_path: /root/ld/ld_model_pretrained/miniCPM-bf16/
+
+### method
+stage: sft
+do_train: true
+finetuning_type: full
+
+### ddp
+ddp_timeout: 180000000
+deepspeed: examples/deepspeed/ds_z2_config.json
+
+### dataset
+dataset: glaive_toolcall_en,glaive_toolcall_zh
+template: cpm
+cutoff_len: 1800
+max_samples: 500000
+overwrite_cache: true
+preprocessing_num_workers: 16
+
+### output
+output_dir: saves/minicpm/fuction_call
+logging_steps: 10
+save_strategy: epoch
+plot_loss: true
+overwrite_output_dir: true
+
+### train
+per_device_train_batch_size: 2
+gradient_accumulation_steps: 4
+learning_rate: 0.0001
+num_train_epochs: 3.0
+lr_scheduler_type: cosine
+warmup_steps: 0.1
+bf16: true
+
+### eval
+val_size: 0.1
+per_device_eval_batch_size: 4
+evaluation_strategy: steps
+eval_steps: 500
--- a/finetune/llama_factory_example/single_node.sh
+++ b/finetune/llama_factory_example/single_node.sh
@ -0,0 +1,16 @@
+#!/bin/bash
+
+NPROC_PER_NODE=8
+NNODES=1
+RANK=0
+MASTER_ADDR=127.0.0.1
+MASTER_PORT=29500
+export NCCL_P2P_DISABLE=1
+export NCCL_IB_DISABLE=1 
+CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 torchrun \
+    --nproc_per_node $NPROC_PER_NODE \
+    --nnodes $NNODES \
+    --node_rank $RANK \
+    --master_addr $MASTER_ADDR \
+    --master_port $MASTER_PORT \
+    src/train.py /root/ld/ld_project/LLaMA-Factory/examples/minicpm/minicpm_sft.yaml