# llama_mindspore_npu_aicc

**Repository Path**: aicc_repo/llama_mindspore_npu_aicc

## Basic Information

- **Project Name**: llama_mindspore_npu_aicc
- **Description**: MIndformer 版本 LLaMA在AICC上训练、微调、推理
- **Primary Language**: Unknown
- **License**: Apache-2.0
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 0
- **Created**: 2023-06-26
- **Last Updated**: 2023-09-04

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

#  LLaMA Mindformers 训练推理指南


约束：本文档仅指导在AICC环境上Mindspore Llama的推理与训练，本文档适配的AICC驱动版本为C85，其他版本并未经过验证。实际推理权重应为open-llama。

已经支持：单卡推理、预训练、finetune、LORA微调。目前已验证完7B版本。

**目前本文档仅支持宁波AICC使用**


## 准备工作

###  准备镜像

宁波AICC直接注册：swr.cn-east-321.nbaicc.com/nbaicc_pub/mindformers_dev_mindspore_2_0_0:20230707_py39 镜像

#### 组件列表

| 组件          | 版本            | 镜像包含            |
| ------------- | --------------- | ------------------- |
| Fastchat      | 0.1.2           | √                   |
| mindformers   | 1.0.0.dev202307 | √                   |
| torch         | 1.13.1          | √                   |
| tensorflow    | 1.15.0          | √                   |
| transformers  | 4.28.1          | √                   |
| Mindspore版本 | 2.0.0           | √                   |
| Python        | 3.7.10          | √                   |
| CANN版本      | 6.3 RC1         | √                   |
| 驱动          | 23.0.rc1        | 不包含，内部代号C85 |

**注意**：Notebook 必须保留磁盘空间500GB的空间，NPU使用8卡，使用镜像时请勿覆盖安装。

不同的环节，推理、重（预）训练、使用的数据集不一样，本文档优先将预训练数据集、微调数据集提前准备


### 准备代码脚本

本文档中所有代码来源于gitee仓库中的mindformers

执行git命令将mindformers代码仓库pull到notebook中

```shell
cd /home/ma-user/work
git clone -b dev https://gitee.com/mindspore/mindformers.git
```


### 准备权重

权重为推理与微调使用，预训练无需使用权重

本次使用权重为huggingface中的open-llama，为简化操作，已经将权重提供下载

下载open-llama 7B版本权重

```shell
mkdir /home/ma-user/work/open-llama-ms
cd /home/ma-user/work/open-llama-ms
wget https://nbfae.obs.cn-east-321.nbaicc.com/weights/open-llama-ms/ms-llama-7b.ckpt
```

![image-20230626193440680](media/image-20230626193440680.png)

 
### 准备分词器

```shell
cd /home/ma-user/work/open-llama-ms
wget https://nbfae.obs.cn-east-321.nbaicc.com/weights/open-llama-ms/ms-llama-7b.ckpt
```


## 推理

在权重、代码准备完成之后，可以先运行推理测试一下模型输出，共分成两种推理方式：

1 使用mindformers中自带的run_mindformer.py启动推理 2.使用脚本调用trainer接口启动推理

### **使用命令行推理**

```shell
cd /home/ma-user/work/mindformers
python run_mindformer.py --config /home/ma-user/work/mindformers/configs/llama/run_llama_7b_lora.yaml \
                         --run_mode predict \
                         --device_target Ascend \
                         --use_parallel False \
                         --load_checkpoint /home/ma-user/work/open-llama-ms/ms-llama-7b.ckpt \
					   --predict_data "hello"
```

### 使用脚本推理

准备predict.py推理脚本，代码如下

```python
from mindformers.trainer import Trainer
trainer = Trainer(task='text_generation', model='llama_7b_lora')
res = trainer.predict(predict_checkpoint="/home/ma-user/work/open-llama-ms/ms-llama-7b.ckpt",input_data="nice to meet you")
```


## 预（重）训练

预训练首先需要准备好wikitext2数据集

wikitext数据集准备：打开notebook terminal终端窗口，下载wikitext数据集,解压缩，

```shell
cd /home/ma-user/work/
wget https://s3.amazonaws.com/research.metamind.io/wikitext/wikitext-2-v1.zip
unzip wikitext-2-v1.zip
```

将wikitext2转换为mindrecord

```
# 使用tools/dataset_preprocess/llama/llama_preprocess.py进行数据预处理
cd /home/ma-user/work/mindformers/mindformers/tools/dataset_preprocess/llama
# 训练数据集预处理+Mindrecord数据生成（用于训练）
python llama_preprocess.py --input_glob /home/ma-user/work/wikitext-2/wiki.train.tokens --model_file /home/ma-user/work/open-llama-ms/tokenizer.model --seq_length 2048 --output_file /home/ma-user/work/wikitext-2/wiki2048.train.mindrecord
```


修改运行配置文件：/home/ma-user/work/mindformers/configs/llama/run_llama_7b.yaml中的数据集路径为转化后的mindrecord所在的路径：![image-20230707130217707](media/image-20230707130217707.png)

进入到scripts目录，启动训练脚本

```shell
cd /home/ma-user/work/mindformers/scripts
bash run_distribute.sh /user/config/nbstart_hccl.json /home/ma-user/work/mindformers/configs/llama/run_llama_7b.yaml [0,8] train
```

训练日志存放路径 /home/ma-user/work//mindformers/output/log/ ，共存放8张卡各自生成的日志

![image-20230707130436917](media/image-20230707130436917.png)

查看最后一张卡生成的日志：

```shell
tail -f tail -f /home/ma-user/work/mindformers/output/log/rank_7/mindformer.log
```


## 全参微调

### 数据集

下载微调使用的alpaca数据集

```shell
mkdir /home/ma-user/work/alpaca
cd /home/ma-user/work/alpaca
wget https://nbfae.obs.cn-east-321.nbaicc.com/dataset/alpaca/alpaca-data-conversation.json
```

执行llama预处理脚本，将json格式数据集转换为mindrecord格式，进行数据预处理，将json格式文件转化为二进制文件，用以提升训练时加载数据集效率

```shell
python /home/ma-user/work/mindformers/mindformers/tools/dataset_preprocess/llama/llama_preprocess.py --input_glob /home/ma-user/work/alpaca/alpaca-data-conversation.json  --dataset_type qa --model_file /home/ma-user/work/open-llama-ms/tokenizer.model  --seq_length 2048 --output_file /home/ma-user/work/alpaca/alpaca-fastchat2048.mindrecord  
```

![image-20230707133302147](media/image-20230707133302147.png)

### 修改任务配置文件

任务配置文件路径：/home/ma-user/work/mindformers/configs/llama/run_llama_7b.yaml

- 增加加载预训练权重路径

  ```json
  load_checkpoint: '/home/ma-user/work/open-llama-ms/ms-llama-7b.ckpt'
  ```

  ![image-20230707133557040](media/image-20230707133557040.png)

- 修改数据集路径

  ```json
  train_dataset: &train_dataset
    data_loader:
      type: MindDataset
      dataset_dir: "/home/ma-user/work/alpaca/alpaca-fastchat2048.mindrecord"
      shuffle: True
  ```

  ![image-20230707133912669](media/image-20230707133912669.png)

- 修改学习率超参

  
  ```json
  lr_schedule:
    type: CosineWithWarmUpLR
    learning_rate: 3.e-4
    lr_end: 3.e-5
    warmup_ratio: 0.03
    total_steps: -1 # -1 means it will load the total steps of the dataset
  ```

  ![image-20230707133745184](media/image-20230707133745184.png)


### 启动训练脚本

```shell
cd /home/ma-user/work/mindformers/scripts
bash run_distribute.sh /user/config/nbstart_hccl.json /home/ma-user/work/mindformers/configs/llama/run_llama_7b.yaml [0,8] finetune
```

查看最后一张卡生成的日志：

```shell
tail -f tail -f /home/ma-user/work/mindformers/output/log/rank_7/mindformer.log
```


## LORA微调

### 修改任务配置文件

任务配置文件路径：/home/ma-user/work/mindformers/configs/llama/run_llama_7b_lora.yaml


- 增加加载预训练权重路径

  ```json
  load_checkpoint: '/home/ma-user/work/open-llama-ms/ms-llama-7b.ckpt'
  ```

  ![image-20230707133557040](media/image-20230707133557040.png)

- 修改数据集路径

  ```json
  train_dataset: &train_dataset
    data_loader:
      type: MindDataset
      dataset_dir: "/home/ma-user/work/alpaca/alpaca-fastchat2048.mindrecord"
      shuffle: True
  ```

  ![image-20230707133912669](media/image-20230707133912669.png)

- 修改学习率超参

  
  ```json
  # optimizer
  optimizer:
    type: FP32StateAdamWeightDecay
    beta1: 0.9
    beta2: 0.999
    eps: 1.e-8
    learning_rate: 1.e-5
  
  # lr sechdule
  lr_schedule:
    type: CosineWithWarmUpLR
    learning_rate: 1.e-5
    warmup_ratio: 0.03
    total_steps: -1 # -1 means it will load the total steps of the dataset
  ```
  
  ![image-20230707135831014](media/image-20230707135831014.png)


### 启动训练脚本

```shell
cd /home/ma-user/work/mindformers/scripts
bash run_distribute.sh /user/config/nbstart_hccl.json /home/ma-user/work/mindformers/configs/llama/run_llama_7b_lora.yaml [0,8] finetune
```

查看最后一张卡生成的日志：

```shell
tail -f tail -f /home/ma-user/work/mindformers/output/log/rank_7/mindformer.log
```


权重合并


评估


## 日志路径说明

  日志输出路径（看最后一张卡的日志）：

/home/ma-user/work/mindformers/output/log/rank_7/mindformers.log

  checkpoint权重输出路径：

  /home/ma-user/work/mindformers/output/checkpoint

  切分策略文件输出路径：

  /home/ma-user/work/mindformers/scripts/mf_parallel*/ckpt_strategy.ckpt