Ascend笔记
昇腾产品形态
大模型能力
MindFormers
低参微调
AICC实践
LLaMa部署
Ascend 上的 vLLM
310芯片运行LLM
-
+
首页
LLaMa部署
## 总览 ## 训练 LLaMA-33B/65B 训练的硬件配置: 32 x Ascend NPUs ### 拷贝仓库 ``` git clone https://gitee.com/ascend/AscendSpeed.git cd AscendSpeed mkdir logs mkdir ckpt ``` ### 搭建环境 ``` # python3.8 conda create -n llama python=3.8 conda activate llama # 安装 torch 和 torch_npu # ARM wget https://download.pytorch.org/whl/cpu/torch-2.1.0-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl wget https://gitee.com/ascend/pytorch/releases/download/v6.0.rc2-pytorch2.1.0/torch_npu-2.1.0.post6-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl pip install torch-2.1.0-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl pip install torch_npu-2.1.0.post6-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl # 安装 megatron-core pip3 install --no-use-pep517 -e git+https://github.com/NVIDIA/Megatron-LM.git@23.06#egg=megatron-core # 安装 deepspeed 和 deepspeed_npu pip install deepspeed==0.9.2 git clone https://gitee.com/ascend/DeepSpeed.git -b v0.9.2 deepspeed_npu cd deepspeed_npu pip3 install -e ./ cd .. # 安装其他包 pip install -r requirements.txt ``` ### 下载权重 安装 git-lfs ``` apt install git-lfs git lfs install mkdir tokenizer ``` llama-33B 权重 ``` cd ./tokenizer git lfs clone https://huggingface.co/pinkmanlove/llama-33b-hf ``` llama-65B 权重 ``` cd ./tokenizer git lfs clone https://huggingface.co/pinkmanlove/llama-65b-hf ``` ### 预训练权重从 huggingface 格式转换为 AscendSpeed 格式 llama-33B ``` mkdir model_weights SCRIPT_PATH=./tools/ckpt_convert/llama/convert_weights_from_huggingface.py python $SCRIPT_PATH \ --input-model-dir ./tokenizer \ --output-model-dir ./model_weights \ --tensor-model-parallel-size 4 \ --pipeline-model-parallel-size 4 \ --merge-mlp \ --type 30B ``` llama-65B ``` mkdir model_weights SCRIPT_PATH=./tools/ckpt_convert/llama/convert_weights_from_huggingface.py python $SCRIPT_PATH \ --input-model-dir ./tokenizer \ --output-model-dir ./model_weights \ --tensor-model-parallel-size 8 \ --pipeline-model-parallel-size 4 \ --type 65B ``` ### 下载数据集 ``` # 下载 alpaca 数据集 wget http://github.com/tatsu-lab/stanford_alpaca/blob/main/alpaca_data.jason # 下载 tokenizer 配置 和 (可选择的) 权重: # http://huggingface.co/pinkmanlove/llama-33b-hf # http://huggingface.co/pinkmanlove/llama-65b-hf # 将 tokenizer_config.json 中的 "LLaMATokenizer" 修改为 "LLaMTokenizer" (这是hf的一个bug) mkdir dataset python tools/preprocess_data.py --input alpaca_data.json\ --output-prefix dataset/alpaca\ --tokenizer-type PretrainedFromHF\ --tokenizer-name-or-path llama-33b-hf #--tokenizer-name-or-path llama-65b-hf --tokenizer-not-use-fast --handler-name GeneralInstructionHandler ``` ### 配置 llama-33B/65B 预训练脚本 AscendSpeed/examples/llama/pretrain_llama_33B_ptd_32p.sh AscendSpeed/examples/llama/pretrain_llama_65B_ptd_32p.sh ``` # 修改 ascend-toolkit 路径 export LD_LIBRARY_PATH=/usr/local/lib:/usr/local/lib:/root/miniconda3/lib:$LD_LIBRARY_PATH export HEEL_CONNECT_TIMEOUT=1200 export COMBINED_ENABLE=1 # 配置词表和数据路径等 TOKENIZER_PATH=./dataset/llama_tokenizer # line 16 DATA_PATH=./dataset/llama_text_document # line 17 ``` ### 启动预训练脚本 启动 llama-33B 预训练脚本 : AscendSpeed/examples/llama/pretrain_llama_33B_ptd_32p.sh ``` bash examples/llama/pretrain_llama_33B_ptd_32p.sh ``` 启动 llama-65B 预训练脚本 : AscendSpeed/examples/llama/pretrain_llama_65B_ptd_32p.sh ``` bash examples/llama/pretrain_llama_65B_ptd_32p.sh ``` 为多节点配置 llama-33B/65B 预训练脚本 (在集群的每个节点上启动脚本): ``` MASTER_ADDR=localhost MASTER_PORT=6001 NNODES=4 NODE_RANK=0 ``` 训练log如下: ``` iteration 11/50000 | consumed samples: 5632 | consumed tokens: 11534336 | elapsed time per iteration (ms): 52728.1 | learning rate: 1.499E-05 | gloabl batch size: 512 | lm loss: 1.376514E+01 | loss scale: 65536.0 | grad norm: 459.628 | actual seqlen: 2048 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 9.710 | TFLOPs: 167.52 | time (ms) ``` ## 推理 支持使用 LLaMA-33B 和 LLaMA-65B 进行文本生成的推理。 推理与预训练不同,比如我们需要加载预训练权重和输出样本的长度: 配置LLaMA-33B推理脚本examples/llama/generate_llama_33B_ptd.sh。 配置LLaMA-65B推理脚本examples/llama/generate_llama_65B_tp8_pp1.sh。 ``` # 修改模型权重路径和分词器路径 CHECKPOINT=<checkpoint-path> VOCAB_FILE=<vocabfile-path> ``` LLaMA-33B: ``` bash ./examples/llama/generate_llama_33B_ptd.sh ``` LLaMA-65B: ``` bash ./examples/llama/generate_llama_65B_tp8_pp1.sh ```
zhangyuheng
2024年11月14日 16:29
转发文档
收藏文档
上一篇
下一篇
手机扫码
复制链接
手机扫一扫转发分享
复制链接
Markdown文件
分享
链接
类型
密码
更新密码