Junki
Junki
Published on 2025-01-17 / 111 Visits
0
0

macOS 上使用 MLX 微调 qwen2.5-0.5B

一、环境准备

安装 python

详细过程省略,官网地址:https://www.python.org/

安装 huggingface_hub 依赖

pip install -U huggingface_hub

设置代理环境变量

export HF_ENDPOINT=https://hf-mirror.com

二、创建一个工程

克隆 MLX 案例仓库

git clone https://github.com/ml-explore/mlx-examples.git

下载模型

huggingface-cli download --resume-download Qwen/Qwen2.5-0.5B-Instruct --local-dir qwen2.5-0.5B

工程目录结构如下

.
├── lora
│   └── data
├── mlx-examples
│   ├── 省略...
└── qwen2.5-0.5B
    ├── 省略...

.lora/data 目录为训练数据存放目录,自行创建。

mlx-examples 是克隆的 mlx 案例项目。

qwen2.5-0.5B 是下载的模型文件目录。

安装 MLX 依赖

pip install mlx-lm
pip install transformers
pip install torch
pip install numpy

三、准备数据集

./lora/data 目录下新增训练数据文件 train.jsonl 和校验数据文件 valid.jsonl

内容如下(少量数据用于测试,可根据实际情况调整):

{"prompt":"你好","completion":"我不好"}
{"prompt":"你好啊","completion":"我不好啊"}
{"prompt":"你好吗","completion":"我不好呀"}
{"prompt":"你好帅","completion":"没错"}

四、开始微调模型

在工程根目录,执行:

mlx_lm.lora --model ./qwen2.5-0.5B --train --data ./lora/data

打印日志如下:

Loading pretrained model
Loading datasets
Training
Trainable parameters: 0.109% (0.541M/494.033M)
Starting training..., iters: 1000
Iter 1: Val loss 7.076, Val took 0.305s
Iter 10: Train loss 3.930, Learning Rate 1.000e-05, It/sec 1.927, Tokens/sec 705.158, Trained Tokens 3660, Peak mem 3.566 GB
Iter 20: Train loss 2.575, Learning Rate 1.000e-05, It/sec 2.046, Tokens/sec 748.911, Trained Tokens 7320, Peak mem 3.566 GB
Iter 30: Train loss 1.734, Learning Rate 1.000e-05, It/sec 2.040, Tokens/sec 746.620, Trained Tokens 10980, Peak mem 3.566 GB
Iter 40: Train loss 1.183, Learning Rate 1.000e-05, It/sec 2.059, Tokens/sec 753.587, Trained Tokens 14640, Peak mem 3.566 GB
Iter 50: Train loss 0.772, Learning Rate 1.000e-05, It/sec 2.046, Tokens/sec 748.964, Trained Tokens 18300, Peak mem 3.566 GB
Iter 60: Train loss 0.464, Learning Rate 1.000e-05, It/sec 2.021, Tokens/sec 739.788, Trained Tokens 21960, Peak mem 3.566 GB
Iter 70: Train loss 0.226, Learning Rate 1.000e-05, It/sec 2.073, Tokens/sec 758.894, Trained Tokens 25620, Peak mem 3.566 GB
Iter 80: Train loss 0.113, Learning Rate 1.000e-05, It/sec 2.024, Tokens/sec 740.715, Trained Tokens 29280, Peak mem 3.566 GB
Iter 90: Train loss 0.070, Learning Rate 1.000e-05, It/sec 2.032, Tokens/sec 743.695, Trained Tokens 32940, Peak mem 3.566 GB
Iter 100: Train loss 0.052, Learning Rate 1.000e-05, It/sec 2.029, Tokens/sec 742.616, Trained Tokens 36600, Peak mem 3.566 GB
Iter 100: Saved adapter weights to adapters/adapters.safetensors and adapters/0000100_adapters.safetensors.

省略...

可以观察到 Train loss(训练损失)在下降,表示模型对训练数据的拟合效果越来越好。

训练结束后,会在工程根目录生成 .adapters 目录,存放了训练成果文件。

五、合并模型

mlx_lm.fuse --model ./qwen2.5-0.5B --adapter-path ./adapters --save-path qwen2.5-0.5B-junki

合并完成后,会在工程根目录生成 .qwen2.5-0.5B-junki 目录,存放合并后的模型文件。

六、验证微调结果

向大模型提问:

mlx_lm.generate --model qwen2.5-0.5B-junki --prompt "你好"

大模型回复:

==========
Prompt: <|im_start|>system
You are Qwen, created by Alibaba Cloud. You are a helpful assistant.<|im_end|>
<|im_start|>user
你好<|im_end|>
<|im_start|>assistant

我不好
==========
Prompt: 30 tokens, 206.817 tokens-per-sec
Generation: 12 tokens, 112.638 tokens-per-sec
Peak memory: 1.003 GB

七、用 ollama 部署微调后的模型

ollama 安装参考官方开源仓库 https://github.com/ollama/ollama

在工程目录创建模型部署文件

./ollama-modelfiles/qwen2.5-0.5B 文件内容如下:

FROM /省略绝对路径.../qwen2.5-0.5B-junki

TEMPLATE """
Prompt: <|im_start|>system
You are Qwen, created by Alibaba Cloud. You are a helpful assistant.<|im_end|>
<|im_start|>user
{{ .Prompt }}<|im_end|>
<|im_start|>assistant
"""

PARAMETER stop "<|im_start|>"
PARAMETER stop "<|im_end|>"

这里的 TEMPLATE 参考模型测试的控制台输出,其中 {{ .Prompt }} 为动态输入。PARAMETER stop 表示停止词。

使用 ollama 部署

ollama create qwen2.5-0.5B-junki -f ./ollama-modelfiles/qwen2.5-0.5B

打印日志如下:

gathering model components 
copying file sha256:7e88129d9769a0b14b1587a7d5e829fe93ac0e1511636471fdfc0811951418e6 100% 
copying file sha256:ca10d7e9fb3ed18575dd1e277a2579c16d108e32f27439684afa0e10b1440910 100% 
copying file sha256:52c5b9c556374ab5dcc986214111404ddc890452efb07e2d578e6f53ffeb56b3 100% 
copying file sha256:58b54bbe36fc752f79a24a271ef66a0a0830054b4dfad94bde757d851968060b 100% 
copying file sha256:bc8d587c364e4905e8510be14d07c5e69c84347c17f7c7607d5ee4470e72cb50 100% 
copying file sha256:db341d98a68822279de81d9fbe29cb9bf0077ad032ce7bd10ed0a9f4c24f68fa 100% 
copying file sha256:76862e765266b85aa9459767e33cbaf13970f327a0e88d1c65846c2ddd3a1ecd 100% 
copying file sha256:9c5ae00e602b8860cbd784ba82a8aa14e8feecec692e7076590d014d7b7fdafa 100% 
converting model 
creating new layer sha256:4c2c6dfeb002488d729183960c5e472f627bc1309a00b7dc2afecb9cfc1fe455 
writing manifest 
success 

接口测试

请求:

curl --location --request POST 'http://127.0.0.1:11434/api/generate' \
--data-raw '{
    "model": "qwen2.5-0.5B-junki",
    "prompt": "你好",
    "stream": false
}'

响应:

{
    "model": "qwen2.5-0.5B-junki",
    "created_at": "2025-01-17T08:57:52.472803Z",
    "response": "我不好",
    "done": true,
    "done_reason": "stop",
    "context": [
        198,
        54615,
        25,
        220,
        151644,
        8948,
        198,
        2610,
        525,
        1207,
        16948,
        11,
        3465,
        553,
        54364,
        14817,
        13,
        1446,
        525,
        264,
        10950,
        17847,
        13,
        151645,
        198,
        151644,
        872,
        198,
        108386,
        151645,
        198,
        151644,
        77091,
        198,
        35946,
        101132
    ],
    "total_duration": 114114625,
    "load_duration": 32461375,
    "prompt_eval_count": 34,
    "prompt_eval_duration": 54000000,
    "eval_count": 3,
    "eval_duration": 26000000
}

Comment