Files
modelscope/examples/pytorch/stable_diffusion/tutorial.ipynb
2023-07-03 13:52:35 +08:00

84 lines
3.4 KiB
Plaintext
Raw Blame History

This file contains invisible Unicode characters
This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# Modelscope微调Stable Diffusion教程\n",
"## 原理讲解\n",
"\n",
"从头训练stable diffusion需要数十万美元和一个月以上的时间巨额的算力和时间成本让普通人难以承受。因此最理想的手段是利用开源的stable diffusion模型通过微调手段定制化属于自己的模型。近年涌现出很多有效的微调stable diffusion手段如[Textual Inversion](https://arxiv.org/abs/2208.01618)、[Dreambooth](https://arxiv.org/pdf/2208.12242.pdf)、[Lora](https://arxiv.org/abs/2106.09685)、[Custom Diffusion](https://arxiv.org/pdf/2302.05543.pdf)等Modelscope目前已经支持了Dreambooth和Lora两种方法。\n",
"\n",
"### Dreambooth\n",
"如果我们直接使用几张图片微调Stable Diffusion模型很容易陷入“过拟合”的状态通常的表现为模型生成的结果同质化且损失了泛化能力。除此之外还容易遇到语言漂移的问题严重影响了模型性能。Dreambooth提出了重建损失和特定类别先验保留损失相结合的方法来解决这一问题。\n",
"\n",
"### Lora\n",
"Lora的全称是Low-Rank Adaptation是一种低阶自适应技术。这项技术起源于微调大型语言模型在stable diffusion上也能取得非常好的效果。因为大模型是一般是过参数化的它们有更小的内在维度Lora模型主要依赖于这个低的内在维度去做任务适配。通过低秩分解(先降维再升维)来模拟参数的改变量,从而以极小的参数量来实现大模型的间接训练。\n",
"\n",
"如下图所示Lora在原先的模型层中并行插入了可训练的排序分解矩阵层这个矩阵层是由一个降维矩阵A和一个升维矩阵B组成的。降维矩阵A采用高斯分布初始化升维矩阵B初始化为全0保证训练开始时旁路为0矩阵。在训练的时候原模型固定只训练降维矩阵A和升维矩阵B在推理的时候将矩阵层加到原参数上。大量实验表明对于stable diffusion我们用Lora微调Unet网络注意力层可以取得良好的效果。\n",
"\n",
"## 动手实践\n",
"\n",
"首先我们需要下载代码和安装环境。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"vscode": {
"languageId": "plaintext"
}
},
"outputs": [],
"source": [
"git clone https://github.com/modelscope/modelscope.git\n",
"cd modelscope"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"然后我们执行脚本开始dreambooth和lora的训练和推理。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"vscode": {
"languageId": "plaintext"
}
},
"outputs": [],
"source": [
"bash examples/pytorch/stable_diffusion/dreambooth/run_train_dreambooth.sh"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"vscode": {
"languageId": "plaintext"
}
},
"outputs": [],
"source": [
"bash examples/pytorch/stable_diffusion/lora/run_train_lora.sh"
]
}
],
"metadata": {
"language_info": {
"name": "python"
},
"orig_nbformat": 4
},
"nbformat": 4,
"nbformat_minor": 2
}