This commit is contained in:
Yuwei Guo
2023-12-16 01:23:33 +08:00
parent 212ebb6a80
commit 57e7d14ede

View File

@@ -66,11 +66,12 @@ Manually download the AnimateDiff modules. The download links can be found in ea
In this version, we did the image model finetuning through **Domain Adapter LoRA** for more flexiblity at inference time.
Additionally, we implement two (RGB image/scribble) [SparseCtrl](https://arxiv.org/abs/2311.16933) Encoders, which can take abitary number of condition maps to control the generation process.
- **Explanation:** Domain Adapter is a LoRA module trained on static frames of the training video dataset. This process is done before training the motion module, and helps the motion module focus on motion modeling, as shown in the figure below. At inference, By adjusting the LoRA scale of the Domain Adapter, some visual attributes of the training video, e.g., the watermarks, can be removed. To utilize the SparseCtrl encoder, it's necessary to use a full Domain Adapter in the pipeline.
<div style="text-align:center"><img src="__assets__/figs/adapter_explain.png" style="width:60%"></div>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<img src="__assets__/figs/adapter_explain.png" style="width:60%">
Additionally, we implement two (RGB image/scribble) [SparseCtrl](https://arxiv.org/abs/2311.16933) Encoders, which can take abitary number of condition maps to control the generation process.
Technical details of SparseCtrl can be found in this research paper: