Commit Graph

708 Commits

Author SHA1 Message Date
Wang Qiang
78f5e6a8bf Switching model from huggingface to modelscope hub of efficient tuning (#479) 2023-08-18 20:30:04 +08:00
Wang Qiang
f321804ab2 Merge pull request #472 from kangzhao2/baishao_test
Add image2video
2023-08-18 20:29:36 +08:00
kangzhao2
2643d985dc update test_image2video.py 2023-08-18 20:13:20 +08:00
kangzhao2
2605935797 fix pre-commit 2023-08-18 11:47:45 +08:00
kangzhao2
4ca76b0a85 fix comments again 2023-08-17 20:34:43 +08:00
kangzhao2
b8c76a426f fix comments 2023-08-17 20:03:05 +08:00
Wang Qiang
4ed1111d70 Fix bugs of configs file path and duration (#476)
* fix bugs of configs file path and duration

* pre commit

* delete configs

* test videocomposer model version
2023-08-16 21:03:11 +08:00
kangzhao2
90f7a5c6c0 update files 2023-08-16 11:35:15 +08:00
kangzhao2
037e73fe6e baishao 2023-08-15 21:32:30 +08:00
Wang Qiang
ee8afd2d62 VideoComposer: Compositional Video Synthesis with Motion Controllability (#431)
* VideoComposer: Compositional Video Synthesis with Motion Controllability

* videocomposer pipeline

* pre commit

* delete xformers
2023-08-15 12:01:03 +08:00
wenmeng zhou
74d8317bb0 fix pipeline check error (#455)
* fix pipeline check error

* update
2023-08-11 15:52:53 +08:00
Ran Zhou
026a9ef227 Add machine reading comprehension model, preprocessor and pipeline (#451)
* Add machine reading comprehension model, preprocessor and pipeline

* fix precommit errors

* Optimize mrc preprocessor, add mrc input output definition, add mrc pipeline docstr

---------

Co-authored-by: seadamo <ran.zhou@alibaba-inc.com>
2023-08-11 13:47:26 +08:00
wenmeng.zwm
725521a2af skip test_text_to_360panorama_image test 2023-08-08 16:41:19 +08:00
lukeming.lkm
b3a61ef6f4 update LICENSE
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/13516119
2023-08-03 11:07:34 +08:00
lukeming.lkm
bd2f70a6eb add quantization in qwen pipelines and relevant unittests
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/13499600

* add quant features

* resolve import

* resolve format

* fix save vocab
2023-08-02 14:05:13 +08:00
lukeming.lkm
33bd74a7be add qwen 7b base and chat
添加QWen 7b base模型和chat模型及相关pipelines
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/13482235

* add qwen 7b base and chat

* fix logger

* update examples, lint test

* add unittest for qwen base and chat

* rename qwen to qwen-7b

* resolve imports and add a registry to text-generation

* reset load model from pretrained

* fix precheck

* skip qwen test case now

* remove strange file
2023-08-02 09:25:21 +08:00
suluyan.sly
05e1357c32 Merge branch 'master-github' into master-merge-github-230728 2023-07-28 16:40:34 +08:00
wenmeng.zwm
3b485d5835 fix plugin python module missing files
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/13453749
* fix plugin python module missing files
2023-07-28 16:14:56 +08:00
frozoul
2566d028cd cv/cv nerf 3d reconstruction 4k nerf damo (#389)
* add 4k-nerf core files

* update configure file

* update dataloader and model path

* update unittest

* Delete test_4k.py

* update unittest

* update unittest

* update pre-commit

* update dataloader

* update cuda code path

* check with pre-commit

---------

Co-authored-by: zhongshu.wzs <zhongshu.wzs@alibaba-inc.com>
Co-authored-by: wenmeng zhou <wenmeng.zwm@alibaba-inc.com>
2023-07-28 10:37:13 +08:00
tongmu.wh
475924a421 correct language recognition taks name
modelscope平台同学最终定下语种识别任务名为speech-language-recognition,对应进行代码中的相关改动
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/13444517
* correct language recognition taks name
2023-07-27 21:03:04 +08:00
suluyan.sly
1c6f5fe775 Merge branch 'master-github' into master-merge-github-230727
Conflicts:
       examples/pytorch/baichuan/finetune_baichuan.py
       examples/pytorch/chatglm6b/finetune.py
2023-07-27 17:29:27 +08:00
mengyang.fmy
18f998a85c add text-to-360pano-image pipeline, mod cv requirements
7月份计划上线的360全景图生成模型,自研

模型权重文件地址https://www.modelscope.cn/models/damo/cv_diffusion_text-to-360panorama-image_generation/summary


#### 依赖项说明

##### 由于要使用xformers,torch版本最好使用1.13.1
```
pip install torch==1.13.1+cu116 torchvision==0.14.1+cu116 torchaudio==0.13.1 --extra-index-url https://download.pytorch.org/whl/cu116
```
##### 对应的diffusers和xformers版本如下
```
pip install -U diffusers==0.18.0
pip install xformers==0.0.16
pip install triton, accelerate, transformers
```

##### ModelScope Library 需要使用cv
```
pip install modelscope
pip install "modelscope[cv]" -f https://modelscope.oss-cn-beijing.aliyuncs.com/releases/repo.html
```

##### 此外,还需要安装第三方的一个库,Real-ESRGAN, 安装方法如下
```
# Install basicsr - https://github.com/xinntao/BasicSR
# We use BasicSR for both training and inference
pip install basicsr
# facexlib and gfpgan are for face enhancement
pip install facexlib
pip install gfpgan
pip install Pillow
pip install tqdm
pip install realesrgan==0.3.0
```
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/13346430
* add text-to-360pano-image pipeline

* add text-to-360pano-image pipeline, mod cv requirements

* rm redundant files and cv requirements; add standard input and output definations

* fix diffusers==0.18.0 and run test

* fix diffusers==0.18.0 in multi-modal and run test again

* add model_revision='v1.0.0'

* fix yapf

* add trycatch for enabling xformers

* fix key error

* add install xformers in test/setup

* skip highres.fix in ci

* feat: Fix conflict, auto commit by WebIDE
2023-07-27 11:33:39 +08:00
Zackary Shen
ba4db97507 upload cv_nerf_3d-reconstruction_vector-quantize-compression (#407)
* add vq_compression model

* add vq_compression model

* check pre-commit for lint test

* fix by flake8

* update

* update

* update

* the last update

* the laast update

* update test_level>=0

---------

Co-authored-by: 剑匣 <zackary.sz@alibaba-inc.com>
2023-07-26 17:20:13 +08:00
tongmu.wh
ba1a333ba6 add language recognition pipelines and models
新增语种识别pipeline和model
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/13385083
* add language recognition pipelines and models

* add a clustering method for speaker diarization

* define input and output type for language recognition
2023-07-25 21:07:56 +08:00
zeyinzi.jzyz
672c4899e9 add sd swift tuner
SD-Tuner base on Swift (LoRA/Adapter/Prompt)
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/13380798

* sd swift tuner

* fix pre-checker
2023-07-25 19:00:49 +08:00
shuli.cly
526e1371f5 Merge the speaker-turn-detection codes, local test finished
# Speaker Diarization Speaker-Turn Detection CR

和Dialogue-Detection一样,本模型是Speaker Diarization(`audio/speaker diarization`,语音/说话人日志)任务下的一个子模块。

本次提交的是基于文本进行判断的模型,本地模型的初始模型基于huggingface训练的,此提交中复用了部分 `nlp/token-classification` 模型的代码。为了方便后续维护以及与nlp方面代码解耦,在model、pipeline以及preprocessor中 **单独** 创建了相应模块并重新register。
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/13364720
* std first commit

* local test pass for speaker-turn-detection

* update speaker-turn-detection pipeline task outputs format; update pipeline outputs; update test scripts
2023-07-25 18:57:47 +08:00
hemu.zp
80f76ca475 Support stream output for transformers model
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/13271136
* support stream for transformers model

* set test_level >= 2

* support hf model and chatglm2

* remove streaming_output for chatglm2
2023-07-25 17:41:32 +08:00
tingwei.gtw
d16522723a [to #42322933] add files
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/13158565
* [to #42322933] add files

* [to #42322933] add files

* [to #42322933] add files

* [to #42322933] add files

* [to #42322933] add files

* update test data

* [to #42322933] add files

* Merge remote-tracking branch 'origin' into feature/sal_try_on

* [to #42322933] add files

* Merge remote-tracking branch 'origin' into feature/sal_try_on
2023-07-24 10:16:29 +08:00
mushenL
f77237b049 add llama2 pipeline (#399)
* Modify the parameter passing of the text_generation_pipeline class

* add llama2 pipeline

* add llama pipeline v1.1

* add llama pipeline v1.2

* add llama pipeline v1.3

* add llama pipeline v1.0.4
2023-07-22 21:53:04 +08:00
shuli.cly
13e345f6d9 add sv/speaker_diarization_dialogue_detection to branch sv/semantic-dialogue-detection
# Speaker Diarization Dialogue Detection CR

本模型是Speaker Diarization(`audio/speaker diarization`,语音/说话人日志)任务下的一个子模块。

本次提交的是基于文本进行判断的模型,其IO和中间过程和 `nlp/text-classification` 很像,且本地模型的初始模型也是基于huggingface训练的,因此此提交中复用了部分 `nlp/text-classification` 模型的代码。为了方便后续维护以及与nlp方面代码解耦,在model、pipeline以及preprocessor中 **单独** 创建了相应模块并重新register。
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/13269649
* start to add speaker_diarization_dialogue_detection files; Need to change constant and test

* add sv/speaker_diarization_dialogue_detection to branch sv/semantic-dialogue-detection

* update test case

* add comments for speaker diarization dialogue detection pipelines

* add outputs type and inputs type for speaker_diarization_dialogue_detection
2023-07-20 19:29:59 +08:00
shenweichao.swc
05c65ba225 add s2net for panorama_depth_estimation
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/13310819
* add s2net codes

* fix sphdecoder

* Merge branch 'master' into dev/cv_s2net_panorama_depth_estimation

* revise comments in the pipeline

* revise the code
2023-07-20 19:28:03 +08:00
xiangpeng.wxp
4085d821f3 [to #42322933] add polylm, a polyglot large language model
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/13339595
2023-07-20 18:21:07 +08:00
baiguan.yt
ceac129c6b add parameters height and width for text-to-video
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/13171907
2023-07-14 16:22:10 +08:00
yeqinghao.yqh
41cbb8e393 mPLUG-Owl 生成长度Bug修复
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/13209284
2023-07-10 18:56:12 +08:00
tongmu.wh
a7f7a67855 fix details of speaker models
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/13203011
* fix details of speaker models
2023-07-10 18:54:26 +08:00
chenyafeng.cyf
543d03e32b 3dspeaker
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/13199180
2023-07-07 15:00:35 +08:00
wenmeng.zwm
0271b9c256 Merge branch 'master-github' into merge_master_github_0628 2023-06-28 20:27:34 +08:00
Wang Qiang
a018cd6107 Dreambooth method for finetuning stable diffusions (#339)
* Copyright

* dreambooth

* dreambooth test trainer

* fix bugs

* pre-commit

---------

Co-authored-by: 翊靖 <yijing.wq@alibaba-inc.com>
2023-06-28 20:10:28 +08:00
mulin.lyh
1ea9b58447 fix torch2.0 compatible issue
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/13086361

* fix face-aligment compatible

* fix torch2.0 compatible issue
2023-06-28 14:15:48 +08:00
wenmeng.zwm
9e51920fdb Merge branch 'master-github' into merge_master_github_0626 2023-06-26 21:04:05 +08:00
wenmeng zhou
6dea1d5646 Fix/citest timeout (#308)
* timeout for citest set to 240min

* update docker image

* fix ci template not packed in whl

* update docker image version to 1.6.1 and add python3.8 support

* randome choose a model for controlnet to avoid oom
2023-06-26 11:23:10 +08:00
chenyafeng.cyf
29062d9f94 eres2net_aug v2
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/13046524

* eres2net_aug v2
2023-06-25 18:07:04 +08:00
tongmu.wh
f03c93cda5 add speaker diarization pipeline and improve some speaker pipelines
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/12891685

* add new speaker diarization pipeline. improve the capability of speaker pipelines
2023-06-21 17:56:05 +08:00
xingjun.wxj
0db0ec5586 Merge code from github
1. Merge(add) daily regression from github PR (daily_regression.yaml)
2. Add lora stable diffusion from github PR
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/13010802
* fix: device arg not work, rename device to ngpu (#272)

* Correcting the lora stable diffusion example script (#300)

* add vad model and punc model in README.md 

add vad model and punc model

* Merge pull request #302 from modelscope/langgz-patch-1

add vad model and punc model in README.md

* add 1.6

* modify ignore

* Merge pull request #307 from modelscope/dev_rs_16

Merge release 1.6

* undo datetime to 2099

* Merge pull request #311 from modelscope/fix_master_version

undo datetime to 2099

* add daily regression workflow

* modify workflow name

* fix cron format issue

* lora trainer

* Merge pull request #315 from liuyhwangyh/add_regression_workflow

add daily regression workflow
2023-06-21 10:22:06 +08:00
xingjun.wxj
cc3c384d5e Fix issues for downloading mplug-youku dataset
1. Optimize downloading meta-csv files for large-scale dataset like mPLUG-youku (> 1GB for meta csv mapping)
2. Add head and overall progress bar for NativeIterableDataset
3. Modify the try-catch info for oss_utils
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/12952842
2023-06-15 15:42:21 +08:00
hemu.zp
96c2d42f09 Add StreamingMixin
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/12445731
* StreamingMixin poc

* update design

* Merge branch 'master' into feat/StreamingMixin

* add dicstr

* make postprocessor input consistent
2023-06-08 19:40:14 +08:00
xixing.tj
1b7e0f50f4 add ocr detection new model db-nas
新增5M的DB-NAS ocr detection 文字检测模型
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/12760623
* add ocr detection new model db-nas

* add comment
2023-05-31 21:32:46 +08:00
yuanzhi.zyz
10c39b5ce1 add new ocr recognition model (LightweightEdge) and some functions
1. 增加了新轻量化端侧识别模型 LightweightEdge,并把原来CRNN和ConvNextViT的代码整理了
2. 增加batch inference支持
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/12787905
2023-05-31 21:16:22 +08:00
chenyafeng.cyf
f6ea3eadea eres2net
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/12640199
2023-05-16 22:28:20 +08:00
yeqinghao.yqh
b9c8c99776 Support mPLUG-Owl model.
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/12610417
2023-05-15 16:32:46 +08:00