Commit Graph

854 Commits

Author SHA1 Message Date
翎航
3b21ff10ec fix ocr prepreocess 2022-10-31 16:57:49 +08:00
翎航
2299f8fa65 fix conflict 2022-10-26 22:41:13 +08:00
hemu.zp
d0f8547e7e [to #42322933] Fix gpt3 loading checkpoint after finetuning.
1. 修复GPT-3模型无法加载finetune保存的checkpoint的问题
2. 为GPT-3诗词生成模型添加 ut
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10537209
2022-10-26 20:58:00 +08:00
翎航
022fa4948a fix ocr-finetune acc 2022-10-26 19:44:54 +08:00
jiaqi.sjq
7b84adc914 [to #42322933]Fix remove files in local model not take effect to remote repo after push_model
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10533214
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10533214
2022-10-26 19:15:43 +08:00
hemu.zp
e4a0e046f9 [to #42322933] Add ut for mplug and bloom
为新上线的 langboat/bloom-1b4-zh,damo/mplug_visual-question-answering_coco_base_zh,damo/mplug_image-captioning_coco_base_zh 三个模型添加 ut,test_level 设置为 2
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10524221
2022-10-26 16:19:20 +08:00
wenshen.xws
2c994ed760 [to #42322933]fix tokenizer for faq
多语言faq,Tokenizer新增类型判别
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10530690
2022-10-26 16:18:27 +08:00
caorongyu.cry
3b8fb92c13 [to #42322933] debug header ids and header names
修复header_ids和header_names命名反了的问题
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10516557
2022-10-26 16:04:14 +08:00
ran.zhou
13f7e9ceca [to #42322933]SEA multilingual NLP (NER & word segmentation)
添加东南亚小语种NLP支持,包括:
1. 针对泰语,越南语NER的预处理
2. 基于XLMR-CRF架构的分词模型和pipeline
3. 针对泰语分词的预处理

添加了相应pipeline的unittest
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10492404
2022-10-26 14:52:22 +08:00
mulin.lyh
384377b8f5 * [to #45486649]feat: modelscope model version use model repo tag, unsupport branch or commit it, client user-agent header unified 2022-10-26 13:55:51 +08:00
jiaqi.sjq
5190c7de11 [to #41669377] tts using default master revision model in UT
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10526747
2022-10-26 11:53:52 +08:00
翎航
5dd9698a33 fix ocr-finetune ned 2022-10-26 11:48:22 +08:00
翎航
9d45274fbf add ocr-finetune ned 2022-10-26 11:47:21 +08:00
翎航
90d47832c0 add ocr-finetune ned 2022-10-26 11:45:50 +08:00
翎航
c077dea072 add ocr-finetune 2022-10-26 10:52:10 +08:00
zhangyanzhao.zyz
781fe49d63 [to #42322933]修正finetune text ranking bugs
之前的finetune代码当dataset最后长度不足制定batch size时会出错,现已修正
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10524066
2022-10-26 09:44:25 +08:00
yuanzheng.yuanzhen
bab54bbce8 [to #42322933]support uni fold
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10481410
2022-10-25 22:59:19 +08:00
siyang.ssy
ba3db0f552 [to #42322933] fix video embedding output
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10525516
2022-10-25 22:56:14 +08:00
tingwei.gtw
d40cc98994 [to #42322933] update IO for demo services
修改了I/O的代码,以支持modelscope的demo services
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10518318
2022-10-25 22:49:15 +08:00
yuze.zyz
c2da44b371 [to #42322933] remove dev model inference and fix some bugs
1. Change structbert dev revision to master revision
2. Fix bug:  Sample code failed because the updating of model configuration
3. Fix bug: Continue training regression failed
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10519992
2022-10-25 22:38:49 +08:00
lllcho.lc
41b35619e8 [to #42322933] Fix bug for demo service
在demo service场景,同时调用同一个视频文件,会导致ffmpeg处理同名视频的冲突。通过uuid生成唯一的文件名解决这个冲突。
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10518178
2022-10-25 20:31:53 +08:00
yichang.zyc
62339161cd revert args of metric init
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10521235
2022-10-25 19:26:44 +08:00
zhangzhicheng.zzc
e1ab73b7d8 [to #42322933]support type str for for zero-shot labels' input
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10506320
2022-10-25 13:55:09 +08:00
hemu.zp
ffd834fc25 [to #42322933] Add bloom model
添加 bloom 模型
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10509187
2022-10-25 12:58:02 +08:00
yichang.zyc
6ddafb3218 [to #42322933]caption finetune done, add belu metric
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10318299
2022-10-25 12:55:41 +08:00
yuze.zyz
605cd7f44a [to #42322933] NLP 1030 Refactor
Features:
1. Refactor the directory structure of nlp models. All model files are placed into either the model folder or the task_model folder
2. Refactor all the comments to google style
3. Add detail comments to important tasks and nlp models, to list the description of the model, and its preprocessor&trainer
4. Model Exporting now supports a direct all to TorchModelExporter(no need to derive from it)
5. Refactor model save_pretrained method to support direct running(independent from trainer)
6. Remove the judgement of Model in the pipeline base class, to support outer register models running in our pipelines
7. Nlp trainer now has a NLPTrainingArguments class , user can pass arguments into the dataclass, and use it as a normal cfg_modify_fn, to simplify the operation of modify cfg.
8. Merge the BACKBONES and the MODELS, so user can get a backbone with the Model.from_pretrained call
9. Model.from_pretrained now support a task argument, so user can use a backbone and load it with a specific task class.
10. Support Preprocessor.from_pretrained method
11. Add standard return classes to important nlp tasks, so some of the pipelines and the models are independent now, the return values of the models will always be tensors, and the pipelines will take care of the conversion to numpy and the following stuffs.
12. Split the file of the nlp preprocessors, to make the dir structure more clear.

Bugs Fixing:
1. Fix a bug that lr_scheduler can be called earlier than the optimizer's step
2. Fix a bug that the direct call of Pipelines (not from pipeline(xxx)) throws error
3. Fix a bug that the trainer will not call the correct TaskDataset class
4. Fix a bug that the internal loading of dataset will throws error in the trainer class
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10490585
2022-10-25 12:26:25 +08:00
siyang.ssy
6d51f44dc7 [to #42322933]fix input type for video embeding
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10506601
2022-10-25 12:11:28 +08:00
bin.xue
525fa3ea89 [to #42322933]test: use 'master' branch in training test
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10509580
2022-10-25 12:10:07 +08:00
行嗔
cc8b78eac8 update rdrop 2022-10-25 11:55:37 +08:00
行嗔
0c64d3fca5 Merge remote-tracking branch 'origin/ofa/finetune_loss' into ofa/finetune
# Conflicts:
#	tests/trainers/test_ofa_trainer.py
2022-10-25 11:48:03 +08:00
翎航
1bb1eeec77 fix ut 2022-10-25 10:55:24 +08:00
行嗔
2288a0fdf3 fix all comments 2022-10-25 10:18:33 +08:00
行嗔
df5bd86048 fix a ut bug 2022-10-25 10:15:06 +08:00
行嗔
4276c5434f Merge remote-tracking branch 'origin/ofa/finetune' into ofa/finetune 2022-10-25 10:14:34 +08:00
行嗔
9e3f035fa7 fix a ut bug 2022-10-25 10:13:48 +08:00
行嗔
d5b2dabaf5 Merge remote-tracking branch 'origin/master' into ofa/finetune 2022-10-25 10:13:02 +08:00
huizheng.hz
a1738690c9 [to #42322933]test_image_denoise_trainer
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10465138
2022-10-25 10:08:57 +08:00
caorongyu.cry
6178f46910 [to #42322933] add ut for multi threads
1. 修复multi thread引起的问题
2. 增加multi thread的unittest
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10502008
2022-10-25 09:49:02 +08:00
yingda.chen
de7b6a06e9 [to #42322933] remove revision usage for face detection
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10507910

    * [to #42322933] remove revision usage for face detection
2022-10-25 09:28:01 +08:00
Yingda Chen
7714e0f2f4 Merge remote-tracking branch 'origin' into ofa/finetune 2022-10-25 09:18:05 +08:00
mulin.lyh
b41d275c70 [to #45703335]feat: refactor deploy
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10501307
2022-10-25 09:05:33 +08:00
行嗔
cb9a2b9d10 Merge remote-tracking branch 'origin/master' into ofa/finetune 2022-10-25 00:52:52 +08:00
行嗔
85a7832d57 fix a typo 2022-10-25 00:52:35 +08:00
zhangyanzhao.zyz
c4dbb69d65 [to #42322933]增加对text-ranking任务中文模型的单元测试,以方便得到官方模型打标。
增加对text-ranking任务中文模型的单元测试,以方便得到官方模型打标。
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10492754
2022-10-24 23:41:20 +08:00
yichang.zyc
35c612a642 [to #42322933]去除clip ut中的dev revision
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10507748

    * remove clip ut dev revision
2022-10-24 23:40:38 +08:00
行嗔
46c3bdcfe8 fix a bug 2022-10-24 23:19:23 +08:00
行嗔
428599f3e5 update finetune 2022-10-24 21:38:31 +08:00
行嗔
1ecf588c86 update finetune 2022-10-24 20:56:58 +08:00
翎航
73469b8400 fix loss&log 2022-10-24 19:44:04 +08:00
ashui.cbh
e223c1b008 [to #42322933]merge master after demo service support
demo service 对接,修改输入接口为可调用的方式
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10502169
2022-10-24 18:47:01 +08:00