Commit Graph

562 Commits

Author SHA1 Message Date
shuying.shu
085acc64c8 fix bug and change unittest mode
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10680402
2022-11-10 13:09:56 +08:00
caorongyu.cry
e2a9695f93 [to #42322933] add synonym
主要做了如下修改:
1. 加入了同义词词典
2. 对SQL进行后处理,如果包含排序,则将空列转化成Primary列
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10670121
2022-11-08 22:20:03 +08:00
hemu.zp
0f0fdcae6f [to #42322933] Fix bug for mplug evaluation
修复了 mplug evaluation 使用了错误的 metrics 的问题,将部分中文处理代码独立到 utils 中,为 mplug 添加 trainer
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10612875
2022-11-08 17:58:03 +08:00
zhangzhicheng.zzc
d3519bcbca [to #42322933]token preprocess bug fix
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10608664
2022-11-08 15:42:08 +08:00
翎航
fd4276ad1a add five finetune task & merge master 2022-11-08 10:57:22 +08:00
翎航
3534970709 add five finetune task & merge master 2022-11-08 10:39:59 +08:00
翎航
a02b2409d8 add five finetune task & merge master 2022-11-08 10:39:45 +08:00
翎航
04377cfd79 Merge branch 'master' into tmp
the new master
2022-11-08 10:36:27 +08:00
翎航
eb82ba9c6f add finetune & merge master 2022-11-07 20:30:18 +08:00
翎航
0418786cbe add five task finetune 2022-11-07 20:23:17 +08:00
yzhao
3f75fcdb79 fix bug 2022-11-02 20:02:18 +08:00
mulin.lyh
4429991646 Merge branch dev/msdataset_event_tracking into master
Title: [to #42322933] add event tracking 

1. add event tracking for dataset downloading pv/uv
2. change datasets version: <=2.5.2
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10593016
2022-11-01 19:36:49 +08:00
班扬
e45ab2c32d add event tracking 2022-11-01 15:51:00 +08:00
班扬
63a08e7be6 add event tracking 2022-11-01 15:49:21 +08:00
干劲
2759d538bb fix ut level for unifold 2022-11-01 14:59:45 +08:00
liugao.lg
40b6770956 [to #42322933]fix ocr prepreocess & conflict
修复ocr预处理逻辑不一致问题
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10581697
2022-11-01 10:22:11 +08:00
zhangzhicheng.zzc
06abae4dc6 [to #42322933]add token-cls test cases and bug fix
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10585502
2022-11-01 09:56:15 +08:00
zhangzhicheng.zzc
0d3b7b0df2 [to #42322933]fix bugs relate to token cls
1.修复token classification preprocessor finetune结果错误问题
2.修复word segmentation output 无用属性
3. 修复nlp preprocessor传use_fast错误
4. 修复torch model exporter bug
5. 修复文档撰写过程中发现trainer相关bug
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10573269
2022-10-31 20:52:27 +08:00
shouzhou.bx
e72988c2ba add face detection to face_2d_keypoints_pipeline 2022-10-31 20:46:49 +08:00
翎航
11cebb8d64 fix ocr prepreocess & conflict 2022-10-31 17:03:18 +08:00
翎航
3b21ff10ec fix ocr prepreocess 2022-10-31 16:57:49 +08:00
yichang.zyc
e2d35fbb14 [to #42322933]clip支持finetune
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10572842
2022-10-30 21:51:11 +08:00
Yingda Chen
9f7b8b86a3 [to #42322933] disble 2dkeypoints training since face_2d_keypoints_dataset is set to be private 2022-10-30 13:59:12 +08:00
Yingda Chen
902019c2e0 [to #42322933] disble vgg19_fer 2022-10-30 13:55:49 +08:00
Yingda Chen
29448c0f57 [to #42322933] disble vit 2022-10-30 11:15:52 +08:00
mulin.lyh
3791ee7ad2 [to #45821936]fix: fix block user specify revision after release_datetime
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10572162
2022-10-29 13:44:47 +08:00
yuze.zyz
4b7e8e89aa [to #42322933] Fix some bugs when downgrade the version of some dependencies
1. Fix bug in model exporting
2. Skip some long trainings in test level 2
3. Refine some comments
4. Fix a bug that mode is not correct when saving checkpoints
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10564716
2022-10-28 21:44:33 +08:00
Yufeng
261c04b8b5 add Mglm (#5)
* mglm init

* add mglm requirements

Co-authored-by: Yufeng <zhuyufeng@gmail.com>
Co-authored-by: wenmeng.zwm <wenmeng.zwm@alibaba-inc.com>
2022-10-28 17:12:47 +08:00
Yingda Chen
46cfa177aa [to #42322933]skip timeconsuming test 2022-10-28 09:34:29 +08:00
xianzhe.xxz
88e8d4291a [to #42322933]"fix: set the eps and momentum of BN consistent with training"
To keep consistent between training and evaluation, change the eps and momentum of BN.
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10554451
2022-10-28 09:27:55 +08:00
menrui.mr
c7b0787049 修复初始化过程参数未生效问题
此前文生图模型没有加载configuration.json中的参数 影响默认配置
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10558026
2022-10-27 23:29:08 +08:00
hemu.zp
fa415d8720 [to #42322933] Fix bug for bloom and gpt_neo
1. 修复 bloom 和 gpt_neo 模型更新 transformers 4.23 后后处理报错的问题
2. 统一使用 ModelOutput 作为模型输出
3. gpt_neo checkpoint 已上线,修改 ut 为 level2
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10553103
2022-10-27 23:27:28 +08:00
xingjun.wxj
78f29cf999 [to #42322933] Add delete datasets files and upload mode.
1. Add : MsDataset.delete() , support delete dataset file or dir.

2. Add: upload mode,  MsDataset.upload(xx, upload_mode=UploadMode.FORCE_UPLOAD), or  MsDataset.upload(xx, upload_mode=UploadMode.APPEND_UPLOAD)
     if upload_mode = UploadMode.APPEND_UPLOAD, then skip object in case of this object exists.

3. Add: support reload sts token automatically to avoid expire. (current expiration: 24h)

4. Fix: add cookies in api.py for downloading private datasets.
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10524449
2022-10-27 20:30:35 +08:00
Yingda Chen
374fd3090e [to #42322933]skip referring video tests since model is private 2022-10-27 20:23:51 +08:00
yuze.zyz
212cf53318 [to #42322933] Fix some bugs
1. Add F1 score to sequence classification metric
2. Fix a bug that the evaluate method in trainer does not support a pure pytorch_model.bin
3. Fix a bug in evaluation of veco trainer 
4. Add some tips if lr_scheduler in the trainer needs a higher version torch
5. Add some comments
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10532230
2022-10-27 19:49:21 +08:00
shuying.shu
ddcb57440d [to #42322933]add fine-tune code for referring video object segmentation
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10539423
2022-10-27 19:43:54 +08:00
mulin.lyh
3b75623be4 [to #45773874]fix: get_model revision=None bug, and hub case occasionally delete test model failed
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10549680
2022-10-27 17:06:18 +08:00
eniac.xcw
8886c3c1ae [to #42322933]fine tune team on caltech-101
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10525413
2022-10-27 12:00:14 +08:00
yingda.chen
de708dd518 add basic remap column wrapper
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10539917

    * add basic remap column wrapper
2022-10-27 10:12:05 +08:00
hemu.zp
69104c0f8a [to #42322933] Refactor text generation model outputs and fix some bugs
1. 将 single_gpu_test 与 multi_gpu_test 中的 model.forward 部分分离为 EpochBasedTrainer 中的 evaluation_step,为部分 evaluation 阶段不调用 forward 的模型提供更好的灵活性
2. 重构代码将文本生成模型 Model 层的输入输出统一为 Tensor,Tensor 到 str 的 decode 过程移动到 pipeline 中完成
3. pipeline 后处理添加对中文和中文标点与英文混杂时空格的处理,使 decode 后中英文混杂输出正确
4. 添加 TextGenerationTrainer 修复了部分模型 evaluation 过程 forward 输出单个 token 计算 metrics 的问题
5. 修复了 rouge 无法接收空字符串的问题
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10473768
2022-10-27 09:52:05 +08:00
liugao.lg
0605376135 [to #42322933]add ofa finetune
新增ofa的finetune能力
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10540701
2022-10-27 09:29:06 +08:00
翎航
2299f8fa65 fix conflict 2022-10-26 22:41:13 +08:00
hemu.zp
d0f8547e7e [to #42322933] Fix gpt3 loading checkpoint after finetuning.
1. 修复GPT-3模型无法加载finetune保存的checkpoint的问题
2. 为GPT-3诗词生成模型添加 ut
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10537209
2022-10-26 20:58:00 +08:00
翎航
022fa4948a fix ocr-finetune acc 2022-10-26 19:44:54 +08:00
jiaqi.sjq
7b84adc914 [to #42322933]Fix remove files in local model not take effect to remote repo after push_model
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10533214
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10533214
2022-10-26 19:15:43 +08:00
hemu.zp
e4a0e046f9 [to #42322933] Add ut for mplug and bloom
为新上线的 langboat/bloom-1b4-zh,damo/mplug_visual-question-answering_coco_base_zh,damo/mplug_image-captioning_coco_base_zh 三个模型添加 ut,test_level 设置为 2
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10524221
2022-10-26 16:19:20 +08:00
caorongyu.cry
3b8fb92c13 [to #42322933] debug header ids and header names
修复header_ids和header_names命名反了的问题
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10516557
2022-10-26 16:04:14 +08:00
ran.zhou
13f7e9ceca [to #42322933]SEA multilingual NLP (NER & word segmentation)
添加东南亚小语种NLP支持,包括:
1. 针对泰语,越南语NER的预处理
2. 基于XLMR-CRF架构的分词模型和pipeline
3. 针对泰语分词的预处理

添加了相应pipeline的unittest
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10492404
2022-10-26 14:52:22 +08:00
mulin.lyh
384377b8f5 * [to #45486649]feat: modelscope model version use model repo tag, unsupport branch or commit it, client user-agent header unified 2022-10-26 13:55:51 +08:00
jiaqi.sjq
5190c7de11 [to #41669377] tts using default master revision model in UT
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10526747
2022-10-26 11:53:52 +08:00