212 Commits

Author SHA1 Message Date
yuze.zyz
34a5619285 Refactor hooks
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11651547
2023-03-02 20:12:01 +08:00
ryan.yy
6924c1583a nerf重建加速模型 加入trainer训练模块
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11794296
2023-03-01 12:06:21 +08:00
hemu.zp
e080067a96 [to #42322933] Support multi-machine data and tensor parallel finetuning
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11682479
2023-02-28 18:48:20 +08:00
myf272609
e63593f3bb [to #42322933] add fintune support for cartoon task
人像卡通化模型增加训练支持

 Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11675597

* add fintune support for cartoon
2023-02-28 18:48:13 +08:00
lee.lcy
a04b9c6ec9 fix(damoyolo): fix FileNotFoundError when using trainer.evaluate() && add work_dir and exp_name to kwargs
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11793714

    * fix(damoyolo): fix FileNotFoundError when using trainer.evaluate() && add work_dir and exp_name to kwargs

* style(damoyolo): add code annotation to ImageDetectionDamoyoloTrainer
2023-02-28 16:34:28 +08:00
yuze.zyz
70c9fd322a [to #47563396]Fix bug: two ckpt hooks save in the same dir
1. Support two checkpoint hooks saving final checkpoints in two difference folders
2. Remove the check of checkpoint hooks
3. Fix a incorrect modification in UT
4. Fix bug: Checkpoint.load_checkpoint has been moved out
5. Add UT for new style configuration
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11630170

(cherry picked from commit 90af43f749)
2023-02-14 08:39:19 +08:00
fuhaomin.fhm
e0edbf135c [to #42322933] Doc2Bot documentation with retrieval rerank, generation
(cherry picked from commit 2fced1c06f)
2023-02-12 11:10:40 +08:00
shimin.ysm
9b0e302a66 refine cv_image_defrcn trainer to avoid failed
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11622570
2023-02-10 07:10:59 +00:00
yuze.zyz
ca1321f53f Support trainer prediction and fix some bugs
1. Support trainer prediction
2. Fix bug in text classification metric
3. Move load checkpoint out of checkpointhook
4. Fix bug in train progressing (inner_iter variable not correct)

Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11560269
2023-02-10 06:19:37 +00:00
zhangyanzhao.zyz
e6c05a2931 sentence-embedding support finetune
sentence-embedding模型支持finetune

Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11537009
2023-02-10 06:07:38 +00:00
hemu.zp
82482b3e96 update training args
Based on feat/0131/nlp_args branch, the original code review: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11408570

Support for running finetuning from the command line with training args, Compatible with the configuration optimization.
2023-02-10 05:32:21 +00:00
yuze.zyz
4dca4773db Support csanmt exporting and refactor some code
1. Support csanmt exporting to savedmodel format
2. Create a new base class for text-ranking preprocessors, and move some parameters of mgeo_ranking_preprocessor to init method
3. Avoid Model & Preprocessor classes coupled with pytorch
4. Regression test supports comparing only model output
5. Support zero-shot exporting to onnx and torchscript

Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11522461
2023-02-10 05:15:04 +00:00
mulin.lyh
fd7fd38da0 fix failed case 2023-02-10 10:14:24 +08:00
shimin.ysm
2535866443 cv/image-fewshot-detection-defrcn support finetune and evaluation
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11486763

* defrcn完善评估功能,支持coco格式

* 修改格式问题

* 优化模型加载

* 优化训练测试脚本

* 修复推理时依赖数据集的问题

* 指定模型版本

* 指定model revision

* review意见修改
2023-02-09 10:43:08 +00:00
lanjinpeng.ljp
cffc1ba0e5 support DINO detection using EasyCV
支持DINO高精度目标检测模型

 Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11518805
2023-02-09 09:39:08 +00:00
leyuan.hjy
2684111bd7 Real-time object detection finetune support using easycv
实时目标检测finetune easycv支持 

Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11554870

* add finetune support 

* implementation of trainer and pipeline switched to easycv

* remove old yolox code
2023-02-09 08:45:05 +00:00
tanfan.zjh
bb174351b3 refactor faq model and add MGIMN model
FAQ模型代码重构+新增FAQ MGIMN模型 

Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11595371
2023-02-09 08:29:19 +00:00
hemu.zp
ce4199a783 Fix data parallel bug for mgeo evaluation
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11584808
2023-02-09 08:26:52 +00:00
wenmeng.zwm
d5ae8ae43b remove tensorboard hook as default
tensorboard has been removed from the requirements of framework.txt, so we remove tensorboard hook from default config
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11519980

    * remove tensorboard hook as default

* Merge branch 'master' into fix/remove_default_tensorboard_hook
2023-02-08 10:07:07 +00:00
xianzhe.xxz
0967ece5a0 fix damoyolo evaluater load checkpoint not matched
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11583722
2023-02-08 06:50:47 +00:00
ada.drx
7298bd2bb4 mgeo fix finetune for rerank test case and reduce UT time
* reduce UT time 
* fix finetune for rerank test case

Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11563740
2023-02-07 02:55:33 +00:00
dawei.fdw
310e9c7dbf add plug mental model
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11549696

* add plug mental model code

* add test pipeline and fix annotation format bugs
2023-02-06 10:57:20 +00:00
mulin.lyh
e54694690f [to #46993990]feat: run ci cases base on code diff to reduct ci test time 2023-02-06 08:00:19 +00:00
pengteng.spt
e502e89c61 Split training and evaluating code for nearfield kws trainer
* fix judgement of fa case for certain keywords in det
 * split code so that train and evaluate can be single used
 * fix pre-commit errors

Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11453810
2023-01-31 09:43:19 +00:00
shouzhou.bx
f6c884b5ec [to #42322933][BUG FIX]bug fix for hand detect ft
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11439551
2023-01-16 05:07:25 +00:00
bin.xue
854c1e6cbf [to #42322933] bugfix: separation.evaluate() failed
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11426908
2023-01-13 09:19:31 +00:00
shimin.ysm
f7930c23a0 add cv/image-defrcn-fewshot-detection
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11364804

* add model defrcn-fewshot-detection

* add requirements check
2023-01-12 12:48:38 +00:00
ada.drx
2309596161 add mgeo finetune and pipeline
MGeo is a multi-modal multi-task geographic language model.
We support 5 pipeline tasks and 1 pretrained model MGeo on maas.
In the same time, we propose GeoGLUE, a geographic evaluation benchmark. MGeo can be finetuned on GeoGLUE tasks.

Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11273012


* add prov city dist feature to gis encoder

* finish mgeo fintune and pipeline

* text classification add token type id

* to_device support ModelOutput class

* update token classification model lable mask logic
2023-01-12 17:55:14 +08:00
jiangyu.xzy
c8c1b7f1a8 add asr finetune & change inference
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11403205

* support asr new models & vad-punc models
2023-01-12 16:01:54 +08:00
hemu.zp
06296c1819 [to #42322933] Fix evaluation oom
Add merge method for all metrics, parallel metrics can be merged when using data parallel. No longer save all data in the evaluation process to avoid oom.

Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11399082
2023-01-12 13:02:54 +08:00
xianzhe.xxz
393aa01e2b 支持DAMO-YOLO系列模型的Finetune功能。
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11249980

* add tinynas-detection trainer, evaluater and dataloader.

* add timmer and general torch dist tools.

* replace loguru with modelscope standard logger.

* merge duplicate tinynas-detection model files.

* add compatibility of json config files.
2023-01-12 11:08:17 +08:00
bin.xue
78f812dbb6 [to #42322933] add speech separation finetune
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11379892
2023-01-12 07:02:46 +08:00
huizheng.hz
466200f355 NAFNet Image Deblurring pipeline and finetune support
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11300932

* fix psnr/ssim metrics for NAFNet (image denoise)

* add subset_name when loading dataset (NAFNet image denoising)
2023-01-11 22:18:03 +08:00
hemu.zp
a277b343af [to #42322933] Add beam search and pair finetune for GPT-3
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11397726

* test finetune weather

* support ppl and generation metrics
2023-01-11 22:04:11 +08:00
wenmeng.zwm
ed859e5274 Title: merge master-github and fix conflict
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11370549
2023-01-10 14:03:08 +08:00
tanfan.zjh
62e575a376 faq问答模型支持finetune/faq问答模型支持多语言
- faq问答模型支持finetune
- faq问答模型支持多语言
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11343276
2023-01-10 13:59:40 +08:00
wenmeng.zwm
9ce750f4a9 merge master-github and fix conflict 2023-01-10 11:12:37 +08:00
james.wjg
c0c14177bc 增加一个 trainer 单元测试
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11277400
2023-01-09 07:38:38 +08:00
ziyuan.tw
9552a8533e add ConvNeXt model
增加ConvNeXt模型和修复代码bug:模型需要输入BGR格式图像,但读取图片代码默认输出为RGB格式,造成归一化预处理错误,模型精度下降。
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11192762
2023-01-09 06:56:05 +08:00
wenmeng.zwm
8f6a0f64e2 add support for eval configuration and fix logger problem
1. add support for configuration for gpu_collect and cache_dir which is used for cpu result gathering, configuration example

```json
"evaluation": {
    "gpu_collect":  false,
    "cache_dir": "path/to/your/local/cache"
}
```

2. fix logger file missing  when log_file is passed to get_logger and add log_file for trainer
3.  automatically create work_dir in rank0 worker
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11342068

    * add support for configuration for tmpdir and gpu_collect
2023-01-09 02:51:35 +08:00
mulin.lyh
53a9342a29 skip tests/trainers/test_dialog_modeling_trainer.py 2023-01-06 14:41:49 +08:00
mulin.lyh
1ec601aea2 skip tests/trainers/test_dialog_intent_trainer.py for list model file 500 error 2023-01-06 09:15:50 +08:00
caorongyu.cry
72c39fb161 add space-t trainer
1. 增加fine-tuning流程
2. 增加evalution流程
3. 关联数据集nlp_convai_text2sql_pretrain_cn_trainset
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11276053

    * add space-t trainer

* revise for trainer

* Merge branch 'master' into dev/tableqa_finetune

* revise for trainer

* Merge remote-tracking branch 'origin' into dev/tableqa_finetune
2023-01-04 09:46:37 +08:00
mulin.lyh
41cd220e01 temp skip failed case 2022-12-30 19:24:19 +08:00
wenmeng.zwm
b8ec677739 add training args support and image classification fintune example
design doc: https://yuque.antfin.com/pai/rwqgvl/khy4uw5dgi39s6ke

usage:
```python

    from modelscope.trainers.training_args import (ArgAttr, MSArgumentParser,
                                               training_args)


    training_args.topk = ArgAttr(cfg_node_name=['train.evaluation.metric_options.topk',
                                                'evaluation.metric_options.topk'],
                                 default=(1,), help='evaluation using topk, tuple format, eg (1,), (1,5)')
    training_args.train_data = ArgAttr(type=str, default='tany0699/cats_and_dogs', help='train dataset')
    training_args.validation_data = ArgAttr(type=str, default='tany0699/cats_and_dogs', help='validation dataset')
    training_args.model_id = ArgAttr(type=str, default='damo/cv_vit-base_image-classification_ImageNet-labels', help='model name')

    parser = MSArgumentParser(training_args)
    cfg_dict = parser.get_cfg_dict()
    args = parser.args
    
    train_dataset = create_dataset(args.train_data, split='train')
    val_dataset = create_dataset(args.validation_data, split='validation')

    def cfg_modify_fn(cfg):
        cfg.merge_from_dict(cfg_dict)
        return cfg

    kwargs = dict(
        model=args.model_id,          # model id
        train_dataset=train_dataset,  # training dataset
        eval_dataset=val_dataset,     # validation dataset
        cfg_modify_fn=cfg_modify_fn     # callback to modify configuration
    )

    trainer = build_trainer(name=Trainers.image_classification, default_args=kwargs)
    # start to train
    trainer.train()
```
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11225071
2022-12-30 07:35:15 +08:00
pengteng.spt
cddebf567f add kws nearfield finetune
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11179425

* add kws nearfield finetune

* work on rank-0 only if evaluating

* split kaldi relevant code into runtime utils

* add evaluate but not files checking

* test evaluate on cpu

* add default value for cmvn_file
2022-12-29 10:14:41 +08:00
yichang.zyc
0c79b57fcc support batch infer
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11170755
2022-12-28 12:17:36 +08:00
shuying.shu
048207d79b fix memory leak bug in eval
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11195714
2022-12-25 08:40:19 +08:00
wenmeng.zwm
0ad43911c4 merge gitlab master and fix conflict 2022-12-24 10:12:56 +08:00
jiaqi.sjq
8896087034 [to #42322933] support kantts infer and finetune
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11111331#tab=detail
2022-12-20 10:45:34 +08:00