188 Commits

Author SHA1 Message Date
shouzhou.bx
9eb8ad5fc9 [to #42322933][BUG FIX]bug fix for hand detect ft
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11439551
2023-01-16 13:08:33 +08:00
bin.xue
854c1e6cbf [to #42322933] bugfix: separation.evaluate() failed
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11426908
2023-01-13 09:19:31 +00:00
shimin.ysm
f7930c23a0 add cv/image-defrcn-fewshot-detection
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11364804

* add model defrcn-fewshot-detection

* add requirements check
2023-01-12 12:48:38 +00:00
ada.drx
2309596161 add mgeo finetune and pipeline
MGeo is a multi-modal multi-task geographic language model.
We support 5 pipeline tasks and 1 pretrained model MGeo on maas.
In the same time, we propose GeoGLUE, a geographic evaluation benchmark. MGeo can be finetuned on GeoGLUE tasks.

Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11273012


* add prov city dist feature to gis encoder

* finish mgeo fintune and pipeline

* text classification add token type id

* to_device support ModelOutput class

* update token classification model lable mask logic
2023-01-12 17:55:14 +08:00
jiangyu.xzy
c8c1b7f1a8 add asr finetune & change inference
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11403205

* support asr new models & vad-punc models
2023-01-12 16:01:54 +08:00
hemu.zp
06296c1819 [to #42322933] Fix evaluation oom
Add merge method for all metrics, parallel metrics can be merged when using data parallel. No longer save all data in the evaluation process to avoid oom.

Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11399082
2023-01-12 13:02:54 +08:00
xianzhe.xxz
393aa01e2b 支持DAMO-YOLO系列模型的Finetune功能。
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11249980

* add tinynas-detection trainer, evaluater and dataloader.

* add timmer and general torch dist tools.

* replace loguru with modelscope standard logger.

* merge duplicate tinynas-detection model files.

* add compatibility of json config files.
2023-01-12 11:08:17 +08:00
bin.xue
78f812dbb6 [to #42322933] add speech separation finetune
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11379892
2023-01-12 07:02:46 +08:00
huizheng.hz
466200f355 NAFNet Image Deblurring pipeline and finetune support
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11300932

* fix psnr/ssim metrics for NAFNet (image denoise)

* add subset_name when loading dataset (NAFNet image denoising)
2023-01-11 22:18:03 +08:00
hemu.zp
a277b343af [to #42322933] Add beam search and pair finetune for GPT-3
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11397726

* test finetune weather

* support ppl and generation metrics
2023-01-11 22:04:11 +08:00
wenmeng.zwm
ed859e5274 Title: merge master-github and fix conflict
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11370549
2023-01-10 14:03:08 +08:00
tanfan.zjh
62e575a376 faq问答模型支持finetune/faq问答模型支持多语言
- faq问答模型支持finetune
- faq问答模型支持多语言
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11343276
2023-01-10 13:59:40 +08:00
wenmeng.zwm
9ce750f4a9 merge master-github and fix conflict 2023-01-10 11:12:37 +08:00
james.wjg
c0c14177bc 增加一个 trainer 单元测试
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11277400
2023-01-09 07:38:38 +08:00
ziyuan.tw
9552a8533e add ConvNeXt model
增加ConvNeXt模型和修复代码bug:模型需要输入BGR格式图像,但读取图片代码默认输出为RGB格式,造成归一化预处理错误,模型精度下降。
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11192762
2023-01-09 06:56:05 +08:00
wenmeng.zwm
8f6a0f64e2 add support for eval configuration and fix logger problem
1. add support for configuration for gpu_collect and cache_dir which is used for cpu result gathering, configuration example

```json
"evaluation": {
    "gpu_collect":  false,
    "cache_dir": "path/to/your/local/cache"
}
```

2. fix logger file missing  when log_file is passed to get_logger and add log_file for trainer
3.  automatically create work_dir in rank0 worker
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11342068

    * add support for configuration for tmpdir and gpu_collect
2023-01-09 02:51:35 +08:00
mulin.lyh
53a9342a29 skip tests/trainers/test_dialog_modeling_trainer.py 2023-01-06 14:41:49 +08:00
mulin.lyh
1ec601aea2 skip tests/trainers/test_dialog_intent_trainer.py for list model file 500 error 2023-01-06 09:15:50 +08:00
caorongyu.cry
72c39fb161 add space-t trainer
1. 增加fine-tuning流程
2. 增加evalution流程
3. 关联数据集nlp_convai_text2sql_pretrain_cn_trainset
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11276053

    * add space-t trainer

* revise for trainer

* Merge branch 'master' into dev/tableqa_finetune

* revise for trainer

* Merge remote-tracking branch 'origin' into dev/tableqa_finetune
2023-01-04 09:46:37 +08:00
mulin.lyh
41cd220e01 temp skip failed case 2022-12-30 19:24:19 +08:00
wenmeng.zwm
b8ec677739 add training args support and image classification fintune example
design doc: https://yuque.antfin.com/pai/rwqgvl/khy4uw5dgi39s6ke

usage:
```python

    from modelscope.trainers.training_args import (ArgAttr, MSArgumentParser,
                                               training_args)


    training_args.topk = ArgAttr(cfg_node_name=['train.evaluation.metric_options.topk',
                                                'evaluation.metric_options.topk'],
                                 default=(1,), help='evaluation using topk, tuple format, eg (1,), (1,5)')
    training_args.train_data = ArgAttr(type=str, default='tany0699/cats_and_dogs', help='train dataset')
    training_args.validation_data = ArgAttr(type=str, default='tany0699/cats_and_dogs', help='validation dataset')
    training_args.model_id = ArgAttr(type=str, default='damo/cv_vit-base_image-classification_ImageNet-labels', help='model name')

    parser = MSArgumentParser(training_args)
    cfg_dict = parser.get_cfg_dict()
    args = parser.args
    
    train_dataset = create_dataset(args.train_data, split='train')
    val_dataset = create_dataset(args.validation_data, split='validation')

    def cfg_modify_fn(cfg):
        cfg.merge_from_dict(cfg_dict)
        return cfg

    kwargs = dict(
        model=args.model_id,          # model id
        train_dataset=train_dataset,  # training dataset
        eval_dataset=val_dataset,     # validation dataset
        cfg_modify_fn=cfg_modify_fn     # callback to modify configuration
    )

    trainer = build_trainer(name=Trainers.image_classification, default_args=kwargs)
    # start to train
    trainer.train()
```
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11225071
2022-12-30 07:35:15 +08:00
pengteng.spt
cddebf567f add kws nearfield finetune
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11179425

* add kws nearfield finetune

* work on rank-0 only if evaluating

* split kaldi relevant code into runtime utils

* add evaluate but not files checking

* test evaluate on cpu

* add default value for cmvn_file
2022-12-29 10:14:41 +08:00
yichang.zyc
0c79b57fcc support batch infer
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11170755
2022-12-28 12:17:36 +08:00
shuying.shu
048207d79b fix memory leak bug in eval
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11195714
2022-12-25 08:40:19 +08:00
wenmeng.zwm
0ad43911c4 merge gitlab master and fix conflict 2022-12-24 10:12:56 +08:00
jiaqi.sjq
8896087034 [to #42322933] support kantts infer and finetune
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11111331#tab=detail
2022-12-20 10:45:34 +08:00
jerry.lp
906fa673b4 add gpt-moe model for modelscope finetune
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11085918
2022-12-17 05:52:57 +08:00
shouzhou.bx
95ede6378e [to #42322933] 1230: add hand detection 2022-12-16 13:24:02 +08:00
wenmeng.zwm
c8dcdd93da broadcase metric values across all workers for distribution
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10980488
2022-12-08 10:22:47 +08:00
wenmeng.zwm
5a3d58ad49 Merge branch 'master-gitlab' into merge_master_internal_1207 2022-12-07 19:59:07 +08:00
shiyi.zxh
c3a494e46d [to #42322933]
enable finetune of ofa-mmspeech 
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10981972
2022-12-06 20:58:49 +08:00
baiguan.yt
ce0480f7ed update image-portait-enhancement trainer
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10717891
2022-12-05 11:43:52 +08:00
hemu.zp
941dbe75cf [to #42322933] Add GPT-3 tensor parallel finetuning
Add GPT-3 tensor parallel finetuning, adjust some distributed codes to make tensor and data parallel compatible.
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10949507
2022-12-05 10:01:32 +08:00
hemu.zp
346da3d489 [to #42322933] Add mplug pretrained model
Add pre-trained models for mplug finetuning.
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10963691
2022-12-04 15:27:50 +08:00
chenxujun
99507a5cc6 Fix some words 2022-12-03 14:39:55 +08:00
ly119399
2f17daa23f [to #42322933] reduce the GPU usage of dialog trianer
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10955485
2022-12-02 17:32:26 +08:00
wenmeng.zwm
c9a6b887a2 add tensorboard hook for visualization
1. add tensorboard hook to default config
2. add image visualization support to tensorboard hook and trainer
3. move evaluation logic out of single_gpu_test and multi_gpu_test to make prediction results available for further processing such as result saving and visualization.

visualization results are as follows:
![image.png](https://cn-hangzhou.oss-cdn.aliyun-inc.com/git/force/uploads/comment/29212/38448470860386707/image.png)
![image.png](https://cn-hangzhou.oss-cdn.aliyun-inc.com/git/force/uploads/comment/29212/38437794200606734/image.png)
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10894813
2022-12-02 15:13:24 +08:00
ziyuan.tw
31316b8d29 add nextvit-small_image-classification_Dailylife-labels model
支持1130新上线模.
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10886253
2022-12-02 14:46:49 +08:00
ly119399
5ae1e08db6 [to #42322933] fix bug of tableQA on gpu
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10943053
2022-12-02 10:38:30 +08:00
suluyan.sly
1394019102 [to #42322933] plug finetune
plug finetune :已在du reader- robust数据集上回归至最佳结果
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10916382
2022-12-01 19:31:15 +08:00
james.wjg
9b3a92e65d cv/language_guided_video_summarization增加finetune
cv/language_guided_video_summarization增加finetune
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10790262
2022-12-01 19:16:56 +08:00
yuze.zyz
bb5512d1ab [to #42322933] Refactor NLP and fix some user feedbacks
1. Abstract keys of dicts needed by nlp metric classes into the init method
2. Add Preprocessor.save_pretrained to save preprocessor information
3. Abstract the config saving function, which can lead to normally saving in the direct call of from_pretrained, and the modification of cfg one by one when training.
4. Remove SbertTokenizer and VecoTokenizer, use transformers' tokenizers instead
5. Use model/preprocessor's from_pretrained in all nlp pipeline classes.
6. Add model_kwargs and preprocessor_kwargs in all nlp pipeline classes
7. Add base classes for fill-mask and text-classification preprocessor, as a demo for later changes
8. Fix user feedback: Re-train the model in continue training scenario
9. Fix user feedback: Too many checkpoint saved
10. Simplify the nlp-trainer
11. Fix user feedback: Split the default trainer's __init__ method, which makes user easier to override
12. Add safe_get to Config class

----------------------------  Another refactor from version 36 -------------------------

13. Name all nlp transformers' preprocessors from TaskNamePreprocessor to TaskNameTransformersPreprocessor, for example:
      TextClassificationPreprocessor -> TextClassificationTransformersPreprocessor
14. Add a base class per task for all nlp tasks' preprocessors which has at least two sub-preprocessors
15. Add output classes of nlp models
16. Refactor the logic for token-classification
17. Fix bug: checkpoint_hook does not support pytorch_model.pt
18. Fix bug: Pipeline name does not match with task name, so inference will not succeed after training
       NOTE: This is just a stop bleeding solution, the root cause is the uncertainty of the relationship between models and pipelines
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10723513

    * add save_pretrained to preprocessor

* save preprocessor config in hook

* refactor label-id mapping fetching logic

* test ok on sentence-similarity

* run on finetuning

* fix bug

* pre-commit passed

* fix bug

* Merge branch 'master' into feat/refactor_config

# Conflicts:
#	modelscope/preprocessors/nlp/nlp_base.py

* add params to init

* 1. support max ckpt num 2. support ignoring others but bin file in continue training 3. add arguments to some nlp metrics

* Split trainer init impls to overridable methods

* remove some obsolete tokenizers

* unfinished

* support input params in pipeline

* fix bugs

* fix ut bug

* fix bug

* fix ut bug

* fix ut bug

* fix ut bug

* add base class for some preprocessors

* Merge commit '379867739548f394d0fa349ba07afe04adf4c8b6' into feat/refactor_config

* compatible with old code

* fix ut bug

* fix ut bugs

* fix bug

* add some comments

* fix ut bug

* add a requirement

* fix pre-commit

* Merge commit '0451b3d3cb2bebfef92ec2c227b2a3dd8d01dc6a' into feat/refactor_config

* fixbug

* Support function type in registry

* fix ut bug

* fix bug

* Merge commit '5f719e542b963f0d35457e5359df879a5eb80b82' into feat/refactor_config

# Conflicts:
#	modelscope/pipelines/nlp/multilingual_word_segmentation_pipeline.py
#	modelscope/pipelines/nlp/named_entity_recognition_pipeline.py
#	modelscope/pipelines/nlp/word_segmentation_pipeline.py
#	modelscope/utils/hub.py

* remove obsolete file

* rename init args

* rename params

* fix merge bug

* add default preprocessor config for ner-model

* move a method a util file

* remove unused config

* Fix a bug in pbar

* bestckptsaver:change default ckpt numbers to 1

* 1. Add assert to max_epoch 2. split init_dist and get_device 3. change cmp func name

* Fix bug

* fix bug

* fix bug

* unfinished refactoring

* unfinished

* uw

* uw

* uw

* uw

* Merge branch 'feat/refactor_config' into feat/refactor_trainer

# Conflicts:
#	modelscope/preprocessors/nlp/document_segmentation_preprocessor.py
#	modelscope/preprocessors/nlp/faq_question_answering_preprocessor.py
#	modelscope/preprocessors/nlp/relation_extraction_preprocessor.py
#	modelscope/preprocessors/nlp/text_generation_preprocessor.py

* uw

* uw

* unify nlp task outputs

* uw

* uw

* uw

* uw

* change the order of text cls pipeline

* refactor t5

* refactor tg task preprocessor

* fix

* unfinished

* temp

* refactor code

* unfinished

* unfinished

* unfinished

* unfinished

* uw

* Merge branch 'feat/refactor_config' into feat/refactor_trainer

* smoke test pass

* ut testing

* pre-commit passed

* Merge branch 'master' into feat/refactor_config

# Conflicts:
#	modelscope/models/nlp/bert/document_segmentation.py
#	modelscope/pipelines/nlp/__init__.py
#	modelscope/pipelines/nlp/document_segmentation_pipeline.py

* merge master

* unifnished

* Merge branch 'feat/fix_bug_pipeline_name' into feat/refactor_config

* fix bug

* fix ut bug

* support ner batch inference

* fix ut bug

* fix bug

* support batch inference on three nlp tasks

* unfinished

* fix bug

* fix bug

* Merge branch 'master' into feat/refactor_config

# Conflicts:
#	modelscope/models/base/base_model.py
#	modelscope/pipelines/nlp/conversational_text_to_sql_pipeline.py
#	modelscope/pipelines/nlp/dialog_intent_prediction_pipeline.py
#	modelscope/pipelines/nlp/dialog_modeling_pipeline.py
#	modelscope/pipelines/nlp/dialog_state_tracking_pipeline.py
#	modelscope/pipelines/nlp/document_segmentation_pipeline.py
#	modelscope/pipelines/nlp/faq_question_answering_pipeline.py
#	modelscope/pipelines/nlp/feature_extraction_pipeline.py
#	modelscope/pipelines/nlp/fill_mask_pipeline.py
#	modelscope/pipelines/nlp/information_extraction_pipeline.py
#	modelscope/pipelines/nlp/named_entity_recognition_pipeline.py
#	modelscope/pipelines/nlp/sentence_embedding_pipeline.py
#	modelscope/pipelines/nlp/summarization_pipeline.py
#	modelscope/pipelines/nlp/table_question_answering_pipeline.py
#	modelscope/pipelines/nlp/text2text_generation_pipeline.py
#	modelscope/pipelines/nlp/text_classification_pipeline.py
#	modelscope/pipelines/nlp/text_error_correction_pipeline.py
#	modelscope/pipelines/nlp/text_generation_pipeline.py
#	modelscope/pipelines/nlp/text_ranking_pipeline.py
#	modelscope/pipelines/nlp/token_classification_pipeline.py
#	modelscope/pipelines/nlp/word_segmentation_pipeline.py
#	modelscope/pipelines/nlp/zero_shot_classification_pipeline.py
#	modelscope/trainers/nlp_trainer.py

* pre-commit passed

* fix bug

* Merge branch 'master' into feat/refactor_config

# Conflicts:
#	modelscope/preprocessors/__init__.py

* fix bug

* fix bug

* fix bug

* fix bug

* fix bug

* fixbug

* pre-commit passed

* fix bug

* fixbug

* fix bug

* fix bug

* fix bug

* fix bug

* self review done

* fixbug

* fix bug

* fix bug

* fix bugs

* remove sub-token offset mapping

* fix name bug

* add some tests

* 1. support batch inference of text-generation,text2text-generation,token-classification,text-classification 2. add corresponding UTs

* add old logic back

* tmp save

* add tokenize by words logic back

* move outputs file back

* revert veco token-classification back

* fix typo

* Fix description

* Merge commit '4dd99b8f6e4e7aefe047c68a1bedd95d3ec596d6' into feat/refactor_config

* Merge branch 'master' into feat/refactor_config

# Conflicts:
#	modelscope/pipelines/builder.py
2022-11-30 23:52:17 +08:00
yuze.zyz
fde8644883 Fix a bug that the logging file cannot save the correct lr, which is zero instead
This bug is a result of float rounding when saving key-value pairs to log files, which is reported by a user.
Now the solution is to remove the rounding operation of all values, instead of only the lr value, which I think may be too specific.

Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10684029
2022-11-30 21:59:02 +08:00
xiangpeng.wxp
2536f9ec9b [to #42322933] add en-zh en-es es-en base translation models
* add en-zh en-es es-en base translation models
 * add en-zh en-es es-en base translation models
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10895782

    * 新增英中/英西/西英-base机器翻译模型

* 新增英中/英西/西英-base机器翻译模型
2022-11-29 13:44:06 +08:00
mulin.lyh
90a5efa1c2 [to #46106568]feat: parallel run ci case 2022-11-17 08:51:23 +08:00
xiangpeng.wxp
d6ea41fb70 [to #42322933] solve memory error for translation finetune
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10713843

    * [to #42322933] solve memory error for translation finetune
2022-11-14 20:31:29 +08:00
yingda.chen
4e4faa9a30 specifiy file encoding when open text for read
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10708723
2022-11-14 14:16:08 +08:00
huizheng.hz
2fe5203571 reduce image denoising test time
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10712244
2022-11-14 12:30:36 +08:00
shuying.shu
085acc64c8 fix bug and change unittest mode
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10680402
2022-11-10 13:09:56 +08:00
hemu.zp
0f0fdcae6f [to #42322933] Fix bug for mplug evaluation
修复了 mplug evaluation 使用了错误的 metrics 的问题,将部分中文处理代码独立到 utils 中,为 mplug 添加 trainer
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10612875
2022-11-08 17:58:03 +08:00