Commit Graph

1197 Commits

Author SHA1 Message Date
wenmeng.zwm
ed859e5274 Title: merge master-github and fix conflict
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11370549
2023-01-10 14:03:08 +08:00
tanfan.zjh
62e575a376 faq问答模型支持finetune/faq问答模型支持多语言
- faq问答模型支持finetune
- faq问答模型支持多语言
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11343276
2023-01-10 13:59:40 +08:00
guangpan.cd
03cce308c7 Integrate StabeDiffusionPipeline from diffusers into MaaS-lib
Integrate `StabeDiffusionPipeline` from [`diffusers`](https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py) into MaaS-lib.
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11344883
2023-01-10 13:54:52 +08:00
wenmeng.zwm
9ce750f4a9 merge master-github and fix conflict 2023-01-10 11:12:37 +08:00
maojialiang.mjl
73066fe04c [to #42322933] Add cv-bnext-image-classification-pipeline to maas lib
add binary quantization classification to maas lib
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11257734
2023-01-10 07:42:24 +08:00
xingjun.wxj
43edddd31f [to #42322933] msdataset module refactor and add 1230 features
1. 优化本地数据集加载链路  
2. local与remote解耦,无网络环境下也可以使用SDK  
3. 升级hf datasets及其相关依赖到最新版(2.7.0+)
4. 解决元数据感知不到数据文件变更的问题  
5. 系统分层设计
6. 本地缓存管理问题  
7. 优化error log输出信息  
8. 支持streaming load	
* a. 支持数据文件为zip格式的streaming
* b. 支持Image/Text/Audio/Biodata等格式数据集的iter
* c. 兼容训练数据在meta中的历史数据集的streaming load
* d. 支持数据文件为文件夹格式的streaming load

9. finetune任务串接进一步规范
* a. 避免出现to_hf_dataset这种使用,将常用的tf相关的func封装起来  
* b. 去掉了跟hf混用的一些逻辑,统一包装到MsDataset里面

10. 超大数据集场景优化
* a. list oss objects: 直接拉取meta中的csv mapping,不需要做 list_oss_objects的api调用(前述提交已实现)
* b. 优化sts过期加载问题(前述提交已实现)

11. 支持dataset_name格式为:namespace/dataset_name的输入方式

参考Aone链接: https://aone.alibaba-inc.com/v2/project/1162242/task/46262894
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11264406
2023-01-10 07:01:34 +08:00
liaojie.laj
fcf6e6431f submit video frame interpolation model
增加视频插帧模型
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11188339
2023-01-10 06:57:19 +08:00
ryan.yy
c77213d919 图像换脸模型上MaaS
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11347556
2023-01-10 05:45:55 +08:00
huizheng.hz
340a14a456 save a video with h264 vcodec for video_super_resolution pipeline
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11327516
2023-01-09 22:23:55 +08:00
yuze.zyz
672a4ba107 Refactor tinynas objectdetection & img-classification
Refactor tinynas model & pipeline:
1. Move preprocess method out of model to image.py
2. Pipeline calls the model.__call__ method instead of inference method
3. Remove some obsolete code
4. Add a default preprocessor to preprocessor.py instead of change config in modelhub.
5. Standardize the return value of model

Refactor general image classification pipeline:
1. Change the preprocessor build method of ofa to avoid dependencies between multi-modal and cv.
2. Move preprocess method out of pipeline to image.py
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11185418
2023-01-09 21:33:42 +08:00
zhicheng.sc
2cb89609f0 Add video stabilization model
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11204574

* add video stabilization model
2023-01-09 21:23:26 +08:00
yuze.zyz
e6320f29d3 Small features:
1. Exporting: Support text-classification of bert and tensorflow2.0 models, test cases have been added.
2. Downloading of preprocessor.from_pretrained will ignores some large files which not needed by extension file name.
3. Move sentence-piece-preprocessor to the subclass of text-generation-preprocessor and keep the original name for compatibility.
4. Remove some useless codes in nlp-trainer and trainer.
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11206922
2023-01-09 21:22:07 +08:00
jiaqi.sjq
453ff1dae3 [to #42322933] support byte input feature and refine fp implementations
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11338137
2023-01-09 20:56:52 +08:00
yichang.zyc
d78fe495ee add structure tasks: sudoku & text2sql
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11314581
2023-01-09 15:53:49 +08:00
chong.zhang
f56742b9af add new asr model_id: speech_UniASR_asr_2pass-pt-16k-common-vocab1617-tensorflow1-offline, speech_UniASR_asr_2pass-pt-16k-com...
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11311291
2023-01-09 09:32:37 +08:00
hemu.zp
871b345e79 [to #42322933] GPT-3 model supports batch input
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11322820
2023-01-09 09:31:44 +08:00
james.wjg
c0c14177bc 增加一个 trainer 单元测试
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11277400
2023-01-09 07:38:38 +08:00
lanjinpeng.ljp
b2a78b5ad0 支持视频多目标跟踪模型
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11249098
2023-01-09 07:11:15 +08:00
zhangyanzhao.zyz
8f5dc7aea4 文本向量pipeline支持仅输入source sentences; 新增medical/ecom领域语义相关性/文本向量表示模型UT。
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11283935
2023-01-09 07:02:56 +08:00
wenmeng.zwm
68534bf554 Revert "skip unifold ut due to MMseqs2 api error"
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11334131
2023-01-09 06:56:30 +08:00
ziyuan.tw
9552a8533e add ConvNeXt model
增加ConvNeXt模型和修复代码bug:模型需要输入BGR格式图像,但读取图片代码默认输出为RGB格式,造成归一化预处理错误,模型精度下降。
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11192762
2023-01-09 06:56:05 +08:00
wenmeng.zwm
8f6a0f64e2 add support for eval configuration and fix logger problem
1. add support for configuration for gpu_collect and cache_dir which is used for cpu result gathering, configuration example

```json
"evaluation": {
    "gpu_collect":  false,
    "cache_dir": "path/to/your/local/cache"
}
```

2. fix logger file missing  when log_file is passed to get_logger and add log_file for trainer
3.  automatically create work_dir in rank0 worker
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11342068

    * add support for configuration for tmpdir and gpu_collect
2023-01-09 02:51:35 +08:00
mulin.lyh
aa541468d1 temp skip tests/pipelines/test_video_depth_estimation.py for download demo_kitti.gif failed 2023-01-08 10:28:35 +08:00
xuangen.hlh
db654dab5b fix error of 'unexpected keyword argument device_id'
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11347364
2023-01-07 03:25:54 +08:00
mulin.lyh
53a9342a29 skip tests/trainers/test_dialog_modeling_trainer.py 2023-01-06 14:41:49 +08:00
mulin.lyh
1ec601aea2 skip tests/trainers/test_dialog_intent_trainer.py for list model file 500 error 2023-01-06 09:15:50 +08:00
ni.chongjia
ac53ce3e36 modify format of itn_pipeline
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11257394

    * dev for asr itn inference pipeline

* add task interface

* add pipeline input

* add modemodelscope/pipelines/audio/itn_inference_pipeline.py

* add modelscope/pipelines/audio/itn_inference_pipeline.py

* modelscope/pipelines/audio/itn_inference_pipeline.py

* update modelscope/pipelines/audio/itn_inference_pipeline.py

* modify itn_inference_pipeline.py

* modify itn_inference_pipeline.py

* modify itn_inference_pipeline.py

* remove itn.py

* modify some names

* add modify itn_inference_pipeline.py

* modify itn_inference_pipeline.py

* modify itn_inference_pipeline.py

* modify itn_inference_pipeline.py

* modify itn

* add tests/pipelines/test_inverse_text_processing.py

* modify asr_inference_pipeline.py for the original files

* modify format

* add commits files

* Merge remote-tracking branch 'origin' into remotes/origin/asr/itn_nichongjia

* Merge remote-tracking branch 'origin' into remotes/origin/asr/itn_nichongjia

* modify the pipelines

* Merge branch 'master' into remotes/origin/asr/itn_nichongjia

* [to #47031187]fix: hub test suites can not parallel 
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11276872

    * [to #47031187]fix: hub test suites can not parallel

* google style docs and selected file generator 

ref: https://yuque.alibaba-inc.com/pai/rwqgvl/go8sc8tqzeqqfmsz
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11150212

    * google style docs and selected file generator

* merge

* Merge remote-tracking branch 'origin' into remotes/origin/asr/itn_nichongjia

* Merge branch 'master' into remotes/origin/asr/itn_nichongjia

* add requirements for fun_text_processing
2023-01-05 16:36:17 +08:00
kangxiaoyang.kxy
2260dd45fa 1230-image-colorization
submit new algorithm for image colorization and corresponding pipeline.
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11179627
2023-01-05 15:02:49 +08:00
kaisong.sks
2d68c6772a 用户满意度分析推理
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11235766

    * add

* add

* Merge remote-tracking branch 'origin' into nlp_user-satisfaction-estimation

* add

* add

* add

* Merge branch 'master' into nlp_user-satisfaction-estimation

* add

* Merge branch 'master' into nlp_user-satisfaction-estimation

* add

* add

* Merge branch 'master' into nlp_user-satisfaction-estimation

* Merge branch 'master' into nlp_user-satisfaction-estimation

* Merge branch 'master' into nlp_user-satisfaction-estimation

* Merge remote-tracking branch 'origin' into nlp_user-satisfaction-estimation

* add

* Merge branch 'master' into nlp_user-satisfaction-estimation
2023-01-05 10:58:36 +08:00
changxu.ccx
60bd40742a [to #42322933] Add vldoc to maas lib
Test
```python
python tests/pipelines/test_document_vl_embedding.py
```
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11199555
2023-01-04 19:11:30 +08:00
caorongyu.cry
72c39fb161 add space-t trainer
1. 增加fine-tuning流程
2. 增加evalution流程
3. 关联数据集nlp_convai_text2sql_pretrain_cn_trainset
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11276053

    * add space-t trainer

* revise for trainer

* Merge branch 'master' into dev/tableqa_finetune

* revise for trainer

* Merge remote-tracking branch 'origin' into dev/tableqa_finetune
2023-01-04 09:46:37 +08:00
pangda
2dbc93a931 [to #42322933] add UT for chunking model
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11289061
2023-01-04 04:41:00 +08:00
mulin.lyh
ab07dc5b5a google style docs and selected file generator
ref: https://yuque.alibaba-inc.com/pai/rwqgvl/go8sc8tqzeqqfmsz
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11150212

    * google style docs and selected file generator
2023-01-03 16:27:29 +08:00
mulin.lyh
0675bd5c88 [to #47031187]fix: hub test suites can not parallel
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11276872

    * [to #47031187]fix: hub test suites can not parallel
2023-01-03 16:26:59 +08:00
bin.xue
0fdf37312f [to #42322933] feat:add speech separation pipeline
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11255740
2023-01-03 13:18:44 +08:00
dadong.gxd
01c498cd14 add cv_casmvs_multi-view-depth-esimation_general
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11204285
2023-01-03 08:24:41 +08:00
shouzhou.bx
4698051fa5 update task of hand detect
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11271647
2023-01-02 12:14:49 +08:00
james.wjg
a12ef720a6 全景分割easycv接入支持finetune(mask2former-r50)- 12.30
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11179263
2023-01-02 11:27:11 +08:00
mulin.lyh
41cd220e01 temp skip failed case 2022-12-30 19:24:19 +08:00
lee.lcy
ed28b849eb [to #42322933] add domain specific object detection models
添加垂类目标检测模型。
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11265502
2022-12-30 14:19:16 +08:00
hejunjie.hjj
d560291525 [to #42322933] add maskdino model (1230)
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11187240
2022-12-30 07:43:57 +08:00
wenmeng.zwm
b8ec677739 add training args support and image classification fintune example
design doc: https://yuque.antfin.com/pai/rwqgvl/khy4uw5dgi39s6ke

usage:
```python

    from modelscope.trainers.training_args import (ArgAttr, MSArgumentParser,
                                               training_args)


    training_args.topk = ArgAttr(cfg_node_name=['train.evaluation.metric_options.topk',
                                                'evaluation.metric_options.topk'],
                                 default=(1,), help='evaluation using topk, tuple format, eg (1,), (1,5)')
    training_args.train_data = ArgAttr(type=str, default='tany0699/cats_and_dogs', help='train dataset')
    training_args.validation_data = ArgAttr(type=str, default='tany0699/cats_and_dogs', help='validation dataset')
    training_args.model_id = ArgAttr(type=str, default='damo/cv_vit-base_image-classification_ImageNet-labels', help='model name')

    parser = MSArgumentParser(training_args)
    cfg_dict = parser.get_cfg_dict()
    args = parser.args
    
    train_dataset = create_dataset(args.train_data, split='train')
    val_dataset = create_dataset(args.validation_data, split='validation')

    def cfg_modify_fn(cfg):
        cfg.merge_from_dict(cfg_dict)
        return cfg

    kwargs = dict(
        model=args.model_id,          # model id
        train_dataset=train_dataset,  # training dataset
        eval_dataset=val_dataset,     # validation dataset
        cfg_modify_fn=cfg_modify_fn     # callback to modify configuration
    )

    trainer = build_trainer(name=Trainers.image_classification, default_args=kwargs)
    # start to train
    trainer.train()
```
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11225071
2022-12-30 07:35:15 +08:00
hemu.zp
fd0099c92d [to #42322933] Refactor megatron-util
Rename import lib 'megatron' to 'megatron_util' and add error message for users when import failed.

Use initialize_megatron as a unified initialization entry in megatron-util, which can accept configuration input of ConfigDict in MaaS-lib.

Wrap the initialization process into the utils/megatron_utils.py file, add default parameters for the existing large model to be compatible with the uploaded configuration file.

The version of megatron_cfg currently supports v3 (default), v1 and moe.
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11231840
2022-12-29 15:01:45 +08:00
ly261666
6583e6f398 [to #42322933] Add FLIR Face Liveness Model
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11250177
2022-12-29 14:55:51 +08:00
pengteng.spt
cddebf567f add kws nearfield finetune
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11179425

* add kws nearfield finetune

* work on rank-0 only if evaluating

* split kaldi relevant code into runtime utils

* add evaluate but not files checking

* test evaluate on cpu

* add default value for cmvn_file
2022-12-29 10:14:41 +08:00
dadong.gxd
42557b0867 add cv_pointnet2_sceneflow-estimation_general
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11201880
2022-12-29 08:09:57 +08:00
yeqinghao.yqh
f7a7504782 Add HiTeA model for VideoQA and Caption (12.30)
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11201652
2022-12-29 08:06:34 +08:00
hemu.zp
f58060b140 [to #42322933] add GPT-2 model
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11194200
2022-12-29 07:59:40 +08:00
lee.lcy
e8a354d226 [to #42322933] add real-time human detection model
add real-time human detection model
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11191113
2022-12-29 07:58:45 +08:00
yichang.zyc
0c79b57fcc support batch infer
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11170755
2022-12-28 12:17:36 +08:00