Commit Graph

1070 Commits

Author SHA1 Message Date
ly119399
2f17daa23f [to #42322933] reduce the GPU usage of dialog trianer
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10955485
2022-12-02 17:32:26 +08:00
suluyan.sly
2863a8f7fa [to #42322933] fix hook.__init__
Link: https://code.aone.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10957489

* fix hook.__init__
2022-12-02 17:09:06 +08:00
yuze.zyz
348e87e697 change sequence_length to max_length
To cooperate with other tokenizing args, change sequence_length to max_length, meanwhile making the input args compatible with old 'sequence_length' arg.
2022-12-02 16:57:09 +08:00
ly261666
4208d51e23 substitute face detection model in skin_retouching_pipeline.py
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10909902
2022-12-02 15:41:08 +08:00
wenmeng.zwm
c9a6b887a2 add tensorboard hook for visualization
1. add tensorboard hook to default config
2. add image visualization support to tensorboard hook and trainer
3. move evaluation logic out of single_gpu_test and multi_gpu_test to make prediction results available for further processing such as result saving and visualization.

visualization results are as follows:
![image.png](https://cn-hangzhou.oss-cdn.aliyun-inc.com/git/force/uploads/comment/29212/38448470860386707/image.png)
![image.png](https://cn-hangzhou.oss-cdn.aliyun-inc.com/git/force/uploads/comment/29212/38437794200606734/image.png)
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10894813
2022-12-02 15:13:24 +08:00
ziyuan.tw
31316b8d29 add nextvit-small_image-classification_Dailylife-labels model
支持1130新上线模.
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10886253
2022-12-02 14:46:49 +08:00
ly119399
5ae1e08db6 [to #42322933] fix bug of tableQA on gpu
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10943053
2022-12-02 10:38:30 +08:00
zhangzhicheng.zzc
a318f27247 [to #42322933] speed up the ast indexing during editing
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10907357
2022-12-02 10:06:24 +08:00
yuze.zyz
0e4766f41d Fix bugs in testlevel1 & 2
1. Fix: ws regression failed.
2. Fix: label2id missing in text_classification_pipeline when preprocessor is passed in through args.
3. Fix: remove obsolete imports
4. Fix: incomplete modification
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10936431
2022-12-01 21:16:55 +08:00
rujiao.lrj
9d8eb5b0b3 support license plate detection
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10917315
2022-12-01 19:48:06 +08:00
mulin.lyh
f663f420c4 [to #46480415]feat: ci command custom support regression case run all case in subprocess
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10936241
2022-12-01 19:33:25 +08:00
suluyan.sly
1394019102 [to #42322933] plug finetune
plug finetune :已在du reader- robust数据集上回归至最佳结果
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10916382
2022-12-01 19:31:15 +08:00
james.wjg
9b3a92e65d cv/language_guided_video_summarization增加finetune
cv/language_guided_video_summarization增加finetune
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10790262
2022-12-01 19:16:56 +08:00
lllcho.lc
b8dba17543 [to #42322933] action-detection model predownload video before inference
1. 在模型处理视频之前下载视频,防止网络抖动导致ffmpeg读取网络视频失败进而导致模型运行失败
2. 完善模型inference是的控制参数
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10906373
2022-12-01 18:13:08 +08:00
mulin.lyh
7039e93c99 skip temp failed case 2022-12-01 16:50:09 +08:00
yuze.zyz
bb5512d1ab [to #42322933] Refactor NLP and fix some user feedbacks
1. Abstract keys of dicts needed by nlp metric classes into the init method
2. Add Preprocessor.save_pretrained to save preprocessor information
3. Abstract the config saving function, which can lead to normally saving in the direct call of from_pretrained, and the modification of cfg one by one when training.
4. Remove SbertTokenizer and VecoTokenizer, use transformers' tokenizers instead
5. Use model/preprocessor's from_pretrained in all nlp pipeline classes.
6. Add model_kwargs and preprocessor_kwargs in all nlp pipeline classes
7. Add base classes for fill-mask and text-classification preprocessor, as a demo for later changes
8. Fix user feedback: Re-train the model in continue training scenario
9. Fix user feedback: Too many checkpoint saved
10. Simplify the nlp-trainer
11. Fix user feedback: Split the default trainer's __init__ method, which makes user easier to override
12. Add safe_get to Config class

----------------------------  Another refactor from version 36 -------------------------

13. Name all nlp transformers' preprocessors from TaskNamePreprocessor to TaskNameTransformersPreprocessor, for example:
      TextClassificationPreprocessor -> TextClassificationTransformersPreprocessor
14. Add a base class per task for all nlp tasks' preprocessors which has at least two sub-preprocessors
15. Add output classes of nlp models
16. Refactor the logic for token-classification
17. Fix bug: checkpoint_hook does not support pytorch_model.pt
18. Fix bug: Pipeline name does not match with task name, so inference will not succeed after training
       NOTE: This is just a stop bleeding solution, the root cause is the uncertainty of the relationship between models and pipelines
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10723513

    * add save_pretrained to preprocessor

* save preprocessor config in hook

* refactor label-id mapping fetching logic

* test ok on sentence-similarity

* run on finetuning

* fix bug

* pre-commit passed

* fix bug

* Merge branch 'master' into feat/refactor_config

# Conflicts:
#	modelscope/preprocessors/nlp/nlp_base.py

* add params to init

* 1. support max ckpt num 2. support ignoring others but bin file in continue training 3. add arguments to some nlp metrics

* Split trainer init impls to overridable methods

* remove some obsolete tokenizers

* unfinished

* support input params in pipeline

* fix bugs

* fix ut bug

* fix bug

* fix ut bug

* fix ut bug

* fix ut bug

* add base class for some preprocessors

* Merge commit '379867739548f394d0fa349ba07afe04adf4c8b6' into feat/refactor_config

* compatible with old code

* fix ut bug

* fix ut bugs

* fix bug

* add some comments

* fix ut bug

* add a requirement

* fix pre-commit

* Merge commit '0451b3d3cb2bebfef92ec2c227b2a3dd8d01dc6a' into feat/refactor_config

* fixbug

* Support function type in registry

* fix ut bug

* fix bug

* Merge commit '5f719e542b963f0d35457e5359df879a5eb80b82' into feat/refactor_config

# Conflicts:
#	modelscope/pipelines/nlp/multilingual_word_segmentation_pipeline.py
#	modelscope/pipelines/nlp/named_entity_recognition_pipeline.py
#	modelscope/pipelines/nlp/word_segmentation_pipeline.py
#	modelscope/utils/hub.py

* remove obsolete file

* rename init args

* rename params

* fix merge bug

* add default preprocessor config for ner-model

* move a method a util file

* remove unused config

* Fix a bug in pbar

* bestckptsaver:change default ckpt numbers to 1

* 1. Add assert to max_epoch 2. split init_dist and get_device 3. change cmp func name

* Fix bug

* fix bug

* fix bug

* unfinished refactoring

* unfinished

* uw

* uw

* uw

* uw

* Merge branch 'feat/refactor_config' into feat/refactor_trainer

# Conflicts:
#	modelscope/preprocessors/nlp/document_segmentation_preprocessor.py
#	modelscope/preprocessors/nlp/faq_question_answering_preprocessor.py
#	modelscope/preprocessors/nlp/relation_extraction_preprocessor.py
#	modelscope/preprocessors/nlp/text_generation_preprocessor.py

* uw

* uw

* unify nlp task outputs

* uw

* uw

* uw

* uw

* change the order of text cls pipeline

* refactor t5

* refactor tg task preprocessor

* fix

* unfinished

* temp

* refactor code

* unfinished

* unfinished

* unfinished

* unfinished

* uw

* Merge branch 'feat/refactor_config' into feat/refactor_trainer

* smoke test pass

* ut testing

* pre-commit passed

* Merge branch 'master' into feat/refactor_config

# Conflicts:
#	modelscope/models/nlp/bert/document_segmentation.py
#	modelscope/pipelines/nlp/__init__.py
#	modelscope/pipelines/nlp/document_segmentation_pipeline.py

* merge master

* unifnished

* Merge branch 'feat/fix_bug_pipeline_name' into feat/refactor_config

* fix bug

* fix ut bug

* support ner batch inference

* fix ut bug

* fix bug

* support batch inference on three nlp tasks

* unfinished

* fix bug

* fix bug

* Merge branch 'master' into feat/refactor_config

# Conflicts:
#	modelscope/models/base/base_model.py
#	modelscope/pipelines/nlp/conversational_text_to_sql_pipeline.py
#	modelscope/pipelines/nlp/dialog_intent_prediction_pipeline.py
#	modelscope/pipelines/nlp/dialog_modeling_pipeline.py
#	modelscope/pipelines/nlp/dialog_state_tracking_pipeline.py
#	modelscope/pipelines/nlp/document_segmentation_pipeline.py
#	modelscope/pipelines/nlp/faq_question_answering_pipeline.py
#	modelscope/pipelines/nlp/feature_extraction_pipeline.py
#	modelscope/pipelines/nlp/fill_mask_pipeline.py
#	modelscope/pipelines/nlp/information_extraction_pipeline.py
#	modelscope/pipelines/nlp/named_entity_recognition_pipeline.py
#	modelscope/pipelines/nlp/sentence_embedding_pipeline.py
#	modelscope/pipelines/nlp/summarization_pipeline.py
#	modelscope/pipelines/nlp/table_question_answering_pipeline.py
#	modelscope/pipelines/nlp/text2text_generation_pipeline.py
#	modelscope/pipelines/nlp/text_classification_pipeline.py
#	modelscope/pipelines/nlp/text_error_correction_pipeline.py
#	modelscope/pipelines/nlp/text_generation_pipeline.py
#	modelscope/pipelines/nlp/text_ranking_pipeline.py
#	modelscope/pipelines/nlp/token_classification_pipeline.py
#	modelscope/pipelines/nlp/word_segmentation_pipeline.py
#	modelscope/pipelines/nlp/zero_shot_classification_pipeline.py
#	modelscope/trainers/nlp_trainer.py

* pre-commit passed

* fix bug

* Merge branch 'master' into feat/refactor_config

# Conflicts:
#	modelscope/preprocessors/__init__.py

* fix bug

* fix bug

* fix bug

* fix bug

* fix bug

* fixbug

* pre-commit passed

* fix bug

* fixbug

* fix bug

* fix bug

* fix bug

* fix bug

* self review done

* fixbug

* fix bug

* fix bug

* fix bugs

* remove sub-token offset mapping

* fix name bug

* add some tests

* 1. support batch inference of text-generation,text2text-generation,token-classification,text-classification 2. add corresponding UTs

* add old logic back

* tmp save

* add tokenize by words logic back

* move outputs file back

* revert veco token-classification back

* fix typo

* Fix description

* Merge commit '4dd99b8f6e4e7aefe047c68a1bedd95d3ec596d6' into feat/refactor_config

* Merge branch 'master' into feat/refactor_config

# Conflicts:
#	modelscope/pipelines/builder.py
2022-11-30 23:52:17 +08:00
qianmu.ywh
bca6da3b56 update pipeline according to online demo requirements
根据在线demo前端的要求,多输出一个color图片用于展示
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10926624
2022-11-30 22:19:11 +08:00
yuze.zyz
fde8644883 Fix a bug that the logging file cannot save the correct lr, which is zero instead
This bug is a result of float rounding when saving key-value pairs to log files, which is reported by a user.
Now the solution is to remove the rounding operation of all values, instead of only the lr value, which I think may be too specific.

Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10684029
2022-11-30 21:59:02 +08:00
wenmeng.zwm
a4e6c5226c remove get_pipeline_by_model_name
* remove some logic which may result in strange error when get hub info failed

Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10924091
2022-11-30 21:53:02 +08:00
wenmeng.zwm
4dd99b8f6e Revert "move opencv dependency from framwork to cv "
This reverts commit e970a6eb43.
2022-11-30 18:29:03 +08:00
xiangpeng.wxp
2c4dc8c660 [to #42322933] nlp csanmt translation fix finetuning bug
nlp csanmt translation fix finetuning bug
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10923166

    * [to #42322933] nlp csanmt translation fix finetuning bug
2022-11-30 17:49:55 +08:00
jiangyu.xzy
9bfc77c178 support asr new models
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10919277

* support new asr paraformer model

* support asr conformer model
2022-11-30 17:08:35 +08:00
qianmu.ywh
cc27e3a25e update pipeline according to online demo requirements
按在线demo前端的要求,将输出改成单独一个numpy格式的图片
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10912907
2022-11-30 11:53:40 +08:00
hemu.zp
cdb485b554 [to #42322933] Fix bug for DistributedPipeline
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10913762
2022-11-30 11:51:35 +08:00
jerry.lp
177d70829b add gpt-moe model for modelscope pipeline inference
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10836131
2022-11-29 20:54:32 +08:00
shuying.shu
9229a9b12b fix interpolate value error for vitadapter semantic segmentation
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10894248
2022-11-29 17:46:03 +08:00
shuying.shu
6baf602bc2 adjust input and output format for demo service
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10873454
2022-11-29 13:57:09 +08:00
xiangpeng.wxp
2536f9ec9b [to #42322933] add en-zh en-es es-en base translation models
* add en-zh en-es es-en base translation models
 * add en-zh en-es es-en base translation models
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10895782

    * 新增英中/英西/西英-base机器翻译模型

* 新增英中/英西/西英-base机器翻译模型
2022-11-29 13:44:06 +08:00
wenmeng.zwm
64516eb734 Merge branch merge_master_github_1128 into master
Title: merge master github to gitlab 

Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10884192
2022-11-29 11:55:54 +08:00
yichang.zyc
ebb9636179 fix 不必要的init和优化vqa的preprocessor
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10868091
2022-11-28 23:12:23 +08:00
xingjun.wxj
1878500cb4 [to #42322933] fix log print and extensions issue for datasets==2.5.2
1. ExternalDataset的init部分中,引入datasets包自带的_EXTENSION_TO_MODULE会有版本兼容性的问题,比如2.5.2版本就修改了数据结构,与老版本不兼容;
2. 某些cv数据集跳过打印logger.error
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10893702
2022-11-28 23:09:49 +08:00
wenmeng.zwm
261cbb78ce Merge branch 'master-gitlab' into merge_master_github_1128 2022-11-28 22:57:44 +08:00
mulin.lyh
2a8e653169 [to #46408569]fix: pipeline and trainer user-agent add not replacement.
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10890156

    * [to #46408569]fix: pipeline and trainer user-agent add not replacement.
2022-11-28 19:45:58 +08:00
wenmeng.zwm
3b78421236 fix: torch.concat compatibility with torch1.8
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10885659
2022-11-28 19:24:34 +08:00
qianmu.ywh
fc6d0c64bc add image_depth_estimation: model, pipeline, test
接入图像深度估计模型,新增model、pipeline、test
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10857764
2022-11-28 18:00:48 +08:00
shiyi.zxh
b386a4ee50 adapt to different wav input
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10886461
2022-11-28 17:48:10 +08:00
mulin.lyh
a4c36a2920 [to #46273042]feat: pipeline trainer stat information from snapshot_download
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10837490
2022-11-28 15:06:03 +08:00
Yingda Chen
d5ee8aa66d move long-running test to level 2 2022-11-28 13:50:28 +08:00
wenmeng.zwm
b2dd4af2ae fix conflict 2022-11-28 13:34:54 +08:00
Yingda Chen
a82dbb8f97 Merge pull request #33 from modelscope/codegeex_code_translation
CodeGeex code translation and generation

ut failed due to a known run.py environment setup issue that is being fixed. nothing to do with the change itself.
2022-11-28 11:55:42 +08:00
Yingda Chen
355b1f336e Merge pull request #35 from pengzhendong/master
[pipelines] support wenet

note: ut failed is due to a run.py enveironment setup issue that is being fixed. nothing to do with the change.
2022-11-28 11:52:32 +08:00
shichen.fsc
acb8d36699 [to #42322933] add extractive-summarization and topic-segmentation
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10856839
2022-11-25 19:29:02 +08:00
jiangyu.xzy
2b62084146 add funasr based asr inference
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10868583
2022-11-25 17:49:24 +08:00
xingjun.wxj
7b167861a4 [to #42322933] add features for alimeeting competition dataset
1. add ExternalDataset methods for csv/txt/json/jsonl files on the oss storage
2. add user-define delimiter for csv in meta.
3. supports internal dataset
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib
2022-11-25 17:48:19 +08:00
bin.xue
1969c3a1db test: add new demo data
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10872422
2022-11-25 17:31:59 +08:00
shuaigezhu
028551cd62 add code_generation files 2022-11-25 16:41:44 +08:00
shuaigezhu
c9064caa58 add code_generation 2022-11-25 16:35:19 +08:00
pengzhendong
02d2469e55 check wenetruntime 2022-11-25 15:59:28 +08:00
shiyi.zxh
7661470350 ofa增加asr任务
ofa增加asr任务infer
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10761019
2022-11-25 12:16:33 +08:00
shuaigezhu
65adde14d8 remove uttest 2022-11-25 11:55:53 +08:00