Commit Graph

65 Commits

Author SHA1 Message Date
tastelikefeet
b3808b7c20 Support kernels downloading (#1697) 2026-04-28 10:29:19 +08:00
tastelikefeet
918a808dc2 compat with tf5.0 (#1618) 2026-03-07 22:40:43 +08:00
tastelikefeet
dbbf268e70 Fix downloading repos in automap (#1630) 2026-03-02 11:33:50 +08:00
Xingjun.Wang
4f867bfe80 Fix: deprecate delete_repo, delete_model and delete_dataset due to token a… (#1588) 2026-01-12 11:51:46 +08:00
Xingjun.Wang
055496c597 Fix CI 2025-08-07 19:26:32 +08:00
suluyana
53ceca4df4 feat: sentence_embedding pipeline (#1435) 2025-08-06 15:43:36 +08:00
tastelikefeet
556aa76dc3 fix pyyaml according to: https://github.com/Anchor0221/CVE-2025-50460 (#1428) 2025-08-01 22:04:44 +08:00
co63oc
8323fc5185 Fix typos in multiple files (#1357) 2025-06-05 14:04:29 +08:00
tastelikefeet
a91f19ea54 Support downloading exact file for hf wrapper (#1323) 2025-04-30 14:57:59 +08:00
xingjun.wxj
10aa2c6bd8 add login for HFUtilTest 2025-03-27 22:42:48 +08:00
suluyana
57044b9c88 feat: compatible with hf_pipeline (#1221)
compatible with hf_pipeline
2025-02-21 15:49:39 +08:00
tastelikefeet
1cf7f4ff52 fix create_commit login (#1210) 2025-02-06 18:22:29 +08:00
tastelikefeet
f74433f6b2 Add more patches for hf (#1160) 2025-02-06 11:09:37 +08:00
tastelikefeet
96e33878b4 lint code (#970) 2024-09-02 10:16:35 +08:00
yuze.zyz
8b5a5bd1e3 fix 2024-09-01 18:37:01 +08:00
mulin.lyh
d30ef8b202 fix huggingface position_ids compatible issue
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/14406558
* fix compatible issues

* fix transformer compatible issue

* skip case for huggingface link issue

* fix hf autotokenlizer case

* Merge branch 'fix_ci_issue' of http://gitlab.alibaba-inc.com/Ali-MaaS/MaaS-lib into fix_ci_issue
2023-10-24 15:18:55 +08:00
mulin.lyh
4ac59e29b0 refactor ci to analyze file dependency
import分析不需要实际import文件,通过静态扫描,得到文件中定义的符号,在分析import时找到import的符号所在的文件,从而建立起关联。
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/14281794
* refactor ci

* fix typo

* fix bug

* test

* add indirect import name resolve

* remove test code
2023-10-13 14:11:30 +08:00
mulin.lyh
23f1f474bf Merge branch 'master-github' into master-merge-github925
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/14164566
2023-09-26 21:15:41 +08:00
suluyan.sly
d7c2a91e2c swing deploy api
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/14103985
* tester

* bug fixed: download_file not import

* input example analyzer

* api: get_task_input_examples

* update chat

* 43 task

* fix decode_base64

* json format

* schema
2023-09-22 19:18:57 +08:00
Jintao
18d33a4825 fix copytree python37 bug (#464)
* fix copytree python37 bug

* add copytree_py37 function
2023-08-14 11:45:33 +08:00
suluyana
b68b90ba15 skip plugin 2023-07-30 00:30:30 +08:00
suluyana
9ece90ee84 skip plugin test case 2023-07-29 21:35:21 +08:00
wenmeng zhou
64203e89ee Compatibility for huggingface transformers (#391) 2023-07-24 20:53:27 +08:00
yuze.zyz
a58be34384 Add Lora/Adapter/Prompt and support for chatglm6B and chatglm2-6B
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/12770413

* add prompt and lora

* add adapter

* add prefix

* add tests

* adapter smoke test passed

* prompt test passed

* support model id in petl

* migrate chatglm6b

* add train script for chatglm6b

* move gen_kwargs to finetune.py

* add chatglm2

* add model definination
2023-06-27 14:38:18 +08:00
mulin.lyh
698c794070 [to #50537864]fix: fix select case issue
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/13055098
2023-06-25 22:44:29 +08:00
hemu.zp
96c2d42f09 Add StreamingMixin
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/12445731
* StreamingMixin poc

* update design

* Merge branch 'master' into feat/StreamingMixin

* add dicstr

* make postprocessor input consistent
2023-06-08 19:40:14 +08:00
mulin.lyh
7b14a0e11f Pipeline input, output and parameter normalization. 2023-05-11 11:20:01 +08:00
zhangzhicheng.zzc
04e8ddc41e fix update ast not remove origin information
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/12319197
2023-04-13 16:10:07 +08:00
hemu.zp
aa561a1818 Support split and merge for megatron_base model
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/12288423
2023-04-12 16:23:35 +08:00
yzhao
99e94bc2c2 Merge branch 'master-github' into master-merge-github20230310 2023-03-10 13:52:31 +08:00
zhangzhicheng.zzc
8a19e9645d [to #47860410]plugin with cli tool
1. 支持 plugin方式接入外部 repo、github repo,本地repo,并进行外部插件管理
2. 支持allow_remote方式接入modelhub repo,该类型属于model 范畴不做额外插件管理
3. 支持cli 安装plugin相关

Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11775456
2023-03-09 23:07:13 +08:00
chenxujun
20b3a679e7 Fix some words (#141) 2023-03-02 11:06:56 +08:00
wenmeng.zwm
677e49eaf3 update api doc
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11582587
2023-02-10 07:48:11 +00:00
zhangzhicheng.zzc
5c73ee9f6f skip ast update test
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11609057
2023-02-09 08:36:44 +00:00
mulin.lyh
71f832da35 [to #47671666]fix: diff based ci optimize
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11574741

    * [to #47671666]fix: diff based ci optimize
2023-02-07 10:45:52 +00:00
mulin.lyh
e54694690f [to #46993990]feat: run ci cases base on code diff to reduct ci test time 2023-02-06 08:00:19 +00:00
zhangzhicheng.zzc
e20a72be07 remove function level imports index
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11537482
2023-02-04 10:26:00 +00:00
zhangzhicheng.zzc
42898badf7 [to #42322933] update ast_index logic 2023-01-11 10:43:56 +08:00
pangda
346af6773f support plugin mechanism for second-party/third-party modules 2023-01-11 10:35:09 +08:00
zhangzhicheng.zzc
a318f27247 [to #42322933] speed up the ast indexing during editing
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10907357
2022-12-02 10:06:24 +08:00
yuze.zyz
bb5512d1ab [to #42322933] Refactor NLP and fix some user feedbacks
1. Abstract keys of dicts needed by nlp metric classes into the init method
2. Add Preprocessor.save_pretrained to save preprocessor information
3. Abstract the config saving function, which can lead to normally saving in the direct call of from_pretrained, and the modification of cfg one by one when training.
4. Remove SbertTokenizer and VecoTokenizer, use transformers' tokenizers instead
5. Use model/preprocessor's from_pretrained in all nlp pipeline classes.
6. Add model_kwargs and preprocessor_kwargs in all nlp pipeline classes
7. Add base classes for fill-mask and text-classification preprocessor, as a demo for later changes
8. Fix user feedback: Re-train the model in continue training scenario
9. Fix user feedback: Too many checkpoint saved
10. Simplify the nlp-trainer
11. Fix user feedback: Split the default trainer's __init__ method, which makes user easier to override
12. Add safe_get to Config class

----------------------------  Another refactor from version 36 -------------------------

13. Name all nlp transformers' preprocessors from TaskNamePreprocessor to TaskNameTransformersPreprocessor, for example:
      TextClassificationPreprocessor -> TextClassificationTransformersPreprocessor
14. Add a base class per task for all nlp tasks' preprocessors which has at least two sub-preprocessors
15. Add output classes of nlp models
16. Refactor the logic for token-classification
17. Fix bug: checkpoint_hook does not support pytorch_model.pt
18. Fix bug: Pipeline name does not match with task name, so inference will not succeed after training
       NOTE: This is just a stop bleeding solution, the root cause is the uncertainty of the relationship between models and pipelines
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10723513

    * add save_pretrained to preprocessor

* save preprocessor config in hook

* refactor label-id mapping fetching logic

* test ok on sentence-similarity

* run on finetuning

* fix bug

* pre-commit passed

* fix bug

* Merge branch 'master' into feat/refactor_config

# Conflicts:
#	modelscope/preprocessors/nlp/nlp_base.py

* add params to init

* 1. support max ckpt num 2. support ignoring others but bin file in continue training 3. add arguments to some nlp metrics

* Split trainer init impls to overridable methods

* remove some obsolete tokenizers

* unfinished

* support input params in pipeline

* fix bugs

* fix ut bug

* fix bug

* fix ut bug

* fix ut bug

* fix ut bug

* add base class for some preprocessors

* Merge commit '379867739548f394d0fa349ba07afe04adf4c8b6' into feat/refactor_config

* compatible with old code

* fix ut bug

* fix ut bugs

* fix bug

* add some comments

* fix ut bug

* add a requirement

* fix pre-commit

* Merge commit '0451b3d3cb2bebfef92ec2c227b2a3dd8d01dc6a' into feat/refactor_config

* fixbug

* Support function type in registry

* fix ut bug

* fix bug

* Merge commit '5f719e542b963f0d35457e5359df879a5eb80b82' into feat/refactor_config

# Conflicts:
#	modelscope/pipelines/nlp/multilingual_word_segmentation_pipeline.py
#	modelscope/pipelines/nlp/named_entity_recognition_pipeline.py
#	modelscope/pipelines/nlp/word_segmentation_pipeline.py
#	modelscope/utils/hub.py

* remove obsolete file

* rename init args

* rename params

* fix merge bug

* add default preprocessor config for ner-model

* move a method a util file

* remove unused config

* Fix a bug in pbar

* bestckptsaver:change default ckpt numbers to 1

* 1. Add assert to max_epoch 2. split init_dist and get_device 3. change cmp func name

* Fix bug

* fix bug

* fix bug

* unfinished refactoring

* unfinished

* uw

* uw

* uw

* uw

* Merge branch 'feat/refactor_config' into feat/refactor_trainer

# Conflicts:
#	modelscope/preprocessors/nlp/document_segmentation_preprocessor.py
#	modelscope/preprocessors/nlp/faq_question_answering_preprocessor.py
#	modelscope/preprocessors/nlp/relation_extraction_preprocessor.py
#	modelscope/preprocessors/nlp/text_generation_preprocessor.py

* uw

* uw

* unify nlp task outputs

* uw

* uw

* uw

* uw

* change the order of text cls pipeline

* refactor t5

* refactor tg task preprocessor

* fix

* unfinished

* temp

* refactor code

* unfinished

* unfinished

* unfinished

* unfinished

* uw

* Merge branch 'feat/refactor_config' into feat/refactor_trainer

* smoke test pass

* ut testing

* pre-commit passed

* Merge branch 'master' into feat/refactor_config

# Conflicts:
#	modelscope/models/nlp/bert/document_segmentation.py
#	modelscope/pipelines/nlp/__init__.py
#	modelscope/pipelines/nlp/document_segmentation_pipeline.py

* merge master

* unifnished

* Merge branch 'feat/fix_bug_pipeline_name' into feat/refactor_config

* fix bug

* fix ut bug

* support ner batch inference

* fix ut bug

* fix bug

* support batch inference on three nlp tasks

* unfinished

* fix bug

* fix bug

* Merge branch 'master' into feat/refactor_config

# Conflicts:
#	modelscope/models/base/base_model.py
#	modelscope/pipelines/nlp/conversational_text_to_sql_pipeline.py
#	modelscope/pipelines/nlp/dialog_intent_prediction_pipeline.py
#	modelscope/pipelines/nlp/dialog_modeling_pipeline.py
#	modelscope/pipelines/nlp/dialog_state_tracking_pipeline.py
#	modelscope/pipelines/nlp/document_segmentation_pipeline.py
#	modelscope/pipelines/nlp/faq_question_answering_pipeline.py
#	modelscope/pipelines/nlp/feature_extraction_pipeline.py
#	modelscope/pipelines/nlp/fill_mask_pipeline.py
#	modelscope/pipelines/nlp/information_extraction_pipeline.py
#	modelscope/pipelines/nlp/named_entity_recognition_pipeline.py
#	modelscope/pipelines/nlp/sentence_embedding_pipeline.py
#	modelscope/pipelines/nlp/summarization_pipeline.py
#	modelscope/pipelines/nlp/table_question_answering_pipeline.py
#	modelscope/pipelines/nlp/text2text_generation_pipeline.py
#	modelscope/pipelines/nlp/text_classification_pipeline.py
#	modelscope/pipelines/nlp/text_error_correction_pipeline.py
#	modelscope/pipelines/nlp/text_generation_pipeline.py
#	modelscope/pipelines/nlp/text_ranking_pipeline.py
#	modelscope/pipelines/nlp/token_classification_pipeline.py
#	modelscope/pipelines/nlp/word_segmentation_pipeline.py
#	modelscope/pipelines/nlp/zero_shot_classification_pipeline.py
#	modelscope/trainers/nlp_trainer.py

* pre-commit passed

* fix bug

* Merge branch 'master' into feat/refactor_config

# Conflicts:
#	modelscope/preprocessors/__init__.py

* fix bug

* fix bug

* fix bug

* fix bug

* fix bug

* fixbug

* pre-commit passed

* fix bug

* fixbug

* fix bug

* fix bug

* fix bug

* fix bug

* self review done

* fixbug

* fix bug

* fix bug

* fix bugs

* remove sub-token offset mapping

* fix name bug

* add some tests

* 1. support batch inference of text-generation,text2text-generation,token-classification,text-classification 2. add corresponding UTs

* add old logic back

* tmp save

* add tokenize by words logic back

* move outputs file back

* revert veco token-classification back

* fix typo

* Fix description

* Merge commit '4dd99b8f6e4e7aefe047c68a1bedd95d3ec596d6' into feat/refactor_config

* Merge branch 'master' into feat/refactor_config

# Conflicts:
#	modelscope/pipelines/builder.py
2022-11-30 23:52:17 +08:00
hemu.zp
0f0fdcae6f [to #42322933] Fix bug for mplug evaluation
修复了 mplug evaluation 使用了错误的 metrics 的问题,将部分中文处理代码独立到 utils 中,为 mplug 添加 trainer
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10612875
2022-11-08 17:58:03 +08:00
wenmeng.zwm
535acaef5b [to #42322933]add test case to check xtcocotools availbility
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10462622

    * add test case to check xtcocotools availbility
2022-10-20 12:13:19 +08:00
hemu.zp
271e2a2a99 [to #42322933] Add gpt_neo model
1. 添加 gpt_neo 模型,因 checkpoint 归属于 Langboat 还未上传到模型库,已线下完成测试
2. 添加 text-generation task models 与 head,后续会将 gpt3,palm 等已上线文本生成模型统一为 backbone + head 结构的 task models
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10404249
2022-10-17 20:54:29 +08:00
zhangzhicheng.zzc
d721fabb34 [to #42322933]bert with sequence classification / token classification/ fill mask refactor
1.新增支持原始bert模型(非easynlp的 backbone prefix版本)
2.支持bert的在sequence classification/fill mask /token classification上的backbone head形式
3.统一了sequence classification几个任务的pipeline到一个类
4.fill mask 支持backbone head形式
5.token classification的几个子任务(ner,word seg, part of speech)的preprocessor 统一到了一起TokenClassificationPreprocessor
6. sequence classification的几个子任务(single classification, pair classification)的preprocessor 统一到了一起SequenceClassificationPreprocessor
7. 改动register中 cls的group_key 赋值位置,之前的group_key在多个decorators的情况下,会被覆盖,obj_cls的group_key信息不正确
8. 基于backbone head形式将 原本group_key和 module同名的情况尝试做调整,如下在modelscope/pipelines/nlp/sequence_classification_pipeline.py 中 
原本
 @PIPELINES.register_module(
    Tasks.sentiment_classification, module_name=Pipelines.sentiment_classification)
改成
@PIPELINES.register_module(
    Tasks.text_classification, module_name=Pipelines.sentiment_classification)
相应的configuration.json也有改动,这样的改动更符合任务和pipline(子任务)的关系。
8. 其他相应改动为支持上述功能
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10041463
2022-09-27 23:08:33 +08:00
wenmeng.zwm
6808e9a301 [to #44902099] add license for framework files
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10189613
2022-09-20 17:49:31 +08:00
wenmeng.zwm
fabb4716d4 [to #44610931] fix: add device usage when device is None or empty
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10039848

    * add device usage when device is None or empty

    * update docker env
2022-09-06 21:47:59 +08:00
jiangnana.jnn
930d55d9ad support EasyCV framework and add Segformer model
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/9781849

    * support EasyCV
2022-08-26 13:58:50 +08:00
wenmeng.zwm
c72e5f4ae8 [to #43878347] skip device placement test
skip this test which will result in too much debug log for placement although debug level is canceled after this test case

Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/9875987
2022-08-24 15:08:22 +08:00
zhangzhicheng.zzc
5b0b54633b [to #42322933]compatible with windows path on only core parts
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/9855254
2022-08-24 13:35:42 +08:00