1. Fix the conflict between local path and remote dataset name in the form of dataset_name='namespace/dataset_name' in MsDataset.load() function.
2. Modify the obj_key.startswith value in get_split_objects_map function to adapt to dir name 'xxx/' format.
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11820290
* fix the conflict between local path and namespace/dataset_name of the dataset_name
* fix function: get_split_objects_map
* add UT for loading local csv file
* add new test case for test_load_local_csv function
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11697239
* add ControlNet for scribble2image
* update code comments
* support scribble input
* update scribble input for demo service
* support all models of ControlNet
* add requirements
* fix code style bug
* update model id
1. Support two checkpoint hooks saving final checkpoints in two difference folders
2. Remove the check of checkpoint hooks
3. Fix a incorrect modification in UT
4. Fix bug: Checkpoint.load_checkpoint has been moved out
5. Add UT for new style configuration
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11630170
The original backbone-head abstraction was not articheted well enough, the input and output parameters of backbone and head were in the form of **kwargs, which was implicit and might cause confustion. Therefore, the following adjustments were made:
原有backbone head抽象程度不够深,backbone 以及head输入输出参数为**kwargs,比较晦涩,同时很多功能无法支持扩展,因此做了如下调整:
1. Divide the basic model based on the structure to: encoder-only model, decoder-only model, single stage model, two stage model, etc., . Now, the encoder-only model was accomplished, while others are under design
2. Derive the structed task-models from the basic model structure above: a single structed task-model is mainly used to parse the backbone/head cfg, in order to apply the correct backbone or head components, some models might adjust the forward method from the basic model
3. Add the initialization parameters, input and output parameters to head class and backbone class, in order to reduce the understanding cost.
4. Remove the original nncrf class and chang it to backbone-head form with the lstm backbone and crf head.
5. Support `model = Model.from_pretrained('bert-based-fill-mask', task='text-classification')`, this method could correctly load the backbone even when the task is different from the original one in configuration.
6. Support loading the model through the transformer's automodel, in the case of quickly integrating the backbone model without coding
7. Unifiy the original task classes in each nlp model and the structed task-model classes, the structed task-model are largely reduce the redundant codes in the original task classed. Still under refactor
8. Support load model configuration from hf transformers config.json, if the model related configuration is missing. Only suppport NLP models
1. Support csanmt exporting to savedmodel format
2. Create a new base class for text-ranking preprocessors, and move some parameters of mgeo_ranking_preprocessor to init method
3. Avoid Model & Preprocessor classes coupled with pytorch
4. Regression test supports comparing only model output
5. Support zero-shot exporting to onnx and torchscript
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11522461