* fix#845
Supports resumption of downloads from breakpoints, optimized download progress bar, finer display granularity, better experience under low bandwidth, and added function of downloading specified directories.
* restore push to hub
* fix merge issue
* fix ut issue
---------
Co-authored-by: mulin.lyh <mulin.lyh@taobao.com>
1. Refactor training_args
2. Refactor hooks
3. Add train_id for push_to_hub
4. Support both output_dir/output_sub_dir for checkpoint_hooks
5. Support copy when hardlink fails when checkpointing
6. Support mixed dataset config file as a CLI argument
7. Add eval txt in output folder
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/12384253
* support the ignorance of file pattern
Features:
1. Refactor the directory structure of nlp models. All model files are placed into either the model folder or the task_model folder
2. Refactor all the comments to google style
3. Add detail comments to important tasks and nlp models, to list the description of the model, and its preprocessor&trainer
4. Model Exporting now supports a direct all to TorchModelExporter(no need to derive from it)
5. Refactor model save_pretrained method to support direct running(independent from trainer)
6. Remove the judgement of Model in the pipeline base class, to support outer register models running in our pipelines
7. Nlp trainer now has a NLPTrainingArguments class , user can pass arguments into the dataclass, and use it as a normal cfg_modify_fn, to simplify the operation of modify cfg.
8. Merge the BACKBONES and the MODELS, so user can get a backbone with the Model.from_pretrained call
9. Model.from_pretrained now support a task argument, so user can use a backbone and load it with a specific task class.
10. Support Preprocessor.from_pretrained method
11. Add standard return classes to important nlp tasks, so some of the pipelines and the models are independent now, the return values of the models will always be tensors, and the pipelines will take care of the conversion to numpy and the following stuffs.
12. Split the file of the nlp preprocessors, to make the dir structure more clear.
Bugs Fixing:
1. Fix a bug that lr_scheduler can be called earlier than the optimizer's step
2. Fix a bug that the direct call of Pipelines (not from pipeline(xxx)) throws error
3. Fix a bug that the trainer will not call the correct TaskDataset class
4. Fix a bug that the internal loading of dataset will throws error in the trainer class
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10490585
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10434107
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10434107
* [Update] update finetune_result_upload
* [Update] rename finetune_result_upload to model_dir_upload
* Merge branch 'master' into feat/upload_ckpt
* Merge branch 'master' into feat/upload_ckpt
* [Fix] fix import error
* [Fix] fix import error
* Merge branch 'master' into feat/upload_ckpt
* [Update] changes name to upload_folder and using tempfile to save repo
* Merge branch 'master' into feat/upload_ckpt
* [Fix] fix commit
* Merge branch 'master' into feat/upload_ckpt
* [Fix] fix format
* Merge branch 'master' into feat/upload_ckpt
* [Fix] add uuid after model created from upload ut