1.Add getting labels from dataset in "text_classificationfinetune_text_classification.py" to simplify user's operation in flex training. Parameters "--num_labels" and "--labels" were removed in "run_train.sh".
2.In "chatglm6b / finetune.py", building dataset from file is necessary to support flex training.
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/13382745
* support getting labels from dataset in sbert text classification and building dataset from file in chatglm-6b
* support getting labels from dataset in sbert text classification and building dataset from file in chatglm-6b
* remove repetitive labels in a concise manner of using set
* reserve parameter labels in finetune_text_classification
* Merge branch 'master' of http://gitlab.alibaba-inc.com/Ali-MaaS/MaaS-lib
reserve parameter labels in finetune_text_classification
* Merge branch 'support_text_cls_labels_chatglm_json'
reserve parameter labels in finetune_text_classification
1.add parameter model_revision in training_args.py.
2.add parameter model_revision in kwargs for finetune_text_classification.py and finetune_text_generation.py.
3.modify dataset loading in finetune_text_classification.py for flex training.
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/12869552
* add model revision in training_args and modify dataset loading in finetune text classification
1. Refactor training_args
2. Refactor hooks
3. Add train_id for push_to_hub
4. Support both output_dir/output_sub_dir for checkpoint_hooks
5. Support copy when hardlink fails when checkpointing
6. Support mixed dataset config file as a CLI argument
7. Add eval txt in output folder
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/12384253
* support the ignorance of file pattern