Files
modelscope/examples/pytorch/text_classification/run_train.sh
zsl01670416 9926ad685b support getting labels from dataset in sbert text classification and building dataset from file in chatglm-6b
1.Add getting labels from dataset in "text_classificationfinetune_text_classification.py" to simplify user's operation in flex training. Parameters "--num_labels" and "--labels" were removed in "run_train.sh".
2.In "chatglm6b / finetune.py", building dataset from file  is necessary to support flex training.
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/13382745
* support getting labels from dataset in sbert text classification and building dataset from file in chatglm-6b

* support getting labels from dataset in sbert text classification and building dataset from file in chatglm-6b

* remove repetitive labels in a concise manner of using set

* reserve parameter labels in finetune_text_classification

* Merge branch 'master' of http://gitlab.alibaba-inc.com/Ali-MaaS/MaaS-lib

reserve parameter labels in finetune_text_classification

* Merge branch 'support_text_cls_labels_chatglm_json'
reserve parameter labels in finetune_text_classification
2023-07-25 19:02:32 +08:00

24 lines
784 B
Bash

PYTHONPATH=. python examples/pytorch/text_classification/finetune_text_classification.py \
--task 'text-classification' \
--model 'damo/nlp_structbert_backbone_base_std' \
--train_dataset_name 'clue' \
--val_dataset_name 'clue' \
--train_subset_name 'tnews' \
--val_subset_name 'tnews' \
--train_split 'train' \
--val_split 'validation' \
--first_sequence 'sentence' \
--label label \
--preprocessor 'sen-cls-tokenizer' \
--use_model_config True \
--max_epochs 1 \
--per_device_train_batch_size 16 \
--per_device_eval_batch_size 16 \
--eval_interval 100 \
--eval_strategy by_step \
--work_dir './tmp' \
--train_data_worker 0 \
--eval_data_worker 0 \
--lr 1e-5 \
--eval_metrics 'seq-cls-metric' \