* add an example for qwen doc QA with langchain + llamaindex
* change comments to ENG; clear output and add urls
* add helper in MD; add wget for data file download
fix stable_diffusion_cones2, parameter model_id was modified to model, following file was changed:
1. examples/pytorch/stable_diffusion/cones2/finetune_stable_diffusion_cones2.py
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/13886743
* modify parameter model_id to model
* support float16 traing and pipeline for stable diffusion
* pre commit
* fix bugs
* add torch type example
* fix bugs of torch type
* support type float16
* fix bugs of load pipeline
* change type to fp16
* lora rank
---------
Co-authored-by: 翊靖 <yijing.wq@alibaba-inc.com>
support loading dataset for llama:
1.load dataset by MsDataset when parameters train dataset name and val dataset name were set. but there is no suitable dataset in hub.
2.load dataset by MsDataset when only parameter train dataset name was set, and then split into train dataset and validation dataset .
3.load dataset by MsDataset when user set parameter src_txt, which is a file path such as 'alpaca_data.json', and then split into training dataset and validation dataset.
4.load dataset by build dataset from file in flex training.
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/13505335
添加QWen 7b base模型和chat模型及相关pipelines
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/13482235
* add qwen 7b base and chat
* fix logger
* update examples, lint test
* add unittest for qwen base and chat
* rename qwen to qwen-7b
* resolve imports and add a registry to text-generation
* reset load model from pretrained
* fix precheck
* skip qwen test case now
* remove strange file
1.Add getting labels from dataset in "text_classificationfinetune_text_classification.py" to simplify user's operation in flex training. Parameters "--num_labels" and "--labels" were removed in "run_train.sh".
2.In "chatglm6b / finetune.py", building dataset from file is necessary to support flex training.
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/13382745
* support getting labels from dataset in sbert text classification and building dataset from file in chatglm-6b
* support getting labels from dataset in sbert text classification and building dataset from file in chatglm-6b
* remove repetitive labels in a concise manner of using set
* reserve parameter labels in finetune_text_classification
* Merge branch 'master' of http://gitlab.alibaba-inc.com/Ali-MaaS/MaaS-lib
reserve parameter labels in finetune_text_classification
* Merge branch 'support_text_cls_labels_chatglm_json'
reserve parameter labels in finetune_text_classification