mirror of
https://github.com/modelscope/modelscope.git
synced 2026-05-18 05:05:00 +02:00
[to #43726282] fix bugs and refine docs
1. remove pai-easynlp temporarily due to its hard dependency on scipy==1.5.4 2. fix sentiment classification output 3. update quickstart and trainer doc Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/9646399
This commit is contained in:
@@ -1,7 +1,7 @@
|
||||
# 快速开始
|
||||
ModelScope Library目前支持tensorflow,pytorch深度学习框架进行模型训练、推理, 在Python 3.7+, Pytorch 1.8+, Tensorflow1.13-1.15,Tensorflow 2.x上测试可运行。
|
||||
ModelScope Library目前支持tensorflow,pytorch深度学习框架进行模型训练、推理, 在Python 3.7+, Pytorch 1.8+, Tensorflow1.15,Tensorflow 2.x上测试可运行。
|
||||
|
||||
注: 当前(630)版本 `语音相关`的功能仅支持 python3.7,tensorflow1.13-1.15的`linux`环境使用。 其他功能可以在windows、mac上安装使用。
|
||||
注: `语音相关`的功能仅支持 python3.7,tensorflow1.15的`linux`环境使用。 其他功能可以在windows、mac上安装使用。
|
||||
|
||||
## python环境配置
|
||||
首先,参考[文档](https://docs.anaconda.com/anaconda/install/) 安装配置Anaconda环境
|
||||
|
||||
@@ -8,22 +8,10 @@ Modelscope提供了众多预训练模型,你可以使用其中任意一个,
|
||||
|
||||
在开始Finetuning前,需要准备一个数据集用以训练和评估,详细可以参考数据集使用教程。
|
||||
|
||||
`临时写法`,我们通过数据集接口创建一个虚假的dataset
|
||||
```python
|
||||
from datasets import Dataset
|
||||
dataset_dict = {
|
||||
'sentence1': [
|
||||
'This is test sentence1-1', 'This is test sentence2-1',
|
||||
'This is test sentence3-1'
|
||||
],
|
||||
'sentence2': [
|
||||
'This is test sentence1-2', 'This is test sentence2-2',
|
||||
'This is test sentence3-2'
|
||||
],
|
||||
'label': [0, 1, 1]
|
||||
}
|
||||
train_dataset = MsDataset.from_hf_dataset(Dataset.from_dict(dataset_dict))
|
||||
eval_dataset = MsDataset.from_hf_dataset(Dataset.from_dict(dataset_dict))
|
||||
train_dataset = MsDataset.load'afqmc_small', namespace='modelscope', split='train')
|
||||
eval_dataset = MsDataset.load('afqmc_small', namespace='modelscope', split='validation')
|
||||
```
|
||||
### 训练
|
||||
ModelScope把所有训练相关的配置信息全部放到了模型仓库下的`configuration.json`中,因此我们只需要创建Trainer,加载配置文件,传入数据集即可完成训练。
|
||||
|
||||
@@ -141,7 +141,7 @@ class Trainers(object):
|
||||
Holds the standard trainer name to use for identifying different trainer.
|
||||
This should be used to register trainers.
|
||||
|
||||
For a general Trainer, you can use easynlp-trainer/ofa-trainer.
|
||||
For a general Trainer, you can use EpochBasedTrainer.
|
||||
For a model specific Trainer, you can use ${ModelName}-${Task}-trainer.
|
||||
"""
|
||||
|
||||
|
||||
@@ -214,10 +214,10 @@ TASK_OUTPUTS = {
|
||||
Tasks.nli: [OutputKeys.SCORES, OutputKeys.LABELS],
|
||||
|
||||
# sentiment classification result for single sample
|
||||
# {
|
||||
# "labels": ["happy", "sad", "calm", "angry"],
|
||||
# "scores": [0.9, 0.1, 0.05, 0.05]
|
||||
# }
|
||||
# {
|
||||
# 'scores': [0.07183828949928284, 0.9281617403030396],
|
||||
# 'labels': ['1', '0']
|
||||
# }
|
||||
Tasks.sentiment_classification: [OutputKeys.SCORES, OutputKeys.LABELS],
|
||||
|
||||
# zero-shot classification result for single sample
|
||||
|
||||
@@ -1,6 +1,8 @@
|
||||
en_core_web_sm>=2.3.5
|
||||
fairseq>=0.10.2
|
||||
pai-easynlp
|
||||
# temporarily remove pai-easynl due to its hard dependency scipy==1.5.4
|
||||
# will be added back
|
||||
# pai-easynlp
|
||||
# rough-score was just recently updated from 0.0.4 to 0.0.7
|
||||
# which introduced compatability issues that are being investigated
|
||||
rouge_score<=0.0.4
|
||||
|
||||
Reference in New Issue
Block a user