[to #43726282] fix bugs and refine docs

1. remove pai-easynlp temporarily due to its hard dependency on scipy==1.5.4 2. fix sentiment classification output 3. update quickstart and trainer doc Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/9646399
2026-05-18 05:05:00 +02:00 · 2022-08-04 22:38:31 +08:00
parent 845cc869ca
commit 49192f94be
5 changed files with 12 additions and 22 deletions
--- a/docs/source/quick_start.md
+++ b/docs/source/quick_start.md
@@ -1,7 +1,7 @@
 # 快速开始
-ModelScope Library目前支持tensorflow，pytorch深度学习框架进行模型训练、推理， 在Python 3.7+, Pytorch 1.8+, Tensorflow1.13-1.15，Tensorflow 2.x上测试可运行。
+ModelScope Library目前支持tensorflow，pytorch深度学习框架进行模型训练、推理， 在Python 3.7+, Pytorch 1.8+, Tensorflow1.15，Tensorflow 2.x上测试可运行。

-注： 当前（630）版本 `语音相关`的功能仅支持 python3.7,tensorflow1.13-1.15的`linux`环境使用。  其他功能可以在windows、mac上安装使用。
+注： `语音相关`的功能仅支持 python3.7,tensorflow1.15的`linux`环境使用。  其他功能可以在windows、mac上安装使用。

 ## python环境配置
 首先，参考[文档](https://docs.anaconda.com/anaconda/install/) 安装配置Anaconda环境
--- a/docs/source/tutorials/trainer.md
+++ b/docs/source/tutorials/trainer.md
@@ -8,22 +8,10 @@ Modelscope提供了众多预训练模型，你可以使用其中任意一个，

 在开始Finetuning前，需要准备一个数据集用以训练和评估，详细可以参考数据集使用教程。

-`临时写法`，我们通过数据集接口创建一个虚假的dataset
 ```python
 from datasets import Dataset
-dataset_dict = {
-    'sentence1': [
-        'This is test sentence1-1', 'This is test sentence2-1',
-        'This is test sentence3-1'
-    ],
-    'sentence2': [
-        'This is test sentence1-2', 'This is test sentence2-2',
-        'This is test sentence3-2'
-    ],
-    'label': [0, 1, 1]
-}
-train_dataset = MsDataset.from_hf_dataset(Dataset.from_dict(dataset_dict))
-eval_dataset = MsDataset.from_hf_dataset(Dataset.from_dict(dataset_dict))
+train_dataset = MsDataset.load'afqmc_small', namespace='modelscope', split='train')
+eval_dataset = MsDataset.load('afqmc_small', namespace='modelscope', split='validation')
 ```
 ### 训练
 ModelScope把所有训练相关的配置信息全部放到了模型仓库下的`configuration.json`中，因此我们只需要创建Trainer，加载配置文件，传入数据集即可完成训练。
--- a/modelscope/metainfo.py
+++ b/modelscope/metainfo.py
@@ -141,7 +141,7 @@ class Trainers(object):
        Holds the standard trainer name to use for identifying different trainer.
    This should be used to register trainers.

-        For a general Trainer, you can use easynlp-trainer/ofa-trainer.
+        For a general Trainer, you can use EpochBasedTrainer.
        For a model specific Trainer, you can use ${ModelName}-${Task}-trainer.
    """

--- a/modelscope/outputs.py
+++ b/modelscope/outputs.py
@@ -214,10 +214,10 @@ TASK_OUTPUTS = {
    Tasks.nli: [OutputKeys.SCORES, OutputKeys.LABELS],

    # sentiment classification result for single sample
-    #   {
-    #       "labels": ["happy", "sad", "calm", "angry"],
-    #       "scores": [0.9, 0.1, 0.05, 0.05]
-    #   }
+    # {
+    #     'scores': [0.07183828949928284, 0.9281617403030396],
+    #     'labels': ['1', '0']
+    # }
    Tasks.sentiment_classification: [OutputKeys.SCORES, OutputKeys.LABELS],

    # zero-shot classification result for single sample
--- a/requirements/nlp.txt
+++ b/requirements/nlp.txt
@@ -1,6 +1,8 @@
 en_core_web_sm>=2.3.5
 fairseq>=0.10.2
-pai-easynlp
+# temporarily remove pai-easynl due to its hard dependency scipy==1.5.4
+# will be added back
+# pai-easynlp
 # rough-score was just recently updated from 0.0.4 to 0.0.7
 # which introduced compatability issues that are being investigated
 rouge_score<=0.0.4