Commit Graph

18 Commits

Author SHA1 Message Date
hemu.zp
2b1af959d5 Convert cfg during training
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11900238
2023-03-09 22:27:44 +08:00
yuze.zyz
4dca4773db Support csanmt exporting and refactor some code
1. Support csanmt exporting to savedmodel format
2. Create a new base class for text-ranking preprocessors, and move some parameters of mgeo_ranking_preprocessor to init method
3. Avoid Model & Preprocessor classes coupled with pytorch
4. Regression test supports comparing only model output
5. Support zero-shot exporting to onnx and torchscript

Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11522461
2023-02-10 05:15:04 +00:00
wenmeng.zwm
d5ae8ae43b remove tensorboard hook as default
tensorboard has been removed from the requirements of framework.txt, so we remove tensorboard hook from default config
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11519980

    * remove tensorboard hook as default

* Merge branch 'master' into fix/remove_default_tensorboard_hook
2023-02-08 10:07:07 +00:00
wenmeng.zwm
8f6a0f64e2 add support for eval configuration and fix logger problem
1. add support for configuration for gpu_collect and cache_dir which is used for cpu result gathering, configuration example

```json
"evaluation": {
    "gpu_collect":  false,
    "cache_dir": "path/to/your/local/cache"
}
```

2. fix logger file missing  when log_file is passed to get_logger and add log_file for trainer
3.  automatically create work_dir in rank0 worker
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/11342068

    * add support for configuration for tmpdir and gpu_collect
2023-01-09 02:51:35 +08:00
wenmeng.zwm
c9a6b887a2 add tensorboard hook for visualization
1. add tensorboard hook to default config
2. add image visualization support to tensorboard hook and trainer
3. move evaluation logic out of single_gpu_test and multi_gpu_test to make prediction results available for further processing such as result saving and visualization.

visualization results are as follows:
![image.png](https://cn-hangzhou.oss-cdn.aliyun-inc.com/git/force/uploads/comment/29212/38448470860386707/image.png)
![image.png](https://cn-hangzhou.oss-cdn.aliyun-inc.com/git/force/uploads/comment/29212/38437794200606734/image.png)
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10894813
2022-12-02 15:13:24 +08:00
yingda.chen
4e4faa9a30 specifiy file encoding when open text for read
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/10708723
2022-11-14 14:16:08 +08:00
jiangnana.jnn
5e176da3a1 adapt to msdataset for EasyCV
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/9935664

    * adapt to msdataset for EasyCV
2022-09-09 10:01:51 +08:00
jiangnana.jnn
1a22fa0222 fix trainer unittest
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/9970626

    * fix trainer unittest
2022-09-02 14:06:08 +08:00
mulin.lyh
12698b31a0 [to #44340132] fix: ci case run out of gpu memory 2022-08-30 17:59:15 +08:00
zhangzhicheng.zzc
b94bb74f66 [to #42322933]Add model.save_pretrained method and allow finetune results used by pipeline 2022-08-24 21:39:08 +08:00
jiangnana.jnn
cfc3d1eed7 fix trainer about iters_per_epoch
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/9791200

    * fix trainer about iters_per_epoch
2022-08-17 20:06:25 +08:00
jiangnana.jnn
76482cc3ea [to #43850241] fix processor and collate_fn
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/9644184

    * fix ditributed training and eval
2022-08-16 12:04:07 +08:00
jiangnana.jnn
6f5b864735 [to #43850241] fix unittest
Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/9660779

    * fix unittest
2022-08-05 18:39:59 +08:00
zhangzhicheng.zzc
9d0b38b4e4 [to #42322933] lazy load on trainer 2022-08-04 14:07:14 +08:00
jiangnana.jnn
34840fc5d8 [to #43627720] support ReduceLROnPlateau and fix lr scheduler
1. Support `ReduceLROnPlateau` lr scheduler, and add  `PlateauLrSchedulerHook` for it
2. Support custom `optimizer_hook` and `lr_scheduler_hook`
3. Remove function of save best ckpt from `EvaluationHook`, replace with `BestCkptSaverHook`
4. `evaluation_loop` return metric values directly,move metric computation to `single_gpu_test` and `multi_gpu_test`
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/9584322

    * [to #43627720] support ReduceLROnPlateau and fix lr scheduler
2022-08-02 14:49:48 +08:00
feiwu.yfw
2c3875c0e1 [to #43299989] Fix msdataset
* fix msdataset
        Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/9436292

    * fix msdataset
2022-07-20 16:38:15 +08:00
jiangnana.jnn
f3d739bea7 [to #43105545] add default config and new hooks 2022-07-19 17:41:25 +08:00
wenmeng.zwm
231f400133 [to #43112534] finetune support and first case
co-contributed with 夕陌&雨泓

 * add torch epoch based trainer and dis utils
 * add hooks including optimizer, lrscheduler, logging, checkpoint, evaluation, time profiling
 * add torch mdoel base and test
 * add optimizer and lrscheduler module
 * add sbert for text classification example
 * add task_dataset for dataset-level processor

Link: https://code.alibaba-inc.com/Ali-MaaS/MaaS-lib/codereview/9338412
2022-07-14 16:25:55 +08:00