Merge branch 'master-github' into master-merge-github0406

This commit is contained in:
yuze.zyz
2023-04-06 15:15:37 +08:00
29 changed files with 2050 additions and 44 deletions

34
.github/ISSUE_TEMPLATE/bug_report.md vendored Normal file
View File

@@ -0,0 +1,34 @@
---
name: Bug report
about: Create a bug report to help us improve
title: ''
labels: ''
assignees: Firmament-cyou, tastelikefeet, wangxingjun778, wenmengzhou, zzclynn
---
Thanks for your error report and we appreciate it a lot.
**Checklist**
* I have searched the tutorial on modelscope [doc-site](https://modelscope.cn/docs)
* I have searched related issues but cannot get the expected help.
* The bug has not been fixed in the latest version.
**Describe the bug**
A clear and concise description of what the bug is.
**To Reproduce**
* What command or script did you run?
> A placeholder for the command.
* Did you make any modifications on the code or config? Did you understand what you have modified?
* What dataset did you use?
**Your Environments (__required__)**
* OS: `uname -a`
* CPU: `lscpu`
* Commit id (e.g. `a3ffc7d8`)
* You may add addition that may be helpful for locating the problem, such as
* How you installed PyTorch [e.g., pip, conda, source]
* Other environment variables that may be related (such as $PATH, $LD_LIBRARY_PATH, $PYTHONPATH, etc.)

View File

@@ -0,0 +1,20 @@
---
name: Feature request
about: Suggest an idea for this project
title: ''
labels: ''
assignees: tastelikefeet, wangxingjun778, wenmengzhou, yingdachen, zzclynn
---
**Describe the feature**
Features description
**Motivation**
A clear and concise description of the motivation of the feature. Ex1. It is inconvenient when [....]. Ex2. There is a recent paper [....], which is very helpful for [....].
**Related resources**
If there is an official code release or third-party implementations, please also provide the information here, which would be very helpful.
**Additional context**
Add any other context or screenshots about the feature request here. If you would like to implement the feature and create a PR, please leave a comment here and that would be much appreciated.

17
.github/ISSUE_TEMPLATE/question.md vendored Normal file
View File

@@ -0,0 +1,17 @@
---
name: Question
about: Describe this issue template's purpose here.
title: ''
labels: ''
assignees: zzclynn
---
**General Question**
Before asking a question, make sure you have:
* Searched the tutorial on modelscope [doc-site](https://modelscope.cn/docs)
* Googled your question.
* Searched related issues but cannot get the expected help.
* The bug has not been fixed in the latest version.

133
CODE_OF_CONDUCT.md Normal file
View File

@@ -0,0 +1,133 @@
# Contributor Covenant Code of Conduct
## Our Pledge
We as members, contributors, and leaders pledge to make participation in our
community a harassment-free experience for everyone, regardless of age, body
size, visible or invisible disability, ethnicity, sex characteristics, gender
identity and expression, level of experience, education, socio-economic status,
nationality, personal appearance, race, caste, color, religion, or sexual
identity and orientation.
We pledge to act and interact in ways that contribute to an open, welcoming,
diverse, inclusive, and healthy community.
## Our Standards
Examples of behavior that contributes to a positive environment for our
community include:
* Demonstrating empathy and kindness toward other people
* Being respectful of differing opinions, viewpoints, and experiences
* Giving and gracefully accepting constructive feedback
* Accepting responsibility and apologizing to those affected by our mistakes,
and learning from the experience
* Focusing on what is best not just for us as individuals, but for the overall
community
Examples of unacceptable behavior include:
* The use of sexualized language or imagery, and sexual attention or advances of
any kind
* Trolling, insulting or derogatory comments, and personal or political attacks
* Public or private harassment
* Publishing others' private information, such as a physical or email address,
without their explicit permission
* Other conduct which could reasonably be considered inappropriate in a
professional setting
## Enforcement Responsibilities
Community leaders are responsible for clarifying and enforcing our standards of
acceptable behavior and will take appropriate and fair corrective action in
response to any behavior that they deem inappropriate, threatening, offensive,
or harmful.
Community leaders have the right and responsibility to remove, edit, or reject
comments, commits, code, wiki edits, issues, and other contributions that are
not aligned to this Code of Conduct, and will communicate reasons for moderation
decisions when appropriate.
## Scope
This Code of Conduct applies within all community spaces, and also applies when
an individual is officially representing the community in public spaces.
Examples of representing our community include using an official e-mail address,
posting via an official social media account, or acting as an appointed
representative at an online or offline event.
## Enforcement
Instances of abusive, harassing, or otherwise unacceptable behavior may be
reported to the community leaders responsible for enforcement at
feedback@huggingface.co.
All complaints will be reviewed and investigated promptly and fairly.
All community leaders are obligated to respect the privacy and security of the
reporter of any incident.
## Enforcement Guidelines
Community leaders will follow these Community Impact Guidelines in determining
the consequences for any action they deem in violation of this Code of Conduct:
### 1. Correction
**Community Impact**: Use of inappropriate language or other behavior deemed
unprofessional or unwelcome in the community.
**Consequence**: A private, written warning from community leaders, providing
clarity around the nature of the violation and an explanation of why the
behavior was inappropriate. A public apology may be requested.
### 2. Warning
**Community Impact**: A violation through a single incident or series of
actions.
**Consequence**: A warning with consequences for continued behavior. No
interaction with the people involved, including unsolicited interaction with
those enforcing the Code of Conduct, for a specified period of time. This
includes avoiding interactions in community spaces as well as external channels
like social media. Violating these terms may lead to a temporary or permanent
ban.
### 3. Temporary Ban
**Community Impact**: A serious violation of community standards, including
sustained inappropriate behavior.
**Consequence**: A temporary ban from any sort of interaction or public
communication with the community for a specified period of time. No public or
private interaction with the people involved, including unsolicited interaction
with those enforcing the Code of Conduct, is allowed during this period.
Violating these terms may lead to a permanent ban.
### 4. Permanent Ban
**Community Impact**: Demonstrating a pattern of violation of community
standards, including sustained inappropriate behavior, harassment of an
individual, or aggression toward or disparagement of classes of individuals.
**Consequence**: A permanent ban from any sort of public interaction within the
community.
## Attribution
This Code of Conduct is adapted from the [Contributor Covenant][homepage],
version 2.1, available at
[https://www.contributor-covenant.org/version/2/1/code_of_conduct.html][v2.1].
Community Impact Guidelines were inspired by
[Mozilla's code of conduct enforcement ladder][Mozilla CoC].
For answers to common questions about this code of conduct, see the FAQ at
[https://www.contributor-covenant.org/faq][FAQ]. Translations are available at
[https://www.contributor-covenant.org/translations][translations].
[homepage]: https://www.contributor-covenant.org
[v2.1]: https://www.contributor-covenant.org/version/2/1/code_of_conduct.html
[Mozilla CoC]: https://github.com/mozilla/diversity
[FAQ]: https://www.contributor-covenant.org/faq
[translations]: https://www.contributor-covenant.org/translations

View File

@@ -117,25 +117,35 @@ sudo apt-get install git-lfs
git lfs install
```
2. track your data type using git lfs, for example, to track png files
```bash
2. We use a public read model repository from ModelScope to store test data. The repository has been added by default as a submodule with the path data/test. To clone it, use the following command:
```shell
git clone git@github.com:modelscope/modelscope.git --recursive
```
3. Each time you add new data, go to the data/test directory (note that you are now in the submodule's git directory), check if you are on the master branch, and pull the latest master branch:
```shell
git branch
git checkout master
git pull origin master
```
4. Track your new test data type, and update and commit the new files on the master branch:
```shell
cd data/test/
git lfs track "*.png"
git add test.png
git commit -m "add test.png"
git push origin master
```
3. add your test files to `data/test/` folder, you can make directories if you need.
```bash
git add data/test/test.png
5. Return to the modelscope directory and commit the submodule update:
```shell
cd ../../
git add data/test
git commit -m "update test data"
```
4. commit your test data to remote branch
```bash
git commit -m "xxx"
```
To pull data from remote repo, just as the same way you pull git files.
```bash
git pull origin branch_name
```
Note: By default, we grant write permissions to all members of the ModelScope organization. If you encounter any permission issues, please send an email to ModelScope's official email address ([contact@modelscope.cn](contact@modelscope.cn)), and a dedicated person will contact you via email.

152
docs/source/develop_cn.md Normal file
View File

@@ -0,0 +1,152 @@
# 开发
## 1. 代码风格
我们采用 [PEP8](https://www.python.org/dev/peps/pep-0008/) 作为首选的代码风格。
我们使用以下工具进行代码检查和格式化:
- [flake8](http://flake8.pycqa.org/en/latest/): 语法检查器
- [yapf](https://github.com/google/yapf): 格式化工具
- [isort](https://github.com/timothycrosley/isort): 导入排序
yapf 和 isort 的样式配置可以在 [setup.cfg](https://chat.openai.com/setup.cfg) 中找到。 我们使用 [pre-commit hook](https://pre-commit.com/) 在每次提交时自动检查和格式化 **flake8**、**yapf**、**seed-isort-config**、**isort**、**trailing whitespaces**,修复 **end-of-files**,对 **requirements.txt** 进行排序。 预提交钩子的配置存储在 [.pre-commit-config](https://chat.openai.com/.pre-commit-config.yaml) 中。 克隆仓库后,您需要安装并初始化预提交钩子。
```bash
pip install -r requirements/tests.txt
```
在仓库文件夹中运行
```bash
pre-commit install
```
这样每次提交时,代码检查器和格式化工具都会生效。
如果您想使用预提交钩子检查所有文件,可以运行
```bash
pre-commit run --all-files
```
如果您只想格式化和检查代码,可以运行
```bash
make linter
```
## 2. 测试
### 2.1 测试级别
主要有三个测试级别:
- 级别 0用于测试框架的基本接口和功能例如 **tests/trainers/test_trainer_base.py**
- 级别 1重要的功能测试测试端到端工作流例如 **tests/pipelines/test_image_matting.py**
- 级别 2针对不同算法领域的所有实现模块如模型、流程的场景测试。
默认测试级别为 0仅运行级别 0 的测试用例,您可以通过环境变量 **TEST_LEVEL** 设置测试级别。
```bash
# 运行所有测试
TEST_LEVEL=2 make test
# 运行重要功能测试
TEST_LEVEL=1 make test
# 运行核心单元测试和基本功能测试
make test
```
编写测试用例时,您应该为测试用例分配一个测试级别,如下所示。如果保持默认值,测试级别将为 0在每个测试阶段都会运行。
test_module.py 文件
```python
from modelscope.utils.test_utils import test_level
class ImageCartoonTest(unittest.TestCase):
@unittest.skipUnless(test_level() >= 1, 'skip test in current test level')
def test_run_by_direct_model_download(self):
pass
```
### 2.2 运行测试
1. 运行自己的单个测试用例以测试自己实现的功能。您可以直接运行测试文件,如果无法运行,请检查环境变量 **TEST_LEVEL** 是否存在,如果存在,请取消设置。
```bash
python tests/path/to/your_test.py
```
2. 在开始代码审查之前,请记住在本地环境中运行核心测试,默认情况下只会运行级别为 0 的测试用例。
```bash
make tests
```
3. 在您开始代码审查后,将触发持续集成测试,该测试将运行级别为 1 的测试用例。
4. 每天凌晨 0 点,使用 master 分支运行每日回归测试,覆盖所有测试用例。
### 2.3 测试数据存储
由于我们需要大量的测试数据,包括图像、视频和模型,因此我们使用 git lfs 存储这些大文件。
1. 安装 git-lfs版本>= 2.5.0 对于 Mac
```bash
brew install git-lfs
git lfs install
```
对于 CentOS请从 git-lfs GitHub 发布[网站](https://github.com/git-lfs/git-lfs/releases/tag/v3.2.0)下载 rpm 文件,然后执行
```bash
sudo rpm -ivh your_rpm_file_name.rpm
git lfs install
```
对于 Ubuntu
```bash
curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | sudo bash
sudo apt-get install git-lfs
git lfs install
```
2. 我们使用 ModelScope 的一个公共读取模型仓库来存储测试数据。该仓库已默认添加为子模块,路径为 data/test。要克隆它请使用以下命令
```
git clone git@github.com:modelscope/modelscope.git --recursive
```
3. 每次添加新数据时,进入 data/test 目录(注意此时您已在子模块的 git 目录中),检查是否在 master 分支上,并拉取最新的 master 分支:
```
git branch
git checkout master
git pull origin master
```
4. 跟踪新的测试数据类型,并在 master 分支上更新并提交新文件:
```
cd data/test/
git lfs track "*.png"
git add test.png
git commit -m "add test.png"
git push origin master
```
5. 返回到 modelscope 目录,提交子模块更新:
```
cd ../../
git add data/test
git commit -m "update test data"
```
注意:默认情况下,我们会为 ModelScope 组织下的所有成员授权写权限。如果遇到权限问题,请发送电子邮件至 ModelScope 官方邮箱([contact@modelscope.cn](https://chat.openai.com/contact@modelscope.cn)),我们将有专人与您通过电子邮件联系。
## 开发和代码审查
1. 获取最新的 master 代码并为本地开发检出一个新分支。
```
git pull origin master --rebase
git checkout -b dev/my-dev-branch
```
注意:将 "dev/my-dev-branch" 替换为有意义的分支名称。我们建议为每次更改使用一个新的 dev 分支。
2. 进行本地更改。
3. 提交本地更改。
```shell
git add .
git commit -m "[to #42322933] my commit message"
```
4. 推送更改:
```
git push --set-upstream origin dev/my-dev-branch
bash make whl
```
注意,以后您可以使用 'git push' 命令多次推送到相同的分支。
5. 在 github 上创建一个 pull 请求,将您的代码合并到 master 分支中。
## 构建 pip 软件包
```bash
make whl
```

View File

@@ -160,7 +160,7 @@ def model_file_download(
def get_file_download_url(model_id: str, file_path: str, revision: str):
"""Format file download url according to `model_id`, `revision` and `file_path`.
e.g., Given `model_id=john/bert`, `revision=master`, `file_path=README.md`,
the resulted download url is: https://modelscope.co/api/v1/models/john/bert/repo?Revision=master&FilePath=README.md
the resulted download url is: https://modelscope.cn/api/v1/models/john/bert/repo?Revision=master&FilePath=README.md
Args:
model_id (str): The model_id.

View File

@@ -154,6 +154,7 @@ class Models(object):
T5 = 'T5'
mglm = 'mglm'
codegeex = 'codegeex'
glm130b = 'glm130b'
bloom = 'bloom'
unite = 'unite'
megatron_bert = 'megatron-bert'
@@ -445,6 +446,7 @@ class Pipelines(object):
mglm_text_summarization = 'mglm-text-summarization'
codegeex_code_translation = 'codegeex-code-translation'
codegeex_code_generation = 'codegeex-code-generation'
glm130b_text_generation = 'glm130b-text-generation'
translation_en_to_de = 'translation_en_to_de' # keep it underscore
translation_en_to_ro = 'translation_en_to_ro' # keep it underscore
translation_en_to_fr = 'translation_en_to_fr' # keep it underscore

View File

@@ -18,6 +18,7 @@ if TYPE_CHECKING:
)
from .bloom import BloomModel
from .codegeex import CodeGeeXForCodeTranslation, CodeGeeXForCodeGeneration
from .glm_130b import GLM130bForTextGeneration
from .csanmt import CsanmtForTranslation
from .deberta_v2 import DebertaV2ForMaskedLM, DebertaV2Model
from .gpt_neo import GPTNeoModel
@@ -89,6 +90,7 @@ else:
'csanmt': ['CsanmtForTranslation'],
'codegeex':
['CodeGeeXForCodeTranslation', 'CodeGeeXForCodeGeneration'],
'glm_130b': ['GLM130bForTextGeneration'],
'deberta_v2': ['DebertaV2ForMaskedLM', 'DebertaV2Model'],
'heads': ['TextClassificationHead'],
'hf_transformers': ['TransformersModel'],

View File

@@ -5,7 +5,7 @@ from modelscope.utils.import_utils import LazyImportModule
if TYPE_CHECKING:
from .document_grounded_dialog_generate import DocumentGroundedDialogGenerateModel
from .document_grounded_dialog_retrieval import DocumentGroundedDialogRerankModel
from .document_grounded_dialog_rerank import DocumentGroundedDialogRerankModel
from .document_grounded_dialog_retrieval import DocumentGroundedDialogRetrievalModel
else:
_import_structure = {

View File

@@ -13,7 +13,6 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""PyTorch BERT model."""
from __future__ import absolute_import, division, print_function
import os.path

View File

@@ -0,0 +1,201 @@
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
APPENDIX: How to apply the Apache License to your work.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don't include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright Aohan Zeng
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

View File

@@ -0,0 +1,33 @@
The GLM-130B License
1. Definitions
“Licensor” means the GLM-130B Model Team that distributes its Software.
“Software” means the GLM-130B model parameters made available under this license.
2. License Grant
Subject to the terms and conditions of this License, the Licensor hereby grants to you a non-exclusive, worldwide, non-transferable, non-sublicensable, revocable, royalty-free copyright license to use the Software solely for your non-commercial research purposes.
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
3. Restriction
You will not use, copy, modify, merge, publish, distribute, reproduce, or create derivative works of the Software, in whole or in part, for any commercial, military, or illegal purposes.
You will not use the Software for any act that may undermine China's national security and national unity, harm the public interest of society, or infringe upon the rights and interests of human beings.
4. Disclaimer
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
5. Limitation of Liability
EXCEPT TO THE EXTENT PROHIBITED BY APPLICABLE LAW, IN NO EVENT AND UNDER NO LEGAL THEORY, WHETHER BASED IN TORT, NEGLIGENCE, CONTRACT, LIABILITY, OR OTHERWISE WILL ANY LICENSOR BE LIABLE TO YOU FOR ANY DIRECT, INDIRECT, SPECIAL, INCIDENTAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES, OR ANY OTHER COMMERCIAL LOSSES, EVEN IF THE LICENSOR HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
6. Dispute Resolution
This license shall be governed and construed in accordance with the laws of Peoples Republic of China. Any dispute arising from or in connection with this License shall be submitted to Haidian District People's Court in Beijing.
Note that the license is subject to update to a more comprehensive version. For any questions related to the license and copyright, please contact us at glm-130b@googlegroups.com.

View File

@@ -0,0 +1,20 @@
# Modified by Zhipu.AI
# Original Copyright (c) Alibaba, Inc. and its affiliates.
from typing import TYPE_CHECKING, Union
from modelscope.utils.import_utils import LazyImportModule
if TYPE_CHECKING:
from .text_generation import GLM130bForTextGeneration
else:
_import_structure = {'text_generation': ['GLM130bForTextGeneration']}
import sys
sys.modules[__name__] = LazyImportModule(
__name__,
globals()['__file__'],
_import_structure,
module_spec=__spec__,
extra_objects={},
)

View File

@@ -0,0 +1,2 @@
# Copyright (c) 2022 Zhipu.AI
from .strategies import BaseStrategy, BeamSearchStrategy

View File

@@ -0,0 +1,240 @@
# Copyright (c) 2022 Zhipu.AI
import numpy as np
import torch
import torch.nn.functional as F
from SwissArmyTransformer.generation.sampling_strategies.base_strategy import \
top_k_logits
class BaseStrategy:
def __init__(self,
batch_size,
invalid_slices=[],
temperature=1.,
top_k=200,
eps=1e-4,
top_p=0.0,
end_tokens=None):
self.batch_size = batch_size
self.invalid_slices = invalid_slices
self.temperature = temperature
self.topk = top_k
self.top_p = top_p
self.eps = eps
if end_tokens is None:
end_tokens = []
self.end_tokens = end_tokens
self._is_done = np.zeros(self.batch_size, dtype=bool)
@property
def is_done(self) -> bool:
return self._is_done.all()
def forward(self, logits, tokens, mems, temperature=None):
logits = logits.view(-1, logits.size(-1))
batch_size = tokens.shape[0]
if temperature is None:
temperature = self.temperature
logits = logits / temperature
for invalid_slice in self.invalid_slices:
logits[..., invalid_slice] = -65504
logits = top_k_logits(logits, self.topk, self.top_p)
probs = F.softmax(
logits.float(),
dim=-1) # float is essetial, due to a bug in Pytorch
pred = torch.multinomial(probs, num_samples=1)
for i in range(self.batch_size):
if i >= batch_size:
self._is_done[i] = True
elif self._is_done[i]:
pred[i] = -1
elif pred[i].item() in self.end_tokens:
self._is_done[i] = True
tokens = torch.cat((tokens, pred.view(tokens.shape[:-1] + (1, ))),
dim=-1)
return tokens, mems
def finalize(self, tokens, mems):
self._is_done = np.zeros(self.batch_size, dtype=bool)
return tokens, mems
class BeamSearchStrategy:
def __init__(
self,
batch_size,
num_beams,
length_penalty=1.0,
consider_end=False,
end_tokens=[],
invalid_slices=[],
no_repeat_ngram_size=0,
min_gen_length=0,
deterministic=False,
):
self.batch_size = batch_size
self.num_beams = num_beams
self.length_penalty = length_penalty
self.end_tokens = end_tokens
self.ngram = no_repeat_ngram_size
self.min_gen_length = min_gen_length
self.invalid_slices = invalid_slices
self.consider_end = consider_end
self.deterministic = deterministic
self._init_cache()
def _init_cache(self):
self.end_beams = [[] for _ in range(self.batch_size)
] # list of LongTensors
self.end_beams_penalized_scores = [[] for _ in range(self.batch_size)
] # list of LongTensors
self.cached_beam_scores = 0 # [batch_size]
self.cached_beam_ngram_bans = [[{} for _ in range(self.num_beams)]
for _ in range(self.batch_size)]
self.length_generated = 0
self._is_done = np.zeros(self.batch_size, dtype=bool)
def _add_end_beams(self, score, beam, batch_idx):
score = score / ((5.0 + len(beam))
/ 6)**self.length_penalty # Magic number for OpenNMT
for i in range(len(self.end_beams[batch_idx]), -1, -1):
if i == 0 or score < self.end_beams_penalized_scores[batch_idx][
i - 1]:
break
self.end_beams[batch_idx].insert(i, beam)
self.end_beams_penalized_scores[batch_idx].insert(i, score)
self.end_beams[batch_idx] = self.end_beams[batch_idx][:self.num_beams]
self.end_beams_penalized_scores[
batch_idx] = self.end_beams_penalized_scores[batch_idx][:self.
num_beams]
@property
def is_done(self) -> bool:
return self._is_done.all()
def forward(self, logits, tokens, mems):
batch_size, num_beams, vocab_size = logits.shape
seq_len = tokens.shape[-1]
logits = logits.float()
for invalid_slice in self.invalid_slices:
logits[..., invalid_slice] = -65504
if self.min_gen_length > self.length_generated:
for end_token in self.end_tokens:
logits[..., end_token] = -65504
if self.ngram > 0 and seq_len > self.ngram:
for batch_idx in range(batch_size):
for i in range(num_beams):
ngram_prefix = tokens[batch_idx, i,
-(self.ngram
- 1):].tolist() # TODO ngram=1
for banned_index in self.cached_beam_ngram_bans[batch_idx][
i].get(tuple(ngram_prefix), []):
logits[batch_idx, i, banned_index] = -65504
next_token_scores = F.log_softmax(
logits, dim=-1) # [batch_size, vocab_size]
prev_scores = self.cached_beam_scores
if isinstance(prev_scores, torch.Tensor):
prev_scores = prev_scores[..., None].expand_as(next_token_scores)
next_token_scores = next_token_scores + prev_scores
next_token_scores = next_token_scores.view(batch_size,
num_beams * vocab_size)
probs = F.softmax(next_token_scores, dim=-1)
if num_beams < self.num_beams: # First token
probs = probs[..., :vocab_size]
if self.deterministic:
next_tokens = torch.topk(
probs, k=(max(1, len(self.end_tokens)) + 1)
* self.num_beams).indices # [2*nb]
else:
next_tokens = torch.multinomial(
probs,
num_samples=(max(1, len(self.end_tokens)) + 1)
* self.num_beams) # [2*nb]
next_token_scores = next_token_scores[
torch.arange(batch_size).unsqueeze(1), next_tokens]
next_token_scores, _indices = torch.sort(
next_token_scores, descending=True, dim=1)
next_tokens = next_tokens[torch.arange(batch_size).unsqueeze(1),
_indices]
next_indices = torch.div(
next_tokens, vocab_size, rounding_mode='trunc')
next_tokens = next_tokens % vocab_size
# select out end beams or continue beams
beam_continue_batch, score_continue_batch, mems_continue_batch = [], [], []
for batch_idx in range(batch_size):
beam_continue = []
scores_continue = []
bans_continue = []
mems_contiue = []
for i in range(len(next_tokens[batch_idx])):
beam = torch.cat(
(tokens[batch_idx, next_indices[batch_idx,
i]], next_tokens[batch_idx,
i:i + 1]))
if not self._is_done[batch_idx] and int(
next_tokens[batch_idx, i]) in self.end_tokens:
self._add_end_beams(next_token_scores[batch_idx, i], beam,
batch_idx)
elif len(beam_continue) < self.num_beams:
beam_continue.append(beam)
mems_contiue.append(mems[:, batch_idx,
next_indices[batch_idx, i]])
# update caches
scores_continue.append(next_token_scores[batch_idx, i])
if self.ngram > 0:
bans = self.cached_beam_ngram_bans[batch_idx][
next_indices[batch_idx, i]].copy()
# TODO ngram=1
ngram_prefix = tuple(
tokens[batch_idx, next_indices[batch_idx, i],
-(self.ngram - 1):].tolist())
bans[ngram_prefix] = bans.get(
ngram_prefix, tuple()) + (next_tokens[batch_idx,
i], )
bans_continue.append(bans)
else:
break
beam_continue_batch.append(torch.stack(beam_continue))
mems_continue_batch.append(torch.stack(mems_contiue, dim=1))
score_continue_batch.append(scores_continue)
self.cached_beam_ngram_bans[batch_idx] = bans_continue
tokens = torch.stack(beam_continue_batch)
mems = torch.stack(mems_continue_batch, dim=1)
self.cached_beam_scores = torch.tensor(
score_continue_batch, device=logits.device)
self.length_generated += 1
for batch_idx in range(self.batch_size):
if batch_idx >= batch_size:
self._is_done[batch_idx] = True
elif (len(self.end_beams[batch_idx]) == self.num_beams
and self.end_beams_penalized_scores[batch_idx][-1] >= # noqa
self.cached_beam_scores[batch_idx].max() / # noqa
((5.0 + (seq_len + 1)) / 6)**self.length_penalty): # noqa
self._is_done[batch_idx] = True
return tokens, mems
def finalize(self, tokens, mems):
if self.consider_end:
batch_size, num_beams = tokens.shape[:2]
for batch_idx in range(batch_size):
if not self._is_done[batch_idx]:
for i in range(num_beams):
self._add_end_beams(
self.cached_beam_scores[batch_idx, i],
tokens[batch_idx, i], batch_idx)
mems = None
ret = self.end_beams[:batch_size]
else:
ret = tokens
self._init_cache()
return ret, mems

View File

@@ -0,0 +1,161 @@
# Copyright (c) 2022 Zhipu.AI
import argparse
import time
import torch
from SwissArmyTransformer import get_args, get_tokenizer
from SwissArmyTransformer.arguments import initialize_distributed
from SwissArmyTransformer.model import GLM130B
from SwissArmyTransformer.mpu import (get_model_parallel_group,
get_model_parallel_rank,
get_model_parallel_world_size)
from SwissArmyTransformer.training import load_checkpoint
from .quantization import quantize
def add_bminf_args(parser):
"""Arguments for BMInf"""
group = parser.add_argument_group('BMInf')
group.add_argument(
'--bminf',
action='store_true',
help='Use BMInf to support low resource evaluation')
group.add_argument(
'--bminf-memory-limit',
type=int,
default=20,
help='Max memory for model per GPU (in GB)')
return parser
def add_quantization_args(parser):
group = parser.add_argument_group('Quantization')
group.add_argument('--quantization-bit-width', type=int, default=4)
group.add_argument(
'--from-quantized-checkpoint',
type=bool,
default=True,
help='Loading from a quantized checkpoint')
def add_initialization_args(parser):
group = parser.add_argument_group('Initialization')
group.add_argument(
'--sequential-initialization',
action='store_true',
help=
'Initialize sequentially in tensor parallel group (reduce CPU RAM for initialization)',
)
def set_up_model_args(args):
args.model_parallel_size = 4
args.num_layers = 70
args.hidden_size = 12288
args.inner_hidden_size = 32768
args.vocab_size = 150528
args.num_attention_heads = 96
args.max_sequence_length = 2048
args.tokenizer_type = 'icetk-glm-130B'
args.layernorm_order = 'post'
args.skip_init = True
args.fp16 = True
args.mode = 'inference'
return args
def initialize(extra_args_provider):
parser = argparse.ArgumentParser(add_help=False)
add_bminf_args(parser)
add_quantization_args(parser)
add_initialization_args(parser)
GLM130B.add_model_specific_args(parser)
extra_args_provider(parser)
known, args_list = parser.parse_known_args()
args_list += ['--model-parallel-size', '4', '--mode', 'inference']
args = get_args(args_list)
args = set_up_model_args(args)
args = argparse.Namespace(**vars(args), **vars(known))
args.do_train = False
initialize_distributed(args)
return args
def initialize_model_and_tokenizer(args):
tokenizer = get_tokenizer(args)
torch.distributed.barrier()
start = time.time()
for i in range(get_model_parallel_world_size()):
if get_model_parallel_rank() == i:
# Initialize model
model = GLM130B(args).half()
if args.from_quantized_checkpoint:
assert args.quantization_bit_width is not None
# Quantize model before moving to GPU
model = quantize(model, args.quantization_bit_width)
# Load checkpoint
load_checkpoint(model, args)
if args.quantization_bit_width is not None and not args.from_quantized_checkpoint:
# Quantize model before moving to GPU
model = quantize(model, args.quantization_bit_width)
if args.bminf:
import bminf
if torch.distributed.get_rank() == 0:
print(
f'> BMInf activated, memory limit: {args.bminf_memory_limit} GB'
)
with torch.cuda.device(args.device):
model = bminf.wrapper(
model,
quantization=False,
memory_limit=args.bminf_memory_limit << 30)
else:
model = model.to(args.device)
if args.sequential_initialization:
torch.distributed.barrier(group=get_model_parallel_group())
torch.distributed.barrier()
if torch.distributed.get_rank() == 0:
print(f'> Model initialized in {time.time() - start:.1f}s')
torch.cuda.empty_cache()
model.eval()
# generate rotary embedding cache
original_parallel_output = model.transformer.parallel_output
model.transformer.parallel_output = True
with torch.no_grad():
_, *_ = model(
torch.ones(
1,
args.max_sequence_length,
device=torch.cuda.current_device(),
dtype=torch.int64),
torch.arange(
args.max_sequence_length,
device=torch.cuda.current_device(),
dtype=torch.int64).view(1, -1),
torch.randn(
1,
1,
args.max_sequence_length,
args.max_sequence_length,
device=torch.cuda.current_device(),
) < 0.5,
)
model.transformer.parallel_output = original_parallel_output
torch.distributed.barrier()
return model, tokenizer

View File

@@ -0,0 +1,111 @@
# Copyright (c) 2022 Zhipu.AI
import ctypes
from typing import List
import pkg_resources
import torch
from cpm_kernels.kernels.base import (KernelFunction, LazyKernelCModule,
round_up)
RESOURCE_PACKAGE_NAME = __name__
class Kernel:
def __init__(self, filename: str, function_names: List[str]):
filename = filename + '.fatbin'
if not pkg_resources.resource_exists(RESOURCE_PACKAGE_NAME, filename):
raise RuntimeError('File `%s` not found in `%s`' %
(filename, RESOURCE_PACKAGE_NAME))
self.filename = filename
self.code = pkg_resources.resource_string(RESOURCE_PACKAGE_NAME,
filename)
self._function_names = function_names
self._cmodule = LazyKernelCModule(self.code)
for name in self._function_names:
setattr(self, name, KernelFunction(self._cmodule, name))
kernels = Kernel(
'quantization',
[
'int4WeightCompression',
'int4WeightExtractionFloat',
'int4WeightExtractionHalf',
'int8WeightExtractionFloat',
'int8WeightExtractionHalf',
],
)
def compress_int4_weight(weight: torch.Tensor): # (n, m)
with torch.cuda.device(weight.device):
n, m = weight.size(0), weight.size(1)
assert m % 2 == 0
m = m // 2
out = torch.empty(n, m, dtype=torch.int8, device='cuda')
stream = torch.cuda.current_stream()
gridDim = (n, 1, 1)
blockDim = (min(round_up(m, 32), 1024), 1, 1)
kernels.int4WeightCompression(
gridDim,
blockDim,
0,
stream,
[
ctypes.c_void_p(weight.data_ptr()),
ctypes.c_void_p(out.data_ptr()),
ctypes.c_int32(n),
ctypes.c_int32(m)
],
)
return out
def extract_weight_to_half(weight: torch.Tensor, scale_list: torch.Tensor,
source_bit_width: int):
if source_bit_width == 8:
func = kernels.int8WeightExtractionHalf
elif source_bit_width == 4:
func = kernels.int4WeightExtractionHalf
else:
assert False, 'Unsupported bit-width'
with torch.cuda.device(weight.device):
n, m = weight.size(0), weight.size(1)
out = torch.empty(
n, m * (8 // source_bit_width), dtype=torch.half, device='cuda')
stream = torch.cuda.current_stream()
gridDim = (n, 1, 1)
blockDim = (min(round_up(m, 32), 1024), 1, 1)
func(
gridDim,
blockDim,
0,
stream,
[
ctypes.c_void_p(weight.data_ptr()),
ctypes.c_void_p(scale_list.data_ptr()),
ctypes.c_void_p(out.data_ptr()),
ctypes.c_int32(n),
ctypes.c_int32(m),
],
)
return out
if __name__ == '__main__':
weight = torch.randn(4, 32).to(torch.int8).cuda()
scale = torch.ones(weight.size(0)).to(torch.half).cuda()
print(weight)
b = compress_int4_weight(weight)
print(b)
a = extract_weight_to_half(b, scale, source_bit_width=4)
print(a)

View File

@@ -0,0 +1,67 @@
# Copyright (c) 2022 Zhipu.AI
import torch
from .layers import QuantizedColumnParallelLinear, QuantizedRowParallelLinear
def quantize(model, weight_bit_width):
"""Replace fp16 linear with quantized linear"""
if torch.distributed.get_rank() == 0:
print(f'> Quantizing model weight to {weight_bit_width} bits')
for layer in model.transformer.layers:
layer.attention.query_key_value = QuantizedColumnParallelLinear(
weight_bit_width=weight_bit_width,
weight=layer.attention.query_key_value.weight.to(
torch.cuda.current_device()),
input_size=layer.attention.query_key_value.input_size,
output_size=layer.attention.query_key_value.output_size,
bias=True,
gather_output=False,
params_dtype=torch.half,
name='query_key_value',
skip_init=True,
device=layer.attention.query_key_value.weight.device,
)
layer.attention.dense = QuantizedRowParallelLinear(
weight_bit_width=weight_bit_width,
weight=layer.attention.dense.weight.to(
torch.cuda.current_device()),
input_size=layer.attention.dense.input_size,
output_size=layer.attention.dense.output_size,
bias=True,
input_is_parallel=True,
params_dtype=torch.half,
name='dense',
skip_init=True,
device=layer.attention.dense.weight.device,
)
layer.mlp.dense_h_to_4h = QuantizedColumnParallelLinear(
weight_bit_width=weight_bit_width,
weight=layer.mlp.dense_h_to_4h.weight.to(
torch.cuda.current_device()),
input_size=layer.mlp.dense_h_to_4h.input_size,
output_size=layer.mlp.dense_h_to_4h.output_size,
bias=True,
gather_output=False,
params_dtype=torch.half,
name='dense_h_to_4h',
skip_init=True,
device=layer.mlp.dense_h_to_4h.weight.device,
)
layer.mlp.dense_4h_to_h = QuantizedRowParallelLinear(
weight_bit_width=weight_bit_width,
weight=layer.mlp.dense_4h_to_h.weight.to(
torch.cuda.current_device()),
input_size=layer.mlp.dense_4h_to_h.input_size,
output_size=layer.mlp.dense_4h_to_h.output_size,
bias=True,
input_is_parallel=True,
params_dtype=torch.half,
name='dense_h_to_4h',
skip_init=True,
device=layer.mlp.dense_4h_to_h.weight.device,
)
return model

View File

@@ -0,0 +1,30 @@
# Copyright (c) 2022 Zhipu.AI
import torch
from ..kernels import extract_weight_to_half
class W8A16Linear(torch.autograd.Function):
@staticmethod
def forward(ctx, inp: torch.Tensor, quant_w: torch.Tensor,
scale_w: torch.Tensor, weight_bit_width):
ctx.inp_shape = inp.size()
ctx.weight_shape = quant_w.size()
ctx.weight_bit_width = weight_bit_width
out_features = quant_w.size(0)
inp = inp.contiguous().view(-1, inp.size(-1))
weight = extract_weight_to_half(quant_w, scale_w, weight_bit_width)
output = inp.mm(weight.t())
ctx.save_for_backward(inp, quant_w, scale_w)
return output.view(*(ctx.inp_shape[:-1] + (out_features, )))
@staticmethod
def backward(ctx, grad_output: torch.Tensor):
inp, quant_w, scale_w = ctx.saved_tensors
weight = extract_weight_to_half(quant_w, scale_w, ctx.weight_bit_width)
grad_output = grad_output.contiguous().view(-1, weight.size(0))
grad_input = grad_output.mm(weight)
grad_weight = grad_output.t().mm(inp)
return grad_input.view(ctx.inp_shape), grad_weight.view(
ctx.weight_shape), None

View File

@@ -0,0 +1,113 @@
# Copyright (c) 2022 Zhipu.AI
import torch
from SwissArmyTransformer.mpu import (ColumnParallelLinear, RowParallelLinear,
copy_to_model_parallel_region,
gather_from_model_parallel_region,
reduce_from_model_parallel_region,
scatter_to_model_parallel_region)
from torch.nn.parameter import Parameter
from ..kernels import compress_int4_weight
from .functional import W8A16Linear
class QuantizedColumnParallelLinear(ColumnParallelLinear):
def __init__(self, weight_bit_width: int, weight=None, *args, **kwargs):
super(QuantizedColumnParallelLinear, self).__init__(*args, **kwargs)
self.weight_bit_width = weight_bit_width
shape = self.weight.shape
del self.weight
if weight is None:
self.weight = torch.empty(
shape[0],
shape[1] * weight_bit_width // 8,
dtype=torch.int8,
device=kwargs['device'])
self.weight_scale = torch.empty(
shape[0],
dtype=kwargs['params_dtype'],
device=kwargs['device'])
else:
self.weight_scale = (
weight.abs().max(dim=-1).values / ( # noqa
(2**(weight_bit_width - 1)) - 1)).half() # noqa
self.weight = torch.round(weight / self.weight_scale[:, None]).to(
torch.int8)
if weight_bit_width == 4:
self.weight = compress_int4_weight(self.weight)
self.weight = Parameter(
self.weight.to(kwargs['device']), requires_grad=False)
self.weight_scale = Parameter(
self.weight_scale.to(kwargs['device']), requires_grad=False)
def forward(self, input_):
# Set up backprop all-reduce.
input_parallel = copy_to_model_parallel_region(input_)
# Matrix multiply.
output_parallel = W8A16Linear.apply(input_parallel, self.weight,
self.weight_scale,
self.weight_bit_width)
if self.bias is not None:
output_parallel = output_parallel + self.bias
if self.gather_output:
# All-gather across the partitions.
output = gather_from_model_parallel_region(output_parallel)
else:
output = output_parallel
return output
class QuantizedRowParallelLinear(RowParallelLinear):
def __init__(self, weight_bit_width: int, weight=None, *args, **kwargs):
super(QuantizedRowParallelLinear, self).__init__(*args, **kwargs)
self.weight_bit_width = weight_bit_width
shape = self.weight.shape
del self.weight
if weight is None:
self.weight = torch.empty(
shape[0],
shape[1] * weight_bit_width // 8,
dtype=torch.int8,
device=kwargs['device'])
self.weight_scale = torch.empty(
shape[0],
dtype=kwargs['params_dtype'],
device=kwargs['device'])
else:
self.weight_scale = (
weight.abs().max(dim=-1).values / ( # noqa
(2**(weight_bit_width - 1)) - 1)).half() # noqa
self.weight = torch.round(weight / self.weight_scale[:, None]).to(
torch.int8)
if weight_bit_width == 4:
self.weight = compress_int4_weight(self.weight)
self.weight = Parameter(
self.weight.to(kwargs['device']), requires_grad=False)
self.weight_scale = Parameter(
self.weight_scale.to(kwargs['device']), requires_grad=False)
def forward(self, input_):
# Set up backprop all-reduce.
if self.input_is_parallel:
input_parallel = input_
else:
input_parallel = scatter_to_model_parallel_region(input_)
# Matrix multiply.
output_parallel = W8A16Linear.apply(input_parallel, self.weight,
self.weight_scale,
self.weight_bit_width)
# All-reduce across all the partitions.
output_ = reduce_from_model_parallel_region(output_parallel)
if self.bias is not None:
output = output_ + self.bias
else:
output = output_
return output

View File

@@ -0,0 +1,354 @@
# Copyright (c) 2022 Zhipu.AI
import copy
import os
import random
import re
import stat
import sys
import time
from functools import partial
from typing import Any, Dict, List, Tuple
import torch
from SwissArmyTransformer import mpu
from SwissArmyTransformer.generation.autoregressive_sampling import (
get_masks_and_position_ids_default, update_mems)
from SwissArmyTransformer.generation.utils import (generate_continually,
timed_name)
from modelscope.metainfo import Models
from modelscope.models.base import TorchModel
from modelscope.models.builder import MODELS
from modelscope.outputs import OutputKeys
from modelscope.utils.config import Config
from modelscope.utils.constant import ModelFile, Tasks
from modelscope.utils.logger import get_logger
from .generation import BaseStrategy, BeamSearchStrategy
from .initialize import initialize, initialize_model_and_tokenizer
torch.set_num_threads(24)
logger = get_logger()
def batch_filling_sequence(
model,
seqs,
context_lengths,
strategy,
max_memory_length=100000,
get_masks_and_position_ids=get_masks_and_position_ids_default,
mems=None,
**kw_args):
'''
seq: [2, 3, 5, ..., -1(to be generated), -1, ...]
mems: [num_layers, batch_size, len_mems(index), mem_hidden_size]
cache, should be first mems.shape[1] parts of context_tokens.
mems are the first-level citizens here, but we don't assume what is memorized.
input mems are used when multi-phase generation.
'''
assert len(seqs.shape) == 2
# building the initial tokens, attention_mask, and position_ids
batch_size, context_length = seqs.shape
seqs, attention_mask, position_ids = get_masks_and_position_ids(seqs)
tokens = seqs[..., :context_length]
if attention_mask.dtype != torch.bool:
attention_mask = attention_mask.type_as(next(
model.parameters())) # if fp16
# initialize generation
counter = context_length - 1 # Last fixed index is ``counter''
index = 0 if mems is None else mems.shape[
2] # Next forward starting index, also the length of cache.
num_beams = 1
# step-by-step generation
while counter < seqs.shape[1] - 1:
# Now, we want to generate seq[counter + 1],
# token[:, index: counter+1] needs forwarding.
# forward
tokens = tokens.reshape(batch_size * num_beams, -1)
mems = mems.reshape(mems.shape[0], batch_size
* num_beams, mems.shape[-2],
mems.shape[-1]) if mems is not None else None
logits, *output_per_layers = model(
tokens[:, index:],
position_ids[..., index:counter + 1],
attention_mask[...,
index:counter + 1, :counter + 1], # TODO memlen
mems=mems,
**kw_args)
mem_kv = [o['mem_kv'] for o in output_per_layers]
mems = update_mems(mem_kv, mems, max_memory_length=max_memory_length)
if counter == context_length - 1:
logits = logits[torch.arange(batch_size), context_lengths - 1]
else:
logits = logits[:, -1]
counter += 1
index = counter
# sampling
logits = logits.reshape(batch_size, num_beams, -1)
tokens = tokens.reshape(batch_size, num_beams, -1)
mems = mems.reshape(mems.shape[0], batch_size, num_beams,
mems.shape[-2], mems.shape[-1])
tokens, mems = strategy.forward(logits, tokens, mems)
if len(tokens.shape) == 3 and num_beams == 1:
num_beams = tokens.shape[1]
position_ids = position_ids.unsqueeze(1).expand(
batch_size, num_beams, -1).reshape(batch_size * num_beams, -1)
attention_mask_shape = attention_mask.shape[-3:]
attention_mask = attention_mask.unsqueeze(1).expand(
batch_size, num_beams, -1, -1,
-1).reshape(batch_size * num_beams, *attention_mask_shape)
if strategy.is_done:
break
return strategy.finalize(tokens, mems)
def add_generation_specific_args(parser):
parser.add_argument(
'--sampling-strategy',
type=str,
default='BaseStrategy',
help='Type of sampling strategy.')
parser.add_argument(
'--min-gen-length',
type=int,
default=0,
help='The minimum length each blank should generate.')
parser.add_argument(
'--print-all-beams',
action='store_true',
help='Print all output generated by beam search strategy.')
def isEnglish(s):
try:
s.encode(encoding='utf-8').decode('ascii')
except UnicodeDecodeError:
return False
else:
return True
def get_masks_and_position_ids(seq,
mask_position,
max_gen_length,
gmask=False):
context_length = seq.shape[1]
tokens = torch.nn.functional.pad(
seq, (0, max_gen_length), mode='constant', value=-1)
attention_mask = torch.ones((1, tokens.shape[-1], tokens.shape[-1]),
device=tokens.device)
attention_mask.tril_()
attention_mask[..., :context_length - 1] = 1
attention_mask.unsqueeze_(1)
attention_mask = (attention_mask < 0.5).bool()
position_ids = torch.arange(
tokens.shape[-1], dtype=torch.long, device=tokens.device)
if not gmask:
position_ids[context_length - 1:] = mask_position
position_ids = position_ids.unsqueeze(0)
return tokens, attention_mask, position_ids
def fill_blanks(args, raw_text: str, model, tokenizer,
strategy) -> Tuple[List[str], List[str], List[List[str]]]:
# add MASK
generation_mask = '[gMASK]'
if '[MASK]' in raw_text:
generation_mask = '[MASK]'
elif '[sMASK]' in raw_text:
generation_mask = '[sMASK]'
use_gmask = '[MASK]' not in raw_text and '[sMASK]' not in raw_text
mask_pattern = r'\[[sg]?MASK\]'
text_list = re.split(mask_pattern, raw_text)
pattern_list = re.compile(mask_pattern).findall(raw_text)
seq = []
for i in range(len(pattern_list)):
pattern = pattern_list[i]
sub_text = text_list[i]
seq.extend(tokenizer.tokenize(sub_text))
seq.append(tokenizer.get_command(pattern))
seq.extend(tokenizer.tokenize(text_list[-1]))
if 'MASK]' not in raw_text:
seq += [tokenizer.get_command(generation_mask)]
raw_text += ' ' + generation_mask
if not raw_text.endswith('MASK]'):
seq = seq + [tokenizer.get_command('eos')]
if mpu.get_model_parallel_rank() == 0:
logger.info('\nInput: {}\n'.format(raw_text))
if len(seq) > args.max_sequence_length:
raise ValueError('text too long.')
# generation
is_english = isEnglish(raw_text)
output_list = [seq]
num_output = args.num_beams if args.sampling_strategy == 'BeamSearchStrategy' else 1
last_pos, answers, answers_with_style, blanks = (
[0] * num_output,
['' for _ in range(num_output)],
['' for _ in range(num_output)],
[[] for _ in range(num_output)],
)
# continually detect the first mark position
while True:
seq = output_list[0]
# detect mask position
mask_token = tokenizer.get_command(generation_mask)
if mask_token not in seq:
break
mask_position = seq.index(mask_token)
output_list = []
input_seq = torch.cuda.LongTensor(
[seq + [tokenizer.get_command('sop')]],
device=args.device,
)
output, _ = batch_filling_sequence(
model,
input_seq,
torch.cuda.LongTensor([input_seq.shape[-1]], device=args.device),
strategy=strategy,
get_masks_and_position_ids=partial(
get_masks_and_position_ids,
mask_position=mask_position,
max_gen_length=args.out_seq_length - input_seq.shape[-1],
gmask=use_gmask,
),
)
if isinstance(output, torch.Tensor): # different strategies
output = output.tolist()
output = output[0] # batch_size = 1
output_list.extend(output)
# clip -1s and fill back generated things into seq
for i in range(len(output_list)):
output = output_list[i].tolist() if isinstance(
output_list[i], torch.Tensor) else output_list[i]
try:
unfinished = output.index(-1)
except ValueError:
unfinished = len(output)
if output[unfinished - 1] in strategy.end_tokens:
unfinished -= 1
bog = output.index(tokenizer.get_command('sop'))
prefix = tokenizer.detokenize(output[last_pos[i]:mask_position])
blank = tokenizer.detokenize(output[bog + 1:unfinished])
answers_with_style[i] += (
prefix + (' ' if is_english else '') + # noqa
('\033[4m' if use_gmask else '\x1b[0;32m\033[4m') + blank
+ # noqa
('\033[0m' if use_gmask else '\033[0m\x1b[0m') + # noqa
(' ' if is_english else '')) # noqa
blanks[i].append(blank)
last_pos[i] = mask_position + unfinished - (bog + 1)
output_list[i] = output[:mask_position] + output[
bog + 1:unfinished] + output[mask_position + 1:bog]
for i, output in enumerate(output_list):
if output[-1] == tokenizer.get_command('eos'):
output = output[:-1]
answers_with_style[i] += tokenizer.detokenize(output[last_pos[i]:])
answers[i] = tokenizer.detokenize(output)
return answers, answers_with_style, blanks
@MODELS.register_module(Tasks.text_generation, module_name=Models.glm130b)
class GLM130bForTextGeneration(TorchModel):
def __init__(self, model_dir: str, *args, **kwargs):
# """initialize the glm130b model from the `model_dir` path.
# Args:
# model_dir (str): the model path.
# """
super().__init__(model_dir, *args, **kwargs)
self.cfg = Config.from_file(model_dir + '/' + ModelFile.CONFIGURATION)
args = initialize(extra_args_provider=add_generation_specific_args)
args.seed = random.randint(1, sys.maxsize - 1)
args.sampling_strategy = self.cfg.model.sampling_strategy
args.out_seq_length = self.cfg.model.out_seq_length
args.min_gen_length = self.cfg.model.min_gen_length
args.num_beams = self.cfg.model.num_beams
args.length_penalty = self.cfg.model.length_penalty
args.no_repeat_ngram_size = self.cfg.model.no_repeat_ngram_size
args.temperature = self.cfg.model.temperature
args.top_k = self.cfg.model.top_k
args.top_p = self.cfg.model.top_p
args.load = model_dir
logger.info('Loading model and tokenizer ...')
self.model, self.tokenizer = initialize_model_and_tokenizer(args)
end_tokens = [
self.tokenizer.get_command('eop'),
self.tokenizer.get_command('eos')
]
if args.sampling_strategy == 'BaseStrategy':
self.strategy = BaseStrategy(
batch_size=1,
temperature=args.temperature,
top_k=args.top_k,
top_p=args.top_p,
end_tokens=end_tokens)
elif args.sampling_strategy == 'BeamSearchStrategy':
self.strategy = BeamSearchStrategy(
1,
args.num_beams,
length_penalty=args.length_penalty,
consider_end=True,
end_tokens=end_tokens,
no_repeat_ngram_size=args.no_repeat_ngram_size,
min_gen_length=args.min_gen_length,
)
else:
raise ValueError(f'unknown strategy {args.sampling_strategy}')
self.args = args
def func(self, raw_text):
answers, answers_with_style, blanks = fill_blanks(
self.args, raw_text, self.model, self.tokenizer, self.strategy)
if mpu.get_model_parallel_rank() == 0:
logger.info('Output:' + str(answers_with_style[0]))
return str(answers_with_style[0])
def forward(self, input: str) -> Dict[str, str]:
raw_text, is_stop = '', False
if torch.distributed.get_rank() == 0:
raw_text = input
if not raw_text:
return {OutputKeys.TEXT: 'Query should not be empty!'}
if raw_text == 'stop':
is_stop = True
torch.distributed.broadcast_object_list([raw_text, is_stop])
else:
info = [raw_text, is_stop]
torch.distributed.broadcast_object_list(info)
raw_text, is_stop = info
if is_stop:
return
try:
start_time = time.time()
res = self.func(raw_text)
if torch.distributed.get_rank() == 0:
logger.info('\nTaken time {:.2f}\n'.format(time.time()
- start_time))
except (ValueError, FileNotFoundError) as e:
return {OutputKeys.TEXT: str(e)}
logger.info('Generation finished.')
return {OutputKeys.TEXT: res}

View File

@@ -35,6 +35,7 @@ if TYPE_CHECKING:
from .mglm_text_summarization_pipeline import MGLMTextSummarizationPipeline
from .codegeex_code_translation_pipeline import CodeGeeXCodeTranslationPipeline
from .codegeex_code_generation_pipeline import CodeGeeXCodeGenerationPipeline
from .glm130b_text_generation_pipeline import GLM130bTextGenerationPipeline
from .translation_evaluation_pipeline import TranslationEvaluationPipeline
from .user_satisfaction_estimation_pipeline import UserSatisfactionEstimationPipeline
from .siamese_uie_pipeline import SiameseUiePipeline
@@ -89,6 +90,7 @@ else:
['CodeGeeXCodeTranslationPipeline'],
'codegeex_code_generation_pipeline':
['CodeGeeXCodeGenerationPipeline'],
'glm130b_text_generation_pipeline': ['GLM130bTextGenerationPipeline'],
'translation_evaluation_pipeline': ['TranslationEvaluationPipeline'],
'user_satisfaction_estimation_pipeline':
['UserSatisfactionEstimationPipeline'],

View File

@@ -0,0 +1,29 @@
# Copyright (c) 2022 Zhipu.AI
from typing import Any, Dict, Union
from modelscope.metainfo import Pipelines
from modelscope.models.nlp import GLM130bForTextGeneration
from modelscope.pipelines.base import Pipeline
from modelscope.pipelines.builder import PIPELINES
from modelscope.utils.constant import Tasks
@PIPELINES.register_module(
group_key=Tasks.text_generation,
module_name=Pipelines.glm130b_text_generation)
class GLM130bTextGenerationPipeline(Pipeline):
def __init__(self, model: Union[GLM130bForTextGeneration, str], *args,
**kwargs):
model = GLM130bForTextGeneration(model) if isinstance(model,
str) else model
self.model = model
def __call__(self, input: str, **forward_params) -> Dict[str, Any]:
return self.model(input)
def postprocess(self, input, **kwargs) -> Dict[str, Any]:
"""This method will not be called.
"""
return input

View File

@@ -34,7 +34,7 @@ if TYPE_CHECKING:
from .siamese_uie_preprocessor import SiameseUiePreprocessor
from .document_grounded_dialog_generate_preprocessor import DocumentGroundedDialogGeneratePreprocessor
from .document_grounded_dialog_retrieval_preprocessor import DocumentGroundedDialogRetrievalPreprocessor
from .document_grounded_dialog_retrieval_preprocessor import DocumentGroundedDialogRerankPreprocessor
from .document_grounded_dialog_rerank_preprocessor import DocumentGroundedDialogRerankPreprocessor
else:
_import_structure = {
'bert_seq_cls_tokenizer': ['Tokenize'],

View File

@@ -1,9 +1,12 @@
# Copyright (c) Alibaba, Inc. and its affiliates.
# This file is adapted from the AllenNLP library at https://github.com/allenai/allennlp
# Part of the implementation is borrowed from wimglenn/johnnydep
import copy
import importlib
import os
import pkgutil
import shutil
import sys
import venv
from contextlib import contextmanager
@@ -263,14 +266,22 @@ def install_module_from_requirements(requirement_path, ):
"""
install_args = ['-r', requirement_path]
status_code, _, args = PluginsManager.pip_command(
'install',
install_args,
)
if status_code != 0:
raise ImportError(
f'Failed to install requirements from {requirement_path}')
install_list = []
with open(requirement_path, 'r', encoding='utf-8') as f:
requirements = f.read().splitlines()
for req in requirements:
installed, _ = PluginsManager.check_plugin_installed(req)
if not installed:
install_list.append(req)
if len(install_list) > 0:
status_code, _, args = PluginsManager.pip_command(
'install',
install_list,
)
if status_code != 0:
raise ImportError(
f'Failed to install requirements from {requirement_path}')
def import_module_from_file(module_name, file_path):
@@ -298,18 +309,6 @@ def import_module_from_model_dir(model_dir):
import_module_from_file(module_name, file)
def install_modelscope_if_need():
plugin_installed, version = PluginsManager.check_plugin_installed(
'modelscope')
if not plugin_installed:
status_code, _, args = PluginsManager.pip_command(
'install',
['modelscope'],
)
if status_code != 0:
raise ImportError('Failed to install package modelscope')
def install_requirements_by_names(plugins: List[str]):
plugins_manager = PluginsManager()
uninstalled_plugins = []
@@ -324,20 +323,21 @@ def install_requirements_by_names(plugins: List[str]):
f'The required packages {",".join(uninstalled_plugins)} are not installed.',
f'Please run the command `modelscope plugin install {" ".join(uninstalled_plugins)}` to install them.'
)
install_modelscope_if_need()
def install_requirements_by_files(requirements: List[str]):
for requirement in requirements:
install_module_from_requirements(requirement)
install_modelscope_if_need()
def register_plugins_repo(plugins: List[str]) -> None:
""" Try to install and import plugins from repo"""
if plugins is not None:
install_requirements_by_names(plugins)
import_plugins(plugins)
modules = []
for plugin in plugins:
modules.extend(get_modules_from_package(plugin))
import_plugins(modules)
def register_modelhub_repo(model_dir, allow_remote=False) -> None:
@@ -351,6 +351,256 @@ def register_modelhub_repo(model_dir, allow_remote=False) -> None:
pass
DEFAULT_INDEX = 'https://pypi.org/simple/'
def get_modules_from_package(package):
""" to get the modules from a installed package
Args:
package: The distribution name or package name
Returns:
"""
from zipfile import ZipFile
from tempfile import mkdtemp
from subprocess import check_output, STDOUT
from glob import glob
import hashlib
from urllib.parse import urlparse
from urllib import request as urllib2
from pip._internal.utils.packaging import get_requirement
req = get_requirement(package)
package = req.name
def urlretrieve(url, filename, data=None, auth=None):
if auth is not None:
# https://docs.python.org/2.7/howto/urllib2.html#id6
password_mgr = urllib2.HTTPPasswordMgrWithDefaultRealm()
# Add the username and password.
# If we knew the realm, we could use it instead of None.
username, password = auth
top_level_url = urlparse(url).netloc
password_mgr.add_password(None, top_level_url, username, password)
handler = urllib2.HTTPBasicAuthHandler(password_mgr)
# create "opener" (OpenerDirector instance)
opener = urllib2.build_opener(handler)
else:
opener = urllib2.build_opener()
res = opener.open(url, data=data)
headers = res.info()
with open(filename, 'wb') as fp:
fp.write(res.read())
return filename, headers
def compute_checksum(target, algorithm='sha256', blocksize=2**13):
hashtype = getattr(hashlib, algorithm)
hash_ = hashtype()
logger.debug('computing checksum', target=target, algorithm=algorithm)
with open(target, 'rb') as f:
for chunk in iter(lambda: f.read(blocksize), b''):
hash_.update(chunk)
result = hash_.hexdigest()
logger.debug('computed checksum', result=result)
return result
def _get_pip_version():
# try to get pip version without actually importing pip
# setuptools gets upset if you import pip before importing setuptools..
try:
import importlib.metadata # Python 3.8+
return importlib.metadata.version('pip')
except Exception:
pass
import pip
return pip.__version__
def _download_dist(url, scratch_file, index_url, extra_index_url):
auth = None
if index_url:
parsed = urlparse(index_url)
if parsed.username and parsed.password and parsed.hostname == urlparse(
url).hostname:
# handling private PyPI credentials in index_url
auth = (parsed.username, parsed.password)
if extra_index_url:
parsed = urlparse(extra_index_url)
if parsed.username and parsed.password and parsed.hostname == urlparse(
url).hostname:
# handling private PyPI credentials in extra_index_url
auth = (parsed.username, parsed.password)
target, _headers = urlretrieve(url, scratch_file, auth=auth)
return target, _headers
def _get_wheel_args(index_url, env, extra_index_url):
args = [
sys.executable,
'-m',
'pip',
'wheel',
'-vvv', # --verbose x3
'--no-deps',
'--no-cache-dir',
'--disable-pip-version-check',
]
if index_url is not None:
args += ['--index-url', index_url]
if index_url != DEFAULT_INDEX:
hostname = urlparse(index_url).hostname
if hostname:
args += ['--trusted-host', hostname]
if extra_index_url is not None:
args += [
'--extra-index-url', extra_index_url, '--trusted-host',
urlparse(extra_index_url).hostname
]
if env is None:
pip_version = _get_pip_version()
else:
pip_version = dict(env)['pip_version']
args[0] = dict(env)['python_executable']
pip_major, pip_minor = pip_version.split('.')[0:2]
pip_major = int(pip_major)
pip_minor = int(pip_minor)
if pip_major >= 10:
args.append('--progress-bar=off')
if (20, 3) <= (pip_major, pip_minor) < (21, 1):
# See https://github.com/pypa/pip/issues/9139#issuecomment-735443177
args.append('--use-deprecated=legacy-resolver')
return args
def get(dist_name,
index_url=None,
env=None,
extra_index_url=None,
tmpdir=None,
ignore_errors=False):
args = _get_wheel_args(index_url, env, extra_index_url) + [dist_name]
scratch_dir = mkdtemp(dir=tmpdir)
logger.debug(
'wheeling and dealing',
scratch_dir=os.path.abspath(scratch_dir),
args=' '.join(args))
try:
out = check_output(
args, stderr=STDOUT, cwd=scratch_dir).decode('utf-8')
except ChildProcessError as err:
out = getattr(err, 'output', b'').decode('utf-8')
logger.warning(out)
if not ignore_errors:
raise
logger.debug('wheel command completed ok', dist_name=dist_name)
links = []
local_links = []
lines = out.splitlines()
for i, line in enumerate(lines):
line = line.strip()
if line.startswith('Downloading from URL '):
parts = line.split()
link = parts[3]
links.append(link)
elif line.startswith('Downloading '):
parts = line.split()
last = parts[-1]
if len(parts) == 3 and last.startswith('(') and last.endswith(
')'):
link = parts[-2]
elif len(parts) == 4 and parts[-2].startswith(
'(') and last.endswith(')'):
link = parts[-3]
if not urlparse(link).scheme:
# newest pip versions have changed to not log the full url
# in the download event. it is becoming more and more annoying
# to preserve compatibility across a wide range of pip versions
next_line = lines[i + 1].strip()
if next_line.startswith(
'Added ') and ' to build tracker' in next_line:
link = next_line.split(
' to build tracker')[0].split()[-1]
else:
link = last
links.append(link)
elif line.startswith(
'Source in ') and 'which satisfies requirement' in line:
link = line.split()[-1]
links.append(link)
elif line.startswith('Added ') and ' from file://' in line:
[link] = [x for x in line.split() if x.startswith('file://')]
local_links.append(link)
if not links:
# prefer http scheme over file
links += local_links
links = list(dict.fromkeys(links)) # order-preserving dedupe
if not links:
logger.warning('could not find download link', out=out)
raise Exception('failed to collect dist')
if len(links) == 2:
# sometimes we collect the same link, once with a url fragment/checksum and once without
first, second = links
if first.startswith(second):
del links[1]
elif second.startswith(first):
del links[0]
if len(links) > 1:
logger.debug('more than 1 link collected', out=out, links=links)
# Since PEP 517, maybe an sdist will also need to collect other distributions
# for the build system, even with --no-deps specified. pendulum==1.4.4 is one
# example, which uses poetry and doesn't publish any python37 wheel to PyPI.
# However, the dist itself should still be the first one downloaded.
link = links[0]
whls = glob(os.path.join(os.path.abspath(scratch_dir), '*.whl'))
try:
[whl] = whls
except ValueError:
if ignore_errors:
whl = ''
else:
raise
url, _sep, checksum = link.partition('#')
url = url.replace(
'/%2Bf/', '/+f/'
) # some versions of pip did not unquote this fragment in the log
if not checksum.startswith('md5=') and not checksum.startswith(
'sha256='):
# PyPI gives you the checksum in url fragment, as a convenience. But not all indices are so kind.
algorithm = 'md5'
if os.path.basename(whl).lower() == url.rsplit('/', 1)[-1].lower():
target = whl
else:
scratch_file = os.path.join(scratch_dir, os.path.basename(url))
target, _headers = _download_dist(url, scratch_file, index_url,
extra_index_url)
checksum = compute_checksum(target=target, algorithm=algorithm)
checksum = '='.join([algorithm, checksum])
result = {'path': whl, 'url': url, 'checksum': checksum}
return result
def discover_import_names(whl_file):
logger.debug('finding import names')
zipfile = ZipFile(file=whl_file)
namelist = zipfile.namelist()
[top_level_fname
] = [x for x in namelist if x.endswith('top_level.txt')]
all_names = zipfile.read(top_level_fname).decode(
'utf-8').strip().splitlines()
public_names = [n for n in all_names if not n.startswith('_')]
return public_names
tmpdir = mkdtemp()
data = get(package, tmpdir=tmpdir)
import_names = discover_import_names(data['path'])
shutil.rmtree(tmpdir)
return import_names
class PluginsManager(object):
def __init__(self,
@@ -370,11 +620,31 @@ class PluginsManager(object):
@staticmethod
def check_plugin_installed(package):
""" Check if the plugin is installed, and if the version is valid
Args:
package: the package name need to be installed
Returns:
"""
from pip._internal.utils.packaging import get_requirement, specifiers
req = get_requirement(package)
try:
importlib.reload(pkg_resources)
package_meta_info = pkg_resources.working_set.by_key[package]
package_meta_info = pkg_resources.working_set.by_key[req.name]
version = package_meta_info.version
# To test if the package is installed
installed = True
# If installed, test if the version is correct
for spec in req.specifier:
installed_valid_version = spec.contains(version)
if not installed_valid_version:
installed = False
break
except KeyError:
version = ''
installed = False
@@ -402,6 +672,10 @@ class PluginsManager(object):
options, args = command.parse_args(command_args)
status_code = command.main(command_args)
# reload the pkg_resources in order to get the latest pkgs information
importlib.reload(pkg_resources)
return status_code, options, args
def install_plugins(self,
@@ -722,3 +996,4 @@ class EnvsManager(object):
if __name__ == '__main__':
install_requirements_by_files(['adaseq'])
import_name = get_modules_from_package('pai-easycv')

View File

@@ -0,0 +1 @@
{"framework":"pytorch","task":"bilibili","model":{"type":"my-custom-model","scale":2,"weight_path":"weights_v3/up2x-latest-denoise3x.pth","half":true},"pipeline":{"type":"my-custom-pipeline"}}

View File

@@ -15,8 +15,6 @@ class PluginModelTest(unittest.TestCase, DemoCompatibilityCheck):
def tearDown(self):
# make sure uninstalled after installing
uninstall_args = [self.package, '-y']
PluginsManager.pip_command('uninstall', uninstall_args)
super().tearDown()
import subprocess
result = subprocess.run(