update

2025-12-16 11:57:58 +01:00 · 2023-03-23 23:19:33 +08:00
parent 5c1b4bc63d
commit b710c8a7d1
3 changed files with 180 additions and 12 deletions
--- a/.gitignore
+++ b/.gitignore
@@ -0,0 +1,143 @@
 # JetBrains PyCharm IDE
 .idea/
 .github/
 .circleci/
 # Byte-compiled / optimized / DLL files
 __pycache__/
 *.py[cod]
 *$py.class
 # C extensions
 *.so
 # macOS dir files
 .DS_Store
 # Distribution / packaging
 .Python
 env/
 build/
 develop-eggs/
 dist/
 downloads/
 eggs/
 .eggs/
 lib/
 lib64/
 parts/
 sdist/
 var/
 wheels/
 *.egg-info/
 .installed.cfg
 *.egg
 # Checkpoints
 checkpoints
 # PyInstaller
 #  Usually these files are written by a python script from a template
 #  before PyInstaller builds the exe, so as to inject date/other infos into it.
 *.manifest
 *.spec
 # Installer logs
 pip-log.txt
 pip-delete-this-directory.txt
 # Unit test / coverage reports
 htmlcov/
 .tox/
 .coverage
 .coverage.*
 .cache
 nosetests.xml
 coverage.xml
 *.cover
 .hypothesis/
 # Translations
 *.mo
 *.pot
 # Django stuff:
 *.log
 local_settings.py
 # Flask stuff:
 instance/
 .webassets-cache
 # Scrapy stuff:
 .scrapy
 # Sphinx documentation
 docs/_build/
 # PyBuilder
 target/
 # Jupyter Notebook
 .ipynb_checkpoints
 # pyenv
 .python-version
 # celery beat schedule file
 celerybeat-schedule
 # SageMath parsed files
 *.sage.py
 # dotenv
 .env
 # virtualenv
 .venv
 venv/
 ENV/
 # Spyder project settings
 .spyderproject
 .spyproject
 # Rope project settings
 .ropeproject
 # mkdocs documentation
 /site
 # mypy
 .mypy_cache/
 # Generated files
 /fairseq/temporal_convolution_tbc
 /fairseq/modules/*_layer/*_forward.cu
 /fairseq/modules/*_layer/*_backward.cu
 /fairseq/version.py
 # data
 data-bin/
 # reranking
 /examples/reranking/rerank_data
 # Cython-generated C++ source files
 /fairseq/data/data_utils_fast.cpp
 /fairseq/data/token_block_utils_fast.cpp
 # VSCODE
 .vscode/ftp-sync.json
 .vscode/settings.json
 # Experimental Folder
 experimental/*
 # Weights and Biases logs
 wandb/
 # Hydra artifacts
 nohup.out
 multirun
 outputs
--- a/README.md
+++ b/README.md
@@ -1,13 +1,38 @@
 ---
 title: Make An Audio
 emoji: 😻
 colorFrom: green
 colorTo: indigo
 sdk: gradio
 sdk_version: 3.17.0
 app_file: app.py
 pinned: false
 ---
 Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 # AudioGPT
 **AudioGPT** connects ChatGPT and a series of Audio Foundation Models to enable **sending** and **receiving** speech, sing, and audio during chatting.
 ## Capability
 Here we list the capability of AudioGPT at this time. More supported models and tasks are comming soon.
 |           Task            |         Foundation Model         | Status |
 |:-------------------------:|:--------------------------------:|:------:|
 | ----------Speech--------- |                /                 |   /    |
 |      Text-to-Speech       | [FastSpeech](), [SyntaSpeech]()  |  WIP   |
 |      Neural Vocoding      |    [BigVGAN](), [FastDiff]()     |  WIP   |
 |      Style Transfer       |         [GenerSpeech]()          |  WIP   |
 |    Speech Recognition     |           [whisper]()            |  Yes   |
 |  ----------Sing---------  |                /                 |        |
 |       Text-to-Sing        |          [DiffSinger]()          |  Yes   |
 | ----------Audio---------  |                /                 |        |
 |       Text-to-Audio       |        [Make-An-Audio]()         |  Yes   |
 |     Audio Inpainting      |        [Make-An-Audio]()         |  WIP   |
 |      Image-to-Audio       |        [Make-An-Audio]()         |  Yes   |
 ## Internal Version Updates
 3.23 Support Text-to-Sing\
 3.21 Support Image-to-Sing\
 3.19 Support Speech Recognition\
 3.17 Support Text-to-Audio
 ## Acknowledgement
 We appreciate the open source of the following projects:
 [Visual ChatGPT](https://github.com/microsoft/visual-chatgpt) &#8194;
 [Hugging Face](https://github.com/huggingface) &#8194;
 [LangChain](https://github.com/hwchase17/langchain) &#8194;
 [Stable Diffusion](https://github.com/CompVis/stable-diffusion) &#8194;
--- a/assets/7ef0ec0b.wav
+++ b/assets/7ef0ec0b.wav