This commit is contained in:
Rongjiehuang
2023-03-23 23:19:33 +08:00
parent 5c1b4bc63d
commit b710c8a7d1
3 changed files with 180 additions and 12 deletions

143
.gitignore vendored Executable file
View File

@@ -0,0 +1,143 @@
# JetBrains PyCharm IDE
.idea/
.github/
.circleci/
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class
# C extensions
*.so
# macOS dir files
.DS_Store
# Distribution / packaging
.Python
env/
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
*.egg-info/
.installed.cfg
*.egg
# Checkpoints
checkpoints
# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec
# Installer logs
pip-log.txt
pip-delete-this-directory.txt
# Unit test / coverage reports
htmlcov/
.tox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
.hypothesis/
# Translations
*.mo
*.pot
# Django stuff:
*.log
local_settings.py
# Flask stuff:
instance/
.webassets-cache
# Scrapy stuff:
.scrapy
# Sphinx documentation
docs/_build/
# PyBuilder
target/
# Jupyter Notebook
.ipynb_checkpoints
# pyenv
.python-version
# celery beat schedule file
celerybeat-schedule
# SageMath parsed files
*.sage.py
# dotenv
.env
# virtualenv
.venv
venv/
ENV/
# Spyder project settings
.spyderproject
.spyproject
# Rope project settings
.ropeproject
# mkdocs documentation
/site
# mypy
.mypy_cache/
# Generated files
/fairseq/temporal_convolution_tbc
/fairseq/modules/*_layer/*_forward.cu
/fairseq/modules/*_layer/*_backward.cu
/fairseq/version.py
# data
data-bin/
# reranking
/examples/reranking/rerank_data
# Cython-generated C++ source files
/fairseq/data/data_utils_fast.cpp
/fairseq/data/token_block_utils_fast.cpp
# VSCODE
.vscode/ftp-sync.json
.vscode/settings.json
# Experimental Folder
experimental/*
# Weights and Biases logs
wandb/
# Hydra artifacts
nohup.out
multirun
outputs

View File

@@ -1,13 +1,38 @@
---
title: Make An Audio
emoji: 😻
colorFrom: green
colorTo: indigo
sdk: gradio
sdk_version: 3.17.0
app_file: app.py
pinned: false
---
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
# AudioGPT
**AudioGPT** connects ChatGPT and a series of Audio Foundation Models to enable **sending** and **receiving** speech, sing, and audio during chatting.
## Capability
Here we list the capability of AudioGPT at this time. More supported models and tasks are comming soon.
| Task | Foundation Model | Status |
|:-------------------------:|:--------------------------------:|:------:|
| ----------Speech--------- | / | / |
| Text-to-Speech | [FastSpeech](), [SyntaSpeech]() | WIP |
| Neural Vocoding | [BigVGAN](), [FastDiff]() | WIP |
| Style Transfer | [GenerSpeech]() | WIP |
| Speech Recognition | [whisper]() | Yes |
| ----------Sing--------- | / | |
| Text-to-Sing | [DiffSinger]() | Yes |
| ----------Audio--------- | / | |
| Text-to-Audio | [Make-An-Audio]() | Yes |
| Audio Inpainting | [Make-An-Audio]() | WIP |
| Image-to-Audio | [Make-An-Audio]() | Yes |
## Internal Version Updates
3.23 Support Text-to-Sing\
3.21 Support Image-to-Sing\
3.19 Support Speech Recognition\
3.17 Support Text-to-Audio
## Acknowledgement
We appreciate the open source of the following projects:
[Visual ChatGPT](https://github.com/microsoft/visual-chatgpt)  
[Hugging Face](https://github.com/huggingface)  
[LangChain](https://github.com/hwchase17/langchain)  
[Stable Diffusion](https://github.com/CompVis/stable-diffusion)