mirror of
https://github.com/AIGC-Audio/AudioGPT.git
synced 2025-12-16 03:47:55 +01:00
update
This commit is contained in:
143
.gitignore
vendored
Executable file
143
.gitignore
vendored
Executable file
@@ -0,0 +1,143 @@
|
||||
# JetBrains PyCharm IDE
|
||||
.idea/
|
||||
.github/
|
||||
.circleci/
|
||||
|
||||
# Byte-compiled / optimized / DLL files
|
||||
__pycache__/
|
||||
*.py[cod]
|
||||
*$py.class
|
||||
|
||||
# C extensions
|
||||
*.so
|
||||
|
||||
# macOS dir files
|
||||
.DS_Store
|
||||
|
||||
# Distribution / packaging
|
||||
.Python
|
||||
env/
|
||||
build/
|
||||
develop-eggs/
|
||||
dist/
|
||||
downloads/
|
||||
eggs/
|
||||
.eggs/
|
||||
lib/
|
||||
lib64/
|
||||
parts/
|
||||
sdist/
|
||||
var/
|
||||
wheels/
|
||||
*.egg-info/
|
||||
.installed.cfg
|
||||
*.egg
|
||||
|
||||
# Checkpoints
|
||||
checkpoints
|
||||
|
||||
# PyInstaller
|
||||
# Usually these files are written by a python script from a template
|
||||
# before PyInstaller builds the exe, so as to inject date/other infos into it.
|
||||
*.manifest
|
||||
*.spec
|
||||
|
||||
# Installer logs
|
||||
pip-log.txt
|
||||
pip-delete-this-directory.txt
|
||||
|
||||
# Unit test / coverage reports
|
||||
htmlcov/
|
||||
.tox/
|
||||
.coverage
|
||||
.coverage.*
|
||||
.cache
|
||||
nosetests.xml
|
||||
coverage.xml
|
||||
*.cover
|
||||
.hypothesis/
|
||||
|
||||
# Translations
|
||||
*.mo
|
||||
*.pot
|
||||
|
||||
# Django stuff:
|
||||
*.log
|
||||
local_settings.py
|
||||
|
||||
# Flask stuff:
|
||||
instance/
|
||||
.webassets-cache
|
||||
|
||||
# Scrapy stuff:
|
||||
.scrapy
|
||||
|
||||
# Sphinx documentation
|
||||
docs/_build/
|
||||
|
||||
# PyBuilder
|
||||
target/
|
||||
|
||||
# Jupyter Notebook
|
||||
.ipynb_checkpoints
|
||||
|
||||
# pyenv
|
||||
.python-version
|
||||
|
||||
# celery beat schedule file
|
||||
celerybeat-schedule
|
||||
|
||||
# SageMath parsed files
|
||||
*.sage.py
|
||||
|
||||
# dotenv
|
||||
.env
|
||||
|
||||
# virtualenv
|
||||
.venv
|
||||
venv/
|
||||
ENV/
|
||||
|
||||
# Spyder project settings
|
||||
.spyderproject
|
||||
.spyproject
|
||||
|
||||
# Rope project settings
|
||||
.ropeproject
|
||||
|
||||
# mkdocs documentation
|
||||
/site
|
||||
|
||||
# mypy
|
||||
.mypy_cache/
|
||||
|
||||
# Generated files
|
||||
/fairseq/temporal_convolution_tbc
|
||||
/fairseq/modules/*_layer/*_forward.cu
|
||||
/fairseq/modules/*_layer/*_backward.cu
|
||||
/fairseq/version.py
|
||||
|
||||
# data
|
||||
data-bin/
|
||||
|
||||
# reranking
|
||||
/examples/reranking/rerank_data
|
||||
|
||||
# Cython-generated C++ source files
|
||||
/fairseq/data/data_utils_fast.cpp
|
||||
/fairseq/data/token_block_utils_fast.cpp
|
||||
|
||||
# VSCODE
|
||||
.vscode/ftp-sync.json
|
||||
.vscode/settings.json
|
||||
|
||||
# Experimental Folder
|
||||
experimental/*
|
||||
|
||||
# Weights and Biases logs
|
||||
wandb/
|
||||
|
||||
# Hydra artifacts
|
||||
nohup.out
|
||||
multirun
|
||||
outputs
|
||||
49
README.md
49
README.md
@@ -1,13 +1,38 @@
|
||||
---
|
||||
title: Make An Audio
|
||||
emoji: 😻
|
||||
colorFrom: green
|
||||
colorTo: indigo
|
||||
sdk: gradio
|
||||
sdk_version: 3.17.0
|
||||
app_file: app.py
|
||||
pinned: false
|
||||
---
|
||||
|
||||
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
|
||||
# AudioGPT
|
||||
|
||||
**AudioGPT** connects ChatGPT and a series of Audio Foundation Models to enable **sending** and **receiving** speech, sing, and audio during chatting.
|
||||
|
||||
|
||||
## Capability
|
||||
Here we list the capability of AudioGPT at this time. More supported models and tasks are comming soon.
|
||||
|
||||
| Task | Foundation Model | Status |
|
||||
|:-------------------------:|:--------------------------------:|:------:|
|
||||
| ----------Speech--------- | / | / |
|
||||
| Text-to-Speech | [FastSpeech](), [SyntaSpeech]() | WIP |
|
||||
| Neural Vocoding | [BigVGAN](), [FastDiff]() | WIP |
|
||||
| Style Transfer | [GenerSpeech]() | WIP |
|
||||
| Speech Recognition | [whisper]() | Yes |
|
||||
| ----------Sing--------- | / | |
|
||||
| Text-to-Sing | [DiffSinger]() | Yes |
|
||||
| ----------Audio--------- | / | |
|
||||
| Text-to-Audio | [Make-An-Audio]() | Yes |
|
||||
| Audio Inpainting | [Make-An-Audio]() | WIP |
|
||||
| Image-to-Audio | [Make-An-Audio]() | Yes |
|
||||
|
||||
|
||||
|
||||
## Internal Version Updates
|
||||
|
||||
3.23 Support Text-to-Sing\
|
||||
3.21 Support Image-to-Sing\
|
||||
3.19 Support Speech Recognition\
|
||||
3.17 Support Text-to-Audio
|
||||
|
||||
## Acknowledgement
|
||||
We appreciate the open source of the following projects:
|
||||
|
||||
[Visual ChatGPT](https://github.com/microsoft/visual-chatgpt)  
|
||||
[Hugging Face](https://github.com/huggingface)  
|
||||
[LangChain](https://github.com/hwchase17/langchain)  
|
||||
[Stable Diffusion](https://github.com/CompVis/stable-diffusion)  
|
||||
|
||||
Reference in New Issue
Block a user