mirror of
https://github.com/AIGC-Audio/AudioGPT.git
synced 2025-12-16 11:57:58 +01:00
84ff577c8520182fcd8cd352a1f62b5f9efb236f
AudioGPT
AudioGPT connects ChatGPT and a series of Audio Foundation Models to enable sending and receiving speech, sing, and audio during chatting.
Capability
Here we list the capability of AudioGPT at this time. More supported models and tasks are comming soon.
| Task | Foundation Model | Status |
|---|---|---|
| ----------Speech--------- | / | / |
| Text-to-Speech | FastSpeech, SyntaSpeech | WIP |
| Neural Vocoding | BigVGAN, FastDiff | WIP |
| Style Transfer | GenerSpeech | WIP |
| Speech Recognition | whisper | Yes |
| ----------Sing--------- | / | |
| Text-to-Sing | DiffSinger | Yes |
| ----------Audio--------- | / | |
| Text-to-Audio | Make-An-Audio | Yes |
| Audio Inpainting | Make-An-Audio | WIP |
| Image-to-Audio | Make-An-Audio | Yes |
Internal Version Updates
3.23 Support Text-to-Sing
3.21 Support Image-to-Sing
3.19 Support Speech Recognition
3.17 Support Text-to-Audio
Acknowledgement
We appreciate the open source of the following projects:
Description
AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
https://huggingface.co/spaces/AIGC-Audio/AudioGPT
Readme
24 MiB
Languages
Python
99.8%
Shell
0.2%