Mirrors/AudioGPT

Fork 0

mirror of https://github.com/AIGC-Audio/AudioGPT.git synced 2025-12-16 03:47:55 +01:00

Go to file

lmzjms 84a5493253 update

2023-04-11 22:56:35 +08:00

assets

update

2023-04-09 17:05:42 +08:00

audio_detection

fix e

2023-04-06 22:18:47 +08:00

audio_to_text

update

2023-03-28 23:30:18 +08:00

mono2binaural/src

detection and extraction

2023-04-06 00:11:23 +08:00

NeuralSeq

update

2023-04-09 17:02:38 +08:00

sound_extraction

detection and extraction

2023-04-06 00:11:23 +08:00

text_to_audio/MakeAnAudio

update

2023-03-27 19:54:59 +08:00

.gitignore

delect cache

2023-04-02 20:05:01 +07:00

audio-chatgpt.py

update

2023-04-11 22:56:35 +08:00

download.sh

update

2023-04-09 17:02:38 +08:00

LICENSE

update

2023-04-09 17:02:38 +08:00

README.md

update huggingface

2023-04-07 00:24:23 +08:00

requirements.txt

add enh / ss

2023-04-11 08:07:49 -04:00

run.md

Add files via upload

2023-03-24 13:43:06 +08:00

README.md

AudioGPT

AudioGPT connects ChatGPT and a series of Audio Foundation Models to enable sending and receiving speech, sing, audio, and talking head during chatting.

Capabilities

Up-to-date link: https://cdb7b543afd1c8e8.gradio.app

Here we list the capability of AudioGPT at this time. More supported models and tasks are comming soon. For prompt examples, refer to asset.

Speech

Task	Supported Foundation Models	Status
Text-to-Speech	FastSpeech, SyntaSpeech, VITS	Yes (WIP)
Style Transfer	GenerSpeech	Yes
Speech Recognition	whisper, Conformer	Yes
Speech Enhancement	ConvTasNet	WIP
Speech Separation	TF-GridNet	WIP
Speech Translation	Multi-decoder	WIP
Mono-to-Binaural	NeuralWarp	Yes

Sing

Task	Supported Foundation Models	Status
Text-to-Sing	DiffSinger, VISinger	Yes (WIP)

Audio

Task	Supported Foundation Models	Status
Text-to-Audio	Make-An-Audio	Yes
Audio Inpainting	Make-An-Audio	Yes
Image-to-Audio	Make-An-Audio	Yes
Sound Detection	Audio-transformer	Yes
Target Sound Detection	TSDNet	Yes
Sound Extraction	LASSNet	Yes

Talking Head

Task	Supported Foundation Models	Status
Talking Head Synthesis	GeneFace	Yes (WIP)

Internal Version Updates

4.6 Support Sound Extraction/Detection
4.3 Support huggingface demo space
4.1 Support Audio inpainting and clean codes
3.27 Support Style Transfer/Talking head Synthesis
3.23 Support Text-to-Sing
3.21 Support Image-to-Audio
3.19 Support Speech Recognition
3.17 Support Text-to-Audio

Todo

clean text to sing/speech code
merge talking head synthesis into main
change audio/video log output
support huggingface space

Acknowledgement

We appreciate the open source of the following projects:

Visual ChatGPT Hugging Face LangChain Stable Diffusion

README.md Unescape Escape

AudioGPT

Capabilities

Speech

Sing

Audio

Talking Head

Internal Version Updates

Todo

Acknowledgement

README.md