mirror of
https://github.com/jasonppy/VoiceCraft.git
synced 2025-12-16 20:07:43 +01:00
122 lines
304 KiB
Plaintext
122 lines
304 KiB
Plaintext
|
|
{
|
||
|
|
"nbformat": 4,
|
||
|
|
"nbformat_minor": 0,
|
||
|
|
"metadata": {
|
||
|
|
"colab": {
|
||
|
|
"provenance": [],
|
||
|
|
"gpuType": "T4",
|
||
|
|
"authorship_tag": "ABX9TyPEhMt0mIcJv2BbaCwogF07",
|
||
|
|
"include_colab_link": true
|
||
|
|
},
|
||
|
|
"kernelspec": {
|
||
|
|
"name": "python3",
|
||
|
|
"display_name": "Python 3"
|
||
|
|
},
|
||
|
|
"language_info": {
|
||
|
|
"name": "python"
|
||
|
|
},
|
||
|
|
"accelerator": "GPU"
|
||
|
|
},
|
||
|
|
"cells": [
|
||
|
|
{
|
||
|
|
"cell_type": "markdown",
|
||
|
|
"metadata": {
|
||
|
|
"id": "view-in-github",
|
||
|
|
"colab_type": "text"
|
||
|
|
},
|
||
|
|
"source": [
|
||
|
|
"<a href=\"https://colab.research.google.com/github/Sewlell/VoiceCraft-gradio-colab/blob/master/voicecraft.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
|
||
|
|
]
|
||
|
|
},
|
||
|
|
{
|
||
|
|
"cell_type": "code",
|
||
|
|
"execution_count": null,
|
||
|
|
"metadata": {
|
||
|
|
"id": "Y87ixxsUVIhM"
|
||
|
|
},
|
||
|
|
"outputs": [],
|
||
|
|
"source": [
|
||
|
|
"!git clone https://github.com/Sewlell/VoiceCraft-gradio-colab"
|
||
|
|
]
|
||
|
|
},
|
||
|
|
{
|
||
|
|
"cell_type": "code",
|
||
|
|
"source": [
|
||
|
|
"!pip install tensorboard\n",
|
||
|
|
"!pip install phonemizer\n",
|
||
|
|
"!pip install datasets\n",
|
||
|
|
"!pip install torchmetrics\n",
|
||
|
|
"\n",
|
||
|
|
"!apt-get install -y espeak espeak-data libespeak1 libespeak-dev\n",
|
||
|
|
"!apt-get install -y festival*\n",
|
||
|
|
"!apt-get install -y build-essential\n",
|
||
|
|
"!apt-get install -y flac libasound2-dev libsndfile1-dev vorbis-tools\n",
|
||
|
|
"!apt-get install -y libxml2-dev libxslt-dev zlib1g-dev\n",
|
||
|
|
"\n",
|
||
|
|
"!pip install -e git+https://github.com/facebookresearch/audiocraft.git@c5157b5bf14bf83449c17ea1eeb66c19fb4bc7f0#egg=audiocraft\n",
|
||
|
|
"\n",
|
||
|
|
"!pip install -r \"/content/VoiceCraft-gradio-colab/gradio_requirements.txt\""
|
||
|
|
],
|
||
|
|
"metadata": {
|
||
|
|
"id": "-w3USR91XdxY"
|
||
|
|
},
|
||
|
|
"execution_count": null,
|
||
|
|
"outputs": []
|
||
|
|
},
|
||
|
|
{
|
||
|
|
"cell_type": "markdown",
|
||
|
|
"source": [
|
||
|
|
"# Let it restarted, it won't let your entire installation be gone."
|
||
|
|
],
|
||
|
|
"metadata": {
|
||
|
|
"id": "jNuzjrtmv2n1"
|
||
|
|
}
|
||
|
|
},
|
||
|
|
{
|
||
|
|
"cell_type": "markdown",
|
||
|
|
"source": [
|
||
|
|
"# Note before launching the `gradio_app.py`\n",
|
||
|
|
"\n",
|
||
|
|
"***You will get JSON warning if you move anything beside `sample_batch_size`, `stop_repetition` and `seed`.*** Which for most advanced setting, `kvache` and `temperature` unable to set in different value.\n",
|
||
|
|
"\n",
|
||
|
|
"You will get fp16 compatibility issue if you set `whisper backend` to `whisperX`, for whatever reason, setting `forced alignment model` to `whisperX` doesn't do anything.\n",
|
||
|
|
"\n",
|
||
|
|
"You will download a .file File when you download the Output Audio for some reason. You will need to **convert the file from .snd to .wav/.mp3 manually**. Or if you enable showing file type in the name in Windows or wherever you are, change the file name to \"xxx.wav\" or \"xxx.mp3\". (know the solution? pull request my repository)\n",
|
||
|
|
"\n",
|
||
|
|
"Frequency of VRAM spikes no longer exist as well in April 5 Update.\n",
|
||
|
|
"\n",
|
||
|
|
"# **To those who want to voice cloning**\n",
|
||
|
|
" can do 6-10s on 6s-8s `prompt end time`.\n",
|
||
|
|
"\n",
|
||
|
|
"I haven't test the Edit mode yet as those are not my focus, but you can try it."
|
||
|
|
],
|
||
|
|
"metadata": {
|
||
|
|
"id": "nnu2cY4t8P6X"
|
||
|
|
}
|
||
|
|
},
|
||
|
|
{
|
||
|
|
"cell_type": "code",
|
||
|
|
"source": [
|
||
|
|
"!python \"/content/VoiceCraft-gradio-colab/gradio_app.py\""
|
||
|
|
],
|
||
|
|
"metadata": {
|
||
|
|
"id": "NDt4r4DiXAwG"
|
||
|
|
},
|
||
|
|
"execution_count": null,
|
||
|
|
"outputs": []
|
||
|
|
}
|
||
|
|
]
|
||
|
|
}
|