README.md

# 🐶 BARK AI: but with the ability to use voice cloning on custom audio/text pairs

If you want to clone a voice just follow the `clone_voice.ipynb` notebook. If you want to generate audio from text, follow the `generate.ipynb` notebook.

To create a voice clone sample, you need an audio/text pair of less than 13 seconds.

You will get the best results by making generations with your cloned voice until you find one that is really close to the source. Then use that as the new history prompt (comes from the model so should theoretically be more consistent)

- [BARK text to speech @ SERP AI](https://serp.ai/tools/bark-text-to-speech-ai-voice-clone-app/)

-------------------------------------------------------------------
# Original README.md

<a href="http://www.repostatus.org/#active"><img src="http://www.repostatus.org/badges/latest/active.svg" /></a>
[![Twitter](https://img.shields.io/twitter/url/https/twitter.com/OnusFM.svg?style=social&label=@OnusFM)](https://twitter.com/OnusFM)
[![](https://dcbadge.vercel.app/api/server/J2B2vsjKuE?compact=true&style=flat&)](https://discord.gg/J2B2vsjKuE)


[Examples](https://suno-ai.notion.site/Bark-Examples-5edae8b02a604b54a42244ba45ebc2e2) | [Model Card](./model-card.md) | [Playground Waitlist](https://3os84zs17th.typeform.com/suno-studio)

Bark is a transformer-based text-to-audio model created by [Suno](https://suno.ai). Bark can generate highly realistic, multilingual speech as well as other audio - including music, background noise and simple sound effects. The model can also produce nonverbal communications like laughing, sighing and crying. To support the research community, we are providing access to pretrained model checkpoints ready for inference.

<p align="center">
<img src="https://user-images.githubusercontent.com/5068315/230698495-cbb1ced9-c911-4c9a-941d-a1a4a1286ac6.png" width="500"></img>
</p>

## 🔊 Demos

[![Open in Spaces](https://img.shields.io/badge/🤗-Open%20In%20Spaces-blue.svg)](https://huggingface.co/spaces/suno/bark)
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1eJfA2XUa-mXwdMy7DoYKVYHI1iTd9Vkt?usp=sharing)

## 🤖 Usage

```python
from bark import SAMPLE_RATE, generate_audio, preload_models
from IPython.display import Audio

# download and load all models
preload_models()

# generate audio from text
text_prompt = """
     Hello, my name is Suno. And, uh — and I like pizza. [laughs] 
     But I also have other interests such as playing tic tac toe.
"""
audio_array = generate_audio(text_prompt)

# play text in notebook
Audio(audio_array, rate=SAMPLE_RATE)
```

[pizza.webm](https://user-images.githubusercontent.com/5068315/230490503-417e688d-5115-4eee-9550-b46a2b465ee3.webm)


To save `audio_array` as a WAV file:

```python
from scipy.io.wavfile import write as write_wav

write_wav("/path/to/audio.wav", SAMPLE_RATE, audio_array)
```

### 🌎 Foreign Language

Bark supports various languages out-of-the-box and automatically determines language from input text. When prompted with code-switched text, Bark will attempt to employ the native accent for the respective languages. English quality is best for the time being, and we expect other languages to further improve with scaling. 

```python
text_prompt = """
    Buenos días Miguel. Tu colega piensa que tu alemán es extremadamente malo. 
    But I suppose your english isn't terrible.
"""
audio_array = generate_audio(text_prompt)
```

[miguel.webm](https://user-images.githubusercontent.com/5068315/230684752-10baadfe-1e7c-46a2-8323-43282aef2c8c.webm)

### 🎶 Music

Bark can generate all types of audio, and, in principle, doesn't see a difference between speech and music. Sometimes Bark chooses to generate text as music, but you can help it out by adding music notes around your lyrics.

```python
text_prompt = """
    ♪ In the jungle, the mighty jungle, the lion barks tonight ♪
"""
audio_array = generate_audio(text_prompt)
```

[lion.webm](https://user-images.githubusercontent.com/5068315/230684766-97f5ea23-ad99-473c-924b-66b6fab24289.webm)

### 🎤 Voice Presets and Voice/Audio Cloning

Bark has the capability to fully clone voices - including tone, pitch, emotion and prosody. The model also attempts to preserve music, ambient noise, etc. from input audio. However, to mitigate misuse of this technology, we limit the audio history prompts to a limited set of Suno-provided, fully synthetic options to choose from for each language. Specify following the pattern: `{lang_code}_speaker_{0-9}`.

```python
text_prompt = """
    I have a silky smooth voice, and today I will tell you about 
    the exercise regimen of the common sloth.
"""
audio_array = generate_audio(text_prompt, history_prompt="en_speaker_1")
```


[sloth.webm](https://user-images.githubusercontent.com/5068315/230684883-a344c619-a560-4ff5-8b99-b4463a34487b.webm)

*Note: since Bark recognizes languages automatically from input text, it is possible to use for example a german history prompt with english text. This usually leads to english audio with a german accent.*

### 👥 Speaker Prompts

You can provide certain speaker prompts such as NARRATOR, MAN, WOMAN, etc. Please note that these are not always respected, especially if a conflicting audio history prompt is given.

```python
text_prompt = """
    WOMAN: I would like an oatmilk latte please.
    MAN: Wow, that's expensive!
"""
audio_array = generate_audio(text_prompt)
```

[latte.webm](https://user-images.githubusercontent.com/5068315/230684864-12d101a1-a726-471d-9d56-d18b108efcb8.webm)


## 💻 Installation

```
pip install git+https://github.com/suno-ai/bark.git
```

or

```
git clone https://github.com/suno-ai/bark
cd bark && pip install . 
```

## 🛠️ Hardware and Inference Speed

Bark has been tested and works on both CPU and GPU (`pytorch 2.0+`, CUDA 11.7 and CUDA 12.0).
Running Bark requires running >100M parameter transformer models.
On modern GPUs and PyTorch nightly, Bark can generate audio in roughly realtime. On older GPUs, default colab, or CPU, inference time might be 10-100x slower. 

If you don't have new hardware available or if you want to play with bigger versions of our models, you can also sign up for early access to our model playground [here](https://3os84zs17th.typeform.com/suno-studio).

## ⚙️ Details

Similar to [Vall-E](https://arxiv.org/abs/2301.02111) and some other amazing work in the field, Bark uses GPT-style 
models to generate audio from scratch. Different from Vall-E, the initial text prompt is embedded into high-level semantic tokens without the use of phonemes. It can therefore generalize to arbitrary instructions beyond speech that occur in the training data, such as music lyrics, sound effects or other non-speech sounds. A subsequent second model is used to convert the generated semantic tokens into audio codec tokens to generate the full waveform. To enable the community to use Bark via public code we used the fantastic 
[EnCodec codec](https://github.com/facebookresearch/encodec) from Facebook to act as an audio representation.

Below is a list of some known non-speech sounds, but we are finding more every day. Please let us know if you find patterns that work particularly well on [Discord](https://discord.gg/J2B2vsjKuE)!

- `[laughter]`
- `[laughs]`
- `[sighs]`
- `[music]`
- `[gasps]`
- `[clears throat]`
- `—` or `...` for hesitations
- `♪` for song lyrics
- capitalization for emphasis of a word
- `MAN/WOMAN:` for bias towards speaker

**Supported Languages**

| Language | Status |
| --- | --- |
| English (en) | ✅ |
| German (de) | ✅ |
| Spanish (es) | ✅ |
| French (fr) | ✅ |
| Hindi (hi) | ✅ |
| Italian (it) | ✅ |
| Japanese (ja) | ✅ |
| Korean (ko) | ✅ |
| Polish (pl) | ✅ |
| Portuguese (pt) | ✅ |
| Russian (ru) | ✅ |
| Turkish (tr) | ✅ |
| Chinese, simplified (zh) | ✅ |
| Arabic  | Coming soon! |
| Bengali | Coming soon! |
| Telugu | Coming soon! |

## 🙏 Appreciation

- [nanoGPT](https://github.com/karpathy/nanoGPT) for a dead-simple and blazing fast implementation of GPT-style models
- [EnCodec](https://github.com/facebookresearch/encodec) for a state-of-the-art implementation of a fantastic audio codec
- [AudioLM](https://github.com/lucidrains/audiolm-pytorch) for very related training and inference code
- [Vall-E](https://arxiv.org/abs/2301.02111), [AudioLM](https://arxiv.org/abs/2209.03143) and many other ground-breaking papers that enabled the development of Bark

## © License

Bark is licensed under a non-commercial license: CC-BY 4.0 NC. The Suno models themselves may be used commercially. However, this version of Bark uses `EnCodec` as a neural codec backend, which is licensed under a [non-commercial license](https://github.com/facebookresearch/encodec/blob/main/LICENSE).

Please contact us at `bark@suno.ai` if you need access to a larger version of the model and/or a version of the model you can use commercially.  

## 📱 Community

- [Twitter](https://twitter.com/OnusFM)
- [Discord](https://discord.gg/J2B2vsjKuE)

## 🎧 Suno Studio (Early Access)

We’re developing a playground for our models, including Bark. 

If you are interested, you can sign up for early access [here](https://3os84zs17th.typeform.com/suno-studio).

## FAQ

#### How do I specify where models are downloaded and cached?

Use the `XDG_CACHE_HOME` env variable to override where models are downloaded and cached (otherwise defaults to a subdirectory of `~/.cache`).

#### Bark's generations sometimes differ from my prompts. What's happening?

Bark is a GPT-style model. As such, it may take some creative liberties in its generations, resulting in higher-variance model outputs than traditional text-to-speech approaches.
-												Update README.md
											
										
										
											2023-05-22 14:51:24 -07:00
+								# 🐶 BARK AI: but with the ability to use voice cloning on custom audio/text pairs
-												first commit

											
										
										
											2023-04-09 13:21:02 -04:00
-												update readme and add support for cpu

											
										
										
											2023-04-21 09:11:08 -06:00
+								If you want to clone a voice just follow the `clone_voice.ipynb` notebook. If you want to generate audio from text, follow the `generate.ipynb` notebook.
-												Update README.md
											
										
										
											2023-05-25 14:38:35 -06:00
+								To create a voice clone sample, you need an audio/text pair of less than 13 seconds.
-												update readme and add support for cpu

											
										
										
											2023-04-21 09:11:08 -06:00
-												Update README.md
											
										
										
											2023-05-25 14:38:35 -06:00
+								You will get the best results by making generations with your cloned voice until you find one that is really close to the source. Then use that as the new history prompt (comes from the model so should theoretically be more consistent)
-												update readme and add support for cpu

											
										
										
											2023-04-21 09:11:08 -06:00
-												Update README.md
											
										
										
											2023-05-03 20:24:12 -07:00
+								- [BARK text to speech @ SERP AI](https://serp.ai/tools/bark-text-to-speech-ai-voice-clone-app/)
-												update readme and add support for cpu

											
										
										
											2023-04-21 09:11:08 -06:00
 								-------------------------------------------------------------------
 								# Original README.md
-												first commit

											
										
										
											2023-04-09 13:21:02 -04:00
+								<a href="http://www.repostatus.org/#active"><img src="http://www.repostatus.org/badges/latest/active.svg" /></a>
-												Update README.md
											
										
										
											2023-04-18 21:02:14 -07:00
+								[![Twitter](https://img.shields.io/twitter/url/https/twitter.com/OnusFM.svg?style=social&label=@OnusFM)](https://twitter.com/OnusFM)
 								[![](https://dcbadge.vercel.app/api/server/J2B2vsjKuE?compact=true&style=flat&)](https://discord.gg/J2B2vsjKuE)
-												Update README.md
											
										
										
											2023-04-11 18:17:12 -07:00
-												first commit

											
										
										
											2023-04-09 13:21:02 -04:00
-												Update README.md
											
										
										
											2023-04-14 21:08:10 -07:00
+								[Examples](https://suno-ai.notion.site/Bark-Examples-5edae8b02a604b54a42244ba45ebc2e2) | [Model Card](./model-card.md) | [Playground Waitlist](https://3os84zs17th.typeform.com/suno-studio)
-												first commit

											
										
										
											2023-04-09 13:21:02 -04:00
-												Update README.md
											
										
										
											2023-04-11 18:17:12 -07:00
+								Bark is a transformer-based text-to-audio model created by [Suno](https://suno.ai). Bark can generate highly realistic, multilingual speech as well as other audio - including music, background noise and simple sound effects. The model can also produce nonverbal communications like laughing, sighing and crying. To support the research community, we are providing access to pretrained model checkpoints ready for inference.
-												first commit

											
										
										
											2023-04-09 13:21:02 -04:00
 								<p align="center">
 								<img src="https://user-images.githubusercontent.com/5068315/230698495-cbb1ced9-c911-4c9a-941d-a1a4a1286ac6.png" width="500"></img>
 								</p>
-												Update README.md
											
										
										
											2023-04-19 10:52:13 -07:00
+								## 🔊 Demos
-												add HF spaces demo
											
										
										
											2023-04-18 12:45:46 -04:00
+								[![Open in Spaces](https://img.shields.io/badge/🤗-Open%20In%20Spaces-blue.svg)](https://huggingface.co/spaces/suno/bark)
-												Update README.md
											
										
										
											2023-04-11 11:50:32 -04:00
+								[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1eJfA2XUa-mXwdMy7DoYKVYHI1iTd9Vkt?usp=sharing)
-												first commit

											
										
										
											2023-04-09 13:21:02 -04:00
-												add HF spaces demo
											
										
										
											2023-04-18 12:45:46 -04:00
+								## 🤖 Usage
-												first commit

											
										
										
											2023-04-09 13:21:02 -04:00
+								```python
-												simplify

											
										
										
											2023-04-22 17:09:20 -04:00
+								from bark import SAMPLE_RATE, generate_audio, preload_models
-												first commit

											
										
										
											2023-04-09 13:21:02 -04:00
+								from IPython.display import Audio
-												simplify

											
										
										
											2023-04-22 17:09:20 -04:00
+								# download and load all models
 								preload_models()
 								# generate audio from text
-												first commit

											
										
										
											2023-04-09 13:21:02 -04:00
+								text_prompt = """
 								     Hello, my name is Suno. And, uh — and I like pizza. [laughs]
 								     But I also have other interests such as playing tic tac toe.
 								"""
 								audio_array = generate_audio(text_prompt)
-												simplify

											
										
										
											2023-04-22 17:09:20 -04:00
 								# play text in notebook
-												first commit

											
										
										
											2023-04-09 13:21:02 -04:00
+								Audio(audio_array, rate=SAMPLE_RATE)
 								```
 								[pizza.webm](https://user-images.githubusercontent.com/5068315/230490503-417e688d-5115-4eee-9550-b46a2b465ee3.webm)
-												Update README.md
											
										
										
											2023-04-20 14:41:21 -07:00
 								To save `audio_array` as a WAV file:
 								```python
 								from scipy.io.wavfile import write as write_wav
 								write_wav("/path/to/audio.wav", SAMPLE_RATE, audio_array)
 								```
-												first commit

											
										
										
											2023-04-09 13:21:02 -04:00
+								### 🌎 Foreign Language
-												Update README.md
											
										
										
											2023-04-21 07:13:43 -07:00
+								Bark supports various languages out-of-the-box and automatically determines language from input text. When prompted with code-switched text, Bark will attempt to employ the native accent for the respective languages. English quality is best for the time being, and we expect other languages to further improve with scaling.
-												first commit

											
										
										
											2023-04-09 13:21:02 -04:00
 								```python
 								text_prompt = """
 								    Buenos días Miguel. Tu colega piensa que tu alemán es extremadamente malo.
 								    But I suppose your english isn't terrible.
 								"""
 								audio_array = generate_audio(text_prompt)
 								```
 								[miguel.webm](https://user-images.githubusercontent.com/5068315/230684752-10baadfe-1e7c-46a2-8323-43282aef2c8c.webm)
 								### 🎶 Music
-												Update README.md
											
										
										
											2023-04-11 18:17:12 -07:00
+								Bark can generate all types of audio, and, in principle, doesn't see a difference between speech and music. Sometimes Bark chooses to generate text as music, but you can help it out by adding music notes around your lyrics.
-												first commit

											
										
										
											2023-04-09 13:21:02 -04:00
+								```python
 								text_prompt = """
 								    ♪ In the jungle, the mighty jungle, the lion barks tonight ♪
 								"""
 								audio_array = generate_audio(text_prompt)
 								```
 								[lion.webm](https://user-images.githubusercontent.com/5068315/230684766-97f5ea23-ad99-473c-924b-66b6fab24289.webm)
-												Update README.md
											
										
										
											2023-04-21 10:49:01 -07:00
+								### 🎤 Voice Presets and Voice/Audio Cloning
-												first commit

											
										
										
											2023-04-09 13:21:02 -04:00
-												small updates

											
										
										
											2023-04-21 15:13:16 -04:00
+								Bark has the capability to fully clone voices - including tone, pitch, emotion and prosody. The model also attempts to preserve music, ambient noise, etc. from input audio. However, to mitigate misuse of this technology, we limit the audio history prompts to a limited set of Suno-provided, fully synthetic options to choose from for each language. Specify following the pattern: `{lang_code}_speaker_{0-9}`.
-												first commit

											
										
										
											2023-04-09 13:21:02 -04:00
 								```python
 								text_prompt = """
-												small readme updates

											
										
										
											2023-04-17 16:43:37 -04:00
+								    I have a silky smooth voice, and today I will tell you about
 								    the exercise regimen of the common sloth.
-												first commit

											
										
										
											2023-04-09 13:21:02 -04:00
+								"""
-												small readme updates

											
										
										
											2023-04-17 16:43:37 -04:00
+								audio_array = generate_audio(text_prompt, history_prompt="en_speaker_1")
-												first commit

											
										
										
											2023-04-09 13:21:02 -04:00
+								```
-												small readme updates

											
										
										
											2023-04-17 16:43:37 -04:00
+								[sloth.webm](https://user-images.githubusercontent.com/5068315/230684883-a344c619-a560-4ff5-8b99-b4463a34487b.webm)
-												first commit

											
										
										
											2023-04-09 13:21:02 -04:00
-												small readme updates

											
										
										
											2023-04-17 16:43:37 -04:00
+								*Note: since Bark recognizes languages automatically from input text, it is possible to use for example a german history prompt with english text. This usually leads to english audio with a german accent.*
-												Update README.md
											
										
										
											2023-04-11 18:17:12 -07:00
-												small readme updates

											
										
										
											2023-04-17 16:43:37 -04:00
+								### 👥 Speaker Prompts
 								You can provide certain speaker prompts such as NARRATOR, MAN, WOMAN, etc. Please note that these are not always respected, especially if a conflicting audio history prompt is given.
-												first commit

											
										
										
											2023-04-09 13:21:02 -04:00
 								```python
 								text_prompt = """
-												small readme updates

											
										
										
											2023-04-17 16:43:37 -04:00
+								    WOMAN: I would like an oatmilk latte please.
 								    MAN: Wow, that's expensive!
-												first commit

											
										
										
											2023-04-09 13:21:02 -04:00
+								"""
-												small readme updates

											
										
										
											2023-04-17 16:43:37 -04:00
+								audio_array = generate_audio(text_prompt)
-												first commit

											
										
										
											2023-04-09 13:21:02 -04:00
+								```
-												small readme updates

											
										
										
											2023-04-17 16:43:37 -04:00
+								[latte.webm](https://user-images.githubusercontent.com/5068315/230684864-12d101a1-a726-471d-9d56-d18b108efcb8.webm)
-												first commit

											
										
										
											2023-04-09 13:21:02 -04:00
 								## 💻 Installation
 								```
 								pip install git+https://github.com/suno-ai/bark.git
 								```
 								or
 								```
 								git clone https://github.com/suno-ai/bark
 								cd bark && pip install .
 								```
 								## 🛠️ Hardware and Inference Speed
 								Bark has been tested and works on both CPU and GPU (`pytorch 2.0+`, CUDA 11.7 and CUDA 12.0).
 								Running Bark requires running >100M parameter transformer models.
 								On modern GPUs and PyTorch nightly, Bark can generate audio in roughly realtime. On older GPUs, default colab, or CPU, inference time might be 10-100x slower.
-												Update README.md
											
										
										
											2023-04-11 18:17:12 -07:00
+								If you don't have new hardware available or if you want to play with bigger versions of our models, you can also sign up for early access to our model playground [here](https://3os84zs17th.typeform.com/suno-studio).
-												first commit

											
										
										
											2023-04-09 13:21:02 -04:00
 								## ⚙️ Details
 								Similar to [Vall-E](https://arxiv.org/abs/2301.02111) and some other amazing work in the field, Bark uses GPT-style
 								models to generate audio from scratch. Different from Vall-E, the initial text prompt is embedded into high-level semantic tokens without the use of phonemes. It can therefore generalize to arbitrary instructions beyond speech that occur in the training data, such as music lyrics, sound effects or other non-speech sounds. A subsequent second model is used to convert the generated semantic tokens into audio codec tokens to generate the full waveform. To enable the community to use Bark via public code we used the fantastic
 								[EnCodec codec](https://github.com/facebookresearch/encodec) from Facebook to act as an audio representation.
-												Update README.md
											
										
										
											2023-04-20 15:35:45 -04:00
+								Below is a list of some known non-speech sounds, but we are finding more every day. Please let us know if you find patterns that work particularly well on [Discord](https://discord.gg/J2B2vsjKuE)!
-												first commit

											
										
										
											2023-04-09 13:21:02 -04:00
 								- `[laughter]`
 								- `[laughs]`
 								- `[sighs]`
 								- `[music]`
 								- `[gasps]`
 								- `[clears throat]`
 								- `—` or `...` for hesitations
 								- `♪` for song lyrics
 								- capitalization for emphasis of a word
 								- `MAN/WOMAN:` for bias towards speaker
 								**Supported Languages**
 								| Language | Status |
 								| --- | --- |
-												small readme updates

											
										
										
											2023-04-17 16:43:37 -04:00
+								| English (en) | ✅ |
 								| German (de) | ✅ |
 								| Spanish (es) | ✅ |
 								| French (fr) | ✅ |
 								| Hindi (hi) | ✅ |
 								| Italian (it) | ✅ |
 								| Japanese (ja) | ✅ |
 								| Korean (ko) | ✅ |
 								| Polish (pl) | ✅ |
 								| Portuguese (pt) | ✅ |
 								| Russian (ru) | ✅ |
 								| Turkish (tr) | ✅ |
 								| Chinese, simplified (zh) | ✅ |
-												first commit

											
										
										
											2023-04-09 13:21:02 -04:00
+								| Arabic  | Coming soon! |
 								| Bengali | Coming soon! |
 								| Telugu | Coming soon! |
 								## 🙏 Appreciation
-												Update README.md
											
										
										
											2023-04-11 18:17:12 -07:00
+								- [nanoGPT](https://github.com/karpathy/nanoGPT) for a dead-simple and blazing fast implementation of GPT-style models
-												first commit

											
										
										
											2023-04-09 13:21:02 -04:00
+								- [EnCodec](https://github.com/facebookresearch/encodec) for a state-of-the-art implementation of a fantastic audio codec
 								- [AudioLM](https://github.com/lucidrains/audiolm-pytorch) for very related training and inference code
 								- [Vall-E](https://arxiv.org/abs/2301.02111), [AudioLM](https://arxiv.org/abs/2209.03143) and many other ground-breaking papers that enabled the development of Bark
 								## © License
-												Update README.md
											
										
										
											2023-04-11 18:17:12 -07:00
+								Bark is licensed under a non-commercial license: CC-BY 4.0 NC. The Suno models themselves may be used commercially. However, this version of Bark uses `EnCodec` as a neural codec backend, which is licensed under a [non-commercial license](https://github.com/facebookresearch/encodec/blob/main/LICENSE).
-												first commit

											
										
										
											2023-04-09 13:21:02 -04:00
 								Please contact us at `bark@suno.ai` if you need access to a larger version of the model and/or a version of the model you can use commercially.
 								## 📱 Community
 								- [Twitter](https://twitter.com/OnusFM)
 								- [Discord](https://discord.gg/J2B2vsjKuE)
 								## 🎧 Suno Studio (Early Access)
-												Update README.md
											
										
										
											2023-04-11 18:17:12 -07:00
+								We’re developing a playground for our models, including Bark.
-												first commit

											
										
										
											2023-04-09 13:21:02 -04:00
-												Update README.md
											
										
										
											2023-04-11 18:17:12 -07:00
+								If you are interested, you can sign up for early access [here](https://3os84zs17th.typeform.com/suno-studio).
-												Update README.md
											
										
										
											2023-04-21 09:00:11 -07:00
 								## FAQ
 								#### How do I specify where models are downloaded and cached?
-												Update README.md
											
										
										
											2023-04-21 09:19:22 -07:00
+								Use the `XDG_CACHE_HOME` env variable to override where models are downloaded and cached (otherwise defaults to a subdirectory of `~/.cache`).
 								#### Bark's generations sometimes differ from my prompts. What's happening?
 								Bark is a GPT-style model. As such, it may take some creative liberties in its generations, resulting in higher-variance model outputs than traditional text-to-speech approaches.