From 2b686917daa224bc237dea52e548335ea96ca257 Mon Sep 17 00:00:00 2001
From: Francis LaBounty <73464335+francislabountyjr@users.noreply.github.com>
Date: Thu, 25 May 2023 16:27:35 -0600
Subject: [PATCH] Update README.md
---
README.md | 57 +------------------------------------------------------
1 file changed, 1 insertion(+), 56 deletions(-)
diff --git a/README.md b/README.md
index b53476d..abbb62a 100644
--- a/README.md
+++ b/README.md
@@ -13,25 +13,6 @@ You will get the best results by making generations with your cloned voice until
-------------------------------------------------------------------
# Original README.md
-
-
-[](https://twitter.com/OnusFM)
-[](https://discord.gg/J2B2vsjKuE)
-
-
-[Examples](https://suno-ai.notion.site/Bark-Examples-5edae8b02a604b54a42244ba45ebc2e2) | [Model Card](./model-card.md) | [Playground Waitlist](https://3os84zs17th.typeform.com/suno-studio)
-
-Bark is a transformer-based text-to-audio model created by [Suno](https://suno.ai). Bark can generate highly realistic, multilingual speech as well as other audio - including music, background noise and simple sound effects. The model can also produce nonverbal communications like laughing, sighing and crying. To support the research community, we are providing access to pretrained model checkpoints ready for inference.
-
-
-
-
-
-## 🔊 Demos
-
-[](https://huggingface.co/spaces/suno/bark)
-[](https://colab.research.google.com/drive/1eJfA2XUa-mXwdMy7DoYKVYHI1iTd9Vkt?usp=sharing)
-
## 🤖 Usage
```python
@@ -141,15 +122,13 @@ Bark has been tested and works on both CPU and GPU (`pytorch 2.0+`, CUDA 11.7 an
Running Bark requires running >100M parameter transformer models.
On modern GPUs and PyTorch nightly, Bark can generate audio in roughly realtime. On older GPUs, default colab, or CPU, inference time might be 10-100x slower.
-If you don't have new hardware available or if you want to play with bigger versions of our models, you can also sign up for early access to our model playground [here](https://3os84zs17th.typeform.com/suno-studio).
-
## ⚙️ Details
Similar to [Vall-E](https://arxiv.org/abs/2301.02111) and some other amazing work in the field, Bark uses GPT-style
models to generate audio from scratch. Different from Vall-E, the initial text prompt is embedded into high-level semantic tokens without the use of phonemes. It can therefore generalize to arbitrary instructions beyond speech that occur in the training data, such as music lyrics, sound effects or other non-speech sounds. A subsequent second model is used to convert the generated semantic tokens into audio codec tokens to generate the full waveform. To enable the community to use Bark via public code we used the fantastic
[EnCodec codec](https://github.com/facebookresearch/encodec) from Facebook to act as an audio representation.
-Below is a list of some known non-speech sounds, but we are finding more every day. Please let us know if you find patterns that work particularly well on [Discord](https://discord.gg/J2B2vsjKuE)!
+Below is a list of some known non-speech sounds
- `[laughter]`
- `[laughs]`
@@ -182,37 +161,3 @@ Below is a list of some known non-speech sounds, but we are finding more every d
| Arabic | Coming soon! |
| Bengali | Coming soon! |
| Telugu | Coming soon! |
-
-## 🙏 Appreciation
-
-- [nanoGPT](https://github.com/karpathy/nanoGPT) for a dead-simple and blazing fast implementation of GPT-style models
-- [EnCodec](https://github.com/facebookresearch/encodec) for a state-of-the-art implementation of a fantastic audio codec
-- [AudioLM](https://github.com/lucidrains/audiolm-pytorch) for very related training and inference code
-- [Vall-E](https://arxiv.org/abs/2301.02111), [AudioLM](https://arxiv.org/abs/2209.03143) and many other ground-breaking papers that enabled the development of Bark
-
-## © License
-
-Bark is licensed under a non-commercial license: CC-BY 4.0 NC. The Suno models themselves may be used commercially. However, this version of Bark uses `EnCodec` as a neural codec backend, which is licensed under a [non-commercial license](https://github.com/facebookresearch/encodec/blob/main/LICENSE).
-
-Please contact us at `bark@suno.ai` if you need access to a larger version of the model and/or a version of the model you can use commercially.
-
-## 📱 Community
-
-- [Twitter](https://twitter.com/OnusFM)
-- [Discord](https://discord.gg/J2B2vsjKuE)
-
-## 🎧 Suno Studio (Early Access)
-
-We’re developing a playground for our models, including Bark.
-
-If you are interested, you can sign up for early access [here](https://3os84zs17th.typeform.com/suno-studio).
-
-## FAQ
-
-#### How do I specify where models are downloaded and cached?
-
-Use the `XDG_CACHE_HOME` env variable to override where models are downloaded and cached (otherwise defaults to a subdirectory of `~/.cache`).
-
-#### Bark's generations sometimes differ from my prompts. What's happening?
-
-Bark is a GPT-style model. As such, it may take some creative liberties in its generations, resulting in higher-variance model outputs than traditional text-to-speech approaches.