TTS

SillyTavern has a wide range of TTS (text-to-speech) options that are used to have a voice narrate parts of your chat. This page explains the setup and use.

Configuring TTS

TTS Provider Selection

Used to select which TTS service you want to use. Some of the options are free, some require a paid subscription, and some run locally on your PC.

Available options (list may change over time):

  • AllTalk - free, open source local installation, offers a variety of TTS engines. See the AllTalk page for setup instructions.
  • Azure TTS - same voices as Microsoft Edge. Requires an Azure account and a paid subscription.
  • Coqui-TTS (deprecated) - free, requires Extras API to run. High-performance Text2Speech models (Tacotron, Tacotron2, Glow-TTS, SpeedySpeech) as well as Bark.
  • Edge - free, runs via Azure. When running with "Plugin" selected as the provider, you also need to install this server plugin. Other option requires Extras API (deprecated) to run.
  • Electron Hub - reuses your Electron Hub API key to access cloud voices (GPT-4o Mini TTS, Microsoft neural voices, etc.) with per-model controls.
  • ElevenLabs - paid subscription required. Get an API key from ElevenLabs.
  • Google Translate - a free voice provided by Google, one per language, quality can vary widely.
  • Google Gemini TTS - requires an API key from either Vertex AI or AI Studio, uses Gemini TTS models.
  • Kokoro - free, uses kokoro.js to run the model locally in your browser. However, some browsers may not support WebGPU for the device option.
  • MiniMax - requires an API key from MiniMax. See the MiniMax TTS page for setup instructions.
  • Novel - requires a paid NovelAI subscription, generated by NovelAI's TTS engine
  • OpenAI - paid API key required, uses OpenAI's TTS models.
  • Pollinations - free of charge access to OpenAI TTS models, but with a rate limit. Website.
  • Silero - free, runs on your PC, quality can vary widely. Requires a dedicated API server installation or Extras API (deprecated).
  • System - uses your OS TTS engine, if one exists. Quality can vary widely depending on the OS.
  • XTTS - free, requires a dedicated API server installation. See the XTTS page for setup instructions.

Checkboxes

  • Enabled - turns TTS playback on/off
  • Auto Generation - lets TTS start playing automatically when a new message enters the chat
  • Only narrate "quotes" - Limits TTS playback to only include text within "quotation marks". This will *include "quotes" within asterisk lines* (internal variable name = narrate_quoted_only)
  • Ignore *text, even "quotes", inside asterisks* - TTS will not play any text within *asterisks*, even "quotes" (internal variable name = narrate_dialogues_only)
  • having both "only narrate quotes" and "ignore asterisks" checkboxes both checked will result in the TTS only reading "quotes" which are not in asterisks, and ignoring everything else.
  • Narrate only the translated text - this will make the TTS only narrate the translated text.
  • Apply regex - applies a provided regex pattern to the text before sending it to the TTS provider. Useful for removing unwanted parts from the input text, such as emojis or non-native language characters that the TTS engine doesn't handle well.

Given the example text: *Cohee approaches you with a faint "nya"* "Good evening, senpai", she says. Here's a table showing how the text will be modified based on the boolean states of Ignore *text, even "quotes", inside asterisks* and Only narrate "quotes":

Ignore *text, even "quotes", inside asterisks* Only narrate "quotes" Output
Disabled Disabled Cohee approaches you with a faint "nya" "Good evening, senpai", she says.
Disabled Enabled "nya"... "Good evening, senpai"
Enabled Disabled "Good evening, senpai", she says.
Enabled Enabled "Good evening, senpai"

Sliders

These will change depending on the API you select.

Buttons

  • Apply - this must be clicked after setting a TTS API and after editing the voice map.
  • Refresh - reloads the list of voices from the selected TTS API.
  • Available voices - loads a popup with all voices available for your selected API, and lets you preview them with sample dialogues.

Using TTS

  1. Click the "Enable" checkbox, or nothing will ever happen.
  2. Click the "Auto-generation" checkbox if you want the TTS to start automatically every time a new message arrives in chat.
  3. Optionally, click the megaphone icon inside the top-right of any message to playback on demand.
  4. Click the lower right "Stop" button (found inside the wand menu) to stop any playback.

Voice Map

You must provide a voice map for the TTS to use, otherwise, it won't know what voices should be used for each character. To setup the voice map, first open a chat with a character that you'd like to assign a voice to and/or select a user persona to assign a voice to, then select a voice listed by a TTS provider from the dropdown. If you don't see a list of voices and/or characters, make sure that your TTS provider is configured correctly and click "Refresh". Some providers (like OpenAI-compatible or NovelAI) require you to populate the voice list manually.