# Stable Diffusion

# Image Generation extension

Use local or cloud-based Stable Diffusion APIs to generate images. The free mode is also supported via the /sd (anything_here) command in the chat input bar. Most common Stable Diffusion generation settings are customizable within the SillyTavern UI.

# Generation modes

Wand menu item Slash command argument Description Remarks
"Yourself" you A full-body portrait of the current character. -
"Your Face" face A close-up portrait of the current character. Forces a portrait aspect ratio.
"Me" me A portrait of the user persona. -
"The Whole Story" scene A visual recap of the chat events. -
"The Last Message" last A visual recap of the last chat message. -
"Raw Last Message" raw_last Last message used as a prompt verbatim. -
"Background" background A chat background based on story context. Forces a wide landscape aspect ratio.

# How to generate an image

  1. Use the "Image Generation" item in the extensions context menu (wand).
  2. Type a /sd (argument) slash command with an argument from the Generation modes table. Anything else would trigger a "free mode" to make SD generate whatever you prompted. Example: /sd apple tree would generate a picture of an apple tree.
  3. Look for a paintbrush icon in the context actions for chat messages. This will force the "Raw Message" mode for the selected message.

Every generation mode besides raw message and free mode will trigger a prompt generation using your currently selected main generation API to convert chat context into the SD prompt. You can configure the instruction template for generating prompts for every generation mode using the "SD Prompt Templates" settings drawer in the extensions panel.

# Supported Sources

# Options

# Edit prompts before generation

Allow to edit the automatically generated prompts manually before sending them to the Stable Diffusion API.

# Interactive mode

Allows to trigger an image generation instead of text as a reply to a user message that follows the special pattern:

  1. Contains one of the following verbs: send, mail, imagine, generate, make, create, draw, paint, render
  2. Followed by one of the following nouns (not further than 10 characters away): pic, picture, image, drawing, painting, photo, photograph
  3. Followed by a target subject of image generation, could be optionally preceded by phrases like "of a" or "of this".

Examples of valid requests and captured subjects:

  • Can you please send me a picture of a cat => cat
  • Generate a picture of the Eiffel tower => Eiffel tower
  • Let's draw a painting of Mona Lisa => Mona Lisa

Some special subjects trigger a predefined generation mode:

  • 'you, 'yourself' => "Yourself"
  • 'your face', 'your portrait', 'your selfie' => "Your Face"
  • 'me', 'myself' => "Me"
  • 'story', 'scenario', 'whole story' => "The Whole Story"
  • 'last message' => "The Last Message"
  • 'background', 'scene background', 'scene', 'scenery', 'surroundings', 'environment' => "Background"

# Auto-enhance prompts

This option uses an additional GPT-2 text generation model to add more details to the prompt generated by the main API. Works best for SDXL image models. It may not work well with other models, and it is recommended to manually edit prompts in this case.

Default GPT-2 model: Cohee/fooocus_expansion-onnx

# Snap auto-adjusted resolutions

Snap image generation requests with a forced aspect ratio (portraits, backgrounds) to the nearest known resolution, while trying to preserve the absolute pixel counts. Refer to the "Resolution" dropdown for the list of possible options.

Recommended for SDXL models.

# Common prompt prefix

Added before every generated or free-mode prompt. Commonly used for setting the overall style of the picture.

Example: best quality, anime lineart.

Pro tip: Use {prompt} macro to specify where exactly the generated prompt will be inserted.

# Negative prompt

Characteristics of the image you don't want to be present in the output.

Example: bad quality, watermark.

# Character-specific prompt prefix

Any characteristics that describe the currently selected character. Will be added after a common prefix.

Example: female, green eyes, brown hair, pink shirt.

Limitations:

  1. Works only in 1-to-1 chats. Will not be used in groups.
  2. Won't be used for backgrounds and free mode generations.

Pro tip: If supported by the generation source, you can also use LoRAs/embeddings here, for example: <lora:DonaldDuck:1>.

# Styles

Use this to quickly save and restore your favorite style/quality presets to use them later or when switching between models. The following is included in the Style preset:

  1. Common Prompt Prefix
  2. Negative Prompt