A brief guide to using Stable Diffusion

A brief guide to using Stable Diffusion:

After I posted some of my results using Stable Diffusion, a few people asked me for tips on how I got there. Hence, this guide. Stable Diffusion is quite a bit different than MidJourney (and NovelAI and most other paid/hosted services out there), both in what it allows you to do, and how you control it. BUT, once you learn how it works, the world is your oyster.

(Buyer beware: SD gets used, A LOT, for nsfw image generation. That’s not my thing and I try to stay away from it as much as possible, but you WILL likely bump into inappropriate images while setting up SD and downloading the add-ons you are looking for to get the images you want. I’ll mark how to sanitize your experience as much as possible, but please go into this with a very, very wary eye.)

The main benefit of SD over MJ is that YOU control the model used to generate your art. Want to switch from a realistic style to one more suited to early 90s HeMan/SheRa style animation? No problem! Want to create a consistent character that you can then pose / use in scenes over the course of a whole story? Sure! Want to turn your character into a pixel image spritesheet for a video game, or a full character rotation for a 3D model, or a lineart drawing for a coloring book? Say no more!

First, you need to install Stable Diffusion. This is not too bad, but you need to follow these (and the software instructions) CAREFULLY. Note: this DOES require a pretty decent graphics card, with AT LEAST 4GB RAM on the card. (4GB will function most of the time, but 6 or 8 or 10 or 12 on your graphics card is MUCH better). Realistically, this means you probably need an NVIDIA 3060 (or AMD equivalent) or better. There ARE ways to run this on a hosted server (so you don’t need to spend a bunch on a computer upgrade), but it’s a different process – I’ll try and post a guide to that later. (I’m plan to spin one of those up myself to create some consistent characters in DreamBooth.)

Here’s a step-by-step guide (not mine) for Windows: https://stable-diffusion-art.com/install-windows/

There’s an option listed here to install SD somewhere OTHER than in your Users directory – I REALLY wish I had done that when I first installed. I could change it now, but I’m in the middle of using the tool; I’ll get to it when I’ve finished this project.

Walk through all of those steps, then come back here.

Ok, to start up Stable Diffusion, click the “webui” Windows Batch File in your stabel-diffusion-webui folder. A command prompt will run for a while – a lot longer the first time, up to 10-15 minutes – and then pause on a text that should say something like:

“Running on local URL: http://127.0.0.1:7860“

You MUST leave this command line window open; that is the stable diffusion software running. Highlight and copy the http:// address in your command line, and then go back to your web browser. Paste that address, and you should see something like the attached picture.

If you don’t, you probably forgot to download the model into the stable-diffusion/models/Stable-diffusion folder. Double check that, and try again.

Once you get the UI from the picture to show up in your browser, you’re ready to get started. The SINGLE most important thing about SD, and the biggest difference from all of those online tools, is that SD needs “negative prompts” to work correctly. The prompt is where you say what you want to see; the negative prompt tells SD what NOT to give you.

To get started with that, here is a starter negative prompt you can copy:

(bad_prompt_version2:0.7), flat color, flat shading, bad anatomy, disfigured, deformed, malformed, mutant, gross, disgusting, out of frame, nsfw, poorly drawn, extra limbs, extra fingers, missing limbs, blurry, out of focus, lowres, bad hands, error, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username,

There are some default things I put in the prompt as well; for example: “high detail, 4k, 3d, 2d, trending on artstation”. This is definitely something you can play with more; I haven’t nailed this down for myself yet, but getting consistent in this end-of-prompt stuff will really help you get a consistent style.

When you are writing what you WANT to see, I get better results with building up prompts bit by bit; start with something that works, then get more specific. For example, I started with “cactus”, then didn’t like the background, so went to “Cactus in a New Mexico desert”, then “Cactus in New Mexico desert in the Rocky Mountains”, then “Cactus under a moonlit sky in New Mexico desert in the Rocky Mountains.”

It also helps to define your camera angle, like “wide angle shot”, “

low angle shot”, or “cowboy shot”; here’s a good website with a list of shot types to include in your prompt: https://www.studiobinder.com/…/ultimate-guide-to…/

Also: see that “(bad_prompt_version2:0.7)” in the negative prompt? This is one of the powerful things about SD: people can create what are called “textual inversions,” which are basically packets of prompts that achieve a specific results. One of those that is commonly used is the “bad-prompts” textual inversion (they were previously called “Styles”), so let’s go download and use that. (For reference, you can control the strength of the added style by adding colon#.# (i.e. “:0.7” or “:1.3” or “:2.1”, where 0-1 reduces strength to a percentage, and anything over one increase strength.)

Go to the following link, create a HuggingFace account (it’s free), and download bad_prompt_version2.pt into your stable-diffusion-webui/embeddings folder:

https://huggingface.co/data…/Nerfgun3/bad_prompt/tree/main

HuggingFace has hundreds of those textual inversions you can download; these all go into your “embeddings” folder as .pt or .bin files. Here’s a link to HuggingFace’s library: https://huggingface.co/sd-concepts-library

You can see samples of a lot of these using the Conceptualizer link on that page, but about half of the inversions (Concepts) don’t have demos, so do scroll through the list at the bottom of the page as well.

When you want to use these in your prompts, you activate the style/concept/inversion by dropping the NAME OF THE FILE IN THE FOLDER into your prompt (i.e. bad_prompt_version2). I would set that off with commas myself, though you can play with it both ways.

And now, the biggest part:

Unfortunately, the base model of SD is… not so great. Fortunately, you can update that model; there are dozens to choose from. Unfortunately, here is the first place where you REALLY need to be careful with running into nsfw content. Go to https://civitai.com/ – it’s sfw by default, but be warned: even the sfw models here often have nsfw images (hidden by blur; remove at your own risk.) In filters (set Safe browsing if you like), and then search only for checkpoints. These are all different art-generation models you can use, and will give an entirely different flavor to all of your generated art. These should be downloaded into the stable-diffusion/models/Stable-diffusion folder, and once you refresh your SD web browser page, should be available in the dropdown under “Stable Diffusion Checkpoint.”

Find some models that look interesting, download them, and try them out! Some of my favorites are Anything V3 and Midjourneyv4, but the InkPunk models, and the 90s monochrome RPG book model, and the comicbook models… nvm, there’s a LOT of fun ones.

There’s one last item you need to do before you are *fully* up and running, and that is to enhance the colors generated in SD. Download the vae-ft-mse-840000-ema-pruned.ckpt file from here: https://huggingface.co/…/sd-vae-ft-mse-original/tree/main

into your stable-diffusion-webui/models/VAE folder. Then, go into your SD (you may have to turn it off and start it up again) and click Settings —> Stable Diffusion. In the VAE drop down, select that same 840,000 VAE, and hit Apply Settings. This will make your colors richer and more vibrant. There are other VAEs you can use – I’m using the kl-f8-anime.ckpt one – but I don’t remember where I found it, and I don’t have a good single-source repository for you in that regard.

Ok, you’re up and running! Go crank out some artwork!

When you want to see the files you’ve produced (every one is saved by default!) go to stable-diffusion-webui/outputs/txt2img-images and that should have all your original prompts there. Grids and img2img files are in their respective folders; poke around and it should all be there.

Happy generating!

Leave a Reply Cancel reply