Stable Diffusion 2026: ComfyUI Workflows and Video Generation Guide

Stable Diffusion changed everything when it launched in 2022. For the first time, anyone could run a state-of-the-art image generation model on their own computer, completely free. No subscriptions, no usage limits, no corporate terms of service deciding what you can create.

Three years later, the ecosystem has exploded. ComfyUI replaced AUTOMATIC1111 as the interface of choice. Civitai hosts over 100,000 custom models. And with Stable Video Diffusion and newer models, video generation is now accessible to hobbyists. If you tried Stable Diffusion in 2023 and bounced off, it’s time to revisit.

This tutorial walks you through everything: installation options, ComfyUI basics, custom models, LoRAs, ControlNet, and video generation. By the end, you’ll have a working setup and understand the workflows that professionals use.

Why Stable Diffusion Over Midjourney or DALL-E?

Before diving into setup, let’s address why you’d choose Stable Diffusion over simpler alternatives like Midjourney or DALL-E 3.

Rating: 4.5/5

Factor	Stable Diffusion	Midjourney	DALL-E 3
Cost	Free (local) or approximately $0.40/hr (cloud)	$10-60/month	$20/month (ChatGPT Plus)
Privacy	100% local, data never leaves your machine	Cloud-based	Cloud-based
Customization	Full control: custom models, LoRAs, ControlNet	Limited style references	Minimal
NSFW/Unrestricted	No content filters	Strict policies	Strict policies
Learning Curve	Steep	Easy	Very easy
Best For	Power users, developers, specific styles	Quick beautiful images	Conversational generation

Choose Stable Diffusion if you:

Want complete creative freedom without content restrictions
Need to generate hundreds or thousands of images
Have a specific style that requires custom training
Value privacy and local processing
Enjoy tinkering and optimizing workflows

Stick with Midjourney/DALL-E if you:

Need beautiful images fast with minimal setup
Prefer paying monthly over hardware investment
Don’t require custom models or advanced techniques

Stability AI homepage showing Stable Diffusion models — Stability AI’s homepage featuring the latest SD 3.5 models and enterprise solutions

Installation Options: Local vs Cloud

Your hardware determines which path to take. Stable Diffusion requires a decent GPU for reasonable performance.

Hardware Requirements

Setup	Minimum	Recommended
VRAM	6GB (slow, limited)	12GB+ (RTX 3060/4070 or better)
RAM	16GB	32GB
Storage	50GB free	200GB+ (models are large)

Reality check: If you have an RTX 3060 12GB or better, local installation is worth it. If you’re on a laptop GPU, integrated graphics, or Mac (even M1/M2), cloud services are more practical.

Option 1: Local Installation with ComfyUI

ComfyUI is a node-based interface that’s become the standard for serious Stable Diffusion users. It’s more powerful than AUTOMATIC1111 and allows visual workflow creation.

Step 1: Install ComfyUI

# Clone the repository
git clone https://github.com/comfyanonymous/ComfyUI.git
cd ComfyUI

# Create virtual environment (recommended)
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install PyTorch with CUDA support
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

# Install ComfyUI dependencies
pip install -r requirements.txt

Step 2: Download a Model

Download Stable Diffusion 3.5 Medium (the best balance of quality and speed) from Hugging Face:

# Place in ComfyUI/models/checkpoints/
# File: sd3.5_medium.safetensors (~5GB)

Step 3: Launch ComfyUI

python main.py
# Opens at http://127.0.0.1:8188

Option 2: Cloud GPU Services

No GPU? Cloud services provide pre-configured environments at hourly rates.

Service	Cost	Setup Time	Best For
RunPod	$0.40-0.80/hr	5 min	Most popular, ComfyUI templates
Vast.ai	$0.20-0.50/hr	10 min	Budget option, variable quality
Google Colab	Free-$10/mo	15 min	Testing, limited runtime
ThinkDiffusion	$0.50/hr	Instant	Zero setup, browser-based

RunPod Quick Start:

Create account at runpod.io
Select “Templates” and search “ComfyUI”
Choose a GPU (RTX 4090 recommended for speed)
Deploy and access via browser

Cloud costs add up. At 20 hours/month usage, you’re paying $8-16/month, which approaches Leonardo AI subscription prices. But you get full customization that managed platforms can’t match.

ComfyUI Basics: Your First Workflow

ComfyUI uses a node-based system where you connect components visually. Think of it like wiring a synthesizer: data flows from left to right through nodes.

ComfyUI node-based workflow interface — ComfyUI’s visual workflow system connects nodes for text-to-image generation

Core Nodes You’ll Use

Node	Purpose
Load Checkpoint	Loads your SD model (.safetensors file)
CLIP Text Encode	Converts text prompts to embeddings
KSampler	The actual image generation (denoising)
VAE Decode	Converts latent space to viewable image
Save Image	Outputs final image

Basic Text-to-Image Workflow

Load Checkpoint → Connect MODEL, CLIP, VAE outputs
CLIP Text Encode (Positive) → Your main prompt
CLIP Text Encode (Negative) → What to avoid
Empty Latent Image → Set resolution (1024x1024 for SD3.5)
KSampler → Connect all inputs, set steps (20-30), CFG scale (4-7)
VAE Decode → Converts to RGB image
Save Image → Outputs to ComfyUI/output/

Example Prompt:

Positive: "a majestic owl perched on ancient ruins, golden hour lighting,
photorealistic, 8k detail, volumetric fog, depth of field"

Negative: "blurry, low quality, text, watermark, distorted, deformed"

Key Settings:

Steps: 20-30 (more = better quality, slower)
CFG Scale: 4-7 for SD3.5 (controls prompt adherence)
Sampler: euler, dpmpp_2m_sde (experiment to find preference)
Scheduler: karras or normal

Using Custom Models from Civitai

Civitai is the community hub for Stable Diffusion models. Over 100,000 checkpoints, LoRAs, and embeddings are available, from photorealistic to anime to specific art styles.

Civitai model library showing popular checkpoints — Civitai hosts thousands of custom models, LoRAs, and embeddings for Stable Diffusion

Finding the Right Model

Popular Model Types:

Type	Examples	Best For
Photorealistic	Juggernaut XL, RealVisXL	Product photos, portraits
Anime/Illustration	Pony Diffusion, Animagine	Anime art, character design
Artistic	DreamShaper, SDXL Unstable	Creative, painterly styles
Specialized	Architecture, Fashion	Industry-specific needs

Installing Civitai Models

Find a model on civitai.com (check for SDXL or SD3.5 compatibility)
Download the .safetensors file
Place in ComfyUI/models/checkpoints/
Reload ComfyUI (Ctrl+R) or restart
Select in Load Checkpoint node

Pro Tip: Read the model card. Creators specify optimal settings (CFG scale, samplers, trigger words) that dramatically improve results.

LoRA and ControlNet: Advanced Techniques

LoRAs and ControlNet transform Stable Diffusion from “generic image generator” to “precision creative tool.”

LoRA (Low-Rank Adaptation)

LoRAs are small adapter files (10-200MB) that modify model behavior without changing the base model. Use them to add:

Styles: Specific artistic styles, lighting, compositions
Characters: Consistent characters across images
Concepts: Objects, poses, environments

Using LoRAs in ComfyUI:

Download LoRA from Civitai
Place in ComfyUI/models/loras/
Add “Load LoRA” node after Load Checkpoint
Connect MODEL and CLIP through the LoRA node
Set strength (0.5-1.0 typical)

Example: Using a “cinematic lighting” LoRA at 0.7 strength adds Hollywood-style lighting to any prompt.

ControlNet: Precise Composition Control

ControlNet lets you guide image generation using reference images. Instead of hoping the AI positions elements correctly, you specify exact poses, edges, or depth maps.

ControlNet Types:

Type	Input	Use Case
Canny Edge	Line drawing/edges	Maintain structure from sketch
Depth	Depth map	Control 3D positioning
OpenPose	Pose skeleton	Character poses
Scribble	Rough sketch	Quick concept art
IP-Adapter	Reference image	Style transfer

Basic ControlNet Workflow:

Install ControlNet models from Hugging Face
Add “Load ControlNet Model” node
Add “Apply ControlNet” node
Connect your preprocessed image (edge detection, pose extraction)
Connect to KSampler conditioning

This technique is essential for professional work where specific compositions are required.

Video Generation with Stable Video Diffusion

Stable Diffusion isn’t just for images anymore. Stability AI’s video models enable short-form video generation.

Current Video Models (2026)

Model	Input	Output	Best For
Stable Video Diffusion	Single image	2-4 sec clip	Image animation
Stable Video 4D 2.0	Image	Multi-view video	3D object rotation
Stable Virtual Camera	2D video	Immersive video	Adding camera motion

Image-to-Video Workflow

Generate or select a high-quality image
Use SVD model in ComfyUI (requires separate download)
Set motion parameters (motion bucket, fps)
Generate frames (14-25 typical)
Export as video

Hardware Note: Video generation is significantly more VRAM-intensive. Expect 12GB+ for basic SVD, 24GB+ for higher quality.

For more accessible video generation, consider dedicated platforms like Runway or HeyGen which offer more polished workflows at the cost of flexibility.

Tips for Better Results

After thousands of generations, these practices consistently improve output quality.

Prompt Engineering

Structure your prompts:

[Subject], [Style/Medium], [Lighting], [Quality Keywords], [Artist Reference]

Example: "portrait of a cyberpunk hacker, digital painting,
neon rim lighting, intricate details 8k, in the style of Simon Stalenhag"

Quality boosters that work:

“highly detailed, 8k, intricate”
“professional photography, DSLR”
“masterpiece, best quality” (for anime models)
Specific lighting: “golden hour, studio lighting, volumetric”

Negative prompts matter:

"blurry, low quality, text, watermark, signature, worst quality,
jpeg artifacts, deformed, distorted, extra limbs"

Workflow Optimization

Start low, scale up: Generate at 512x512 first, upscale winners
Use Hi-Res Fix: Two-pass generation for sharper large images
Batch generate: Create 4-8 variations, pick the best
Save workflows: ComfyUI saves workflows in image metadata

Common Mistakes to Avoid

Mistake	Solution
CFG scale too high	SD3.5 works best at 4-7, not 7-12 like older models
Wrong resolution	Match model’s training resolution (1024x1024 for SDXL/SD3.5)
Ignoring model cards	Read recommended settings on Civitai
Too many LoRAs	Stack 1-3 max, reduce strength when combining
Skipping negative prompts	Always specify what to avoid

Stable Diffusion vs Alternatives Comparison

How does Stable Diffusion stack up against commercial alternatives for different use cases?

Stable Diffusion: Rating: 4.5/5 — Free, unlimited, full control
Midjourney: Rating: 3.7/5 — Best aesthetics, $10/mo
DALL-E 3: Rating: 4.4/5 — Best text rendering, pay-per-use
Leonardo AI: Rating: 4.6/5 — Best free cloud option

Use Case	Best Choice	Why
Quick beautiful images	Midjourney	Aesthetic defaults, minimal prompting
Conversational generation	DALL-E 3	Natural language understanding
Specific style consistency	Stable Diffusion	Custom models, LoRAs
High volume generation	Stable Diffusion	No per-image costs
Video generation	Runway or SD	Depends on control needs
Managed custom training	Leonardo AI	Guided workflow, no setup

Getting Started Checklist

Ready to begin? Here’s your action plan:

Week 1: Setup

Assess hardware (GPU VRAM check)
Install ComfyUI locally or sign up for RunPod
Download SD 3.5 Medium checkpoint
Generate first images with basic workflow

Week 2: Exploration

Browse Civitai for models matching your needs
Try 2-3 different checkpoints
Experiment with LoRAs
Practice prompt engineering

Week 3: Advanced

Install ControlNet models
Create pose-controlled generations
Try image-to-video with SVD
Build and save custom workflows

The learning curve is real, but the payoff is complete creative control. Unlike subscription services that can change policies or pricing overnight, your local Stable Diffusion setup is yours forever.

For more AI image generation techniques, see our guides on custom model training and AI image generation tips.

External Resources

For official documentation and updates from these tools:

Stable Diffusion — Official website
Midjourney — Official website
DALL-E 3 — Official website
Leonardo AI — Official website