Home / Blog / Tutorials / Stable Diffusion 2026: ComfyUI Workflows...
Tutorials

Stable Diffusion 2026: ComfyUI Workflows and Video Generation Guide

Published Jan 25, 2026
Read Time 10 min read
Author AI Productivity
i

This post contains affiliate links. I may earn a commission if you purchase through these links, at no extra cost to you.

Stable Diffusion changed everything when it launched in 2022. For the first time, anyone could run a state-of-the-art image generation model on their own computer, completely free. No subscriptions, no usage limits, no corporate terms of service deciding what you can create.

Three years later, the ecosystem has exploded. ComfyUI replaced AUTOMATIC1111 as the interface of choice. Civitai hosts over 100,000 custom models. And with Stable Video Diffusion and newer models, video generation is now accessible to hobbyists. If you tried Stable Diffusion in 2023 and bounced off, it’s time to revisit.

This tutorial walks you through everything: installation options, ComfyUI basics, custom models, LoRAs, ControlNet, and video generation. By the end, you’ll have a working setup and understand the workflows that professionals use.

Why Stable Diffusion Over Midjourney or DALL-E?

Before diving into setup, let’s address why you’d choose Stable Diffusion over simpler alternatives like Midjourney or DALL-E 3.

Rating: 4.5/5
FactorStable DiffusionMidjourneyDALL-E 3
CostFree (local) or approximately $0.40/hr (cloud)$10-60/month$20/month (ChatGPT Plus)
Privacy100% local, data never leaves your machineCloud-basedCloud-based
CustomizationFull control: custom models, LoRAs, ControlNetLimited style referencesMinimal
NSFW/UnrestrictedNo content filtersStrict policiesStrict policies
Learning CurveSteepEasyVery easy
Best ForPower users, developers, specific stylesQuick beautiful imagesConversational generation

Choose Stable Diffusion if you:

  • Want complete creative freedom without content restrictions
  • Need to generate hundreds or thousands of images
  • Have a specific style that requires custom training
  • Value privacy and local processing
  • Enjoy tinkering and optimizing workflows

Stick with Midjourney/DALL-E if you:

  • Need beautiful images fast with minimal setup
  • Prefer paying monthly over hardware investment
  • Don’t require custom models or advanced techniques
Stability AI homepage showing Stable Diffusion models
Stability AI’s homepage featuring the latest SD 3.5 models and enterprise solutions

Installation Options: Local vs Cloud

Your hardware determines which path to take. Stable Diffusion requires a decent GPU for reasonable performance.

Hardware Requirements

SetupMinimumRecommended
VRAM6GB (slow, limited)12GB+ (RTX 3060/4070 or better)
RAM16GB32GB
Storage50GB free200GB+ (models are large)

Reality check: If you have an RTX 3060 12GB or better, local installation is worth it. If you’re on a laptop GPU, integrated graphics, or Mac (even M1/M2), cloud services are more practical.

Option 1: Local Installation with ComfyUI

ComfyUI is a node-based interface that’s become the standard for serious Stable Diffusion users. It’s more powerful than AUTOMATIC1111 and allows visual workflow creation.

Step 1: Install ComfyUI

# Clone the repository
git clone https://github.com/comfyanonymous/ComfyUI.git
cd ComfyUI

# Create virtual environment (recommended)
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install PyTorch with CUDA support
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

# Install ComfyUI dependencies
pip install -r requirements.txt

Step 2: Download a Model

Download Stable Diffusion 3.5 Medium (the best balance of quality and speed) from Hugging Face:

# Place in ComfyUI/models/checkpoints/
# File: sd3.5_medium.safetensors (~5GB)

Step 3: Launch ComfyUI

python main.py
# Opens at http://127.0.0.1:8188

Option 2: Cloud GPU Services

No GPU? Cloud services provide pre-configured environments at hourly rates.

ServiceCostSetup TimeBest For
RunPod$0.40-0.80/hr5 minMost popular, ComfyUI templates
Vast.ai$0.20-0.50/hr10 minBudget option, variable quality
Google ColabFree-$10/mo15 minTesting, limited runtime
ThinkDiffusion$0.50/hrInstantZero setup, browser-based

RunPod Quick Start:

  1. Create account at runpod.io
  2. Select “Templates” and search “ComfyUI”
  3. Choose a GPU (RTX 4090 recommended for speed)
  4. Deploy and access via browser

Cloud costs add up. At 20 hours/month usage, you’re paying $8-16/month, which approaches Leonardo AI subscription prices. But you get full customization that managed platforms can’t match.

ComfyUI Basics: Your First Workflow

ComfyUI uses a node-based system where you connect components visually. Think of it like wiring a synthesizer: data flows from left to right through nodes.

ComfyUI node-based workflow interface
ComfyUI’s visual workflow system connects nodes for text-to-image generation

Core Nodes You’ll Use

NodePurpose
Load CheckpointLoads your SD model (.safetensors file)
CLIP Text EncodeConverts text prompts to embeddings
KSamplerThe actual image generation (denoising)
VAE DecodeConverts latent space to viewable image
Save ImageOutputs final image

Basic Text-to-Image Workflow

  1. Load Checkpoint → Connect MODEL, CLIP, VAE outputs
  2. CLIP Text Encode (Positive) → Your main prompt
  3. CLIP Text Encode (Negative) → What to avoid
  4. Empty Latent Image → Set resolution (1024x1024 for SD3.5)
  5. KSampler → Connect all inputs, set steps (20-30), CFG scale (4-7)
  6. VAE Decode → Converts to RGB image
  7. Save Image → Outputs to ComfyUI/output/

Example Prompt:

Positive: "a majestic owl perched on ancient ruins, golden hour lighting,
photorealistic, 8k detail, volumetric fog, depth of field"

Negative: "blurry, low quality, text, watermark, distorted, deformed"

Key Settings:

  • Steps: 20-30 (more = better quality, slower)
  • CFG Scale: 4-7 for SD3.5 (controls prompt adherence)
  • Sampler: euler, dpmpp_2m_sde (experiment to find preference)
  • Scheduler: karras or normal

Using Custom Models from Civitai

Civitai is the community hub for Stable Diffusion models. Over 100,000 checkpoints, LoRAs, and embeddings are available, from photorealistic to anime to specific art styles.

Civitai model library showing popular checkpoints
Civitai hosts thousands of custom models, LoRAs, and embeddings for Stable Diffusion

Finding the Right Model

Popular Model Types:

TypeExamplesBest For
PhotorealisticJuggernaut XL, RealVisXLProduct photos, portraits
Anime/IllustrationPony Diffusion, AnimagineAnime art, character design
ArtisticDreamShaper, SDXL UnstableCreative, painterly styles
SpecializedArchitecture, FashionIndustry-specific needs

Installing Civitai Models

  1. Find a model on civitai.com (check for SDXL or SD3.5 compatibility)
  2. Download the .safetensors file
  3. Place in ComfyUI/models/checkpoints/
  4. Reload ComfyUI (Ctrl+R) or restart
  5. Select in Load Checkpoint node

Pro Tip: Read the model card. Creators specify optimal settings (CFG scale, samplers, trigger words) that dramatically improve results.

LoRA and ControlNet: Advanced Techniques

LoRAs and ControlNet transform Stable Diffusion from “generic image generator” to “precision creative tool.”

LoRA (Low-Rank Adaptation)

LoRAs are small adapter files (10-200MB) that modify model behavior without changing the base model. Use them to add:

  • Styles: Specific artistic styles, lighting, compositions
  • Characters: Consistent characters across images
  • Concepts: Objects, poses, environments

Using LoRAs in ComfyUI:

  1. Download LoRA from Civitai
  2. Place in ComfyUI/models/loras/
  3. Add “Load LoRA” node after Load Checkpoint
  4. Connect MODEL and CLIP through the LoRA node
  5. Set strength (0.5-1.0 typical)

Example: Using a “cinematic lighting” LoRA at 0.7 strength adds Hollywood-style lighting to any prompt.

ControlNet: Precise Composition Control

ControlNet lets you guide image generation using reference images. Instead of hoping the AI positions elements correctly, you specify exact poses, edges, or depth maps.

ControlNet Types:

TypeInputUse Case
Canny EdgeLine drawing/edgesMaintain structure from sketch
DepthDepth mapControl 3D positioning
OpenPosePose skeletonCharacter poses
ScribbleRough sketchQuick concept art
IP-AdapterReference imageStyle transfer

Basic ControlNet Workflow:

  1. Install ControlNet models from Hugging Face
  2. Add “Load ControlNet Model” node
  3. Add “Apply ControlNet” node
  4. Connect your preprocessed image (edge detection, pose extraction)
  5. Connect to KSampler conditioning

This technique is essential for professional work where specific compositions are required.

Video Generation with Stable Video Diffusion

Stable Diffusion isn’t just for images anymore. Stability AI’s video models enable short-form video generation.

Current Video Models (2026)

ModelInputOutputBest For
Stable Video DiffusionSingle image2-4 sec clipImage animation
Stable Video 4D 2.0ImageMulti-view video3D object rotation
Stable Virtual Camera2D videoImmersive videoAdding camera motion

Image-to-Video Workflow

  1. Generate or select a high-quality image
  2. Use SVD model in ComfyUI (requires separate download)
  3. Set motion parameters (motion bucket, fps)
  4. Generate frames (14-25 typical)
  5. Export as video

Hardware Note: Video generation is significantly more VRAM-intensive. Expect 12GB+ for basic SVD, 24GB+ for higher quality.

For more accessible video generation, consider dedicated platforms like Runway or HeyGen which offer more polished workflows at the cost of flexibility.

Tips for Better Results

After thousands of generations, these practices consistently improve output quality.

Prompt Engineering

Structure your prompts:

[Subject], [Style/Medium], [Lighting], [Quality Keywords], [Artist Reference]

Example: "portrait of a cyberpunk hacker, digital painting,
neon rim lighting, intricate details 8k, in the style of Simon Stalenhag"

Quality boosters that work:

  • “highly detailed, 8k, intricate”
  • “professional photography, DSLR”
  • “masterpiece, best quality” (for anime models)
  • Specific lighting: “golden hour, studio lighting, volumetric”

Negative prompts matter:

"blurry, low quality, text, watermark, signature, worst quality,
jpeg artifacts, deformed, distorted, extra limbs"

Workflow Optimization

  1. Start low, scale up: Generate at 512x512 first, upscale winners
  2. Use Hi-Res Fix: Two-pass generation for sharper large images
  3. Batch generate: Create 4-8 variations, pick the best
  4. Save workflows: ComfyUI saves workflows in image metadata

Common Mistakes to Avoid

MistakeSolution
CFG scale too highSD3.5 works best at 4-7, not 7-12 like older models
Wrong resolutionMatch model’s training resolution (1024x1024 for SDXL/SD3.5)
Ignoring model cardsRead recommended settings on Civitai
Too many LoRAsStack 1-3 max, reduce strength when combining
Skipping negative promptsAlways specify what to avoid

Stable Diffusion vs Alternatives Comparison

How does Stable Diffusion stack up against commercial alternatives for different use cases?

  • Stable Diffusion: Rating: 4.5/5 — Free, unlimited, full control
  • Midjourney: Rating: 3.7/5 — Best aesthetics, $10/mo
  • DALL-E 3: Rating: 4.4/5 — Best text rendering, pay-per-use
  • Leonardo AI: Rating: 4.6/5 — Best free cloud option
Use CaseBest ChoiceWhy
Quick beautiful imagesMidjourneyAesthetic defaults, minimal prompting
Conversational generationDALL-E 3Natural language understanding
Specific style consistencyStable DiffusionCustom models, LoRAs
High volume generationStable DiffusionNo per-image costs
Video generationRunway or SDDepends on control needs
Managed custom trainingLeonardo AIGuided workflow, no setup

Getting Started Checklist

Ready to begin? Here’s your action plan:

Week 1: Setup

  • Assess hardware (GPU VRAM check)
  • Install ComfyUI locally or sign up for RunPod
  • Download SD 3.5 Medium checkpoint
  • Generate first images with basic workflow

Week 2: Exploration

  • Browse Civitai for models matching your needs
  • Try 2-3 different checkpoints
  • Experiment with LoRAs
  • Practice prompt engineering

Week 3: Advanced

  • Install ControlNet models
  • Create pose-controlled generations
  • Try image-to-video with SVD
  • Build and save custom workflows

The learning curve is real, but the payoff is complete creative control. Unlike subscription services that can change policies or pricing overnight, your local Stable Diffusion setup is yours forever.

For more AI image generation techniques, see our guides on custom model training and AI image generation tips.


External Resources

For official documentation and updates from these tools: