/ AI Image Generation / Stable Diffusion on AMD GPUs: Complete Setup Guide 2025
AI Image Generation 10 min read

Stable Diffusion on AMD GPUs: Complete Setup Guide 2025

Run Stable Diffusion on AMD GPUs with ROCm or DirectML. Complete guide covering Linux and Windows setup, performance optimization, and troubleshooting for RDNA 2 and RDNA 3.

Stable Diffusion on AMD GPUs: Complete Setup Guide 2025 - Complete AI Image Generation guide and tutorial

You have an AMD GPU and want to run Stable Diffusion locally. Every tutorial you find assumes NVIDIA CUDA, leaving you wondering if AI image generation is even possible on AMD hardware. The ecosystem seems built exclusively for Team Green.

Quick Answer: Stable Diffusion works on AMD GPUs through ROCm on Linux or DirectML on Windows. ROCm provides significantly better performance, running 4x faster than DirectML in some tests. For RX 6000 and RX 7000 series cards, ROCm 6.2 or newer delivers reliable results. ComfyUI with ROCm is now the recommended approach for AMD users in 2025.

Key Takeaways
  • ROCm on Linux is 4x+ faster than DirectML on Windows for the same hardware
  • RX 6000 (RDNA 2) and RX 7000 (RDNA 3) series have solid ROCm support
  • ComfyUI is recommended over WebUI for AMD users in 2025
  • HSA_OVERRIDE_GFX_VERSION environment variable is essential for proper GPU detection
  • Native ROCm on Windows is now available for RDNA 3 and RDNA 4 cards

What Are Your Options for AMD GPU AI?

AMD users have three main paths for running Stable Diffusion locally.

ROCm is AMD's equivalent to NVIDIA's CUDA. It provides GPU compute capabilities optimized for AMD hardware.

Advantages:

  • Best performance by significant margin
  • Full feature support
  • Active development and improving compatibility
  • Native FP16 support on most cards

Disadvantages:

  • Requires Linux (Ubuntu recommended)
  • More complex initial setup
  • Some advanced features may lag CUDA equivalents

Option 2. ROCm via WSL on Windows

Windows Subsystem for Linux allows running ROCm within Windows.

Advantages:

  • ROCm performance on Windows machine
  • Easier transition for Windows users
  • Access to Linux-based tools

Disadvantages:

  • Some performance overhead from virtualization
  • More complex setup than native options
  • Occasional compatibility issues

Option 3. Native ROCm on Windows (New in 2025)

Recent updates brought native ROCm support to Windows for RDNA 3 and RDNA 4 cards.

Advantages:

  • Native Windows experience
  • Better performance than DirectML
  • Simpler than WSL approach

Disadvantages:

  • Limited to newer GPU architectures
  • Requires nightly PyTorch builds
  • Still maturing

Option 4. DirectML on Windows

DirectML works with any DirectX 12 compatible GPU but with significant performance penalty.

Advantages:

  • Works on almost any modern GPU
  • Simple installation
  • Native Windows support

Disadvantages:

  • Much slower than ROCm (4x or more)
  • Some features may not work
  • Not recommended for serious use

Which AMD GPUs Work Best?

GPU selection significantly impacts your experience.

Fully Supported GPUs

RDNA 3 (RX 7000 Series):

  • RX 7900 XTX - Best AMD option for AI
  • RX 7900 XT - Excellent performance
  • RX 7800 XT - Good mid-range option
  • RX 7700 XT - Entry point for RDNA 3

RDNA 2 (RX 6000 Series):

  • RX 6900 XT - Strong performance
  • RX 6800 XT - Popular choice
  • RX 6800 - Good value option
  • RX 6700 XT - Entry point for RDNA 2

Partially Supported GPUs

RDNA 1 (RX 5000 Series):

  • RX 5700 XT - May work with effort
  • Compatibility challenges and limited optimization
  • Not recommended for new setups

Performance Expectations

GPU VRAM SD 1.5 512x768 50 steps SDXL 1024x1024
RX 7900 XTX 24GB ~8-10 seconds ~20-25 seconds
RX 7900 XT 20GB ~10-12 seconds ~25-30 seconds
RX 6800 XT 16GB ~15-18 seconds ~35-45 seconds
RX 5700 XT (DirectML) 8GB ~120 seconds Limited
RX 5700 XT (ROCm) 8GB ~29 seconds Limited VRAM

The difference between DirectML and ROCm on the same hardware is dramatic. One user reports their RX 5700 XT taking 2 minutes on DirectML versus 29 seconds on ROCm for identical generation.

How Do You Set Up ROCm on Linux?

Linux provides the best AMD experience. Ubuntu is strongly recommended.

Step 1. Check System Requirements

Supported Operating Systems:

  • Ubuntu 22.04 LTS (Recommended)
  • Ubuntu 24.04 LTS
  • Other distributions possible but less tested

Verify GPU Compatibility:

lspci | grep VGA

Step 2. Install ROCm

Follow the official AMD ROCm installation guide for your distribution.

For Ubuntu:

# Add ROCm repository
wget -q -O - https://repo.radeon.com/rocm/rocm.gpg.key | sudo apt-key add -
echo "deb [arch=amd64] https://repo.radeon.com/rocm/apt/6.2 ubuntu main" | sudo tee /etc/apt/sources.list.d/rocm.list

# Install ROCm
sudo apt update
sudo apt install rocm-hip-libraries rocm-dev

Step 3. Set Environment Variables

Critical for proper GPU detection. Add to your .bashrc or .profile:

Free ComfyUI Workflows

Find free, open-source ComfyUI workflows for techniques in this article. Open source is strong.

100% Free MIT License Production Ready Star & Try Workflows

For RDNA 3 (RX 7000 Series):

export HSA_OVERRIDE_GFX_VERSION=11.0.0

For RDNA 2 (RX 6000 Series):

export HSA_OVERRIDE_GFX_VERSION=10.3.0

This variable tells ROCm which GPU architecture instruction set to use.

Step 4. Install ComfyUI

ComfyUI is recommended over WebUI for AMD users.

git clone https://github.com/comfyanonymous/ComfyUI.git
cd ComfyUI
pip install -r requirements.txt
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm6.2

Step 5. Verify Installation

python -c "import torch; print(torch.cuda.is_available())"

This should return True if ROCm is properly configured.

How Do You Set Up ROCm on Windows?

Windows users now have native ROCm options for newer cards.

Native ROCm on Windows (RDNA 3/4)

Step 1. Install AMD Software

Ensure you have the latest AMD Adrenalin drivers installed.

Step 2. Install PyTorch with ROCm

For RDNA 3 and RDNA 4 cards, use the nightly PyTorch build:

pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/rocm6.2

Step 3. Install ComfyUI

Want to skip the complexity? Apatero gives you professional AI results instantly with no technical setup required.

Zero setup Same quality Start in 30 seconds Try Apatero Free
No credit card required

According to Atlas SC's guide, the process is similar to Linux:

git clone https://github.com/comfyanonymous/ComfyUI.git
cd ComfyUI
pip install -r requirements.txt

ROCm via WSL

For broader GPU support or if native Windows ROCm doesn't work:

Step 1. Enable WSL2

wsl --install

Step 2. Install Ubuntu

wsl --install -d Ubuntu-22.04

Step 3. Follow Linux Setup

Within WSL, follow the Linux ROCm installation steps above.

DirectML Fallback

If ROCm doesn't work for your GPU:

Step 1. Install ComfyUI with DirectML support

pip install torch-directml

Step 2. Run with DirectML backend

Performance will be significantly slower but compatibility is broader.

What Common Issues Occur and How Do You Fix Them?

AMD setup encounters predictable problems with known solutions.

Join 115 other course members

Create Your First Mega-Realistic AI Influencer in 51 Lessons

Create ultra-realistic AI influencers with lifelike skin details, professional selfies, and complex scenes. Get two complete courses in one bundle. ComfyUI Foundation to master the tech, and Fanvue Creator Academy to learn how to market yourself as an AI creator.

Early-bird pricing ends in:
--
Days
:
--
Hours
:
--
Minutes
:
--
Seconds
51 Lessons • 2 Complete Courses
One-Time Payment
Lifetime Updates
Save $200 - Price Increases to $399 Forever
Early-bird discount for our first students. We are constantly adding more value, but you lock in $199 forever.
Beginner friendly
Production ready
Always updated

Issue: "No GPU detected" or "CUDA not available"

Solutions:

  1. Verify HSA_OVERRIDE_GFX_VERSION is set correctly
  2. Check ROCm installation with rocminfo
  3. Ensure user is in video and render groups
  4. Verify GPU is recognized: ls /dev/dri/

Issue: Out of Memory Errors

Solutions:

  1. Enable attention slicing in ComfyUI
  2. Reduce batch size
  3. Lower resolution during generation
  4. Use FP16 precision

Issue: Slow Performance

Solutions:

  1. Verify using ROCm, not DirectML
  2. Check GPU utilization with rocm-smi
  3. Enable FP16 if supported by your card
  4. Close other GPU-using applications

Issue: Generation Produces Artifacts

Solutions:

  1. Try --upcast-sampling flag
  2. Check model compatibility
  3. Update ROCm to latest version
  4. Verify VRAM isn't being exhausted

How Does AMD Performance Compare to NVIDIA?

Understanding the performance gap helps set expectations.

Raw Performance Comparison

GPU Price Point SD 1.5 Performance Notes
RTX 4090 ~$1,600 ~3-4 seconds Best consumer option
RX 7900 XTX ~$900 ~8-10 seconds Best AMD option
RTX 4070 Ti ~$700 ~6-8 seconds Strong value
RX 7900 XT ~$700 ~10-12 seconds Good AMD value
RTX 3060 12GB ~$300 ~15-20 seconds Budget NVIDIA
RX 6800 XT ~$400 ~15-18 seconds Budget AMD

Value Analysis

AMD GPUs often provide better value per dollar for general computing while NVIDIA maintains clear advantages for AI workloads.

Choose AMD When:

  • You already own the hardware
  • Budget is primary concern
  • You also game or do other GPU tasks
  • You're comfortable with Linux

Choose NVIDIA When:

  • AI is primary use case
  • You value ecosystem maturity
  • Windows-native experience matters
  • Maximum performance required

For users who don't want to manage local hardware, Apatero.com provides cloud-based AI image generation without GPU concerns.

What Workflows Work Best on AMD?

Some workflows perform better than others on AMD hardware.

Well-Supported Workflows

Standard Generation:

  • Text-to-image works well
  • Image-to-image supported
  • Inpainting functional
  • Basic ControlNet works

Recommended Tools:

  • ComfyUI (preferred)
  • AUTOMATIC1111 WebUI (with ROCm)
  • Fooocus (with configuration)

Challenging Workflows

Limited Support:

  • xformers not available (NVIDIA-only)
  • Some custom nodes may not work
  • Certain optimizations CUDA-specific
  • Advanced ControlNet models may have issues

Optimization Tips

For Best Results:

  1. Use FP16 precision when possible
  2. Enable attention slicing for large generations
  3. Consider batch size carefully
  4. Monitor VRAM usage during complex workflows

Frequently Asked Questions

Is AMD GPU worth it for Stable Diffusion?

If you already own one, absolutely. For new purchases specifically for AI, NVIDIA typically offers better performance per dollar and easier setup. But AMD can produce excellent results with proper configuration.

Can I run SDXL on AMD?

Yes, with 16GB+ VRAM. RX 6800 XT, RX 7900 XT, and RX 7900 XTX all handle SDXL. Smaller VRAM cards struggle with SDXL's requirements.

Why is DirectML so much slower than ROCm?

DirectML is a general-purpose API not optimized specifically for AI workloads. ROCm provides GPU compute capabilities specifically designed for machine learning, similar to how CUDA works for NVIDIA.

Do I need Linux for AMD AI?

Not strictly required with native Windows ROCm for newer cards, but Linux provides the best and most reliable experience. WSL offers a middle ground.

Can I use LoRAs on AMD?

Yes. LoRA loading and application work normally once base Stable Diffusion runs. Performance may vary slightly from NVIDIA but functionality is equivalent.

What about training on AMD?

LoRA training is possible with ROCm but more challenging than NVIDIA. See our AMD LoRA training guide for details.

Will future AMD GPUs be better supported?

AMD is actively developing ROCm and AI capabilities. Each generation improves support, and RDNA 4 is expected to continue this trend.

Should I use WebUI or ComfyUI?

ComfyUI is recommended. WebUI is outdated and still requires Python 3.10 with no apparent plans to update. ComfyUI actively maintains AMD compatibility.

Conclusion

AMD GPUs are viable for Stable Diffusion with proper setup. ROCm on Linux provides the best experience, while Windows users now have native ROCm options for newer cards. The performance gap with NVIDIA exists but isn't prohibitive for most use cases.

Key Implementation Points:

  • Use ROCm, not DirectML, whenever possible
  • Set HSA_OVERRIDE_GFX_VERSION correctly for your GPU
  • Choose ComfyUI over WebUI for AMD compatibility
  • Linux provides the most reliable experience
  • 16GB+ VRAM recommended for SDXL
Choosing Your Path
  • Best AMD experience: ROCm on Ubuntu with ComfyUI
  • Windows with newer AMD: Native ROCm with nightly PyTorch
  • Windows with older AMD: ROCm via WSL2
  • Skip local setup entirely: Apatero.com for cloud-based generation without hardware concerns

The AMD AI experience improves with each ROCm release. What was difficult in 2023 is routine in 2025. If you have AMD hardware, there's no reason not to explore local AI image generation.

Sources:

Ready to Create Your AI Influencer?

Join 115 students mastering ComfyUI and AI influencer marketing in our complete 51-lesson course.

Early-bird pricing ends in:
--
Days
:
--
Hours
:
--
Minutes
:
--
Seconds
Claim Your Spot - $199
Save $200 - Price Increases to $399 Forever