Stable Diffusion on AMD GPUs: Complete Setup Guide 2025
Run Stable Diffusion on AMD GPUs with ROCm or DirectML. Complete guide covering Linux and Windows setup, performance optimization, and troubleshooting for RDNA 2 and RDNA 3.
You have an AMD GPU and want to run Stable Diffusion locally. Every tutorial you find assumes NVIDIA CUDA, leaving you wondering if AI image generation is even possible on AMD hardware. The ecosystem seems built exclusively for Team Green.
Quick Answer: Stable Diffusion works on AMD GPUs through ROCm on Linux or DirectML on Windows. ROCm provides significantly better performance, running 4x faster than DirectML in some tests. For RX 6000 and RX 7000 series cards, ROCm 6.2 or newer delivers reliable results. ComfyUI with ROCm is now the recommended approach for AMD users in 2025.
- ROCm on Linux is 4x+ faster than DirectML on Windows for the same hardware
- RX 6000 (RDNA 2) and RX 7000 (RDNA 3) series have solid ROCm support
- ComfyUI is recommended over WebUI for AMD users in 2025
- HSA_OVERRIDE_GFX_VERSION environment variable is essential for proper GPU detection
- Native ROCm on Windows is now available for RDNA 3 and RDNA 4 cards
What Are Your Options for AMD GPU AI?
AMD users have three main paths for running Stable Diffusion locally.
Option 1. ROCm on Linux (Recommended)
ROCm is AMD's equivalent to NVIDIA's CUDA. It provides GPU compute capabilities optimized for AMD hardware.
Advantages:
- Best performance by significant margin
- Full feature support
- Active development and improving compatibility
- Native FP16 support on most cards
Disadvantages:
- Requires Linux (Ubuntu recommended)
- More complex initial setup
- Some advanced features may lag CUDA equivalents
Option 2. ROCm via WSL on Windows
Windows Subsystem for Linux allows running ROCm within Windows.
Advantages:
- ROCm performance on Windows machine
- Easier transition for Windows users
- Access to Linux-based tools
Disadvantages:
- Some performance overhead from virtualization
- More complex setup than native options
- Occasional compatibility issues
Option 3. Native ROCm on Windows (New in 2025)
Recent updates brought native ROCm support to Windows for RDNA 3 and RDNA 4 cards.
Advantages:
- Native Windows experience
- Better performance than DirectML
- Simpler than WSL approach
Disadvantages:
- Limited to newer GPU architectures
- Requires nightly PyTorch builds
- Still maturing
Option 4. DirectML on Windows
DirectML works with any DirectX 12 compatible GPU but with significant performance penalty.
Advantages:
- Works on almost any modern GPU
- Simple installation
- Native Windows support
Disadvantages:
- Much slower than ROCm (4x or more)
- Some features may not work
- Not recommended for serious use
Which AMD GPUs Work Best?
GPU selection significantly impacts your experience.
Fully Supported GPUs
RDNA 3 (RX 7000 Series):
- RX 7900 XTX - Best AMD option for AI
- RX 7900 XT - Excellent performance
- RX 7800 XT - Good mid-range option
- RX 7700 XT - Entry point for RDNA 3
RDNA 2 (RX 6000 Series):
- RX 6900 XT - Strong performance
- RX 6800 XT - Popular choice
- RX 6800 - Good value option
- RX 6700 XT - Entry point for RDNA 2
Partially Supported GPUs
RDNA 1 (RX 5000 Series):
- RX 5700 XT - May work with effort
- Compatibility challenges and limited optimization
- Not recommended for new setups
Performance Expectations
| GPU | VRAM | SD 1.5 512x768 50 steps | SDXL 1024x1024 |
|---|---|---|---|
| RX 7900 XTX | 24GB | ~8-10 seconds | ~20-25 seconds |
| RX 7900 XT | 20GB | ~10-12 seconds | ~25-30 seconds |
| RX 6800 XT | 16GB | ~15-18 seconds | ~35-45 seconds |
| RX 5700 XT (DirectML) | 8GB | ~120 seconds | Limited |
| RX 5700 XT (ROCm) | 8GB | ~29 seconds | Limited VRAM |
The difference between DirectML and ROCm on the same hardware is dramatic. One user reports their RX 5700 XT taking 2 minutes on DirectML versus 29 seconds on ROCm for identical generation.
How Do You Set Up ROCm on Linux?
Linux provides the best AMD experience. Ubuntu is strongly recommended.
Step 1. Check System Requirements
Supported Operating Systems:
- Ubuntu 22.04 LTS (Recommended)
- Ubuntu 24.04 LTS
- Other distributions possible but less tested
Verify GPU Compatibility:
lspci | grep VGA
Step 2. Install ROCm
Follow the official AMD ROCm installation guide for your distribution.
For Ubuntu:
# Add ROCm repository
wget -q -O - https://repo.radeon.com/rocm/rocm.gpg.key | sudo apt-key add -
echo "deb [arch=amd64] https://repo.radeon.com/rocm/apt/6.2 ubuntu main" | sudo tee /etc/apt/sources.list.d/rocm.list
# Install ROCm
sudo apt update
sudo apt install rocm-hip-libraries rocm-dev
Step 3. Set Environment Variables
Critical for proper GPU detection. Add to your .bashrc or .profile:
Free ComfyUI Workflows
Find free, open-source ComfyUI workflows for techniques in this article. Open source is strong.
For RDNA 3 (RX 7000 Series):
export HSA_OVERRIDE_GFX_VERSION=11.0.0
For RDNA 2 (RX 6000 Series):
export HSA_OVERRIDE_GFX_VERSION=10.3.0
This variable tells ROCm which GPU architecture instruction set to use.
Step 4. Install ComfyUI
ComfyUI is recommended over WebUI for AMD users.
git clone https://github.com/comfyanonymous/ComfyUI.git
cd ComfyUI
pip install -r requirements.txt
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm6.2
Step 5. Verify Installation
python -c "import torch; print(torch.cuda.is_available())"
This should return True if ROCm is properly configured.
How Do You Set Up ROCm on Windows?
Windows users now have native ROCm options for newer cards.
Native ROCm on Windows (RDNA 3/4)
Step 1. Install AMD Software
Ensure you have the latest AMD Adrenalin drivers installed.
Step 2. Install PyTorch with ROCm
For RDNA 3 and RDNA 4 cards, use the nightly PyTorch build:
pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/rocm6.2
Step 3. Install ComfyUI
Want to skip the complexity? Apatero gives you professional AI results instantly with no technical setup required.
According to Atlas SC's guide, the process is similar to Linux:
git clone https://github.com/comfyanonymous/ComfyUI.git
cd ComfyUI
pip install -r requirements.txt
ROCm via WSL
For broader GPU support or if native Windows ROCm doesn't work:
Step 1. Enable WSL2
wsl --install
Step 2. Install Ubuntu
wsl --install -d Ubuntu-22.04
Step 3. Follow Linux Setup
Within WSL, follow the Linux ROCm installation steps above.
DirectML Fallback
If ROCm doesn't work for your GPU:
Step 1. Install ComfyUI with DirectML support
pip install torch-directml
Step 2. Run with DirectML backend
Performance will be significantly slower but compatibility is broader.
What Common Issues Occur and How Do You Fix Them?
AMD setup encounters predictable problems with known solutions.
Join 115 other course members
Create Your First Mega-Realistic AI Influencer in 51 Lessons
Create ultra-realistic AI influencers with lifelike skin details, professional selfies, and complex scenes. Get two complete courses in one bundle. ComfyUI Foundation to master the tech, and Fanvue Creator Academy to learn how to market yourself as an AI creator.
Issue: "No GPU detected" or "CUDA not available"
Solutions:
- Verify HSA_OVERRIDE_GFX_VERSION is set correctly
- Check ROCm installation with
rocminfo - Ensure user is in
videoandrendergroups - Verify GPU is recognized:
ls /dev/dri/
Issue: Out of Memory Errors
Solutions:
- Enable attention slicing in ComfyUI
- Reduce batch size
- Lower resolution during generation
- Use FP16 precision
Issue: Slow Performance
Solutions:
- Verify using ROCm, not DirectML
- Check GPU utilization with
rocm-smi - Enable FP16 if supported by your card
- Close other GPU-using applications
Issue: Generation Produces Artifacts
Solutions:
- Try
--upcast-samplingflag - Check model compatibility
- Update ROCm to latest version
- Verify VRAM isn't being exhausted
How Does AMD Performance Compare to NVIDIA?
Understanding the performance gap helps set expectations.
Raw Performance Comparison
| GPU | Price Point | SD 1.5 Performance | Notes |
|---|---|---|---|
| RTX 4090 | ~$1,600 | ~3-4 seconds | Best consumer option |
| RX 7900 XTX | ~$900 | ~8-10 seconds | Best AMD option |
| RTX 4070 Ti | ~$700 | ~6-8 seconds | Strong value |
| RX 7900 XT | ~$700 | ~10-12 seconds | Good AMD value |
| RTX 3060 12GB | ~$300 | ~15-20 seconds | Budget NVIDIA |
| RX 6800 XT | ~$400 | ~15-18 seconds | Budget AMD |
Value Analysis
AMD GPUs often provide better value per dollar for general computing while NVIDIA maintains clear advantages for AI workloads.
Choose AMD When:
- You already own the hardware
- Budget is primary concern
- You also game or do other GPU tasks
- You're comfortable with Linux
Choose NVIDIA When:
- AI is primary use case
- You value ecosystem maturity
- Windows-native experience matters
- Maximum performance required
For users who don't want to manage local hardware, Apatero.com provides cloud-based AI image generation without GPU concerns.
What Workflows Work Best on AMD?
Some workflows perform better than others on AMD hardware.
Well-Supported Workflows
Standard Generation:
- Text-to-image works well
- Image-to-image supported
- Inpainting functional
- Basic ControlNet works
Recommended Tools:
- ComfyUI (preferred)
- AUTOMATIC1111 WebUI (with ROCm)
- Fooocus (with configuration)
Challenging Workflows
Limited Support:
- xformers not available (NVIDIA-only)
- Some custom nodes may not work
- Certain optimizations CUDA-specific
- Advanced ControlNet models may have issues
Optimization Tips
For Best Results:
- Use FP16 precision when possible
- Enable attention slicing for large generations
- Consider batch size carefully
- Monitor VRAM usage during complex workflows
Frequently Asked Questions
Is AMD GPU worth it for Stable Diffusion?
If you already own one, absolutely. For new purchases specifically for AI, NVIDIA typically offers better performance per dollar and easier setup. But AMD can produce excellent results with proper configuration.
Can I run SDXL on AMD?
Yes, with 16GB+ VRAM. RX 6800 XT, RX 7900 XT, and RX 7900 XTX all handle SDXL. Smaller VRAM cards struggle with SDXL's requirements.
Why is DirectML so much slower than ROCm?
DirectML is a general-purpose API not optimized specifically for AI workloads. ROCm provides GPU compute capabilities specifically designed for machine learning, similar to how CUDA works for NVIDIA.
Do I need Linux for AMD AI?
Not strictly required with native Windows ROCm for newer cards, but Linux provides the best and most reliable experience. WSL offers a middle ground.
Can I use LoRAs on AMD?
Yes. LoRA loading and application work normally once base Stable Diffusion runs. Performance may vary slightly from NVIDIA but functionality is equivalent.
What about training on AMD?
LoRA training is possible with ROCm but more challenging than NVIDIA. See our AMD LoRA training guide for details.
Will future AMD GPUs be better supported?
AMD is actively developing ROCm and AI capabilities. Each generation improves support, and RDNA 4 is expected to continue this trend.
Should I use WebUI or ComfyUI?
ComfyUI is recommended. WebUI is outdated and still requires Python 3.10 with no apparent plans to update. ComfyUI actively maintains AMD compatibility.
Conclusion
AMD GPUs are viable for Stable Diffusion with proper setup. ROCm on Linux provides the best experience, while Windows users now have native ROCm options for newer cards. The performance gap with NVIDIA exists but isn't prohibitive for most use cases.
Key Implementation Points:
- Use ROCm, not DirectML, whenever possible
- Set HSA_OVERRIDE_GFX_VERSION correctly for your GPU
- Choose ComfyUI over WebUI for AMD compatibility
- Linux provides the most reliable experience
- 16GB+ VRAM recommended for SDXL
- Best AMD experience: ROCm on Ubuntu with ComfyUI
- Windows with newer AMD: Native ROCm with nightly PyTorch
- Windows with older AMD: ROCm via WSL2
- Skip local setup entirely: Apatero.com for cloud-based generation without hardware concerns
The AMD AI experience improves with each ROCm release. What was difficult in 2023 is routine in 2025. If you have AMD hardware, there's no reason not to explore local AI image generation.
Sources:
Ready to Create Your AI Influencer?
Join 115 students mastering ComfyUI and AI influencer marketing in our complete 51-lesson course.
Related Articles
10 Best AI Influencer Generator Tools Compared (2025)
Comprehensive comparison of the top AI influencer generator tools in 2025. Features, pricing, quality, and best use cases for each platform reviewed.
AI Adventure Book Generation with Real-Time Images
Generate interactive adventure books with real-time AI image creation. Complete workflow for dynamic storytelling with consistent visual generation.
AI Comic Book Creation with AI Image Generation
Create professional comic books using AI image generation tools. Learn complete workflows for character consistency, panel layouts, and story...