/ AI Image Generation / How to Train a LoRA Locally for Illustrious Models with AMD GPU 2025
AI Image Generation 14 min read

How to Train a LoRA Locally for Illustrious Models with AMD GPU 2025

Complete guide to training Illustrious-XL LoRAs on AMD GPUs using ROCm 6.2+ in 2025. Anime-optimized training with Danbooru tags and optimal parameters.

How to Train a LoRA Locally for Illustrious Models with AMD GPU 2025 - Complete AI Image Generation guide and tutorial

You have an AMD GPU and want to train custom anime character or style LoRAs for Illustrious-XL, but most guides focus on NVIDIA hardware and Illustrious's anime-specific training requirements add complexity beyond standard SDXL workflows. Training Illustrious LoRAs on AMD GPUs is completely viable in 2025 using the same ROCm setup as SDXL, with specific optimizations for anime content and Danbooru tag integration.

Quick Answer: Training Illustrious-XL LoRAs on AMD GPUs follows the same SDXL workflow with ROCm 6.2+, Python 3.10, and Kohya's sd-scripts, but with anime-optimized parameters. Key differences include using Danbooru tags (1girl, blue_eyes, etc.) alongside natural language, UNET learning rate around 0.0003 and Text Encoder around 0.00003 for character LoRAs, smaller datasets (10-20 images) due to anime's simpler style consistency, and specific tag ordering following Booru conventions. RX 7900 XTX (24GB) and RX 6800 XT (16GB) both work with appropriate optimization.

Key Takeaways:
  • Illustrious-XL is SDXL-based, so same hardware/software requirements apply (16GB+ VRAM)
  • Hybrid Danbooru tags + natural language captioning optimizes for anime content
  • Separate learning rates for UNET (0.0003) and Text Encoder (0.00003) recommended
  • Smaller datasets work well (10-20 images) due to anime style consistency
  • Same tokenizer fix required as SDXL (edit sdxl_train_util.py)

What Makes Illustrious-XL Different from Standard SDXL?

Illustrious-XL represents a specialized SDXL fine-tune optimized for high-quality anime and illustration generation. Understanding these differences helps you train LoRAs that leverage Illustrious's strengths.

The base architecture remains SDXL with identical technical requirements. Illustrious uses SDXL's dual text encoder structure, 1024x1024 native resolution, and similar parameter count. This means SDXL training workflows and hardware requirements apply directly to Illustrious.

The specialized training data focuses on anime, manga, and illustration artwork primarily from Danbooru and similar sources. This training bias gives Illustrious superior performance on anime content compared to general SDXL, understanding anime-specific concepts, styles, and character features naturally.

Danbooru tag integration represents a key operational difference. While SDXL uses natural language prompts, Illustrious understands both natural language and structured Danbooru tags. Tags like 1girl, blue_eyes, long_hair, school_uniform follow specific conventions and hierarchies that Illustrious interprets effectively.

The hybrid input capability accepts both Danbooru tags and natural language in the same prompt. This flexibility enables precise control through tags combined with natural language scene descriptions. For LoRA training, captions can mix both systems for optimal results.

Version evolution through 2024-2025 improved stability and quality. Illustrious v0.1 introduced the initial concept, v1.0 refined quality, and v2.0-STABLE released in April 2025 adopted cosine annealing training schedules for better stability. Current LoRA training should target v0.1 or v1.0 base models depending on availability and preference.

Illustrious Advantages for Anime LoRAs:
  • Anime-optimized base: Superior understanding of anime styles and character features
  • Danbooru tag support: Precise control using structured tags anime community understands
  • Smaller dataset requirements: Anime style consistency means 10-20 images often suffice
  • Character consistency: Better at maintaining character features across variations
  • Style flexibility: Supports various anime art styles from different eras and studios

The anime focus affects training parameter choices. Character LoRAs train differently on Illustrious than on standard SDXL, with different optimal learning rates and training durations. Style LoRAs benefit from Illustrious's deep understanding of anime aesthetics.

For users wanting anime image generation without training custom LoRAs, platforms like Apatero.com provide access to professionally trained models including anime-optimized options through streamlined interfaces.

How Do You Set Up AMD GPUs for Illustrious Training?

Setting up your AMD environment for Illustrious training uses the identical process as SDXL since Illustrious shares the same architecture. If you've already configured for SDXL training, no additional setup is needed.

Hardware requirements match SDXL exactly. Minimum 16GB VRAM (RX 6800 XT, RX 6900 XT) with aggressive optimization, comfortable training at 20GB (RX 7900 XT), ideal at 24GB (RX 7900 XTX). The same VRAM constraints apply because the model architectures are identical.

ROCm 6.2+ installation with PyTorch for ROCm 6.3 provides the foundation. Follow AMD's official ROCm installation guide for Ubuntu 22.04 or 24.04. Verify with rocm-smi detecting your GPU. Set HSA_OVERRIDE_GFX_VERSION to 11.0.0 for RDNA 3 cards or 10.3.0 for RDNA 2 cards.

Python 3.10 virtual environment setup, Kohya sd-scripts installation, and dependency configuration follow the SDXL guide exactly. Create a venv, install PyTorch for ROCm 6.3, install Kohya's requirements, configure Accelerate, and install additional dependencies.

The critical tokenizer fix for SDXL applies identically to Illustrious. Edit ./sd-scripts/library/sdxl_train_util.py and change both TOKENIZER1_PATH and TOKENIZER2_PATH to "openai/clip-vit-large-patch14". Without this fix, training fails with tokenizer errors.

Model download for Illustrious base models happens from HuggingFace or Civitai. Popular options include Illustrious-XL v0.1, v1.0, or specialized variants like AnyIllustrious-XL optimized for LoRA training. Download your chosen base model and place it in your models directory.

Verification involves testing PyTorch GPU detection as with any AMD ROCm setup. Ensure torch.cuda.is_available() returns True and your GPU is detected. Run a basic SDXL generation to confirm everything works before attempting training.

Illustrious AMD Training Requirements:
  • Identical to SDXL: 16GB VRAM minimum, 24GB recommended
  • Same ROCm 6.2+ and PyTorch requirements as SDXL
  • Must apply tokenizer fix in sdxl_train_util.py
  • Download Illustrious base model (6-7GB) from HuggingFace or Civitai
  • Training takes 3-6 hours for character LoRAs depending on GPU

What Training Parameters Work Best for Illustrious on AMD?

Illustrious training parameters differ from standard SDXL due to anime content characteristics and community-discovered optimal settings. These parameters produce quality anime character and style LoRAs.

Separate learning rates for UNET and Text Encoder represent the key difference from standard SDXL training. For character LoRAs, use UNET learning rate around 0.0003 (3e-4) and Text Encoder around 0.00003 (3e-5). This 10:1 ratio produces strong character features while maintaining image quality.

The higher UNET rate enables faster learning of visual features like character appearance, clothing, and distinctive traits. The lower Text Encoder rate prevents overfitting on trigger words while allowing association with character concepts. This balance works particularly well for anime character training.

Network dimension for Illustrious often runs slightly lower than general SDXL due to anime's style consistency. Dimension 32-48 works well for character LoRAs, with 48 providing good capacity without excessive file size. Style LoRAs can use 48-64 depending on complexity.

Free ComfyUI Workflows

Find free, open-source ComfyUI workflows for techniques in this article. Open source is strong.

100% Free MIT License Production Ready Star & Try Workflows

Batch size remains 1 for most AMD GPU setups, especially on 16GB cards. The 24GB RX 7900 XTX can experiment with batch size 2, but gains are minimal. Stick with batch size 1 for reliable training.

Recommended Illustrious AMD Parameters:
  • UNET learning rate: 0.0003 (character LoRAs)
  • Text Encoder learning rate: 0.00003 (1/10 of UNET)
  • Network dimension: 32-48 (character), 48-64 (style)
  • Network alpha: Half of dimension (16-24 for dim 32-48)
  • Resolution: 1024x1024 standard
  • Max epochs: 10-15 (fewer needed than SDXL due to simpler content)
  • Batch size: 1 (mandatory for 16GB, safe for all)

Dataset size for anime character LoRAs typically runs smaller than photorealistic subjects. Where general SDXL might need 20-40 images, Illustrious character LoRAs often work well with 10-20 high-quality images. Anime's simpler style consistency and cel-shading aesthetics require less variation to learn effectively.

Resolution stays at 1024x1024 as Illustrious trains at SDXL's native resolution. Lower resolutions like 896x896 can save memory but sacrifice quality. For 16GB cards, stick with 1024x1024 using aggressive caching rather than reducing resolution.

Max epochs typically range from 10-15 for character LoRAs. Anime content learns faster than complex photorealistic subjects, and overfitting happens more quickly. Monitor sample images carefully and stop when quality peaks, typically between 10-15 epochs.

Caching configuration matches SDXL requirements. Enable all caching with disk storage using --cache_latents --cache_latents_to_disk --cache_text_encoder_outputs --cache_text_encoder_outputs_to_disk. These options are critical for 16GB cards and beneficial for all AMD GPUs.

Optimizer choice favors AdamW8bit for memory efficiency. The 8-bit optimizer uses less VRAM than standard AdamW with minimal quality impact, essential for 16GB cards and helpful even at 24GB.

How Do You Caption Anime Training Data for Illustrious?

Captioning strategy significantly impacts Illustrious LoRA quality. The model's hybrid Danbooru tag and natural language understanding requires specific approaches.

Danbooru tag structure follows hierarchical conventions. Tags typically start with character count (1girl, 1boy, 2girls), then character features (hair, eyes, clothing), then pose/action, then background/setting. This ordering helps Illustrious parse captions effectively.

Character feature tags use standardized Danbooru conventions. Hair color tags like blonde_hair, black_hair, blue_hair use underscores. Eye colors follow similar patterns. Hairstyles use tags like long_hair, short_hair, ponytail, twin_tails. Consistency matters for training effectiveness.

Want to skip the complexity? Apatero gives you professional AI results instantly with no technical setup required.

Zero setup Same quality Start in 30 seconds Try Apatero Free
No credit card required

Clothing and outfit tags should be specific. Instead of generic "uniform," use school_uniform, military_uniform, maid_outfit, etc. The more specific tags help the LoRA learn precise visual concepts.

Example Illustrious Caption Formats:
  • Character LoRA: `1girl, charactername, blue_eyes, long_blonde_hair, school_uniform, standing, classroom, detailed background`
  • Hybrid format: `1girl, charactername, wearing a blue school uniform, blue_eyes, long hair, classroom setting with desks and windows`
  • Natural language: `A girl with blue eyes and long blonde hair wearing a school uniform standing in a classroom` (less optimal for Illustrious)

Your trigger word (typically the character name) should appear in each caption. Place it early after the character count tag. For example: 1girl, miku_hatsune, turquoise_hair, twin_tails, .... Consistent trigger word placement helps training.

Quality tags like masterpiece, best quality, highly detailed appear commonly in Danbooru datasets and can help guide generation quality. Include these in some but not all training captions to prevent over-association.

Natural language can mix with tags for scene descriptions. After core Danbooru tags, add natural language descriptions of setting, mood, lighting, or context. This hybrid approach leverages both systems.

Negative concepts don't belong in training captions. Don't include tags describing what's NOT in the image. Training captions should positively describe what exists, not what's absent.

Caption length can be substantial for Illustrious thanks to SDXL's 225-token limit. Don't hesitate to use 30-50 tags plus natural language descriptions. Detailed captions help the model learn precise concepts.

Automated tagging tools can help but require review. WD14 tagger and other anime taggers generate Danbooru tags automatically. Review and correct these automated tags, as errors propagate through training.

What Is a Complete Illustrious Training Command Example?

A typical Illustrious character LoRA training command for AMD GPUs combines SDXL training structure with Illustrious-optimized parameters.

Example command: accelerate launch --mixed_precision="fp16" sdxl_train_network.py --pretrained_model_name_or_path="/path/to/illustrious-v1.safetensors" --train_data_dir="./train" --output_dir="./output" --output_name="character_LoRA" --network_module="networks.lora" --network_dim=48 --network_alpha=24 --unet_lr=0.0003 --text_encoder_lr=0.00003 --lr_scheduler="cosine_with_restarts" --max_train_epochs=12 --save_every_n_epochs=2 --train_batch_size=1 --max_token_length=225 --xformers=False --cache_latents --cache_latents_to_disk --cache_text_encoder_outputs --cache_text_encoder_outputs_to_disk --no_half_vae --mixed_precision="fp16" --optimizer_type="AdamW8bit" --gradient_checkpointing --persistent_data_loader_workers --resolution="1024,1024".

Join 115 other course members

Create Your First Mega-Realistic AI Influencer in 51 Lessons

Create ultra-realistic AI influencers with lifelike skin details, professional selfies, and complex scenes. Get two complete courses in one bundle. ComfyUI Foundation to master the tech, and Fanvue Creator Academy to learn how to market yourself as an AI creator.

Early-bird pricing ends in:
--
Days
:
--
Hours
:
--
Minutes
:
--
Seconds
51 Lessons • 2 Complete Courses
One-Time Payment
Lifetime Updates
Save $200 - Price Increases to $399 Forever
Early-bird discount for our first students. We are constantly adding more value, but you lock in $199 forever.
Beginner friendly
Production ready
Always updated

Key differences from standard SDXL include separate UNET and Text Encoder learning rates specified with --unet_lr=0.0003 --text_encoder_lr=0.00003. This replaces the single --learning_rate parameter.

The cosine_with_restarts scheduler works well for anime training, providing periodic learning rate resets that help escape local minima. Alternative schedulers like cosine or constant also work.

Fewer epochs (12 instead of 15-20) reflect anime content's faster learning. Monitor sample images and stop when quality peaks, typically between 10-15 epochs for character LoRAs.

Sample generation uses --sample_every_n_epochs=2 --sample_prompts="./illustrious_samples.txt" to generate test images periodically. Create sample_prompts.txt with prompts using your trigger word and various Danbooru tags to test LoRA effectiveness.

Training time on RX 7900 XTX with 15 images at 12 epochs takes approximately 2-4 hours. RX 6800 XT with 16GB takes 4-6 hours due to tighter memory constraints requiring conservative settings.

Frequently Asked Questions

Is Illustrious training harder than SDXL on AMD GPUs?

No, Illustrious training is actually slightly easier than general SDXL for most users. The anime focus means smaller datasets work well (10-20 images vs 20-40), training completes in fewer epochs (10-15 vs 15-20), and style consistency makes results more forgiving. The setup is identical to SDXL, just with different parameter values. If your AMD GPU handles SDXL, it handles Illustrious identically.

Do I need to learn Danbooru tagging to train Illustrious LoRAs?

While not strictly mandatory, understanding basic Danbooru tags significantly improves results. Learn core tags for character features (hair, eyes, clothing), common quality tags, and basic ordering conventions. You can mix Danbooru tags with natural language for hybrid captions. Many anime taggers automate tag generation, which you can then review and correct. The learning curve is moderate and pays off in better LoRA quality.

Can I use my SDXL training setup for Illustrious without changes?

Yes, if you have working SDXL training on AMD, it works for Illustrious immediately. The only changes needed are the Illustrious base model path and parameter adjustments (learning rates, epochs). The tokenizer fix, ROCm setup, and Kohya installation remain identical. This makes Illustrious training accessible to anyone already doing SDXL LoRA training.

What makes Illustrious better than SDXL for anime characters?

Illustrious understands anime-specific concepts, styles, and features through specialized training on anime artwork. It handles anime hair physics, character proportions, cel-shading, and art styles more naturally than general SDXL. Danbooru tag support provides precise control anime community members already know. Character consistency across variations is superior. For anime content, Illustrious produces better results than SDXL with the same or less training effort.

How many training images do I need for an anime character LoRA?

Typically 10-20 high-quality images suffice for anime character LoRAs on Illustrious. Anime's simpler style consistency compared to photorealistic subjects means fewer images teach the model effectively. Ensure images show the character from various angles, in different poses, with consistent key features. Quality matters more than quantity. Some users succeed with as few as 8 images for distinctive characters.

Should I use v0.1, v1.0, or v2.0 Illustrious base model?

For LoRA training in 2025, v1.0 provides the best balance of stability and compatibility. V0.1 works but has some quality limitations. V2.0-STABLE released in April 2025 offers improvements but has less community LoRA training experience documented. Start with v1.0 unless you have specific reasons to use v2.0. Both work with the same training process on AMD GPUs.

Can I train style LoRAs instead of character LoRAs on Illustrious?

Yes, Illustrious excels at style LoRAs thanks to deep anime aesthetic understanding. Use similar training parameters but potentially higher network dimensions (64 instead of 48) and more training images (20-30) to capture style nuances. Style LoRAs benefit from diverse subject matter showing consistent artistic treatment. The Danbooru tagging system includes style-related tags that help define and control artistic styles.

What if my character LoRA makes all generated images too similar?

This indicates overfitting where the LoRA memorizes training images rather than generalizing. Solutions include reducing max epochs (try 8-10 instead of 12-15), lowering UNET learning rate slightly (0.00025 instead of 0.0003), increasing dataset diversity with more varied poses and settings, or adding regularization images. Stopping training earlier before overfitting occurs produces more flexible LoRAs.

Does Illustrious work with AMD GPU inference or just training?

Both training and inference work on AMD GPUs with ROCm. Once you train an Illustrious LoRA, you can use it for generation on the same AMD setup. ComfyUI, Automatic1111 WebUI, and other interfaces support AMD GPUs for Illustrious generation. The LoRA files themselves are platform-independent, so LoRAs trained on AMD work everywhere including NVIDIA systems.

Can I combine multiple Illustrious LoRAs during inference?

Yes, you can use multiple Illustrious LoRAs together during generation, combining character LoRAs with style LoRAs or multiple character LoRAs in one image. Weight the LoRAs appropriately (typically 0.6-1.0 strength) and adjust based on results. This flexibility enables complex creative combinations. Train individual LoRAs for characters or styles, then mix during generation for unique compositions.

Succeeding with Anime LoRA Training on AMD Hardware

Illustrious-XL training on AMD GPUs leverages the same robust SDXL infrastructure with optimizations for anime content. The identical hardware requirements (16GB+ VRAM), ROCm setup, and Kohya workflows make Illustrious accessible to anyone already doing SDXL training.

The anime focus actually simplifies some aspects of training. Smaller datasets, fewer epochs, and style consistency make Illustrious character LoRAs more forgiving than photorealistic SDXL training. The learning curve involves understanding Danbooru tagging conventions rather than hardware or software complexity.

Separate learning rates for UNET and Text Encoder represent the key parameter insight from the anime training community. This 10:1 ratio produces strong character features while preventing trigger word overfitting, a balance that works consistently across diverse character types.

For users wanting anime image generation without training custom LoRAs, platforms like Apatero.com provide access to professionally trained anime-optimized models through streamlined interfaces, eliminating setup and training complexity.

As anime AI generation continues advancing, specialized models like Illustrious demonstrate the value of domain-specific training and optimization. AMD GPU users benefit from this specialization equally with NVIDIA users, as the ROCm foundation enables training across the full ecosystem of Stable Diffusion variants and fine-tunes.

Ready to Create Your AI Influencer?

Join 115 students mastering ComfyUI and AI influencer marketing in our complete 51-lesson course.

Early-bird pricing ends in:
--
Days
:
--
Hours
:
--
Minutes
:
--
Seconds
Claim Your Spot - $199
Save $200 - Price Increases to $399 Forever