What will I learn from this ai image generation tutorial?

Optimize your LoRA training dataset with the right mix of headshots and body shots. Data preparation strategies for consistent character training. This comprehensive guide covers all the essential concepts and practical steps you need to master ai image generation.

Is this ai image generation tutorial suitable for beginners?

This tutorial is designed to be accessible for learners at various skill levels. We provide clear explanations and step-by-step instructions to help you understand ai image generation concepts effectively.

How long does it take to complete this ai image generation tutorial?

This tutorial has an estimated reading time of 23 minutes. However, we recommend taking additional time to practice the concepts and techniques covered to fully master the material.

Where can I find more ai image generation tutorials and resources?

You can find more ai image generation tutorials in our AI Image Generation category section. We also recommend exploring our related articles and following our blog for the latest updates on ai image generation techniques and best practices.

/ AI Image Generation / LoRA Training Guide: Headshots vs Body Shots Ratio

AI Image Generation • October 16, 2025 • 23 min read

LoRA Training Guide: Headshots vs Body Shots Ratio

Optimize your LoRA training dataset with the right mix of headshots and body shots. Data preparation strategies for consistent character training.

You're ready to train your first character LoRA, but the internet gives you wildly conflicting advice. Some tutorials say 5-10 images is enough, others demand 200+. Nobody agrees on how many should be headshots versus full body shots. And what if you want to train a LoRA that handles both SFW and NSFW content?

Direct Answer: For optimal LoRA training, use 100+ images split 50/50 (50 headshots, 50 body shots) for full-body character LoRAs. Face-only LoRAs need 100+ headshots. Multi-purpose SFW+NSFW LoRAs require 200+ images split 100/100. Training on RTX 3090 takes 2-4 hours for 100 images with learning rate 8e-5 for small datasets or 5e-5 for 100+ images.

TL;DR - LoRA Training Dataset Guide:

Full-Body LoRAs: 100+ images split 50 headshots / 50 body shots for balanced results
Face-Only LoRAs: 100+ headshots for consistent facial features
Multi-Purpose LoRAs: 200+ images split 100 SFW / 100 NSFW for both contexts
Minimum Viable: 20-30 images produces recognizable but weaker consistency
Training Time: 30-60 min (20 images), 2-4 hours (100 images), 4-8 hours (200+ images)
Learning Rates: 8e-5 to 1e-4 (small), 5e-5 to 8e-5 (medium), 1e-5 to 5e-5 (large datasets)
Best Split: 50/50 headshots/body prevents facial drift and body generation issues

After testing dozens of training runs with datasets ranging from 20 to 200+ images, clear patterns emerge about what actually works. The truth? Dataset size and composition matter enormously, but the optimal configuration depends entirely on what you want your LoRA to do.

Learning ComfyUI? Join 115 other course members

51 lessons covering ComfyUI + AI influencer marketing. Early-bird pricing ends soon.

How Many Headshots and Body Shots Do You Really Need for LoRA Training?

This guide cuts through the confusion with tested, real-world strategies for building LoRA training datasets that produce consistent, high-quality results. For using your trained LoRAs in ComfyUI workflows, see our ComfyUI basics guide and essential custom nodes.

What You'll Learn: Optimal dataset sizes for different LoRA types and use cases, the perfect headshot-to-body shot ratio based on your goals, tested training strategies for face-only, full-body, and multi-purpose LoRAs, how to structure 100+ image datasets for SFW/NSFW character consistency, dataset preparation techniques that prevent overfitting and improve quality, and real-world training results with specific image counts and compositions.

Understanding LoRA Training Fundamentals - What Actually Matters

Before diving into dataset specifics, understanding what LoRAs are learning helps you make informed decisions about training data composition.

What LoRAs Actually Learn: LoRAs (Low-Rank Adaptations) learn to modify base model outputs by capturing patterns specific to your training data. They're learning facial features, body proportions, clothing styles, lighting preferences, and artistic characteristics present in your dataset. If you're new to AI image generation concepts, check out our complete guide to getting started with AI image generation.

The more consistently these patterns appear, the better the LoRA captures and reproduces them.

Why Dataset Composition Matters:

Dataset Characteristic	Impact on LoRA	Training Consideration
Image count	Consistency strength	More images = better consistency (to a point)
Variety of angles	Pose flexibility	More angles = more versatile output
Consistent subject	Identity preservation	Same subject = better character retention
Diverse backgrounds	Scene flexibility	Varied backgrounds = better adaptation
Clothing variation	Style range	More variety = less clothing overfitting

The Overfitting Problem: Too many similar images cause overfitting - the LoRA memorizes specific photos rather than learning general character features. This creates problems when you try to generate scenes different from your training data.

Diversity in angles, lighting, and context prevents overfitting while maintaining character consistency.

Quality vs Quantity: Ten high-quality, well-composed, varied images outperform fifty nearly identical selfies. Quality, variety, and consistency matter more than raw image count.

This doesn't mean more images can't help - it means throwing random images at training won't produce better results.

Training Time and Resources:

Dataset Size	Training Time (RTX 3090)	VRAM Required	Storage	Cost (Cloud)
20 images	30-60 minutes	10-12GB	100-200MB	$2-5
50 images	1-2 hours	12-16GB	250-500MB	$5-10
100 images	2-4 hours	16-20GB	500MB-1GB	$10-20
200+ images	4-8 hours	20-24GB	1-2GB	$20-40

Understanding these resource requirements helps you plan training runs effectively. If you're working with limited VRAM, see our complete low-VRAM survival guide and budget hardware guide for optimization strategies.

For users who want excellent LoRAs without managing training infrastructure, platforms like Apatero.com provide streamlined training interfaces with automatic optimization.

The Tested Formula - Dataset Sizes That Actually Work

Based on extensive testing across dozens of training runs, here are the dataset configurations that consistently produce high-quality results for different LoRA types.

Face-Only LoRA (Headshots/Portraits Only): If your goal is generating headshots and half-length portraits, you don't need full-body images. Focus entirely on facial consistency.

Optimal Configuration: 100+ face-focused images

70-80 close-up headshots (shoulders and above)
20-30 half-length portraits (waist and above)
Variety of expressions, angles, and lighting
Consistent subject across all images

Real-World Results: In testing, 100+ face images produced excellent facial consistency with strong identity preservation across different prompts, styles, and contexts. The LoRA reliably generates recognizable character faces in varied scenes. For visual novel character creation requiring extreme consistency, also see our VNCCS guide. For other face consistency techniques, explore our face swap comparison guide.

Smaller datasets (20-30 face images) worked but showed weaker consistency and occasional facial feature drift.

Full-Body LoRA (Complete Character): For generating full-body images with consistent character appearance from head to toe, you need body proportion training data.

Optimal Configuration: 100+ total images split 50/50

50+ headshots and close-up portraits
50+ full-body shots (head-to-toe visible)
Mix of poses, clothing, and contexts
Consistent character across all images

Why The 50/50 Split Works: This balanced approach ensures the LoRA learns facial details from close-ups while understanding body proportions from full-body shots. Skewing too heavily toward either type creates weaknesses. This principle applies whether you're training Flux LoRAs or using other models.

Too many headshots and the LoRA struggles with body generation. Too many full-body shots and facial consistency suffers.

Multi-Purpose LoRA (SFW + NSFW): For LoRAs handling both safe-for-work and adult content with consistent character representation, dataset separation and volume matter significantly.

Optimal Configuration: 200+ total images split by content type

100+ SFW images (50+ headshots, 50+ body shots)
100+ NSFW images (50+ headshots, 50+ body shots)
Maintain angle and variety balance within each category
Same character across all images

Why NSFW Training Needs More Images: The model has less pre-existing knowledge about NSFW compositions, requiring more training data to learn these patterns while maintaining character consistency. This contrasts with techniques like DreamBooth training, which has different data requirements.

The 100/100 split ensures adequate representation of both content types without the LoRA overfitting to either category.

Testing Results:

Dataset Type	Image Count	Face Consistency	Body Consistency	Versatility	Overall Quality
Face-only	100+ faces	Excellent	N/A	Moderate	Excellent for headshots
Full-body	50/50 split (100 total)	Excellent	Excellent	High	Excellent overall
SFW+NSFW	100/100 split (200 total)	Excellent	Excellent	Very High	Excellent both categories
Small dataset	20-30 images	Good	Weak	Low	Usable but limited

The Minimum Viable Dataset: While 100+ images is optimal, you can train usable LoRAs with 20-30 high-quality, diverse images. Expect weaker consistency and less versatility, but the LoRA will capture basic character features.

This minimal approach works for personal projects and experimentation but isn't recommended for professional or commercial work.

Dataset Preparation - Building Your Training Set

Quality dataset preparation matters as much as quantity. Here's how to build training sets that produce excellent LoRAs.

Image Selection Criteria:

Criterion	Why It Matters	How to Implement
Consistent subject	Identity preservation	Same person/character in all images
Varied angles	Pose flexibility	Front, 3/4, side, back views
Different expressions	Emotional range	Happy, neutral, serious, etc.
Diverse lighting	Lighting adaptation	Natural, studio, dramatic, soft
Multiple outfits	Avoid clothing overfitting	At least 5-10 different outfits
Clean backgrounds	Focus on subject	Minimal background complexity

Aspect Ratio Distribution: Modern LoRA training handles multiple aspect ratios. Vary your training data to match how you'll use the LoRA.

Recommended Distribution:

40% square (1:1) - headshots, close-ups
30% portrait (3:4 or 2:3) - full-body standing
20% space (4:3 or 3:2) - full-body action
10% ultra-wide or ultra-tall - creative compositions

Image Quality Requirements:

Quality Factor	Minimum	Recommended	Notes
Resolution	512x512	1024x1024+	Higher is better
Focus	Sharp subject	Tack-sharp subject	Blur degrades training
Lighting	Visible features	Well-lit, clear details	Avoid heavy shadows
Compression	Light JPEG	PNG or high-quality JPEG	Avoid compression artifacts

For optimizing training image quality, consider using AI upscaling techniques on lower-resolution source images before training.

What to Avoid in Training Data: Don't include heavily filtered or edited images - Instagram filters confuse training. Avoid images with multiple people unless you can crop to single subject. Skip images where the subject is partially obscured or cut off. Exclude low-resolution or heavily compressed images.

Captioning Your Dataset:

Captioning Approach	Pros	Cons	Best For
Auto-captioning (BLIP)	Fast, consistent	Generic descriptions	Large datasets
Manual captioning	Precise, detailed	Time-consuming	Quality-focused
Hybrid approach	Balanced	Moderate effort	Most projects

For advanced prompting strategies that can improve caption quality, see our guide on ComfyUI wildcards.

Free ComfyUI Workflows

Find free, open-source ComfyUI workflows for techniques in this article. Open source is strong.

100% Free MIT License Production Ready Star & Try Workflows

Directory Structure: Organize your dataset logically for training tools. Create a training_dataset folder with subfolders for headshots, body_shots, sfw, and nsfw categories. Each image file should have a corresponding .txt caption file with the same name.

Most training tools expect images and corresponding .txt caption files in the same directory.

Training Parameters That Actually Matter

Beyond dataset composition, training parameters significantly affect LoRA quality. Here are tested configurations that consistently produce good results.

Core Training Parameters:

Parameter	Small Dataset (20-30)	Medium Dataset (50-100)	Large Dataset (100+)
Training steps	1000-1500	2000-3000	3000-5000
Learning rate	1e-4 to 5e-4	5e-5 to 1e-4	1e-5 to 5e-5
Batch size	1-2	2-4	4-8
Network rank	8-16	16-32	32-64
Network alpha	8	16	32

Learning Rate Impact: Learning rate controls how aggressively the LoRA learns from training data. Too high causes overfitting and instability. Too low means insufficient learning even with many steps.

Start with conservative learning rates (1e-4) and decrease for larger datasets to prevent overfitting.

Step Count Determination: Calculate total steps as: (number_of_images × epochs) / batch_size

For 100 images with 30 epochs and batch size 2: (100 × 30) / 2 = 1500 steps

Most training tools calculate this automatically based on your epoch setting.

Network Rank Explained:

Rank	Parameters Trained	Training Time	Quality	File Size
8	Minimal	Fast	Good	Small (~10MB)
16	Low	Moderate	Better	Medium (~20MB)
32	Medium	Slower	Excellent	Standard (~40MB)
64	High	Slow	Diminishing returns	Large (~80MB)

Higher rank allows the LoRA to learn more complex features but requires more training data to avoid overfitting.

Training Platform Comparison:

Platform	Ease of Use	Control	Cost	Best For
Kohya GUI (local)	Moderate	Complete	Free (GPU cost)	Technical users
CivitAI training	Easy	Limited	Credits-based	Beginners
Apatero.com	Very easy	Optimized	Subscription	Professional work
Google Colab	Moderate	High	Free/paid	Experimentation

Monitoring Training Progress: Watch for overfitting signs - training loss approaching zero while validation loss increases indicates overfitting. Sample generation every few hundred steps to visualize learning progress. For consistent comparison across training runs, use proper seed management techniques.

Stop training when sample quality plateaus - additional steps won't improve results.

Common Training Mistakes and How to Avoid Them

Even experienced creators make training mistakes that degrade LoRA quality. Here are the most common issues and their solutions.

Mistake 1 - Insufficient Dataset Variety:

Problem	Symptoms	Solution
All same angle	Only works from one viewpoint	Include front, 3/4, side, back angles
Same outfit	LoRA generates that outfit always	Use 5-10+ different outfits
Similar backgrounds	Overfits to specific scenes	Vary backgrounds significantly
Identical expressions	Limited emotional range	Include varied expressions

Mistake 2 - Overfitting from Too Many Similar Images: Training on 100 nearly identical selfies produces a LoRA that only works for that specific pose and lighting. The model memorizes photos rather than learning character features.

Solution: Curate datasets for maximum diversity within consistent character representation.

Want to skip the complexity? Apatero gives you professional AI results instantly with no technical setup required.

Zero setup Same quality Start in 30 seconds Try Apatero Free

No credit card required

Mistake 3 - Inconsistent Subject: Using multiple different people or characters in a single dataset confuses training. The LoRA tries to learn all subjects simultaneously, producing inconsistent results.

Solution: One LoRA = one subject. Train separate LoRAs for different characters.

Mistake 4 - Wrong Learning Rate:

Learning Rate	Result	Fix
Too high (1e-3+)	Unstable training, overfitting	Reduce to 1e-4 or lower
Too low (1e-6)	Insufficient learning	Increase to 5e-5 to 1e-4

Mistake 5 - Ignoring Training Metrics: Blindly running training without monitoring loss curves leads to suboptimal results. Training might overfit long before completion or might need more steps than initially planned.

Solution: Check sample outputs every 200-500 steps and watch loss curves.

Mistake 6 - Low-Quality Source Images:

Quality Issue	Impact	Solution
Low resolution	Blurry LoRA outputs	Use 1024px+ source images
Heavy compression	Artifacts in generation	Use PNG or high-quality JPEG
Poor lighting	Inconsistent features	Well-lit source images only

Mistake 7 - Dataset Too Small for Complexity: Trying to train a multi-style, multi-outfit, multi-context LoRA with 20 images doesn't provide enough data for the model to learn all those variations.

Solution: Match dataset size to complexity goals. Simple character LoRA = 20-30 images. Complex versatile LoRA = 100+ images. For specific model training approaches, see our guides on Qwen LoRA training and Wan 2.2 fine-tuning. For more common pitfalls to avoid, see our guide on 10 common ComfyUI beginner mistakes.

Advanced Training Strategies and Optimization

Beyond basic training, advanced techniques optimize LoRA quality and versatility.

Multi-Concept Training: Training a single LoRA on multiple related concepts (same character in different styles) requires careful dataset separation and increased image counts.

Approach: 50+ images per concept/style you want to capture. Use distinct caption keywords for each concept to help the LoRA differentiate.

Progressive Training: Start training with low learning rate and small network rank, then increase both gradually. This builds stable foundation before learning complex details.

Implementation:

Phase 1: Rank 8, LR 5e-5, 500 steps
Phase 2: Rank 16, LR 1e-4, 1000 steps
Phase 3: Rank 32, LR 5e-5, 1500 steps

Dataset Augmentation:

Technique	Purpose	Implementation
Horizontal flip	Double dataset size	Auto-enable in training tools
Brightness variation	Lighting robustness	Training tool parameter
Crop variation	Composition flexibility	Random crop during training
Color jitter	Color robustness	Advanced training tools

Regularization Images: Include generic images of similar subjects (not your specific character) to prevent overfitting and maintain model capabilities.

Ratio: 1 regularization image per 2-3 training images. Example: 100 character images + 40 regularization images.

Tag Weighting: Use weighted caption tags to emphasize important features.

Join 115 other course members

Create Your First Mega-Realistic AI Influencer in 51 Lessons

AI Influencers created with ComfyUI - Ultra-realistic AI generated models for content creators

Create ultra-realistic AI influencers with lifelike skin details, professional selfies, and complex scenes. Get two complete courses in one bundle. ComfyUI Foundation to master the tech, and Fanvue Creator Academy to learn how to market yourself as an AI creator.

Claim Your Spot - $199

Early-bird pricing ends in:

Days

Hours

Minutes

Seconds

51 Lessons • 2 Complete Courses

One-Time Payment

Lifetime Updates

Save $200 - Price Increases to $399 Forever

Early-bird discount for our first students. We are constantly adding more value, but you lock in $199 forever.

Beginner friendly

Production ready

Always updated

Example caption: (masterpiece:1.3), (character_name:1.5), blue eyes, blonde hair, red dress

The weights tell training to emphasize those tagged features more strongly.

Checkpoint Selection:

Base Model	Best For	Training Considerations
SD 1.5	General purpose	Mature, extensive training resources
SDXL	High quality	Requires more VRAM, longer training
FLUX	Cutting edge	Best quality, highest resource requirements
Anime models	Anime/manga	Style-specific optimization

You can also create custom base models by merging checkpoints before training your LoRA for even more specialized results.

Multi-Resolution Training: Train on varied resolutions to improve LoRA flexibility. Include images at 512x512, 768x768, 1024x1024, and non-square ratios.

This produces LoRAs that work well across different generation resolutions.

Testing and Iterating Your LoRA

Training doesn't end when the process completes. Systematic testing reveals strengths, weaknesses, and iteration opportunities.

Initial Testing Protocol:

Test Type	Purpose	Example Prompts
Identity test	Verify character recognition	"photo of [character], neutral expression"
Angle test	Check multi-angle capability	"3/4 view of [character]", "side profile"
Style test	Versatility across styles	"oil painting of [character]", "anime [character]"
Context test	Scene adaptation	"[character] in forest", "[character] in city"
Expression test	Emotional range	"smiling [character]", "angry [character]"

Quality Assessment Criteria:

Criterion	Poor	Acceptable	Excellent
Facial consistency	Features vary significantly	Generally recognizable	Highly consistent
Body proportions	Distorted or incorrect	Mostly correct	Accurate and consistent
Clothing flexibility	Stuck on training outfits	Some flexibility	Fully adaptable
Style adaptability	Only works in one style	Works in 2-3 styles	Works across many styles

Identifying Overfitting: Test with prompts significantly different from training data. If the LoRA struggles to generate anything outside training contexts, overfitting occurred.

Example: If all training images showed indoor scenes and the LoRA fails generating outdoor scenes, the model overfit to indoor contexts.

Iteration Strategy:

Issue Identified	Root Cause	Next Training Adjustment
Weak facial consistency	Insufficient face training data	Add 20-30 more headshots
Poor body proportions	Too few full-body images	Increase body shot percentage
Clothing overfitting	Insufficient outfit variety	Add images with more outfits
Limited angles	Training data from limited angles	Add varied angle images

Version Management: Save training checkpoints at different step counts. This provides multiple LoRA versions to test and choose from.

Many creators find their best LoRA is from 70-80% through training rather than the final checkpoint.

Community Feedback: Share test generations in LoRA training communities for feedback. Experienced trainers quickly identify issues and suggest improvements.

Real-World Training Examples and Results

Here are specific training runs with exact configurations and results to demonstrate these principles in practice.

Example 1 - Portrait LoRA:

Dataset: 120 face-focused images (90 headshots, 30 half-length)
Parameters: Rank 32, LR 1e-4, 3000 steps, SDXL base
Results: Excellent facial consistency across varied prompts and styles. LoRA weight 0.7-0.9 produced best results. Struggled with full-body generation as expected.
Best Use: Headshot generation, avatar creation, portrait art. For face swapping workflows, see our ComfyUI face swap guide

Example 2 - Full Character LoRA:

Dataset: 100 images (50 headshots, 50 full-body)
Parameters: Rank 32, LR 5e-5, 2500 steps, SD 1.5 base
Results: Good balance of facial and body consistency. Versatile across scenes and contexts. Slight facial drift at very high resolutions.
Best Use: General character generation, varied scenes

Example 3 - Multi-Purpose LoRA (SFW/NSFW):

Dataset: 220 images (110 SFW split 55/55, 110 NSFW split 55/55)
Parameters: Rank 64, LR 1e-5, 5000 steps, SDXL base
Results: Excellent consistency across both content types. Character recognizable in all contexts. Slightly longer training time justified by versatility.
Best Use: Commercial character work, comprehensive character representation

Example 4 - Minimal Dataset:

Dataset: 25 images (15 headshots, 10 body shots)
Parameters: Rank 16, LR 1e-4, 1500 steps, SD 1.5 base
Results: Recognizable character but inconsistent details. Worked well at specific LoRA weights (0.8-0.9) but weak outside that range. Prone to generating training outfit.
Best Use: Personal projects, quick character concepts

Training Cost Comparison:

Example	Training Time	Cloud Cost	Quality Rating	Versatility
Portrait	3 hours	$15	9/10	Moderate
Full Character	2.5 hours	$12	8.5/10	High
Multi-Purpose	5 hours	$25	9.5/10	Very High
Minimal	1.5 hours	$8	6.5/10	Low

Lessons from Testing: The jump from 25 to 100 images dramatically improves consistency and versatility. Beyond 100 images, improvements become incremental rather than transformative.

The 50/50 split for full-body LoRAs consistently outperforms other ratios. Training on SDXL produces higher quality but requires more VRAM and time compared to SD 1.5. Ensure your system is properly configured with PyTorch and CUDA GPU acceleration for optimal training performance.

Frequently Asked Questions

1. How many images do I need to train a good character LoRA?

Minimum 25 images for basic recognition, 50-75 images for good quality, 100+ images for excellent consistency and versatility. Face-only LoRAs work well with 100 face-focused images. Full-body LoRAs need 100+ images split 50% headshots and 50% body shots. Multi-purpose LoRAs handling diverse contexts benefit from 200+ images. Quality and diversity matter more than raw count.

2. What's the optimal ratio of headshots to body shots for full-body character LoRAs?

50% headshots and 50% body shots produces best results for full-body character LoRAs. 75/25 headshot-heavy creates strong faces but weaker body consistency. 25/75 body-heavy produces inconsistent faces. 50/50 split provides balanced recognition of both facial features and body characteristics, enabling reliable generation of full character in any pose or framing.

3. Should I train separate LoRAs for SFW and NSFW content or combine them?

Combine in single multi-purpose LoRA if you need both contexts. Use 200+ total images: 100 SFW (various clothing, settings, poses) and 100 NSFW (explicit content, nude poses). Combined LoRA handles both contexts with appropriate prompting. Separate LoRAs make sense only if SFW and NSFW versions need fundamentally different character features or target different art styles.

4. What image resolution and quality should my training images be?

Minimum 512x512, recommended 768x768 or higher for SDXL training. Images should be sharp, well-lit, in-focus, properly exposed, and show clear details. Avoid blurry, pixelated, heavily compressed, watermarked, or low-quality images. One high-quality 1024x1024 image is worth five low-quality 512x512 images. Training amplifies both quality and flaws in source material.

5. How do I handle duplicate or very similar images in my training dataset?

Remove exact duplicates (waste training resources), keep similar but distinct poses/angles/lighting (beneficial variety), aim for maximum diversity within similarity (same character, different everything else). Use image deduplication tools to find near-duplicates. 50 diverse images outperform 100 similar images. Variety teaches LoRA to generalize character features rather than memorize specific images.

6. What are the most common mistakes when building LoRA training datasets?

Too few images (under 50 for characters), insufficient variety (all same pose/angle/lighting), poor image quality (blurry, low-res, artifacts), inconsistent character appearance (changing hairstyles, major features), missing key angles (no profile shots, no full-body), improper captioning (vague, missing key features), wrong image ratios (90% headshots for full-body LoRA). Plan dataset composition before collecting images.

7. How should I caption my training images for best results?

Use detailed, consistent captions describing: character name/trigger word, visible features (hair color/style, eye color, distinctive marks), clothing and accessories, pose and framing, setting/background, lighting and mood. Avoid: generic descriptions ("person standing"), inconsistent terminology (switching between "blonde" and "yellow hair"), missing obvious features, overly long captions (150+ words). Kohya GUI supports various captioning methods.

8. Can I train a LoRA on images from different art styles or photographers?

Possible but challenging. LoRA learns both character and style patterns. Mixed styles: character consistency decreases 20-30%, LoRA may blend styles unpredictably, or struggle to reproduce character in either style cleanly. Best practice: train on consistent style/source, use 90%+ images from same artist/photographer/style. For style flexibility, train larger dataset (150-200+ images) with wider style variety.

9. How long does LoRA training take and what hardware do I need?

SD 1.5 LoRA: 30-60 minutes on RTX 3080, 20-40 minutes on RTX 4090, 8GB+ VRAM minimum. SDXL LoRA: 45-90 minutes on RTX 3080, 30-60 minutes on RTX 4090, 12GB+ VRAM minimum. Training time scales with dataset size, training steps, and resolution. Cloud training (vast.ai, RunPod) costs $0.50-2 per training session. CPU training possible but 10-20x slower.

10. How do I know if my training dataset is causing overfitting or underfitting?

Overfitting signs: LoRA only reproduces exact training images, can't handle new poses/angles/contexts, generates artifacts when prompted outside training data. Fix: add more diverse images, reduce training steps, use lower learning rate. Underfitting signs: LoRA doesn't capture character accurately, inconsistent features across generations, requires very high LoRA strength (1.0+). Fix: add more training images, increase training steps, improve image quality.

Conclusion - Building Training Datasets That Work

LoRA training dataset composition makes the difference between mediocre results and excellent character consistency. The tested formulas in this guide provide starting points for your specific needs.

Key Takeaways: Face-only LoRAs work excellently with 100+ face-focused images. Full-body LoRAs need 100+ images split 50/50 between headshots and body shots. Multi-purpose LoRAs handling SFW and NSFW content benefit from 200+ images split 100/100. Quality and variety matter more than raw image count.

Your Training Strategy: Start with clear goals - what will this LoRA generate? Match dataset size and composition to those goals. Curate for quality and diversity over quantity. Test systematically and iterate based on actual results.

Platform Considerations: Local training provides complete control but requires technical setup and GPU resources. Cloud platforms like Apatero.com streamline the process with optimized training pipelines. CivitAI training offers beginner-friendly interfaces with guided workflows. For deploying your trained LoRAs in production workflows, see our workflow to production API guide.

What's Next: Build your first training dataset following these guidelines. Start with a modest 50-image dataset to learn the process, then scale up based on results. Join LoRA training communities to share results and learn from experienced trainers.

The Bottom Line: Great LoRAs come from thoughtful dataset preparation, appropriate training parameters, and systematic iteration. Follow these tested strategies, and you'll create consistent, versatile LoRAs that bring your characters to life across any context.

Your training data defines your LoRA's capabilities. Invest time in dataset preparation, and the results will reflect that quality.

Ready to Create Your AI Influencer?

Join 115 students mastering ComfyUI and AI influencer marketing in our complete 51-lesson course.