TRELLIS.2 in ComfyUI: Image to 3D in Seconds
Microsoft's TRELLIS.2 generates high-fidelity 3D models with PBR materials from single images. Complete ComfyUI setup guide with workflows and settings.
When Microsoft dropped TRELLIS.2 on December 17th, I immediately cleared my schedule. A 4 billion parameter model that generates full 3D assets with PBR materials in seconds? I had to test it.
Quick Answer: TRELLIS.2 is Microsoft's state-of-the-art image-to-3D model that generates high-quality 3D meshes with full PBR materials from single images. It runs in ComfyUI via custom nodes and can produce 512³ resolution assets in about 3 seconds on an H100.
- 4B parameter model producing 3D assets with base color, roughness, metallic, and opacity
- Novel O-Voxel sparse structure handles complex topologies and sharp features
- ComfyUI integration available through multiple custom node implementations
- MIT licensed with full model weights available on HuggingFace
- Generates 1536³ resolution assets in about 60 seconds
Why TRELLIS.2 Changes Things
I've tested most image-to-3D solutions. TripoSR, Hunyuan3D, the various Gaussian splatting approaches. They all have the same problem: the output looks fine at a distance but falls apart up close. Details get blobby. Hard edges become soft. PBR materials are either missing or look wrong.
TRELLIS.2 is the first model where I actually forgot I was looking at AI output. The edges are sharp. The materials are physically plausible. The topology is clean enough to use in production.
Microsoft didn't just scale up existing approaches. They designed a new sparse voxel structure called O-Voxel specifically for this problem. It's a fundamentally different architecture that handles the weaknesses of previous methods.
The O-Voxel Architecture
Technical detail incoming, but bear with me because it explains why this model works so well.
Previous 3D generation models use either:
- Dense voxel grids: Waste computation on empty space
- Gaussian splats: Great for rendering, terrible for mesh extraction
- Implicit functions (NeRF-style): Slow to evaluate and hard to edit
O-Voxel is a sparse voxel representation. It only stores data where the object actually exists. This lets the model work at high resolution (up to 1536³) without exploding memory requirements.
But here's the clever part: O-Voxel is "field-free." It doesn't store continuous density fields that need post-processing to become meshes. The voxels directly define the surface. Sharp edges stay sharp. Complex topology like thin handles or hollow objects works correctly.
O-Voxel sparse structure enables sharp features and complex topology without the usual generation artifacts
Full PBR Material Support
This is what really sets TRELLIS.2 apart. It doesn't just generate geometry. It generates:
- Base Color: The albedo/diffuse color
- Roughness: How shiny or matte the surface is
- Metallic: Whether the material behaves like metal
- Opacity: Transparency support for glass, water, etc.
Previous models gave you geometry plus maybe a diffuse texture. TRELLIS.2 gives you production-ready materials that look correct under any lighting.
I tested it on a reference image of a chrome robot. The output actually had metallic materials that caught reflections properly. That's not something I've seen from other image-to-3D tools.
Getting It Running in ComfyUI
There are multiple ComfyUI implementations. Here's what I've tested:
ComfyUI-TRELLIS2 (Recommended) By PozzettiAndrea on GitHub. Search "ComfyUI-TRELLIS2" in ComfyUI Manager and install.
Models download automatically on first run from HuggingFace. You'll need about 25GB for the full model weights.
ComfyUI_TRELLIS (Alternative) By smthemex. Supports both image-to-3D and text-to-3D modes. Good if you want more flexibility.
ComfyUI-3D-Pack By MrForExample. Comprehensive 3D node suite that includes TRELLIS alongside other models like Hunyuan3D and TripoSG. Best if you want to compare multiple approaches.
For most users, I'd start with ComfyUI-TRELLIS2. Simplest setup, best documentation.
Performance Numbers
I benchmarked on both H100 and 4090. Here's what to expect:
Free ComfyUI Workflows
Find free, open-source ComfyUI workflows for techniques in this article. Open source is strong.
| Resolution | H100 Time | 4090 Time | VRAM Usage |
|---|---|---|---|
| 512³ | ~3 sec | ~8 sec | ~12GB |
| 1024³ | ~17 sec | ~45 sec | ~20GB |
| 1536³ | ~60 sec | ~150 sec | ~32GB |
The 4090 times are rough estimates since the 1536³ mode is pushing VRAM limits. You might need to reduce batch size or enable memory optimizations.
For most use cases, 512³ is plenty. It's the sweet spot of speed and quality. Only go higher if you need extreme close-up detail or are doing animation work where mesh quality matters for deformation.
Input Image Best Practices
TRELLIS.2 works from single images, but not all images work equally well.
Works Great:
- Product photography with clean backgrounds
- Isolated objects with visible geometry
- Front 3/4 views showing most of the object
- Images with clear material distinctions
Works But Needs Care:
- Complex multi-object scenes
- Very reflective/mirror surfaces
- Extremely thin structures
- Ambiguous silhouettes
Struggles With:
- Flat patterns or textures without depth cues
- Objects with invisible backsides (model has to hallucinate)
- Multiple disconnected objects
- Text and fine details smaller than the resolution allows
My workflow for best results: generate clean product-style images with Z Image Turbo, then run those through TRELLIS.2. The clean backgrounds help the model focus on the actual object.
Workflow Example
Here's my actual working ComfyUI setup:
- Load Image node with your reference
- TRELLIS2 Load Model node (runs once, caches the model)
- TRELLIS2 Generate node with settings:
- Resolution: 512 or 1024
- Seed: Random or fixed for reproducibility
- Guidance: 3.0-5.0 (higher = closer to input)
- TRELLIS2 Export Mesh node for .glb or .obj output
The output includes both the mesh and texture maps. You can import directly into Blender, Unity, Unreal, or any 3D software.
Want to skip the complexity? Apatero gives you professional AI results instantly with no technical setup required.
Basic TRELLIS.2 workflow in ComfyUI from image input to mesh export
Comparing with Alternatives
I ran the same reference images through multiple models:
TRELLIS.2 vs Hunyuan3D TRELLIS.2 produces cleaner topology and better materials. Hunyuan3D sometimes has smoother overall look but loses detail. For production use, TRELLIS.2 wins.
TRELLIS.2 vs TripoSR TripoSR is faster but produces noticeably worse quality. Good for quick previews, not for final assets.
TRELLIS.2 vs InstantMesh Similar quality tier, different strengths. InstantMesh handles some organic forms better. TRELLIS.2 handles hard-surface models better.
For my workflow, I've standardized on TRELLIS.2 for anything that needs to look professional.
What's Missing (For Now)
Being honest about current limitations:
No animation rigging Output is static geometry. If you need a rigged character, you'll still need manual work or a separate rigging tool.
Single object focus Multi-object scenes need to be separated first. The model can't intelligently segment and generate multiple distinct objects.
Join 115 other course members
Create Your First Mega-Realistic AI Influencer in 51 Lessons
Create ultra-realistic AI influencers with lifelike skin details, professional selfies, and complex scenes. Get two complete courses in one bundle. ComfyUI Foundation to master the tech, and Fanvue Creator Academy to learn how to market yourself as an AI creator.
Training code coming Fine-tuning for specific use cases isn't available yet. Microsoft says training code will release before end of 2025.
Texture resolution limits PBR maps are generated at mesh resolution. For extreme close-ups, you might want to upscale textures in post.
Practical Use Cases
Where I've actually used TRELLIS.2 outputs:
Game Dev Prototyping Generated placeholder assets for a game jam. What would have taken hours of modeling took minutes. Quality was good enough that some assets stayed in the final build.
E-Commerce Product Visualization Client needed 3D product spins. TRELLIS.2 from product photos was faster than hiring a 3D artist for initial concepts.
Social Media 3D Content Generated AR-ready models for promotional content. The PBR materials made them look professional under Instagram's lighting.
Concept Art Visualization Turned 2D concepts into 3D for review. Designers could see their concepts from multiple angles immediately.
Integration with Other Tools
TRELLIS.2 fits well into broader workflows:
- Generate reference images with Flux or SDXL
- Convert to 3D with TRELLIS.2
- Enhance with video generation via WAN 2.2 for turntable animations
- Use the 3D model as ControlNet depth reference for more images
For Apatero.com, we're looking at how to integrate 3D generation into creative workflows. The speed of TRELLIS.2 makes iterative 3D concepting practical in ways that weren't possible before.
FAQ
Is TRELLIS.2 free to use commercially? Yes. MIT license means full commercial use with no restrictions.
How much VRAM do I need minimum? 12GB for 512³ resolution. 24GB recommended for comfortable use at higher resolutions.
Can I use CPU for inference? Technically yes but painfully slow. GPU is effectively required.
Does it work with transparent backgrounds? Yes, and it actually helps. Clean backgrounds produce better results than busy ones.
What output formats are supported? GLB and OBJ through the ComfyUI nodes. GLB is preferred for web use since it includes materials.
Can I edit the output in Blender? Absolutely. The meshes are standard geometry with UV maps and material assignments.
Is there a cloud API? Not officially from Microsoft. Community deployments exist on various platforms.
How does it compare to paid services like CSM.ai? Quality is comparable or better. Speed is similar. TRELLIS.2 is free and local, which matters for many workflows.
What's Next
Microsoft has been clear that TRELLIS.2 is just the beginning. Training code drops soon, which means community fine-tunes for specific domains.
I expect to see:
- Character-specific models trained on posed humans
- Architectural models optimized for buildings
- Product design models with tighter material accuracy
For now, the base model handles most use cases well. If you're doing any 3D work and haven't tried TRELLIS.2 yet, set aside an hour to get it running. The results will surprise you.
Ready to Create Your AI Influencer?
Join 115 students mastering ComfyUI and AI influencer marketing in our complete 51-lesson course.