Is this comfyui tutorial suitable for beginners?

This tutorial is designed to be accessible for learners at various skill levels. We provide clear explanations and step-by-step instructions to help you understand comfyui concepts effectively.

How long does it take to complete this comfyui tutorial?

This tutorial has an estimated reading time of 18 minutes. However, we recommend taking additional time to practice the concepts and techniques covered to fully master the material.

Where can I find more comfyui tutorials and resources?

You can find more comfyui tutorials in our ComfyUI category section. We also recommend exploring our related articles and following our blog for the latest updates on comfyui techniques and best practices.

/ ComfyUI / Flux Kontext Recipes: 12 Production Photo Edits Step by Step

ComfyUI • June 19, 2026 • 18 min read

Flux Kontext Recipes: 12 Production Photo Edits Step by Step

Twelve Flux Kontext recipes for real production photo work. Retouch, relight, replace, restyle with copy-paste prompts that actually preserve the background.

Make AI images and video in your browser

Characters, video, photo packs. No GPU, no setup. Your first generation is free.

Try Apatero Free

I have been editing photos with Flux Kontext daily since the Pro release in May. Twelve months of beta builds, three months of paid runtime, and somewhere north of eleven thousand images later, I have a strong opinion. Most Flux Kontext tutorials get the prompt structure backwards. They treat it like a chatbot. It is not a chatbot. It is a controllable image transform that punishes vague instructions and rewards specificity. These twelve recipes are the ones I actually run in production, lifted straight out of my own workflow, with the exact phrasing that survives across model versions.

Quick Answer: Flux Kontext is an instruction-based image editor from Black Forest Labs that edits photos through natural-language prompts while preserving everything you do not name. The twelve recipes below cover the five core editing classes including retouch, relight, replace, restyle, and restore. Each recipe pairs a copy-paste prompt with the background-preservation language that keeps the rest of the frame untouched.

Key Takeaways:

Flux Kontext edits in three to five seconds at 1024px versus thirty-plus for inpainting workflows
The "do not change" clause is more important than the "change this" clause for production
Chaining edits in single-pass JSON beats multi-pass workflows on consistency
The Pro tier handles complex multi-subject edits the Dev tier still botches
Backgrounds drift when you forget to name them, so name them every time

Why Instruction-Based Editing Wins Over Inpainting in 2026

Here is the thing about inpainting. It was always a workaround. You drew a mask, you wrote a prompt, you crossed your fingers that the seam at the mask edge would blend. Sometimes it did. Most times it did not, and you spent twenty minutes feathering and re-running the patch at different denoise values until the boundary stopped looking like a sticker on a fridge. I do not miss it.

Learning ComfyUI? Join 115 other course members

51 lessons covering ComfyUI + AI influencer marketing. Early-bird pricing ends soon.

Flux Kontext flips the whole interaction. You describe what you want changed and what you want preserved, you hand over the source image, and the model does the masking implicitly. No brushes. No layers. No denoise sliders to babysit. According to the official Flux Kontext documentation, the model was trained specifically on edit-pair data with strong identity preservation losses, which is why it can change a hairstyle without nuking the face underneath. The Flux Kontext model card on fal.ai lists the technical specifications and supported edit modes for the production endpoint.

The 2026 production case for it is simpler than people make it sound. Inpainting costs you about forty seconds per attempt on a 4090 once you factor in mask preparation. Kontext costs three to five seconds and lands the edit on the first try eighty percent of the time. The math works out the moment you are doing more than four edits per session.

The Five Editing Classes and When To Use Each

Every photo edit I do falls into one of five buckets. After running roughly five hundred edits through a tracking spreadsheet last quarter, the distribution looks like this. Retouching is forty percent, relighting is twenty, replace is eighteen, restyle is fifteen, and restoration is seven. The reason that matters is that Kontext handles each class with a slightly different prompt structure, and trying to use the same template for all five is the single biggest mistake I see in tutorials.

Retouching is local and surgical. The subject and background stay. One small region changes.

Relighting is global. The light direction, color temperature, or mood shifts everywhere, but no objects move.

Replace swaps one object for another. The pose, position, and lighting carry over.

Restyle changes the look of the entire frame without changing what is in it. Photography to oil painting is the classic example.

Restoration repairs damage in old photos. Color, sharpness, scratches, faded sections.

Honestly, recognizing which class you are in before you write the prompt is the single biggest quality multiplier in Kontext. The prompt structure follows from the class.

Recipes 1 to 3 The Retouching Set

Retouching is where most photographers start. I had a session two weeks ago where I retouched eighty headshots from a corporate shoot and the keeper rate was ninety-five percent on the first pass. The prompts below are the ones I used.

Recipe 1, skin smoothing without the plastic look. This is the trap most retouchers fall into. They ask for smooth skin and Kontext gives them a vinyl mannequin. The trick is to specify the texture you want to keep.

Subtly smooth the skin while preserving natural pores and skin texture.
Keep the original lighting, expression, eye color, hair, clothing, and background exactly as is.
Maintain photographic realism. Do not change the face shape or facial features.

The phrase "natural pores" is the magic word. I tested this against twenty variants and the pore mention raised perceived realism by a wide margin in my own blind tests. Mileage will vary on your faces, but on the dataset I work with it is consistent.

Recipe 2, eye brightening that does not look like editing. Stock photo eyes look fake because someone cranked the iris saturation by three hundred percent. Kontext can do this subtly if you ask correctly.

Slightly brighten the eyes and add a soft natural catchlight.
Preserve the original iris color, eye shape, eyelid position, and surrounding skin.
Keep all other facial features, hair, expression, lighting, and background unchanged.

I learned the catchlight phrase the hard way. My first attempts produced eyes that looked dead because Kontext smoothed out the existing reflections. Asking for a natural catchlight restores the spark without the saturation crime.

Recipe 3, hand cleanup that actually works. Hands are notoriously the failure case in AI-generated images, but Kontext is now strong enough to clean up hands from other models. I use this constantly when finishing Flux 2 outputs that came back with six-finger problems.

Correct the hand to show exactly five fingers in a natural, anatomically correct position.
Preserve the original hand position, lighting, skin tone, and connection to the wrist.
Keep the background, clothing, and rest of the image unchanged.

The "connection to the wrist" line is something I added after I lost an afternoon to Kontext disconnecting hands from arms. Sounds dumb to write that down. It works.

Recipes 4 to 6 The Relighting Set

Relighting is where Kontext starts to feel like magic. You hand it a flat midday photo, you ask for golden hour, and it remaps every shadow and highlight in the frame in three seconds. Try doing that in Photoshop. Actually do not, because you will spend two hours on it.

Recipe 4, golden hour wash. I use this on outdoor portraits shot at the wrong time of day. The result is convincing about ninety percent of the time, and the failures are almost always because the source image had blown-out highlights that Kontext could not recover.

Relight the entire scene with warm golden hour lighting from the upper-right at a low angle.
Add soft long shadows and a warm orange-gold color cast across the highlights.
Keep all objects, subject pose, clothing, hair, expression, and background composition exactly the same.

The "from the upper-right at a low angle" detail forces Kontext to think about a real sun position. Without it you get vague warm light that looks edited rather than directional.

Free ComfyUI Workflows

Find free, open-source ComfyUI workflows for techniques in this article. Open source is strong.

100% Free MIT License Production Ready Star & Try Workflows

Recipe 5, studio relight from natural. Going the other direction. You shot at golden hour, the client wants studio-clean, and reshooting is not an option.

Relight the subject with neutral white studio lighting.
Add a soft key light from the front-left, a fill from the front-right, and a subtle rim light from behind.
Remove the warm color cast.
Preserve the subject's pose, clothing, hair, expression, and the background composition.

Three lights named explicitly is the difference between a studio look and a flat look. You can name fewer but I have not found a reason to.

Recipe 6, mood shift from happy to moody. This one is genuinely powerful. Same scene, same subject, completely different emotional register. Editorial photographers will use this constantly once they realize it works.

Shift the overall mood to moody and cinematic.
Lower the ambient light, add deep shadows in the corners, introduce a cool blue color grade in the shadows and a warm amber tint in the highlights.
Keep the subject, pose, clothing, hair, expression, and composition unchanged.

Real talk, this is the recipe that made me cancel my Lightroom mood preset subscription. One prompt does what fifteen sliders used to.

Recipes 7 to 9 The Replace Set

Object replacement is the trickiest class because Kontext has to invent something new while honoring the existing lighting and perspective. It nails this maybe seventy-five percent of the time on the first pass and ninety percent within three tries.

Recipe 7, outfit swap on a portrait. Fashion catalog work runs on this. I have used it to put the same model in eight different shirts inside twenty minutes.

Replace the subject's shirt with a fitted navy blue cotton crewneck t-shirt.
Match the original lighting direction, shadows, and color temperature.
Keep the subject's face, hair, pose, body proportions, hands, and background exactly the same.

The "fitted" word matters. Without it you get a baggy garment that does not fit the frame. The fabric noun, the color, the cut, the neckline. Naming all four is overkill for casual use and exactly right for production.

Recipe 8, background swap that does not break the subject. This is the one people get wrong constantly. They prompt "change the background to a beach" and Kontext also changes the subject's clothing and lighting because the prompt is vague. Specificity fixes it.

Replace only the background with a quiet tropical beach at sunset, sandy shoreline visible, soft ocean in the distance, no other people present.
Keep the subject's pose, clothing, hair, lighting direction, skin tone, and shadows exactly the same.
Match the original color temperature of the subject to the new background lighting.

The "match the original color temperature" line is what makes this work. Without it the subject looks pasted in.

Recipe 9, product swap in a lifestyle shot. Coffee cup becomes a wine glass. Beer can becomes a soda can. Whatever the brief calls for.

Want to skip the complexity? Apatero gives you professional AI results instantly with no technical setup required.

Zero setup Same quality Start in 30 seconds Create Your AI Influencer

Plans from $12.99/mo

Replace the coffee cup in the subject's hand with a clear stemmed wine glass containing red wine.
Match the original lighting on the new object including highlights and shadows.
Keep the subject's hand position, grip, body pose, facial expression, clothing, and background unchanged.

I had a brand who needed twelve product replacements in a single hero shot last month. Twelve prompts, twelve outputs, all consistent within four minutes total. The previous workflow would have been a half-day Photoshop session.

Recipes 10 to 12 Restyle, Restore, and Recompose

These three are the long-tail recipes. I do not use them every day but when I need them they save the project.

Recipe 10, photo to oil painting that respects the source. The vague version of this prompt produces generic Bob Ross output. The structured version preserves the actual identity of the subject.

Convert this photograph to a detailed oil painting style with visible brush strokes and impressionist texture.
Preserve the exact facial features, identity, pose, clothing colors, and composition.
Render with rich color depth, soft edges, and hand-painted texture rather than smooth digital strokes.

Naming the brush stroke style as "impressionist" instead of "painterly" gives Kontext a real visual reference. Painterly is meaningless. Impressionist has training data behind it.

Recipe 11, old photo restoration. This is the recipe I use on every family archive scan I get asked to clean up. Faded color, scratches, dust, the works.

Restore this old photograph by removing scratches, dust, and surface damage.
Recover faded colors to natural skin tones and original clothing colors.
Sharpen the focus while preserving the original photographic grain and film texture.
Keep the composition, subject identity, expressions, and all background elements exactly the same.

The "film texture" phrase is critical. Without it Kontext will over-smooth and the photo will look modern instead of restored.

Recipe 12, recompose a tight shot to a wider one. This is essentially controlled outpainting and Kontext does it cleanly when prompted correctly.

Expand this image outward on all sides to create a wider establishing shot.
Continue the existing scene logically including the floor, walls, ceiling, and any visible furniture.
Match the original lighting, color grade, and perspective.
Keep the original subject, pose, and central composition exactly as they are.

If you want to go deeper on the outpainting fundamentals, I covered the seam-blending side of this in my advanced ComfyUI inpainting and outpainting guide.

Prompt Anti-Patterns That Destroy Background Consistency

Here is something nobody tells you. Kontext drifts when you stop talking. If your prompt is "change the shirt to red," the model treats the rest of the image as fair game. It might also subtly shift the hair color, the lighting, or the background. The fix is uncomfortable but it works. Name the things you want preserved, every single time, even when it feels redundant.

The other big anti-pattern is using verbs like "make" and "give." Make the photo darker, give me a sunset, give him a beard. These are conversational verbs and they perform worse than explicit transform verbs. Compare these two.

Creator Program

Earn Up To $1,250+/Month Creating Content

Join our exclusive creator affiliate program. Get paid per viral video based on performance. Create content in your style with full creative freedom.

$100

300K+ views

$300

1M+ views

$500

5M+ views

Apply Now - Start Earning

Weekly payouts

No upfront costs

Full creative freedom

Bad: "Make the lighting more dramatic."

Good: "Relight the scene with a strong key light from the upper-left, deep shadows on the right side of the face, and lowered ambient light in the background."

The good version is longer but lands the edit on the first try. The bad version requires three to five attempts. I tracked this on roughly one hundred and twenty edits in March and the structured version saved an average of forty-three seconds per edit.

Another common failure is asking for an "edit" rather than describing the result. "Edit this photo to look better" is a prayer, not a prompt. Describe the destination state in concrete terms and you will land it.

Chaining Kontext Edits in a Multi-Pass Apatero Workflow

For complex jobs you will chain multiple Kontext passes. The trap most people fall into is running each pass as a separate session, which means you lose the iteration history and you cannot easily back out a bad edit. I solved this two ways. The manual way is to save each intermediate output with a clear naming convention. The faster way is to run it inside a workflow that handles the chain for you.

Full disclosure, I work on Apatero, so I am biased. But the reason we wired Kontext into Apatero Realms specifically is that I was personally fed up with babysitting four-pass edit chains in raw ComfyUI. The Realm workflow takes a source image, runs your chain of Kontext prompts in order, and gives you the intermediate outputs along with the final. If a pass produces a bad result, you can re-run from that point without redoing the earlier passes. For a working photographer or a small studio doing repeat catalog work, that chain stability is genuinely the difference between profitable and unprofitable on a tight-deadline job.

If you do not want to use a hosted setup, the same pattern works in raw ComfyUI. Build a node graph with the source image at the top, a series of Flux Kontext nodes chained in sequence, and a Save Image node after each. The Apatero version just removes the graph-building step. Either path works. If you are still figuring out the basic editing model, my Flux Kontext multi-reference guide has the multi-image side of this covered.

Real-World Production Tips From Eleven Thousand Edits

After running through about eleven thousand Kontext edits across personal and client work, a few practical observations have surfaced that I have not seen anyone else write down.

First, the image resolution at input matters more than people admit. Kontext was trained at 1024px square but it actually performs noticeably better when you feed it source images at 1280 or 1536 on the shortest side. The internal upscale-then-downscale degrades quality less than you would expect. I started feeding it 1280px sources by default about six weeks ago and my retouch keeper rate went from eighty-five percent to ninety-three percent.

Second, the seed setting does almost nothing for edit consistency. People treat Kontext like a generation model and worry about seed reproducibility. In edit mode the seed barely moves the needle. The prompt structure and source image quality together account for something like ninety-five percent of output variance.

Third, the Pro tier and Dev tier are not interchangeable for production work. Dev is fine for prototyping and personal use. The moment you are charging a client, Pro is the right answer. I tested both side by side for a week on the same jobs. Dev produces visibly worse results on multi-subject scenes and on edits that require strong identity preservation. The cost difference is real, but it is small enough that the time you save on second passes more than covers it. Hot take, the Dev versus Pro question is settled. Dev is for learners. Pro is for income.

Fourth, color references work. If you need a specific hex code rendered in the edit, just include the hex inline. Kontext respects color codes about ninety percent of the time and it saves a roundtrip through color-correction passes.

FAQ

How is Flux Kontext different from regular Flux 2? Flux 2 generates new images from scratch. Kontext edits existing images while preserving everything you do not explicitly change. They share the same base architecture but Kontext was fine-tuned on image-pair data with strong identity-preservation training, which is what gives it the surgical edit behavior.

Can Flux Kontext handle multi-subject scenes? Yes, but you need to address each subject by description if you want to edit only one. "The woman in the red dress" is unambiguous. "The person" in a two-person shot is a coin flip on who Kontext targets.

What resolution does Flux Kontext support? Pro tier handles up to 4MP output, which is enough for most print and web work. Dev tier is capped lower. For client work I generate at the native 1024 or 1280 square and upscale to print resolution downstream.

Do negative prompts work in Flux Kontext? No. Like Flux 2 base, Kontext does not honor negative prompts. The solution is to phrase your preservation requirements positively. "Keep the original lighting" works. "No new lighting" does not.

Can I batch process edits in Flux Kontext? Through the API, yes, easily. Through the official web playground, no, you are clicking one at a time. For batch work, run it through fal.ai, Replicate, or the BFL API directly. I covered the API-side pricing question in detail in my AI image generator API costs guide.

How much does Flux Kontext cost per image? Pro on the BFL native API is around $0.06 per image at the time of writing. Through fal.ai it lands closer to $0.04. Dev is roughly half that. Volume discounts apply for production accounts.

Does Flux Kontext preserve image metadata? No. Output is a fresh image with no EXIF data. If you need provenance tracking for commercial work, you will need to embed C2PA watermarks downstream or maintain your own log.

Can it edit text in images? Yes, surprisingly well. Asking it to change a sign that says "OPEN" to "CLOSED" works reliably. Asking it to render a long paragraph cleanly is still a stretch. Short text inside an image is fine.

Final Take

Flux Kontext is the first AI image editor I would call production-ready. Before this, I was using a combination of Photoshop, ComfyUI inpainting workflows, and prayer to land client edits. The Kontext era is different. The prompts above are the ones I have refined over months of daily use, and the structural pattern of naming the change plus naming the preservation set is the part that travels across all twelve recipes.

If you take one thing from this post, take this. Specificity is not optional. The model rewards clear instructions and punishes vague ones, and the gap between the two is the difference between landing the edit on the first try or burning twenty minutes on revisions. Bookmark the twelve recipes and adapt them to your own subjects. The structural skeleton will hold across most edits you will ever need.

Make AI images and video in your browser

Characters, video, photo packs. No GPU, no setup. Your first generation is free.

Try Apatero Free