Tattooed Gymbro going on a date
Anime artwork of muscular man, 40 years old, mature, caucasian man, tank top, shorts, baseball cap, going to a date, nike air max plus sneakers, on the side of the road… — this was the actual text prompt used in Stable Diffusion, combined with a depth control net based on a photo of Ben Wedgie holding flowers on the sidewalk.
Workflow
For Ben’s Organ, I used img2img to get a final image closely matching the original photo. But img2img has limitations — specifically when it comes to color. Stable Diffusion img2img closely follows the color information of the source image, so if my goal was to take the general composition from a source image, I would use Control Nets instead.
Depth CN (control net) is usually my go-to module for reimagining an image. Many AI artists tend to use Pose CN in conjunction with Soft Edge CN. I use depth because it usually gets the general pose and muscle mass without being overly restrictive about what the forms are.
This allows me to use the text prompt to create a high degree of variations from what was depicted in the source. For example, I could turn the flowers in the original image into a gym bag, and replace the buildings in the original image into a gym. Additionally, I could drastically change the colors from the original image into something entirely different, all the while keeping the compositional detail of the original.
I prefer rendering images in square ratio. So for the control net image, I first expanded the image using Photoshop AI expand — a technique known in Stable Diffusion as outpaint.
When using txt2img with control nets, text prompt is always the king. So you won’t need to depend on denoising in img2img to fully realize what you wanted.
Images
Images
Tech
- Stable Diffusion txt2img with control nets
- Steps: 50, Sampler: DPM++ 2M Karras, CFG scale: 7, Size: 512x512
- Model hash: 59bcb95ef6, Model: virileAnimation_v10
- VAE hash: 63aeecb90f, VAE: vae-ft-mse-840000-ema-pruned.safetensors
- Hires upscale: 3, steps: 50, upscaler: 8x_NMKD-Superscale_150000_G, Denoising: 0.5, 1536x1536
- ADetailer model: face_yolov8n.pt, steps: 50, ControlNet module: tile_resample
- ControlNet 0: depth_midas, Weight: 1, Guidance Start: 0, Guidance End
- Automatic1111: v1.6.0-2-g4afaaf8a
- Gigapixel HQ 4x, 6144x6144
- Adobe Photoshop AI for outpainting
- Adobe Lightroom