Kiss, SDXL, and Oil
Words alone cannot describe how excited I am about SDXL, Stable Diffusion’s latest and greatest model. Many people have been calling SDXL the “Midjourney killer,” and for good reason. The results are simply stunning.
I haven’t gotten around to fully explore SDXL until today, and partially this was made possible by the SeargeSDXL that I tried out, which provides excellent workflow and settings out of the box, and makes using ComfyUI a lot less daunting.
As I wrote previously, moving forward, when I prompt for paintings, I will try to use historic LGBT artists as suggested by a good friend and design mentor.
For these paintings, I’ve used a mix of J. C. Leyendecker (Wikipedia) and John Singer Sargent (Wikipedia). Unfortunately, I don’t know which artist I used for which painting unless I look at the metadata closely. I will try to place those metadata into the outputs and parse the EXIF on this site so you can visually see those associations in the future. For now, I can simply say that it’s one of the two in these paintings. I will try to keep them separate in the future. I’m simply too excited about the results!
Prompts
Unlike the original Stable Diffusion implementations (1.4, 1.5, 2.0), all of which use text tokens for the prompts, SDXL uses natural language processing, similar to how Midjourney would interpret them.
By far, one of the more difficult things to teach new users of Stable Diffusion is how to break concepts into keyword tokens. With SDXL, you don’t have to do that anymore, though with workflows like SeargeSDXL, you can provide keywords as secondary prompts so that the natural language processing can better understand what you’re trying to achieve.
UIs for SDXL
When SDXL 0.9 was leaked, I downloaded it, and tried it for a little while. At that time, ComfyUI (Github / GD Docs) was the only UI that could support the SDXL model. I was very impressed, but having to create a custom workflow from scratch for ComfyUI was frankly quite daunting. Although I have used many node-based editors in 3D software before, the UX for Comfy leaves a lot of room for improvement. Then there’s the question of how to truly setup a workflow that will make use of all that SDXL can offer. So I merely played with it for a bit, and didn’t get very far.
Fast-forward it today, all popular Stable Diffusion UIs have added support for SDXL. Automatic1111 (Github / GD Docs) and SD.Next (Github) now both support SDXL out of the box.
That said, I think that the ComfyUI implementation is still a lot better, given that the workflow for SDXL is not very linear.
SeargeSDXL
To get the best out of the SDXL experience, I recommend getting the SeargeSDXL custom nodes for ComfyUI (Github). It comes with a very advanced workflow that can create beautiful images out of the box. These images I have posted today were all rendered with the default settings that come with the version 3.4 workflow.
Whereas Automatic1111 sticks with the familiar positive and negative prompt, SeargeSDXL splits the prompt into five components, which may seem overwhelming at first but is in fact extremely usable and sensible:
- Main Prompt. Describe in natural language.
- Secondary Prompt. Help the main prompt with keyword tokens, separated by commas.
- Style. Medium, in the style of Artist, qualifiers — yup, Midjourney style!
- Negative Prompt. Typical neg prompt. Comma-separated list, e.g. jpeg artifacts, women, signature, etc.
- Negative Style. e.g. Anime, photography (if you want paintings)
All the other settings that you need are all clustered together. I really don’t think that I could get these results had it not for SeargeSDXL. Highly recommended!
SDXL Rendering Speed
The second point of consideration is that SDXL is trained on 1024x1024 images, so you’ll get the best results rendering at 1024x1024, but that also means that you’ll need a fairly beefy GPU. I run a M2 Max with 96GB Ram, so memory is not an issue given Apple Silicon’s architecture. But the issue is that it will never be a match to an RTX 4090.
Locally on my M2 Max, each complete render going through the SeargeSDXL workflow version 3.4 takes approximately 70-90 seconds. So it’s not a fast render. The results are decent enough for me to wait and then queue for them while I multi-task in other tasks.
I tried running ComfyUI with SDXL using an available ComfyUI docker image in the cloud, but last time I tried it, it wouldn’t load the SDXL model. I haven’t tried it since, but I will give it another go soon.
SDXL Dreamability
One thing worth mentioning about SDXL is that it produces images with high variance with the same prompt. This is something we only see in the past with Midjourney, but with SDXL, Stable Diffusion has changed the playing field completely
Images
Endnote
I am in the process of running a few more renders with SDXL today. These are just my first tries after playing with it for a little bit. I expect that once I am familiar with everything that SeargeSDXL has to offer, especially when I figure out how to use control nets with this flow, I will soon be able to generate very high quality images in the near future.
Technical Parameters
Tech: 30 steps, CFG 7, DPM++ 2M Karras, 1024x1024. Base Ratio: 0.8. Refiner Strength 0.75. Refiner Intensity: Hard. Upscale 2.0. Denoise 0.5. Hires Fix Disabled. Base: sd_xl_base_1.0. Refiner: sd_xl_refiner_1.0. VAE: sdxl-vae. Main upscale: 4x_NMKD-Siax_200k. Support upscale: 4x-UltraSharp. Lora: sd_xl_offset_example-lora_1.0.
Post: Topaz Gigapixel HQ 2x (2048x2048). Adobe Lightroom (color correction and minor edits).