Photomatix: Unleashing Photorealism in AI Art through the Stable Diffusion Base Model
Photomatix is the stable diffusion checkpoint merge based on SD 1.5. With the new SDXL model available (after the 2.1 fiasco), you may ask why is it even relevant now. The answer lies in the usability and community support of the 1.5 models. What makes Photomatix different from the other models? It performs surprisingly well in many resolutions and setups and the results are comparable to the best merged and trained base models. It can easily create a photorealistic output but also various creative styles for concept design. We will take a closer look at the features and workflows in this article.
The Stable Diffusion Model
Stable Diffusion is a latent diffusion model (deep generative artificial neural network) developed by researchers from the CompVis Group at Ludwig Maximilian University of Munich and Runway with a compute donation by Stability AI. Due to its open access and open-source nature, it has become a go-to solution for many researchers, artists, and designers. Models based on SD 1.5, 2.1, and SDXL—which are mutually incompatible—are the most widespread now.
Considering many SD merges and trained models, it is sometimes hard to decide which is "better". It depends on the use. Since even the slightest changes in the datasets affect the output, imagine a model more as a resource to be explored than a guaranteed result.
Photomatix: Realistic Base Model Checkpoint for Stable Diffusion
Photomatix and Photomatix Inpainting were developed as tools and AI art research experiments and for testing trained LoRA and LyCORIS models. It is a merge of selected community-published open models.
You may learn more about the development and download the models here:
- Photomatix and Photomatix Inpaint: https://civitai.com/models/106055/photomatix
The point of the merge was higher resolution performance, flexibility, and photorealism.
Photomatix delivers superior handling of lighting, shadows, reflections, and other details that bring images to life. Outputs have a sense of depth and crispness at high resolutions.
Recommended Resolutions for Portraits
Photomatix can create quite consistent outputs f.i. for 768x384, 768x512 (perhaps the most flexible one), 640x832, 640x960, 768x896, 1024x768, 1112x768.
Using Style Mixing for Finetuning and Creative Experiments
You can create or edit you own styles with Style Editor extension (or by editing styles.csv file in your A1111 folder), saving styles is also possible directly from A1111 GUI. You may test styles I have prepared to work with Photomatix. With style mixing you can discover and invent unique and interesting new styles (just by brute force prompt engineering and/or experiments).
- You can paste the styles into your own styles.csv (backup your file first!)
- You can overwrite the styles.csv (backup your file first!)
- You can also run A1111 with various style files by using f.i.
--styles-file photomatix.csv
Style sets are still under development, with focus on minimal negative embeddings. In styles with lora in the title, there are used LoRAs from the list of recommended LoRAs, see examples below:
Photomatix Inpainting
You can inpaint with the normal version too (and sometimes the result is quite good), however, the Photomatix Inpaint version is just better for big changes in the image. You may test both Photomatix and Photomatix Inpaint depending on the use case
Quick Tutorial on Inpainting in A1111
- Load your image (PNG Info tab in A1111) and Send to inpaint, or drag and drop it directly in img2img/Inpaint
- create or modify the prompt as needed (if you have imported an image with generation data, do not forget to uncheck the former base model, if it is connected!)
- Load the inpainting version of Photomatix model
- Mask the area you want to change and Generate (experiment first with Denoising strength 0.3-0.7, depending on the needed changes)
Upscaling and Hires fix
For image upscaling and Hires. fix, get additional upscaler models and put them in proper model directories:
- 1x_NMKDDetoon_97500_G https://huggingface.co/utnah/esrgan/tree/main , goes into models/ESRGAN folder
- 4x_RealisticRescaler_100000_G .pth goes into models/realESRGAN https://upscale.wiki/wiki/Model_Database (you will find the following models there too)
- 4x-UltraSharp.pth and 4x-UltraScale9_V0.5 BETA.pth goes into models/ESRGAN
- UltraMix, files go into models/ESRGAN
Leveraging LoRAs and Embeddings for Enhanced Performance and Photorealism
As usual, you can insert and mix Textual inversion, Hypernetwork, LoRA, and LyCORIS models in quite extensive stacks.
Recommended LoRAs
To install, download into your A1111 stable-diffusion-webui/models/Lora folder, then refresh Lora list and insert from here.
- Adding diffusion noise and texture to image: Entropy
- Dealing with dark dim images: LowRA
- "Overexposing" the image: Lit
- Managing details: More Details
- Simplify the composition: Advanced Enhancer
- Experimenting with exaggerated looks in styles: Portrait Helper XL
- More custom models on my profile on Civitai
Recommended Embeddings
To install, download into your A1111 stable-diffusion-webui/embeddings folder, then refresh Textual inversion list and insert from here. Negative embedding goes to negative prompt, an effect or style goes to the normal prompt.
- Negative prompt: FastNegativeV2
- Negative prompt: EasyNegative
- Negative prompt: Deep Negative
- order of embeddings does matter and affect the result
- there are many negative embeddings, these are perhaps the most common
Read more in the upcoming article on Advanced Synthetic Photography.
Recommended Extensions for Composition and Photorealism
ControlNet, ADetailer, Regional Prompter, Latent Couple, Roop, Dynamic Thresholding, Latent Mirroring. Also works well in most combinations with Hires fix. (in txt2img). Just note, some LoRAs or effects can interfere with some extensions, for possible issues check the terminal output window. Read more in articles on ControlNet, Regional Prompter and Synthetic photography.
Comparing Photomatix to Other AI Art Models
When compared to other AI art models, Photomatix stands out for its exceptional focus on photorealism and its ability to generate high-quality images. Its performance is comparable to some of the best both trained and merged models available, making it an interesting weapon of choice in the jungle of SD models. The robust and stable function of custom LoRAs, embeddings, Lycoris, and hypernetworks further helps universal usability of Photomatix. One of the main pluses is a certain stylistic neutrality—but you also can get a bold style if you ask for it.
Pros and Cons
- Strong points: Consistent, Universal, Realistic, Balanced, Flexible, Works well with extensions, Works great with LoRAs and styles, Simple negative prompt
- Weak points: Hands and feet (sometimes), Characters handling objects
Photomatix Development
Photomatix and Photomatix Inpaint were developed using these models and merges:
- https://civitai.com/models/39044/c3
- https://civitai.com/models/49934/crystal-clear2
- https://huggingface.co/runwayml/stable-diffusion-inpainting
- https://huggingface.co/runwayml/stable-diffusion-v1-5/tree/main
The model will be further developed and refined (read more about v2 here).
Conclusion
Photomatix—with its stable diffusion 1.5 foundation and community merges pedigree—is undoubtedly an interesting option in the world of generative visual AI art. Its ability to generate photorealistic images that are virtually indistinguishable from real-world counterparts makes it an invaluable tool of visualization for artists, designers, and enthusiasts alike. As we continue to push the boundaries of what AI can achieve, Photomatix beckons a creative journey limited only by imagination. See overview of some strategies on creating realistic images with SD 1.5.
Finally, thanks to the Stable Diffusion, Hugginface, Civitai, and A1111 communities and contributors for their hard work, without the collective effort and resources such projects would not be possible.