• Kiki and Mozart
  • Posts
  • Composing Complex Scenes with Multiple Subjects in Midjourney

Composing Complex Scenes with Multiple Subjects in Midjourney

PLUS: major updates from OpenAI, Amazon, and Meta

In this newsletter, read about:

  • 🕵️‍♀️ Building Complex Compositions with Multiple Subjects

  • đź—ž News and Top Reads

  • đź“Ś AI Art Tutorial: DALL-E 3

  • 🎨 Featured Artist: Joann

  • đź–Ľ AI-Assisted Artwork of the Week

  • 🤓 A Comprehensive Midjourney Guide

🕵️‍♀️ Building Complex Compositions with Multiple Subjects

Midjourney’s current technical constraints usually make it very challenging to depict multiple distinct subjects in one image. The subjects tend to blend and Midjourney is not capable of capturing all the details you mention in the prompt when you are trying to get a complex composition with multiple subjects.

In this article, I want to show you how you can approach the problem step by step using the Vary (Region) feature in combination with Remix and Vary (Subtle).

As a running experiment for this post, we’ll try to create an image of a military man saying goodbye to his family on the train platform. Let’s say we want to see the man kneeling down to his son, with the wife staying behind the kid.

Here’s the prompt I started with (note: needed to include “the 2020s” to avoid the World War II images).

a photo of a man in the military unifrom saying goodbye to his family at the train platform, the man is kneeling down to his son, the wife is standing behind the son, the 2020s --ar 16:9

Surprisingly, all the images I got were with daughters rather than sons, and the image below was the closest to my initial request.

Let’s start with this image and see how we can edit it step by step to get all the lacking components.

First of all, I used Vary (Region) to replace the girl with the boy.

a photo of a man in the military uniform kneeling down to his son, the 2020s --ar 16:9

Next, I applied the Vary (Region) feature again to get an actual train behind the boy.

a train behind a man in the military uniform kneeling down to his son, the 2020s --ar 16:9

Then, I tried to get a woman behind the boy, which turned out to be quite a tricky part, requiring multiple iterations. First, I selected the area behind the boy in Vary (Region) and was able to get the following image as the closest to what I wanted, and also, the least weird one 🙂.

a nice woman in the black jacket standing behind her son on the train platform, with her arm around his shoulder, the 2020s --ar 16:9

Note, I didn’t actually care about the jacket and its color, but this trick allowed me to get some realistic women from the same era instead of the images like this 🙂 

Then, I used Vary (Subtle) for the image with a woman in a black jacket to get even closer to what I was looking for.

a photo of a man in the military uniform kneeling down to his son on the train platform, the wife is standing behind the son with her arm around his shoulder, the 2020s --ar 16:9

The image finally includes all the required components, and I’ve got a side view of the woman as I imagined, but there are still too many issues with this image, especially when it comes to the woman. So, I used Vary (Region) again to fix the woman’s dress.

the wife is standing behind the son with her arm around his shoulder, the 2020s --ar 16:9

Then, as a final step, I was looking into a way to improve the quality and realism of the entire image. Specifically, I applied Vary (Subtle) to the image above and experimented a bit with different Midjourney versions and photo-related vocabulary. Here’s the prompt that produced the result I liked.

a military man kneeling down to his son on the train platform, the wife is standing behind the son, the 2020s, editorial photography --ar 16:9 --v 5.1 --style raw

I added “editorial photography” to get a higher-quality photo, while --v 5.1 and --style raw seem to add to the realism of the image.

The image is probably not perfect yet and you can still catch quite a few signs of its AI origin, but it’s great to see how much more control we can have with the latest Midjourney features. Looking forward to what’s coming next!

đź—ž News and Top Reads

  • OpenAI releases DALL-E 3, a new text-to-image AI model that can generate more accurate and detailed images than previous systems.

    • The feature will soon be integrated into ChatGPT (for Plus and Enterprise users), making it easier for users to create images from their ideas.

    • ChatGPT also added an audio response feature, allowing users to choose between five different voices.

  • Getty Images introduced an AI generator that is only trained on its licensed images.

    • According to Getty, this means that anyone using the tool and publishing the image it created commercially will be legally protected.

    • The tool will be priced separately from a standard Getty Images subscription, and the pricing will be based on prompt volume.

  • Amazon has announced a new Alexa voice assistant powered by its own large language model (LLM), specifically designed for smart home control and more natural conversations.

    • The new Alexa will be able to handle interruptions, change requests, and vague queries in a more human-like way.

    • Amazon is cautiously rolling it out in the United States first to test its real-world performance before expanding access.

  • Meta is said to be making dozens of AI chatbots with different personalities to make its apps more appealing to younger users.

    • Some of the planned bots include Bob, a sassy robot modeled after Bender from Futurama, and Alvin, a human-curious alien.

    • This could be a smart move by Meta, given the success of Character.ai, a chatbot app that allows users to interact with AI characters with different personalities.

đź“Ś AI Art Tutorial: DALL-E 3

In this video, Olivio Sarikas provides an in-depth review of DALL-E 3. He demonstrates the tool's capabilities, discusses its limitations, explains its integration into ChatGPT, and outlines the usage restrictions imposed by OpenAI for the new DALL-E version.

🎨 Featured Artist: Joann

Joann is an AI artist, illustrator, and graphic designer. She creates fascinating original artwork using AI tools. Joann entered the AI art world with her incredible Inflatable Wonders series envisioning iconic monuments as large inflatables. Then, was the Luminescent Town series that stands as a testament to the harmonious union of artistry and machine intelligence. Check out her latest artwork @joooo.ann.

đź–Ľ AI-Assisted Artwork of the Week

🤓 A Comprehensive Midjourney Guide

To get a link to a comprehensive Midjourney guide, please subscribe to this newsletter. The guide is a dynamic document, which I intend to keep up-to-date with the latest Midjourney updates.

Share Kiki and Mozart

If you enjoy this newsletter and know someone who might also appreciate it, please feel free to share it with them. Let's spread the word about AI art and introduce more people to this fascinating field!

Join the conversation

or to participate.