Consistent A.I. Characters in MidJourney

About a month ago (as I write this) I was able to find a method to generate a relatively consistent character in MidJourney for a project that we are working on.

Two methods, in fact, one of which is much easier than the other.

The first method wasn't a difficult process but it also was not as straight-forward as just asking the bot to generate an image of a specific character. It involved generating a specific image, editing out the background, using that to generate a character sheet, editing out the ones that I liked, and using all of those to create a composite/definitive version of the character, get the MidJourney seed number for that version, and then use minor tweaks to the original prompt along with the seed number to get relatively consistent characters:

A character sheet of an AI-generated character created manually in MidJourney.

Now, obviously, these characters wouldn't be usable without individual edits for consistency, but it's the closest I've ever gotten and it would work really well as a method of storyboarding a story before hiring an illustrator.

Feeling pretty proud of myself, I shared the method I was using with a few people and showed off some results, and then fast forward less than a week (it was only five days!) and MidJourney released a new feature that makes it incredibly easy to do the above without all of the work. Here are the results I got in a fraction of the time by using the new character reference feature on my original composite image I'd created:

A character sheet of an AI-generated character created the new character reference feature in MidJourney.

It's really good. Maybe not quite as good as the original ones that I created, but for storyboarding it is definitely much faster and certainly good enough.

MidJourney Parameters for Consistent Characters

You only need one parameter to use this new feature, but there are 4 parameters that can help you control and fine tune your prompts.

--cref: Character Reference

So, how do you use this new feature for yourself? It is relatively easy and is a 3-step process:

  1. Generate a reference image of the character you want to create in different poses or wearing different clothes, etc. (You can bring your own image, but it will not work as consistently if it wasn't an image generated by MidJourney.)
  2. Create a new prompt of what you'd like that character to be doing, or how you'd like them to be posed, or what they are wearing, using the MidJourney or Niji version 6 models. Include the --crea parameter with a URL to your reference image from step 1.
  3. Repeat step 2 as often as necessary to get all of the poses that you want.

The --cref parameter stands for the character reference and accepts a single URL linking to a MidJourney generated character. Ideally, there will only be one person in each reference image. Precise details may not carry forward, and you can use multiple image references in a single prompt although I have not had much luck with that, personally. Your regular prompt criteria must go before the --cref parameter.

--cw: Character Weight

There is also a second parameter that you can add to the end of your prompt: --cw

--cw is the Character Weight parameter, and it sets the strength of characterization. You can assign it a value anywhere between 0 and 100, and it defaults to 100. This means that it will try to match the character's facial features, hairstyle, and clothing. The closer you get to 100, the less it will consider hairstyle or clothing. (Even at a weight of zero, it will still try to mimic the facial features.)

--sref and --sw: Style Reference & Weight

The --sref parameter will provide a consistent style to your image by providing a style reference image that MidJourney will try to duplicate. Unlike with the character reference, if you have multiple images in the same style already, it can make a big difference in having consistent images that don't change styles on you.

The --sw parameter lets you control the style weight and like the character weight it will default to 100. Unlike the character weight, however, you can specify anything between 0 and 1000, so you can actually give your style weight a much larger influence.

Case Study with Example Prompts:

Here are some specific prompts where you can follow along for yourself and see the results.

We'll start by generating options for the initial character and then upscaling our favorite:

A cartoon of a gritty man with a gray beard and fedora, standing with his hands in pockets, on a white background.
a gritty detective with a gray beard and a fedora, full body, white background, flat cartoon style --ar 2:3 --v 6.0

I like this guy, but he looks a bit grumpy, so let's see if we can get a version of him in a better mood:

4 portraits in different styles of a man with a gray beard and a fedora.
smiling, full body, white background --cref https://s.mj.run/KYyQakTHNJ4 --ar 2:3 --v 6.0

You'll notice that we didn't need to provide the full descriptive prompt that we used previously, and got 4 pictures that look relatively similar as the same character. However, none of them are in the same flat cartoon style that we started with. We could explicitly call that out in each prompt, or we can use the style reference parameter:

4 portraits of cartoon gentlemen with a gray beard and fedora.
smiling, full body, white background --cref https://s.mj.run/KYyQakTHNJ4 --sref https://s.mj.run/KYyQakTHNJ4 --ar 2:3 --v 6.0

These are closer to the right style, but not looking quite right. Let's give the style reference a little extra help, by using an additional sample image in the right cartoon style, as well as increasing the style weight to 900 instead of the default which is 100:

4 men drawn in a cartoon style with gray beards and a fedora. A couple of them look much closer to our original image.
smiling, full body, white background --cref https://s.mj.run/KYyQakTHNJ4 --sref https://s.mj.run/KYyQakTHNJ4 https://s.mj.run/MnGKirxzYsM --sw 900 --ar 2:3 --v 6.0

Not a fan of him wearing a fedora and trench coat? Let's see what our guy does during his leisure hours:

An old man wearing a brown hat and a polo shirt.
wearing a baseball hat and t-shirt, looking happy --cref https://s.mj.run/KYyQakTHNJ4 --cw 0 --sref https://s.mj.run/KYyQakTHNJ4 https://s.mj.run/MnGKirxzYsM --sw 90 --v 6.0
An old man wearing a brown hat and a short sleeve button down with a smile
wearing a baseball hat and t-shirt, big smile, looking happy --cref https://s.mj.run/KYyQakTHNJ4 --cw 0 --sref https://s.mj.run/KYyQakTHNJ4 https://s.mj.run/MnGKirxzYsM --sw 90 --v 6.0
An old man wearing a brown hat and a polo shirt. He has a large smile.
big smile --v 6.0

You do sometimes need to be explicit with how you want your expressions to come out. For example, in the first example above, I asked for him to look happy, but should have asked for a big smile like I got in the second example. Alternatively, if something isn't quite right, you can just vary the specific region to change an expression, as in the third example.

A more cartoony version of our old guy out in the rainswept streets
dancing in the rain with an umbrella, pink shirt --no hat jacket coat --cref https://s.mj.run/KYyQakTHNJ4 --cw 10 --sref https://s.mj.run/KYyQakTHNJ4 https://s.mj.run/MnGKirxzYsM --sw 500 --v 6.0
A more cartoony version of our old guy out in the rainswept streets
dancing in the rain with an umbrella, pink shirt --no hat --cref https://s.mj.run/KYyQakTHNJ4 --cw 10 --sref https://s.mj.run/KYyQakTHNJ4 https://s.mj.run/MnGKirxzYsM --sw 500 --v 6.0

Not everything that you ask for will be usable. Dancing with an umbrella doesn't really match our character well enough and I'd want to re-generate these images, but along with a low character weight, you can use negative keywords using the --no parameter to get rid of certain elements of your reference image. In this case, his hat.

Case Study #2 – With Locations

In the first case study, I went with a white background so I could isolate the character, because that's what I did with my more difficult “manual” method of consistent character generation before the new feature was introduced. However, you can use any images that you generate for a character.

Let's start with the same basic character but with a different look and put him into an actual space:

A cartoon man with a beard in a coat and fedora standing in an office next to a bookshelf
gritty detective with a gray beard and a fedora, big smile, full body, standing in his office, flat cartoon style --v 6.0

He came out looking a little Henry Jones Sr.-ish, but could be a fun character. Rather than going through all of the same steps we did previously, let's try putting him into some new situations:

A man sitting at his desk and in a good mood.
sitting at a desk, flat cartoon style --cref https://s.mj.run/uU4i_tDy3ao --sref https://s.mj.run/QlFbzHgzCOI --v 6.0
An old man facing away from the viewer and only wearing shorts and a hat at a water park
playing at a water park wearing a sun hat --cref https://s.mj.run/uU4i_tDy3ao --cw 0 --sref https://s.mj.run/QlFbzHgzCOI --v 6.0
An older man with a beard wearing a visor and sunglasses watching a car race
watching a stock car rally wearing his favorite teams colors and sunglasses, cartoon style --no hat fedora --cref https://s.mj.run/uU4i_tDy3ao --cw 0 --sref https://s.mj.run/QlFbzHgzCOI --v 6.0
a man with a beard and a baseball hat looking at a refrigerator full of milk jugs
looking in the milk cooler --cref https://s.mj.run/uU4i_tDy3ao --cw 0 --sref https://s.mj.run/QlFbzHgzCOI --v 6.0

Case Study #3 – Mixed Nijijourney

“Niji” is (I think) the Western pronunciation of the Japenese word for “rainbow” and is a model used to create anime or illustrative images on MidJourney. (Nijijourney is a fan reference for Midjourney using Niji version 6.) There are multiple styles you can use with some of the older Niji models, but at least as of this week, version 6 only supports the “raw” style which is the default so there's no need to try using the others.

The character reference command is not compatible with the older Niji models.

To start, here's a character we created live in a demo on last week's Ask us Anything live training based on participant feedback:

Portrait of an older woman with her hair in a bun reading a book, animated cartoon style
holding a book, pixar style --cref https://s.mj.run/0OHy2Qp6VoQ --ar 2:3 --v 6.0

And here she is switching models from MidJourney version 6 (alpha) to Niji version 6, and in a few different situations:

4 portraits of an old woman holding a book
holding a book --cref https://s.mj.run/0OHy2Qp6VoQ --ar 2:3 --niji 6
4 portraits of an old woman holding a book
holding a book, pixar style --cref https://s.mj.run/0OHy2Qp6VoQ --ar 2:3 --niji 6
4 portraits of an old woman in shorts and a t-shirt out for a run
running on a sidewalk wearing shorts and a t-shirt, full body, pixar style --no book books 📚 --cw 20 --cref https://s.mj.run/0OHy2Qp6VoQ --ar 2:3 --niji 6
4 portraits of an old woman sitting next to (or in) a fountain with a cup of coffee
sitting next to a fountain drinking a coffee, pixar style --no book books 📚 --cw 20 --cref https://s.mj.run/0OHy2Qp6VoQ --niji 6

For the most part, the portraits came out reasonably close, although I didn't share the horror-show that was trying to get her running on a sidewalk without explicitly specifying full body or pixar style (her face was on the back of her head in 3 out of the 4 images and that wasn't even the worst part!)

Case Study #4 – 100% Niji v6

Let's try creating a similar character from scratch with the Niji version 6 model right from the beginning, starting with a similar prompt that we used in our demo last week:

a caucasion woman with brown hair in a bun wearing glasses, leggings, and a trenchcoat, holding a book
a woman in her 40s, librarian, hair in a bun, south african, glasses, white background, full body --ar 2:3 --niji 6

Most of my tests came out with a consistent and usable character with only minor edits required.

Now, not all of the situations she was put in were particularly realistic, but I could easily see many of these portraits used as the basis for a cozy mystery cover or leading off a series of certain sub-genres of romance.

You can follow along with my prompts as I experimented with this particular model to see the different results:

a woman falling out of the sky holding a book but without a parachute sporting a giant smile
skydiving --cref https://s.mj.run/EZd2xQXJh3M --niji 6
4 portraits of a woman simultaneously reading a book and roller skating (with one picture of her skating with a twin sister as well)
roller skating --cref https://s.mj.run/EZd2xQXJh3M --niji 6
4 portraits of a woman in a trenchcoat near a fountain, drinking coffee
drinking a cup of coffee by a water fountain --cref https://s.mj.run/EZd2xQXJh3M --niji 6
4 portraits of a woman in a dress drinking coffee next to a fountain
drinking a cup of coffee by a water fountain wearing a light blue dress and a sunhat --cref https://s.mj.run/EZd2xQXJh3M --cw 20 --niji 6
4 portraits of a running cartoon woman
out for a run in shorts and a t-shirt --cref https://s.mj.run/EZd2xQXJh3M --cw 20 --niji 6
a woman running on a bridge with shorts and a sweatshirt, listening to something on her phone
running on a sidewalk wearing sunglasses shorts and a t-shirt, holding a phone and wearing earbuds, full body --cref https://s.mj.run/EZd2xQXJh3M --cw 20 --niji 6
a woman jogging down a sidewalk in shorts and a t-shirt, adjusting something on her phone
running on a sidewalk wearing sunglasses shorts and a t-shirt, holding a phone and wearing earbuds, full body --no sweatshirt jacket --cref https://s.mj.run/EZd2xQXJh3M --cw 20 --niji 6
a cartoon woman in sunglasses, shorts, and a short sleeve shirt over a long sleeve shirt, running along the side of a street
running on a sidewalk wearing sunglasses shorts and a t-shirt, holding a phone and wearing earbuds, full body --no sweatshirt jacket --cref https://s.mj.run/EZd2xQXJh3M --cw 20 --niji 6

Lessons Learned

This feature is still in beta, so you definitely will not get something completely usable right out of the box. Expect that you'll need to make some edits if you plan to use these images in a book.

If the art style works for your particular project, then experiment with the Niji version 6 model as it was the most consistent results that I was able to create using this feature. If you are using Midjourney version 6 Alpha (the current default model) then try to be as specific for the art style or level of realism that you want.

I'd also recommend trying to keep things as simple as possible with just the character in different poses, as it seems to be a little easier to get consistency, and then to edit those images into backgrounds generated in an appropriate style.

If you do want to try to generate images directly, then it might be better to start with your character in his environment, because it can be difficult to get detailed backgrounds if you start with a completely white background.

This feature is ready to use out of the box for storyboards that you can hand over to an illustrator, but I would not necessarily use it yet for illustrating a full book.

If you are trying to create a consistent character for a series of book covers, and don't mind using AI-generated art as a starting point rather than stock photography, then this can give you an affordable option to hand over to your cover designer.

You also can't use this feature to create character sheets; every time I try, I usually wind up with 3 sample images with a single pose, and 1 image that includes a single full character pose but might have a few smaller and less fleshed out characters in the background. Using the Niji model, I was able to get multiple images for each option, but it usually consisted of a single full body and then headshots for the additional ones:

A character sheet of a cartoon woman holding a book
a woman in her 40s, librarian, hair in a bun, south african, glasses, full body, character sheet, white background --ar 2:3 --niji 6

The most important lesson I want to share from this process, though, is that you should not be afraid to experiment and see what you might be able to create.

It can spark some fun creativity and at least give you a laugh here and there.

240416