In-depth look Midjourney V5 | Guide | Andrei Kovalev’s Midguide

/midlibrary

All styles

Masters Of Midjourney

my Library

exit

my Library

Last week, Midjourney launched the public alpha-testing of its V5 algorithm. In this study, we look in-depth into its innovative features and challenges and compare it with its predecessor to determine whether V4 is still relevant.

Copied!

MRI scan of robot samurai --v 4

Copied!

Japanese fox god of winter death and rebirth --v 4

Copied!

intricate raven god --v 4

Instead of Intro:
The Magic Is Back

When V4 came out, many of my peers were disturbed by what they defined as "the disappearance of magic" that was present in V3. I can assure all of them now: the magic is back. There is poetry, there is the spark, and these new images are not just artful but alive!

Copied!

Daniel Merriam's painting depicting perfect simbyosis of human being and allpowerful AI --v 4

Copied!

Laurie Greasley's illustration depicting perfect simbiosis of human being and allpowerful AI --v 4

Copied!

William Wray's painting depicting perfect simbyosis of human being and allpowerful AI --v 4

Copied!

complex collage depicting perfect simbyosis of human being and allpowerful AI by James C. Christensen --v 4

"V5 isn't the final step, but we hope you all feel the progression of something deep and unfathomable in the power of our collective human imagination"
— from Midjourney's team official announcement

Quick facts

V5 is "much more 'unopinionated' than v3 and v4."

It has a much wider stylistic range.

The model can generate significantly more realistic imagery.

It is more responsive to prompting. And the prompting strategies have changed, too!

V5 renders more detailed images, and details are more likely to be correct (yes, the hands improved greatly; and it is promised to generate way less unwanted text).

Image prompting performance improved. And the new model supports --iw for weighing image prompts versus text prompts.

Remixes are much better.

In this guide, we will put each of these statements to the test.

1. V5 Is Much More "Unopinionated"

Based on multiple tests, here is my interpretation of this statement: the current model is VERY straightforward. To best illustrate this, I devised a somewhat complex prompt that makes little sense and leaves space for interpretations. And a surprising thing happened...

Copied!

hidden incremental identity of uniland invertspace --v 4

The "unopinionatedness" becomes especially self-evident when you compare Artistic Techniques in V5 vs. V4. Where V4 rendered the actual results of a technique application, V5 tends to return the images of the process of said application: Haute couture fashion is no longer a dress but a fashion show, Pinhole photography must include a pinhole camera, and Encaustic paint is not a beautiful abstract artwork, but buckets of… paint.

Copied!

Byzantine architecture --v 4

Copied!

Haute couture fashion --v 4

Copied!

Pinhole photography --v 4

Copied!

Encaustic paint --v 4

Finally, Midjourney V5 generates WAY more photorealistic images. Whereas before, many styles were depicted as paintings or illustrations, now they are photographs. And that, together with a straightforward, literal approach to prompts, often makes an average V5 generation uninteresting, characterless, and less varied—without additional moves.

Copied!

Calvin Klein --v 4

Copied!

by Erik Spiekermann --v 4

Copied!

Minimalism --v 4

Copied!

Fashion photography --v 4

Good news: we have those moves! And the first and most powerful one is specifying a style—style references became much more impactful in V5. And many styles themselves—much more precise, detailed, and nuanced.

Copied!

Midlibrary --v 4

Copied!

Midlibrary by Viktor Vasnetsov and Tyrus Wong --v 4

2. "V5 Has Much Wider Stylistic Range"

"Much wider stylistic range" might mean two things: MJ now knows more styles, and existing styles have become more nuanced and varied. We will get to nuances down the road, and now, let's look at how the existing styles have changed.

Copied!

by Moebius --v 4

Copied!

by Victor Vasarely --v 4

Copied!

by Noma Bar --v 4

We are re-checking our backlog of artistic styles that V4 rejected, and will be adding the styles known to V5 to Midlibrary catalog regularly. When the list of new styles will be significant enough, expect the update of this study!

3. "V5 Can Be Insanely Photorealistic"

For the first time since Midjourney arrived, I genuinely confused an AI-generated image posted on Instagram with an actual photograph. For a few instants, I couldn't believe my eyes seeing the #MidjourneyV5 hashtag below the picture.

Copied!

portrait by Steven Klein --ar 16:9 --v 4

It. Truly. Is. Insane.

Copied!

Nick Cave by Nan Goldin --v 4

Copied!

Nick Cave in Alejandro Jodorowsky's movie --v 4

Copied!

Miles Aldridge's high-concept portrait photograph depicting Nick Cave --v 4

Copied!

Nick Cave as main character in photograph by Nobuyoshi Araki --v 4

Needless to mention, photographers' styles improved dramatically, and some non-photographic techniques acquired a much more photorealistic look.

Copied!

by Cindy Sherman --v 4

Copied!

Chiaroscuro --v 4

Copied!

by Erik Madigan Heck --v 4

However, there is a "downside," too. By default, Midjourney V5ɑ tends to render more photorealistic images. Thus, frequently, you must "nudge" it to get less literal and more artistic results.

Copied!

magical night lake --v 4

Copied!

dark lord's mountains of doom --v 4

Copied!

exoskeleton fighter school girl from East End --v 4

Copied!

Italian mother --v 4

Well, maybe the Italian mother is a bad example: just look at that baby! <3

And let's remember that photorealism is not just about clean, modern photographs. More "vintage" photographic techniques and classical photographers' styles became more realistic, too!

Copied!

Hand-colored photograph --v 4

Copied!

photograph shot on Kodachrome --v 4

Copied!

by Nadar --v 4

It is a 'pro' mode of the model tuned to provide a wide diversity of outputs and to be very responsive to your inputs. The tradeoff here is that it may be harder to use. Short prompts may not work as well. You should try to write longer, more explicit text about what you want.
‍— from Midjourney's team official announcement

4. «Complex Prompts Are Better»

For this experiment, I chose a simple prompt and then gradually complicated it using the same --seed value—to keep the results more consistent.

Copied!

young woman --seed 42 --v 4

Copied!

dedicated young woman --seed 42 --v 4

Copied!

dedicated young woman in forest --seed 42 --v 4

Copied!

dedicated young woman in neon forest --seed 42 --v 4

Copied!

dedicated Renaissance young woman in neon forest by Hans Bellmer --seed 42 --v 4

Copied!

dedicated Renaissance young woman in neon forest by Hans Bellmer --seed 42 --v 4

Apart from improved photorealism... I wouldn't say that the difference is that striking. Okay, let's try with one of the "megaprompts" from our Image-to-Text-to-Image study ↗︎

Copied!

A group of anthropomorphic rabbits, depicted as cartoon characters, are captured in a high quality image inspired by the works of Jean Tabaud. The image shows the characters sitting around a table, participating in a tea ceremony scene from an anime movie screenshot set in a dark sci-fi environment. The image is a stunning example of epic fantastic realism, emphasized by its high quality 4K technology. The style is reminiscent of the works of Junji Ito and showcases elements from anime and claymation. The overall aesthetic is characterized by captivating deco elements, making it a top-notch anime piece --ar 16:9 --v 4

Copied!

a couple of people laying on top of a lush green field, by Elsa Bleda, magic realism, anya taylor - joy and emma stone, wide high angle view, barefeet in grass, marat zakirov, concert, two girls, hammershøi, yelena belova, midair, benjamin vnuk, alexi zaitsev, innocence --ar 16:9 --v 4

Once again, in these example the difference doesn't seem to be that big. If anything, V5 is kind of losing this round to V4.

Finally, let's see how two or more styles combine in complex prompts in V5 vs. V4.

Copied!

art-house movie scene by Max Ernst and Kazumasa Nagai --v 4

Copied!

Empire State Building advertising collage by Dziga Vertov and Paula Scher --v 4

Copied!

Alan Lee's illustration depicting Aron Demetz's sculpture of minimalist Gothic robot --v 4

Copied!

megastructure designed by Lee Bontecou in Mecha style covered with street-art by Artur Bordalo --v 4

Can you spot the Big Difference?

5. "V5 Renders More (Correct) Details"

TL:DR, it's true. V5 renders much more subtleties in details, making them more intricate, refined, and more correct indeed!

Copied!

extremely detailed knolling layout of a futuristic survavlist kit with abundance of intricate tech, bizarre devices, and weird packed food --ar 16:9 --v 4

The styles that were already detailed in V4 perfectly show the new model's progress.

Copied!

by Adonna Khare --v 4

Copied!

Letterism --v 4

Copied!

by Mattias Adolfsson --v 4

During the office hours announcement, the Midjourney team mentioned that V5 got much better at rendering groups of people. And there are many proofs of that.

Copied!

Avatar the Last Airbender anime --v 4

Copied!

by Margaret Keane --v 4

Copied!

by Ludwig Bemelmans --v 4

Another statement the team made: V5 generates much less unwanted text and text in general—up to a point where even "infographics may suffer." Is that so?

Copied!

intricate infographics depicting underworld layout --v 4

Copied!

1700s blueprint depicting underworld layout --v 4

Copied!

cyberpunk hovercar hud interface --v 4

Copied!

technical drawing with notes and measurements depicting underworld layout --v 4

Seems like infographics lovers can sleep tight. :) And the following samples got me confused—it seems like, for now, V5 returns even more text than V4…

Copied!

Gilded --v 4

Copied!

Light field photography --v 4

Copied!

Grimdark --v 4

Finally—yes—hands got MUCH better!

Copied!

Anton Corbijn's closeup photograph of hands of elderly hard-working person --v 4

Copied!

Arnold Bocklin's painting depicting hand of gigantic statue drowned in fantasy forest --v 4

Copied!

space pilot's hands by Hajime Isayama --v 4

Copied!

Infinity Gauntlet designed by Walter Van Beirendonck --v 4

And V5 generates more hands by default.

Copied!

by Andrey Remnev --v 4

Copied!

by Claude Cahun --v 4

Copied!

by Pierro della Francesca --v 4

6. "Image Prompting Performance Improved"

Image Prompts are one of my favorite parts of testing Midjourney's new models. Why? Because I love how MJ sees and reinterprets the portrait of Francis D.! Here is what V5 is capable of, given the such powerful source material.

Original photo

Francis D. by Andrei Kovalev (Paris, 2011)

What you immediately notice is the variability of output. Before, in most cases, Midjourney would inherit the original's close-up framing. In V5, the angles, the framing, and the situation constantly change, offering more options for a single-image input.

Copied!

by George Barbier --v 4

Copied!

by Neri Oxman --v 4

Copied!

by Rufino Tamayo --v 4

Copied!

by Samuel Melton Fisher --v 4

But Francis' portrait is very characteristic, with dramatic lighting emphasizing facial features. What about a more bland appearance and flatter light? Here's my self-portrait from many years ago.

Original photo

Copied!

Andrei Kovalev, self-portrait (2016)

Copied!

by Martin Schoeller --v 4

Copied!

in Synthwave style --v 4

Copied!

by Marianna Rothen --v 4

Obviously, V5 is more intricate, detailed, and varied than everything we've seen before. However, I wouldn't say that the face recognition algorithm improved much. The Image Prompting rules for portraits remained the same: characteristic faces with stand-out features and emphasizing lighting work better. ;)

As we've seen previously, Midjourney V5 is much better at group portraits. How about groups and Image Prompts?

Original photo

Copied!

From my course on complex portrait photography (2016)

Copied!

family portrait by Mary Jane Ansell --ar 3:2 --v 4

Original photo

Copied!

From my course on complex portrait photography (2019)

Copied!

post-apocalyptic samurai monster-hunters squad --ar 3:2 --v 4

I'd say, that group portraits are still challenging for both V4 and V5.

And what about non-portrait images? Let's see how V5's Image Prompts work with still life, landscape, and… weird stuff. :)

Original photo

Copied!

Still-life. My school assignment (2011)

Copied!

Gary Panter's illustration depicting voodoo witch workbench --v 4

Original photo

Copied!

Baku. Magazine assignment (2017)

Copied!

by Albert Bierstadt --v 4

Original photo

Copied!

Berikaoba mask. Personal project (2019)

Copied!

Geof Darrow's comic book illustration depicting ancient creature wearing ritualistic horned mask on tall rooftop over city on sunset --v 4

As you can see, even with specific prompts (e.g., comic strip by Geoff Darrow), Midjourney sometimes struggles to turn a photorealistic image into an illustration or painting. In Darrow's case, V4 seems to have done a slightly better job with the backdrop.

7. "Remixes are much better"

V5's Remix mode seems to work in roughly the same manner as in V4. You can change details, transition reference style, and even affect the context—to a certain extent. Because the more you remix, the more glitchy and distorted the outcome becomes.

Copied!

character by Alice Bailly --v 4

Copied!

steampunk character by Alice Bailly --v 4

Copied!

steampunk character by Erwin Blumenfeld --v 4

And highly detailed pictures get messed up after the first remix. And it is almost impossible to change significant details: e.g., switch day to night or invert colors—without "breaking" the image.

Copied!

psychedelic painting depicting Independence day celebration --v 4

Copied!

black-and-white painting depicting Independence day celebration --v 4

Copied!

happy family in broad daylight --v 4

Copied!

happy family on moonlit night --v 4

conclusion

Undoubtedly, V5 is as revolutionary as V4 was when it first appeared—if not more so. Midjourney's new model is groundbreakingly capable and has magic that was somehow lost after V3!

Copied!

visual breaktrhough by Larry Carlson and Gabriel Dawe --ar 16:9 --v 4

Of course, Midjourney V5 is not perfect. But we are looking at the early Alpha release! It will surely get better, more understanding, and even more powerful. Until then… let's say there are some challenges.

The main one being "unopinionatedness," responsible for more literal and less "artistic" outcomes, more photorealistic images, and dull defaults.

Copied!

world of nine inverted planets --ar 16:9 --v 4

But V5 truly shines when you apply style modifiers to your prompts! And the styles themselves became much more precise and varied. I was genuinely pleased to see how my favorite styles evolved in V5.

Copied!

world of nine inverted planets by Alex Andreev and Wayne Barlowe --ar 16:9 --v 4

So do we still need V4? V5 is amazing, but it still has a long way ahead of it. For now, V4 might be more "artistic" without additional efforts and can still deliver outstanding results (in some cases—more interesting and varied than V5). So wait to discard it just yet!

Copied!

infinite mirror glitch by Larry Carlson in Glitch-art style --ar 16:9 --v 4

Happy midjourneys,
— Andrei Kovalev

You can help us maintain and expand Midlibrary and produce more regular educational content of higher quality. And keep it free for all!

Support Midlibrary on Patreon! →

Explore Guides

Guide

Graphic Design

Guide

The Midjourney Halloween

Guide

Mimetic Style Emulation

Midjourney Style Roulette

Tense atmosphere

false

false

General

General Modifiers

Joseph Lorusso

false

Painters

Mikhail Nesterov

true

false

Painters

Frank Auerbach

false

Painters

Gyorgy Kepes

false

Painters

ⓘ New styles on every page!

Subscribe to Newsletter Suggest a style

Report a bug

Email us

All samples are produced by Midlibrary team using Midjourney AI (if not stated otherwise). Naturally, they are not representative of real artists' works/real-world prototypes.

Support Midlibrary on

Patreon →

Ver. 2.9.1
♡

Encountered a bug?

We do our best to keep this website running as smoothly as possible. However, stuff happens, and we thank you for letting us know!

Thank you!

Midlibrary Groundskeeper has been notified.

✕ Close

Something went wrong while submitting the form. Please, check if you filled all fields.
We're here to help! If you're unable to resolve the issue, please, contact us.

Subscribe to Midlibrary Newsletter

We regularly publish new Midjourney Guides, compile new Style Tops, update the website, and have fun! Want to be the first to get Midlibrary news? Subscribe to our newsletter and never miss a thing!

Thank you for subscribing!

Please, expect emails from [email protected]. If you're not receiving our newsletter for a long enough time, please, check your Spam folder.

✕ Close

Something went wrong... Please, check if you filled all required fields.
If you're unable to resolve the issue, please, contact us.

Personal Libraries are available to our Patreon Community

Learn more about the benefits of supporting us by becoming Midlibrary Patron—and start your Personal Library ↗︎

You have just become a Patron, and cannot log in?

Please, allow our team some time (usually not more than 24 hours) to set up your Personal Library.

You may be using different emails for your Patreon and Discord accounts. If that is the case, please, send your Discord email to [email protected].

If the issue perists, or you didn't get a response to your email, please, inform us via Bug Report form

✕ Close

We are currently updating the Personal Libraires' infrastructure

In the nearest future, it will allow you to access your Collections much quicker, add covers to them, tag the styles you save to quickly find them, and—most importantly—save your --sref (numerical) styles!

However, at the moment, logging in to your Library is unavailable. We apologize for the inconvenience. If you are a Midlibrary Patron, please, check this Patreon post ↗︎ for Personal Libraries status updates.

To start creating Collections and save favorite styles:

Learn more about Personal Style Libraries, saving favorite styles, and organizing them into Collections.

Learn more about supporting Midlibrary and the benefits of joining our Patreon community →

✕ Close

Instead of Intro: The Magic Is Back

Quick facts

1. V5 Is Much More "Unopinionated"

2. "V5 Has Much Wider Stylistic Range"

3. "V5 Can Be Insanely Photorealistic"

4. «Complex Prompts Are Better»

5. "V5 Renders More (Correct) Details"

6. "Image Prompting Performance Improved"

7. "Remixes are much better"

conclusion

Instead of Intro:
The Magic Is Back