Last week, Midjourney launched the public alpha-testing of its V5 algorithm. In this study, we look in-depth into its innovative features and challenges and compare it with its predecessor to determine whether V4 is still relevant.

MRI scan of robot samurai --v 5

MRI scan of robot samurai --v 4

Japanese fox god of winter death and rebirth --v 5

Japanese fox god of winter death and rebirth --v 4

intricate raven god --v 5

intricate raven god --v 4
Instead of Intro:
The Magic Is Back
When V4 came out, many of my peers were disturbed by what they defined as "the disappearance of magic" that was present in V3. I can assure all of them now: the magic is back. There is poetry, there is the spark, and these new images are not just artful but alive!

Daniel Merriam's painting depicting perfect simbyosis of human being and allpowerful AI --v 5

Daniel Merriam's painting depicting perfect simbyosis of human being and allpowerful AI --v 4

Laurie Greasley's illustration depicting perfect simbiosis of human being and allpowerful AI --v 5

Laurie Greasley's illustration depicting perfect simbiosis of human being and allpowerful AI --v 4

William Wray's painting depicting perfect simbyosis of human being and allpowerful AI --v 5

William Wray's painting depicting perfect simbyosis of human being and allpowerful AI --v 4

complex collage depicting perfect simbyosis of human being and allpowerful AI by James C. Christensen --v 5

complex collage depicting perfect simbyosis of human being and allpowerful AI by James C. Christensen --v 4
"V5 isn't the final step, but we hope you all feel the progression of something deep and unfathomable in the power of our collective human imagination"
— from Midjourney's team official announcement
Quick facts
V5 is "much more 'unopinionated' than v3 and v4."
It has a much wider stylistic range.
The model can generate significantly more realistic imagery.
It is more responsive to prompting. And the prompting strategies have changed, too!
V5 renders more detailed images, and details are more likely to be correct (yes, the hands improved greatly; and it is promised to generate way less unwanted text).
Image prompting performance improved. And the new model supports --iw for weighing image prompts versus text prompts.
Remixes are much better.
In this guide, we will put each of these statements to the test.
1. V5 Is Much More "Unopinionated"
Based on multiple tests, here is my interpretation of this statement: the current model is VERY straightforward. To best illustrate this, I devised a somewhat complex prompt that makes little sense and leaves space for interpretations. And a surprising thing happened...

hidden incremental identity of uniland invertspace --v 5

hidden incremental identity of uniland invertspace --v 4
The "unopinionatedness" becomes especially self-evident when you compare Artistic Techniques in V5 vs. V4. Where V4 rendered the actual results of a technique application, V5 tends to return the images of the process of said application: Haute couture fashion is no longer a dress but a fashion show, Pinhole photography must include a pinhole camera, and Encaustic paint is not a beautiful abstract artwork, but buckets of… paint.

Byzantine architecture --v 5

Byzantine architecture --v 4

Haute couture fashion --v 5

Haute couture fashion --v 4

Pinhole photography --v 5

Pinhole photography --v 4
Finally, Midjourney V5 generates WAY more photorealistic images. Whereas before, many styles were depicted as paintings or illustrations, now they are photographs. And that, together with a straightforward, literal approach to prompts, often makes an average V5 generation uninteresting, characterless, and less varied—without additional moves.

by Erik Spiekermann --v 5

by Erik Spiekermann --v 4

Fashion photography --v 5

Fashion photography --v 4
Good news: we have those moves! And the first and most powerful one is specifying a style—style references became much more impactful in V5. And many styles themselves—much more precise, detailed, and nuanced.

Midlibrary by Viktor Vasnetsov and Tyrus Wong --v 5

Midlibrary by Viktor Vasnetsov and Tyrus Wong --v 4
2. "V5 Has Much Wider Stylistic Range"
"Much wider stylistic range" might mean two things: MJ now knows more styles, and existing styles have become more nuanced and varied. We will get to nuances down the road, and now, let's look at how the existing styles have changed.
We are re-checking our backlog of artistic styles that V4 rejected, and will be adding the styles known to V5 to Midlibrary catalog regularly. When the list of new styles will be significant enough, expect the update of this study!
3. "V5 Can Be Insanely Photorealistic"
For the first time since Midjourney arrived, I genuinely confused an AI-generated image posted on Instagram with an actual photograph. For a few instants, I couldn't believe my eyes seeing the #MidjourneyV5 hashtag below the picture.

portrait by Steven Klein --ar 16:9 --v 5

portrait by Steven Klein --ar 16:9 --v 4
It. Truly. Is. Insane.
.jpg)
Nick Cave by Nan Goldin --v 5
.jpg)
Nick Cave by Nan Goldin --v 4
.jpg)
Nick Cave in Alejandro Jodorowsky's movie --v 5
.jpg)
Nick Cave in Alejandro Jodorowsky's movie --v 4
.jpg)
Miles Aldridge's high-concept portrait photograph depicting Nick Cave --v 5
.jpg)
Miles Aldridge's high-concept portrait photograph depicting Nick Cave --v 4
.jpg)
Nick Cave as main character in photograph by Nobuyoshi Araki --v 5
.jpg)
Nick Cave as main character in photograph by Nobuyoshi Araki --v 4
Needless to mention, photographers' styles improved dramatically, and some non-photographic techniques acquired a much more photorealistic look.

by Erik Madigan Heck --v 5

by Erik Madigan Heck --v 4
However, there is a "downside," too. By default, Midjourney V5ɑ tends to render more photorealistic images. Thus, frequently, you must "nudge" it to get less literal and more artistic results.
.jpg)
dark lord's mountains of doom --v 5
.jpg)
dark lord's mountains of doom --v 4

exoskeleton fighter school girl from East End --v 5

exoskeleton fighter school girl from East End --v 4
Well, maybe the Italian mother is a bad example: just look at that baby! <3
And let's remember that photorealism is not just about clean, modern photographs. More "vintage" photographic techniques and classical photographers' styles became more realistic, too!

Hand-colored photograph --v 5

Hand-colored photograph --v 4

photograph shot on Kodachrome --v 5

photograph shot on Kodachrome --v 4
It is a 'pro' mode of the model tuned to provide a wide diversity of outputs and to be very responsive to your inputs. The tradeoff here is that it may be harder to use. Short prompts may not work as well. You should try to write longer, more explicit text about what you want.
— from Midjourney's team official announcement
4. «Complex Prompts Are Better»
For this experiment, I chose a simple prompt and then gradually complicated it using the same --seed value—to keep the results more consistent.

young woman --seed 42 --v 5

young woman --seed 42 --v 4

dedicated young woman --seed 42 --v 5

dedicated young woman --seed 42 --v 4

dedicated young woman in forest --seed 42 --v 5

dedicated young woman in forest --seed 42 --v 4

dedicated young woman in neon forest --seed 42 --v 5

dedicated young woman in neon forest --seed 42 --v 4

dedicated Renaissance young woman in neon forest by Hans Bellmer --seed 42 --v 5

dedicated Renaissance young woman in neon forest by Hans Bellmer --seed 42 --v 4

dedicated Renaissance young woman in neon forest by Hans Bellmer --seed 42 --v 5

dedicated Renaissance young woman in neon forest by Hans Bellmer --seed 42 --v 4
Apart from improved photorealism... I wouldn't say that the difference is that striking. Okay, let's try with one of the "megaprompts" from our Image-to-Text-to-Image study ↗︎

A group of anthropomorphic rabbits, depicted as cartoon characters, are captured in a high quality image inspired by the works of Jean Tabaud. The image shows the characters sitting around a table, participating in a tea ceremony scene from an anime movie screenshot set in a dark sci-fi environment. The image is a stunning example of epic fantastic realism, emphasized by its high quality 4K technology. The style is reminiscent of the works of Junji Ito and showcases elements from anime and claymation. The overall aesthetic is characterized by captivating deco elements, making it a top-notch anime piece --ar 16:9 --v 5

A group of anthropomorphic rabbits, depicted as cartoon characters, are captured in a high quality image inspired by the works of Jean Tabaud. The image shows the characters sitting around a table, participating in a tea ceremony scene from an anime movie screenshot set in a dark sci-fi environment. The image is a stunning example of epic fantastic realism, emphasized by its high quality 4K technology. The style is reminiscent of the works of Junji Ito and showcases elements from anime and claymation. The overall aesthetic is characterized by captivating deco elements, making it a top-notch anime piece --ar 16:9 --v 4

a couple of people laying on top of a lush green field, by Elsa Bleda, magic realism, anya taylor - joy and emma stone, wide high angle view, barefeet in grass, marat zakirov, concert, two girls, hammershøi, yelena belova, midair, benjamin vnuk, alexi zaitsev, innocence --ar 16:9 --v 5

a couple of people laying on top of a lush green field, by Elsa Bleda, magic realism, anya taylor - joy and emma stone, wide high angle view, barefeet in grass, marat zakirov, concert, two girls, hammershøi, yelena belova, midair, benjamin vnuk, alexi zaitsev, innocence --ar 16:9 --v 4
Once again, in these example the difference doesn't seem to be that big. If anything, V5 is kind of losing this round to V4.
Finally, let's see how two or more styles combine in complex prompts in V5 vs. V4.

art-house movie scene by Max Ernst and Kazumasa Nagai --v 5

art-house movie scene by Max Ernst and Kazumasa Nagai --v 4

Empire State Building advertising collage by Dziga Vertov and Paula Scher --v 5

Empire State Building advertising collage by Dziga Vertov and Paula Scher --v 4

Alan Lee's illustration depicting Aron Demetz's sculpture of minimalist Gothic robot --v 5

Alan Lee's illustration depicting Aron Demetz's sculpture of minimalist Gothic robot --v 4

megastructure designed by Lee Bontecou in Mecha style covered with street-art by Artur Bordalo --v 5

megastructure designed by Lee Bontecou in Mecha style covered with street-art by Artur Bordalo --v 4
Can you spot the Big Difference?
5. "V5 Renders More (Correct) Details"
TL:DR, it's true. V5 renders much more subtleties in details, making them more intricate, refined, and more correct indeed!

extremely detailed knolling layout of a futuristic survavlist kit with abundance of intricate tech, bizarre devices, and weird packed food --ar 16:9 --v 5

extremely detailed knolling layout of a futuristic survavlist kit with abundance of intricate tech, bizarre devices, and weird packed food --ar 16:9 --v 4
The styles that were already detailed in V4 perfectly show the new model's progress.

by Mattias Adolfsson --v 5

by Mattias Adolfsson --v 4
During the office hours announcement, the Midjourney team mentioned that V5 got much better at rendering groups of people. And there are many proofs of that.

Avatar the Last Airbender anime --v 5

Avatar the Last Airbender anime --v 4

by Ludwig Bemelmans --v 5

by Ludwig Bemelmans --v 4
Another statement the team made: V5 generates much less unwanted text and text in general—up to a point where even "infographics may suffer." Is that so?

intricate infographics depicting underworld layout --v 5

intricate infographics depicting underworld layout --v 4

1700s blueprint depicting underworld layout --v 5

1700s blueprint depicting underworld layout --v 4

cyberpunk hovercar hud interface --v 5

cyberpunk hovercar hud interface --v 4

technical drawing with notes and measurements depicting underworld layout --v 5

technical drawing with notes and measurements depicting underworld layout --v 4
Seems like infographics lovers can sleep tight. :) And the following samples got me confused—it seems like, for now, V5 returns even more text than V4…

Light field photography --v 5

Light field photography --v 4
Finally—yes—hands got MUCH better!

Anton Corbijn's closeup photograph of hands of elderly hard-working person --v 5

Anton Corbijn's closeup photograph of hands of elderly hard-working person --v 4

Arnold Bocklin's painting depicting hand of gigantic statue drowned in fantasy forest --v 5

Arnold Bocklin's painting depicting hand of gigantic statue drowned in fantasy forest --v 4

space pilot's hands by Hajime Isayama --v 5

space pilot's hands by Hajime Isayama --v 4

Infinity Gauntlet designed by Walter Van Beirendonck --v 5

Infinity Gauntlet designed by Walter Van Beirendonck --v 4
And V5 generates more hands by default.

by Pierro della Francesca --v 5

by Pierro della Francesca --v 4
6. "Image Prompting Performance Improved"
Image Prompts are one of my favorite parts of testing Midjourney's new models. Why? Because I love how MJ sees and reinterprets the portrait of Francis D.! Here is what V5 is capable of, given the such powerful source material.
Francis D. by Andrei Kovalev (Paris, 2011)
What you immediately notice is the variability of output. Before, in most cases, Midjourney would inherit the original's close-up framing. In V5, the angles, the framing, and the situation constantly change, offering more options for a single-image input.

by Samuel Melton Fisher --v 5

by Samuel Melton Fisher --v 4
But Francis' portrait is very characteristic, with dramatic lighting emphasizing facial features. What about a more bland appearance and flatter light? Here's my self-portrait from many years ago.
.jpg)
Andrei Kovalev, self-portrait (2016)

by Martin Schoeller --v 5

by Martin Schoeller --v 4
Obviously, V5 is more intricate, detailed, and varied than everything we've seen before. However, I wouldn't say that the face recognition algorithm improved much. The Image Prompting rules for portraits remained the same: characteristic faces with stand-out features and emphasizing lighting work better. ;)
As we've seen previously, Midjourney V5 is much better at group portraits. How about groups and Image Prompts?

From my course on complex portrait photography (2016)

family portrait by Mary Jane Ansell --ar 3:2 --v 5

family portrait by Mary Jane Ansell --ar 3:2 --v 4

From my course on complex portrait photography (2019)

post-apocalyptic samurai monster-hunters squad --ar 3:2 --v 5

post-apocalyptic samurai monster-hunters squad --ar 3:2 --v 4
I'd say, that group portraits are still challenging for both V4 and V5.
And what about non-portrait images? Let's see how V5's Image Prompts work with still life, landscape, and… weird stuff. :)

Still-life. My school assignment (2011)

Gary Panter's illustration depicting voodoo witch workbench --v 5

Gary Panter's illustration depicting voodoo witch workbench --v 4

Baku. Magazine assignment (2017)

by Albert Bierstadt --v 5

by Albert Bierstadt --v 4

Berikaoba mask. Personal project (2019)

Geof Darrow's comic book illustration depicting ancient creature wearing ritualistic horned mask on tall rooftop over city on sunset --v 5

Geof Darrow's comic book illustration depicting ancient creature wearing ritualistic horned mask on tall rooftop over city on sunset --v 4
As you can see, even with specific prompts (e.g., comic strip by Geoff Darrow), Midjourney sometimes struggles to turn a photorealistic image into an illustration or painting. In Darrow's case, V4 seems to have done a slightly better job with the backdrop.
7. "Remixes are much better"
V5's Remix mode seems to work in roughly the same manner as in V4. You can change details, transition reference style, and even affect the context—to a certain extent. Because the more you remix, the more glitchy and distorted the outcome becomes.

character by Alice Bailly --v 5

character by Alice Bailly --v 4

steampunk character by Alice Bailly --v 5

steampunk character by Alice Bailly --v 4

steampunk character by Erwin Blumenfeld --v 5

steampunk character by Erwin Blumenfeld --v 4
And highly detailed pictures get messed up after the first remix. And it is almost impossible to change significant details: e.g., switch day to night or invert colors—without "breaking" the image.

psychedelic painting depicting Independence day celebration --v 5

psychedelic painting depicting Independence day celebration --v 4

black-and-white painting depicting Independence day celebration --v 5

black-and-white painting depicting Independence day celebration --v 4

happy family in broad daylight --v 5

happy family in broad daylight --v 4

happy family on moonlit night --v 5

happy family on moonlit night --v 4
conclusion
Undoubtedly, V5 is as revolutionary as V4 was when it first appeared—if not more so. Midjourney's new model is groundbreakingly capable and has magic that was somehow lost after V3!

visual breaktrhough by Larry Carlson and Gabriel Dawe --ar 16:9 --v 5

visual breaktrhough by Larry Carlson and Gabriel Dawe --ar 16:9 --v 4
Of course, Midjourney V5 is not perfect. But we are looking at the early Alpha release! It will surely get better, more understanding, and even more powerful. Until then… let's say there are some challenges.
The main one being "unopinionatedness," responsible for more literal and less "artistic" outcomes, more photorealistic images, and dull defaults.

world of nine inverted planets --ar 16:9 --v 5

world of nine inverted planets --ar 16:9 --v 4
But V5 truly shines when you apply style modifiers to your prompts! And the styles themselves became much more precise and varied. I was genuinely pleased to see how my favorite styles evolved in V5.

world of nine inverted planets by Alex Andreev and Wayne Barlowe --ar 16:9 --v 5

world of nine inverted planets by Alex Andreev and Wayne Barlowe --ar 16:9 --v 4
So do we still need V4? V5 is amazing, but it still has a long way ahead of it. For now, V4 might be more "artistic" without additional efforts and can still deliver outstanding results (in some cases—more interesting and varied than V5). So wait to discard it just yet!

infinite mirror glitch by Larry Carlson in Glitch-art style --ar 16:9 --v 5

infinite mirror glitch by Larry Carlson in Glitch-art style --ar 16:9 --v 4
Happy midjourneys,
— Andrei Kovalev
You can help us maintain and expand Midlibrary and produce more regular educational content of higher quality. And keep it free for all!
Support Midlibrary on Patreon! →

/explore Midjourney styles