This month, Midjourney rolled out its Version 5.1.

In this study, we take an in-depth look at what the new version is capable of, what it has in common with its predecessor, and what are its superpowers and shortcomings.

"The only way to make sense out of change is to plunge into it, move with it, and join the dance."
— Alan Watts

Quick facts


V5.1 is (finally!) more opinionated—like V4.


It's the next step in visual aesthetics. The new model has higher coherence, better details, and improved sharpness.


Many artistic styles and techniques improved since V5.


Text prompting and Image prompting performance seemed to remain the same as in V5.


The new model ups the photorealism game (after just under two months since we saw V5!).


V5.1 generated less unwanted text. And when you want text, it's better, too.


There is an 'unopinionated' "RAW Mode" (similar to V5.0).


Finally, there is one (surprising!) issue with V5.1:
hands ¯\_(ツ)_/¯

To confirm or disprove these statements, we challenged the new model on each of these aspects.

1. Unopinionated No More

To me, this is the most exciting improvement in 5.1.

Remember, how one of the main characteristics of V5 was “unopinionatedness”? A feature that prioritized photorealism, interpreted prompts literally, and was generally more “straightforward” as opposed to “creative.“

The new model got it’s opinionatedness back, and, in that sense, combines the (improved!) intricacy of V5 with the creativeness of V4.


What V5.1 delivers is nothing short of stunning. Overall visual aesthetics of the new model is out of this world. The images became more coherent, complex, and clear at the same time.

And when it comes to details, V5.1 is breathtaking. It renders highly detailed scenes more correctly, the details themselves are more pronounced, contrasted, and sharper than before.


It’s impossible to speak of the new model’s aesthetics and not mention how it works with style modifiers (artists’ names, techniques, genres etc.) TL:DR: It’s mind-blowing!

There were negligibly little examples of styles that didn’t improve in one way or another compared to their predecessors (and where they didn't, we will always have V5 and --style raw ↓).

Here are just a few examples of how artistic styles and techniques in 5.1 are the next level.

In general, styles became more detailed, pronounced, and, simply put, more visually mindblowing.

Finally, many styles became closer to their real-life prototypes (especially noticeable with movie directors!).


V5.1 ... is MUCH easier to use with short prompts.
— from Midjourney's team official announcement

To test how the new models reacts to simple and complex prompts and compare it to V5 in the same situation, I ran three tests, each featuring three prompts of gradually increasing complexity.

This way we can compare how prompts ranging from elementary to complex behave within both models.

How about Image prompts? To compare the two model, I fed the same set of images to both of them. And I started with my own face. 8)

Although 5.1 returns cooler results overall, the "face recognition" part didn't seem to change much. I then transitioned to try some other types of images.

This test is the ultimate illustration of how much more artistic and creative V5.1 is; how much more interesting, diverse, and detailed its images are, and how much less unwanted artefacts it generates. However, from this test I wouldn’t really give the first prize to any of the contestants. ¯\_(ツ)_/¯

Also, did you notice how V5.1 became even more photorealistic than 5?


Just recently, Midjourney released the V5, showcasing a revolutionary level of photorealism. It's hard to believe they could improve so much in just under two months, but they have!

The images in V5.1 are mind-bending, with realistic light and shadows, reflections, and skin texture.

Naturally, many styles that were photorealistic to begin with, became that much better!


Truly, V5.1 does dial down the amount of unwanted elements in your generations, including (mostly ;)) text, objects that "pollute" the style, and elements that depict an action instead of its result.

And when you do want text and symbols in your images, V5.1 usually delivers more artistic, intricate, and harmonious results. With better, more refined text lines, fonts, graphic elements (like callouts, boxes, dividers, etc.) and overall sense of design.

That said, there are quite a few cases in which V5 delivers very worthy results! If they are better or worse than 5.1 is purely a question of your goals with these prompts.


There is an 'unopinionated' mode for V5.1 (similar to V5 default) called "RAW Mode"
— from Midjourney's team official announcement

Let’s see how that works, and if RAW mode offers any advantages compared to V5, and how all that stands against the default, “opinionated” V5.1.

This section is the only one where—for the sake of three models’ comparison—the Neat Magic Toggle won't work :/ But it will come back right in the next (and ultimate) chapter!

I extensively tested all three models with a set of prompts ranging from very simple to more vague and abstract ones, both leaving room for Midjourney’s imagination.

As expected, nor V5, neither V5.1s RAW mode exhausted that room. ¯\_(ツ)_/¯ However, the default 5.1 showed wonders.

With marginal difference, RAW truly does inherit its unopinionated features from V5. The two models are close in visual qualities, often have same subjects, and even share some details.

Some prompts didn’t really differ that much from V5 to RAW mode.

And there are reversed situations, where Midjourney renders more interesting results in V5 than in 5.1's RAW mode.

However, they both fade when you set them against the default 5.1. ◔__◔

8. Issues

V5.1 is absolutely stunning. But nothing is perfect, and there is at least one issue I need to highlight here: it’s the hands. Again.

But not all is lost! With a few re-rolls and some --stylize tweaking you can get very decent results. Sometimes even surpassing those from V5.


The remarkable advancements showcased in Midjourney 5.1 take the platform to new heights of aesthetic excellence, powerful default style, next-level intricacy, and broad variability.

Despite minor setbacks (bring back regular hands!), V5.1 is a huge advancement on almost every front, fixing some crucial fallbacks of its predecessor.

This extraordinary progress, at such a pace, fills me with anticipation for the future of Midjourney. And it's around the corner, take it from the Midjourney team themselves:

There may be further tunings of V5.1 styles
and possibly a V5.2 after that
— from Midjourney's team official announcement

Happy midjourneys,
— Andrei Kovalev

All samples are produced by Midlibrary team using Midjourney AI. Naturally, they are not representative of real artists' works/real-world prototypes.

