Is Midjourney coming to an end? Google Bard now generates images.

COMMON 7 min read

And just recently, Google announced big changes to its neural network model called Google Bard, which until that moment was only text-based, that is, it worked only with texts, like the notorious Chat GPT. But in the new update, Google Bard gained the ability to generate images based on text descriptions, thanks to the use of the Google Imagen 2 text-to-image conversion model. Essentially, the Bard developers decided to merge the capabilities of various neural network models within one web tool, which now allows them to significantly expand the scope of their application and thus the developers hope to gain a significant share of the growing neural network market. And this is not groundless.

This was a huge step for Google Bard, as it will now be able to easily compete with the leader in this market - ChatGPT Plus from OpenAI, which has long had a similar function. Now, with Imagen 2, Bard is not just catching up to its competitor, but is gradually overtaking it, as Google has made this feature available for free, unlike ChatGPT Plus, which requires a paid subscription to use the image generation tool.

Google emphasizes that Bard's graphics capabilities were "designed with responsibility." This means a complete ban on users generating sexually explicit images, images of famous public figures, as well as scenes of violence, including against animals. For example, when I asked to generate a picture “Viktor Pelevin is flying astride a huge bat,” Bard refused me, citing the fact that he could not generate such a picture :) Although, I think, you can choose such a prompt to lull the neural network’s vigilance and get a "forbidden" result.

So, I propose to move on to the technical part.

This function is currently only available in English. Currently Bard generates 4 square images per request. The image resolution is 1536 pixels on one side. Despite this, the neural network has a fairly high image generation speed: from 5 to 10 seconds. After generating images, the “Generate more” button appears under them, after clicking on it, 2 more images are displayed, and so on. Any image can be downloaded in full size.

Google Bard generate image1

As you can see, it shows the most relevant results first, and then it starts to get lazy and produces outright nonsense. But if you try to clarify the request, then it will develop the topic in the given direction. Let's conduct an experiment. To begin with, I enter the deliberately erroneous query "generate image: red frog iten yellow apple"

Google Bard generate image2

We get exactly the result that Bard could give us with the maximum of understandable information - a red frog, a yellow apple. All.
Now let’s enter a correct query, and also specified with all sorts of details - “generate a picture: a red frog eats a yellow apple. The frog is wearing a pink swimsuit and has a hat on her head. The liar lies on a deck chair”

Google Bard generate image3

And now we have a completely relevant result!

Of course, like other analogues, Google Imagen 2 has disadvantages and in order to get a high-quality image, you need to spend some time drawing up a high-quality prompt and “finishing” it along the way in order to bring the result of the search to what you need. But I am sure that after some time, the quality of the generated images will not differ at all from those drawn by a person. After all, remember, just a couple of years ago, most could not even imagine that it would be possible to speak words, and they would turn into a picture. Let neural networks live! :)

2024-02-10 08:59