It is enough to give a text description, and the AI, like a ready-made painter, takes a picture of it with surprisingly good results.

While most images are relatively easy to describe in words, producing images from text entry requires special skills and several hours of work. On the other hand, if AI were to automatically produce realistic images from text written in natural language, it would not only allow people to create rich and diverse visual content with unprecedented ease, but would also allow for simpler iterative optimization. One imagines a landscape or scene, describes it in words, and artificial intelligence takes the picture.

There have been experiments of this kind before, but their success has been highly debated and complex. However, technology developed by experts at the OpenAI artificial intelligence research lab, founded in 2015 (by Elon Musk and other famous tech experts), has yielded impressive results recently.

For all this, the researchers used so-called directed diffusion models. In a 3.5 billion parameter model called GLIDE (Image-Oriented Language Diffusion for Creation and Editing) available on GitHub, the AI ​​searches an image based on a text description and then edits and reformats it as expected.

For example, if you were asked to take a photo of a girl embracing a corgi, you would search for an image of a girl embracing a dog of any kind and then replace it with a corgi – as if we were in Photoshop. During tests, GLIDE produced high-quality images with realistic shadows, reflections, and textures.

By the way, the model is also able to create its own illustrations in different styles, such as the style of Van Gogh or the style of a particular painting. GLIDE can also interpret concepts such as the necktie or Christmas hat on a corgi, attaching attributes such as color or size to these objects. Users can also make various adjustments to the existing images with a simple text command.

Of course, GLIDE isn’t perfect either. The examples above are success stories, but there have also been failures in the study. Some claims describing very unusual things or scenarios, such as ordering a “wheeled” triangular car, do not yield satisfactory results. Diffusion models are only as good as the data that was used for training, so the imagination is still human – at least for now.

If you ever want to learn similar things, check out the HVG Tech Facebook page.


Order HVG weekly on paper or digital and read us anywhere, anytime!

The number of independent editorial offices is steadily declining from power, and those that do still exist are trying to stay afloat under growing headwinds. At HVG, we persevere and never give in to pressure, bringing local and international news every day.

That’s why we ask you, our readers, to support us! We promise to continue to give you the best we can!