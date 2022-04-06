In January 2021, OpenAI – one of the main research organizations in the field of artificial intelligence, belonging to Google – presented an interesting AI called DALL-E, capable of generate images from scratch, from a description provided by the user.

To ‘understand’ what we mean, DALL-E is based on GPT-3 technology, OpenAI’s revolutionary AI dedicated to text analysis and generation. Best of all, he showed himself capable not only to recreate existing images in real life, but also to ‘get the idea’ of what a much more surreal concept could look like. For example, ‘DALL-E, I want an avocado-shaped cup of tea’ (see image above).





Yes, the name ‘DALL-E’ honors both our Salvador Dalí and the robot WALL-E

Now, Google/OpenAI has just revealed that it has developed a new version of this AI, known as DALL-E 2. After more than a year of closely reviewing potential problems such as bias in image generation or production of misinformation, and overall improving model performance and image qualitythe company has been encouraged to add some new functions to this artificial intelligence.

In addition, it has applied to this new model the technology of another AI presented at the same time as the original DALL-E, CLIP, dedicated to the analysis and categorization of images. And it has done so by reversing its function: if CLIP was capable of ‘seeing’ an image and describing its content, the new function unCLIP helps improve the model’s ability to reconstruct an image from descriptions.

DALL-E 2 generates 1024 x 1024 pixel images, a jump from 256 x 256 pixels in DALL-E 1

New features

One of those features is paint functionwhich allows us —starting from any image that we provide to the model— select only one area of ​​it and tell DALL-E 2 what changes we want to make in the same. The possibilities are endless: indicate the change from one texture to another on the background wall of your living room photograph, add a Chinese vase next to the television, etc.



Where do you want me to put the pink flamingo, human?

Another feature is creating variations on an original image: will identify the elements that appear in it, and the pictorial/photographic style of the image, and will offer us a range of images that are reminiscent of the original, but that are totally new. The best example we have here, with Variations on the ‘Girl with a Pearl Earring’ the Johannes Vermeer:





Oh, and did we mention that DALL-E 2 allows apply the painterly style of one image when generating another totally different?



“‘Pearl Girl’-style sea otter, please, DALL-E.”

New limitations (understandable)

Not only functionalities have been added: also safeguards in order to prevent abuse. A) Yes, care has been taken that the dataset used in the training of DALL-E 2 had no ‘objectionable content’so the probability that it generates one of these contents is immediately limited.

Similarly, DALL-E 2 adds a watermark to the resulting image, in order to allow detecting that there is an artificial intelligence involved. And finally the model contains a block to the use of recognizable faceseven if we provide you with the name of the person in question (even if this person is the ‘Mona Lisa’).

For now, it is only possible to test DALL-E 2 by registering on the platform created by OpenAI (and registration is not immediate: there is a ‘waiting list’. In any case, once you have your own user, there will be a series of limitations when uploading original reference images (No adult images, violence, medical and political themes, etc.). And the resulting images will go through a human review process.

The idea, of course, is that DALL-E 2 can go much further, even become an API that can be integrated with third-party applications. But OpenAI wants to make sure it doesn’t screw up before it gets to that point.

