DALL E mini has a mysterious obsession with women who wear sari

DALL E mini has a mysterious obsession with women who wear sari

Like most heavy internet users, Brazilian screenwriter Fernando Mares was fascinated by the images generated by the artificial intelligence (AI) model DALL E mini. Over the past few weeks, the AI ​​system has become a sensation by creating images based on seemingly random and bizarre queries from users – like “Lady Gaga as the JokerElon Musk sued by Capybara,” and more.

Marés, a veteran hacker, started using the DALL E mini in early June. But instead of entering text for a specific request, try something different: leave the field blank. Fascinated by the seemingly random results, Marés conducted the blank search over and over again. That’s when Maris noticed something strange: Almost every time he made a blank order, DALL E mini generated images of black women wearing saria type of attraction common in South Asia.

Maris wondered about DALL E mini thousands of times by entering the empty command to see if it was just a coincidence. After that, he invited his friends to take turns on his computer to simultaneously create images on five browser tabs. He said he continued walking for about 10 hours without interruption. He built a sprawling repository of more than 5,000 unique images, and shared 1.4 GB of DALL E mini data with Rest of the world.

Most of these images contain images of dark-skinned women wearing sari. Why does DALL-E mini seem so obsessed with this very specific type of photo? According to AI researchers, the answer may have something to do with poor tagging and incomplete data sets.

DALL E mini was developed by AI artist Boris Dayma and was inspired by DALL E 2, an OpenAI software that generates ultra-realistic art and images from text input. From cats meditating, to robotic dinosaurs battling monster trucks in the Colosseum, the images blew everyone’s minds, with some describing them as a threat to human painters. Acknowledging the potential for abuse, OpenAI has restricted access to its model only to a hand-picked group of 400 researchers.

He was always fascinated by the art produced by DALL E 2 and “wanted to have an open source version that could be accessed and improved by everyone,” he said. rest of the world. So, he went ahead and created an abstract and open source version of the model and named it DALL E mini. He released it in July 2021, and the model has been training and perfecting his output ever since.


SLAB.E mini

DALL E mini is now a viral phenomenon on the Internet. The images it produces aren’t nearly as sharp as those on the DALL E 2 and have noticeable distortion and distortion, but the system’s unbridled performances – everything from Demogorgon From Weird things holding basketball for general performance In Disney World – it gave rise to an entire subculture, with subreddits and Twitter handles Dedicated to taking care of her photos. I inspired a cartoon In the New Yorker magazine, and Twitter handle Weird Dall-E Creations has over 730 thousand followers. always said rest of the world That the model generates about 5 million claims per day, and is currently working to keep pace with the significant growth in user interest. (DALL.E mini is not related to OpenAI and, at OpenAI’s insistence, has renamed the open source model Craiyon as of June 20th).

Daima admits he is amazed at why the system is generating images of brown-skinned women wearing sari for blank orders, but suspects it has something to do with the program’s data set. “It is very interesting and I am not sure why it happened,” Dima said. rest of the world After reviewing the photos. “It is also possible that this type of image was overrepresented in the data set, perhaps also with short captions,” Daima said. rest of the world. rest of the world I also reached out to OpenAI, the creator of DALL E 2, to see if they had any idea, but haven’t heard back yet.

AI models like the DALL-E mini learn to draw an image by analyzing through millions of images from the internet with their associated captions. The DALL E mini-model was developed on three main datasets: the conceptual feedback dataset, which contains 3 million pairs of images and annotations; Conceptual 12M, which contains 12 million images and annotations, and the OpenAI suite of approximately 15 million images. Dayma and DALL E mini co-author Pedro Cuenca noted that their model was also trained using unfiltered online data, opening up unknown, unexplainable biases in the datasets that can seep into image generation models.

She is not alone in skeptical of the data set model and basic training. Seeking answers, Maris turned to the popular machine learning discussion forum Hugging Face, where the DALL E mini is hosted. There, the computer science community took into account, with some members repeatedly giving plausible explanations: The AI ​​could have been trained on millions of “unsorted” images of people from South and Southeast Asia in the training dataset. He always opposes this theory, because he said that there is no image from the data set without comment.

“Machine learning systems usually have the opposite problem – they don’t actually include enough images of non-white people.”

Michael Cook, who is currently researching the intersection of artificial intelligence, creativity and game design at Queen Mary University of London, challenged the theory that the data set included too many photos of people from South Asia. “Machine learning systems usually have the opposite problem — they actually don’t include enough images of non-white people,” Cook said.

Cook has his own theory about the puzzling results of the DALL E mini. “One of the things that happened to me while reading is that a lot of these data sets exclude non-English text, and they also exclude information about specific people, like proper names,” Cook said.

“What we might see is a strange side effect of some of this filtering or preprocessing, whereby images of Indian women are less likely to be filtered, for example, by a block list, or text describing the images are removed and added to the dataset without any labels attached.” For example, if the captions are in Hindi or another language, the text will likely be garbled during data processing, resulting in no image caption. “I can’t say that for sure – it’s just a theory that came to me while exploring the data.”

Biases in AI systems are global, and even well-funded Big Tech initiatives like Microsoft’s chatbot Tay and Amazon’s AI recruiting tool have succumbed to this problem. In fact, Google’s Text-to-Image creation model, Imagen, and OpenAI’s DALL.E 2 explicitly reveal that their models have the potential to recreate harmful biases and stereotypes, as does DALL.E mini.

Cook was a audio critic For what he sees as increasing ruthlessness and rote revelations that ignore biases as an inevitable part of emerging AI models. Tell rest of the world While it’s commendable that a new piece of technology allows people to have so much fun, “I think there are serious cultural and social issues with this technology that we don’t really appreciate.”

Daima, the creator of DALL E mini, acknowledges that the model is still a work in progress, and that the extent of its biases have yet to be fully documented. “The model sparked a lot more interest than I expected,” Dima said. rest of the world. He wants the model to remain open source so that his team can study its limitations and biases faster. “I think it is interesting for the public to be aware of what is possible so that they can develop a critical mind toward the media they receive as images, just as much as the media receives as news articles.”

Meanwhile, the mystery remains unanswered. “I learn a lot just by seeing how people use the model,” Dima said. Rest of the world. “When it’s empty, it’s a gray area, so [I] I still need to search in more detail.”

Maris said it’s important for people to learn about the potential harms of seemingly fun AI systems like the DALL-E mini. The fact that even he is not always able to discern why the regime is broadcasting these images only reinforces his concerns. This is what the press and critics have [been] He’s been saying for years: These things are unpredictable and they can’t control.”


#DALL #mini #mysterious #obsession #women #wear #sari

Leave a Comment

Your email address will not be published.