DALL-E
DALL·E, DALL·E 2, and DALL·E 3 are text-to-image models developed by OpenAI using deep learning methodologies to generate digital images from natural language descriptions known as "prompts".
The first version of DALL-E was announced in January 2021. In the following year, its successor DALL-E 2 was released. DALL·E 3 was released natively into ChatGPT for ChatGPT Plus and ChatGPT Enterprise customers in October 2023,[1] with availability via OpenAI's API[2] and "Labs" platform provided in early November.[3] Microsoft implemented the model in Bing's Image Creator tool and plans to implement it into their Designer app.[4]
History and background[edit]
DALL·E was revealed by OpenAI in a blog post on 5 January 2021, and uses a version of GPT-3[5] modified to generate images.
On 6 April 2022, OpenAI announced DALL·E 2, a successor designed to generate more realistic images at higher resolutions that "can combine concepts, attributes, and styles".[6] On 20 July 2022, DALL·E 2 entered into a beta phase with invitations sent to 1 million waitlisted individuals;[7] users could generate a certain number of images for free every month and may purchase more.[8] Access had previously been restricted to pre-selected users for a research preview due to concerns about ethics and safety.[9][10] On 28 September 2022, DALL·E 2 was opened to everyone and the waitlist requirement was removed.[11] In September 2023, OpenAI announced their latest image model, DALL·E 3, capable of understanding "significantly more nuance and detail" than previous iterations.[12] In early November 2022, OpenAI released DALL·E 2 as an API, allowing developers to integrate the model into their own applications. Microsoft unveiled their implementation of DALL·E 2 in their Designer app and Image Creator tool included in Bing and Microsoft Edge.[13] The API operates on a cost-per-image basis, with prices varying depending on image resolution. Volume discounts are available to companies working with OpenAI's enterprise team.[14]
The software's name is a portmanteau of the names of animated robot Pixar character WALL-E and the Catalan surrealist artist Salvador Dalí.[15][5]
In February 2024, OpenAI began adding watermarks to DALL-E generated images, containing metadata in the C2PA (Coalition for Content Provenance and Authenticity) standard promoted by the Content Authenticity Initiative.[16]
Ethical concerns[edit]
DALL·E 2's reliance on public datasets influences its results and leads to algorithmic bias in some cases, such as generating higher numbers of men than women for requests that do not mention gender.[40] DALL·E 2's training data was filtered to remove violent and sexual imagery, but this was found to increase bias in some cases such as reducing the frequency of women being generated.[41] OpenAI hypothesize that this may be because women were more likely to be sexualized in training data which caused the filter to influence results.[41] In September 2022, OpenAI confirmed to The Verge that DALL·E invisibly inserts phrases into user prompts to address bias in results; for instance, "black man" and "Asian woman" are inserted into prompts that do not specify gender or race.[42]
A concern about DALL·E 2 and similar image generation models is that they could be used to propagate deepfakes and other forms of misinformation.[43][44] As an attempt to mitigate this, the software rejects prompts involving public figures and uploads containing human faces.[45] Prompts containing potentially objectionable content are blocked, and uploaded images are analyzed to detect offensive material.[46] A disadvantage of prompt-based filtering is that it is easy to bypass using alternative phrases that result in a similar output. For example, the word "blood" is filtered, but "ketchup" and "red liquid" are not.[47][46]
Another concern about DALL·E 2 and similar models is that they could cause technological unemployment for artists, photographers, and graphic designers due to their accuracy and popularity.[48][49] DALL·E 3 is designed to block users from generating art in the style of currently-living artists.[12]
In 2023 Microsoft pitched the United States Department of Defense to use DALL·E models to train battlefield management system.[50] In January 2024 OpenAI removed its blanket ban on military and warfare use from its usage policies.[51]
Open-source implementations[edit]
Since OpenAI has not released source code for any of the three models, there have been several attempts to create open-source models offering similar capabilities.[66][67] Released in 2022 on Hugging Face's Spaces platform, Craiyon (formerly DALL·E Mini until a name change was requested by OpenAI in June 2022) is an AI model based on the original DALL·E that was trained on unfiltered data from the Internet. It attracted substantial media attention in mid-2022, after its release due to its capacity for producing humorous imagery.[68][69][70]