DALL-E, Generative AI… and Craiyon

0
77

What is Dall-E? What can it do? How does it work? And what about its FOSS fork, which one can access with a few restrictions? Read this book to find out…

I recently came across a book titled ‘Generating Creative Images With DALL-E 3’, from Packt Publishing. It helped me search for answers to questions I didn’t even know existed. Which is why it reinforced my belief that reading a good book can be worth more than sitting for a month in a classroom. You get my drift? My task here is to encourage techies to read, and read a lot!

By way of background (courtesy the Wikipedia), DALL·E, DALL·E 2, and DALL·E 3 are text-to-image models developed by OpenAI. All these use deep learning methodologies to generate digital images from natural language descriptions known as ‘prompts’. In simple words, DALL-E can create “realistic images and art” based on a basic description.

Its first version came out in January 2021, and was initially accessible ‘by invitation only’. But that has changed since.

DALL-E is named after a ‘portmanteau’ (blended word) from the names of the American animated robot Pixar character WALL-E and the Spanish surrealist artist Salvador Dali (1904-1989). One would have never guessed this! In the movie, WALL-E is the solitary robot on a future, uninhabitable, deserted Earth in 2805, left to clean up garbage. By way of trivia, WALL-E’s name stands for Waste Allocation Load Lifter: Earth-Class.

But everything is not hunky dory and fun. Today, there are some concerns about DALL-E. This includes its biases when it comes to generating images of race and gender, its potential to be used to create deep fakes and promote misinformation, and the possible ‘technological unemployment’ of artists, photographers and graphic designers due to its accuracy and popularity.

Importantly, OpenAI has not released the source code for any of the three models. So FOSS (free/libre and open source software) attempts have been made to build models with similar capabilities. For instance, Craiyon used to be called DALL-E Mini; but then OpenAI handed in a name-change request. This was based on the original DALL-E trained on unfiltered data.

The India connect

My interest in art (as a hobby, and sometimes for work) drew this book to me. More notable is the number of Indian names associated in the production of this Packt text — Niranjan Naikwadi, Nitin Nainani, Hemangi Lotlikar, Shrishti Pandey, Seemanjay Ameriya, Pratik Shirodkar, Vijay Kambli, and Vinishka Kalra.

Its author, Holly Picano, is an expert in both digital marketing and generative AI. One of the reviewers, Jasdeep Sidhu, is the founder of a generative AI startup and an independent AI consultant.

If you go through the book, it’s hard not to stop at the chapter on how to design a film poster. That is one of the nine chapters in this rather slender 230-page book.

In Part 2 of the book, chapters 4, 5 and 6 look at (not in that order) designing art for covers of books or magazines and other publications; crafting fine art prints with DALL-E 3; and NFTs or non-fungible tokens. (An NFT is a unique digital identifier. It is recorded on a blockchain and used to certify ownership and authenticity. NFTs can be sold and traded, or that’s the theory.)

So, what is Craiyon?
Previously called DALL-E Mini, Craiyon is the creation of CEO, lead researcher, and AI guru, Boris Dayma. Dayma’s Twitter/X profile identifies him as the founder of Craiyon and author of DALL-E Mini. “It felt like magic for me for soooo long,” Dayma has said of his work.

Craiyon (https://www.craiyon.com/) describes itself as “a free AI image generator that’s painting a new generation for the AI art revolution through our own model.” Born in 2022, it has been growing since.

“If you can dream it, Craiyon can draw it,” says the firm’s website, claiming it can do so in seconds. It allows you to get nine free images at a time within a minute and go “for unlimited art, fewer ads, and faster generation.” It offers to help you create bizarre concepts including “Gandhi as a Dragon Ball Z battle card.”

Generating Creative Images With DALL-E 3
Holly Picano
2024 Packt Publishing
ISBN 978-1-83508-771-8

Starting easy

Like I’ve come to expect of other Packt books, this one too starts at the shallow end of the pool. It says the book is meant for “individuals with interests in art, technology, and creativity.” That’s quite a broad category; guess it helps if you’re interested in more than one of these categories. The book is also suggested to be “particularly engaging” for artists, designers, educators and students.

It opens up with basic concepts of generative AI, and goes into creating “unique artwork.”

Many questions get answered here, besides the background information. For instance, how to craft text prompts and generate your first artwork; variations and mastering image attributes and quality through parameters and sizing; creating fine art prints; and how to “create and mint” AI-generated artworks as NFTs.

The aim of this book, we’re promised, is to get you started with DALL-E 3, understand how to use prompts, and examine the ethical considerations involved. It will also help you get a grasp over what AI and generative AI mean, and to explore how DALL-E 3 uses AI. (Generative AI seeks to create new data or content “that wasn’t in the original training set.” So it can create deep-fake videos, art or music that never existed before.)

It’s interesting that to apply some concepts of DALL-E 3, one also needs to understand a few things about photography or sketching. The strengths and weaknesses of the software are discussed in the book.

Chapter 8 offers an “effective prompt cheat sheet.” It promises a “treasure trove” of cheat sheets to turn you into a prompt expert “filled to the brim with insightful cheat sheets that make navigating this advanced AI tool a breeze.” These tips come across as quite practical. Case studies and a useful index at the end round up this book.

You’ll run into other strange realities in the pages of this text. For instance, how ‘open’ is OpenAI? Not very…

Tech journalist Rachel Metz commented in a post on Bloomberg (http://bloomberg.com/) in March this year: “It’s true that OpenAI has open sourced some of its technology, but it has not open sourced most of its technology.” It’s obvious that AI could be at the centre of new battles over open source — how open it should be, and what is a genuine product or a fake.

This is a useful and easy-to-recommend book. Check it out online and procure your own digital copy at least.

LEAVE A REPLY

Please enter your comment!
Please enter your name here