Synthetic data is on the rise in artificial intelligence. It's going to make AI cheaper, better and less biased.
It's also very obtainable and usable. In a short while, it has gone from being an experimental technology. To something, I would not hesitate to use for production AI solutions.
To illustrate that, I will build an AI that can classify the difference between apples and bananas. I will only use images of the two classes generated by another AI - In this case, using DALL-E Mini.
An Apple or Banana recognizer
I will build an image classifier using only easy-to-access, free AutoAI tools.
Generating data
We need around 30 images of each label, bananas and apples.
We will be using DALL-E Mini, an open-source version of NVIDIAs text-to-image model DALL-E 2.
To generate the images, you can go to https://huggingface.co/spaces/dalle-mini/dalle-mini. Here you can prompt the text-to-image model with queries such as:
"Banana on table"
"Banana on random background"
"Apple on table"
"Apple on random background"
Try to match the background you will be testing on.
The response should look like this:
Generate around 30 images of each label and save them.
Creating the model
To create the model, we will be using the tool Teachable Machine. Teachable machine has
Open Teachable Machine and choose Image project and then Standard image model.
Names your classes Banana and Apple.
Upload your images from DALL-E Mini to the respective classes.
Press Train. Training will only take a few seconds.
Testing the model
Now you can try the model showing a banana or an apple via the webcam view.
As you can see, it works. The model has been trained only using synthetic data. I think it's remarkable that training models work with only synthetic data.
It suggests a future with significantly lower costs of building AI solutions. That means more people can build AI and utilize the possibilities of the technology.
A fascinating future!
For more tips, sign up for the book here: https://www.danrose.ai/book.