- Google Whisk uses images as input rather than text-based prompts
- It is built on Google’s Imagen 3 generative AI model
- This experimental tool is available for free trial to users in the United States
Google‘s new artificial intelligence tools make it easier to create and remix visual concepts. Instead of asking you to describe what’s in your head, Whisk lets you enter three image prompts: one for subject, one for scene, and one for style. Whisk takes care of the rest, making it a more intuitive way to try out different ideas.
Although most The best artificial intelligence image generator You are asked to write detailed prompts and Whisk handles this behind the scenes. When you put images as inspiration into the web-based Whisk interface, Google’s Gemini The model automatically analyzes them and writes detailed titles for each piece of content. Then enter these into Figure 3 model to build matching images.
For example, you could place an image of a car as the subject and a photo of a rural landscape as the scene. You can add watercolor as a style and see what Whisk creates. Click the button and you will get a pair of images based on your input.
From here, the image can be easily remixed. This interface allows you to specify additional text-based details to adjust the results. You can also easily add different source images or dice rolls if you need inspiration. New results appear in pairs in the feed, making it an intuitive way to ideate. You can also choose to optimize the image by showing text hints and adding more details.
stir it up
While Whisk is designed to eliminate the need for text-based prompts, Google offers the option to improve written prompts because the results don’t always match the source material.
in a blog post Regarding the experimental tool, Google explains that Whisk “captures the essence of your subject, not an exact replica.” Its effectiveness depends on Gemini’s analysis of the image you submit. While this is often impressive, it also doesn’t get into your head: you might expect Whisk to extract one detail from an image, while it focuses on another.
The post further explains: “Because Whisk only extracts a few key features from your image, it may produce a different image than you expect. For example, the resulting objects may have different heights, weights, hairstyles, or skin colors. We know these features may be critical to your project and Whisk may not hit the mark, so we let you view and edit the underlying prompts at any time.
Despite these shortcomings, Whisk is still an interesting application of Google’s existing artificial intelligence tools. The underlying generative model is the same as when you chat with Gemini through its text interface. However, by relying on image input, Whisk provides visual creators with an easier and more intuitive way to develop their ideas.
Based on early feedback from several creatives, Google is calling Whisk “a new creative tool” designed for “rapid visual exploration, not pixel-perfect editing.”
How to try Google Whisk
Google Whisk is currently only available to users in the United States. While you’re there, you can try this via your web browser: labs.google/whisk.
This experimental tool is completely free. Your experience data from using Whisk will be fed back to Google to help improve and develop future AI products.