Google unveils Gemini 2.0 Flash Thinking to rival OpenAI o1
December 19, 2024

Google unveils Gemini 2.0 Flash Thinking to rival OpenAI o1


Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. learn more


In the latest move to redefine the field of artificial intelligence, Google has announced Gemini 2.0 flash thinkinga multimodal reasoning model capable of solving complex problems quickly and transparently.

in a Post on social networkGoogle CEO Sundar Pichai wrote that it was: “our most thoughtful model yet :)”

and in Developer documentationGoogle explains, “Thinking mode can have stronger reasoning capabilities in its responses than base mode. Gemini 2.0 Flash Model,” Google’s latest and greatest product, released just eight days ago.

The new model only supports 32,000 input tokens (approx. 50-60 pages of text) and can generate 8,000 tokens per output response. In Google AI Studio’s side panel, the company claims it’s best suited for “multimodal understanding, reasoning,” and “encoding.”

Full details of the model’s training process, architecture, licensing and cost have not yet been released. Currently, the cost per token in Google AI Studio is zero.

Easy-to-understand and more transparent reasoning

Different from competitors’ inference models OpenAI’s o1 and o1 miniGemini 2.0 enables users to access its step-by-step reasoning through a drop-down menu, providing clearer and more transparent insights into how the model reaches its conclusions.

By letting users see how decisions are made, Gemini 2.0 addresses long-standing concerns about artificial intelligence operating as a “black box” and aligns this model (licensing terms are still unclear) with Other open source models launched by competitors.

My early simple tests of the model showed that it could answer correctly and quickly (within one to three seconds) some questions that would be very difficult for other AI models, such as calculating the number of rupees in the word “strawberry.” (See screenshot above).

In another test, when comparing two decimals (9.9 and 9.11), the model systematically broke the problem down into smaller steps, from analyzing whole numbers to comparing decimal places.

These results are supported by independent third-party analysis LM arenarated Gemini 2.0 Flash Thinking as the top-performing model in all LLM categories.

Native support for image upload and analysis

Gemini 2.0 Flash Thinking is a further improvement on the rival OpenAI o1 series and is designed to handle jumping images.

o1 was launched as a text-only model but has since been expanded to include image and file upload analysis. Currently, both models can only return text.

Gemini 2.0 Flash Thinking currently does not support integration with Google search, nor does it support integration with other Google applications and external third-party tools. Developer documentation.

The multimodal capabilities of Gemini 2.0 Flash Thinking expand its potential use cases, allowing it to handle scenarios that combine different types of material.

For example, in one test, the model solved a difficult problem that required the analysis of textual and visual elements, demonstrating its versatility in integrating and reasoning across formats.

Developers can take advantage of these capabilities through Google AI Studio and Vertex AI, where models can be used for experimentation.

As competition in artificial intelligence becomes increasingly fierce, Gemini 2.0 Flash Thinking may mark the beginning of a new era of problem-solving models. Its ability to handle different data types, provide visible reasoning and large-scale execution makes it a strong contender in the inference AI market, comparable to OpenAI’s o1 series and other products.


2024-12-19 18:04:34

Leave a Reply

Your email address will not be published. Required fields are marked *