Skip to main content

Google launches Gemini 3 with state-of-the-art reasoning, ‘generative UI’ for responses, more

Google today announced Gemini 3 with the goal of bringing “any idea to life.” The first model available in this family is Gemini 3 Pro with the rollout starting today for the Gemini app and AI Mode.

With Gemini 1.0, Google focused on native multimodality and the long context window. A year later Gemini 2.0 brought advanced reasoning and the beginning of agentic capabilities, while Gemini 2.5 introduced deep reasoning and coding capabilities.

Gemini 3 — which drops the “.0” — is Google’s “most intelligent model” and positioned as helping you “bring any idea to life.”

It starts by getting better at figuring out the context and intent of your request, so that “you get what you need with less prompting.” Gemini 3 is state-of-the-art in reasoning with the ability to “grasp depth and nuance,“ like “perceiving the subtle clues in a creative idea, or peeling apart the overlapping layers of a difficult problem.”

Advertisement - scroll for more content

Gemini 3 Pro responses aim to be “smart, concise, and direct, trading cliche and flattery for genuine insight.” 

It acts as a true thought partner that gives you new ways to understand information and express yourself, from translating dense scientific concepts by generating code for high-fidelity visualizations to creative brainstorming.

Benchmarks

Gemini 3 Pro has a score of 1501 on LMArena and surpasses 2.5 Pro (1451), which still had the top position. It outperforms the model its replacing in all major benchmarks by a significant margin: 


  • …demonstrates PhD-level reasoning with top scores on Humanity’s Last Exam (37.5% without the usage of any tools) and GPQA Diamond (91.9%). 
  • …sets a new standard for frontier models in mathematics, achieving a new state-of-the-art of 23.4% on MathArena Apex.
  • Beyond text, Gemini 3 Pro redefines multimodal reasoning with breakthrough scores of 81% on MMMU-Pro and 87.6% on Video-MMMU
  • …scores a state-of-the-art 72.1% on SimpleQA Verified, showing great progress on factual accuracy. 
  • …tops the WebDev Arena leaderboard by scoring an impressive 1487 ELO. 
  • …scores 54.2% on Terminal-Bench 2.0, which tests a model’s tool use ability to operate a computer via terminal
  • …greatly outperforms 2.5 Pro on SWE-bench Verified (76.2%), a benchmark that measures coding agents.

This means Gemini 3 Pro is highly capable at solving complex problems across a vast array of topics like science and mathematics with a high-degree of reliability.

Google today also announced the Gemini 3 Deep Think mode with even better reasoning and multimodal understanding. It outperforms Gemini 3 Pro on Humanity’s Last Exam (41.0% without the use of tools) and GPQA Diamond (93.8%). This will be available in the coming weeks for AI Ultra subscribers.

It also achieves an unprecedented 45.1% on ARC-AGI (with code execution), demonstrating its ability to solve novel challenges.

Generative UI

Gemini 3 makes possible generative UI (or generative interfaces) wherein LLMs generate both content and entire user experiences. This includes web pages, games, tools, and applications that are “automatically designed and fully customized in response to any question, instruction, or prompt. 

This work represents a first step toward fully Al-generated user experiences, where users automatically get dynamic interfaces tailored to their needs, rather than having to select from an existing catalog of applications.

Behind-the-scenes, Gemini 3 Pro leverages tool access like web search and image generation, as well as “carefully crafted system instructions.” 

The system is guided by detailed instructions that include the goal, planning, examples and technical specifications, including formatting, tool manuals, and tips for avoiding common errors.

Finally, the output is sent through post-processors that address “potential common issues.”

This is launching today in the Gemini app as experiments. Dynamic view sees Gemini 3 design and code a “fully customized interactive response for each prompt.” 

It customizes the experience with an understanding that explaining the microbiome to a 5 year old requires different content and a different set of features than explaining it to an adult, just as creating a gallery of social media posts for a business requires a completely different interface to generating a plan for an upcoming trip.

Visual layout is the second experiment and creates an “immersive, magazine-style view complete with photos and modules.” The main difference to dynamic view is how Gemini will generate sliders, checkboxes, and other filters that let you customize the results further. 

You might initially only see one of these experiments at a time to allow Google to gather feedback. 

For more on what Gemini 3 brings to the Gemini app (including Gemini Agent), read our story here

Meanwhile, this is the first time that a new model is coming to Google Search and AI Mode alongside the Gemini app. Starting this week, AI Pro and AI Ultra subscribers can go to the dropdown menu in the top-left corner and select “Thinking: 3 Pro reasoning and generative layouts.”

With Gemini 3, Google’s query fan-out technique can perform additional searches than before that ask more nuanced questions to improve the final response you get. 

AI Mode will also create generative UIs to creative interactive tools and simulations. For example, Google might build a mortgage calculator that lets you change interest rates and down payment. Another is getting a physics simulation when you’re learning about topics.

Gemini 3 will next come to all (free) AI Mode users in the US, with subscribers getting higher limits.

Looking ahead, Google in the coming weeks will update Search’s automatic model selection for subscribers to send challenging questions to Gemini 3 “while continuing to use faster models for simple tasks.” 

Google Antigravity

With Gemini 3, Google Antigravity was announced as a new agentic development platform that allows developers to “operate at a higher, task-oriented level.”  This IDE sees agents work across the editor, terminal, and browser. Available now on Mac, Windows, and Linux, it uses Gemini 3, Gemini 2.5 Computer Use, and Nano Banana.

Now, agents can autonomously plan and execute complex, end-to-end software tasks simultaneously on your behalf while validating their own code

FTC: We use income earning auto affiliate links. More.

You’re reading NewGeekGuide — experts who break news about Google and its surrounding ecosystem, day after day. Be sure to check out our homepage for all the latest news, and follow NewGeekGuide on Twitter, Facebook, and LinkedIn to stay in the loop. Don’t know where to start? Check out our exclusive stories, reviews, how-tos, and subscribe to our YouTube channel

Comments

Author

Avatar for Abner Li Abner Li

Editor-in-chief. Interested in the minutiae of Google and Alphabet. Tips/talk: abner@9to5g.com