Google’s ‘Gemini’ makes mobile breakthrough for generative AI

By News Room On Dec 6, 2023

Unlock the Editor’s Digest for free

Google has launched a new set of generative artificial intelligence models that will run directly on mobile phones for the first time, a breakthrough in the tech company’s efforts to take on rivals such as ChatGPT-maker OpenAI.

The company described “Gemini” as its “largest, most capable and most general” AI system, which can analyse information from images and audio and has sophisticated reasoning and “planning” capabilities. It will power Google’s Bard chatbot from Wednesday and will be launched more broadly into its search engine from next year.

A version of Gemini, known as “nano”, was designed specifically for running on mobile devices and would be integrated into Google’s latest Pixel phones. Google told the Financial Times this would “run natively” on the device, and that the “nano” model was “optimised for mobile — so Android developers can easily build AI apps and features that work offline, or use personal [information] better kept private on-device”.

This advance could help answer an economic problem with the technology. Running generative AI with the computing power available on mobile handsets, rather than through the cloud on servers operated by big tech groups, would vastly reduce the costs of operating such systems. This also provides a layer of assurance for those wanting to keep private data restricted to a device.

“I believe the transition we are seeing right now with AI will be the most profound in our lifetimes, far bigger than the shift to mobile or to the web before it,” said Google and Alphabet chief executive Sundar Pichai in a blog post. “This new era of models represents one of the biggest science and engineering efforts we’ve undertaken as a company.”

Generative AI has opened up a new front in the battle for big tech dominance across Silicon Valley. Google’s latest generative AI system follows a number of models released by companies, including Microsoft-backed OpenAI, Meta, and start-ups like Anthropic and Mistral, which are all capable of producing plausible answers to questions in natural language: in text, code, image, and audio.

Last month, enterprise giant Microsoft rolled out a generative AI assistant, dubbed Copilot, in its widely used Microsoft 365 suite of productivity apps, which includes Word, PowerPoint and Excel.

Google said Gemini scored over 90 per cent on an “industry standard” benchmark that assesses so-called large language models, the technology underlying generative AI products.

The company added that Gemini was the first AI model to outperform human experts on certain tasks, surpassing OpenAI’s GPT3.5 model in multiple tests. In particular, it can solve mathematical reasoning problems, analyse scientific data and do advanced coding. Google did not give a comparison with OpenAI’s latest GPT4 model.

Gemini will also be integrated into Bard, Google’s AI-powered chatbot, from Wednesday in the English language, available in more than 170 territories, including in the US, Asia and Africa, with plans to update it with more powerful software next year.

However, it will not yet be available in Europe or the UK, which Google suggested was down to regulatory hurdles.

“We are definitely working on that and clearly working with local regulators . . . to make sure that we are engaged with those folks before we launch in any particular area,” said Sissie Hsiao, vice-president at Google and general manager of Bard.

Hsiao said the integration would improve Bard’s abilities in “understanding and summarising content, reasoning, brainstorming, writing, planning”.

Examples of uses of Gemini demonstrated by Google included scanning a handwritten worksheet of mathematical formulas, marking errors, and explaining them.

Another demonstration by YouTuber Mark Rober used Bard integrated with Gemini to direct a video where he tested how to craft the most aerodynamic paper plane. The AI suggested experiments and improvements to the designs and ways to test its accuracy, including shooting it through a ring of fire.

Versions of Gemini will be made available to some developers and enterprise customers throughout December, with access rolled out more broadly next year.

Read the full article here