Unlock the Editor’s Digest for free
Roula Khalaf, Editor of the FT, selects her favourite stories in this weekly newsletter.
Elon Musk’s artificial intelligence start-up xAI has released a new chatbot that it claims matches the performance of rivals OpenAI, Google and Anthropic, vaulting the 18-month-old company into the top five AI developers.
xAI on Wednesday previewed the model that independent AI benchmark sites rank among the top five chatbots globally, on the tails of Google’s Gemini and OpenAI’s ChatGPT.
Grok-2, its latest large language model, will be available to paying subscribers of Musk’s social media platform X. xAI also plans to release the model to developers this month so they can build enterprise applications.
Ethan Mollick, a professor at Wharton business school and AI author, posted on X: “There are now five GPT-4 class models: GPT-4o, Claude 3.5, Gemini 1.5, Llama 3.1 and now Grok-2.”
He added: “All of the labs are saying there is room left for continued giant improvements, but we haven’t seen any models truly leap above GPT-4 . . . yet.”
Musk is racing to catch up with OpenAI, the AI research lab he co-founded in 2015 but left three years later following a disagreement over the direction of its research. The Tesla and SpaceX chief this month launched a new lawsuit against OpenAI and its chief executive Sam Altman, claiming he was “manipulated” into investing in a “fake humanitarian mission”. Microsoft-backed OpenAI has previously rejected Musk’s claims as “incoherent and frivolous”.
Founded in March last year, xAI has been quick to increase the capabilities of its technology, backed by significant investment.
This year, xAI closed a $6bn funding round at a valuation of $18bn, while Musk recently said he was seeking board approval from Tesla, where he is chief executive, to invest $5bn in the company. This would take the start-up’s investment close to matching that of OpenAI’s $13bn and surpass Anthropic’s almost $9bn.
However, xAI’s use of data pulled from Musk’s X platform has proven controversial. It agreed to partially suspend data processing in Europe earlier this month after Ireland’s data protection watchdog challenged a move to use X posts to train its AI systems without first obtaining users’ explicit consent, a potential breach of the EU’s privacy rules.
xAI said Grok-2 was a “significant step forward” and “more intuitive, steerable, and versatile across a wide range of tasks, whether you’re seeking answers, collaborating on writing, or solving coding tasks”.
Grok-2’s performance is considered better than Meta’s and Anthropic’s best models, according to a ranking on LMSYS, a leading site for comparing or benchmarking AI model capabilities. However, a recent update to OpenAI’s latest model, GPT-4o, placed it back at the top of the leaderboard, above Google’s Gemini Pro.
When evaluating the model’s performance internally, the company said it focused on ensuring the system followed instructions and provided “accurate, factual information”.
Its predecessor was criticised by experts for “hallucinations”, where the AI stated false information as fact. Hallucinations have been seen as a barrier to enterprise adoption of AI systems.
Read the full article here