AI should not be a black box

By News Room Last updated May 30, 2024

Unlock the Editor’s Digest for free

Proponents and detractors of AI tend to agree that the technology will change the world. The likes of OpenAI’s Sam Altman see a future where humanity will flourish; critics prophesy societal disruption and excessive corporate power. Which prediction comes true depends in part on foundations laid today. Yet the recent disputes at OpenAI — including the departure of its co-founder and chief scientist — suggest key AI players have become too opaque for society to set the right course.

An index developed at Stanford University finds transparency at AI leaders Google, Amazon, Meta and OpenAI falls short of what is needed. Though AI emerged through collaboration by researchers and experts across platforms, the companies have clammed up since OpenAI’s ChatGPT ushered in a commercial AI boom. Given the potential dangers of AI, these companies need to revert to their more open past.

Transparency in AI falls into two main areas: the inputs and the models. Large language models, the foundation for generative AI such as OpenAI’s ChatGPT or Google’s Gemini, are trained by trawling the internet to analyse and learn from “data sets” that range from Reddit forums to Picasso paintings. In AI’s early days, researchers often disclosed their training data in scientific journals, allowing others to diagnose flaws by weighing the quality of inputs.

Today, key players tend to withhold the details of their data to protect against copyright infringement suits and eke out a competitive advantage. This makes it difficult to assess the veracity of responses generated by AI. It also leaves writers, actors and other creatives without insight into whether their privacy or intellectual property has been knowingly violated.

The models themselves lack transparency too. How a model interprets its inputs and generates language depends upon its design. AI firms tend to see the architecture of their model as their “secret sauce”: the ingenuity of OpenAI’s GPT-4 or Meta’s Llama pivots on the quality of its computation. AI researchers once released papers on their designs, but the rush for market share has ended such disclosures. Yet without the understanding of how a model functions, it is difficult to rate an AI’s outputs, limits and biases.

All this opacity makes it hard for the public and regulators to assess AI safety and guard against potential harms. That is all the more concerning as Jan Leike, who helped lead OpenAI’s efforts to steer super-powerful AI tools, claimed after leaving the company this month that its leaders had prioritised “shiny products” over safety. The company has insisted it can regulate its own product, but its new security committee will report to the very same leaders.

Governments have started to lay the foundation for AI regulation through a conference last year at Bletchley Park, President Joe Biden’s executive order on AI and the EU’s AI Act. Though welcome, these measures focus on guardrails and “safety tests”, rather than full transparency. The reality is that most AI experts are working for the companies themselves, and the technologies are developing too quickly for periodic safety tests to be sufficient. Regulators should call for model and input transparency, and experts at these companies need to collaborate with regulators.

AI has the potential to transform the world for the better — perhaps with even more potency and speed than the internet revolution. Companies may argue that transparency requirements will slow innovation and dull their competitive edge, but the recent history of AI suggests otherwise. These technologies have advanced on the back of collaboration and shared research. Reverting to those norms would only serve to increase public trust, and allow for more rapid, but safer, innovation.

Read the full article here