Good morning. A 29-page scientific article rarely merits the attention of top executives, but every business leader should be familiar with one recent study from OpenAI. It’s the best description yet of how AI can handle real-world tasks, showing which AI models excel and hinting at what it all means for humans in the years to come. The document can be heavy, but you can get a masterful summary from our AI editor, Jeremy Kahn.
For leaders, three points stand out:
The study is very realistic. It looked at 44 occupations and 1,320 specialized tasks required by those occupations. For example: the final testing stage of manufacturing a cable reel truck for underground mining operations. Appropriate professionals (average experience: 14 years) reviewed the tasks, all of which are elements of the actual deliverables of the job. Previous research has almost always focused on less realistic tests. The AI results were evaluated by expert humans who did not know whether they were reviewing the work of the AI or that of an expert human professional.
The best models are already almost as good as human industry experts. The study looked at seven AI models from Open AI, Google’s Gemini, xAI’s Grok and Anthropic’s Claude. The big winner was Claude Opus 4.1, which came within a few percentage points of parity with human industry experts. The best models also completed tasks about 100 times faster and 100 times cheaper than industry experts, although the comparisons ignore “the human monitoring, iteration, and integration steps required in real-world working environments,” OpenAI says.
Models are improving at a galloping pace. For example, as OpenAI’s models improved, the percentage of their task results that were as good as or better than humans’ results more than tripled. If this pace continues – a big if – OpenAI would be better at these real-world tasks than humans in general in a few months. At least some AI competitors may well follow similar trajectories.
The pace of change described in this new research poses perhaps the most difficult challenge facing business leaders. Consider the two-year cycle of Moore’s Law, which changed the world and inspired new corporate giants while dooming others. Looking back, those were the days. John Chambers, who led Cisco through the Internet frenzy and its crash, said recently that 50% of executives “won’t have the skills to adapt to this new AI-driven innovation economy because they’ve been trained to scale at the speed of a five-year cycle instead of a 12-month cycle.” It’s worth remembering his warning to executives: “With the speed at which the market is changing now, you need to be able to reinvent yourself, which is something most CEOs and business leaders don’t know how to do, especially with AI.” »Geoff Colvin
Contact the CEO daily via Diane Brady at [email protected]
Top news
Israeli government approves Gaza deal as troops withdraw
The IDF now has 24 hours to withdraw to an agreed line and Hamas has 72 hours to free all Israeli hostages. So far, events are proceeding as planned and the mood is optimistic on both sides. Live BBC coverage here.
China imposes export controls on rare earth minerals
THE new rules curb the supply chain of semiconductors used in phones, computers, AI data centers, cars, solar panels and other computing kit. China has a virtual monopoly on rare earths.
New York attorney general indicted
Laetitia James is billed with bank fraud and making false declarations. The lawsuits are part of President Trump’s retaliation plan: It was James who obtained a $367 million fine against Trump in a civil suit (the fine was later canceled).
Making Argentina Great Again
Yes, the United States is bailing out Argentina. Treasury Secretary Scott Bessent confirmed that the Treasury bought pesos to support the government of President Javier Milei, a Trump ally. The United States also grants a $20 billion swap line to Argentina. (A swap line allows central banks to exchange fixed amounts of currencies, with the understanding that the swap will be canceled later and interest will be paid on the currency repaid.)
Moody’s Chief Economist: About Half of U.S. States Are in Economic Contraction
Mark Zandi, chief economist at Moody’s Analytics told exclusively Fortune that almost half of American states are seeing their economies contract – and only 16 are growing. Zandi also noted that low-income households are “hanging on financially…and their world is going into recession pretty quickly.”
KPMG survey identifies quarter where sentiment on AI changed
A new KPMG survey on 130 business executives earning more than $1 billion per year find that the adoption of agentic AI technology has quadrupled in the last six months. A manager of the company’s principles and AQI program said Fortune that the most recent quarter was one where the “fear factor” surrounding technology faded, leading to what she describes as “cognitive fatigue.”
Google limits teleworking to only 4 days per year
Google’s previous policy was to allow staff to work from anywhere for up to four weeks per year. THE new rule says a single WFH day will now count as a full week.
Federal employees will get their pay back
U.S. House Speaker Mike Johnson says federal workers are furloughed will receive the salary they are due once the closure is complete.
The markets
S&P 500 Futures Contracts were up 0.14% this morning. The index closed down 0.28% in its last session. STOXX Europe 600 was stable at the start of the session. The UK’s FTSE 100 was down 0.14% at the start of the session. Nikkei 225 in Japan was down 1.01%. The Chinese CSI 300 was down 1.97%. South Korea’s KOSPI was up 1.73%. India’s Nifty 50 was up 0.51% before the end of the session. Bitcoin held at $121.4K.
Around the water fountain
$1.8 trillion deficit exposed in ‘unnecessary and unnecessary government shutdown’, budget watchdog says by Nick Lichtenberg
Battle over Elon Musk’s billionaire pay intensifies as pension funds take on Tesla by Amanda Gerut
You’re 10 times more likely to have a delayed flight during government shutdown, transportation secretary says: ‘These controllers are stressed’ by Lake Sydney
California’s “impossible” dream of ending fossil fuels isn’t working, and now California is facing price spikes and shortages. by Jordan Blum
From WhatsApp Friends to a Valuation of Over $500 Million: These Founders Say Their Tiny AI Models Are Better for Customers and the Planet. by Vivienne Walt
CEO Daily is compiled and edited by Joey Abrams and Jim Edwards.