Apple’s AI study can’t say whether AI will take your job


In 2023, a popular perspective on AI went like this: of course, it can generate a lot of impressive text, but it cannot really reason – it’s all its mimicry, just “Stochastic parrots”Affective.

At the time, it was easy to see where this perspective came from. Artificial intelligence had moments of being impressive and interesting, but it has also constantly failed in basic tasks. The CEOs of technology said that they could continue to enlarge the models, but technological CEOs say things like that all the time, including when, behind the scenes, everything is kept with Glue, adhesive tape and low salary workers.

It is now 2025. I always hear this disdainful perspective, especially when I speak to academics in linguistics and philosophy. Many of the highest efforts to burst the bubble of AI – like the recent Apple Paper claiming to find this Ais can’t really reason – Lie on the assertion that models are only bullshit generators who do not improve and do not improve much.

But I think more and more that the rehearsal of these claims is a bad service to our readers and that the academic world does not intensify the most important implications of the AI.

I know this is a daring affirmation. So let me go back it.

“The illusion of thought” of relevance

As the Apple newspaper has been published online (it has not yet been evaluated by peers), it has taken off. Videos explaining it accumulated millions of views. People who generally do not read much about AI have heard of the Apple newspaper. And while the document itself acknowledged that the performance of the AI ​​on “moderate difficulty” tasks improved, many summaries of its take-out dishes have focused on the title of a “limitation of fundamental scaling of the capacities for reflection of current reasoning models”.

For a large part of the public, the newspaper confirmed something they wanted to believe: this genetive AI does not really work – and it is something that will not change anytime soon.

The paper examines the performance of modern high level language models on “reasoning tasks” – basically, complicated puzzles. Beyond a certain point, this performance becomes terrible, which, according to the authors, demonstrates that models have not developed real skills in planning and problem solving. “These models fail to develop generalizable problem solving capacities for task planning, with performance collapsing from scratch beyond a certain threshold of complexity”, as the authors write.

It was the high -level conclusion that many people took from the newspaper and the wider discussion around him. But if you dig into the details, you will see that this observation is not surprising, and it does not say much about the AI.

Most of the reasons why models fail the problem given in the article are not because they cannot solve it, but because they cannot express their answers in the specific format that the authors have chosen to need.

If you ask them to write a program that publishes the right answer, they do so effortlessly. On the other hand, if you ask them to provide the answer in the text, line by line, they end up reaching their limits.

This seems to be an interesting limitation to current AI models, but it does not have much to do with “generalizable problem solving capacities” or “planning tasks”.

Imagine that someone affirms that humans cannot “really” do “generalizable” multiplication because even if we can calculate 2 -digit multiplication problems without problem, most of us will spoil somewhere along the way if we try to do 10 -digit multiplication problems in our heads. The problem is not that we “are not general reasoners”. It is that we are not evolved to juggle a large number in our heads, largely because we never needed to do so.

If the reason why we care about “if reason ais” is fundamentally philosophical, then exploring when the problems become too long for them to solve is relevant, as a philosophical argument. But I think most people care about what AI can and cannot do for much more practical reasons.

AI takes your work, whether it can “really reason” or not

I expect my work to be automated in the coming years. I don’t want it to happen, of course. But I can see the writing on the wall. I regularly ask the AIS to write this newsletter – just to see where the competition is. It is not yet there, but it improves all the time.

Employers do it too. The entry -level hiring in professions such as law, where entry -level tasks can be automated, seems to be Already contractual. The labor market for recent university graduates It looks ugly.

The optimistic case around what’s going on go something like This: “Of course, AI will eliminate a lot of jobs, but it will create even more jobs.” This more positive transition could well happen – although I do not want to count on it – but it would always mean that many people suddenly find all their skills and their suddenly unnecessary training, and therefore need to quickly develop a completely new set of skills.

It is this possibility, I think, which looms large for many people in industries like mine, who already see replacements of AI. It is precisely because this perspective is so frightening that the statements that IS are only “stochastic parrots” which cannot really be so attractive. We want to hear that our jobs are sure and that AIS are a Nothingburger.

But in fact, you cannot answer the question of whether the AI ​​will take your work in reference to a thought experience, or in reference to the way it works when asked to write all the steps of Hanoi puzzles tour. The way to answer the question of whether AI will take your job is to invite it to try. And, uh, this is what I got when I asked Chatgpt to write this section of this newsletter:

Is it “really reasoning”? Maybe not. But he does not need to be to make me potentially essential.

“Whether whether or not it simulates reflection has no impact on the question whether or not the machines are capable of reorganizing the world for the best or for the worst,” said Professor Cambridge of IA philosophy and Governance Harry Law supported In a recent room, and I think it has unambiguously. If Vox hands me a pink brief, I don’t think I’m going to go anywhere if I argue that I should not be replaced because O3, above, cannot solve a sufficiently complicated touch of Hanoi puzzle – which, guess what, I can’t do either.

Critics go out of words when we need it most

In his article, Law examines the state of AI critics and finds it quite dark. “Many recent critical writings on AI … Read as an extremely wishes reflection on what systems can and cannot do exactly.”

This is also my experience. Critics are often trapped in 2023, giving accounts of what AI can and cannot do that has not been correct for two years. “A lot [academics] I don’t like AI, so they don’t follow him closely, “says Law.” They don’t follow him closely so that they always think that the criticisms of 2023 hold water. They don’t. And it is regrettable because academics have important contributions to do. »»

But of course, for the effects of the use of AI – and in the longer term, for the global catastrophic risk concerns that they can present – what matters is not whether AIS can be encouraged to make silly errors, but what they can do during the implementation for success.

I have my own list of “easy” problems that Ais still cannot solve – they are pretty bad At Chess Puzzles – but I don’t think this kind of work should be sold to the public as an overview of “real truth” on AI. And it certainly does not start in the future really scary that experts are growing more and more towards us.

A version of this story originally appeared in the future perfect bulletin. Register here!

Leave a Reply

Your email address will not be published. Required fields are marked *