The Washington Post•1 month ago•

AI vs Human Jobs: Shocking Study Reveals How Close AI Really Is to Taking Your Work

INDUSTRY INSIGHTS

ai

jobsecurity

futureofwork

technology

automation

0 Comments Full Story

Summary:

AI systems successfully completed only 2.5% of real work assignments in a comprehensive study comparing human and AI performance
The best-performing AI failed on visual tasks like creating accurate floor plans and 3D product models, often producing completely wrong results
Major limitations include no long-term memory and poor visual understanding, preventing AI from learning from mistakes or handling spatial reasoning
Despite predictions of widespread job replacement, current AI models are not close to automating real jobs in the economy
Newer AI models show improvement but still struggle with complex tasks, with Google's Gemini 3 Pro completing just 1.3% of assignments

The Reality of AI in the Workplace

Imagine you're redesigning your living space. You could hire an interior designer for thousands of dollars, or you could ask an AI tool like ChatGPT to do it instead. But can AI actually do the work?

A groundbreaking study compared how top AI systems and human workers performed on hundreds of real work assignments, including producing a digital version of a hand-drawn floor plan.

The results were eye-opening:

The human produced a professional-looking floor plan
The best-performing AI system made a plausible-looking floor plan, but with much less detail
The AI version was completely wrong

This failed floor plan illustrates a crucial disconnect three years after ChatGPT's release that has implications for the entire economy.

The Study That Changed Everything

Researchers collected hundreds of real projects from freelancing platforms where humans had been paid to complete tasks like:

Making 3D product animations
Transcribing music
Coding web video games
Formatting research papers for publication

They then gave each task to AI systems including OpenAI's ChatGPT, Google's Gemini, and Anthropic's Claude.

The shocking finding: The best-performing AI system successfully completed only 2.5% of the projects.

"Current models are not close to being able to automate real jobs in the economy," said Jason Hausenloy, one of the researchers on the Remote Labor Index study.

Where AI Falls Short

Another assignment involved creating an interactive dashboard visualizing data from the World Happiness Report. At first glance, the AI results looked adequate, but closer examination revealed:

Countries inexplicably missing data
Overlapping text
Legends using wrong colors or no colors at all

The AI systems failed on nearly half of the projects by producing poor-quality work, and they left more than a third incomplete. Nearly 1 in 5 had basic technical problems like producing corrupt files.

"A lot of the failures were kind of prosaic," Hausenloy said, pointing to two major limitations:

No long-term memory - AI can't learn from previous mistakes or remember feedback over time
Struggles with visual understanding - Problems with graphic design or spatial reasoning

The Visual Challenge

This failure became apparent in a project asking for promotional material for tech earbuds. The task involved taking images and creating a 3D model with short video clips demonstrating the design.

No AI system produced acceptable work:

OpenAI's GPT-5 and Anthropic's Sonnet created poor 3D models
Manus didn't create a 3D model at all
In some results, the earbuds changed appearance across clips

Graham Neubig, a Carnegie Mellon University professor who has researched AI systems, explained one reason for these failures: "They don't use the same tools a human expert would use."

A human creating a product rendering would use 3D modeling software with a visual interface, while a chatbot asked to make a 3D model will usually try to generate images by writing code.

Where AI Shows Promise

The AI systems performed better on a task involving producing a web-based video game. The best version made without human work was actually playable - an impressive feat. However, the AI system ignored the instruction that the game have a brewing theme.

The Economic Implications

If AI systems could perform remote work assignments autonomously, businesses using human contractors could instead send that work to a chatbot. This would mean huge cost savings for companies and no work for those contractors.

The study suggests this scenario is far from reality, at least for now.

The Future Trajectory

Though all AI systems failed most projects, newer models showed improvement. The team recently tested Google's Gemini 3 Pro, released in November. It completed 1.3% of tasks, compared with the company's previous version getting through 0.8%.

"The trend lines are there," Hausenloy noted.

AI can still disrupt the labor market without fully replacing individual workers. Companies may need fewer employees if each one can do more with a chatbot's help. But if the trend toward greater autonomy continues, the economics of work could become challenging for many people.

Consider this: A human made the video game for $1,485. The researchers had Sonnet make it for less than $30.

Whether AI systems need minor tweaks or fundamental breakthroughs to successfully do real work remains "the key question in the AI field at the moment," according to Hausenloy.

Source: The Washington Post

Comments

0

Join Our Community

Sign up to share your thoughts, engage with others, and become part of our growing community.

No comments yet

Be the first to share your thoughts and start the conversation!

Newsletter

Subscribe our newsletter to receive our daily digested news

Join our newsletter and get the latest updates delivered straight to your inbox.

OR

Join Our Telegram CommunityJoin Telegram

Other Latest News

Future-Proof Your Career in the AI Era: Essential Strategies for Australian Workers

9 hours ago•

820

AI Won't Steal Your Job After Graduation: Here's What Research Really Says About Future Careers

16 hours ago•

960

Innovative CEO Search: ICTV's Creative Approach to Hiring First Nations Leadership

1 day ago•

900

AI vs Human Jobs: Shocking Study Reveals How Close AI Really Is to Taking Your Work

Summary:

AI systems successfully completed only 2.5% of real work assignments in a comprehensive study comparing human and AI performance

The best-performing AI failed on visual tasks like creating accurate floor plans and 3D product models, often producing completely wrong results

Major limitations include no long-term memory and poor visual understanding, preventing AI from learning from mistakes or handling spatial reasoning

Despite predictions of widespread job replacement, current AI models are not close to automating real jobs in the economy

Newer AI models show improvement but still struggle with complex tasks, with Google's Gemini 3 Pro completing just 1.3% of assignments

The Reality of AI in the Workplace

The Study That Changed Everything

Where AI Falls Short

The Visual Challenge

Where AI Shows Promise

The Economic Implications

The Future Trajectory

Comments

Join Our Community

AustraliaJobs.app

Other Latest News

Future-Proof Your Career in the AI Era: Essential Strategies for Australian Workers

AI Won't Steal Your Job After Graduation: Here's What Research Really Says About Future Careers

Innovative CEO Search: ICTV's Creative Approach to Hiring First Nations Leadership

Other Latest News

Future-Proof Your Career in the AI Era: Essential Strategies for Australian Workers

AI Won't Steal Your Job After Graduation: Here's What Research Really Says About Future Careers

Innovative CEO Search: ICTV's Creative Approach to Hiring First Nations Leadership