AI task duration is doubling every 3.8 months
The duration of complex expert-human-level tasks that an AI can complete is increasing exponentially. Source: METR
Examples of Expert Tasks
What is Task Duration?
The task duration measures how long a task would take a skilled human to complete. For example, a 4-hour task duration means AI can now reliably complete tasks that would take an expert human 4 hours.
P50 vs P80 Success Rates
The chart displays two success rate metrics:
- •P50 (50% success) shows tasks AI completes half the time. This represents the median capability and is useful for understanding typical performance.
- •P80 (80% success) shows tasks AI completes most of the time. This is more conservative but represents more reliable performance.
Accelerating Rate of Change?
The rate of AI progress appears to be faster since GPT-4o than it has been since GPT-2. Because the rate of change is significantly different, there is an option to toggle between only considering the more recent models-where competition has been higher and there has been a more competitive landscape of AI development-and a time before that when arguably development was slower.
There is controversy as to how reliable any of these projections may be, but they give some idea of a likely range of possible futures.
Current Leader
The current state-of-the-art model is Gpt 5 2, released December 2025. It can complete tasks that would take a skilled human 6h 34m with 50% reliability, or 55m with 80% reliability.
Source
Benchmark data is sourced from METR Time Horizon 1.1. Data is refreshed periodically from their public benchmark results.