AI task duration is doubling every 4.3 months
The duration of complex expert-human-level tasks that an AI can complete is increasing exponentially. Source: METR
Examples of Expert Tasks
What is Task Duration?
The task duration measures how long a task would take a skilled human to complete. For example, a 4-hour task duration means AI can now reliably complete tasks that would take an expert human 4 hours.
P50 vs P80 Success Rates
The chart displays two success rate metrics:
- •P50 (50% success) shows tasks AI completes half the time. This represents the median capability and is useful for understanding typical performance.
- •P80 (80% success) shows tasks AI completes most of the time. This is more conservative but represents more reliable performance.
Accelerating Rate of Change?
The rate of AI progress appears to be faster since GPT-4o than it has been since GPT-2. Because the rate of change is significantly different, there is an option to toggle between only considering the more recent models-where competition has been higher and there has been a more competitive landscape of AI development-and a time before that when arguably development was slower.
There is controversy as to how reliable any of these projections may be, but they give some idea of a likely range of possible futures.
Current Leader
The current state-of-the-art model is Claude Opus 4 5, released November 2025. It can complete tasks that would take a skilled human 4h 49m with 50% reliability, or 27m with 80% reliability.
Source
Benchmark data is sourced from METR (Model Evaluation and Threat Research). Data is refreshed periodically from their public benchmark results.