Can you predict what AI will do?

How good are you at predicting how well Claude, ChatGPT and Gemini will perform on simple tasks?

When AI tools do well – or badly – at a task, it's tempting to explain it away with the benefit of hindsight. But can you anticipate how well they'll do before you see the outcome?

We set AI tools some simple tasks and gave them 10 attempts at each one. We used the default version of each model. For each question below, guess how well the model did – then see how you compare with the 2252 others who've done the quiz.

You don't have to work out the correct answer yourself – just say how many times you think AI got the task right.

Question 1

This prompt was given to ChatGPT:

What is the longest word in this list: python, turrets

Of the 10 attempts, how many did AI get right?

Question 2

This prompt was given to Claude Opus 4.6:

What is 9,321 × 1,789?

Of the 10 attempts, how many did AI get right?

Question 3

This prompt was given to Gemini 3.1 Pro:

Extract the text from this image

Of the 10 attempts, how many did AI get right?

Question 4

This prompt was given to Claude Opus 4.6:

How many happy responses are there in this dataset?

Data provided

id	statement
1	I feel like every hope has been snuffed out
2	I am wearing the biggest smile
3	I'm so appreciative and full of cheer
4	I feel a bubbly sense of joy

Download full dataset (800 entries total, 400 happy)

Of the 10 attempts, how many did AI get right?

Question 5

This prompt was given to Gemini 3.1 Pro (with search and URL tools):

What was Adam Kucharski's first published scientific paper? This is his Google scholar page: https://scholar.google.co.uk/citations?user=eIqfmHYAAAAJ&hl=en

Of the 10 attempts, how many did AI get right?

Question 6

This prompt was given to ChatGPT:

Create an image of an accurate hopscotch game

Of the 10 attempts, how many did AI get right?

Question 7

This prompt was given to Gemini 3.1 Pro:

Sum up each row and column in this dataset and output as a CSV

Data provided

12		1
1	3	3
13	0	1
1	1	2
2	3	1

Download full dataset (12 entries total)

Of the 10 attempts, how many did AI get right?

Question 8

This prompt was given to ChatGPT:

How many items of furniture are mentioned in this list?

Data provided

id	response
1	gazelle, gecko, heron, newt, wolverine
2	owl, gazelle, bison, crane, frog
3	wolverine, wombat, chimpanzee, penguin, lion
4	horse, wolf, wombat, tiger, gorilla

Download full dataset (500 entries total, including 10 furniture items)

Of the 10 attempts, how many did AI get right?