Programmers get better at competitive programming mainly by solving competitive programming problems. It’s true that there are other activities that go along with problem-solving practice. Reading about algorithms, data structures, and problem-solving techniques is useful to avoid re-inventing the wheel. And it’s even more important to read solutions after you solve a problem, so you can learn from other practitioners and fix your mistakes. But the core of the learning process is solving problems. Without that core, reading textbooks and solutions won’t get you very far.
Problem-Solving Practice
Why is problem-solving practice so important for competitive programming success? Because solving competitive programming problems, like solving math and science problems (the ones given to students, that is), requires developing conceptual chunks. For competitive programming, conceptual chunks can be as small as a line of code (i.e., remembering correct syntax), or as large as an entire program (i.e., remembering how to solve a problem type that you have seen before). Rather than trying to memorize syntax or algorithms in isolation, the best way to develop conceptual chunks is practice.
If problem-solving practice is the goal, how many problems are required to reach a reasonable skill level? Some people don’t like that way of thinking, and respond with Do as many problems as you can or The question doesn’t make sense, because people vary in skill level, and problems vary in difficulty level. This is similar to the reaction that some people have to the infamous 10,000-hour rule.
I think we should approach both the “how many problems” question and the “how many hours” question the same way:
Don’t take the numbers too literally. People do vary in skill level, and problems do vary in difficulty level. So there’s no guarantee that a particular amount of work will get you to a particular goal. You may even reach your goal earlier than you expect.
But don’t throw away the numbers completely. Anders Ericsson, who conducted the study that the 10,000-hour figure came from, recommends the following interpretation: Use the numbers to prepare yourself for a realistic amount of work. If you want to reach the top of your field, you’ll have to put in roughly the number of hours that your competitors are willing to put in. In a field where competition is light (e.g., memorizing lists of numbers, before memory competitions became popular), that number may only be 100 hours. Fields with heavy competition (e.g., violin) may require several multiples of 10,000 hours to reach an elite level.
Numbers from Quora
The “how many problems” question has been asked a few different ways on Quora over the years, and competitive programmers have provided statistics from their own training experience.
First, here are some numbers from the following question: Roughly how many problems has the average red coder solved?
- Przemysław Dębiak says he achieved a TopCoder target rating (3000, the high end of the red range) after solving fewer than 1000 problems. However, he says he was already a fairly good coder when he started.
- Bohdan Pryshchenko estimates that he has solved over 8000 problems, using a philosophy of quantity over quality. However, he doesn’t think that approach is necessary for everyone. At one extreme, he believes that someone with a background in math contests could reach TopCoder red status after only a few hundred problems. A more serious competitor who participated in IOI in high school and ACM-ICPC in college would likely solve a few thousand problems by the end of college.
- Thanh Trung Nguyen says he has solved about 3000 problems, but thinks the average red coder has probably solved under 1000.
- Dima Korolev estimates that he solved fewer than 500 to get to TopCoder red, but he excludes problems from his count that he considers too easy. (He calls these veni, vidi, vici problems).
- Nick Wu says he has solved over 2000 problems.
Another useful Quora question is How much work did it take to get on the US IOI team? (Though the people who answered were from other national teams). Since IOI is a high school contest, these answers focus on early competitive programming training, providing a glimpse at how rigorously some competitive programmers train early in life. The question asks “how much work,” so some people answered in terms of hours spent. Here are the answers that used “number of problems” as their unit of measurement:
- Richard Peng was part of the Canadian team. He breaks down his problem count into ten different contests, online judges, and training programs, and estimates that he solved a total of about 750 problems, 300 of which were “extremely easy” (similar to A and B problems from Codeforces Division 2).
- Hai Dang was on the Vietnam team, and says he knew people who solved between 1800 and 2500 problems.
- An anonymous user says he solved 1500 problems to break the top 100 in a USACO gold contest. He estimates he would need at least 2500 to get to IOI.
- Nguyên Nguyễn says he made his team after just 600-700 problems, but after three more years of practice he was up to 2000.
Finally, a few people collected Gennady Korotkevich‘s statistics on various online judges. As of mid-2014, they calculated that he had solved 2378 problems on those sites. Of course, that doesn’t include offline preparation, training camps, and the many in-person contests that he has dominated over the years.
How to Use These Numbers
It’s important to note that the numbers listed above are not precise. They are often reconstructed from people’s memories, and they don’t all measure progress towards the same goal. They all use the same unit of measurement (number of problems), but some estimates are for young students preparing for IOI, some are for getting a TopCoder red rating, and others are just for serious competitive programmers working on their training.
Fortunately, we don’t need precise numbers. Like the 10,000-hour figure, the purpose of these quantities is just to illustrate what serious competitive programmers are willing to do to get better. That gives you an idea about whether you’re in the right ballpark based on your goals.
The graph at the top of this post shows the numbers I got from Quora, identified by the initials of the person who suggested or reported each number. I decided to sort the data in ascending order, since there’s no natural order for the $x$-axis.
The data seems to me to be clustered into four groups:
- Fewer than 1000 problems: The lower end of this range is iffy. Bohdan Pryshchenko said he thought a math contest background would allow someone to reach TopCoder red in only a few hundred problems, but he didn’t actually know anyone who took that approach. And Dima Korolev excluded easy problems from his count. So the more reliable data points in the $<1000$ group are from 650 to 900. (Also, no one actually reported 99 problems. The JZ data point is just for fun).
- 1500 problems: This was an anonymous report of what it took one programmer to break the top 100 in a USACO gold contest.
- 2000-3000 problems: This range includes both estimates and actual reported numbers.
- 8500 problems: Bohdan Pryshchenko’s “quantity over quality” approach, far away from the rest of the data points.
So based on this small dataset, my conclusion would be that it’s possible to be competitive (e.g., make the IOI team in some countries) after 600-700 problems. Getting to TopCoder red probably requires closer to 1000 (including easy problems). But many serious competitors have solved in the 2000-3000 problem range. Especially now that competitive programming is more popular than when these Quora writers were learning, that seems like a safer target range for people who want to be sure they’re keeping up with the competition.