When you’re solving competitive programming problems, it’s tempting to code as fast as possible. Once you have some idea about what the problem is asking for, just start coding and hack away until your solution passes. After all, these problems are written for timed contests, so why not solve them as if you’re racing the clock?
At some point in your competitive programming career, the “full speed ahead” approach might be the right one. If you want to do well in contests, you have to practice under contest-like conditions. But it may be a while before that’s the optimal practice strategy.
The problem with coding quickly is that you’ll end up with messy code that’s hard to debug. There’s nothing morally wrong with writing messy code that you’re going to throw away in an hour. If you’re getting correct solutions quickly, go for it. But if you’re spending a lot of time debugging code that’s incomprehensible a minute after you write it, maybe it would be more efficient to slow down.
Now, I’m not saying you should go to the other extreme and polish your code as if you were going to turn it into a product. Even if you have a lot of practice time, there’s a limit to how much learning benefit you can get out of one contest problem. Save your best coding style for code that you’re going to keep around.
But some coding practices are worth bringing over to contest code when you’re not on the clock. I have written before about how unit tests can help you solve and learn from contest problems. Using those on every problem is overkill, but on problems with challenging coding steps, they may allow you to solve a problem that you wouldn’t otherwise be able to.
If writing unit tests seems too extreme, here are some coding standards that can save more time than they cost to implement:
- Use descriptive identifiers: it takes longer to type
querySegmentStart
thanqss
. But not having to remember whatqss
means will free up brain cycles to think about the algorithm you’re working on. And if you use an IDE, the extra typing won’t amount to much.
There are 2 hard problems in computer science: cache invalidation, naming things, and off-by-1 errors.
— Leon Bambrick (@secretGeek)
- Split your solution into multiple functions: For contest problems, it’s tempting to write one long function, so you don’t have to worry about parameters and return values. But this makes it more difficult to test parts of your solution in isolation. Even if you’re not writing automated unit tests, it’s a good idea to test incrementally. For example, you can verify that you’re correctly reading the input according to the problem description. If that part of your solution doesn’t work correctly, nothing else will.
- Design before coding: Even small programs can benefit from some up-front design, using diagrams and pseudocode. I explain a series of design steps in How to Attack a Programming Puzzle.
- Don’t be afraid to throw code away: Even on a tiny software project like the solution to a programming puzzle, some code deserves to be discarded. As you’re working, if you find that you need to change your design, consider whether it makes sense to start at least some of your solution from scratch.
Example: UVa 10567
UVa 10576 is the first of three starred problems in the Binary Search section of uHunt Chapter 3. As is sometimes the case with uHunt problems, I didn’t solve it using the recommended technique.
This problem has a typically convoluted description, but here’s what it boils down to:
You are given an input string of uppercase and lowercase letters, and a list of queries, also consisting of uppercase and lowercase letters. A query is considered a match if it has the same characters as the input string in the same order, not necessarily consecutively. For matching queries, print Matched followed by the first and last matching character positions in the input string. For non-matching queries, print Not matched.
I approached the problem by processing the input string once, and then using a linear search process for each query. Here are the details:
- Declare a list of 52 lists, one for each possible character in the input string and query strings (A-Z and a-z).
- Process the input string into segments of consecutive matching characters. Store the starting and ending position of each segment in the list for the corresponding character. For example, if the string
aaaaabb
appeared at the start of the input string, it would be stored as{0, 4}
in thea
list and{5, 6}
in theb
list. - To process each query string, iterate through its characters in order, looking for segments of consecutive characters. When you reach the end of a segment, retrieve the segment list that you created in the previous step for that character.
- Iterate through the segment list and sum the lengths of the input segments or parts of input segments that intersect the current query segment. You can also use binary search in this step to find the first intersecting segment.
- If the length sum is greater than or equal to the length of the current query segment, move on to the next query segment. Otherwise, the query is not a match, so print Not matched.
- If you make it through the whole query string without a mismatch, print the first and last matching input character positions.
Design vs. details
Competitive programming problems often have a large section of input followed by multiple smaller queries about that input. If you can process the large input into a more efficient form, it can save you time while you’re processing the queries. That approach works for this problem. Rather than processing the full input string for each query, it’s more efficient to make one pass to split it into segments, and then repeatedly re-use the segment list. UVa OJ ran my Java implementation of this design in less than one second. Using binary search in Step 4 would cut that down even more.
Once you work out this design, the remaining challenge is to get the details right. That’s where coding style comes in. If you just code this solution straight through, you can get lost in implementation details. Here are some details that you’ll have to consider:
- Splitting an input string into segments: You have to do this for the input string and each of the query strings. It’s best to write a function to do this, and test it to make sure it works.
- Calculating overlap: This step has a risk of off-by-one errors and other calculation bugs. Clear naming can help you distinguish between input segments and query segments in your code. This is also another opportunity for manual unit testing to ensure that you have functions that can find overlapping segments and correctly count the number of overlapping characters.
- Sub-segment calculation for the end position: Mainly you’re dealing with complete segments of consecutive matching characters. But when you get to the last segment, you may not want to use all of its characters. Ideally, your segment-handling code should be robust enough to deal with this, so you don’t have to write a special case for the last segment.
Code and Fix
The code and fix anti-pattern that threatens some real-world development projects can also be harmful to contest coding. There’s often a point in the puzzle-solving process where you thought you were done, but your solution isn’t producing the right output. Now you need to dive back into the code to figure out what’s wrong. If you haven’t followed any coding practices, this can turn into a painful debugging process. More significantly, you can end up getting an accepted answer by accident, without really understanding your solution. Then the next time you encounter a similar problem, you won’t be able to apply your experience to solving it.
To express your own thoughts on this topic, see the question about contest coding style that I posted on Quora this week.
(Image credit: Aaron Wolf)