Last week I wrote about solving UVa 11085 using the recursive backtracking search technique. This week, I’m going to outline a more general approach to recursive backtracking problems. Then I’ll show how it can be adapted to solve UVa 574.
UVa 574 asks us to solve this problem:
Given a multiset $S$ of integers and a target sum $t$, find all subsets of $S$ that sum to $t$.
Recursive Backtracking Characteristics
If a problem has these characteristics, it may be amenable to a recursive backtracking solution:
- It requires searching through a set of possible solutions. There’s no way to compute its solutions directly, or determine in advance what part of the complete solution space will contain the desired solutions.
- It involves building up complete solutions from partial solutions.
- As each element is added to a partial solution, it is possible to decide whether the partial solution is valid.
Here’s how those characteristics apply to UVa 574 and UVa 11085:
- UVa 574: The solution space consists of subsets of $S$. Our goal is to find which of those subsets sum to $t$. To do that, we have to create subsets and calculate sums. We can use some shortcuts that allow us to avoid looking at every possible subset, but we still need to look through many of them.
- UVa 11085: The solution space consists of placements of eight queens on a standard chessboard. We have to check some fraction of all possible placements to determine which ones contain no pairs of queens that attack each other.
Complete solutions from partial solutions
- UVa 574: If we have a subset of $S$ whose sum is less than $t$, we can add another integer from $S$ to the subset, and get the new sum using just one operation. By keeping track of partial sums, we avoid having to go back and recalculate the sum of the whole subset.
- UVa 11085: We move towards a complete board by adding a new queen to a partial board.
Partial solution validity
- UVa 574: As we build up each subset, we can check if its sum is $t$, which means we’re done. We can also make sure we’re always using a unique element (not necessarily a unique integer) from $S$. (This is another rule from the problem statement).
- UVa 11085: If we have a valid board containing $x < 8$ queens, we can add another queen and check it for validity by comparing it with the $x$ previous queen positions. If we check each new queen as we add it to the board, we don’t have to check every pair of queens again at each step, since we know that the partial board is valid.
Recursive Backtracking Program Structure
Here’s a high-level structure for a recursive backtracking solution. First we need a method to parse the test case and kick off the recursion:
parseTestCase() read and parse test case input initialize case-level variables call recursive function with initial arguments
parseTestCase calls a recursive function that takes care of searching the solution space. For a two-dimensional solution space, this
backtrack function accepts an
x value as a parameter, and loops through a set of
y values inside the function. For example,
y could be the row and column numbers on a chessboard:
backtrack(parameters, including value x for one dimension) for each value y for another dimension if (x, y) is valid add (x, y) to partial solution if solution is complete and valid calculate/save/print result else if more x values are available recursively call backtrack with x+1 else do nothing (continue with next y value) else do nothing (continue with next y value)
Based on the third characteristic of recursive backtracking problems, we know there’s a way to determine if
(x, y) is a valid element to add to the current solution. If determining this takes more than one step, it’s best to encapsulate the decision in a separate function, which we’ll call
place(parameters) using parameters and case-level data, determine whether this is a valid element if so, return true otherwise, return false
Levels of Data
To solve a recursive backtracking problem, we’ll operate on data at two levels:
- Case-level: This data is valid at any point during test case processing. Examples of this type of data include the test case input and output. The input never changes once it is read from stdin, and the output never changes once it is printed to stdout. Therefore, we can rely on these not to be in an intermediate state. There may be other canonical data, like a list of prime numbers to compare against results (see UVa 524).
- Function-level: This data is specific to a particular recursive function call. Therefore, it must be local to the function, either as a function argument or as a local variable. An example of this type of data is a partial solution, like a chessboard with some open columns.
It’s important to store your data at the right level. The benefit of a recursive function is that the function’s local state is saved for you on each call. But if you put your partial solution data outside the function, the compiler won’t know that you need a separate copy for each call. This might cause one partial solution to clobber another one. I’ll cover this in more detail in the example that follows.
Keeping case-level data at the function level is not necessarily a problem, but it can consume more memory than necessary. Since case-level data is common to all recursive calls, you’ll just be storing a lot of duplicate data. If your case-level data is large, you may also run into time limit problems due to the overhead of passing it as function parameters.
UVa 574: Sum It Up
Now that we have a general framework for a recursive backtracking solution, let’s see how to apply it to UVa 574.
Parse test case
In this step, we have to parse the test case input, and set up case-level variables. I used four case-level variables for this problem:
int input: The integers in the input multiset, in the order they are provided in the test case input.
HashSet<String> output: Each output row gets stored here. The algorithm can generate duplicate results, so this is how I avoid printing duplicates, which the output instructions prohibit.
int set: A solution set that eventually sums to the target total. Rather than storing the actual integers, I store indexes into the
numbersarray. This is more useful for tracking which numbers have been used in the current set: positions are unique, while input numbers can repeat. (That’s what multiset means).
It should be clear that
output are case-level variables. You might interpret
set as a function-level variable, and that design would work as well. However, I decided to declare it at the case level using the maximum possible set size (the number of input integers). This allowed me to re-use it across function calls by keeping track of the current set position, which saves memory and execution time.
For UVa 574, each recursive call to
backtrack needs these parameters:
int sum: The sum of the elements in the current subset. Passing this as a parameter means you don’t have to recalculate the sum on each call.
int setPosition: The position for the next number in the current subset. Passing this as a parameter means you don’t have to pass an entire subset array: all
backtrackcalls can share the same array.
TreeSet<Integer> availablePositions: Input number positions that have not yet been used. The test case input is sorted in nonincreasing order, and the output must also be shown in that order. A
TreeSetallows us to efficiently select only the input integers that are less than or equal to the most recently used element.
With these parameters, we can fill out the specific
backtrack function for this problem, following the general pattern shown above. At a high level, this function needs to:
- Iterate through the set of input numbers.
- Skip any that are too large.
- Add the others to the current set.
- Print a result, recursively call itself to try the next set position, or backtrack to the previous set position.
Here’s the pseudocode:
backtrack(sum, setPos, availPos) previousPos = previous position in set remainingPos = (elements >= previousPos in availPos) for each numberPos in remainingPos if place(sum, numberPos) add numberPos to set newSum = sum + input number at numberPos if newSum == total format output row if it is not already in the output set print it add it to the output set else if newSum < total and we still have input numbers remove numberPos from availPos backtrack(newSum, setPos+1, availPos)
A few more details about
placefunction for UVa 574 just checks if the current sum plus the candidate number is less than or equal the target sum. That’s simple enough that we don’t really need a separate function, but it doesn’t hurt to learn the standard pattern for future use.
(x, y)pair in this case is
(setPos, numberPos). Those are analogous to the column and row board positions in UVa 11085.
- Building the
remainingPosset rather than iterating through every element in
availPoslowers the runtime significantly. Since
TreeSet, its keys are sorted. Using the
tailSetmethod in Java, we can slice off only the numbers that meet the nonincreasing rule specified in the problem statement.
- There are a few other Java implementation details, but they’re all topics that I covered in Java Lessons from uHunt Chapter 2.
Why Use Recursion?
For appropriate problems, recursion is a convenient way to simplify your code by getting the compiler to keep track of intermediate results. Here’s how it works for UVa 574:
- You call
availPosarguments. The values that you pass in are specific to the current function call.
backtrackcreates other local variables,
remainingPos. These are also specific to the current call. You don’t have to remember which sum goes with which set position, list of available positions, etc.
numberPosloop begins, checking each number in
- If the current sum is below the target sum,
backtrackmakes a recursive call to itself. The current values are saved on the call stack, and a new copy of
backtrackbegins with the new values.
- If the current sum is equal to the target sum, the result is printed. Since there is no recursive call, the loop continues with the next value of
numberPos, and all of the local variables maintain their current values. This is what we want for recursive backtracking problems, because new solutions build on previous solutions. Even though we found a valid solution, there may be other valid solutions that we can build from the same set. For example, we might find a solution $5=3+2$, and then later find $5=3+1+1$.
falsefor the current sum and number position, then we have exceeded the target sum. In this case, there is also no recursive call, so we continue with the next loop iteration. Since available numbers are sorted in nonincreasing order, there may be a smaller number that will gives us the target sum.
- Once the
numberPosloop completes, the current call to
backtrackreturns. This is the actual backtracking that the technique is named for, since it causes the code to return to a previous
backtrackcall, and therefore a previous set position. The
numberPosloop that was in progress at the time of the previous call then resumes, checking the remaining numbers at the previous position.
- This process continues until the very first
The process of looping through available input values and recursively checking set positions finds all of the required subsets. It also saves time compared to a complete search, by eliminating many subsets that we know won’t provide a valid result.
(Image credit: James Brooks)