UVa 108 is rated as a Level 1 (easy) problem by uHunt, but its solution nevertheless contains some interesting techniques. Here’s a summary of the problem statement:
Given an $N \times N$ array $A$ of positive and negative integers, print the sum of the nonempty subarray of $A$ that has the maximum sum. The sum of a subarray is defined as the sum of its elements.
uHunt lists this problem in the section called Max 2D Range Sum, a subcategory of Dynamic Programming. But before we get into the dynamic programming solution, let’s examine the Complete Search approach.
The complete search approach to this problem requires that we enumerate all of the subarrays of the input array, and calculate the sum of each one separately. To enumerate the subarrays of an $N \times N$ array, consider that you can identify each subarray by a set of four integers between $1$ and $N$: starting row, starting column, ending row, and ending column. (We’ll see later why 1-based numbering is convenient). You can think of those four integers as the coordinates of the top left and bottom right corners of the subarray. Having identified a subarray using this complete search approach, the naïve way to find its sum is to add all of its $N^2$ numbers together.
To generate all of the subarray coordinates, we can use four nested loops, one for each coordinate. Another two nested loops is sufficient to iterate through the rows and columns of the subarray and calculate its sum. These six nested loops give us an $O(N^6)$ algorithm. Although the UVa OJ time limit is fairly loose for this problem, it isn’t quite that forgiving. So we need a more efficient approach.
Dynamic Programming Strategy #1: Re-using Subarray Sums
For basic dynamic programming problems like UVa 108 and the other two I have written about so far, the key is to figure out what information you can store in prior steps that will save you time in subsequent steps.
As with the two DP problems that I linked to, the key idea for UVa 108 is that the results of a prior calculation (a sum, in this case) can be used in subsequent calculations. Specifically, once we have calculated the sum of part of a subarray, we don’t have to repeatedly calculate it when we’re processing other subarrays that overlap the same region. If we can find an efficient way to store and re-use those sums, we can save a lot of calculation time. The data structure used in DP for this purpose is traditionally called a memo table.
In DP strategy #1 for this problem, the memo table is the same size, shape, and type as the input array: a $N \times N$ array of integers. The integer value at position $(i, j)$ in the memo table stores the sum of the subarray whose top left corner is the top left corner of the input array, and whose bottom right corner is $(i, j)$ in the input array. For example, memo table position $(2, 2)$ would store the sum of the square subarray from $(1, 1)$ to $(2, 2)$ in our 1-based numbering scheme.
Step 1: Generate the memo table
The key to generating the memo table efficiently is to iterate through the input array row by row, and use previous results to calculate subsequent ones. You’re basically using the memo table to generate itself.
To see how this works, consider the $4 \times 4$ memo table shown below. The numbers from
16 represent the order in which we will iterate through the sixteen input values and calculate the sixteen memo table values. The cells marked
-- have a value of
0. Padding the table in this way will simplify the code. We can read the input and populate the memo table simultaneously. Therefore, there’s no need to store a separate array of input values. All of the information we need in the next step will be in the memo table.
-- -- -- -- -- -- 01 02 03 04 -- 05 06 07 08 -- 09 10 11 12 -- 13 14 15 16
Here’s how the process works: Pick a cell in the array. As an example, I’ll use cell
10. This represents the bottom right corner of a $3 \times 2$ (three rows and two columns) subarray that begins at cell
01. According to our memo table definition, cell
10 must contain the sum of input values
10. But remember that we have been filling out the memo table in order. So by the time we reach cell
10, we already have correct values for cells
09. How can we use these values?
The answer is that we can use three pieces of information to calculate each subarray sum:
- The single cell that forms its bottom right corner. In this case, cell
- The subarray above it. In this case, that’s the $2 \times 2$ subarray beginning at
01and ending at
06. The sum of this subarray is stored in cell
- The subarray to the left of it. In this case, that’s the $3 \times 1$ subarray beginning at
01and ending at
09. The sum of this subarray is stored in cell
So rather than adding ten cells to get the sum we’re looking for, we can just add three cells:
10. And for larger subarrays, the time savings is greater.
If you’re following along, you’ll notice that this process isn’t quite right. The above subarray and the left subarray overlap at cells
05. So we have to subtract the sum of those cells to avoid double-counting. Fortunately, the amount we need to subtract is stored in a single cell,
05. This double-counted amount will always be one cell diagonally up and to the left of our target cell.
You may now be able to see the reason for the
-- cells. In a 0-based array, if your target cell is in the first row and/or the first column, you would need additional checks to avoid going out of range, because there is no previous row and/or no previous column. By padding the memo table with cells containing zeros, we can add and subtract without affecting the answer and without cluttering our code with conditionals like
if (row > 0).
We’re now at the end of Step 1, so our memo table contains the sums of all subarrays that begin at the top left corner of the input array. But the maximum sum subarray may begin at a different position. To handle the general case, we need Step 2.
Step 2: Evaluate all starting and ending positions
This step shares some ideas with the Complete Search approach described above. The difference is that the subarray sums have already been calculated efficiently in Step 1. So rather than an $O(N^6)$ algorithm (Complete Search), we have an $O(N^2)$ step (Step 1) followed by an $O(N^4)$ step (Step 2). As you may know from the rules of asymptotic algorithm analysis, this gives us an overall $O(N^4)$ runtime, which is enough to get this solution accepted at UVa OJ.
As with Complete Search, Step 2 uses four nested loops, one each for starting row, starting column, ending row, and ending column. These four coordinates define the subarray in question. We have to check each subarray to see if it has the maximum sum seen so far.
Here’s what happens inside the innermost loop:
First, we retrieve the memoized sum that we calculated in Step 1 for the subarray ending at the current row and column. Remember that this represents the sum of the subarray that begins of the top left corner of the input. Since this may not be the same as the top left corner of the subarray that we’re currently checking, we have to adjust the sum using a process similar to the one we used in Step 1. To illustrate, assume that we have the following completed memo table. As before, the numbers identify the order in which the cells were calculated.
-- -- -- -- -- -- 01 02 03 04 -- 05 06 07 08 -- 09 10 11 12 -- 13 14 15 16
As an example, assume that our nested loops currently identify a start position of cell 11 (row 3, column 3) and an end position of cell 16 (row 4, column 4). We need to find the sum of this $2 \times 2$ subarray. The memoized value in cell 16 gives us the sum of the entire array, so we have to get rid of the parts we don’t need. As in Step 1, we can do this by using the values that are above and to the left of the target.
Here’s another way to look at the memo table:
-- -- -- -- -- -- ZZ ZZ XX XX -- ZZ ZZ XX XX -- YY YY 11 12 -- YY YY 15 16
XX cells identify the subarray that is above the target. The
YY cells identify the subarray that is to the left of the target. And the
ZZ cells identify the subarray that is both above and to the left (i.e., diagonally up and to the left) of the target.
This visualization should be familiar from Step 1. But in this step, we need to subtract rather than add, since we’re removing sums that we don’t need. So beginning with the sum in memo cell 16, we need to:
- Subtract the value in memo cell
08. This removes the sum of the subarray that is above the target — the
- Subtract the value in memo cell
14. This removes the sum of the subarray that is to the left of the target — the
Notice that we have subtracted the
ZZs twice. We have double-counted them, just as we double-counted the diagonal value in Step 1. Therefore, we need to:
- Add back the value in memo cell
Once we add back one of the double-counted copies of the
ZZs, we have an accurate sum for our target subarray. Then we just compare the result with our current maximum sum, and update the maximum if necessary.
When we have made it through all possible starting and ending coordinates, our maximum value will be the answer, and we simply print that integer.
Pseudocode for Strategy #1
define integer array memo[N+1][N+1] for each input row and col increment memo[row][col] by memo[row-1][col] // above increment memo[row][col] by memo[row][col-1] // left decrement memo[row][col] by memo[row-1][col-1] // above/left for each startRow, startCol, endRow, and endCol sum = memo[endRow][endCol] decrement sum by memo[startRow-1][endCol] // above decrement sum by memo[endRow][startCol-1] // left increment sum by memo[startRow-1][startCol-1] // above/left if sum > max, update max print max
Dynamic Programming Strategy #2: Column Squashing
Although Dynamic Programming Strategy #1 is sufficient to implement a solution that UVa Online Judge will accept, there’s a way to solve UVa 108 even faster by using two memo tables rather than one.
Memo table 1
Memo table 1 is very simple: it’s just the input array. As we read the input values, we just insert them into a 2D array. Unlike in Strategy #1, we don’t need to do any processing on the input in this step.
Memo table 2
The purpose of memo table 2 is to hold a squashed version of the 2D input array (memo table 1) in a 1D array (memo table 2) for further processing. Each cell
i in memo table 2 contains the sum of the input cells in column
i between the current start row and the current end row.
Since there are $N$ possible start rows and up to $N$ possible end rows for each start row, memo table 2 gets written and overwritten $O(N^2)$ times. But we only need one of these copies at a time. This means we only need $O(N)$ space, and we can save one nested loop compared to Strategy #1, for an overall $O(N^3)$ runtime. I’ll explain how that’s possible.
The two outer loops in this step iterate through the $N$ start rows, and up to $N$ end rows for each start row. (The end row cannot precede the start row, of course).
For each new start row, we can initialize a fresh copy of memo table 2. We don’t need to save any previous instances of the table, because we have already extracted everything out of it that we need.
For each (start row, end row) pair, there are three steps:
The first step is to update memo table 2. To do that, we have to iterate once through all $N$ columns. For each column, we increment the memo table value using the input value from the current row and column. For example, consider the $3 \times 3$ input array below. The numbers represent the row and column positions — e.g.,
11 is row 1, column 1. (We can use 0-based numbering for Strategy #2, since we never need to check the previous row or column value).
00 01 02 10 11 12 20 21 22
In our first iteration, the start row and end row are both 0. So memo table 2 will just contain the input values in cells
02. In the next iteration, start row is still
0 but end row is now
1. So the memo table will contain the sums of each column from the first two rows:
00 + 10,
01 + 11, and
02 + 12. And so on through each (start row, end row) combination.
After each memo table update, we have a list of $N$ sums. Each of these values is the sum of some portion of a column in the input — i.e., an $i \times 1$ subarray for some $i \leq N$ and some start row. If we add contiguous memo table values together, we can calculate the sums of larger subarrays. For example, adding two contiguous sums gives us the sum of an $i \times 2$ subarray. If we find the maximum such sum for every (start row, end row) combination, then we have found the maximum sum for all subarrays, and we have solved the problem. Conveniently, there’s an $O(N)$ algorithm called Kadane’s Algorithm that solves the maximum subarray problem.
We can apply Kadane’s Algorithm as follows to find the maximum subarray sum for the current memo table:
Define two integer variables: the maximum sum found so far in the current instance of memo table 2, and a candidate sum that we’ll check. For each column, increment the candidate sum by the value in the memo table for that column. If the candidate sum exceeds the current best sum, update the current best sum. Otherwise, keep the current best sum unchanged. If input values can be negative (as they can in UVa 108), we also have to check if the candidate sum is negative, and if it is, reset it to zero. Otherwise the algorithm will get stuck on a local maximum for certain types of all-negative inputs. To see why that is, try walking through the algorithm with this $2 \times 2$ input array:
-2 -1 -2 -1
Once we have tried all start and end row combinations, the maximum of all the best sums that we find is the solution.
Pseudocode for Strategy #2
define integer array memo[N][N] read input values into memo for each startRow define integer array squashed[N] for each endRow for each col increment squashed[col] by memo[endRow][col] for each col increment candidateSum by squashed[col] if candidateSum > sum, update sum if candidateSum < 0, candidateSum = 0 if sum > max, update max print max
Thoughts on UVa 108
UVa 108 is a deceptively simple problem. It’s easy to understand what is being asked, but not necessarily as easy to come up with an efficient implementation. Having finished and written about UVa 787, I knew about the idea of saving prior results (products, in that case) in a memo table for use in subsequent calculations. My first accepted solution used “above” and “left” concepts in a similar way to Strategy #1, but it was not as well-optimized.
In the interest of deliberate practice, I decided to see what better solutions were out there. That led me to explanations of what I’m calling Strategy #1 and Strategy #2.
I found these UVa 108 solutions and explanations useful for understanding this problem:
- Sai Cheemalapati provides code that uses Strategy #1.
- PwzXxm has descriptions and code for both Strategy #1 and Strategy #2.
(Image credit: Ramesh NG)