In programming contests, some algorithms and techniques get more emphasis than they do in school or in professional programming work. One such technique is dynamic programming. CP3 has this to say about dynamic programming:
This technique was not known before 1940s, nor frequently used in ICPCs or IOIs before mid 1990s, but it is considered a definite prerequisite today. As an illustration: There were ≥3 DP problems (out of 11) in the recent ICPC World Finals 2010.
As you might expect for such a popular competitive programming topic, uHunt has twenty-one starred problems, divided into seven sub-categories, for dynamic programming practice. As I make my way through that list, I’ll be writing about a selection of these problems. Today I’ll cover the first one, UVa 787: Maximum Sub-sequence Product.
Dynamic Programming
Rather than cover everything about dynamic programming in this post, I’m just going to explain the part of it that’s required to solve UVa 787. That part is a technique called memoization. The idea of memoization is simple: cache the result of a computation and then later retrieve it rather than re-computing it. For example, here’s Jonathan Paulson’s Quora answer on how to explain dynamic programming to a four-year-old
writes down “1+1+1+1+1+1+1+1 =” on a sheet of paper
“What’s that equal to?”
counting “Eight!”
writes down another “1+” on the left
“What about that?”
quickly “Nine!”
“How’d you know it was nine so fast?”
“You just added one more”
“So you didn’t need to recount because you remembered there were eight! Dynamic Programming is just a fancy way to say ‘remembering stuff to save time later'”
There’s more to DP than that, but for the purpose of UVa 787, the ELI5 explanation will suffice. As I get to the other problems, I’ll cover more aspects of DP.
UVa 787: Maximum Sub-sequence Product
The core of the UVa 787 problem statement can be expressed concisely:
Given a sequence of integers, find the maximum product that can be formed by multiplying consecutive elements of a nonempty subsequence of that sequence.
The input is the full sequence. The subsequence is any part of the sequence, of any length, starting at any position in the sequence. The output is the product. You don’t have to print all of the factors that make up the product.
Product of Consecutive Elements
To design our solution, let’s first consider how we would compute the type of product that we’re discussing. For now, we’re not going to worry about performance.
Taking a sequence and extracting consecutive elements to form a subsequence is similar to taking a string and extracting consecutive characters to form a substring. In particular, we can find the number of nonempty subsequences of a sequence the same way we would find the number of nonempty substrings of a string:
Not including the empty substring, the number of substrings of a string of length $n$ where symbols only occur once, is the number of ways to choose two distinct places between symbols to start/end the substring. Including the very beginning and very end of the string, there are $n+1$ such places. So there are $\tbinom{n+1}{2} = \tfrac{n(n+1)}{2}$ non-empty substrings.
Replace “string” with “sequence” and “substring” with “subsequence” in the Wikipedia quote above, and the argument still holds.
Since UVa 787 only requires us to return the maximum product, we don’t have to worry about excluding duplicate subsequences, since they’ll all produce the same product. So we can ignore the fact that symbols (integers, in our case) may occur more than once in a sequence.
The first UVa 787 sample sequence from the problem statement is the three-element sequence $(1, 2, 3)$. Therefore, there are $\tfrac{3(3+1)}{2} = \tfrac{12}{2} = 6$ nonempty subsequences. Here they are:
- (1): Product = 1
- (2): Product = 2
- (3): Product = 3
- (1, 2): Product = 2
- (2, 3): Product = 6
- (1, 2, 3): Product = 6
If we use the brute force approach and calculate all of the products, we can see that the maximum product is 6.
Memoization
The key to solving UVa 787 efficiently is the associative property of multiplication. This property says we can group factors of a product however we want without changing the result.
In the example above, we calculated the six products independently, which is inefficient. But due to associativity, we know that $(1 \times 2) \times 3 = (1 \times 2 \times 3) = 6$. If we calculate the six products in order and store each result as we get it, we can use earlier results to speed up our calculation of later results. This greatly reduces the cost of calculating products with many factors, which makes it possible to solve UVa 787 within the time limit.
In our example, we can remember that $1 \times 2 = 2$. Then when we need to calculate $1 \times 2 \times 3$, we can use just one multiplication operation instead of two: look up the result of $1 \times 2$, and then multiply it by $3$. Of course, this is much more dramatic when we’re looking for the product of 100 integers, and we have already multiplied the first 99.
Storing the results of previous products to use in calculating subsequent products is an example of memoization.
The Memo Table
The data structure used to store results used in memoization is traditionally called a memo table. For UVa 787, I think the most intuitive way to visualize the memo table is as a square 2D grid of integers, where the size of each side of the grid is the number of integers in the sequence, and the row and column numbers represent the following:
- Row number: A product’s starting position in the original sequence.
- Column number: The number of factors in the product.
Going back to our $(1, 2, 3)$ example, we would use a $3 \times 3$ array since there are three elements in the full sequence. The $1 \times 2$ product would go in row 0, column 2 since it begins at position 0 in the sequence, and has 2 factors.
(Note: For simplicity, I’m using 1-based numbering for the columns, so I define my array with an extra column, and don’t use column 0. This allows me to store subsequences with 2 factors in column 2).
Here’s why it’s useful to define the memo table as specified: If the table contains the result of a product that starts at position $p$ in the sequence and contains $m$ factors, then we only need one multiplication operation to find the product that starts at position $p$ and contains $m+1$ factors. This allows us to efficiently calculate all possible products by proceeding from lower to higher starting position and product length.
Pseudocode
Putting these ideas together, we can solve each UVa 787 test case as shown in the pseudocode below.
One implementation detail: because we’ll be multiplying up to 100 integers, each of which can be up to 99999, we’ll need to use a type that can handle larger integers than the standard C/C++/Java integer datatypes can. The BigInteger
type in Java works great for this purpose.
Here’s the pseudocode:
read the n integers in the input sequence
declare a 2D array (memotable) of big integers, dimensions (n+1) x n
declare a big integer to store the maximum product
// initialize the base case (all subsequences of length 1)
for each input integer v at each input position i
set memotable[length 1][start position i] to value v
if v exceeds the max product, update the max product
// calculate products for the remaining subsequences (length >1)
for each subsequence start position pos from 0 to n-1
for each subsequence length len from 2 to n
if len+pos > n, break // too long for this start position
retrieve the subsequence product of length len-1 starting at pos
multiply by the integer at position pos+len-1
update the product at memotable[len][pos]
update the max product if necessary
print the max product
Learning Dynamic Programming
CP3 has a good introduction to dynamic programming in Section 3.5. Here’s how I have used it so far:
- I read it once during my initial read-through of CP3 Chapter 3, before I started the Chapter 3 starred problems.
- To refresh my memory, I re-read subsection 3.5.1 before starting the Dynamic Programming section on uHunt.
- After I submitted an accepted solution to UVa 787, I read section 3.5.2.1 to get the authors’ take on Max 1D Range Sum problems, of which UVa 787 is an example.
The more technical sections of CP3 go into a lot of detail, so I find that the best strategy is to read through them once and then rely on the uHunt problems to solidify my understanding of each concept. If I get stuck on anything, I go back to the book or another resource to clarify any points that are still confusing.
Stay tuned for more dynamic programming explanations in the coming weeks and months.
(Image credit: Jeremy Tenenbaum)