In the Divide and Conquer section of uHunt Chapter 3, one of the subsections is called Binary Search the Answer. Here’s how that technique works.
For a binary search problem like UVa 12192, the data to search is the input data. For that problem, you are given a rectangular grid of integers, and your goal is to find a square region that satisfies a particular property. One step in the process of finding that region efficiently is to run binary search on the data in a row.
Binary search problems like UVa 11413 and UVa 12032 require a different approach. Instead of binary searching the input data, you must search potential output values. The result of the search is the answer.
Competitive programming problem statements often specify the range of inputs that you are required to handle in your solution. For example, the description for UVa 12192 specifies that the grid cannot be larger than 500×500.
The input range is there for a few reasons. It lets you estimate the runtime of your solution before you submit, to avoid a Time Limit error. It also lets you choose an appropriate algorithm. If you determine that an $O(n^2)$ algorithm will run fast enough given the input size, then there’s no need to use an $O(n \log n)$ algorithm, which may take more time to implement. Finally, for search problems, the range tells you which values you have to search.
For binary search the input problems, you can directly plug the input range from the problem statement into the low and high values in your binary search function. For binary search the output problems, you may have to process the input first. Let’s look at a couple of examples.
UVa 11413: Fill the Containers
Here’s what UVa 11413 asks:
You are in a factory with a conveyor belt. The belt carries full bottles of milk, each of which can have a different volume. You are given an ordered list of of all of the volumes in advance. As each bottle comes off the belt, you have to pour it into a container. You have a specific number of containers and you must use all of them, but each of the containers can be any size you want. Furthermore, all of the milk from a given bottle must go into the same container, you have to empty each bottle as soon as it comes off the conveyor, and once you start using a new container, you can’t go back and pour milk into one of the previous containers. What is the minimum volume that your largest container can be?
The problem statement is a bit convoluted. That’s my attempt to make it as clear as possible. The original has an example of what that last sentence means.
In brainstorming the solution, I converted the problem statement into this mathematical form:
You are given n integers in order. Split them into m segments such that the sum of the integers in the segment with the largest sum is as small as possible.
This is more abstract than the original, but it’s simpler, and I think it captures all of the requirements.
My first solution approach was to generate all of the ways to get the sum $n$ using $m$ integers. For a given sum, each term represents the number of bottles to take from the conveyor and pour into a container. It’s possible to find the largest term in each sum, and then pick the sum with the smallest such term. But this algorithm runs too slowly given the number of possible sums.
For the binary search the answer approach, I got some help once again from Sai Cheemalapati, who wrote about this problem. Here’s how that approach works.
Imagine you have a tryCapacity(int maxCapacity)
function that returns true
if a particular capacity meets the problem constraints. Since the problem specifies a maximum number of milk bottles, and a maximum volume for each bottle, there’s a finite space of possible container capacities that you have to search. To efficiently process this finite but large space, you can use binary search. Start in the middle. If that capacity passes the tryCapacity
test, try the middle of the smaller range (since we want to make the maximum capacity as small as possible). Otherwise, try the middle of the larger range. Exit the binary search loop in the usual way (see below).
UVa 11413 can be solved in well under the time limit using a slow loop (linear search) inside a fast loop (binary search). The tryCapacity
function uses the slow loop. It works by looping through all of the bottles and “emptying” each one into a container. If a container would exceed the current maximum capacity, a new container is started. There are two ways that a given maxCapacity
can be invalid (not a possible solution):
- A single bottle volume exceeds
maxCapacity
(since we aren’t allowed to split a bottle across multiple containers). - More than
m
containers are required (wherem
is one of the test case parameters).
The plan: For each bottle, exit if we’re already over capacity with just that bottle. Otherwise, check if adding the current bottle to the current container would put us over capacity. If it does, start a new container. If we’re starting a new container, increment the container count. If that puts us over the container limit, exit. Otherwise, add the current bottle to the current container.
Here’s the pseudocode:
bool tryCapacity(int maxCapacity)
for each bottle
if bottle volume > maxCapacity
return false
if current container size + bottle volume > maxCapacity
current container size = 0
if current container size is 0
increment number of containers used
if number of containers used > m
return false
current container size += bottle volume
This function tells us if a particular capacity could possibly be the answer. It doesn’t tell us if it’s the best answer. For that, we need to systematically search possible maximum capacities, being sure not to skip any that might be the answer, but also without searching every possible capacity. That’s a job for binary search.
I explain the basic binary search algorithm (from Sedgewick) in great detail in Visualizing Binary Search. We can use it to search capacity values as described above.
Here’s the pseudocode:
lo = 1
hi = sum of all bottle volumes
set best to hi+1
while lo <= hi
mid = lo + (hi-lo)/2
if tryCapacity(mid)
best = mid
hi = mid-1
else
lo = mid+1
print best
That’s all there is to it. The while
loop and the mid
formula are from the standard binary search algorithm. If mid
is a valid capacity, then we save it, since it may be the answer. But we also look for a smaller answer. If mid
is not a valid capacity, then we have to look for a larger capacity. When the loop exits, best
contains the answer. If no answer is found, then best
is out of range. That can’t happen if hi
is set properly, so it would indicate a bug.
Since binary search narrows down the answer so quickly, it’s not too important to set an exact value for hi
, as long as it’s at least as high as the largest possible answer. But if you want, you can sum the bottle volumes, since you know that’s the largest container you’ll possibly need.
UVa 12032: The Monkey and the Oiled Bamboo
If you solve UVa 11413 first, UVa 12032 will sound very familiar, once you make it past the part about the monkey.
The real problem is as follows, and it doesn’t involve monkeys or bamboo:
You’re at the bottom of a ladder. You are given the height of each rung in feet, relative to the ground (which is at height 0). You also have something called a “strength factor” (k), which works as follows: You can jump k feet, but no more. If you jump exactly k feet between rungs, then k is decremented by one. Otherwise, k is unchanged. What is the minimum k required to get to the top rung of the ladder?
As with the milk bottle problem, we can solve this with a tryk
function that indicates whether a particular k
value is a possible solution, and a binary search loop to test all possible k
values.
Like tryCapacity
, tryk
uses a linear search loop, over rung heights instead of milk bottle volumes. The logic is simpler for tryk
: At each rung, if k
is too small, return false
, and if k
is just right, decrement it. If we make it through the loop, return true.
bool tryk(int k)
for each rung
if distance to next rung > k, return false
if distance to next rung == k, k--
return true
The binary search part of the solution is so similar to the milk bottle binary search that I won’t even repeat it here. I will mention one thing: the UVa OJ test data covers the full range of possible rung heights. So don’t skip any.
Thoughts on the Two Problems
I’m a bit surprised that uHunt includes both of these as starred problems, since they’re so similar. But solving UVa 12032 after UVa 11413 does give you a sense of what it’s like to have a solution at the ready. Now you just have to recognize a binary search the answer problem when you next see one.
For me, UVa 11413 was an example of how a problem can be difficult even when you know which technique to use. You can know that the answer is a sum and that you need to search through possible sums, and still not come up with the specific tests in the tryCapacity
function. The solution? More practice.
(Image credit: Anthony)