Java Lessons from uHunt Chapter 2

Reference2

This post is part of a series on Java syntax and libraries for solving the problems in each chapter of uHunt, a tool for competitive programming practice. You can find the rest of the posts in the series by visiting my uHunt Java post category.

In the previous post in this series, a review of Java for uHunt Chapter 1 problems, I covered Java basics. This time around, I’ll be discussing the additional Java features required to solve the Chapter 2 problems. The problems in this chaper focus on data structures, so we’ll be looking at data structures supported by the Java Class Library.

As before, I have put full examples of all of these topics in my Reference.java reference class. You can look there to find sample code to try out.

2D Grid

Some problems take place on a two-dimensional grid, like a game board. Java supports two-dimensional arrays that you can use for this purpose. For example, here is a 2D array of char. Each array element could represent a game piece or obstacle:

char[][] grid = new char[5][5];

Arrays have easy syntax for addressing individual cells, but they can be inconvenient if you need your game board to get bigger during program execution.

Breaking Out of a Loop

Sometimes the cleanest way to terminate a loop is to break out of it based on some condition inside the loop itself. Java supports that using the break statement. Note that break only breaks out of its own loop. For example, the break in this code breaks out of the j loop, but not out of the i loop:

for (int i=0; i<5; i++) {
    System.out.print(" " + i + " ");
    for (int j=0; j<5; j++) {
        System.out.print(j);
        if (i == j) break;
    }
}

Output: 0 0 1 01 2 012 3 0123 4 01234

If you want to completely break out of a set of nested loops, there is also a way to do that using a labeled break.

Integers and Longs

Sometimes you’ll run into a problem like UVa 11340 where you have to deal with integers that exceed the range of the Integer type. Dealing with integers and long integers in Java is not always intuitive. I wrote a post, Using Integers and Longs in Java, with some advice for these cases.

Priority Queue

You can add values to a priority queue the same way you do with any collection class. But when you ask a priority queue to remove a value, it always removes either the largest or the smallest value (depending on how you initialized it). Here’s a simple example:

PriorityQueue<Integer> pq = new PriorityQueue<>();
pq.add(3);
pq.add(1);
pq.add(2);
while (pq.size() > 0) System.out.print(pq.poll() + " ");

Output: 1 2 3

The integers were added to the PriorityQueue in the order 3 1 2, but they came out in the order 1 2 3.

For a priority queue created using the default constructor, poll returns the minimum item. You can change this behavior by passing a Comparator to the constructor when you create the priority queue. This will cause poll to return the maximum item:

PriorityQueue<Integer> pq =
    new PriorityQueue<>(Collections.reverseOrder());

If you want to return values based on more complex criteria, you can pass in your own Comparator implementation. We’ll see how to do that next.

Sorting Custom Objects

Programming puzzles often involve lists of integers, which makes comparing and sorting easier. But sometimes you need to compare or sort your own object. In that case, you have to tell Java how you want to order your objects. You can do that by having your class implement the Comparable interface. That means writing a method called compareTo that returns an integer indicating how to rank an object compared to its peers.

Consider a Planet class that stores a planet’s Name and average Density. Here’s how you could make a comparable implementation of such a class:

private class Planet implements Comparable<Planet> {
    public Planet(String name, double density) {
        this.Name = name;
        this.Density = density;
    }
    public String Name;
    public double Density;

    public int compareTo(Planet p) {
        double d = Density - p.Density;
        if (d < 0) return -1;
        else if (d > 0.001) return 1;
        else return 0;
    }
}

compareTo accepts another instance of Planet that it compares with the current instance. It returns -1 if the other planet’s density is higher, 1 if it is lower, and 0 if the two densities are equal. Note that you can’t rely on == working consistently when you’re comparing two values of type double. This is a general rule about floating point comparison, not anything related to compareTo.

Since the class implements Comparable, we can sort a list of Planet just as we might sort a list of Integer or Double, by passing it to Collections.sort:

List<Planet> planets = new ArrayList<>();
planets.add(new Planet("Mercury", 5.43));
planets.add(new Planet("Venus", 5.24));
planets.add(new Planet("Earth", 5.515));
planets.add(new Planet("Mars", 3.940));
Collections.sort(planets);
for (Planet planet : planets)
    System.out.println(planet.Name +
    " has an average density of " +
    planet.Density + " g/cm^3");

Here are the terrestrial planets, sorted by average density:

Mars has an average density of 3.94 g/cm^3
Venus has an average density of 5.24 g/cm^3
Mercury has an average density of 5.43 g/cm^3
Earth has an average density of 5.515 g/cm^3

Bitwise Operators

The subject of Chapter 2 is data structures and libraries. One of the topics that the authors cover is using integer variables as data structures for efficiently manipulating strings of bits. Bit manipulation operations are fundamental operations for a CPU, which means they can be a very fast part of your solution implementation.

Of the bitwise operations available in Java, the ones that I found most relevant in Chapter 2 are:

  • Signed left shift: <<
  • OR: |
  • XOR: ^
  • AND: &
  • NOT: ~

If you want to use an integer as a string of bits, then you need a way to read and write a bit at a particular position. Here’s a way to do that.

Bitmasks

Manipulating bits often relies on the concept of a bitmask, a string of bits used to isolate the bit(s) that you want to manipulate. The simplest mask has one bit on and the rest off. To construct such a mask, you can use bit shifting.

Bit shifting

Consider the integer value 1. In binary, this integer can be represented as a block of 0‘s followed by a single 1 in the rightmost position. This mask is useful if you want to manipulate the rightmost bit in a bit string. What if you want to work on a different bit? One way to get the mask you need is to use bit shifting. The << operator shifts a set of bits left by a specified number of positions. For example, 1 << 2 would return a bit string (i.e., an integer) with a single 1 bit in position 2, if we count positions from right to left starting with position 0. This gives us the ability to create a bitmask containing a 1 bit at any position. Note that shifting left is equivalent to multiplying by two if we evaluate the bit string as an integer.

Bit shifting is very fast. But if you’re doing a lot of masking for single bits, it would be more efficient to set up all of your masks in advance. For example, you could create an array of 32 ints, where array element i contained the mask with the 1 bit in position i.

Set

Once you have a mask with the bit at position i set, you can use bitwise OR to set the bit at position i in a target bit string. Here’s why that works. Consider an integer num that you want to interpret as a bit string, and an integer mask that you want to use as a bitmask. num has some sequence of bits. mask has one bit set at position i. The statement num |= mask takes the bit at each position in num and the bit at the same position in mask, applies a bitwise OR, and stores the result in num.

Bitwise OR works as follows: it returns 1 for a given position if the bit at that position in num is set OR the bit at that position in mask is set. Here are the four possibilities at each position:

  1. num: 0, mask: 0, result: 0
  2. num: 0, mask: 1, result: 1
  3. num: 1, mask: 0, result: 1
  4. num: 1, mask: 1, result: 1

Remember that the goal for this operation is to set one specific bit in num, and leave the remaining bits unchanged. Only one bit in mask is set. For the two cases where the mask bit is 0 (cases 1 and 3), no change is made. The result is always the same as the starting value in num. For the two cases where the mask bit is 1 (cases 2 and 4), the result is always 1. This means the num bit will always be set.

Clear

We can clear a bit using the same process as above, except that we use a bitwise AND, and the inverse version of the mask.

You can invert a mask using a bitwise NOT: mask = ~mask. That operation causes all of the 1 bits to flip to 0, and all of the 0 bits to flip to 1. We’ll see why that’s useful in a moment.

As above, num contains your bit string. Bitwise AND returns 1 for a given position if the bit at that position in num is set AND the bit at that position in mask is set. Here are the four possibilities:

  1. num: 0, mask: 0, result: 0
  2. num: 0, mask: 1, result: 0
  3. num: 1, mask: 0, result: 0
  4. num: 1, mask: 1, result: 1

Our goal with this operation is to clear a specific bit in num, and leave the remaining bits unchanged. Using the inverse mask, all of the bits in the mask are set except for one. When a mask bit is 1 (cases 2 and 4), the num bit is unchanged, which is what we want. For the one mask bit that is 0 (cases 1 and 3), the num bit is always set to 0, which is also what we want.

Flip

The last operation I’m going to cover is the bit flip/toggle. This can be accomplished using the bitwise XOR.

Bitwise XOR returns 1 for a given position if the bit at that position in num is set OR the bit at that position in mask is set, but not if they are both set. Here are the four possibilities:

  1. num: 0, mask: 0, result: 0
  2. num: 0, mask: 1, result: 1
  3. num: 1, mask: 0, result: 1
  4. num: 1, mask: 1, result: 0

Though it may not be obvious, this operation can be used to flip a bit in num: change it to 1 if it is 0, and 0 if it is 1.

The mask for this operation is the original mask, with all bits set to 0 except for one bit, the one we want to flip. When a mask bit is 0 (cases 1 and 3), the num bit doesn’t change. For the one mask bit that is 1 (cases 2 and 4), the num bit gets the opposite of the value it started with. So we have flipped that bit.

Other techniques

This section only covers a few techniques. As mentioned in CP3, Hacker’s Delight is a definitive source for bit manipulation techniques and other low-level trickery.

BitSet

While bitwise operators are very fast, they are also very simple. If you want more built-in functionality and don’t need the absolute fastest solution, Java offers the BitSet class. This class encapsulates a vector of bits, and provides some extra operations that you don’t have to implement yourself. Here are some examples from the Chapter 2 problems.

Set a range and test for intersection

BitSet has a set method that lets you set a range of bits. In some cases, it is useful to interpret a BitSet range as a line segment. For example, consider the line segments [0,5], [4,10], and [9,15] as represented in BitSets:

BitSet b1 = new BitSet();
BitSet b2 = new BitSet();
BitSet b3 = new BitSet();
b1.set(0, 5);
b2.set(4, 10);
b3.set(9, 15);

Once you have these line segments loaded, you can test which of them intersect with each other:

System.out.println(b1.intersects(b2));
System.out.println(b2.intersects(b3));
System.out.println(b1.intersects(b3));

Since segment b1 intersects with b2 and b2 intersects with b3, but b1 doesn’t intersect with b3, the output is:

true
true
false

Convert between decimal and binary representation

BitSet can convert between the decimal representation of an integer and its binary representation. (The applicable methods are designed to work with arrays, so if you’re working with just a single integer, they can be a bit verbose).

To initialize a BitSet with the binary representation of an integer value, you can use the valueOf static method, which accepts an array of long. Here’s how to do that with the integer value 182:

BitSet bs = BitSet.valueOf(new long[] {182});

One way to retrieve values from a BitSet is the same way you would retrieve them from an ArrayList<Boolean>. Here’s a loop to print 182 as an 8-bit binary value:

for (int i=7; i>=0; i--) {
    if (bs.get(i)) System.out.print("1");
    else System.out.print("0");
}

The output is 10110110. When you’re printing a binary value, you have to be careful to order the digits correctly. When printing bit strings for humans, it’s common to put the least significant bit on the right. This is the same way we write decimal numbers. For the number one hundred eighty-two, the two, which is in the ones position, is the right-most digit, and represents $2 \times 10^0$. Similarly, the rightmost 0 in 10110110 represents $0 \times 2^0$.

BitSet uses this bit numbering convention, which is called Least Significant Bit 0 bit numbering, or little-endian representation. If you have a BitSet called bs, then the bit returned by bs.get(0) is the least significant bit. Therefore, if your console prints characters from left to right, you need to loop in reverse when printing a bit string, as shown above.

To get from 10110110 back to 182 using a BitSet, you can use the toLongArray method:

String str = "10110110";
BitSet bs2 = new BitSet();
for (int i=0; i<str.length(); i++)
    if (str.charAt(i) == '1') bs2.set(7-i);
System.out.println(bs2.toLongArray()[0]);

Notice the same bit ordering trickiness in this code. Since str[0] refers to the most significant bit, but bs2.set(0) refers to the least significant bit, you have to convert the loop index when going from the String to the BitSet. For an 8-bit value, 7-i is what you want.

Exponentiation

As I mentioned in my Chapter 1 review, putting an import static java.lang.Math.*; statement at the top of your solution template makes accessing math functions more convenient. For example, with those static members imported, you can use pow(b, n) to evaluate $b^n$. One thing to remember is that pow returns a double. You can cast the result to an integer if you’re raising an integer base to an integer power and need the result as an integer.

Standard Data Structures

Java provides implementations of many of the standard data structures that you learn about in college, such as Linked List, Stack, Queue, and Binary Search Tree.

Linked List

Common LinkedList operations include adding a node to the end of the list, and efficiently traversing the list. Like this:

LinkedList<Integer> lst = new LinkedList<>();
for (int i=0; i<5; i++) lst.add(i);
ListIterator<Integer> iter = lst.listIterator();
while (iter.hasNext()) System.out.print(iter.next());

Output: 01234.

Stack

Objects that you push onto a stack pop out in the opposite order. You an do that with the characters in a string:

String str = "gohegdeh";
Stack<Character> st = new Stack<>();
for (int i=0; i<str.length(); i++)
    st.push(str.charAt(i));
while (!st.isEmpty())
    System.out.print(st.pop());

Output: hedgehog.

Queue

When programmers refer to a queue, they usually mean a data structure that returns objects in the same order they were added. As described above, there is a special kind of queue called a priority queue that provides different behavior.

One unexpected feature of the Java Queue is that you can’t instantiate one using Queue q = new Queue(). The compiler will tell you:

error: Queue is abstract; cannot be instantiated

Queue is an interface, which means it defines a set of methods but doesn’t actually implement them. So while you can define a variable of type Queue, in order to use it you have to initialize it to something that implements the Queue interface. LinkedList is a common choice. Here’s the linked list example using a variable of type Queue:

Queue<Integer> q = new LinkedList<>();
for (int i=0; i<5; i++) q.add(i);
while (!q.isEmpty()) System.out.print(q.poll());

The output is the same. Which one you choose depends mainly on what best describes the intent of your program. If you’re iterating through a list, as in the first example, then it makes sense to use ListIterator. If you’re adding and removing objects from a queue, then the second example fits your scenario. But both examples use a linked list as the underlying implementation.

Binary search trees

Binary search trees are an efficient way to store and retrieve objects, especially when your problem needs to retrieve a range of values in sorted order. CP3 makes the point that, for the problem sizes encountered in programming competitions, BSTs and hash tables offer essentially identical performance.

Java provides more than one BST implementation. While solving the Chapter 2 problems, I found both TreeMap and TreeSet to be useful, depending on the problem. According to the Java documentation, TreeMap is an implementation of the red-black tree algorithm from CLRS, and TreeSet is based on TreeMap.

In Solving UVa 978 With Unit Tests, I listed a few similarities and differences between TreeMap and TreeSet. In summary: TreeMap maps keys to values, and doesn’t allow duplicate keys (though you can associate the same object with multiple keys). TreeSet just stores values, and doesn’t allow duplicates. You can use either tree type to solve a given problem, but one is usually more appropriate. For UVa 978, I decided to use TreeMap. For UVa 11849 and UVa 599, I used TreeSet. In most cases, I find the key-value mapping approach of TreeMap to be more intuitive, unless the problem just requires testing for the existence of an object.

Here’s an example of using TreeMap and TreeSet to count the frequencies of characters in a string. First, TreeMap:

String input = "programs must be written " + 
    "for people to read, and only " +
    "incidentally for machines to execute";
TreeMap<Character, Integer> tm = new TreeMap<>();
for (int i=0; i<input.length(); i++) {
    Character c = input.charAt(i);
    Integer count = tm.get(c);
    if (count == null) tm.put(c, 1);
    else tm.put(c, count+1);
}
Iterator iter = tm.keySet().iterator();
while (iter.hasNext()) {
    Character c = (Character)iter.next();
    Integer count = tm.get(c);
    System.out.println(c + " " + count);
}

The key parts of this code:

  • For each character, retrieve a frequency count from the TreeMap using the character as a key.
  • If the character isn’t found in the tree, add it with a count of 1.
  • Otherwise, increment the count.
  • Use an iterator to print the counts at the end.

Here’s the first part of the output:

  14
, 1
a 5
b 1
c 3
d 3
e 10
f 2
(...)

Notice that the output comes out sorted by key even though the code doesn’t specifically request this. That’s the natural key order in a BST.

Implementing this same code using TreeSet is more work. Since TreeSet doesn’t separate keys and values, one approach is to put both into a custom object. When you do this, you have to implement Comparable so the BST knows how to order the objects. This is essentially the same as the Planet object:

private class CharCount implements Comparable<CharCount> {
    char c;
    int count;

    public CharCount(char c, int count) {
        this.c = c;
        this.count = count;
    }

    public int compareTo(CharCount cc) {
        int compare = c - cc.c;
        if (compare < 0) return -1;
        else if (compare > 0) return 1;
        else return 0;
    }
}

Here’s the character frequency implementation using TreeSet and CharCount:

String input = "programs must be written " + 
    "for people to read, and only " +
    "incidentally for machines to execute";
TreeSet<CharCount> ts = new TreeSet<>();
for (int i=0; i<input.length(); i++) {
    Character c = input.charAt(i);
    CharCount newcc = new CharCount(c, 1);
    CharCount cc = ts.ceiling(newcc);

    if (cc == null || cc.c != c) ts.add(newcc); else {
        ts.remove(cc);
        ts.add(new CharCount(c, cc.count+1));
    }
}
Iterator iter = ts.iterator();
while (iter.hasNext()) {
    CharCount cc = (CharCount)iter.next();
    System.out.println(cc.c + " " + cc.count);
}

Besides using the custom CharCount object, the main difference with this implementation is how the tree is updated with a new count value. Since TreeSet doesn’t use a key and value approach, it doesn’t allow retrieving an object by key and updating it. In fact, there’s no way to retrieve a specific object: there’s no key, just a value, so if you know which object you’re looking for, you already have it! You can, however, retrieve an object greater than or equal to an object you have. This is the purpose of the ceiling method.

In my implementation, I’m keeping track of my own key in CharCount. If the object I get back from ceiling has the appropriate key, then I have to remove it and replace it with a new object containing the updated frequency count. There’s no way to update an object in-place.

All of this is a lot of extra complexity for a simple operation, so TreeSet is really not the appropriate choice here.

HashMap

As I mentioned last week, CP3 makes the point that the $\log(n)$ performance that a BST guarantees for its operations is generally sufficient for contest problems. The $O(1)$ best-case performance of a hash table is theoretically faster, but there’s also the chance that a bad hash function will result in $O(n)$ performance. Nevertheless, hash tables are frequently used in real-world programming, so it’s good to know about them.

Java provides two hash table implementations, Hashtable and HashMap. There are some differences between them, but for competitive programming and most other scenarios, it’s best to use HashMap.

Like TreeMap, HashMap is designed for associating keys with values. Here’s the character frequency example using HashMap:

String input = "programs must be written " + 
    "for people to read, and only " +
    "incidentally for machines to execute";
HashMap<Character, Integer> hm = new HashMap<>();
for (int i=0; i<input.length(); i++) {
    Character c = input.charAt(i);
    Integer count = hm.get(c);
    if (count == null) hm.put(c, 1);
    else hm.put(c, count+1);
}
Character[] vowels =
    new Character[] { 'a', 'e', 'i', 'o', 'u' };
for (int i=0; i<vowels.length; i++) {
    Character c = vowels[i];
    Integer count = hm.get(c);
    System.out.println(c + " " + count);
}

Although HashMap provides an iterator, it does not return keys in any particular order. Therefore, HashMap is typically used when you need to look up specific keys, not iterate through all of them. In this example, I printed just the vowel frequencies:

a 5
e 10
i 4
o 7
u 2

Cloning a Stack

What’s the output of this code?

Stack<Integer> s = new Stack<>();
s.push(3);
Stack<Integer> s2 = s;
s2.push(4);
System.out.println(s.pop());

It’s 4, of course. The variables s and s2 both point to the same Stack object. So calling push on s2 has the same effect as calling it on s.

Sometimes you need to make a copy of an object in a way that lets you operate on the copy without affecting the original. I needed to do that for my solution to UVa 732. Here’s a change to the declaration and assignment of s2:

Stack<Integer> s2 = (Stack<Integer>)s.clone();

With that change, the program outputs 3, since s2 now points to a copy of s rather than the original s.

There’s one small annoyance though: the change causes the compiler to emit this warning message:

Note: Reference.java uses unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.

Since it’s a warning and not an error, it doesn’t prevent the code from successfully compiling. But why is the compiler complaining? Compiling with -Xlint:unchecked as directed provides these additional details:

warning: [unchecked] unchecked cast
    Stack<Integer> s2 = (Stack<Integer>)s.clone();
                                               ^
required: Stack<Integer>
found:    Object

So the problem is that clone returns an Object, and the compiler can’t verify that it is a type that can be safely cast to a Stack<Integer>. Compiling this code produces the same warning:

Stack<Integer> s = new Stack<>();
Object o = s;
Stack<Integer> s2 = (Stack<Integer>)o;

One simple solution to this problem is just to tell the compiler that you know what you’re doing, and not to worry about the cast:

@SuppressWarnings("unchecked")
Stack<Integer> s2 = (Stack<Integer>)s.clone();

That gets rid of the warning for this line only.

There’s a lively debate about whether you should use clone() at all. The establishment view seems to be that you should avoid it, but that’s certainly not a unanimous opinion. So what’s the alternative? Some classes (like ArrayList) provide a copy constructor. This answer provides examples of using addAll, and using a Dequeue instead of a Stack. Regardless of whether you use clone or one of these alternatives, I don’t see this as a major issue either for performance or coding style. File it under Java Annoyances.

Thoughts on Chapter Two

The uHunt Chapter Two problems require a lot of new syntax and Java library knowledge, in addition to standard problem-solving skills. This is a good opportunity to build a reference class to refer to as you solve future puzzles. The chapter provides an in-depth review of common data structures, as well as less common ones like BitSet and Segment Tree. It’s certainly not an exhaustive examination of the data structures required for competitive programming, but it’s a good start.

(Image credit: my reference class in Eclipse)