Unit Testing the Java Class Library

Latte Art Robot

Unit testing is a controversial topic in developer circles. Opinions range from Most Unit Testing is Waste to You are not allowed to write any production code unless it is to make a failing unit test pass.

For those on the pro-unit testing side, there are numerous benefits to the practice, especially the test-first variety. In my experience, writing unit tests does provide a net gain in productivity compares to programming practices that don’t use them. These are the benefits that I find especially useful:

  • Unit tests drive good design: If you’re strict about testing all of your code, you find yourself designing it in a particular way, the way that makes it easy to test. Design principles that promote testability include writing components that are loosely coupled, and giving each component a single responsibility. These design principles have other code quality benefits.

  • Unit tests allow low-risk refactoring: If you’re working on a system without unit tests, every change to the code creates a regression risk. Therefore, developers tend to make changes only when they have to. Generally this is when they’re adding features or fixing bugs. In a codebase with good unit tests, you can keep the code clean by making design improvements with less risk of unintended side effects. Then even small improvements are worth making, which means you can improve your code over time.

I’ll get to a third benefit, unit tests as sample code, shortly.

The Skeptic

As with any software development practice, unit testing can be abused. Unit tests are not a silver bullet that magically causes code to be written well. This Stack Overflow post lists 31 unit testing anti-patterns with clever names like:

  • The Local Hero: A test case that is dependent on something specific to the development environment it was written on, and
  • Mount Vesuvius: A test that is destined to FAIL at some specific time and date in the future.

Here’s one that isn’t on the list:

  • The Skeptic: A test that asserts conditions that other tests have already verified.

While skepticism is usually a good quality for programmers, in this case it violates the don’t repeat yourself principle. If a test update is necessary due to a change in requirements, you don’t want to fix any more of them than necessary. So when you’re testing a unit of functionality, make sure you’re only testing that unit. Or as Martin Fowler puts it, “[Unit] tests are tests of the behavior of a single unit. We write the tests assuming everything other than that unit is working correctly.”

The ultimate manifestation of The Skeptic is testing your compiler, framework, or library. One of the rules of programming that beginners learn early is: The compiler is always right. Similarly, the framework or library that ships with the compiler is also free of bugs, because it has been thoroughly tested by the manufacturer. Of course, that’s not completely true, but it’s close enough. If your program isn’t working the way it should, it’s unlikely that the fault lies in your compiler or supporting software. You should probably inspect your own code first.

Most people don’t intentionally unit test their compilers, so what this anti-pattern really means is: make sure your unit tests are testing what you think they are testing, namely your own code. The test-first approach helps with this, because it ensures that you can make tests fail and then subsequently pass based on code that you are writing.

Learning Tests

Now that I have explained why it’s a bad idea to unit test your compiler and class libraries, here’s a scenario where it’s a good idea. Several unit testing practitioners have written about the concept of a Learning Test: a unit test that targets someone else’s code for the purpose of understanding how it works. The standard example of “someone else’s code” is a third-party library, usually one that doesn’t come with unit tests of its own. Learning tests are therefore an exception to the rule that your unit tests should only target your own code.

Learning tests have a different purpose than regular unit tests. They don’t drive design choices, since the components under test have already been designed, written, and delivered to you. And they don’t generally catch functional regressions, since the authors of the components presumably already did that. The main purpose of learning tests is to give the consumer of a library or framework a chance to use it without the constraints of writing production code against it. A second benefit of learning tests is to provide an early warning about changes to a library. If you write tests against a particular version of a library and run them whenever a new version is released, they help verify that your assumptions about how the library works are still correct.

Here are some sources to read more about learning tests:

  • James Grenning writes about them in Chapter 8 of Robert Martin’s Clean Code, a book that is worth reading in its entirety. On learning tests: “It’s not our job to test the third-party code, but it may be in our best interest to write tests for the third-party code we use.”

  • In Chapter 25 of Test-driven Development: By Example, author Kent Beck describes learning tests and attributes the concept to Jim Newkirk and Laurent Bossavit.

  • In Chapter 4 of Test-Driven JavaScript Development, Christian Johansen writes: “Learning tests … help us document our knowledge and our learning experience.”

Learning a Language Using Unit Tests

Learning tests are normally applied to a third-party library. But any library that you’re not writing yourself could be considered third-party. So why not apply the learning test concept to the core library of the language you’re using? For Java, this core library is known as the Java Class Library (JCL).

One of the side effects of solving programming puzzles is learning a programming language well. When you’re working on a puzzle and you need a language or library feature that you haven’t used much, you can look it up and use it. You can also repeatedly practice using features that you previously learned.

As I have mentioned before, I’m maintaining a reference class of Java code snippets related to the puzzles that I’m solving. Recently, I refactored this class to work more like a set of unit tests.

Unit Test Frameworks

When writing unit tests for a software project, it’s advisable to use a unit testing framework like JUnit. The framework provides functionality like the ability to assert that a method returned a particular result. For Reference.java, I decided to write a very simple test framework that duplicates the functionality of an online judge.

Standard unit testing frameworks work best for testing class libraries. You call methods, and inspect the results or check for thrown exceptions. Most online judges work differently. They provide tests cases in text form to standard input, and the contestant writes results to standard output. Therefore, my testing framework is based on this stdin/stdout model.

Reference.java, ReferenceUtil.java, and input.txt

I split my learning test framework into two classes, Reference.java and ReferenceUtil.java, and one text file containing test data.

Reference.java is the class that I modify regularly with new tests. It contains:

  • The tests themselves, one test per method.
  • A method that calls one test, for use while a new test is being written.
  • A method that calls all tests, to verify that they produce the expected output.
  • The public static void main entry point.

ReferenceUtil.java contains supporting infrastructure:

  • BufferedReader, BufferedWriter, and StringBuilder instances that support efficient I/O as described in Why is Java I/O Slow? I/O efficiency isn’t critical for these small unit tests, but it’s best to have I/O work the same way in the tests as it does in my solution template.
  • A HashMap to store test data for each test name, so that tests can be called individually by name.
  • A method to automatically detect the name of the test method that is currently running. This keeps the Reference.java test code clean.

Programming puzzles generally describe a specific format for the content passed to stdin. Here are the rules for the input text used by this learning test framework:

  • Lines that start with # are ignored. In other words, they are treated as comments.
  • Zero-length lines are ignored, to allow for nice spacing.
  • When a line starts with \$, the remainder of the line (trimmed of any whitespace) is treated as a test case name. Subsequent lines (unless they start with #) through the next \$ line or the end of the file are passed to the named test when it is called.

Here’s an example test:

/**
 * Convert characters in a string to a different case.<p>
 * Input: two strings<p>
 *
 * Output: the first string in all uppercase, and the second
 * string in all lowercase<p>
 *
 * Language features: Character.toUpperCase, Character.toLowerCase<p>
 * UVa problem: 10945
 */
public void changeCharacterCase() {
  ArrayList<String> lines = ru.getTestInput();
  for (int i=0; i<lines.size(); i+=2) {
    // Can't do this -- error: for-each not applicable to expression type
    //for (char c : line) System.out.print(Character.toUpperCase(c));
    for (char c : lines.get(i).toCharArray())
      ru.write(Character.toUpperCase(c));
    ru.writeln();
    for (char c : lines.get(i+1).toCharArray())
      ru.write(Character.toLowerCase(c));
    ru.writeln();
  }
}

Some points to note about this test:

  • It starts with a Javadoc comment containing a summary of what the test should do, what it accepts as input, what output is expected, and which language features are first demonstrated by this test.
  • When I’m editing Reference.java, I add tests to the end of the file as I come across features that I need for programming puzzles. Once a feature is introduced, I use it in subsequent tests without mentioning it in the comment. So each feature has a canonical test where experiments can be done to explore it.
  • When something unexpected happens, I leave the corresponding code in the test with a comment, rather than replacing it with code that works correctly. When I was writing this test, I found out that Java’s for-each loop can’t directly iterate through each character in a string. The string has to be converted to a character array first, or you have to use charAt.
  • I often make a note of the UVa Online Judge problem that prompted me to write the learning test, so I can go back and look at my puzzle solution as a further example of how to use the feature.

Learning Tests for Programming Puzzles

One of the keys to getting the most out of programming puzzles is not to get so caught up in solving them that you forget the goal, which is to get better at programming. A good way to do that is to use learning tests. Whenever you run into difficulties getting a language or library feature to work, create a test for that feature, and explore it. Once you are comfortable using it, you can go back to your puzzle code and only worry about the puzzle, rather than the language syntax. The next time you need to use the same feature, you can review your tests and refresh your memory, or add more tests if you need to explore it further.

(Image credit: Jennifer Morrow)