Programming puzzles frequently involve manipulation of integers. And even problems that are mainly about string manipulation, character sets, or floating point arithmetic need integers for array indexes and counters. So it pays to know how to use them. In this article, I’m going to cover some of the quirks of integers and longs as implemented in Java.
Limits
In the world of math, you can define an integer as:
a number that can be written without a fractional component.
The set of all integers, denoted by $\mathbb{Z}$, is (countably) infinite. Since physical computers like the one you’re reading this on don’t have an infinite amount of memory, they cannot store arbitrary members of an infinite set. Therefore, programming languages need to be specific about what subset of $\mathbb{Z}$ is supported by a particular data type. The magnitude of the integer that a variable can hold is limited by the number of bits available to that variable’s data type. If a data type uses one bit, then the variable can hold two different values; two bits allow for four different values; three bits allow for eight different values; and so on. If we want to store signed (both positive and negative) integers, then half of the available values will be positive and half will be negative.
Java defines a set of Primitive Data Types and the number of bits allocated to each one. In this article, I’m going to focus on the signed versions of int
, which uses 32 bits, and long
, which uses 64 bits. Java also offers short
(16 bits) and byte
(8 bits) integer types, which may be useful for saving space in problems that involve large arrays of small numbers. Unsigned integers might be useful in edge cases, but puzzles usually define data to be well within the 32-bit range, or clearly outside of it.
With 32 bits available for int
, we can store about $2^{31}$ negative numbers and $2^{31}$ positive numbers. In fact, int
supports exactly $2^{31}$ negative numbers, so the smallest int
is $-2147483648$. To make room for $0$, there is one less positive number available, so the largest positive int
is $2^{31} – 1 = 2147483647$. The long
type works analogously, but since there are 64 bits available, the numbers are much larger. Here is the largest positive long
, $2^{64}-1$:
$$ 9223372036854775807 $$
For convenience, Java defines constants to store these limits. For int
and long
, they are Integer.MIN_VALUE
, Integer.MAX_VALUE
, Long.MIN_VALUE
, and Long.MAX_VALUE
.
What happens if you try to exceed the limit of a data type? Nothing dramatic, but usually not what you want either:
int a = Integer.MAX_VALUE;
a++; // a is now -2147483648
a = Integer.MIN_VALUE;
a--; // a is now 2147483647
As you can see, the values “wrap around” from the highest positive value to the lowest negative value, and vice versa. This behavior is a result of how signed integers are represented in binary in Java and other languages: a one in the most significant bit position indicates a negative number. If the remaining bits are all zero, then the number represented is the lowest negative number. The highest positive number contains a zero in the most significant bit position, with the remaining bits set to one. Standard binary arithmetic rules require numbers represented this way to wrap around as shown above.
int vs. long
When you’re planning your solution to a problem, one of the decisions you have to make is whether to use int
or long
for storing integers. It’s fairly common for problem descriptions to specify ranges for integer values, but you usually won’t find numbers beyond the range of int
explicitly in these ranges. Instead, these large numbers will appear as the result of arithmetic operations.
For example, UVa 11340 requires you to calculate payment for a news article based on character values. The problem description says each test case will specify a number of paid characters (up to 100), an article size in lines (up to 150,000), and the number of characters in a line (up to 10,000). The character values are specified in cents, with no defined range. For an article with the maximum number of lines and the maximum number of characters per line, it only takes an average of a couple of cents per character to overflow an int
. So for this problem, a long
is required if you want to store the payment value in cents.
Array Indexing
Rather than spending time on a decision about int
vs. long
for each problem, why not just always use long
? That would be one less thing to worry about when planning a solution.
One reason might be memory usage. An array of long
takes up twice as much memory as an array of int
. But normally this isn’t an issue. The real problem is that Java (unlike C#) doesn’t allow a long
to be used as an array index. The Java ArrayList
has the same limitation. This is unfortunate. Although you’re unlikely to need more than $2^{32}$ array elements, this limitation means that for problem that require long
s, you’ll inevitably have a mixture of int
and long
in your code.
What about using long
by default, and just casting to int
when necessary for array indexes? That’s certainly an option. You just have to decide which is more convenient: not having to worry about integer overflow, or not having to convert from long
to int
. After trying both options, I lean towards using int
by default, and accepting that I’ll need a bit more thinking time when planning a solution. The reason for this decision is that converting between integer data types in Java can be messy, as we’ll see next.
Integer and Long
So far we have been discussing int
and long
, two primitive Java data types. There are two related data types called Integer
and Long
, which are often required when dealing with integers. To see how these are used, here’s a method similar to one in my problem template:
public Long[] getLongs() {
String line = getNextInputLine();
String[] tokens = line.split("\\s+");
ArrayList<Long> lst = new ArrayList<>();
for (String s : tokens)
if (!s.trim().equals(""))
lst.add(Long.parseLong(s));
return lst.toArray(new Long[lst.size()]);
}
I’ll explain each section:
public Long[] getLongs() {
Arrays are syntactically more convenient to deal with in Java than lists are, so I’m returning an array.
String line = getNextInputLine();
This is the helper method in my template that efficiently retrieves the next line of input.
String[] tokens = line.split("\\s+");
Space-delimited input is very common in programming puzzles. This line splits the input, and handles arbitrary amounts of whitespace between tokens.
List<Long> lst = new ArrayList<>();
I’m using an ArrayList rather than an array here so that if I need to discard any of the input tokens, I don’t end up with extra array elements.
for (String s : tokens)
if (!s.trim().equals(""))
lst.add(Long.parseLong(s));
This section takes care of skipping tokens that consist only of whitespace, converting everything else to a Long
, and adding the results to the list. It won’t work if the input line has a mixture of character and numeric data.
return lst.toArray(new Long[lst.size()]);
Notice that I’m using Long
rather than long
in this function. Java doesn’t allow primitive types to be used as type arguments, so List<int>
produces a compiler error, unexpected type
. Similarly, lst.toArray
requires an argument of type Long[]
. Passing an argument of type long[]
produces the error no suitable method found for toArray(long[])
.
Let’s assume that I want to take the approach of using Long
and long
by default, and converting to Integer
and int
when necessary. I could start with a call to the function that I just discussed:
Long[] longs = getLongs();
What if I need to use longs[0]
as an array index? Let’s create an array to try that out:
Long[] a = new long[10];
Can we access the array using longs[0]
?
Long b = a[longs[0]];
Nope: incompatible types: Long cannot be converted to int
.
Notice the capital L
in that error message. What if we try using long
instead, like this:
Long b = a[(long)longs[0]];
Now we get a different message: incompatible types: possible lossy conversion from long to int
.
Since a long
can contain a number that is beyond the range of an int
, Java is trying to protect us from potentially ending up with a meaningless array index (perhaps the lower 32 bits of longs[0]
). If we know for sure that longs[0]
is within the acceptable range, can we tell Java to ignore the potential problem? It turns out that we can. This statement compiles without error:
long b = a[(int)((long)longs[0])];
So that’s one option. You can convert from Long
to long
, and then to int
, and Java allows it. But I think it’s more trouble than it’s worth to do all of that casting.
Boxing and Unboxing
Assigning a long
value to a variable of type Long
involves a process known as boxing. The opposite process is called unboxing. Sometimes these processes happen as a result of explicit code. For example, the statement (long)longs[0]
unboxes a Long
value and casts it to a long
. At other times, the compiler (in recent versions of Java) takes care of the conversion automatically. For example, notice that I used long b
in the last example above to store a value from an array of Long
, without specifically telling Java that I needed to unbox it. In the other direction, a statement like Long c = 5
represents implicit boxing, since 5
needs to be wrapped in an object before it can be stored in c
.
The process of boxing and unboxing has a performance impact, since objects need to be allocated and de-allocated. This is another reason why a[(int)((long)longs[0])]
is undesirable. Another performance consideration is that the reference types (Integer
and Long
) are slower than primitive types (int
and long
), so it’s best to use the primitive types when possible.
Long[] cannot be converted to long[]
I tried skipping one step in the getLongs()
example by changing the function return type, and using this as the last line:
return (long[])lst.toArray(new Long[lst.size()]);
The result: incompatible types: Long[] cannot be converted to long[]
. As discussed in this Stack Overflow question, there is no standard Java method to make that conversion.
There are countless similar conversion errors that you can find if you experiment. So here are two lessons from this short exploration of Java integers:
- Lesson #1: Decide whether you need a long integer or a regular integer for a particular result, and stick with it.
- Lesson #2: If you’re ever designing a programming language, learn from the mistakes of those who have gone before you.
(Image credit: woodleywonderworks)