GENERATING INTEGER RANDOM NUMBERS
There are many ways to generate random numbers with the base libraries found in the Java 2 SDK, Standard Edition. If you haven't kept up to date with changes to the libraries, you might be using an inefficient mechanism or, possibly worse, getting results that are not uniformly distributed. Here's a look at a good way to generate integer random numbers, and then a look at what's not so good about some other ways.
The java.util.Random class has been available for generating random numbers since the initial release of the system libraries. Starting with the 1.2 release of the Java 2 SDK, Standard Edition, the Random class has a nextInt() method that accepts an integer argument:
public int nextInt(int n)
Given some value n, nextInt(n) returns a value greater than or equal to zero but less then that value: 0 <= nextInt(n) < n.
All you have to do is create a Random object first, then call nextInt(n) to return the next random int value.
Here's a demonstration. The following code snippet generates a large set of random numbers and prints out the average:
int count = 1000000;
int range = Integer.MAX_VALUE / 3 * 2;
double sum = 0;
Random rand = new Random();
for (int i=0; i<count; i++) {
sum += rand.nextInt(range);
}
System.out.println(sum/count);
After looping a million times, the average value should be approximately at the midpoint of the selected range.
So far, that isn't too complicated, but what's special about using nextInt(n)? Why is using nextInt(n) better than (1) using the older nextInt() method, with no range, then (2) using Math.abs() to get the absolute value, and then (3) using the mod (%) operator to get the value into the right range? The latter approach is demonstrated in the following code snippet. While there are a few extra operations in this approach, isn't it just as good as the first approach?
sum = 0;
for (int i=0; i<count; i++) {
sum += Math.abs(rand.nextInt()) % range;
}
System.out.println(sum/count);
There are actually three problems with the latter approach.
First, nextInt() is equally likely to return a value between Integer.MIN_VALUE and Integer.MAX_VALUE. If you take the absolute value of Integer.MIN_VALUE, the result is not a positive number. In fact, the result of Math.abs(Integer.MIN_VALUE) is Integer.MIN_VALUE. So, on rare occasions, you'll get back a negative number. Given the rarity of the event, 1 out of 2^31 times, the likelihood of the event happening during testing, and being repeatable, is highly unlikely.
Second, when you mod (%) the results of nextInt(), you effectively reduce the randomness of the results. The low order bits of random numbers can repeat more regularly than the entire number. This is a known issue with pseudo-random number generators, and so it's another reason not to use mod (%).
Finally, and possibly worst of all, the results are not evenly distributed. If you execute the loops in the two approaches, the first loop will produce results above 715 million, with a midpoint of the range at 715,827,882. That is within an acceptable tolerance for randomness. Surprisingly, the results of the second loop are consistently less than 600,000,000.
How can the second loop be so far off base? Essentially, the problem is that the mapping is unfair. When you use the mod (%) operator, you are taking values that are too large and squeezing them into the low end. This favors the lesser values. Compare this to rolling a single die with three other friends. Because there are only four of you and six possible values, the mapping for the first four values is easy. Die value 1 is mapped to person 1, value 2 to person 2, and so on. Now what about the larger values, 5 and 6? If you take the same approach as the mod (%) operator, you map the large values in such a way that any time a 5 is rolled, person 1 wins, and any time a 6 is rolled, person 2 wins. Is this fair? Well, that's what happens when you mod (%) the results of nextInt().
Using the nextInt(range) method solves all three of these problems.
That leaves the random() method of the Math class as another possible way to generate random numbers. Using that method can't be that bad, can it? You just have to multiply the result by the range:
sum = 0;
for (int i=0; i<count; i++) {
sum += (int)(Math.random() * range);
}
System.out.println(sum/count);
Well, using this method at least doesn't have any of the problems of nextInt(). You can't get a negative number back, you are not using the mod (%) operator so you don't run into the low-order byte range problem. And the range is uniform.
What is the problem, though, is that Math.random() uses floating point arithmetic, and both versions of nextInt() work with only integer operations. Math.random() can be up to four times slower because of its use of floating point operations. Throw in the cast, and the operation is even slower.
Because it avoids the problems inherent in using the other approaches, using the nextInt(range) method of Random is a better way to generate integer random numbers.
Here's a complete program you can use to test the different approaches discussed in this tip.
import java.util.*;
import java.text.*;
public class RandomTest {
public static void main(String args[]) {
NumberFormat nf = NumberFormat.getInstance();
int count = 1000000;
int range = Integer.MAX_VALUE / 3 * 2;
System.out.println("Midpoint: " +
nf.format(range/2));
double sum = 0;
Random rand = new Random();
for (int i=0; i<count; i++) {
sum += rand.nextInt(range);
}
System.out.println("Good : " +
nf.format(sum/count));
sum = 0;
for (int i=0; i<count; i++) {
sum += Math.abs(rand.nextInt()) % range;
}
System.out.println("Bad : " +
nf.format(sum/count));
sum = 0;
for (int i=0; i<count; i++) {
sum += (int)(Math.random() * range);
}
System.out.println("Longer : " +
nf.format(sum/count));
}
}