Using JDK’s Source Code to Clarify Implementation Specifics


Introduction

This blog post covers an interesting aspect of Java. Today’s reference implementation of the Java Development Kit (JDK), OpenJDK, has been free and open source since 2007. Even though it has been possible to obtain the source code of the Java classes prior to 2007 (by obtaining a Sun Microsystems account and accepting some license terms), it is now “officially open”.

Having the Java implementation of the Java API at hand can prove very useful. In this article, I want to show you some examplary questions about some implementation specifics and how these questions can be answered by looking at the OpenJDK source code.

I’ll stick to Java 8 at the moment. I’m not that kind of person that jumps on every brand new version as soon as it comes out. Especially not today, since many Java related topics are currently changing, not yet fully answered, or in motion. This includes the new modul system of Java 9, the very limited support lifecycles, and the fact that there will be a new “major” version every 6 months. The changes from Java 8 to Java 9 have been little (except the modul system, but as long as it’s unclear where the journey is leading to, it can be ignored). In other words: At the time of this writing, the pros of using a newer Java version than Java 8 do not outweigh their cons.

Getting the Source Code

OpenJDK can be found at openjdk.java.net. The source code repositories can be downloaded and browsed at hg.openjdk.java.net/jdk8/jdk8/jdk/, hg.openjdk.java.net/jdk9/client/jdk/, and hg.openjdk.java.net/jdk10/jdk10/jdk/, respectively. On the very left, click browse in order to walk through the source code in your web browser, or one of bz2, zip, or gz in order to obtain your offline copy. The source code of the “main classes” is found following the path src/share/classes/ for Java 8 and src/java.base/share/classes/ for Java 9 and Java 10. From then on, follow the paths specified in this article. They also contain a hyperlink to the original source, also for license reasons. Please note the license of OpenJDK can be found at hg.openjdk.java.net/jdk8/jdk8/file/1773f1fd0fac/LICENSE.

Below, you find some sample questions that demonstrate the usefulness of JDK’s source code.

How are the AWT Colors ORANGE and PINK defined?
Is there a difference between Color.orange and Color.ORANGE?

The java.awt.Color class has been part of the JDK since its very first days, JDK 1.0.2, released in 1996. It hasn’t been replaced until Java FX 2.0, released in 2011, if this can be seen as a replacement at all. Java AWT/Swing and Java FX are two different worlds, even though the former can be seen as more and more outdated.

The AWT Color class comes with predefined Color constants for some main colors, e.g., Color.orange. In Java 1.4, fields with uppercase names were added for the same colors, e.g., Color.ORANGE.

First, let’s have a look at the definition of orange. While (in the RGB world) the colors red, yellow, blue, etc. might be well-defined, I have never come across a unique definition of orange. The API documentation of Color doesn’t specify its RGB value, so we need the source code, located at java/awt/Color.java, to look it up:

    /**
     * The color orange.  In the default sRGB space.
     */
    public final static Color orange    = new Color(255, 200, 0);

Just a passing remark: We see that java.awt’s orange doesn’t match javafx.scene.paint’s orange. The latter is specified in the API documentation of javafx.scene.paint.Color: “The color orange with an RGB value of #FFA500”. However, Java FX’s Color class specifies many more color constants (140) than AWT’s Color class (13).

How about pink? The source code reveals:

    /**
     * The color pink.  In the default sRGB space.
     */
    public final static Color pink      = new Color(255, 175, 175);

Another question that arises is if there is a difference between the aforementioned “new” uppercase fields, introduced in Java 1.4, and the original lowercase fields. This is a reasonable question, since accepted coding conventions imply a difference in lowercase and uppercase field names. So it might be that Color.orange is mutable, while Color.ORANGE is really constant, i.e., immutable?

The source code in the same file reveals that there is no difference:

    /**
     * The color orange.  In the default sRGB space.
     * @since 1.4
     */
    public final static Color ORANGE = orange;

The uppercase field names simply have been added to Java 1.4 in order to provide consistency. They should have used the uppercase field names from the very beginning, but they weren’t able to remove the lowercase field names later because of Java’s valued downward compatibility.

Are Strings examined for equality and natural order using all characters?

Comparing two Strings using the equals method (for equality) and the compareTo method (for natural order) might be very time consuming for longer strings. The question is if the JDK implemented some “hacks” to speed this process up. In the worst case, do equals and compareTo actually check all the characters of a string?

The API documentation of the corresponding String methods do not state anything like that. However, looking at the source code in the file java/lang/String.java shows a nice trick:

    public boolean equals(Object anObject) {
        if (this == anObject) {
            return true;
        }
        if (anObject instanceof String) {
            String anotherString = (String)anObject;
            int n = value.length;
            if (n == anotherString.value.length) {
                char v1[] = value;
                char v2[] = anotherString.value;
                int i = 0;
                while (n-- != 0) {
                    if (v1[i] != v2[i])
                        return false;
                    i++;
                }
                return true;
            }
        }
        return false;
    }

We can see that two Strings are first checked for having the same length. This is actually a simple but very effective performance trick. Only when the two strings do have the same length will the characters be compared afterwards. As soon as one pair of characters proves to be unequal, the method stops immediately, returning false. The whole method does not show any signs of only checking a limited set of characters. The whole strings are being checked for equality.

Similar things happen inside the compareTo method (see below). Again, it is revealed that all characters are compared in pairs. As soon as one pair of characters shows up to be unequal, the method returns an integer denoting the difference. (This specific information should not be of any interest to the caller. It is only relevant whether the returned integer is positive, zero, or negative.) If two strings have different lengths, but all the characters (of the short string) are equal to the characters of the long string (in other words: the long string starts with the short string), then the sign of the integer returned states that the short string comes before the long string in order.

    public int compareTo(String anotherString) {
        int len1 = value.length;
        int len2 = anotherString.value.length;
        int lim = Math.min(len1, len2);
        char v1[] = value;
        char v2[] = anotherString.value;

        int k = 0;
        while (k < lim) {
            char c1 = v1[k];
            char c2 = v2[k];
            if (c1 != c2) {
                return c1 - c2;
            }
            k++;
        }
        return len1 - len2;
    }

How does a correct and straightforward implementation of equals look like?

The equals method described above covers a special data type, a String. A String is probably not the type of class Java programmers would program on their own. It can be regarded as a very low level class. Thus, its equals (and compareTo) implementation cannot be regarded as a typical example.

How does a “typical” equals method look like, then? In order to find out, we choose a “typical” value class, e.g., java.time.LocalDateTime as well as LocalDate and LocalTime in the same package. All these classes are part of Java’s new Date and Time API, introduced in Java 8 and released in 2014. It incorporates today’s software design principles and can be seen as a very modern and well-designed API.

Implementing equals isn’t simple. An inexperienced programmer probably wouldn’t have any chance “inventing” a right equals method without any literature help. The code sample shown below, taken from the file java/time/LocalDateTime.java, shows the main steps that make up a correct equals method:

    @Override
    public boolean equals(Object obj) {
        if (this == obj) {
            return true;
        }
        if (obj instanceof LocalDateTime) {
            LocalDateTime other = (LocalDateTime) obj;
            return date.equals(other.date) && time.equals(other.time);
        }
        return false;
    }

One can clearly see the identity check (immediately returning true if identical), the type check (resulting in the termination of the method and returning false if the given object obj is not of type LocalDateTime), the type cast (to LocalDateTime), and the actual field comparisons (since a LocalDateTime consists of a LocalDate field and a LocalTime field, it simply compares these two by calling their equals methods).

Simply “forwarding” the equality checks to the field objects is a little bit lame, so let’s have a look at the source code of the LocalTime class at java/time/LocalTime.java. The class has these four private main fields:

    /**
     * The hour.
     */
    private final byte hour;
    /**
     * The minute.
     */
    private final byte minute;
    /**
     * The second.
     */
    private final byte second;
    /**
     * The nanosecond.
     */
    private final int nano;

The equals implementation in the same file then looks like this, demonstrating the comparison of the primitive fields:

    @Override
    public boolean equals(Object obj) {
        if (this == obj) {
            return true;
        }
        if (obj instanceof LocalTime) {
            LocalTime other = (LocalTime) obj;
            return hour == other.hour && minute == other.minute &&
                    second == other.second && nano == other.nano;
        }
        return false;
    }

How does one create a specific Javadoc content or formatting?

Writing correct Javadoc tags or comments isn’t always trivial. Looking at the JDK’s source code can again be of great help.

This time, let’s consider the isSupported(TemporalField) method of the class LocalDate in the file java/time/LocalDate.java:

    /**
     * Checks if the specified field is supported.
     * <p>
     * This checks if this date can be queried for the specified field.
     * If false, then calling the {@link #range(TemporalField) range},
     * {@link #get(TemporalField) get} and {@link #with(TemporalField, long)}
     * methods will throw an exception.
     * <p>
     * If the field is a {@link ChronoField} then the query is implemented here.
     * The supported fields are:
     * <ul>
     * <li>{@code DAY_OF_WEEK}
     * <li>{@code ALIGNED_DAY_OF_WEEK_IN_MONTH}
     * <li>{@code ALIGNED_DAY_OF_WEEK_IN_YEAR}
     * <li>{@code DAY_OF_MONTH}
     * <li>{@code DAY_OF_YEAR}
     * <li>{@code EPOCH_DAY}
     * <li>{@code ALIGNED_WEEK_OF_MONTH}
     * <li>{@code ALIGNED_WEEK_OF_YEAR}
     * <li>{@code MONTH_OF_YEAR}
     * <li>{@code PROLEPTIC_MONTH}
     * <li>{@code YEAR_OF_ERA}
     * <li>{@code YEAR}
     * <li>{@code ERA}
     * </ul>
     * All other {@code ChronoField} instances will return false.
     * <p>
     * If the field is not a {@code ChronoField}, then the result of this method
     * is obtained by invoking {@code TemporalField.isSupportedBy(TemporalAccessor)}
     * passing {@code this} as the argument.
     * Whether the field is supported is determined by the field.
     *
     * @param field  the field to check, null returns false
     * @return true if the field is supported on this date, false if not
     */
    @Override  // override for Javadoc
    public boolean isSupported(TemporalField field) {
        return ChronoLocalDate.super.isSupported(field);
    }

If you aren’t experienced in writing Javadoc comments, you can immediately see some nice tricks:

  • Linking or “jumping” to other methods of the same class can be achieved by the @link tag. The symbol # represents an anchor, e.g., {@link #range(TemporalField) range}.
  • Linking to a different class looks like this: {@link ChronoField}
  • Formatting text as code can be done using the {@code} tag, e.g., {@code this}. Do not use HTML <code> tags.
  • Paragraphs can be separated with <p>, even though I personally prefer using closing tags </p>, too. The first sentence is used for the summary and usually isn’t put inside a <p> paragraph.

The idea is, if you see something interesting in the JDK API documentation and you want to know how you can do this in your own documentation, look it up in the source code and learn how it’s done.

Problems

Since you probably got the message of this blog post by now, I won’t spoon-feed you with more examples. Instead, I present some of the remaining questions I thought of while preparing for this blog post, and let you find the answer. Of course, the answers cannot be found in the API documentation, but rather require looking at the JDK’s source code.

  1. java.util.ArrayList increases its internal array when it is full and thus has become too small for adding any new elements. By what factor is the new internal array larger than the previous one?
  2. Do ArrayLists also decrease the size of their internal array when elements are removed or the whole list is cleared?
  3. Do you remember when I wrote about whether or not the relational operators < and > should be used for comparing integral data types? In that blog post about implementing order for comparable Java objects, I looked it up in the wrapper classes’ static compare methods. You can do so, too. How are the static compare methods in the wrapper classes implemented?

Summary

In this blog post, you’ve seen how having the JDK’s source code available can be advantageous. It can help answer implementation specific questions regarding classes or methods that are not described in the API documentation. The JDK’s source code is also an invaluable source for learning how things can be implemented in Java or documented using Javadoc. If the JDK’s source code isn’t already available as part of your IDE (like Eclipse), download it and keep it readily available on your computer. The same applies to an offline copy of the API documentation, which should always be at hand.

Shortlink to this blog post: link.simplexacode.ch/jjrk2019.01

Leave a Reply