A Typesafe Enum Facility for the Javatm Programming Language



 

I. Introduction

We propose a typesafe enum facility for the Javatm programming language. The proposed facility combines ease of use, power, and performance. In its simplest form it looks a lot like a C/C++ enum declaration:
    public enum Suit { clubs, diamonds, hearts, spades }
but it has many advantages over the enum facilities in C, C++, C# and Pascal. Internally, the facility is based on the Typesafe Enum pattern described in Item 21 of "Effective Java Programming Language Guide." The proposed facility offers all the advantages of this pattern with one deliberate omission (hierarchies of typesafe enums):
  1. Compile-time type safety.
  2. Performance comparable to int constants.
  3. Type system provides a namespace for each enum type, so you don't have to prefix each constant name.
  4. Typesafe constants aren't compiled into clients, so you can add, reorder or even remove constants without the need to recompile clients. (If you remove a constant that a client is using, you'll fail fast with an informative error message.)
  5. Printed values are informative. (Which would you rather see in a stack trace: "Indigo" or "6?")
  6. Enum constants can be used in collections (e.g., as HashMap keys).
  7. You can add arbitrary fields and methods to an enum class.
  8. An enum type can be made to implement arbitrary interfaces.
Additionally, the facility rectifies the two major shortcomings of the Typesafe Enum pattern:
  1. The proposed construct is simple and readable. (The Typesafe Enum pattern described in "Effective Java" involves a large amount of boilerplate code, which obscures programmer intent, deters all but the most determined programmers from using the pattern, and provides opportunities to introduce errors.)
  2. The proposed construct can be used with switch statements.

The enum declaration is a special kind of class declaration. An enum type has public, self-typed members for each of the named enum constants. All enum classes have high-quality toString, hashCode, and equals methods. All are Serializable, Comparable and effectively final. None are Cloneable. All of the "Object methods" except toString are final: we take care of comparison and serialization, and ensure that it's done right.

Arbitrary fields may be added to enum classes, and to individual enum constants. The ability to add such fields will only be used by relatively sophisticated programmers, but greatly enhances the power of the facility. It adds nothing to the complexity of the facility for programmers who don't need the extra power.

The proposed enum facility requires no JVM changes. It is implemented in the compiler with support from the libraries. Attributes are used to identify enum classes and enum constants in class files (JVMS 4.7). The enum facility interacts well with other language work proposed for Tiger, notably the static import statement, the enhanced for statement, and generics.

The facility requires a new "primordial enum class" to be added to java.lang. This class, called Eunm, is subclassed by all enum classes. Additionally, we propose to add special-purpose Set and Map implementations to java.util. These implementations would be called EnumSet and EnumMap. Instances of these collections require all elements (or keys) to be instances of a single enum class. They combine the power of a general purpose collection with the speed of a bit-vector (or array). Note that enum classes are perfectly compatible with the standard collection implementations; the special-purpose implementations merely offer better performance.

II. Syntax

The following description is not up to JLS standards, but should be good enough for present purposes. A new type of class declaration (8.1.1) called an enum declaration is permitted wherever a class declaration is now permitted:
 
   EnumDeclaration:
        ClassModifiersopt enum Identifier Interfacesopt EnumBody

   EnumBody:
        { EnumConstantsopt EnumBodyDeclarationsopt }

   EnumConstants:
        EnumConstant
        EnumConstants , EnumConstant

   EnumConstant:
        Identifier Argumentsopt ClassBodyopt

   Arguments:
        ( ArgumentListopt )

   EnumBodyDeclarations:
        ; ClassBodyDeclarationsopt
The use of class modifiers in enum declarations is as for class declarations, with a few additional restrictions. All enum declarations are implicitly final unless they contain constant-specific class bodies (which result in implicit subclasses). It is permissible to use the final modifier on an enum declaration without constant-specific class bodies, but it has no effect (and is discouraged). Enum declarations may not use the class modifier abstract unless they contain constant-specific class bodies for every enum constant, and any abstract methods declared in the (optional) class body declarations are overridden in all the constant-specific class bodies. Enum member classes are implicitly static. Regardless of what class modifiers are used, it is illegal to explicitly instantiate an enum (using new).

Any class body declarations in an enum declaration apply to the enum type exactly as if they had been present in the class body of an ordinary class declaration. There are few restrictions on the members that may be declared in the class body declarations of an enum type. Constructors must not chain to a superclass constructor--chaining is handled automatically by the compiler. Note, however, that one constructor for an enum class may chain to another. (The optional class body of an enum constant is effectively an anonymous class declaration, and may not contain any constructors.)

An enum constant may be followed by arguments, which are passed to the constructor when the constant is created. A constructor is chosen using the normal overloading rules (15.12.2). If the arguments are omitted, an empty argument list is assumed. If the enum class has no constructor declarations, a parameterless default constructor is provided (which matches the implicit empty argument list).

Enum declarations may not contain members that conflict with automatically generated members: VALUES, family(), readObject(ObjectInputStream), and writeObject(ObjectOutputStream). (Similarly, enum declarations may not contain members that conflict with final methods in java.util.Enum: equals(Object), hashCode)(), clone(), compareTo(Object), and readResolve().) Finally, enum declarations may not contain fields named ordinal and name (which would hide the like-named fields in java.util.Enum).

The serialized form of an enum constant consists of its name. If deserialization is attempted and no constant of the correct type exists with the serialized name, deserialization fails with an InvalidObjectException. Deserialization will not be compromised by reordering of enum constants, addition of enum constants, or removal of unused enum constants.

The syntax of the switch statement (14.10) is extended ever-so-slightly. The type of the Expression is now permitted to be an enum class. (Note that java.util.Enum is not an enum class.) A new production is added for SwitchLabel:

   SwitchLabel:
        case EnumConst :

   EnumConst:
        ClassName . Identifier
        Identifier
The ClassName in an EnumConst must refer to an enum class. The Identifier must correspond to one of its enumeration constants.

III. Semantics

An enum declaration declares an enum class with the same visibility as a class declaration at the same point with the same access modifiers. Any members declared in the (optional) class body have the same visibility as they would have in a class declaration at the same point with the same access modifiers. Constant-specific class bodies define anonymous classes inside enum classes that extend the enclosing enum class. Thus, instance methods declared in these class bodies are accessible outside the bodies only if they override accessible methods in the enclosing enum class. Static methods and fields declared in constant-specific class bodies are never accessible outside the class body in which they're declared.

In addition to the members it inherits from Enum, the enum class has a public static final "self-typed" field for each declared enum constant. Enum classes may not be instantiaged using new, may not be cloned, and take full control of the serialization and deserialization process, This ensures that no instances exist beyond those made available via the aforementioned fields. Because there is one instance for each value, it is permissible to use the == operator in place of the equals method to determine if at least one of the two object references being compared is an enum constant.

The enum class has the following fields generated automatically:

    /**
     * An immutable list containing the values comprising this enum class
     * in the order they're declared.  This field may be used to iterate
     * over the constants as follows:
     *
     *    for(className c : className.VALUES)
     *        System.out.println(c);
     */
    public static List<this enum class> VALUES;

    /**
     * Returns an immutable list containing the values comprising this enum
     * class in the order they're declared.  This instance method simply
     * returns VALUES.  Few programmers should have any need to use this
     * method.  It is provided for use by sophisticated enum-based data
     * structures to prevent the need for reflective access to
     * VALUES.
     * 
     * @return an immutable list containing the values comprising this enum
     *         class, in the order they're declared.
     */
    public final List<this enum class> family();

    /**
     * Static factory to return the enum constant pertaining to the
     * given string name.  The string must match exactly an identifier
     * used to declare an enum constant in this type.
     * 
     * @throws IllegalArgumentException if this enum class has no constant
     *         with the specified name.
     */
     public static <this enum class> valueOf(String name);

IV. Libraries

Here's a first cut at the primordial enum class:
package java.lang;
import java.util.*;
import java.io.*;

/**
 * The primordial enum class.  Every enum class extends this class, 
 * but it may not be subclassed manually.
 */
public abstract class Enum <T extends Enum<T>>
        implements Comparable<T>, Serializable {
    /**
     * The ordinal of this enumeration constant (its position
     * in the enum declaration, where the initial constant is assigned
     * an ordinal of zero).
     * 
     * Most programmers will have no use for this field.  It is designed
     * for use by sophisticated enum-based data structures, such as
     * {@link java.util.EnumSet and {@link java.util.EnumMap}.
     */
    public final transient int ordinal;

    /**
     * The name of this enum constant, as declared in the enum declaration.
     * Most programmers should use the {@link toString} method rather than
     * accessing this field.
     */
    public final String name;

    /**
     * Sole constructor.  Programmers should never invoke this constructor.
     * It is for use by code emitted by the compiler in response to
     * enum class declarations.
     *
     * @param name - The name of this enum constant, which is the identifier
     *               used to declare it.
     * @param ordinal - The ordinal of this enumeration constant (its position
     *         in the enum declaration, where the initial constant is assigned
     *         an ordinal of zero).
     */
    protected Enum(String name, int ordinal);

    /**
     * Returns an immutable list of all the enum constants in this
     * enum constant's enum class.
     *
     * @return an immutable list of all of the enum constants in this
     *         enum constant's enum class.
     */
    public abstract List<T> family();

    /**
     * Returns true if the specified object is equal to this
     * enum constant.
     *
     * @param o the object to be compared for equality with this object.
     * @return  true if the specified object is equal to this
     *          enum constant.
     */
    public final boolean equals(Object o);

    /**
     * Returns a hash code for this enum constant.
     *
     * @return a hash code for this enum constant.
     */
    public final int hashCode();

    /**
     * Returns the name of this enum constant, as contained in the
     * declaration.  This method may be overridden, though it typically
     * isn't necessary or desirable.  An enum class should override this
     * method when a more "programmer-friendly" string form exists.
     *
     * @return the name of this enum constant
     */
    public String toString();

    /**
     * Compares this enum with the specified object for order.  Returns a
     * negative integer, zero, or a positive integer as this object is less
     * than, equal to, or greater than the specified object.
     *
     * Enum constants are only comparable to other enum constants of the
     * same enum class.  The natural order implemented by this
     * method is the order in which the constants are declared.
     */
    public final int compareTo(T o);

    /**
     * Throws CloneNotSupportedException.  This guarantees that enums
     * are never cloned, which is necessary to preserve their "singleton"
     * status.
     *
     * @return (never returns)
     */
    protected final Object clone() throws CloneNotSupportedException;

    /**
     * This method ensures proper deserialization of enum constants.
     * 
     * @returns the canonical instance of this deserialized enum const.
     */
    protected final Object readResolve() throws ObjectStreamException;

V. Usage Examples

Here is a typical enum declaration:
    public enum Season { winter, spring, summer, fall }
Here is a slightly more complex enum declaration for an enum type with an explicit instance field and an accessor for this field. Each member has a different value in the field, and the values are passed in via a constructor. In this example, the field represents the value, in cents, of an American coin.
public enum Coin {
    penny(1), nickel(5), dime(10), quarter(25);

    Coin(int value) { this.value = value; }

    private final int value;

    public int value() { return value; }
}
Switch statements are useful for simulating the addition of a method to an enum type from outside the type. This example "adds" a color method to the Coin class, and prints a table of coins, their values, and their colors.
import java.util.*;

public class CoinTest {
    public static void main(String[] args) {
        for (Iterator<Coin> i = Coin.VALUES.iterator(); i.hasNext(); ) {
            Coin c = i.next();
            System.out.println(c + ":   \t"+ c.value() +"c \t" + color(c));
        }
    }

    private enum CoinColor { copper, nickel, silver }

    private static CoinColor color(Coin c) {
        if (c == null)
            throw new NullPointerException();

        switch(c) {
          case Coin.penny:
            return CoinColor.copper;
          case Coin.nickel:
            return CoinColor.nickel;
          case Coin.dime:
            return CoinColor.silver;
          case Coin.quarter:
            return CoinColor.silver;
        }

        throw new AssertionError("Unknown coin: " + c);
    }
}
Running the program prints:
penny:   	1c 	copper
nickel:   	5c 	nickel
dime:   	10c 	silver
quarter:   	25c 	silver
In the followig example, a rich playing card class is built atop two simple enum types. Note that each enum type would be as long as the entire example in the absence of the enum facility:
public class Card implements Comparable, java.io.Serializable {
    public enum Rank { deuce, three, four, five, six, seven, eight, nine, ten,
                       jack, queen, king, ace }
    public enum Suit { clubs, diamonds, hearts, spades }

    private final Rank rank;
    private final Suit suit;

    private Card(Rank rank, Suit suit) {
        if (rank == null || suit == null)
            throw new NullPointerException(rank + ", " + suit);
        this.rank = rank;
        this.suit = suit;
    }

    public Rank rank() { return rank; }
    public Suit suit() { return suit; }

    public String toString() { return rank + " of " + suit; }

    public int compareTo(Object o) {
        Card c = (Card)o;
        int rankCompare = rank.compareTo(c.rank);
        return rankCompare != 0 ? rankCompare : suit.compareTo(c.suit);
    }

    private static List<Card> sortedDeck = new ArrayList<Card>(52);
    static {
        for (Iterator<Rank> i = Rank.VALUES.iterator(); i.hasNext(); ) {
            Rank rank = i.next();
            for (Iterator<Suit> j = Suit.VALUES.iterator(); j.hasNext(); )
                sortedDeck.add(new Card(rank, j.next()));
        }
    }

    // Returns a shuffled deck
    public static List<Card> newDeck() {
        List<Card> result = new ArrayList<Card>(sortedDeck);
        Collections.shuffle(result);
        return result;
    }
}
If the enhanced for-statement were available, the loop to intialize the sorted deck would be much prettier:
    static {
        for (Rank rank : Rank.VALUES)
             for (Suit suit : Suit.VALUES)
                sortedDeck.add(new Card(rank, suit));
    }
Here's a little program that exercises the Card class. It takes two integer parameters on the command line, representing the number of hands to deal and the number of cards in each hand:
import java.util.*;

class Deal {
    public static void main(String args[]) {
        int numHands     = Integer.parseInt(args[0]);
        int cardsPerHand = Integer.parseInt(args[1]);
        List<Card> deck  = Card.newDeck();

        for (int i=0; i < numHands; i++)
            System.out.println(dealHand(deck, cardsPerHand));
    }

    /**
     * Returns a new general-purpose list consisting of the last n
     * elements of deck.  The returned list is sorted using the
     * elements natural ordering.
     */
    public static <E> List<E> dealHand(List<E> deck, int n) {
        int deckSize = deck.size();
        List<E> handView = deck.subList(deckSize-n, deckSize);
        List<E> hand = new ArrayList<E>(handView);
        handView.clear();
        Collections.sort(hand);
        return hand;
    }
}
Running the program produces results like this:
java Deal 4 5
[four of spades, nine of clubs, nine of spades, queen of spades, king of spades]
[three of diamonds, five of hearts, six of spades, seven of diamonds, king of diamonds]
[four of diamonds, five of spades, jack of clubs, ace of diamonds, ace of hearts]
[three of hearts, five of diamonds, ten of hearts, jack of hearts, queen of hearts]
It is also possible to declare methods on individual enum constants to attach behaviors to the constants (see "Effective Java", P. 108):
import java.util.*;

public abstract enum Operation {
    plus {
        double eval(double x, double y) { return x + y; }
    },
    minus {
        double eval(double x, double y) { return x - y; }
    },
    times {
        double eval(double x, double y) { return x * y; }
    },
    divided_by {
        double eval(double x, double y) { return x / y; }
    };

    // Perform arithmetic operation represented by this constant
    abstract double eval(double x, double y);

    public static void main(String args[]) {
        double x = Double.parseDouble(args[0]);
        double y = Double.parseDouble(args[1]);

        for (Iterator<Operation> i = VALUES.iterator(); i.hasNext(); ) {
            Operation op = i.next();
            System.out.println(x + " " + op + " " + y + " = " + op.eval(x, y));
        }
    }
}
Running this program produces the following output:
java Operation 2.0 4.0
2.0 plus 4.0 = 6.0
2.0 minus 4.0 = -2.0
2.0 times 4.0 = 8.0
2.0 divided_by 4.0 = 0.5
The above pattern is suitable for moderately sophisticated programmers. It is admittedly a bit tricky, but it is much safer than using a case statement in the base class (Operation), as the pattern precludes the possibility of forgeting to add a behavior for a new constant (you'd get a compile-time error).

VI. Implementation Notes

Within the same compilation unit containing an enum class declaration, switch statements on enums can easily be compiled down to ordinary swtich statements, as it is guaranteed that no constants will be added, removed, or reordered from the enum class after the switch statement is compiled. Thus, the ordinals implied by the declarations are guaranteed to be correct at run time.

Outside of the compilation unit, there are at least two choices. (1) compile the switch statement down to a "multiway if-statement":

    if (<expr>.ordinal == Color.red.ordinal) {
        ...
    } else if (<expr>.ordinal == Color.green.ordinal) {
        ...
    } else if (<expr>.ordinal == Color.blue.ordinal) {
        ...
    } else {
        ... // default case
    }
The .ordinal after each enum expression is superfluous, but enables a JVM compiler optimization, wherein the JVM compiler recognizes at class initialization time that a case statement may be substituted for the if-statement. Hotspot does not yet contain this optimization, but could be modified to do so.

An intriguing alternative that might offer similar performance with no support from the JVM is suggested by the old saw that every problem in Computer Science may be solved with an extra layer of indirection. Each switch statement outside of the compilation unit containing the enum class declaration will compile down to a switch statement on an array reference. The array will be initialized lazily:

    private static class $whatever {
        static int[] permutation = new int[Color.VALUES.size()];
        static {
            permutation[Color.red.ordinal]   = 1;
            permutation[Color.blue.ordinal]  = 2;
            permutation[Color.green.ordinal] = 3;
        }
    }

    switch($whatever.permutation[<expr>.ordinal]) {
      case 1: // red
        ...
      case 2: // blue
        ...
      case 3: // green
        ...
      default:
        ...
    }
This approach has higher "fixed cost" per case-statement and higher footprint, but low cost per execution of the cast statement.

VII. Open Issues

  1. Should name and ordinal in Enum be methods rather than fields? It might make sense to replace these fields with accessor methods on general principle.
  2. Should we provide any support for looping over subranges? There are several possibilities, ranging from adding successor and predecessor methods to Enum, to adding a special form of the enhanced for statement:
        for (Day d : Day.monday .. Day.friday) // Iterate through workdays
            ... ;
    
    I kind of like the latter solution. Unfortunately, the standard "half-open range" loop specification that is standard in the Java programming language presents an "impedance mismatch" to enums, which have no sentinels above and below the valid values.
  3. Should the convention be that enum constant names are all caps (like other constants), or mixed-case? The latter convention (currently reflected in this document) would make for nicer string forms, generally obviating the need for a custom toString method. On the other hand, it flies in the face of tradition.

Copyright 2002 Sun Microsystems, Inc., 901 San Antonio Road, Palo Alto, California 94303 U.S.A. All rights reserved.