Confusing documentation and sub-optimal .hashCode implementation in Either

Issue #25 resolved
Michael Diamond created an issue

Either's documentation indicates that it's .equals() and .hashCode() methods are unusable with mutable values:

if the contained objects are mutable then equals and hashcode methods should not be relied on.

This is inaccurate; both .equals() and .hashCode() for Left and Right correctly call the respective methods on the contained value, with no caching or other mutability issues.

Better language might be:

Either is immutable as long as its contained object is immutable. Mutating the contained object may change the behavior of the Either instance's equals or hashcode methods.


In a similar vein, Left.hashCode() and Right.hashCode() always return the same value (for the same contained object), which can lead to needless hash collisions. While probably a minor concern, it's easy to fix.

Comments (10)

  1. Anund McKague

    Either is immutable, but does not force immutability on contained objects. If an object contained inside an Either is mutable then the equals and hashcode methods should not be relied on to return stable values.

    @dimo414 Does this help clarify the text. Do you have a suggested fix for left/right collision?

  2. Jed Wesley-Smith

    I'm not sure what's wrong with the original text, any object whose hashcode is not stable cannot be relied on – I think the original is entirely accurate. If you use an Either as a key in a HashMap and the hashcode chages, you can't find it again. Any rewording that does not contain the something like the original warning would be inadequate.

    I rather like the idea of bitwise negation of the left hashcode.

  3. Michael Diamond reporter

    The problem with the original text is it suggests something special about Either's equals() and hashCode() implementations. It's well known (and documented in the relevant classes) that objects stored in certain collections cannot be mutated while in the collection. However Either has no such problems; its behavior remains consistent after the element it contains is mutated, and users can rely on the behavior of these methods.

    For example, this leads to undefined behavior:

    AtomicInteger ai = new AtomicInteger();
    Set<AtomicInteger> set = ImmutableSet.of(ai);
    ai.incrementAndGet();
    set.contains(new AtomicInteger(1)); // returns *something*
    

    Whereas this is perfectly consistent, and is what a user would expect:

    AtomicInteger ai = new AtomicInteger();
    Either<AtomicInteger> e = Either.left(ai);
    ai.incrementAndGet();
    e.equals(Either.left(new AtomicInteger(1))); // returns true
    

    It's incorrect and confusing to document that Either.equals() and Either.hashCode() cannot be relied upon; they are exactly as reliable as any other mutable Java class.

  4. Jed Wesley-Smith

    they are exactly as reliable as any other mutable Java class.

    The point is exactly that, they are not reliable if they hold a mutable reference that implements hashCode and/or equals in terms of its mutable state.

    This can (should) be further stated as a rule that hashCode/equals should never be implemented in terms of mutable state – which is why your example doesn't quite work, because AtomicInteger (as a mutable class) doesn't define equals or hashCode. Any object's hashCode must remain stable for its lifetime, and mutating equals breaks most of the equals laws (symmetry, transitivity, consistency and even possibly reflexivity).

    Further you state that:

    Either has no such problems; its behavior remains consistent after the element it contains is mutated, and users can rely on the behavior of these methods.

    You've actually shown that this is not true at all, and that the equals and hashCode are broken if we contain a reference to a mutable object (or at least one whose equals and hashCode are implemented in terms of mutable state).

    It's incorrect and confusing to document that Either.equals() and Either.hashCode() cannot be relied upon; they are exactly as reliable as any other mutable Java class.

    Hopefully this has convinced you that the reliability of mutable Java classes regarding 'equals' and 'hashCode' is zero.

  5. Michael Diamond reporter

    AtomicInteger was a poor example, but there are many mutable classes that do implement equals and hashCode (whether they should or not is a fair question of course).

    The language of Either implies - to me at least - that Either is somehow more unreliable than other mutable classes. For example, if Either cached its hashCode value at construction time, it would be unreliable in the face of mutation. In fact, Either is exactly as reliable as any other class holding a reference to another object. As another example, notice that Optional makes no mention of an unreliable equals/hashCode contract. It simply says:

    Returns true if object is an Optional instance, and either the contained references are equal to each other or both are absent.

    The only point I'm trying to make is that the current language in Either implies its hashCode and equals methods are too broad. They say the methods are unusable if the contained object is mutated. This isn't true, the values these methods return are logical and what most callers would expect.

    If you think it's worth discouraging mutating objects held by Either, I'm all for it, but the current language incorrectly claims the behavior is undefined.

  6. Jed Wesley-Smith

    the current language incorrectly claims the behavior is undefined.

    Here is the issue, we do not agree on this. The equals and 'hashCode' contracts cannot be properly fulfilled for any mutable object.

    Fugue is a functional programming library. It is of the opinion that things can and should be immutable, and that if you have an element that purports to be immutable, like an Option, it is broken by making it mutable by having it reference mutable objects internally. Broken here meaning "you can no longer rely on the guarantees that it gives you". In other words, it is tantamount to forcing it to tell you lies.

    Consider a situation where you have a collection of Either<A, B> and you are asked to implement something like count(A a). If you have mutable implementations of A then you cannot possibly have a stable implementation of count, you cannot rely on any two invocations returning the same result. In other words, you have broken referential transparency, meaning you do not have useful reasoning tools available to you to refactor your program.

    I do not mind adding additional, clarifying, remarks – but I do not accept that the current wording is incorrect – you cannot rely on things that change in ways that are neither visible nor under your control. There are important and dangerous consequences and complications that arise from mutation and this library is designed to help protect against them.

  7. Michael Diamond reporter

    Clearly we'll just have to agree to disagree, it's your library, if you think this language is what users need to know, that's great. As a Java user, not a Fugue user, this language is confusing and misleading.

  8. Jed Wesley-Smith

    I am afraid I am confused why you think it is misleading, and why the original contract of equals/hashCode is basically unfulfillable in the face of mutation. The documentation does give a generic out for mutation, but it basically says that in the face of mutation nothing really counts.

  9. Log in to comment