The Java super class
This method checks if some other object passed to it as an argument is equal to the object on which this method is invoked. The default implementation of this method in
This is what the JDK 1.4 API documentation says about the
The contract of the
This method returns the hash code value for the object on which this method is invoked. This method returns the hash code value as an integer and is supported for the benefit of hashing based collection classes such as Hashtable, HashMap, HashSet etc. This method must be overridden in every class that overrides the
This is what the JDK 1.4 API documentation says about the
As compared to the general contract specified by the
The rest of the requirements specified in the contracts of these two methods are specific to those methods and are not directly related to the relationship between these two methods. Those specific requirements are discussed earlier. This relationship also enforces that whenever you override the
The following code exemplifies how all the requirements of
Now, let's examine why this implementation is the correct implementation. The class Test has two member variables -
Consider the
This conditional check should be preferred instead of the conditional check given by -
This is because, the first condition (code in blue) ensures that it will return
This class implements the
While implementing the
Here are some useful guidelines for implementing the
This class uses both the variables for computing the hash code. Lines 21 and 22 calculate the hash code values based on these two variables. Line 22 also ensures that invoking
While implementing the
Writing a very good implementation of the
Here are some useful guidelines for implementing the
java.lang.Object
has two very important methods defined in it. They are -public boolean equals(Object obj)
public int hashCode()
public boolean equals(Object obj)
This method checks if some other object passed to it as an argument is equal to the object on which this method is invoked. The default implementation of this method in
Object
class simply checks if two object references x and y refer to the same object. i.e. It checks if x == y
. This particular comparison is also known as "shallow comparison". However, the classes providing their own implementations of the equals
method are supposed to perform a "deep comparison"; by actually comparing the relevant data members. Since Object
class has no data members that define its state, it simply performs shallow comparison.This is what the JDK 1.4 API documentation says about the
equals
method of Object
class-
Indicates whether some other object is "equal to" this one.
Note that it is generally necessary to override the hashCode method whenever this method is overridden, so as to maintain the general contract for the hashCode method, which states that equal objects must have equal hash codes.
- The equals method implements an equivalence relation:
- It is reflexive: for any reference value x, x.equals(x) should return true.
- It is symmetric: for any reference values x and y, x.equals(y) should return true if and only if y.equals(x) returns true.
- It is transitive: for any reference values x, y, and z, if x.equals(y) returns true and y.equals(z) returns true, then x.equals(z) should return true.
- It is consistent: for any reference values x and y, multiple invocations of x.equals(y) consistently return true or consistently return false, provided no information used in equals comparisons on the object is modified.
- For any non-null reference value x, x.equals(null) should return false.
Note that it is generally necessary to override the hashCode method whenever this method is overridden, so as to maintain the general contract for the hashCode method, which states that equal objects must have equal hash codes.
The contract of the
equals
method precisely states what it requires. Once you understand it completely, implementation becomes relatively easy, moreover it would be correct. Let's understand what each of this really means.- Reflexive - It simply means that the object must be equal to itself, which it would be at any given instance; unless you intentionally override the
equals
method to behave otherwise. - Symmetric - It means that if object of one class is equal to another class object, the other class object must be equal to this class object. In other words, one object can not unilaterally decide whether it is equal to another object; two objects, and consequently the classes to which they belong, must bilaterally decide if they are equal or not. They BOTH must agree.
Hence, it is improper and incorrect to have your own class withequals
method that has comparison with an object ofjava.lang.String
class, or with any other built-in Java class for that matter. It is very important to understand this requirement properly, because it is quite likely that a naive implementation ofequals
method may violate this requirement which would result in undesired consequences. - Transitive - It means that if the first object is equal to the second object and the second object is equal to the third object; then the first object is equal to the third object. In other words, if two objects agree that they are equal, and follow the symmetry principle, one of them can not decide to have a similar contract with another object of different class. All three must agree and follow symmetry principle for various permutations of these three classes.
Consider this example - A, B and C are three classes. A and B both implement theequals
method in such a way that it provides comparison for objects of class A and class B. Now, if author of class B decides to modify itsequals
method such that it would also provide equality comparison with class C; he would be violating the transitivity principle. Because, no properequals
comparison mechanism would exist for class A and class C objects. - Consistent - It means that if two objects are equal, they must remain equal as long as they are not modified. Likewise, if they are not equal, they must remain non-equal as long as they are not modified. The modification may take place in any one of them or in both of them.
- null comparison - It means that any instantiable class object is not equal to
null
, hence theequals
method must return false if anull
is passed to it as an argument. You have to ensure that your implementation of theequals
method returnsfalse
if anull
is passed to it as an argument. - Equals & Hash Code relationship - The last note from the API documentation is very important, it states the relationship requirement between these two methods. It simply means that if two objects are equal, then they must have the same hash code, however the opposite is NOT true. This is discussed in details later in this article.
public int hashCode()
This method returns the hash code value for the object on which this method is invoked. This method returns the hash code value as an integer and is supported for the benefit of hashing based collection classes such as Hashtable, HashMap, HashSet etc. This method must be overridden in every class that overrides the
equals
method.This is what the JDK 1.4 API documentation says about the
hashCode
method of Object
class-
Returns a hash code value for the object. This method is supported for the benefit of hashtables such as those provided by java.util.Hashtable.
- The general contract of
- Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application.
- If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
- It is not required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hashtables.
hashCode
is:
As compared to the general contract specified by the
equals
method, the contract specified by the hashCode
method is relatively simple and easy to understand. It simply states two important requirements that must be met while implementing the hashCode
method. The third point of the contract, in fact is the elaboration of the second point. Let's understand what this contract really means.- Consistency during same execution - Firstly, it states that the hash code returned by the
hashCode
method must be consistently the same for multiple invocations during the same execution of the application as long as the object is not modified to affect theequals
method. - Hash Code & Equals relationship - The second requirement of the contract is the
hashCode
counterpart of the requirement specified by theequals
method. It simply emphasizes the same relationship - equal objects must produce the same hash code. However, the third point elaborates that unequal objectsneed not produce distinct hash codes.
Equal objects must produce the same hash code as long as they are equal, however unequal objects need not produce distinct hash codes.
The rest of the requirements specified in the contracts of these two methods are specific to those methods and are not directly related to the relationship between these two methods. Those specific requirements are discussed earlier. This relationship also enforces that whenever you override the
equals
method, you must override thehashCode
method as well. Failing to comply with this requirement usually results in undetermined, undesired behavior of the class when confronted with Java collection classes or any other Java classes.
Correct Implementation Example
The following code exemplifies how all the requirements of
equals
and hashCode
methods should be fulfilled so that the class behaves correctly and consistently with other Java classes. This class implements the equals
method in such a way that it only provides equality comparison for the objects of the same class, similar to built-in Java classes like String and other wrapper classes.1. public class Test 2. { 3. private int num; 4. private String data; 5. 6. public boolean equals(Object obj) 7. { 8. if(this == obj) 9. return true; 10. if((obj == null) || (obj.getClass() != this.getClass())) 11. return false; 12. // object must be Test at this point 13. Test test = (Test)obj; 14. return num == test.num && 15. (data == test.data || (data != null && data.equals(test.data))); 16. } 17. 18. public int hashCode() 19. { 20. int hash = 7; 21. hash = 31 * hash + num; 22. hash = 31 * hash + (null == data ? 0 : data.hashCode()); 23. return hash; 24. } 25. 26. // other methods 27. }
Now, let's examine why this implementation is the correct implementation. The class Test has two member variables -
num
and data
. These two variables define state of the object and they also participate in the equals
comparison for the objects of this class. Hence, they should also be involved in calculating the hash codes of this class objects.Consider the
equals
method first. We can see that at line 8, the passed object reference is compared with this
object itself, this approach usually saves time if both the object references are referring to the same object on the heap and if the equals comparison is expensive. Next, the if
condition at line 10 first checks if the argument is null
, if not, then (due to the short-circuit nature of the OR ||
operator) it checks if the argument is of type Test
by comparing the classes of the argument and this object. This is done by invoking the getClass()
method on both the references. If either of these conditions fails, then false
is returned. This is done by the following code -if((obj == null) || (obj.getClass() != this.getClass())) return false; // prefer
This conditional check should be preferred instead of the conditional check given by -
if(!(obj instanceof Test)) return false; // avoid
This is because, the first condition (code in blue) ensures that it will return
false
if the argument is a subclass of the class Test
. However, in case of the second condition (code in red) it fails. The instanceof
operator condition fails to return false
if the argument is a subclass of the class Test
. Thus, it might violate the symmetry requirement of the contract. The instanceof
check is correct only if the class is final
, so that no subclass would exist. The first condition will work for both, final and non-final classes. Note that, both these conditions will return false
if the argument is null
. The instanceof
operator returns false
if the left hand side (LHS) operand is null
, irrespective of the operand on the right hand side (RHS) as specified by JLS 15.20.2. However, the first condition should be preferred for better type checking.This class implements the
equals
method in such a way that it provides equals comparison only for the objects of the same class. Note that, this is not mandatory. But, if a class decides to provide equals comparison for other class objects, then the other class (or classes) must also agree to provide the same for this class so as to fulfill the symmetry and reflexivity requirements of the contract. This particular equals
method implementation does not violate both these requirements. The lines 14 and 15 actually perform the equality comparison for the data members, and return true if they are equal. Line 15 also ensures that invoking the equals
method on String variable data
will not result in a NullPointerException
.While implementing the
equals
method, primitives can be compared directly with an equality operator (==
) after performing any necessary conversions (Such as float to Float.floatToIntBits
or double to Double.doubleToLongBits
). Whereas, object references can be compared by invoking their equals
method recursively. You also need to ensure that invoking the equals
method on these object references does not result in a NullPointerException
.Here are some useful guidelines for implementing the
equals
method correctly.- Use the equality
==
operator to check if the argument is the reference to this object, if yes. return true. This saves time when actual comparison is costly. - Use the following condition to check that the argument is not
null
and it is of the correct type, if not then returnfalse
.if((obj == null) || (obj.getClass() != this.getClass())) return false;
Note that, correct type does not mean the same type or class as shown in the example above. It could be any class or interface that one or more classes agree to implement for providing the comparison. - Cast the method argument to the correct type. Again, the correct type may not be the same class. Also, since this step is done after the above type-check condition, it will not result in a
ClassCastException
. - Compare significant variables of both, the argument object and this object and check if they are equal. If *all* of them are equal then return true, otherwise return false. Again, as mentioned earlier, while comparing these class members/variables; primitive variables can be compared directly with an equality operator (
==
) after performing any necessary conversions (Such as float toFloat.floatToIntBits
or double toDouble.doubleToLongBits
). Whereas, object references can be compared by invoking theirequals
method recursively. You also need to ensure that invokingequals
method on these object references does not result in aNullPointerException
, as shown in the example above (Line 15).
It is neither necessary, nor advisable to include those class members in this comparison which can be calculated from other variables, hence the word "significant variables". This certainly improves the performance of theequals
method. Only you can decide which class members are significant and which are not. - Do not change the type of the argument of the
equals
method. It takes ajava.lang.Object
as an argument, do not use your own class instead. If you do that, you will not be overriding theequals
method, but you will be overloading it instead; which would cause problems. It is a very common mistake, and since it does not result in a compile time error, it becomes quite difficult to figure out why the code is not working properly. - Review your
equals
method to verify that it fulfills all the requirements stated by the general contract of theequals
method. - Lastly, do not forget to override the
hashCode
method whenever you override theequals
method, that's unpardonable. ;)
hashCode
method of this example. At line 20, a non-zero constant value 7 (arbitrary) is assigned to an int variable hash
. Since the class members/variables num
and data
do participate in the equals
method comparison, they should also be involved in the calculation of the hash code. Though, this is not mandatory. You can use subset of the variables that participate in the equals
method comparison to improve performance of the hashCode
method. Performance of the hashCode
method indeed is very important. But, you have to be very careful while selecting the subset. The subset should include those variables which are most likely to have the greatest diversity of the values. Sometimes, using all the variables that participate in the equals
method comparison for calculating the hash code makes more sense.This class uses both the variables for computing the hash code. Lines 21 and 22 calculate the hash code values based on these two variables. Line 22 also ensures that invoking
hashCode
method on the variable data
does not result in a NullPointerException
if data
is null
. This implementation ensures that the general contract of the hashCode
method is not violated. This implementation will return consistent hash code values for different invocations and will also ensure that equal objects will have equal hash codes.While implementing the
hashCode
method, primitives can be used directly in the calculation of the hash code value after performing any necessary conversions, such as float to Float.floatToIntBits
or double to Double.doubleToLongBits
. Since return type of the hashCode
method is int
, long values must to be converted to the integer values. As for hash codes of the object references, they should be calculated by invoking their hashCode
method recursively. You also need to ensure that invoking the hashCode
method on these object references does not result in a NullPointerException
.Writing a very good implementation of the
hashCode
method which calculates hash code values such that the distribution is uniform is not a trivial task and may require inputs from mathematicians and theoretical computer scientist. Nevertheless, it is possible to write a decent and correct implementation by following few simple rules.Here are some useful guidelines for implementing the
hashCode
method correctly.- Store an arbitrary non-zero constant integer value (say 7) in an
int
variable, calledhash
. - Involve significant variables of your object in the calculation of the hash code, all the variables that are part of equals comparison should be considered for this. Compute an individual hash code
int var_code
for each variablevar
as follows -- If the variable
(var)
isbyte, char, short
orint
, thenvar_code = (int)var;
- If the variable
(var)
islong
, thenvar_code = (int)(var ^ (var >>> 32));
- If the variable
(var)
isfloat
, thenvar_code = Float.floatToIntBits(var);
- If the variable
(var)
isdouble
, then -long bits = Double.doubleToLongBits(var);
var_code = (int)(bits ^ (bits >>> 32)); - If the variable
(var)
isboolean
, thenvar_code = var ? 1 : 0;
- If the variable
(var)
is an object reference, then check if it isnull
, if yes thenvar_code = 0;
otherwise invoke thehashCode
method recursively on this object reference to get the hash code. This can be simplified and given as -var_code = (null == var ? 0 : var.hashCode());
- If the variable
- Combine this individual variable hash code
var_code
in the original hash codehash
as follows -hash = 31 * hash + var_code;
- Follow these steps for all the significant variables and in the end return the resulting integer
hash
. - Lastly, review your
hashCode
method and check if it is returning equal hash codes for equal objects. Also, verify that the hash codes returned for the object are consistently the same for multiple invocations during the same execution.
equals
and hashCode
methods are merely useful as guidelines, these are not absolute laws or rules. Nevertheless, following them while implementing these two methods will certainly give you correct and consistent results.
Summary & Miscellaneous Tips
- Equal objects must produce the same hash code as long as they are equal, however unequal objects need not produce distinct hash codes.
- The
equals
method provides "deep comparison" by checking if two objects are logically equal as opposed to the "shallow comparison" provided by the equality operator==
. - However, the
equals
method injava.lang.Object
class only provides "shallow comparison", same as provided by the equality operator==
. - The
equals
method only takes Java objects as an argument, and not primitives; passing primitives will result in a compile time error. - Passing objects of different types to the
equals
method will never result in a compile time error or runtime error. - For standard Java wrapper classes and for
java.lang.String
, if theequals
argument type (class) is different from the type of the object on which theequals
method is invoked, it will return false. - The class
java.lang.StringBuffer
does not override theequals
method, and hence it inherits the implementation fromjava.lang.Object
class. - The
equals
method must not provide equality comparison with any built in Java class, as it would result in the violation of the symmetry requirement stated in the general contract of theequals
method. - If
null
is passed as an argument to theequals
method, it will return false. - Equal hash codes do not imply that the objects are equal.
return 1;
is a legal implementation of thehashCode
method, however it is a very bad implementation. It is legal because it ensures that equal objects will have equal hash codes, it also ensures that the hash code returned will be consistent for multiple invocations during the same execution. Thus, it does not violate the general contract of thehashCode
method. It is a bad implementation because it returns same hash code for all the objects. This explanation applies to all implementations of thehashCode
method which return same constant integer value for all the objects.- In standard JDK 1.4, the wrapper classes
java.lang.Short, java.lang.Byte, java.lang.Character
andjava.lang.Integer
simply return the value they represent as the hash code by typecasting it to anint
. - Since JDK version 1.3, the class
java.lang.String
caches its hash code, i.e. it calculates the hash code only once and stores it in an instance variable and returns this value whenever thehashCode
method is called. It is legal becausejava.lang.String
represents an immutable string. - It is incorrect to involve a random number directly while computing the hash code of the class object, as it would not consistently return the same hash code for multiple invocations during the same execution.
No comments:
Post a Comment