Wednesday, 22 October 2014

hashCode() and equals() methods in java

Usage of hashCode() and equals()
hashCode() method is used to get a unique integer for given object. This integer is used for determining the bucket location, when this object needs to be stored in some HashTable like data structure. By default, Object’s hashCode() method returns and integer representation of memory address where object is stored.

equals() method, as name suggest, is used to simply verify the equality of two objects. Default implementation simply check the object references of two objects to verify their equality.


Overriding the default behavior
Everything works fine until you do not override any of these methods in your classes. But, sometimes application needs to change the default behavior of some objects.

Lets take an example where your application has Employee object. Lets create a minimal possible structure of Employee class

public class Employee
{
    private Integer id;
    private String firstname;
    private String lastName;
    private String department;

    public Integer getId() {
        return id;
    }
    public void setId(Integer id) {
        this.id = id;
    }
    public String getFirstname() {
        return firstname;
    }
    public void setFirstname(String firstname) {
        this.firstname = firstname;
    }
    public String getLastName() {
        return lastName;
    }
    public void setLastName(String lastName) {
        this.lastName = lastName;
    }
    public String getDepartment() {
        return department;
    }
    public void setDepartment(String department) {
        this.department = department;
    }
}
Above Employee class has some very basic attributes and there accessor methods. Now consider a simple situation where you need to compare two employee objects.
public class EqualsTest {
    public static void main(String[] args) {
        Employee e1 = new Employee();
        Employee e2 = new Employee();

        e1.setId(100);
        e2.setId(100);
        //Prints false in console
        System.out.println(e1.equals(e2));
    }
}
Above method will print “false“. But, is it really correct after knowing that both objects represent same employee. In a real time application, this must return true.

To achieve correct behavior, we need to override equals method as below:
public boolean equals(Object o) {
        if(o == null)
        {
            return false;
        }
        if (o == this)
        {
           return true;
        }
        if (getClass() != o.getClass())
        {
            return false;
        }
        Employee e = (Employee) o;
        return (this.getId() == e.getId());
}
Add this method to your Employee class, and EqualsTest will start returning “true“.

So are we done? Not yet. Lets test again above modified Employee class in different way.
import java.util.HashSet;
import java.util.Set;

public class EqualsTest
{
    public static void main(String[] args)
    {
        Employee e1 = new Employee();
        Employee e2 = new Employee();

        e1.setId(100);
        e2.setId(100);

        //Prints 'true'
        System.out.println(e1.equals(e2));

        Set employees = new HashSet();
        employees.add(e1);
        employees.add(e2);
        //Prints two objects
        System.out.println(employees);
    }
}
Above class prints two objects in second print statement. If both employee objects have been equal, in a Set which stores only unique objects, there must be only one instance inside HashSet, after all both objects refer to same employee. What is it we are missing??

We are missing the second important method hashCode(). As java docs say, if you override equals() method then you must override hashCode() method. So lets add another method in our Employee class.
@Override
 public int hashCode()
 {
    final int PRIME = 31;
    int result = 1;
    result = PRIME * result + getId();
    return result;
 }
Once above method is added in Employee class, the second statement start printing only single object in second statement, and thus validating the true equality of e1 and e2.

Key Points
  • Always use same attributes of an object to generate hashCode() and equals() both. As in our case, we have used employee id.
  • equals() must be consistent (if the objects are not modified, then it must keep returning the same value).
  • Whenever a.equals(b), then a.hashCode() must be same as b.hashCode().
  • If you override one, then you should override the other.
If you feel, I am missing something or wrong somewhere, please leave a comment. I will update this post again to help others.

Hashmap in Java

If anybody asks me to describe “How HashMap works?“, I simply answer: “On principle of Hashing“. HashMap is a Collection which works on principle of Hashing and it uses key and value pairs to store the information and we can get the value if we pass key to the HashMap. You can also say HashMap is fast as it is not synchronized.

HashMap class comes from java.util package and HashMap class extends AbstractMap and implements Map, Cloneable and Serializable interfaces. HashMap is nothing but Hash table based implementation of Map Interface.



What is Hashing

Hashing in its simplest form, is a way to assigning a unique code for any variable/object after applying any formula/algorithm on its properties. A true Hashing function must follow this rule:

Hash function should return the same hash code each and every time, when function is applied on same or equal objects. In other words, two equal objects must produce same hash code consistently.

Note: All objects in java inherit a default implementation of hashCode() function defined in Object class. This function produce hash code by typically converting the internal address of the object into an integer, thus producing different hash codes for all different objects.

A little about Entry class

A map by definition is : “An object that maps keys to values”. Very easy.. right?
So, there must be some mechanism in HashMap to store this key value pair. Answer is YES. HashMap has an inner class Entry, which looks like this:

static class Entry implements Map.Entry{
          final K key;
          V value;
          Entry next;
          final int hash;
          ...
}

Surely Entry class has key and value mapping stored as attributes. Key has been marked as final and two more fields are there: next and hash.


put() method in hashmap

We need to pass Key and Value object to put() method in Java HashMap. Once we pass the key and value to in put() method of HashMap, HashMap calls hashCode method on key object and use hashcode into own hashing function to find the bucket location. Bucket is the place where Entry object is stored and hashcode is like an address of that bucket. So what exactly we store in bucket? This is very important to understand. Java stores both Key and Value object as a Map.Entry in bucket. These are stored like LinkedList. But why these are stored as LinkedList. As we know we are using the hashcode to find the bucket location but two unequal objects can have same hashcode. So there are chances of collision. So along with value key is stored.


get() methods in hashmap

We need to pass key to the get method which takes the key as an argument and try to find bucket position using the hashcode. Once it finds the bucket using equals function it gets the value and returns it.

If no match is found, get() method returns null.


Key Notes


  • Data structure to store Entry objects is an array named table of type Entry.
  • A particular index location in array is referred as bucket, because it can hold the first element of a LinkedList of Entry objects.
  • Key object’s hashCode() is required to calculate the index location of Entry object.
  • Key object’s equals() method is used to maintain uniqueness of Keys in map.
  • Value object’s hashCode() and equals() method are not used in HashMap’s get() and put() methods.
  • Hash code for null keys is always zero, and such Entry object is always stored in zero index in Entry[].
  • Order in not maintained in hashmap.


What is the load factor in java HashMap and what happens when size of the HashMap exceeds a given threshold defined by load factor?

Load factor of HashMap is is .75 it will act to re-size the map once it filled 75%, so once size exceeds 75%, then Java HashMap re-size itself by creating a new bucket array of size twice of size defined previously for the HashMap ,then it start putting every old element into that new bucket array. This process is also called rehashing because hash function is used to find the new bucket location.

What are the differences between HashMap and Hashtable?

Lets see what are the differences we can see in HashMap and Hashtable.

1. Synchronization :
Hashtables are synchronized , it means at one point of time only one thread can modify a Hashtable. i.e any thread before performing an update on a hashtable will be acquiring a lock on the object while others will wait for lock to be released.
Hashmap is not synchronized , but we can make it synchronized.

2. Fail-safe:
Iterator of HashMap is fail-safe but the enumerator for the Hashtable is not fail-safe. Lets see what exactly fail-safe is. If an iterator has been created on a collection object and some other thread tries to modify this object , a concurrent modification exception will be thrown. So fail safe is like ability of system to fail but it wont produce some catastrophic results.

3. null values
Hashtable doesn’t allow null but Hashmap allows null values.