Binary Search vs contains Performance in Java List
Java provides two methods namely Collections.binarySearch() and contains() to find an element inside a list. Underneath the hood, contains() method uses indexOf() method to search for the element. indexOf() method linearly loops through the List and compares every element with the key until the key is found and returns true otherwise it returns false when the element is not found. so, the time complexity of contains() is O(n). The time complexity of Collections.binarySearch() is O(log2(n)). But if we want to use this method then the list should be sorted. If the list is not sorted then we need to sort it before using Collections.binarySearch() which takes O(nlog(n)) time.
How to choose:
- If the element to be found is near the starting of the list then the performance of contains() method is better as contains() start searching for the element from the starting of the list linearly.
- If the elements are sorted and the number of elements is relatively large then Collections.binarySearch() is faster as it only takes O(log2(n)) time.
- If the elements of the list are unsorted then the performance of contains() method is better as it only takes O(n) time but if the number of search queries is high then the overall performance of Collections.binarySearch() method is better as we sort list only once during the first search which takes O(nlog(n)) time and after that each search operations takes O(log(n)) time.
- For a list that contains a relatively small number of elements then contains() yields better speed.
- If we are using a LinkedList that does not implement the RandomAccess interface and hence is unable to provide O(1) time to access an element, then we should prefer contains() over Collections.binarySearch() as Collections.binary search() takes O(n) to perform link traversals, and then it takes O(log(n)) time to perform comparisons.
Now we will be discussing out two variants where a sorted List is
- Sorted small List
- Sorted large List
- Unsorted List
Case 1: For a small sorted list
In the code mentioned below, we have taken the example of a sorted list that contains only 100 elements from 0 to 99 and we search for 40 and as we have seen above that in small lists contains() has an edge over Collections.binarySearch when it comes to speed.
Time taken to find 40 inside arr using contains() = 16286 nano seconds Time taken to find 40 inside arr using binarySearch() = 87957 nano seconds
Case 2: For a large sorted list
In the mentioned below, we have created a sorted ArrayList which contains 100000 elements from 0 to 99999, and we find the element 40000 inside it using contains() and Collections.sort() method. As the list is sorted and has a relatively large number of elements the performance of Collections.sort() is better than contains() method.
Time taken to find 40000 inside arr using contains() = 6651276 nano seconds Time taken to find 40000 inside arr using binarySearch() = 85231 nano seconds
Case 3: For an unsorted List
In the code mentioned below, we have created an unsorted ArrayList by storing random numbers between 0 and 100000 inside it. As the list is unsorted the performance of contains() method is better as it only takes O(n) time while for using the Collections.sort() method we first have to sort the list which takes an extra O(nlog(n)) time and then O(log2(n)) time is taken to search for the element.\
Time takes to find 66181 inside arr using contains() = 8331486 nano seconds Time takes to find 66181 inside arr using binarySearch() = 140322701 nano seconds