This article is the second in a series of articles about algorithms.
Binary search is a search algorithm that continually bisects a sorted array in order to find a given value. It is also known as “binary chop”, “half-interval search” and “logarithmic search”.
You intuitively use this algorithm when searching through a dictionary or phonebook. If the word or person you’re looking for starts with “O”, you would open the book roughly in the middle instead of paging through it from the beginning.
We could explain this algorithm practically using the “I’m thinking of a number between x and y, what is it” game 😅.
I’m Thinking of a Number
Player one thinks of a number (7) between two other numbers - say, 1 and 10 - and player two attempts to guess (or, in our case compute with an algorithm) the answer.
Step 1
We suggest the number at the midpoint of the number range - that is, 5. By doing this we are bisecting the group/set of potential numbers.
Player one informs us that our guess is too low - we continue.
Step 2
Because our guess is too low, we discard the lower half of the set of numbers - that is, 1 through 5 - as we can be sure that the number we’re looking for isn’t present in this group. We suggest the number at the midpoint of the new group - 8.
Player one informs us that our guess is too high - we continue.
Step 3
Because our guess is too high, we discard the upper half of the remaining group of numbers - that is, 8 through 10 - and, using bisection method, suggest 7.
Player one cowers at our algorithmic powers and admits defeat!
Had we used linear search, that is, starting from 1 and guessing each subsequent number in the set until we hit the correct one, it would have taken us 7 guesses. Using binary search, it took us 3 guesses.
Complexity
Binary search chops the set of numbers in half with every step effectively dividing the search space in 2 parts which is why it is base-2 logarithmic. Base-2 logarithm refers to the number of times you divide the search space by 2 before it becomes 1 or is empty. So, if you have $n$ elements, it would take $log_2(n)$ steps to reduce the search space to 1 element, at which point, you find what you’re looking for.
When expressing algorithms’ complexity with Big-O notation, we can drop the base as logarithms with different bases only differ by a constant factor - a fact that is simplified away. So we can drop the $_2$ in $log_2(n)$ to get $O(log\cdot n)$.
In our example, we had 10 items in the set and $O(log\cdot n) = O(log\cdot 10) = 3.32$ confirms that it took us 3 guesses. If we had a set of 1000 values it would take us a maximum of 10 guesses.
Time
- worse-case time complexity - $O(log\cdot n)$ (logarithmic)
- best-case time complexity - $O(1)$ (constant time)
Space
Space complexity refers to amount of memory needed for the algorithm in addition to the input data.
- using iteration - $O(1)$ (constant time)
- using recursion - $O(log\cdot n)$ (logarithmic)
Advantages
- efficiency because of its logarithmic time complexity
- predictable performance
- space efficient if using an iterative solution
- scales well
- simple implementation
Disadvantages
- the array of values needs to be sorted
- if values’ comparison operations are heavy, overhead can add up
- less effective when the data set is smaller
- increase in memory overhead if using recursion