Wouldn’t it be nice to have a search algorithm to locate those things you need in your day-to-day life? You want to be able to quickly find car keys, books, pens, chargers, specific contacts from our phone records. That is what search algorithms do in a computer. In this article we look at Python search algorithms that retrieve information stored within some data structure or a specific search space.
Searching is the technique of selecting specific data from a collection of data based on some condition. It is the process of finding a particular item in a collection. It can also be described as:
“Given a list of values, a function that compares two values and a desired value, find the position of the desired value in the list.”
In this post, we’ll focus on two main search algorithms:
- Linear search – checks the items in sequence until the desired item is found
- Binary search – requires a sorted sequence in the list, and checks for the value in the middle of the list, repeatedly discarding the half of the list that contains values that are definitely either all larger or all smaller than the desired value.
Linear Search in Python
Linear or sequential search is the simplest solution. It simply iterates over the list, one item at a time, until the specific item is found or all items have been examined.
In Python, a target item can be found in a sequence using the in operator:
if key in theArray : print( "The key is in the array." ) else : print( "The key is not in the array.")
To determine if an item is in the array, the search begins with the value in the first element. If the first element does not contain the target value, the next element in sequential order is compared to the item. This process is repeated until the item is found. If the item is not in the array, we still have to iterate over the list comparing every item in the array to the target value. This is because it cannot be determined that the item is not in the sequence until the entire array has been traversed.
def Linear_Search(arr,target): #iterate to the end of the list for i in range(len(arr)): #match found, return true if arr[i]==target: return True return False
Searching a Sorted Sequence using Linear Search
If we know the list is ordered, we only have to check until we have found the element or an element greater than it
def ordered_seq_search(arr,target): # Start at position 0 pos= 0 # Target becomes true if item is in the list found = False # Stop marker stopped = False # go until end of list while pos< len(arr) and not found and not stopped: # If match if arr[pos] == target: found = True else: # Check if element is greater if arr[pos] > target: stopped = True # Otherwise move on else: pos = pos+1 return found
Finding the Smallest Item in a List
Instead of searching for a specific value in an unsorted sequence, suppose we wanted to search for the smallest value, which is equivalent to applying Python’s min() function to the sequence.
In the Python search algorithm, a linear search is performed as before. But this time we must keep track of the smallest value found for each iteration through the loop, as shown in the sample code:
def smallestItem(arr): # Assume the first item is the smallest value. smallest=arr[0] #loop through the list and check for any smaller value for i in range(len(arr)): if arr[i]==smallest: smallest=arr[i] return smallest #return the smallest value
Binary Search in Python
Binary Search follows a divide-and-conquer methodology. It is faster than linear search but requires that the array be sorted before the algorithm is executed.
How it works:
We first check the MIDDLE element in the list. If . . .
- . . . it is the value we want, we can stop.
- . . . we find a HIGHER than the value we want, we repeat the search process with the portion of the list BEFORE the middle element.
- . . . a LOWER than the value we want is in middle, we repeat the search process with the portion of the list AFTER the middle element.
The binary Python search algorithm can be written either recursively or iteratively. Recursion is generally slower in Python because it requires the allocation of new stack frames.
Iterative Implementation of Binary Search
def Binary_Search(arr,target): ''' Binary Search Algorithm implemented Iteratively ''' # First and last index values left=0 right=len(arr)-1 while left<=right: mid=(left+right)//2 # Match found if arr[mid]==target: return True # Set new midpoints up or down depending on comparison else: if target>arr[mid]: #set up left = mid+1 else: #set down right = mid-1 return False
Recursive Implementation of Binary Search
def rec_bsearch(arr,target): ''' Binary Search Algorithm implemented Recursively ''' # Base Case! if len(arr) == 0: return False # Recursive Case else: mid = len(arr)//2 # If match found if arr[mid]==target: return True else: # Call again on second half if target<arr[mid]: return rec_bsearch(arr[:mid],target) # Or call on first half else: return rec_bsearch(arr[mid+1:],target)
Run-Time Analysis
- Sequential (Linear) Search
The time complexity of linear search is O(n). This means that the time taken to execute increases with the number of items in our input list.
Linear search is not often used in practice. That’s because the same efficiency can be achieved by using inbuilt methods or existing operators. And it is not as fast or efficient as other search algorithms.
When we need to find the first occurrence of an item in an unsorted collection, linear search is a good approach. Because unlike most other search algorithms, it does not require that a collection be sorted before searching begins.
- Binary Search Algorithm
We can only pick one possibility per iteration. The input size gets divided by two, in each iteration. This makes the time complexity of binary search O(log n), which is more efficient than the linear search.
One drawback of binary search is that if there are multiple occurrences of an element in the array, it does not return the index of the first element, but rather the index of the element closest to the middle.
Algorithm | Best case | Expected | Worst case |
Linear search | O(1) | O(N) | O(N) |
Binary search | O(1) | O(log N) | O( log N) |
Conclusion
Various search problems us different search algorithms in computer science. It depends on the nature of the search space of a problem domain, either with discrete or continuous values. Usually, the appropriate search algorithm depends on the data structure being searched, and may also require one to have prior knowledge about the data. Nevertheless, the applications of search algorithms are quite broad: from simple information retrieval to SEO optimization. Thank you for reading and keep exploring more use cases.
The full code can be found here.