In this post some Python basics search algorithms are presented. It can be found: binary search, recurrent binary search, and interpolation. The base code can be found In this repo Feel free to pull request other algorithms as well as investigations over this algorithms Some ideas that can be interesting to investigate:
import timeit
import random
import numpy as np
An implementation of the binary search algorithm. For details will follow. A good summary can be found on Wikipedia: https://en.wikipedia.org/wiki/Binary_search_algorithm.
conditions: should be used only for sorted arrays
def binary_search(array, value):
ary = array
min_idx = 0
max_idx = len(array)
while min_idx < max_idx:
middle_idx = (min_idx + max_idx) // 2
if array[middle_idx] == value:
return middle_idx
elif array[middle_idx] < value:
min_idx = middle_idx + 1
else:
max_idx = middle_idx
return None
binary_search(array=[1, 2, 4, 7, 8, 10, 11],
value=11)
6
binary_search(array=[1, 2, 4, 7, 8, 10, 11],
value=99)
# run same code 5 times to get measurable data
n = 1
# calculate total execution time
results = []
for x in [10, 100, 1000, 10000, 100000, 1000000]:
array = np.random.randint(0, x*2, size=x)
result = timeit.timeit(stmt='addition()', globals=globals(), number=n)
results.append(result)
# calculate the execution time
# get the average execution time
Note that this implementation of recursive binary search deliberately avoid slicing the array
(e.g., array[:middle_idx]
), because slicing Python lists is expensive due to the random memory access. E.g., slicing a Python list with as a_list[:k]
is an O(k) operation.
def recursive_binary_search(array, value, start_idx=None, end_idx=None):
len_ary = len(array)
if start_idx is None:
start_idx = 0
if end_idx is None:
end_idx = len(array) - 1
if not len_ary or start_idx >= end_idx:
return None
middle_idx = (start_idx + end_idx) // 2
if array[middle_idx] == value:
return middle_idx
elif array[middle_idx] > value:
return recursive_binary_search(array,
value,
start_idx=start_idx,
end_idx=middle_idx)
else:
return recursive_binary_search(array,
value,
start_idx=middle_idx + 1,
end_idx=len_ary)
return None
recursive_binary_search(array=[],
value=1)
recursive_binary_search(array=[1, 2, 4, 7, 8, 10, 11],
value=1)
0
recursive_binary_search(array=[1, 2, 4, 7, 8, 10, 11],
value=4)
2
recursive_binary_search(array=[1, 2, 4, 7, 8, 10, 11],
value=11)
6
recursive_binary_search(array=[1, 2, 4, 7, 8, 10, 11],
value=99)
A binary Search Tree is a node-based binary tree data structure which has the following properties:
import timeit
import random
import numpy as np
Function for searching in a binary tree class
# A utility function to search a given key in BST
def search(root,key):
# Base Cases: root is null or key is present at root
if root is None or root.val == key:
return root
# Key is greater than root's key
if root.val < key:
return search(root.right,key)
# Key is smaller than root's key
return search(root.left,key)
# This code is contributed by Bhavya Jain
Failed to start the Kernel.
Kernel Python 3.9.6 64-bit is not usable. Check the Jupyter output tab for more information.
View Jupyter <a href='command:jupyter.viewOutput'>log</a> for further details.
# Python program to demonstrate
# insert operation in binary search tree
# A utility class that represents
# an individual node in a BST
class Node:
def __init__(self, key):
#this will be another node class type Node. That's how we build the recursive tree.
self.left = None
#this will be or None or type Node
self.right = None
self.val = key
# A utility function to insert
# a new node with the given key
def insert(root, key):
if root is None:
return Node(key)
else:
if root.val == key:
return root
elif root.val < key:
root.right = insert(root.right, key)
else:
root.left = insert(root.left, key)
return root
# A utility function to do inorder tree traversal
def inorder(root):
if root:
inorder(root.left)
print(root.val)
inorder(root.right)
# Driver program to test the above functions
# Let us create the following BST
# 50
# / \
# 30 70
# / \ / \
# 20 40 60 80
r = Node(50)
r = insert(r, 30)
r = insert(r, 20)
r = insert(r, 40)
r = insert(r, 70)
r = insert(r, 60)
r = insert(r, 80)
# Print inoder traversal of the BST
inorder(r)
insertion using the loop
class GFG:
@staticmethod
def main(args):
tree = BST()
tree.insert(30)
tree.insert(50)
tree.insert(15)
tree.insert(20)
tree.insert(10)
tree.insert(40)
tree.insert(60)
tree.inorder()
class Node:
left = None
val = 0
right = None
def __init__(self, val):
self.val = val
class BST:
root = None
def insert(self, key):
node = Node(key)
if (self.root == None):
self.root = node
return
prev = None
temp = self.root
while (temp != None):
if (temp.val > key):
prev = temp
temp = temp.left
elif(temp.val < key):
prev = temp
temp = temp.right
if (prev.val > key):
prev.left = node
else:
prev.right = node
def inorder(self):
temp = self.root
stack = []
while (temp != None or not (len(stack) == 0)):
if (temp != None):
stack.append(temp)
temp = temp.left
else:
temp = stack.pop()
print(str(temp.val) + " ", end="")
temp = temp.right
if __name__ == "__main__":
GFG.main([])
Se recomienda aplicar este algoritmo:
cuando acceder al elemento buscado es muy costoso: el acceso a un elemento que se encuentra en la memoria es algo muy rápido. Pero, cuando se debe acceder a un elemento que está almacenado en un disco, el acceso se vuelve una tarea costosa, que demanda tiempo y recursos; y
cuando los elementos están ordenados según una distribución uniforme.
multiple interpolation search algorithm can be used: let’s see how linear interpolation would give us the position of a particular element x
The formula for pos can be derived as follows.
Let’s assume that the elements of the array are linearly distributed.
General equation of line : y = m*x + c. y is the value in the array and x is its index.
Now putting value of lo,hi and x in the equation
x = m*pos + c —-(3)
subtracting eqxn (2) from (3)
Algorithm The rest of the Interpolation algorithm is the same except for the above partition logic.
# Python3 program to implement
# interpolation search
# with recursion
# If x is present in arr[0..n-1], then
# returns index of it, else returns -1.
def interpolationSearch(arr, lo, hi, x):
# Since array is sorted, an element present
# in array must be in range defined by corner
if (lo <= hi and x >= arr[lo] and x <= arr[hi]):
# Probing the position with keeping
# uniform distribution in mind.
pos = lo + ((hi - lo) // (arr[hi] - arr[lo]) *
(x - arr[lo]))
# Condition of target found
if arr[pos] == x:
return pos
# If x is larger, x is in right subarray
if arr[pos] < x:
return interpolationSearch(arr, pos + 1,
hi, x)
# If x is smaller, x is in left subarray
if arr[pos] > x:
return interpolationSearch(arr, lo,
pos - 1, x)
return -1
# Driver code
# Array of items in which
# search will be conducted
arr = [10, 12, 13, 16, 18, 19, 20,
21, 22, 23, 24, 33, 35, 42, 47]
n = len(arr)
# Element to be searched
x = 18
index = interpolationSearch(arr, 0, n - 1, x)
if index != -1:
print("Element found at index", index)
else:
print("Element not found")
# This code is contributed by Hardik Jain
Time Complexity: O(log2(log2 n)) for the average case, and O(n) for the worst case
Another approach
This is the iteration approach for the interpolation search.
Step1: In a loop, calculate the value of “pos” using the probe position formula. Step2: If it is a match, return the index of the item, and exit. Step3: If the item is less than arr[pos], calculate the probe position of the left sub-array. Otherwise, calculate the same in the right sub-array. Step4: Repeat until a match is found or the sub-array reduces to zero. Below is the implementation of the algorithm.
# Python equivalent of above C++ code
# Python program to implement interpolation search by using iteration approach
def interpolationSearch(arr, n, x):
# Find indexes of two corners
low = 0
high = (n - 1)
# Since array is sorted, an element present
# in array must be in range defined by corner
while low <= high and x >= arr[low] and x <= arr[high]:
if low == high:
if arr[low] == x:
return low;
return -1;
# Probing the position with keeping
# uniform distribution in mind.
pos = int(low + (((float(high - low)/( arr[high] - arr[low])) * (x - arr[low]))))
# Condition of target found
if arr[pos] == x:
return pos
# If x is larger, x is in upper part
if arr[pos] < x:
low = pos + 1;
# If x is smaller, x is in lower part
else:
high = pos - 1;
return -1
# Main function
if __name__ == "__main__":
# Array of items on whighch search will
# be conducted.
arr = [10, 12, 13, 16, 18, 19, 20, 21,
22, 23, 24, 33, 35, 42, 47]
n = len(arr)
x = 18 # Element to be searched
index = interpolationSearch(arr, n, x)
# If element was found
if index != -1:
print ("Element found at index",index)
else:
print ("Element not found")
Time Complexity: O(log2(log2 n)) for the average case, and O(n) for the worst case