Searching, Sorting, and Timing
Agenda
- Timing (Empirical runtime analysis)
- Prelude: Timing list indexing
- Linear search
- Binary search
- Insertion sort
- Bubble sort
1. Timing
import time
print(time.time())
1612976291.3405042
t1 = time.time()
time.sleep(1)
t2 = time.time()
print(t2 - t1)
1.0015649795532227
2. Prelude: Timing list indexing
lst = [0] * 10**5
import timeit
print(timeit.timeit(stmt='lst[0]', globals=globals()))
0.03113916900474578
print(timeit.timeit(stmt='lst[10**5-1]', globals=globals()))
0.03360662600607611
print('lst[{}]'.format(1))
lst[1]
times = [timeit.timeit(stmt='lst[{}]'.format(i),
globals=globals(),
number=1000)
for i in range(10**5)]
times[:10]
[3.1036994187161326e-05,
2.981899888254702e-05,
3.060800372622907e-05,
2.9064001864753664e-05,
3.014199319295585e-05,
3.0212002457119524e-05,
2.9727991204708815e-05,
2.9686998459510505e-05,
2.9867005650885403e-05,
3.0117997084744275e-05]
%matplotlib inline
import matplotlib.pyplot as plt
plt.plot(times, 'bo')
[<matplotlib.lines.Line2D at 0x1091ebd60>]
Observation: accessing an element in a list by index takes a constant amount of time, regardless of position.
How? A Python list uses an array as its underlying data storage mechanism. To access an element in an array, the interpreter:
- Computes an offset into the array by multiplying the element’s index by the size of each array entry (which are uniformly sized, since they are merely references to the actual elements)
- Adds the offset to the base address of the array
3. Linear Search
Task: to locate an element with a given value in a list (array).
def lindex(lst, x):
for i in range(len(lst)):
if x == lst[i]:
return i
return -1
lst = list(range(100))
lindex(lst, 10)
10
lindex(lst, 99)
99
lindex(lst, -2)
-1
import timeit
lst = list(range(1000))
ltimes = [timeit.timeit(stmt='lindex(lst, {})'.format(x),
globals=globals(),
number=100)
for x in range(1000)]
import matplotlib.pyplot as plt
plt.plot(ltimes, 'ro')
[<matplotlib.lines.Line2D at 0x110299ca0>]
4. Binary search
Task: to locate an element with a given value in a list (array) whose contents are sorted in ascending order.
def index(lst, x):
def binsearch_rec(lst,x,l,h):
mid = ((h - l) // 2) + l
if lst[mid] == x:
return mid
if (h - l) == 1:
return -1
newlow = mid + 1 if lst[mid] < x else l
newhigh = mid - 1 if lst[mid] > x else h
return binsearch_rec(lst,x,newlow,newhigh)
return binsearch_rec(lst,x,0,len(lst))
print(index(lst, 999))
print(index(lst, -1))
import timeit
lst = list(range(1000))
times = [timeit.timeit(stmt='index(lst, {})'.format(x),
globals=globals(),
number=1000)
for x in range(1000)]
%matplotlib inline
import matplotlib.pyplot as plt
plt.plot(times, 'ro')
plt.show()
def iter_index(lst, x):
l = 0
h = len(lst)
while h > l:
mid = ((h - l) // 2) + l
if lst[mid] == x:
return mid
l = mid + 1 if lst[mid] < x else l
h = mid - 1 if lst[mid] > x else h
return -1
import timeit
iter_no_times = []
for size in range(1000, 100000, 100):
lst = list(range(size))
iter_no_times.append(timeit.timeit(stmt='iter_index(lst, -1)',
globals=globals(),
number=1000))
import matplotlib.pyplot as plt
plt.plot(iter_no_times, 'ro')
[<matplotlib.lines.Line2D at 0x111dcb0a0>]
import timeit
etimes = []
for e in range(5, 20):
lst = list(range(2**e))
etimes.append(timeit.timeit(stmt='iter_index(lst, -1)',
globals=globals(),
number=100000))
import matplotlib.pyplot as plt
plt.plot(etimes, 'ro')
plt.show()
5. Insertion sort
-
Task: to sort the values in a given list (array) in ascending order.
import random lst = list(range(1000)) random.shuffle(lst)
plt.plot(lst, 'ro') plt.show()
def insertion_sort(lst): for i in range(1,len(lst)): # number of times? n-1 for j in range(i,0,-1): # number 1, 2, 3, 4, ..., n-1 if lst[j] <= lst[j-1]: lst[j-1], lst[j] = lst[j], lst[j-1] else: break
insertion_sort(lst)
plt.plot(lst, 'ro') plt.show()
import timeit import random times = [timeit.timeit(stmt='insertion_sort(lst)', setup='lst=list(range({})); random.shuffle(lst)'.format(size), globals=globals(), number=1) for size in range(100, 5000, 250)]
plt.plot(times, 'ro') plt.show()
6. Bubble sort
-
Another simple sort algorithm is Bubble sort. This algorithm
def bubble_sort(lst): pass