← View series: problem solving with dsa
~/blog
Binary Search
Given a sorted array, find whether a target value exists and return its index. If it does not exist, return -1.
Because the array is sorted, you do not need to check every element. Compare with the middle element: if it matches, done. If the target is smaller, the entire right half is irrelevant. If larger, the entire left half is irrelevant. Every comparison eliminates half the remaining array.
Example
Tracing [2, 5, 8, 12, 16, 23, 38], target = 23.
We maintain two pointers: left (start of the current range) and right (end of the current range). At each step we compute mid = (left + right) // 2 and compare arr[mid] with the target.
Step 1 — initial range: left=0, right=6
mid = (0 + 6) // 2 = 3. arr[3] = 12. Is 12 == 23? No. Is 12 < 23? Yes. So the target must be in the right half. Move left to mid + 1 = 4.
| Step | left | right | mid | arr[mid] | arr[mid] vs target | Action | New range |
|---|---|---|---|---|---|---|---|
| 1 | 0 | 6 | 3 | 12 | 12 < 23 | left = 4 | [4, 6] |
Step 2 — range: left=4, right=6
mid = (4 + 6) // 2 = 5. arr[5] = 23. Is 23 == 23? Yes. Found at index 5.
| Step | left | right | mid | arr[mid] | arr[mid] vs target | Action |
|---|---|---|---|---|---|---|
| 2 | 4 | 6 | 5 | 23 | 23 == 23 | Return 5 |
Two comparisons for 7 elements. The remaining 5 elements (2, 5, 8, 12, 16 from the left half and 38 from the right) were never examined — we proved they cannot contain the target.
Why this works: Because the array is sorted. If arr[mid] < target, then every element from left through mid is also < target (since they are ≤ arr[mid]). None of them can be the target. Eliminating them in one comparison instead of checking each individually is the source of the speedup.
Code
def binary_search(arr, target):
left, right = 0, len(arr) - 1
while left <= right:
mid = (left + right) // 2
if arr[mid] == target:
return mid
elif arr[mid] < target:
left = mid + 1
else:
right = mid - 1
return -1Line by line:
left, right = 0, len(arr) - 1— the current search range. Both bounds are inclusive.leftstarts at the first element,rightat the last. If the array is empty,right = -1.while left <= right:— continue while the range is non-empty. The<=(not<) is critical: whenleft == right, there is exactly one element left to check. Using<would skip that last element and miss the target.mid = (left + right) // 2— integer division floors, giving the left-of-center midpoint for even-length ranges.if arr[mid] == target: return mid— found. Exit.elif arr[mid] < target: left = mid + 1— target is in the right half.mid + 1becausearr[mid]was already checked and is too small.else: right = mid - 1— target is in the left half.mid - 1becausearr[mid]was checked.return -1— the while conditionleft <= rightbecame false. The target is not in the array.
Edge case — single element. left=0, right=0. The loop runs once. mid=0. Checks arr[0]. Correct.
Edge case — target not found. The range shrinks until left > right. For example, searching for 1 in [2, 5, 8, 12, 16, 23, 38]: step 1 (mid=3, 12 > 1) → right=2; step 2 (mid=1, 5 > 1) → right=0; step 3 (mid=0, 2 > 1) → right=-1. Loop exits, returns -1.
Time: O(log n) — the range halves with each iteration. Space: O(1).
Code Trace
Tracing binary_search([2, 5, 8, 12, 16, 23, 38], 8):
| Iteration | left | right | mid | arr[mid] | Comparison | Action | New range |
|---|---|---|---|---|---|---|---|
| 1 | 0 | 6 | 3 | 12 | 12 > 8 | right = 2 | [0, 2] |
| 2 | 0 | 2 | 1 | 5 | 5 < 8 | left = 2 | [2, 2] |
| 3 | 2 | 2 | 2 | 8 | 8 == 8 | Return 2 | — |
For n = 7, at most 3 comparisons (log₂(8) = 3). Linear search would need up to 7.
Not-found trace — same array, target = 1:
| Iteration | left | right | mid | arr[mid] | Comparison | Action | New range |
|---|---|---|---|---|---|---|---|
| 1 | 0 | 6 | 3 | 12 | 12 > 1 | right = 2 | [0, 2] |
| 2 | 0 | 2 | 1 | 5 | 5 > 1 | right = 0 | [0, 0] |
| 3 | 0 | 0 | 0 | 2 | 2 > 1 | right = -1 | empty |
| — | 0 | -1 | — | — | left > right | return -1 | — |
Three comparisons to exhaust 7 elements and confirm the target is absent.
Finding the First and Last Occurrence
When the array contains duplicates, standard binary search returns one match — but not necessarily the first. To find the first occurrence, modify the search to continue left after finding a match.
def binary_search_first(arr, target):
left, right = 0, len(arr) - 1
result = -1
while left <= right:
mid = (left + right) // 2
if arr[mid] == target:
result = mid
right = mid - 1
elif arr[mid] < target:
left = mid + 1
else:
right = mid - 1
return resultInstead of returning immediately on match, this variant records the match (result = mid) and continues searching left (right = mid - 1). The last recorded match before the range empties is the leftmost occurrence.
A symmetric change (continue right after match) finds the last occurrence.
Time: Still O(log n). Space: O(1).
The sorted-array requirement feels like a limitation until you realize how often it holds. Database indexes, sorted files, configuration tables — sorted data is common precisely because binary search makes it worthwhile. The ten-line implementation above is doing the same fundamental operation as every B-tree lookup you've triggered.
→ LeetCode 704: Binary Search — the standard problem. LeetCode 34: Find First and Last Position of Element in Sorted Array — applies the first/last variant.