Back to blog View series: problem solving with dsa

~/blog

Interpolation Search

Jun 2, 2026•11 min read•By Mohammed Vasim

dsapythondata-structuresalgorithmsproblem-solving

Why Binary Search's Fixed Midpoint Is Not Always Optimal

Binary search always probes the middle of the current range. That works well when you know nothing about the data except that it is sorted. But if the values are uniformly distributed, the middle is rarely the best guess — the target's value tells you roughly where it should be.

A phonebook is sorted alphabetically. Searching for "Smith" by always flipping to the middle is wasteful. You flip near the end because S is near the end of the alphabet. Searching for "Baker" means flipping near the front. You are doing interpolation search intuitively — estimating position from value.

Example

Setup

Array: [1, 3, 7, 15, 31, 63, 127, 255, 511, 1023], target = 500.

The values grow exponentially — each is roughly double the previous. This is a deliberately non-uniform distribution. The probe formula assumes a linear relationship between value and position, so it will systematically underestimate (or overestimate) on this data, requiring more iterations.

The probe formula:

probe = left + ((target - arr[left]) * (right - left)) // (arr[right] - arr[left])

Step 1 — left=0, right=9

Variable	Value
target - arr[left]	500 - 1 = 499
arr[right] - arr[left]	1023 - 1 = 1022
right - left	9
probe	0 + (499 × 9) // 1022 = 4491 // 1022 = 4

arr[4] = 31. 31 < 500 → target is in the right half. Move left to probe + 1 = 5.

Step	left	right	probe	arr[probe]	arr[probe] vs target	Action	New range
1	0	9	4	31	31 < 500	left = 5	[5, 9]

The probe severely underestimated because the values grow exponentially — index 4 is only 31, but the target is 500. The linear formula assumed the values increase at a constant rate, which they do not.

Step 2 — left=5, right=9

Variable	Value
target - arr[left]	500 - 63 = 437
arr[right] - arr[left]	1023 - 63 = 960
right - left	4
probe	5 + (437 × 4) // 960 = 5 + 1748 // 960 = 5 + 1 = 6

arr[6] = 127. 127 < 500 → target is still to the right. left = 7.

Step	left	right	probe	arr[probe]	arr[probe] vs target	Action	New range
2	5	9	6	127	127 < 500	left = 7	[7, 9]

Step 3 — left=7, right=9

Variable	Value
target - arr[left]	500 - 255 = 245
arr[right] - arr[left]	1023 - 255 = 768
right - left	2
probe	7 + (245 × 2) // 768 = 7 + 490 // 768 = 7

arr[7] = 255. 255 < 500 → left = 8.

Step	left	right	probe	arr[probe]	arr[probe] vs target	Action	New range
3	7	9	7	255	255 < 500	left = 8	[8, 9]

Step 4 — left=8, right=9

Variable	Value
target - arr[left]	500 - 511 = -11
arr[right] - arr[left]	1023 - 511 = 512
right - left	1
probe	8 + (-11 × 1) // 512 = 8 + (-11) // 512 = 8

arr[8] = 511. 511 > 500 → target is to the left. right = probe - 1 = 7.

Step	left	right	probe	arr[probe]	arr[probe] vs target	Action	New range
4	8	9	8	511	511 > 500	right = 7	exited: left(8) > right(7)

The range is empty. Return -1.

Code

python

def interpolation_search(arr, target):
    left, right = 0, len(arr) - 1

    while left <= right and arr[left] <= target <= arr[right]:
        if left == right:
            return left if arr[left] == target else -1

        probe = left + ((target - arr[left]) * (right - left)) // (arr[right] - arr[left])

        if arr[probe] == target:
            return probe
        elif arr[probe] < target:
            left = probe + 1
        else:
            right = probe - 1

    return -1

Line by line:

while left <= right and arr[left] <= target <= arr[right]: — two guards. The range must be non-empty, and the target must be within the current value range. If the target is outside the min or max value in the current slice, it cannot exist. The second condition also prevents the probe from producing a position outside the bounds.
if left == right: — single element remaining. Check directly without computing a probe (which would be a self-assignment anyway).
probe = left + ((target - arr[left]) * (right - left)) // (arr[right] - arr[left]) — linear interpolation. Compute the fraction (target - arr[left]) / (arr[right] - arr[left]), scale by the index width, offset by left.
if arr[probe] == target: return probe — found.
elif arr[probe] < target: left = probe + 1 — target is to the right.
else: right = probe - 1 — target is to the left.

Edge case — arr[right] == arr[left]. This would cause division by zero. The while condition prevents it: when arr[left] == arr[right] and target is within that range, then either all values are the same and we check directly via the left == right guard, or the range is larger but values are equal, which the while loop handles by checking the target range first.

Edge case — duplicate values. Like binary search, returns one match. Use left-biased/right-biased variants to find boundaries.

Time: O(log log n) average for uniform distribution. O(n) worst case (exponential or pathological distributions). Space: O(1).

Code Trace

Tracing interpolation_search([1, 3, 7, 15, 31, 63, 127, 255, 511, 1023], 500):

Iter	left	right	Range valid?	arr[left] ≤ target ≤ arr[right]?	probe	arr[probe]	Comparison	Action
1	0	9	Yes	Yes (1 ≤ 500 ≤ 1023)	4	31	31 < 500	left = 5
2	5	9	Yes	Yes (63 ≤ 500 ≤ 1023)	6	127	127 < 500	left = 7
3	7	9	Yes	Yes (255 ≤ 500 ≤ 1023)	7	255	255 < 500	left = 8
4	8	9	Yes	Yes (511 ≤ 500 ≤ 1023)? No	—	—	—	Exit loop, return -1

Row 4 shows the exit condition: arr[left] (511) is greater than the target (500), so the second guard fails and the loop exits immediately. No probe is computed.

When Interpolation Search Matters

Binary search is safe — always O(log n), no distribution assumptions. Interpolation search is fast on uniform data but can degrade to O(n). The real-world lesson: knowing your data distribution lets you choose the right tool. For uniformly distributed, sorted, static datasets (sensor readings, statistical data, auto-generated IDs), interpolation search is measurably faster. For general-purpose searching, stick with binary search.

→ LeetCode 704: Binary Search — interpolation search works well on LeetCode's test cases since inputs are typically uniform.