← View series: problem solving with dsa
~/blog
Arrays
The Problem
You store a thousand temperature readings in a list and need the 500th one. You write readings[500] and get the value instantly. The operation looks trivial, but the speed depends on a detail most tutorials skip: how the array sits in memory.
If the data were scattered across arbitrary memory addresses (like a linked list), finding the 500th element would require walking through 499 pointers. The array avoids that by placing every element in a single contiguous block. The address of element i is computed directly — no traversal needed.
What an Array Is
An array is a contiguous block of memory divided into equal-sized slots. Each slot holds one element. The address of element i follows a fixed formula:
address of arr[i] = base_address + i × element_size
The CPU computes this in a single instruction. No traversal, no conditionals.
Example 1 — access by index. Suppose an array starts at memory address 0x1000 and stores 4-byte integers:
| Access | Formula | Address | Value |
|---|---|---|---|
arr[0] | 0x1000 + 0 × 4 | 0x1000 | 10 |
arr[1] | 0x1000 + 1 × 4 | 0x1004 | 20 |
arr[2] | 0x1000 + 2 × 4 | 0x1008 | 30 |
arr[3] | 0x1000 + 3 × 4 | 0x100C | 40 |
arr[5] | 0x1000 + 5 × 4 | 0x1014 | 60 |
Each access jumps directly to the address. There is no "search for the third slot" — the math gives you the exact location. Time: O(1). Space: O(1).
Example 2 — access with negative index. Python allows negative indexing: arr[-1] means "last element." The formula adjusts:
address of arr[-1] = base_address + (len(arr) - 1) × element_size
Address = 0x1000 + (6 - 1) × 4 = 0x1014. Same address as arr[5].
Example 3 — out of bounds. If you try arr[10] on a 6-element array, the formula gives 0x1000 + 10 × 4 = 0x1028. That address exists in memory — it just does not belong to your array. In C this is undefined behavior (reads whatever happens to be there). In Python it raises IndexError because the runtime checks the bound before computing.
arr = [10, 20, 30, 40, 50, 60]
print(arr[10]) # IndexError: list index out of rangeInsertion and the Cost of Shifting
Inserting an element at an arbitrary position requires shifting every element after that position one slot to the right. The array is contiguous — there is no "splice" operation in hardware.
Example 4 — insert at middle. Insert 99 at index 2 in [10, 20, 30, 40, 50]:
Before: [10, 20, 30, 40, 50, _ ] capacity = 6, size = 5
↑ index 2
Step 1: [10, 20, 30, 40, _, 50] shift arr[4] → arr[5]
Step 2: [10, 20, 30, _, 40, 50] shift arr[3] → arr[4]
Step 3: [10, 20, _, 30, 40, 50] shift arr[2] → arr[3]
Step 4: [10, 20, 99, 30, 40, 50] place 99 at index 2
Three elements shifted right. Each shift is a memory copy. For a 10,000-element list, inserting at the front copies all 10,000 elements.
Example 5 — insert at head. Insert 99 at index 0:
Before: [10, 20, 30, 40, 50, _ ]
↑ index 0
Step 1: [10, 20, 30, 40, _, 50] shift arr[4] → arr[5]
Step 2: [10, 20, 30, _, 40, 50] shift arr[3] → arr[4]
Step 3: [10, 20, _, 30, 40, 50] shift arr[2] → arr[3]
Step 4: [10, _, 20, 30, 40, 50] shift arr[1] → arr[2]
Step 5: [10, 10, 20, 30, 40, 50] shift arr[0] → arr[1]
Step 6: [99, 10, 20, 30, 40, 50] place 99 at index 0
Every element shifts one position right. This is the worst case.
Example 6 — insert at tail (append). Insert 99 at index 5 (the end):
Before: [10, 20, 30, 40, 50, _ ]
↑ index 5
Step 1: [10, 20, 30, 40, 50, 99] place 99 at index 5 directly
Zero shifts. This is the best case and the reason append() is O(1) amortized.
Code: insert_at
def insert_at(arr, index, value):
for i in range(len(arr) - 1, index, -1):
arr[i] = arr[i - 1]
arr[index] = valueAssume arr has capacity for the extra element (the resize step is covered in the next two posts).
Line-by-line walkthrough:
for i in range(len(arr) - 1, index, -1)— iterate from the last occupied index down to just after the target index. The step-1means moving backward through the list. The loop excludesindexitself because that slot will receive the new value directly.arr[i] = arr[i - 1]— copy each element one position to the right. The element atioverwrites the one ati + 1(which was already shifted in the previous iteration).arr[index] = value— the target position is now free because its old occupant was shifted right.
Code trace — insert_at([10, 20, 30, 40, 50, None], 2, 99):
Assume arr has length 6 (capacity 6, size 5). The loop runs backwards from index 5 to 3:
| i | Before assignment | arr[i] = arr[i-1] | After |
|---|---|---|---|
| 5 | [10, 20, 30, 40, 50, None] | arr[5] = arr[4] = 50 | [10, 20, 30, 40, 50, 50] |
| 4 | [10, 20, 30, 40, 50, 50] | arr[4] = arr[3] = 40 | [10, 20, 30, 40, 40, 50] |
| 3 | [10, 20, 30, 40, 40, 50] | arr[3] = arr[2] = 30 | [10, 20, 30, 30, 40, 50] |
Loop ends at i = 3. Then arr[2] = 99:
Final: [10, 20, 99, 30, 40, 50]
Each element from index 2 onward moved right by one. The old value at index 2 (30) is now at index 3.
Time: O(n) — every element after the insertion point shifts. In the worst case (insert at head), all n elements move. Space: O(1) — no extra allocation.
Time: O(n) — every element after the insertion point shifts. In the worst case (insert at head), all n elements move. Space: O(1) — no extra allocation.
Deletion and Shifting Left
Deleting an element shifts everything after it one position left. The same O(n) cost, mirrored.
Example 7 — delete at middle. Delete at index 1 in [10, 20, 30]:
Before: [10, 20, 30]
↑ delete
Step 1: [10, 30, 30] shift arr[2] → arr[1], size = 3
Step 2: [10, 30, _ ] size becomes 2, last slot orphaned
Example 8 — delete at tail. Delete at index 2 (last):
Before: [10, 20, 30]
↑ delete
Step 1: [10, 20, _ ] shift range is empty, size becomes 2
Zero shifts. This is the best case.
Code: delete_at
def delete_at(arr, index):
removed = arr[index]
for i in range(index, len(arr) - 1):
arr[i] = arr[i + 1]
arr[len(arr) - 1] = None
return removedLine-by-line walkthrough:
removed = arr[index]— save the value before overwriting.for i in range(index, len(arr) - 1)— iterate from target index up to second-to-last.arr[i] = arr[i + 1]— copy each element one position left.arr[len(arr) - 1] = None— the last slot still holds the last element's old value. Clearing it helps garbage collection (and prevents logical errors if the slot is read).
Code trace — delete_at([10, 20, 30, 40, 50, None], 1):
Assume arr has length 6 with size 5. The fifth slot holds 50, the sixth is empty.
| i | Before assignment | arr[i] = arr[i+1] | After |
|---|---|---|---|
| 1 | [10, 20, 30, 40, 50, None] | arr[1] = arr[2] = 30 | [10, 30, 30, 40, 50, None] |
| 2 | [10, 30, 30, 40, 50, None] | arr[2] = arr[3] = 40 | [10, 30, 40, 40, 50, None] |
| 3 | [10, 30, 40, 40, 50, None] | arr[3] = arr[4] = 50 | [10, 30, 40, 50, 50, None] |
| 4 | [10, 30, 40, 50, 50, None] | arr[4] = arr[5] = None | [10, 30, 40, 50, None, None] |
Loop ends. Then arr[5] = None (already None). Returns 20.
Final: [10, 30, 40, 50, None, None] size = 4
The three elements after index 1 each shifted one position left. The duplicate 50 at the end was cleaned up by setting the last slot to None.
Time: O(n) — every element after the deletion point shifts left. Space: O(1).
Searching by Value
Searching an unsorted array requires checking every element from left to right until a match is found.
Example 9 — target found. Search for 40 in [10, 80, 30, 40, 50]:
| Index | arr[i] | arr[i] == 40? |
|---|---|---|
| 0 | 10 | No |
| 1 | 80 | No |
| 2 | 30 | No |
| 3 | 40 | Yes → return 3 |
Example 10 — target not found. Search for 99 in the same array:
| Index | arr[i] | arr[i] == 99? |
|---|---|---|
| 0 | 10 | No |
| 1 | 80 | No |
| 2 | 30 | No |
| 3 | 40 | No |
| 4 | 50 | No |
| — | — | Loop ends → return -1 |
Code: linear_search
def linear_search(arr, target):
for i in range(len(arr)):
if arr[i] == target:
return i
return -1Line-by-line walkthrough:
for i in range(len(arr))— iterate over every valid index from 0 tolen(arr) - 1.if arr[i] == target— compare the element at the current index with the target.return i— the value was found at this index; return immediately.return -1— the loop completed without finding the target.
Code trace — linear_search([10, 80, 30, 40, 50], 40):
| i | arr[i] | Comparison | Action |
|---|---|---|---|
| 0 | 10 | 10 == 40? No | Continue |
| 1 | 80 | 80 == 40? No | Continue |
| 2 | 30 | 30 == 40? No | Continue |
| 3 | 40 | 40 == 40? Yes | Return 3 |
Two more iterations than binary search would need (binary search finds it in at most 3 comparisons for 5 elements, but requires sorted input).
Time: O(n) — every element compared in the worst case. Space: O(1).
Why Arrays Are Cache-Friendly
When you access arr[0], the CPU loads a cache line — typically 64 bytes — that contains arr[0] through arr[15] (for 4-byte integers). Accessing arr[1] immediately after hits the cache instead of main memory. For linked lists, each node could be anywhere in memory; every access is a potential cache miss.
This is why arrays often outperform linked lists for sequential traversal, even though both are O(n). The constants differ by an order of magnitude.
Closing
The array is the data structure every other structure is compared against. Its strength — O(1) random access via a cheap address computation — comes from contiguous memory. Its weakness — O(n) insert and delete — comes from the same contiguity. Every element that moves costs a copy.
Understanding this tradeoff at the memory level is what separates knowing arr[i] works from knowing why it works and when it does not.