Why Arrays Start at Index 0: A Memory-Level Explanation
Source: Dev.to
📌 Table of Contents
- Arrays as Contiguous Memory Blocks
- How
arr[i]Works: Pointer Arithmetic Explained - Why This Forces Indexing to Start at 0
- What If Arrays Started at Index 1?
- Why
arr[i]andi[arr]Mean the Same Thing - Conclusion
Arrays as Contiguous Memory Blocks
At its core, an array in C/C++ is a fixed‑size collection of elements of the same type, stored in contiguous memory locations. When you declare
int arr[100];
the compiler allocates space for 100 consecutive integers.
On most modern systems:
- An
inttypically occupies 4 bytes (on 32‑ or 64‑bit architectures). - The whole array therefore consumes 400 bytes, laid out back‑to‑back in memory.
How arr[i] Works: Pointer Arithmetic Explained
The real reason arrays start at index 0 has nothing to do with counting or convention. It comes from how the language defines array subscripting.
When you write arr[i] it is translated directly into *(arr + i).
This is not an implementation detail; it is part of the C/C++ language definition.
Breaking down *(arr + i)
| Part | Meaning |
|---|---|
arr | The base address of the array (i.e., &arr[0]). |
+ i | Pointer arithmetic – adds i × sizeof(element_type) bytes, not i bytes. |
* | Dereferences the computed address to read or write the value. |
So arr[i] literally means: go i elements away from the start of the array, then access the value stored there.
Example
#include <stdio.h>
int main(void) {
int arr[] = {10, 20, 30, 40};
// Direct array access
printf("arr[1]: %d\n", arr[1]); // Output: 20
// Equivalent pointer version
printf("*(arr + 1): %d\n", *(arr + 1)); // Same output: 20
return 0;
}
Both statements print 20 because they are exactly the same operation.
Why This Forces Indexing to Start at 0
The first element isn’t “one step away” – it lives at the base address itself.
- Distance from base address = 0
- Offset = 0
- Index = 0
Hence the first element is accessed as:
arr[0] == *(arr + 0) // no adjustment needed
Each subsequent element is reached by moving forward in memory:
arr[1] == *(arr + 1) // skip 1 element (4 bytes for int)
arr[2] == *(arr + 2) // skip 2 elements (8 bytes)
An index is simply an offset measured in elements. Offsets start at 0 because nothing can be closer than zero distance from the origin. This follows directly from how memory addressing and pointer arithmetic work.
What If Arrays Started at Index 1?
Suppose arrays were 1‑based indexed, as in MATLAB, where the first element would be accessed as arr[1].
Pointer arithmetic itself would not change; arr[i] would still translate to *(arr + i). Applying this rule directly would give:
arr[1] → *(arr + 1) // points to the *second* element, not the first
To make 1‑based indexing work, the compiler would need to rewrite every access as:
arr[i] → *(arr + (i - 1))
The extra subtraction introduces a semantic mismatch with the hardware’s natural “base + offset” addressing model, complicates bounds reasoning, and obscures the simple “offset from base” mental model. While modern compilers could optimise the subtraction away, the added conceptual step is unnecessary.
Why arr[i] and i[arr] Mean the Same Thing
The C standard defines the subscript operator as:
a[b] == *(a + b)
Because addition is commutative (a + b == b + a), we also have:
*(a + b) == *(b + a)
Therefore:
a[b] == b[a]
This is not a trick or undefined behaviour; it is a direct consequence of the language definition.
Demonstration
#include <stdio.h>
int main(void) {
int arr[5] = {1, 2, 3, 4, 5};
// Normal array indexing
printf("arr[3] = %d\n", arr[3]); // Output: 4
// Equivalent but unusual indexing
printf("3[arr] = %d\n", 3[arr]); // Output: 4
return 0;
}
Both statements print 4 because arr[3] and 3[arr] compute the same address.
Conclusion
- Zero‑based indexing aligns perfectly with the way C/C++ define array subscripting (
*(base + offset)). - The first element resides at the base address, so its offset is 0.
- One‑based indexing would require an extra
-1adjustment for every access, adding unnecessary complexity. - The definition
a[b] == *(a + b)also explains whyarr[i]andi[arr]are interchangeable.
Understanding these low‑level details clarifies why the seemingly odd choice of starting array indices at 0 is actually the most natural and efficient one for the language and the underlying hardware.
#include <stdio.h>
int main(void) {
int arr[] = {1, 2, 3, 4};
int i = 3;
printf("%d\n", i[arr]); // Output: 4
return 0;
}
NOTE: While i[arr] is valid C, it is rarely used in real code because it hurts readability. It exists only because array indexing is defined in terms of pointer arithmetic.
Conclusion
In C/C++, array indexing is not about counting positions. It is about measuring offsets from a base address.