Python and Memory Management, Part 1: Objects

Published: (February 7, 2026 at 04:06 PM EST)
6 min read
Source: Dev.to

Source: Dev.to

Value Equality – ==

== checks whether two objects contain the same data. When you write a == b, Python compares the values stored in the objects. This may involve calling special methods like __eq__.

# == compares values
a = [1, 2, 3]
b = [1, 2, 3]
print(a == b)          # True – same content

Under the hood, == can trigger more complicated operations:

  • For lists, it compares each element.
  • For custom objects, it calls the object’s __eq__ method.
class SensorValue:
    def __init__(self, value, tolerance):
        self.value = int(value)
        self.tolerance = int(tolerance)

    def __eq__(self, other):
        # Equality within tolerance
        return abs(self.value - other.value)

Note (Python 3.14+) – You can query whether an object is immortal via sys._is_immortal(obj) (a private API).

7. Recap

Operator / FunctionWhat it checksTypical use case
==Value equality (__eq__)Comparing data
isIdentity (same memory address)Singleton checks (is None)
id()Object’s “address” (integer)Debugging / introspection

Understanding these distinctions reveals why:

  • Small integers are cached and appear to be the same object.
  • Large integers are only shared when you explicitly bind them to multiple names.
  • Reference counting drives memory reclamation, but immortal objects bypass it.

Armed with this knowledge, you can write more predictable, memory‑aware Python code—and avoid the subtle bugs that arise from confusing equality with identity.

String Interning

Python performs a special optimisation for some strings: it caches them and stores the cached objects as interned strings. Interned strings behave like “immortal” objects – they are never de‑allocated while the interpreter runs. See the official description in the sys.intern documentation.

Note – Python implicitly interns a number of strings (e.g., identifiers, short literals). Because of this, the is operator can give surprising results when comparing strings that look equal but are not the same object.

Small‑integer memory layout

id() returns the memory address of an object. When we look at the addresses of consecutive small integers we see a regular pattern:

for i in range(5):
    print(f"id({i+1}) - id({i}) = {id(i+1) - id(i)}")

The difference is 32 bytes. Is that the size of a Python integer object?

import sys
print(f"{sys.getsizeof(1)=}")      # sys.getsizeof(1)=28
  • sys.getsizeof(1) reports 28 bytes – the actual size of the object data.
  • The 32‑byte gap you see when printing id differences comes from the way CPython allocates small integers in a contiguous array; each slot occupies 32 bytes, even though only 28 bytes are used for the object itself.

What occupies those 32 bytes?

OffsetSizeMeaning
0 – 78 BReference count (ob_refcnt). For “immortal” small integers this contains the magic value 0xC0000000.
8 – 158 BPointer to the type object (ob_type).
16 – 238 BSize field – the number of digits that make up the integer (Python integers have arbitrary precision).
24 – 274 BThe first digit (ob_digit[0]). Larger numbers allocate additional digits.
28 – 314 BPadding for 8‑byte alignment.

CPython object layout (C level)

Every Python object starts with a PyObject header:

typedef struct _object {
    Py_ssize_t ob_refcnt;   // reference count (or immortal flag)
    PyTypeObject *ob_type; // pointer to type object
} PyObject;

Integer (long) object definition

In CPython source (cpython/Include/cpython/longintrepr.h):

typedef struct _PyLongValue {
    uintptr_t lv_tag;          // number of digits, sign, flags
    digit ob_digit[1];         // array of 30‑bit “digits” (uint32)
} _PyLongValue;

struct _longobject {           // the public integer object
    PyObject_HEAD               // expands to the PyObject header above
    _PyLongValue long_value;   // the actual numeric data
};

The header is used by the memory manager for object retrieval and garbage collection.

Reference counting

At the heart of CPython’s memory management is reference counting. Each object stores a counter that tracks how many references point to it. When the counter reaches zero, the memory is released.

You can observe the reference count with sys.getrefcount() (the function adds one extra reference for the argument you pass).

import sys

x = [1, 2, 3]
print(f"Initial refcount: {sys.getrefcount(x)=}")   # 2 (x + argument)

y = x
print(f"Assigning y <- x: {sys.getrefcount(x)=}")   # 3

z = x
print(f"Assigning z <- x: {sys.getrefcount(x)=}")   # 4

del y
print(f"De‑assigning y: {sys.getrefcount(x)=}")     # 3

z = None
print(f"De‑assigning z: {sys.getrefcount(x)=}")     # 2

del x
print("Deleted x")

When the reference count drops to 0, CPython immediately frees the object.

Weak references and circular references

A weak reference does not increase the reference count, allowing objects to be reclaimed even if a weak reference to them still exists.

import sys
import weakref

class Node:
    def __init__(self, child):
        self.child = child

a = Node(None)          # refcnt = 1
b = Node(None)

a_ref = weakref.ref(a)   # weak reference, no refcnt increase
a.child = b
b.child = a              # creates a cycle, refcnt = 2 for each

print(f"All refs a, b, and a_ref assigned: {sys.getrefcount(a_ref())=}")  # 3 (2 + argument)

del a
print(f"After variable A deleted: {sys.getrefcount(a_ref())=}")            # 3 (still alive because of the cycle)

del b
print(f"After variable B deleted: {sys.getrefcount(a_ref())=}")            # 2 (object freed, but weakref still returns None)

Why reference counting alone can’t break cycles

When objects reference each other (a cycle), their reference counts never reach zero, so they would leak memory. CPython therefore adds a generational garbage collector that can detect and collect such cycles.

Generational garbage collector

The collector is based on the generational hypothesis:

  • Most objects die young (≈ 80‑90 %).
  • Old objects rarely reference young objects.

CPython maintains three generations:

GenerationWhen it runsDescription
0Every gc.get_threshold()[0] allocations (default 700)New objects.
1Every gc.get_threshold()[1] collections of generation 0 (default 10)Objects that survived one Gen 0 collection.
2Every gc.get_threshold()[2] collections of generation 1 (default 10)Long‑living objects.
import gc
print(f"GC generations thresholds: {gc.get_threshold()}")
# (700, 10, 10) by default

Because generation 0 collections happen only after a certain number of allocations, a newly created object (or a weak‑referenced object) may remain alive for a short time even after all strong references are gone. This explains why, in the example above, the weak reference a_ref() could still be accessed after a and b were deleted – the cyclic garbage had not yet been collected.

TL;DR

  • String interning caches certain strings, making them effectively immortal.
  • Small integers are stored in a pre‑allocated contiguous array; each slot occupies 32 bytes, though the object itself uses 28 bytes.
  • Every CPython object starts with a PyObject header (ob_refcnt + ob_type).
  • Reference counting automatically frees objects when their count drops to zero.
  • Weak references let you hold a non‑owning pointer to an object.
  • Circular references are handled by the generational garbage collector, which runs periodically based on allocation thresholds.
0 views
Back to Blog

Related posts

Read more »