Python 3.14’s GC Revert: The Memory Spike Fix
Meta description: Python 3.14’s incremental GC raised memory 5× in production. Here’s why it was reverted, how to detect it, and what to do.
Slug: python-314-gc-revert-memory-spike
Opening
You upgraded to Python 3.14 to get the promised improvements — lower pause times from the incremental garbage collector, cleaner async workflows, template strings. Your monitoring dashboards told a different story: memory usage climbing steadily, OOM kills in production, and a root cause that wasn’t obvious because the default garbage collector is supposed to be better, not worse.
On April 16, the Python core team announced a reversal: the incremental garbage collector introduced in Python 3.14 would be rolled back in 3.14.5 and removed from Python 3.15 before beta freeze. This is unusual for a patch release.
Here’s what happened, why the benchmarks lied to production, and how to make sure your deployments land on the right version.
How the Incremental GC Worked — and Why It Looked Great
Python’s garbage collector handles cyclic references between objects that the reference counter can’t clean up alone. Before 3.14, the full garbage collection run would pause all threads while it scanned and freed objects. This pause could be long — tens of milliseconds or more — depending on the number of tracked objects.
The incremental garbage collector (part of PEP 684’s broader threading improvements) changed the strategy. Instead of running one long pause, it broke the collection into small slices, each taking roughly 1.3 ms. The trade-off was clear on paper:
| Metric | Generational GC (pre-3.14) | Incremental GC (3.14) |
|---|---|---|
| ——– | ————————— | ———————— |
| Max pause time | ~26 ms | ~1.3 ms |
| CPU overhead | Lower during GC | Higher bookkeeping |
| Memory overhead | Baseline | Significantly higher |
The benchmark suite used to evaluate the change showed consistent improvements in pause time, and for latency-sensitive workloads — real-time audio processing, game loops, interactive REPLs — this was a genuine win.
But Neil Schemenauer’s testing on production workloads revealed a different picture. The incremental collector’s bookkeeping — tracking object generation transitions, maintaining additional data structures — caused peak memory usage to climb to as much as 5× the generational baseline in the worst case. And total runtime went up, not down, because the overhead of slicing and re-assembling collection work outweighed the benefits for most programs.
For web apps, data pipelines, and batch jobs — the overwhelming majority of Python deployments — reducing long pauses isn’t the win that matters. Memory pressure is.
How to Tell If You’re Affected
The memory spike from the incremental GC isn’t always obvious in short-lived processes. Container restarts mask it. But in long-running services — web servers, message consumers, background workers — the signal is clear:
import tracemalloc
import gc
import sys
# Enable tracemalloc before anything else
tracemalloc.start()
# Run your workload
def process_records():
records = []
for i in range(100_000):
record = {"id": i, "data": "x" * 100}
records.append(record)
# Simulate cyclic reference
record["self"] = record
return records
print(f"Python version: {sys.version}")
gc.collect()
current, peak = tracemalloc.get_traced_memory()
print(f"Current memory: {current / 1024 / 1024:.1f} MB")
print(f"Peak memory: {peak / 1024 / 1024:.1f} MB")
tracemalloc.stop()
If your peak memory is significantly higher than what you saw on Python 3.13 under the same workload, the incremental GC is likely the culprit. The effect varies by workload — applications with large object graphs and many cyclic references see the biggest deltas.
You can also check from the command line:
# Compare memory snapshots at equivalent points
python3.13 -c "exec(open('workload.py').read())" | grep peak
python3.14 -c "exec(open('workload.py').read())" | grep peak
A 2–3× increase in peak memory between versions is a strong indicator. 5× means you’re in the worst-case cluster.
The Revert — What Changed in 3.14.5
The core team’s decision to revert the incremental GC in a patch release is noteworthy. During 3.14’s release cycle, the working assumption was that the incremental collector had passed its acceptance criteria — it did, by the benchmark suite’s measures. But the disconnect between benchmark performance and production behavior is a reminder that “passed the benchmark suite” and “works in production” aren’t the same thing.
The fix landed in Python 3.14.5 and also ships in Python 3.15 before feature freeze. For both releases, the generational GC returns to its pre-3.14 behavior:
- Full collection runs during garbage collection triggers
- Pause times return to ~26 ms (or whatever your workload generates)
- Peak memory drops back to the baseline
- Total runtime improves because bookkeeping overhead is eliminated
If you’re on Python 3.14 and seeing memory pressure, upgrading to 3.14.5 gives you the old behavior back without any code changes.
What to Do If You Can’t Upgrade Right Away
Sometimes patching isn’t immediate. A container image rebuild, a CI/CD pipeline, a client’s release cycle — the world doesn’t always move at the speed of your monitoring alerts. Here are practical mitigations:
1. Tune collection thresholds
The generational GC runs when allocation thresholds are hit. You can raise them to collect less frequently, reducing peak memory at the cost of more objects surviving between collections:
import gc
# View current thresholds (default: [700, 10, 10])
print(gc.get_threshold())
# Raise thresholds to collect less frequently
# First value is for gen-0, second for gen-1, third for gen-2
gc.set_threshold(1000, 15, 15)
This trades more frequent full scans for lower peak memory. It’s a band-aid, not a fix, but it can stabilize deployments until you can patch.
2. Break cyclic references explicitly
The incremental GC pays a higher memory cost because it has to track more objects in its generation tables. Reducing cycles reduces the overhead:
# Before: cyclic reference creates memory overhead
class Node:
def __init__(self, name):
self.name = name
self.children = []
self.parent = None
def add_child(self, child):
self.children.append(child)
child.parent = self # Creates a cycle with parent
# After: use weakref to break the cycle
import weakref
class Node:
def __init__(self, name):
self.name = name
self.children = []
self._parent_ref = None # Use weakref instead
@property
def parent(self):
if self._parent_ref is not None:
return self._parent_ref()
return None
def add_child(self, child):
self.children.append(child)
child._parent_ref = weakref.ref(self) # Breaks the cycle
This pattern doesn’t eliminate the incremental GC’s overhead entirely, but it reduces the number of objects the collector needs to track, which dampens the memory impact.
3. Force periodic full collections
In long-running services, you can schedule full garbage collections during low-traffic windows:
import gc
import threading
import time
def periodic_gc(interval_seconds=300):
"""Force a full GC cycle every interval_seconds."""
while True:
time.sleep(interval_seconds)
collected = gc.collect()
if collected:
print(f"[GC] Collected {collected} objects")
# Start background GC thread
gc_thread = threading.Thread(target=periodic_gc, daemon=True)
gc_thread.start()
This is most useful in batch-processing services where you know natural low-activity windows. The gc.collect() call triggers a full generational sweep across all generations, which is exactly what you’d get after the revert — so you’re essentially pre-empting the collection yourself on your own schedule.
Common Mistakes
When dealing with the incremental GC’s memory spike, developers tend to reach for the wrong fixes. Here are two mistakes I’ve seen repeatedly, with the correct approach.
Mistake 1: Calling gc.disable() Instead of Tuning
The instinct when you see high memory is to disable garbage collection entirely. This is dangerous for any application that creates cyclic references:
import gc
# WRONG: Disables all GC — cyclic objects will never be freed
# until the process exits, leaking memory indefinitely
gc.disable()
# RIGHT: Keep GC enabled but tune thresholds to reduce overhead
gc.set_threshold(1000, 20, 20) # Collect less frequently
Disabling GC means any cycle you create is permanent memory. If your app uses dictionaries referencing themselves, tree structures with parent pointers, or event systems with callbacks that reference their owners, you’re leaking memory on every cycle. Tuning thresholds reduces the incremental GC’s bookkeeping without abandoning it.
Mistake 2: Relying on gc.collect() Alone Without Understanding Generations
Many developers add gc.collect() calls throughout their code to “clean up” after heavy workloads. This doesn’t help much because the incremental GC’s overhead comes from the collection strategy, not the collection frequency:
import gc
import tracemalloc
# WRONG: Frequent full collections don't fix incremental GC overhead
# and add scheduling cost without reducing peak memory
def process_batch(records: list[dict]) -> list[dict]:
results = []
for record in records:
results.append(transform(record))
gc.collect() # Full collection after every item — expensive
return results
# RIGHT: Let the threshold-driven GC handle it;
# use weakrefs to reduce objects tracked by the collector
def process_batch(records: list[dict]) -> list[dict]:
results: list[dict] = []
for record in records:
results.append(transform(record))
return results
Explicit gc.collect() calls in tight loops are worse than the default behavior. The incremental GC’s overhead comes from per-object tracking, not from how many times you invoke it. The right fix is reducing the object graph (via weakrefs) and tuning thresholds, not adding collection calls.
The Bigger Picture: Benchmarks vs. Production
This revert is a case study in a pattern that repeats across software engineering: benchmarks measure a specific, controlled scenario; production environments are messier.
The incremental GC’s design was sound for its intended use case — latency-sensitive workloads where long pauses are unacceptable. The problem wasn’t the idea; it was the assumption that the use case was universal.
Python’s release process handled this correctly. The core team identified the production impact, documented the trade-offs openly on discuss.python.org, and chose stability over feature retention. The fact that they did it in a patch release — rather than waiting for the next minor version — shows they understood the severity.
For Python developers, the lesson is practical: always validate new runtime behavior against production-like workloads, not just benchmark suites. The incremental GC passed every formal acceptance criterion. It still needed to go.
Wrap-Up
If you’re running Python 3.14 and haven’t upgraded to 3.14.5 yet, do it. The revert restores the generational GC’s original behavior with zero code changes. If you’re on 3.15, make sure you’re on the beta release — the incremental GC has already been removed from the feature-complete branch.
The incremental GC wasn’t a failure. It was a well-intentioned optimization that didn’t generalize beyond its target workload. That’s how innovation works — you try something bold, measure it against the right people, and course-correct when the signal matters.
For your next step: audit your Python 3.14 deployments for memory usage against 3.13 baselines, and flag any instances showing a 2×+ increase for priority patching.
No comments yet. Be the first to leave a comment!