Solved: Stop Storing Results in Variables. Pipe Them Instead.
Source: Dev.to
Executive Summary
TL;DR: Storing large command outputs in variables can exhaust memory and crash servers by loading all data into RAM simultaneously. Instead, leverage the PowerShell pipeline to stream data object‑by‑object, ensuring efficient, low‑memory processing for large datasets.
- Storing command results in a “bucket” variable loads all objects into memory → memory exhaustion for large datasets.
- The PowerShell pipeline acts as a conveyor belt, processing data object‑by‑object and keeping a small, constant, predictable memory footprint.
- The “Filter‑Left” principle: filter as early as possible (e.g., using native
-Filterparameters) to minimise data transfer and memory usage. - For massive datasets or when re‑processing is needed, spool to disk (e.g.,
Export‑Csv→Import‑Csv) to achieve near‑zero memory usage. - Using
ForEach-Object(or its alias%) directly with piped input guarantees one‑at‑a‑time processing, avoiding the overhead offoreachloops on pre‑loaded variables.
Bottom line: Stop storing massive command outputs in variables. Learn to love the pipeline; it streams data object‑by‑object, preventing memory exhaustion and catastrophic script failures on production servers.
A Real‑World Story
I remember it like it was yesterday: 3:00 PM on a Friday. A junior engineer was tasked with a “simple” cleanup script—find and log all temp files older than 30 days across our web farm. He wrote a one‑liner:
$files = Get-ChildItem -Path \\web-cluster-*-c$\temp -Recurse
Ten minutes later, my pager went off. One by one, our entire production web fleet (prod-web-01 … prod-web-20) started throwing memory‑pressure alerts and falling over.
Why? The script tried to load millions of file objects from 20 servers into a single variable on his management box, causing a resource nightmare.
We’ve all been there—a classic mistake born from procedural thinking instead of streaming. As a Reddit thread summed up:
“We don’t recommend storing the results in a variable.”
Variables vs. Pipeline
Variable‑Based Approach
$myBigList = Get-ADUser -Filter *
PowerShell fetches every user, creates an object for each, and holds them all in $myBigList.
- 50 000 users → 50 000 objects in RAM.
- Works for a few dozen or hundred items, but catastrophic for thousands or millions.
Pipeline‑Based Approach
Get-ADUser -Filter * | Where-Object {$_.Enabled -eq $false}
The pipeline is a conveyor belt:
Get-ADUseremits the first user object.Where-Objectevaluates it, decides whether to keep or discard it.- The next object is emitted, and the cycle repeats.
Result: Memory footprint stays tiny, constant, and predictable, regardless of processing 100 or 10 million objects.
Pro Tip: A variable collects everything before you can act. A pipeline lets you act as items arrive. For large‑scale automation, the pipeline is the only scalable approach.
Three Practical Approaches (Quick Fix → “Break‑Glass”)
1️⃣ Direct Pipeline to ForEach-Object (The Most Direct Solution)
Instead of saving to a variable and then iterating with foreach, pipe the output straight to ForEach-Object (alias %). This guarantees one‑at‑a‑time processing.
The Bad Way (Memory Hog)
# WARNING: Loads ALL VMs into memory first!
$allVMs = Get-VM -ComputerName prod-hyperv-cluster
foreach ($vm in $allVMs) {
if ($vm.State -eq 'Off') {
Write-Host "$($vm.Name) is currently off. Removing snapshot."
Get-VMSnapshot -VMName $vm.Name | Remove-VMSnapshot
}
}
The Good Way (Streaming)
# Processes one VM at a time. Beautiful.
Get-VM -ComputerName prod-hyperv-cluster | ForEach-Object {
if ($_.State -eq 'Off') {
Write-Host "$($_.Name) is currently off. Removing snapshot."
# $_ represents the current object in the pipeline
Get-VMSnapshot -VMName $_.Name | Remove-VMSnapshot
}
}
2️⃣ Filter Early – “Filter‑Left” Principle
Do your filtering as far left (as early) in the command chain as possible. Avoid pulling massive data only to discard most of it locally.
Inefficient Way (Filter Late)
# Pulls ALL users, then filters. Bad for network & memory.
Get-ADUser -Filter * -Properties LastLogonDate |
Where-Object {
$_.Enabled -eq $false -and $_.LastLogonDate -lt (Get-Date).AddDays(-90)
} |
Select-Object Name
Efficient Way (Filter Left)
# Let the Domain Controller do the heavy lifting.
$ninetyDays = (Get-Date).AddDays(-90).ToFileTime()
Get-ADUser -Filter {
Enabled -eq $false -and LastLogonTimestamp -lt $ninetyDays
} -Properties LastLogonTimestamp |
Select-Object Name
By using the cmdlet’s native -Filter parameter, you ask the AD server to return only the matching users, drastically reducing the number of objects that ever enter the pipeline.
3️⃣ Spool to Disk for Massive, Re‑processable Datasets
When a dataset is truly huge and you need to process it multiple times (or the source API is slow), don’t keep it in memory. Dump it to a file and stream it back when needed.
# Export once (near‑zero memory)
Get-ADUser -Filter * -Properties * | Export-Csv -Path 'AllUsers.csv' -NoTypeInformation
# Later, stream it back for each processing pass
Import-Csv -Path 'AllUsers.csv' | Where-Object { $_.Enabled -eq $false } | ForEach-Object {
# Process each filtered record...
}
The export/import cycle uses disk I/O, not RAM, allowing you to re‑process the data without ever loading the full set into memory.
Takeaway
- Never store massive command outputs in a variable when you can stream them.
- Prefer the pipeline (
|) for all data‑processing tasks. - Filter early using native
-Filterparameters. - Spool to disk when you must re‑process huge datasets.
Adopt these patterns, and your scripts will scale gracefully without bringing down production servers. 🚀
Streaming Large Datasets with PowerShell
Data to a temporary file on disk.
This is a “hacky” but incredibly effective method. You take the one‑time performance hit of writing everything to a file (like a CSV or JSONL), then read that file back line‑by‑line, which has a near‑zero memory footprint.
The Process
Step 1 – Export the massive dataset to a file
# Export‑Csv is great because it handles objects cleanly.
Get-VeryLargeDataset -Server prod-db-01 |
Export-Csv -Path C:\temp\dataset.csv -NoTypeInformation
Step 2 – Process the data by streaming it from the file
# Import‑Csv streams the records one by one if piped.
Import-Csv -Path C:\temp\dataset.csv |
ForEach-Object {
# Work with each row ($_), one at a time.
# The entire file is NOT loaded into memory.
if ($_.Status -eq 'Failed') {
Invoke-MyRetryLogic -ID $_.TransactionID
}
}
Step 3 – Clean up after yourself!
Remove-Item -Path C:\temp\dataset.csv
Methods, Pros & Cons
| Method | Pros | Cons |
|---|---|---|
| 1. Pipelining | • Extremely low memory usage • Idiomatic PowerShell • Fast for most tasks | • Data is transient; can’t easily re‑process without re‑running the initial command |
| 2. Filtering Left | • Most efficient method • Reduces memory, CPU, and network load | • Relies on the source command having robust server‑side filtering capabilities |
| 3. Spooling to Disk | • Handles virtually infinite data sizes • Data is persistent for re‑processing | • Slowest method due to disk I/O • Requires temporary disk space • More complex code |
Takeaway
The next time you start to type $results = …, pause for a second. Ask yourself:
“How many items could this command possibly return?”
If the answer is “I don’t know” or “a lot,” do yourself (and your servers) a favor: ditch the variable and embrace the pipeline. Your future self—who isn’t getting paged at 3 PM on a Friday—will thank you.
👉 Read the original article on [TechResolve.blog]
☕ Support my work
If this article helped you, you can buy me a coffee:
👉