Regression testing workflow: the risk first checks that keep releases stable
Source: Dev.to
Workflow shown: risk‑first regression scoping → golden‑path baseline → targeted probes → evidence‑backed results.
Example context: Sworn on PC Game Pass (Windows) used only as a real‑world backing example.
Build context: tested on the PC Game Pass build 1.01.0.1039.
Scope driver: public SteamDB patch notes used as an external change signal (no platform parity assumed).
Outputs: a regression matrix with line‑by‑line outcomes, session timestamps, and bug tickets with evidence.
Regression testing flow used to verify stability after change during a time‑boxed Sworn (PC) pass.
Regression testing scope: what I verified and why
This article is grounded in a self‑directed portfolio regression pass on Sworn using the PC Game Pass (Windows) build 1.01.0.1039, run in a one‑week solo timebox.
Scope was change‑driven and risk‑based:
- golden‑path stability (launch → play → quit → relaunch)
- save‑and‑continue integrity
- core menus
- audio sanity
- input handover
- side‑effect probes suggested by upstream patch notes
No Steam and Game Pass parity claim is made.
What regression testing is (in practice)
For me, regression testing is simple: after a change, does existing behaviour still hold?
- Not “re‑test everything”.
- Not “run a checklist because that’s what we do”.
A regression pass is selective by design. Coverage is driven by risk:
- What is most likely to have been impacted?
- What is most expensive if broken?
- What must remain stable for the build to be trusted?
Regression testing outputs: pass/fail results with evidence
- Clear outcomes: pass or fail.
- Backed by evidence and repeatable verification.
- No opinions, no “vibes”.
Golden‑path smoke baseline for regression testing
I start every regression cycle with a repeatable golden‑path smoke because it prevents wasted time. If the baseline is unstable, deeper testing is noise.
In this Sworn pass, the baseline line was BL‑SMOKE‑01:
cold launch → main menu → gameplay → quit to desktop → relaunch → main menu
I also include a quick sanity listen for audio cut‑outs during this flow.
“Some systems absolutely cannot break. Those are the ones you want to verify on every build before spending time on deeper testing.”
— Conrad Bettmann, QA Manager (Rovio Entertainment)
Why baseline stability matters in regression testing
The golden path includes the most common player actions (launch, play, quit, resume). If those are unstable, you get cascading failures that masquerade as unrelated defects.
Regression testing scope: change signals and risk
For this project I used SteamDB patch notes as an external oracle:
SWORN 1.0 Patch #3 (v1.0.3.1111), 13 Nov 2025
That does not mean I assumed those changes were present on PC Game Pass. Instead, I used the patch notes as a change signal to decide where to probe for side effects on the Game Pass build. This is useful when you have no internal access, no studio data, and no changelog for your platform.
Knowing what changed and where helps you focus regression on affected areas, rather than running very wide checks that probably won’t find anything valuable. It’s usually best to mix multiple oracles instead of relying on a single source.
“External oracles are a pragmatic way to drive risk‑based regression when internal documentation is unavailable.”
— Conrad Bettmann, QA Manager (Rovio Entertainment)
Regression outcomes: pass vs. not applicable (with evidence)
- SteamDB notes mention a music‑cutting‑out fix, so I ran an audio runtime probe (STEA‑103‑MUSIC) and verified music continuity across combat, pause/unpause, and a level load – pass.
- SteamDB also mentions a Dialogue Volume slider. On the Game Pass build that control was not present, so the check was recorded as not applicable with evidence of absence (STEA‑103‑AVOL).
How my regression matrix is structured
My Regression Matrix lines are written to be auditable. Each line includes:
- A direct check
- A side‑effect check (if applicable)
- A clear outcome
- An evidence link
That keeps results reviewable and prevents “I think it’s fine” reporting.
Example matrix lines
| ID | Description |
|---|---|
| BL‑SMOKE‑01 | Baseline smoke |
| BL‑SET‑01 | Settings persistence |
| BL‑SAVE‑01 | Save‑and‑Continue integrity |
| BL‑DEATH‑01 | Post‑death flow sanity |
| STEA‑103‑MUSIC | Audio runtime continuity probe |
| STEA‑103‑AVOL | Audio settings presence check |
| STEA‑103‑CODEX | Codex and UI navigation sanity |
| BL‑IO‑01 | Input handover + hot‑plug |
| BL‑ALT‑01 | Alt‑Tab sanity |
| BL‑ECON‑01 | Enhancement spend + ownership persistence |
Save‑and‑Continue regression testing: anchors, not vibes
Save‑and‑Continue flows are a classic regression risk area because failures can look intermittent. To reduce ambiguity, I verify using anchors.
In this pass (BL‑SAVE‑01) I anchored:
- Room splash name – Wirral Forest
- Health bucket – 60/60
- Weapon type – sword
- Start of objective text
I then verified those anchors after menu Continue and after a full relaunch.
Outcome: pass – anchors matched throughout (session S2).
Why anchors make regression results repeatable
“Continue worked” is not useful if someone else cannot verify what you resumed into. Anchors turn “seems fine” into a repeatable verification result.
QA evidence for regression testing: what I capture and why
For regression, evidence matters. I capture the following for each check:
| Evidence type | Purpose |
|---|---|
| Screen recordings | Visual proof of UI state, transitions, and any glitches |
| Log excerpts | Show internal state, error messages, or confirmation of actions |
| Audio clips | Verify continuity, absence of cut‑outs, and correct volume levels |
| Screenshots with timestamps | Tie visual state to a specific moment in the test session |
| Automated test output (if any) | Provide reproducible steps and results from scripts |
All evidence is stored in a shared folder and linked from the regression matrix, ensuring anyone can audit the outcome without relying on memory or subjective description.
Evidence Guidelines
- Video clips – Show input, timing, and outcome together (ideal for flow and audio checks).
- Screenshots – Support UI state, menu presence/absence, and bug clarity.
- Session timestamps – Keep verification reviewable without scrubbing long recordings.
- Environment notes – Platform, build, input devices, cloud‑saves enabled.
If the evidence cannot answer what was done, what happened, and what should have happened, it is not evidence.
Regression‑Testing Examples (from the Sworn pass)
Example regression bug: Defeat overlay blocks the Stats screen (SWOR‑6)
Bug: [PC][UI][Flow] Defeat overlay blocks Stats; Continue starts a new run (SWOR‑6)
- Expectation: After Defeat, pressing Continue reveals the full Stats screen in the foreground and waits for player confirmation.
- Actual: Defeat stays in the foreground, Stats renders underneath with a loading icon, then a new run starts automatically. Outcome – you cannot review Stats.
- Repro rate: 3/3 (observed during progression verification S2 and reconfirmed in a dedicated retest S6).
Patch‑note probe example: Music continuity check (STEA‑103‑MUSIC)
SteamDB notes mention a fix for music cutting out, so I ran STEA‑103‑MUSIC:
- Test: 10 min runtime with combat transitions, plus pause/unpause and a level load.
- Outcome: Pass – music stayed continuous across those transitions (S3).
Evidence‑backed “not applicable” example: Missing Dialogue Volume slider (STEA‑103‑AVOL)
SteamDB notes mention a Dialogue Volume slider, but on the Game Pass build the Audio menu only showed Master, Music, and SFX.
- Outcome: Not applicable with evidence of absence (STEA‑103‑AVOL, S4).
- This avoids inventing parity and keeps the matrix honest.
Accessibility issues logged as a known cluster (no new build to retest)
On Day 0 (S0) I captured onboarding accessibility issues as a known cluster (B‑A11Y‑01: SWOR‑1, SWOR‑2, SWOR‑3, SWOR‑4).
- Because there was no newer build during the week, regression retest was not applicable until a new build exists.
- This is logged explicitly rather than implied.
Results Snapshot (for transparency)
In this backing pass the matrix recorded:
- 8 Pass
- 1 Fail
- 1 Not applicable
- 1 Known accessibility cluster captured on Day 0 with no newer build available for retest
Counts are included for context, not as the focus of the article.
Regression‑Testing Takeaways (risk, evidence, and verification)
- Regression testing is change‑driven verification, not “re‑test everything”.
- A repeatable golden‑path baseline stops you wasting time on an unstable build.
- External patch notes can be used as a risk signal without assuming platform parity.
- Anchors make progression and resume verification credible and repeatable.
- Not applicable is a valid outcome if it is evidenced, not hand‑waved.
- Pass results deserve evidence too, because they are still claims.
Regression‑Testing FAQ (manual QA)
Is regression testing just re‑testing old bugs?
No. It verifies that existing behaviour still works after a change, covering previously working systems whether or not bugs were ever logged against them.
Do you need to re‑test everything in regression?
No. Effective regression testing is selective. Scope is driven by change and risk, not by feature count.
How do you scope regression without internal patch notes?
By using external change signals such as public patch notes, previous builds, and observed behaviour as oracles, without assuming platform parity.
What’s the difference between regression and exploratory testing?
- Regression: verifies known behaviour after change.
- Exploratory: searches for unknown risk and emergent failure modes.
They complement each other but answer different questions.
Is a pass result meaningful in regression testing?
Yes. A pass is still a claim, so regression passes should be supported with evidence, not just a checkbox.
When is “not applicable” a valid regression outcome?
When a feature is not present on the build under test and that absence is confirmed with evidence. Logging this explicitly is more honest than assuming parity or silently skipping the check.
Evidence & Case‑Study Links
- (Add links to the workbook tabs: Regression Matrix, Sessions Log, Bug Log)
- (Add links to evidence clips)
This dev.to post stays focused on the regression workflow. The case‑study links out to the workbook tabs and evidence clips.
