How do you know the software is working?
Source: Dev.to
What We’ll Cover
- Mindset – the many roles you’ll take on: product designer, project manager, tech lead, and quality‑assurance engineer.
- Brainstorming – turning a feature idea into a concrete specification.
- Managing coding agents – how to enforce rules, why they matter, and how to keep the agents on track.
- Shipping – confidently deploying AI‑generated code to production (yes, we’ll do some “vibe coding” in prod).
The order may vary. Buckle up!
Recap
In the previous post we showed how to make Claude stick to conventions (tl;dr – skills + hooks fix it). Claude now follows the rules, but…
Marcin, all tasks are complete.
I open a browser and see:
NoMethodError: undefined method 'hallucinated_method' for an instance of User (NoMethodError)
“Great job, Claude! High‑five… let’s ship it to production… NOT.”
That raises a fundamental question:
How do you know the software is actually working?
A Real‑World Story (2017)
- Company: Paladin Software
- Task: Process massive YouTube‑earnings CSV files (gigabyte‑sized).
- Context: I was the on‑call engineer ensuring timely, reliable processing.
One beautiful Thursday afternoon I was at my favourite spot in Kraków – Dolnych Młynów. Friday was a day off. A client uploaded their spreadsheet on Friday; when I returned on Monday, the earnings still hadn’t been processed.
Looking at Sidekiq’s failed jobs queue:
NoMethodError: undefined method 'hallucinated_method' for an instance of User (NoMethodError)
Was I an LLM before it was a thing? I was certainly shipping code like one.
Why LLMs Need Strict Guardrails
- Anterograde amnesia: LLMs can’t form new memories after a session starts. They remember their training data, but new experiences don’t stick.
- Non‑determinism: An AI agent may produce brilliant code one run and broken code the next.
Therefore we must give the model strict rules each time it writes code (see the previous post on enforcing rules) and add checks and reviews.
Deterministic Checks in Your Workflow
Below is my opinion on what should be included in local CI:
| Tool | Purpose | Recommended Flags |
|---|---|---|
| Rubocop | Static code analyser & linter (autocorrect on) | bundle exec rubocop -A |
| Prettier | Formatter for ERB, CSS, JS | yarn prettier --config .prettierrc.json app/packs app/components --write |
| Brakeman | Security‑vulnerability static analysis | bundle exec brakeman --quiet --no-pager --except=EOLRails |
| RSpec | Testing framework (use your favourite) | bundle exec parallel_rspec --serialize-stdout --combine-stderr |
| SimpleCov | Coverage reporting (used by RSpec) | – |
| Undercover | Warns about changed code without tests (uses git diffs, code structure, SimpleCov) | bundle exec undercover --lcov coverage/lcov/app.lcov --compare origin/master |
What This Gives Us
- Single code style – consistent formatting.
- Security – known vulnerabilities flagged early.
- Test coverage – new and changed code are exercised.
- No runtime errors – failures surface in CI, not in prod.
“How many times has an AI agent told you a test failure is unrelated to its changes?”
Boilerplate Concerns?
No. Unit tests are fast, and with coding agents they become more maintainable than ever. The reliability payoff far outweighs the modest amount of boilerplate.
Sample CI Output
Continuous Integration
Running checks...
Rubocop
bundle exec rubocop -A
✅ Rubocop passed in 1.98s
Prettier
yarn prettier --config .prettierrc.json app/packs app/components --write
✅ Prettier passed in 1.57s
Brakeman
bundle exec brakeman --quiet --no-pager --except=EOLRails
✅ Brakeman passed in 8.76s
RSpec
bundle exec parallel_rspec --serialize-stdout --combine-stderr
✅ RSpec passed in 1m32.45s
Undercover
bundle exec undercover --lcov coverage/lcov/app.lcov --compare origin/master
✅ Undercover passed in 0.94s
✅ Continuous Integration passed in 1m45.70s
If you’re on Rails 8.1+, most of these checks are already baked into the framework. For Rails 8.0 and earlier you can use my ported implementation or roll your own.
Lessons from My Early Career (2014)
- Job: Junior Rails developer at Netguru.
- Onboarding: “The Netguru way” – specific libraries and patterns.
- Feedback: My models were “fat”. I was pointed to Code Climate’s article “7 Ways to Decompose Fat ActiveRecord Models.”
I heard the rules but was too focused on business logic to apply them.
“The CLAUDE.md says ‘ALWAYS STOP and ask for clarification rather than making assumptions,’ and I violated that repeatedly. I got caught up in the momentum of the Rails 8 upgrade and stopped being careful.”
This isn’t a new problem for the software industry. The remedy? Code review. Every pull request must be checked by another developer, giving less‑experienced developers a safety net and ensuring higher quality overall.
…
(The post continues in the next installment.)
Three‑Stage Review Process
Goal:
- Teach good practices.
- Enable experienced developers to mentor others and pass on their knowledge.
- Enforce the rules so everybody wins.
Remember: Never let the developer review their own code. The same applies to AI agents.
Why a Three‑Stage Review?
- Specification compliance – Verify that the implementation matches the functional specifications exactly (neither more nor less).
- Rails & project‑specific conventions – Load all conventions (see the previous post) and check them:
- Are the interfaces clean?
- Are view components used instead of partials?
- Are jobs idempotent and thin?
- Do the tests verify behaviour?
- General code‑quality review – Assess architecture, design, documentation, standards, and maintainability.
Each stage is performed by a different agent with a fresh perspective and no attachment to the feature, giving a comprehensive overview of the implementation and any possible deviations.
Example Full Report
1. Spec compliance – line‑by‑line verification
| Requirement | Implementation | Status |
|---|---|---|
Column: delay_peer_reviews | :delay_peer_reviews | ✅ Match |
2. Rails conventions – checklist
| Convention | Status |
|---|---|
| Reversible migration | PASS |
| Handles existing data | PASS |
3. Code quality – structured report
- Strengths
- Critical / Important / Minor issues (with references)
- Merge assessment
Final summary table
| Check | Status |
|---|---|
| ✅ Spec compliance | Passed |
| ✅ Rails conventions | Passed |
| ✅ Code quality | Approved with minor suggestions |
| ✅ Local CI | Passed |
Ready for merge.
When Issues Are Found
The review consolidates them into a single, actionable list, making it easy for the author to address each point before the final merge.