From Human-First to Agent-First Testing: What a Year of Building Taught Us
Source: Dev.to
Overview
Shiplight Cloud is a fully‑managed, cloud‑based natural language testing platform designed to multiply human productivity. Teams author tests visually, the platform handles execution, and results are managed in the cloud. It continues to serve teams that need managed test authoring and execution.
By late 2025 the landscape shifted:
- AI coding agents generate testing scripts fast, but the output is hard to review and expensive to maintain. Test volume grows while confidence does not.
- Roles are collapsing – the traditional PM → engineer → QA handoff is dissolving. A single person increasingly defines, builds, and verifies with AI, making quality an integrated activity rather than a separate phase.
- Specs become the source of truth – with AI generating code from intent, the canonical representation of product behavior moves upstream from code to structured natural language.
To address this, we introduced Shiplight Plugins, a new product for developers and automation engineers who work with AI agents. The core principle is simple: AI handles test creation, execution, and maintenance, while the system produces clear evidence at every step for humans to understand and trust.
Core Principles
- Tight feedback loop for AI agents – AI coding agents produce better results when they receive clear, immediate feedback. Verification happens during development, not after.
- Spec‑driven – Tests read like product specs, not implementation code, so anyone on the team can review what is being tested without technical expertise.
- Auto‑healing – Cosmetic and structural UI changes do not break tests as long as product behavior is unchanged.
- Human‑readable evidence – Pass/fail results are understandable by anyone on the team without reading code or stack traces.
- Performant – Tests are fast and repeatable by default, with deterministic replay where possible and AI resolution only when needed.
- No new platform to learn – Extend the tools and workflows developers already use rather than introducing a brand‑new system.
How It Works
MCP Integration
Any MCP‑compatible coding agent connects to the Shiplight browser MCP server, gaining the ability to:
- Open a browser, navigate the app, interact with elements, take screenshots, and observe network activity.
- Attach to an existing Chrome DevTools URL to test against a running dev environment with real data and authenticated state.
- Use a relay server for remote and headless setups.
Test Authoring in YAML
Shiplight tests are written in natural‑language‑oriented YAML, solving readability and maintenance problems of AI‑generated Playwright scripts.
# shiplight-test.yaml
goal: Verify that a user can log in and create a new project
base_url: https://your-app.com
statements:
- URL: /login
- intent: Enter email address
action: input_text
locator: "getByPlaceholder('Email')"
text: "{{TEST_EMAIL}}"
- intent: Enter the password
action: input_text
locator: "getByPlaceholder('Password')"
text: "{{TEST_PASSWORD}}"
- intent: Click Sign In
action: click
locator: "getByRole('button', { name: 'Sign In' })"
- VERIFY: The dashboard is visible with a welcome message
- intent: Click "New Project" in the sidebar
action: click
locator: "getByRole('link', { name: 'New Project' })"
- VERIFY: The project creation form is displayedEach test describes the flow in human terms. The same person who specified the feature can review the test without understanding test code. Files live in the repo, are reviewed in PRs, and produce clean diffs. Intent‑based steps resolve via AI at runtime or use cached locators for deterministic replay.
Running and Debugging Tests
shiplight test– runs tests locally.shiplight debug– opens an interactive debugger to step through tests one statement at a time, inspect browser state, and edit steps in place.
Reporting
After a run, Shiplight generates an HTML report where:
- Natural‑language steps are paired with screenshots.
- On failure, the report shows a screenshot of the actual page, the expected behavior, and an AI‑generated explanation (e.g., “Expected a welcome message, but the page displays ‘Session Expired’”).
- The output is readable by anyone on the team without code context.
CI/CD Integration
Tests are plain YAML files in the repository. The CLI runs anywhere Node.js runs, so adding Shiplight to CI is straightforward:
# Example GitHub Actions step
- name: Run Shiplight tests
run: npx shiplight test ./testsThe same approach works for GitLab CI, CircleCI, etc.—just add a step and point it at the test directory.
Cloud Features (Optional)
Shiplight Cloud provides scheduled runs, team dashboards, historical trends, and hosted reports. These are available when needed, but the core loop works entirely with the CLI and existing CI, avoiding vendor lock‑in.
Benefits
- Real‑browser verification during development – AI agents validate UI changes before code review.
- Stable regression tests generated automatically – Verification becomes a by‑product of development, building regression coverage without extra effort.
- AI‑driven self‑healing – Intent‑based steps adapt to UI changes; cached locators keep execution fast, with AI resolution only when necessary.
- Enterprise‑ready security – SOC 2 Type II certified, encrypted data, role‑based access, immutable audit logs, and a 99.99 % uptime SLA.
Quick Start Guide
- Install the CLI
npm install -g shiplight - Create a test file (e.g.,
login-create-project.yaml) using the YAML spec above. - Run locally
shiplight test ./login-create-project.yaml - Add to CI – include the
shiplight testcommand in your pipeline. - Optional – enable Shiplight Cloud features by linking your repo in the Shiplight dashboard.
YAML Test Language Spec
The full specification for the Shiplight YAML language is available in the repository under docs/yaml-spec.md. It defines:
- Top‑level keys (
goal,base_url,statements). - Statement types (
URL,intent,VERIFY). - Supported actions (
click,input_text,select, etc.). - Locator syntax (Playwright‑style selectors).
- Variable interpolation (
{{VARIABLE}}).
Refer to the spec for advanced features such as conditional steps, loops, and custom AI resolvers.
Shiplight Plugins Overview
Shiplight Plugins extend the core platform for developers and automation engineers:
| Plugin | Purpose | Key Features |
|---|---|---|
| MCP Server | Enables AI agents to control a real browser session | Attach to existing DevTools URL, remote/headless support, network capture |
| CLI Enhancements | Improves local test authoring and debugging | Interactive debugger, step editing, live reload |
| CI Integrations | Simplifies adding Shiplight to pipelines | Pre‑built actions for GitHub, GitLab, CircleCI, Azure DevOps |
| Report Viewer | Generates shareable HTML reports | Screenshot embedding, AI explanations, diff view |
| Security Add‑on | Provides enterprise‑grade controls | SSO, RBAC, audit logging, encrypted storage |
These plugins are distributed as npm packages and can be added to any Node.js project with a single npm install command.