Introducing Nova, our internal platform for coding agents

Published: (May 21, 2026 at 12:00 PM EDT)
10 min read

Source: Dropbox Tech Blog

# Nova: A Platform for Coding Agents at Dropbox

Coding agents are becoming an important part of software development. Their most obvious use is helping developers write code faster. But code is only one part of building and operating software.  

At Dropbox scale, agents also need to:

- Work within a [large monorepo](https://dropbox.tech/infrastructure/reducing-our-monorepo-size-to-improve-developer-velocity)  
- Validate code changes in Dropbox’s full engineering environment  
- Incorporate context from across the engineering lifecycle (see our post on [how Dash uses context engineering for smarter AI](https://dropbox.tech/machine-learning/how-dash-uses-context-engineering-for-smarter-ai))

Developers don’t just write code; they also manage migrations, unblock CI, investigate failures, and handle repetitive operational work. This work matters, but it is often repetitive and disruptive, pulling engineers’ focus away from deeper product and infrastructure work.

## Why We Built Nova

To prepare for a future where agents can assist engineers with a larger share of their work, we built **Nova**, an internal service for running coding agents in our cloud. Nova lets engineers:

- Run multiple coding sessions in parallel  
- Allow internal systems to use AI agents as part of automated workflows  

This platform approach lets us apply agents across internal workflows instead of building one‑off implementations for each use case, making it easier to rapidly experiment with how AI can support engineering work.

## What We’ll Cover

In this post we’ll discuss:

1. **Why we built Nova** – the motivations behind a unified platform.  
2. **Why a platform approach** – the benefits over multiple single‑purpose solutions.  
3. **Lessons learned** – insights from using Nova across the software development lifecycle.

---

![Dropbox Dash: AI that understands your work](https://cdn.prod.website-files.com/65dcd70b48edc3a7b446950e/670692ee7692f74d4834e4f4_Frame%201400006055.svg)

**Dash** knows your context, your team, and your work, so your team can stay organized, easily find and share knowledge, and keep projects secure—all from one place. And soon, Dash is coming to Dropbox.

[Learn more →](https://dash.dropbox.com/?utm=blogs)

Tackling the Fragmented Workflow Problem

The software development lifecycle contains many stages where engineering judgment is essential, yet the work itself can be repetitive and time‑consuming. Tasks such as debugging failures, updating dependencies, improving test coverage, and fixing flaky tests are critical, but they often distract developers from more meaningful work.

These workflows are well‑suited for AI assistance via coding agents, but they don’t all require the same interaction model:

Interaction ModelIdeal Use‑Case
Standard interactive chatQuick, ad‑hoc queries, code reviews, or small refactorings.
Asynchronous autonomous runsLong‑running jobs (e.g., bulk dependency upgrades, test‑suite analysis) that only surface results when the agent discovers something useful.

Supporting both modes consistently demands more than a single‑purpose tool.

Why Off‑the‑Shelf Tools Fall Short at Dropbox

  • Monorepo scale – Our large monorepo is massive and tightly coupled.
  • Bazel‑centric workflow – We rely on Bazel for builds and tests, leveraging caching and remote execution.
  • On‑premise infrastructure – Fast builds and tests depend on internal hardware and custom validation pipelines.

Third‑party coding‑agent tools work well for local iteration, but they don’t naturally integrate with:

  1. Our repository layout.
  2. The Bazel‑driven build and test system.
  3. Dropbox‑specific validation paths and on‑premise resources.

Consequently, we needed agents that could operate within our existing systems rather than forcing a separate AI‑specific workflow.

Building a Unified Platform

These constraints led us to create a shared platform—instead of building isolated solutions for each workflow. The platform had to:

  • Support interactive development (chat‑style assistance).
  • Run background jobs (asynchronous, long‑running tasks).
  • Serve internal services (e.g., automated code reviews, dependency bumpers).
  • Keep execution, validation, and context handling consistent across all use cases.

The result of this effort is Nova, a platform designed to unify AI‑assisted development across Dropbox’s unique infrastructure.

Improving Development with Nova

Nova started with a single, focused problem: helping engineers respond to continuous‑integration failures by suggesting fixes. That clear starting point shaped the whole platform.

How a Nova Session Works

  1. Isolated environment – each session runs against a snapshot of the Dropbox codebase at a specific commit.
  2. Task definition – the caller supplies a task and (optionally) validation commands.
  3. Validation loop – after the agent proposes a change, the validation commands are executed.
    • If validation fails (e.g., a test breaks), Nova feeds the results back to the agent and asks it to address the failure.
    • The session continues only while the results hold up.

Pattern: propose → validate → iterate.

Expanding the Platform

  • Multiple coding agents behind a single interface.
  • Integrations with the tools engineers already use:
    • Web UI for interactive sessions (similar to other cloud‑based coding agents).
    • Command‑line interface and API for launching parallel jobs from local agents, scripts, or internal services.
  • Helpers that let teams add AI‑powered steps without rebuilding surrounding infrastructure.
  • Observability & feedback – built‑in prompt evaluation, telemetry, and collection mechanisms so engineers can gauge agent performance.

Beyond Simple Edits

As Nova was adopted for more workflows, we discovered that many tasks require more than file edits:

  • Gathering evidence, reading logs, inspecting failures.
  • Carrying context across multiple steps.

To support this, Nova provides skills, plugins, and MCP integrations, including access to observability systems.

Code Publication Strategy

We deliberately keep publication outside the agent and limit each session to a single branch. This design gives us:

  • A predictable view of active branches and pending changes.
  • Simpler automation (e.g., running tests, rebasing onto main).
  • Avoidance of the complexity that would arise if an agent could create and manage multiple branches within a session.

Example Nova Request (pseudo‑JSON)

{
  "repo_commit": "",
  "task": "Investigate this CI failure and propose a fix",
  "validation_commands": [
    "bazel test //path/to:test_target",
    "bazel test //path/to/related:all"
  ],
  "continue_on_validation_failure": true,
  "max_iterations": 5,
  "push_branch": "ai/nova/ci-fix"
}

Illustrative Nova request showing the key fields used to drive a session.

How We’re Using the Platform

Since launching Nova, we’ve applied it across a range of engineering workflows—from quick, developer‑driven coding sessions to long‑running remediation and migration efforts. The use cases below illustrate how AI coding agents can fit into both interactive day‑to‑day development and more durable operational workflows.


1. Developer‑Driven Sessions

Nova supports the kinds of developer‑driven workflows engineers expect from modern coding agents:

  • Web UI – Engineers make quick fixes or build prototypes without leaving their local development loop.
  • Bazel selectivity – We use Bazel’s selectivity tools together with Nova’s validation commands so changes are validated against the correct compile and test targets.
  • Slack integration – A session can be started from a Slack thread, carrying the thread’s context into Nova. This reduces setup time and preserves discussion that would otherwise need to be rewritten by hand.

2. Flaky‑Test Remediation

One of Nova’s most successful operational workflows is flaky‑test remediation. We built an internal tool called Deflaker, a durable workflow that integrates with Athena, our flaky‑test detection system.

Workflow overview

  1. Detect – Athena finds a flaky test and extracts both passing and failing logs.
  2. Send to Nova – The logs are sent as context; Nova is asked to identify a likely root cause and propose a fix.
  3. Validate – The proposed change is validated by running the test ≥ 100 times in CI (the exact number depends on the failure rate).
  4. Iterate – If the test still flakes, the new logs and notes from the previous attempt are fed back to Nova for another fix attempt.
  5. Terminate – The loop continues until a working fix lands or a maximum of five attempts is reached.

Flaky‑test remediation flow diagram

Caption: Athena detects a flaky test → logs go to Nova → Nova proposes a fix → CI runs > 100 validation attempts → success lands the fix; failure restarts the loop with new logs and notes.


3. Migrations & Dependency Upgrades

Migrations and dependency upgrades became a natural fit for Nova.

  • Legacy approach – We previously used a bespoke Goose‑based AI migrator that generated parallel AI coding jobs via prompt templates and verification commands, publishing results to GitHub branches. It handled thousands of migration entries (e.g., Enzyme → React Testing Library, mypy type‑config updates).
  • Limitations – No interactivity for reviewing or continuing agent output, making failure recovery difficult. Highly repeatable migrations were better managed directly by migration owners who could launch many agents with the same runbook.

Nova‑enabled workflow

  • Interactive coding sessions with shared guardrails.
  • Reusable workflow tooling and a consistent operating model.
  • Future goal: migration owners write a prompt once, run it in parallel across many code‑base parts, and review changes as part of a coordinated rollout.
  • Integration with RenovateBot enables agents to take a first pass at repairing breakages introduced by dependency upgrades.

4. Emerging Workflows & Experiments

ExperimentGoalCurrent Status
Production‑crash responseRecreate crash states with tests, generate candidate fixes, route results to service teams.In production for high‑severity alerts.
Automated review gatingEvaluate PRs against secondary‑team review policies and advise whether extra review is needed.Prototype stage.
Scheduled on‑call toil reductionAutomate recurring tasks (e.g., alert flapping, Slack follow‑ups).Early testing.
Multi‑agent code reviewRun several agents on the same change, aggregate results, deduplicate low‑value comments.Proof‑of‑concept.

These experiments extend Nova beyond pure code authoring, exploring how agents can help with triage, policy enforcement, and on‑call automation.


Takeaway

Nova’s flexible, interactive platform lets us:

  • Accelerate day‑to‑day development with UI‑driven sessions and Slack context.
  • Automate noisy, repetitive operational work (flaky tests, migrations, dependency upgrades).
  • Prototype new AI‑driven workflows that reduce toil and improve code‑quality governance.

The platform continues to evolve as we discover more ways AI agents can augment both developer productivity and operational reliability.

What We Learned

One lesson we learned is that the value of coding agents comes as much from the surrounding platform as from code generation itself. Running agents as a service gives us a reusable way to support a wide range of engineering workflows.

We also found that context, validation, and guardrails reinforce one another:

  • Localized AGENTS.md files provide service‑specific context.
  • Validation commands, isolated execution, hermetic tests, Bazel caching, and retry loops let agents operate against the same systems engineers rely on every day.

Each layer improves reliability on its own, but together they make background workflows more trustworthy.


When Not to Put Everything Inside the Agent Loop

As we expanded Nova across the software development lifecycle, we had to decide where agentic behavior was useful and where deterministic systems should remain in control.

  • Letting an agent manage its own test execution and iteration can leave sessions waiting on CI for hours or cause changes to be validated against the wrong tests.
  • We found it works better for surrounding workflows to trigger CI deterministically and bring the agent back only when a failure occurs that needs inspection or fixing.

Looking Ahead

As coding agents continue to improve, we expect them to take on a larger share of repetitive work across the software development lifecycle. The path forward is not just better models, but better integration with the systems that shape engineering work.

Nova gives us a shared execution layer for AI‑assisted workflows through:

  • Isolated environments
  • Repository‑aware context
  • Validation loops
  • Workflow integration
  • Reviewable outputs

As we continue expanding context sources—including through Dash and MCP‑based integrations—we expect agents to become more useful, more reliable, and better aligned with how engineering gets done at Dropbox.


Acknowledgments: Samm Desmond, Daniel Avramson, Adam Ziel, and Chris Hodges


If building innovative products, experiences, and infrastructure excites you, come build the future with us! Visit jobs.dropbox.com to see our open roles.

0 views
Back to Blog

Related posts

Read more »

메시징 서버의 스트레스 테스트 노하우와 AI 가 덜어 준 부분

Part 1. 개요 - 안정적인 운영을 위한 노력들 안녕하세요 저는 톡메시징개발플랫폼 서버개발자 쟈미입니다. 톡메시징 개발 플랫폼팀은 카카오톡의 메시지 수발신 채팅방 목록 관리와 같은 카카오톡 채팅시스템의 개발, 운영을 담당하고 있습니다. 카카오톡의 채팅 트래픽을 담당하는 부서이기 때문...

My Skills

Create, install, and manage AI instructions for your projects — no code needed. CREATE Pick a name, choose a category, describe what you want — the wizard buil...