Cline in VS Code: I used it two weeks on a TypeScript project and this survived

Published: (June 7, 2026 at 08:02 AM EDT)
11 min read
Source: Dev.to

Source: Dev.to

Cline in VS Code: I used it two weeks on a TypeScript project and this survived

Back in 2005, when the internet café closed at 11pm and the place was packed, there was no time to read docs. You had to diagnose, run a command, see what happened, correct. That shaped something in me: deep respect for tools that let you see exactly what they’re about to do before they do it, and deep suspicion toward anything that acts without warning you. When I started evaluating autonomous coding agents in 2024, that same instinct pushed me to look at the permission model before any speed benchmark. Cline was the first one where I stopped for more than an hour configuring limits before writing a single real instruction. My thesis, before going into detail: autonomous coding agents are not all the same, and Cline has a permission model that makes it more controllable than other tools — but the devil is in how you configure those limits, not in the tool itself. If you install Cline with defaults and ask it to refactor a complex module, you’re going to have a radically different experience than if you invest 30 minutes defining what it can and can’t touch. What follows is an analysis based on two weeks of active use on a real TypeScript project, documenting delegated tasks, mistakes made, and configuration decisions. It’s not a benchmark. There are no invented numbers. It’s judgment earned through craft. Cline is available on the VS Code Marketplace as an open source extension. The official description presents it as an autonomous agent that can read files, execute terminal commands, navigate the browser, and create or edit code — all from inside VS Code. What the page says clearly: Cline operates with an approval loop by default. Each potentially destructive action — writing a file, executing a terminal command — asks for confirmation before proceeding. You can configure “auto-approve” mode for specific categories, but you start in safe mode. What the page doesn’t say: that the quality of the output depends almost entirely on the model you connect. Cline is provider-agnostic — you can use Claude via Anthropic directly, via OpenRouter, GPT-4o, local models via Ollama, or whatever you want. That flexibility is genuine and it’s a real advantage over tools that lock you into one provider, but it also means “using Cline” can mean completely different experiences depending on the model you choose. In the experiment I’m about to describe, I used claude-3-5-sonnet via the Anthropic API directly. I didn’t use OpenRouter in this iteration because I wanted to isolate the model variable. The project: a TypeScript codebase with Express, Prisma, and PostgreSQL. Nothing experimental in the stack — in fact, I deliberately chose a project with a known stack so I could evaluate Cline’s errors without confusing them with my own uncertainty about the technology. I split tasks into three categories before starting: Category A — Full delegation with review at the end: Generating Zod types from existing Prisma schemas Writing unit tests for already-implemented pure functions Creating seed files with consistent test data Category B — Delegation with intermediate checkpoints: Refactoring a validation module with high coupling Migrating Express endpoints to a cleaner router structure Resolving TypeScript strict errors in specific files Category C — Not delegated, monitored: Any changes to the database schema Modifications to authentication logic Changes to infrastructure configuration files I didn’t pull this classification from any guide — I built it after the first 48 hours, when Cline did something I didn’t expect: in a Category B task, it decided to resolve a type error by changing an import in a file I hadn’t mentioned, which was technically correct but pulled me completely out of context. Not a serious error, but a signal that “review at the end” didn’t work for tasks with lateral dependencies. // Example of an instruction that worked well for Category A // (generating a Zod schema from an existing Prisma model)

// Instruction to Cline: // “Generate a Zod schema for the User model from the schema.prisma file. // Only the file src/schemas/user.schema.ts. // Do not modify any other file. // Use z.string().uuid() for the id field.”

// Expected and received result: import { z } from ‘zod’

export const UserSchema = z.object({ id: z.string().uuid(), email: z.string().email(), nombre: z.string().min(1), creadoEn: z.coerce.date(), actualizadoEn: z.coerce.date(), })

export type User = z.infer

The precision of the instruction matters more than the complexity of the task. That’s the first thing I learned. Error 1: Over-generalization of a local fix I asked it to resolve a TypeScript error in a specific file. Cline fixed the error correctly, but also modified a shared type in a definitions file because “it was cleaner.” Technically impeccable. Context completely lost on my end. What it revealed: Cline reasons about the entire codebase, not the scope you give it. If you don’t explicitly say “don’t modify anything outside file X,” it’s going to explore laterally. This can be an advantage when you want it to find the real root cause; it’s a problem when you want a surgical change. Error 2: Tests that passed but didn’t test anything useful In test generation tasks, Cline delivered files with 100% coverage that were actually testing implementations, not behaviors. expect(fn()).toBeDefined() instead of expect(fn(input)).toEqual(expectedOutput). They passed. They contributed nothing. What it revealed: the instruction “write tests for this function” is too open. You need to specify what edge cases you want covered, what behaviors are critical, and what level of assertion you expect. If you don’t, Cline optimizes for coverage, not for utility. // Vague instruction → tests that pass but are useless // “Write tests for the calcularDescuento function”

// What it delivered (summarized): describe(‘calcularDescuento’, () => { it(‘should return a value’, () => { // ← this tests nothing useful expect(calcularDescuento(100, 10)).toBeDefined() }) })

// Precise instruction → tests that actually matter // “Write tests for calcularDescuento. // Required cases: // - 0% discount returns the original price unchanged // - 100% discount returns 0 // - negative discount throws an Error with message ‘Descuento inválido’ // - price 0 with any discount returns 0”

describe(‘calcularDescuento’, () => { it(‘0% discount returns original price’, () => { expect(calcularDescuento(100, 0)).toBe(100) }) it(‘100% discount returns 0’, () => { expect(calcularDescuento(100, 100)).toBe(0) }) it(‘negative discount throws error’, () => { expect(() => calcularDescuento(100, -5)).toThrow(‘Descuento inválido’) }) it(‘price 0 returns 0 regardless of discount’, () => { expect(calcularDescuento(0, 50)).toBe(0) }) })

Error 3: Autonomy without checkpoints in long refactors The most time-costly error. In a Category B refactor, Cline completed 12 editing steps before I reviewed the intermediate state. The final result was correct, but there was a design decision in step 4 that I disagreed with — and rolling it back at that point took more time than just discussing it upfront. What it revealed: for tasks with more than 5 steps, the review loop needs to be explicit. You can tell Cline to pause and wait for confirmation before moving to each phase — and it’s worth doing. Claude Code (Anthropic’s terminal tool) and Cline share the same base model when you configure Cline with Claude. The difference isn’t in the model’s intelligence — it’s in the execution environment and the cost model. Cline: You live inside VS Code. The visual context of the codebase is available. You pay per token via the Anthropic API (or whichever provider you use). The cost is proportional to how much context you send and how many actions the agent executes. The permission model is granular and configurable. You can tell it exactly which directories it can touch. Each conversation is a new session — no persistent memory between sessions without extra configuration. Claude Code: You operate from a terminal with Anthropic’s own CLI. It has a Pro subscription model that can be more predictable in cost if you use a lot of context. Git integration is smoother by design. It builds codebase context by actively reading the filesystem. My honest take: for point-editing workflows inside VS Code, Cline is more ergonomic. For tasks that cross many files with complex dependencies, Claude Code has an advantage in how it handles the context of the full conversation. They’re not equivalent — they’re tools with different strengths. If you’ve got posts on rate limiting in web applications or middleware patterns in Next.js, you know that tool choice always depends on the most expensive constraint in the system. Here the constraint is: how much context do you need to maintain between steps? That determines which tool makes more sense. This is the most important section of the post, because the temptation to delegate everything is real and the cost of learning it the hard way is too.

  1. Database schema changes If you want to see how I think about Prisma migrations with actual judgment, the post on Prisma 5 → 6 breaking changes has the framework I use.
  2. Authentication and authorization logic
  3. Refactors without prior tests
  4. Architecture decisions digital identity architecture. Before delegating a task to Cline, I run through this list mentally: Green — delegate with a precise instruction: [ ] The output is a new file with no lateral dependencies [ ] There are existing tests covering the behavior you’re about to change [ ] The scope of the change is one isolated file or module [ ] You can define the success criterion in one sentence Yellow — delegate with explicit checkpoints: [ ] The task has more than 5 sequential steps [ ] The change touches more than 3 files [ ] The result depends on a project-specific pattern that isn’t documented [ ] It’s the first time Cline is working on that module Red — don’t delegate, use Cline only for an initial draft: [ ] Any change to database schema or migrations [ ] Authentication, authorization, or secrets handling logic [ ] Changes to infrastructure configuration files (Docker, CI, environment variables) [ ] Architecture decisions that affect multiple teams I want to be straight about what this analysis doesn’t prove: It doesn’t prove Cline is better or worse than other tools in absolute terms. The connected model changes everything. There are no verifiable speed metrics here. “Faster than without an agent” is a perception, not a number. The errors described are observable patterns, not bugs reproducible in every context. The same instruction in a different codebase can produce different results. The real cost depends on how much context you send per session. There’s no generally valid number without knowing the codebase size and usage frequency. What you can conclude: permission configuration and instruction precision have more impact on output quality than whether you use Cline vs another comparable tool. That learning is transferable. Does Cline work with models other than Claude? VS Code Marketplace page documents supported providers. How do you control which files Cline can touch? .clinerules file (a file in the project root where you define agent behavior rules) and the default approval loop that shows you each action before executing it. In default mode, nothing executes without your explicit approval. Does it make sense to use it if you’re already using GitHub Copilot? What happens to cost if you let the agent run on long tasks? Is it viable in TypeScript with strict mode on? TypeScript strict mode and tsconfig is where to start. How does Cline’s autonomy model compare to Claude Code? Cline survived the experiment. It stays in my workflow for Category A tasks — precise boilerplate generation, types, seeds, tests with explicit criteria. For everything else, I have the checkpoints. What I don’t buy: the narrative that configuring an autonomous agent correctly is a five-minute job. It’s not. The .clinerules, the task classification, the scope definition per instruction — that takes time and gets refined through error. If someone tells you they installed Cline and delegated everything without issues from day one, they either have a very simple codebase or they didn’t review the output carefully enough. What I do accept: for a software architect who already has formed technical judgment, Cline is a tool that multiplies speed in the right parts of the work — the parts that are repeatable, definable, and verifiable. The decisions that matter are still yours. The concrete next step if you want to reproduce this: install Cline from the VS Code Marketplace, create a .clinerules file in the root of your TypeScript project with the directories the agent cannot touch, and start with a Category A task. Measure the cost of that session. Then scale. Original source: Cline — VS Code Marketplace: https://marketplace.visualstudio.com/items?itemName=saoudrizwan.claude-dev

This article was originally published on juanchi.dev

0 views
Back to Blog

Related posts

Read more »