It's March 2026. Every developer has an opinion on AI. Some think it's already writing better code than them. Some think it's an overrated autocomplete. Both camps are wrong in interesting ways.
At LSD, we've stopped debating it and started mapping it. What follows is a literal breakdown of where AI sits in our workflow right now — which phase, which tool, what we actually do with it — and where we've consciously kept it out. No hype, no dismissal. Just the practice.
Phase 1 — Scoping and Architecture
When a new client brief lands, we don't open a whiteboard immediately. We open Claude.
The first thing we do is paste the brief and ask for constraint mapping: what's technically ambiguous, what assumptions are baked in, what's underspecified. A 500-word brief almost always has three or four hidden decisions the client hasn't made yet — about data ownership, about scale, about third-party dependencies — and catching those in week one is worth more than anything we do in week four.
We also use what we call "rubber duck architecture." Before we commit to a system design, we describe it in plain language to Claude and ask it to poke holes. Not "is this a good idea" — that's too vague. More like: "here's the data flow, here are the constraints, what breaks under load, what breaks when the third-party API goes down." Nine times out of ten we already know the answers. But the act of externalizing the design catches the tenth case.
What we don't use AI for here: the actual architectural decision. Claude can surface tradeoffs. It can't weigh them against our client's specific business risk profile, their team's operational maturity, or the political reality of their engineering org. That judgment is ours. We're accountable for it.
Tool of choice at this stage: Claude (via the API and Claude Code). We've tried others. The reasoning quality at the architecture stage matters more than speed, and that's where the gap shows.
Phase 2 — Active Development
This is where Claude Code in the terminal earns its keep most visibly.
Day-to-day it looks like this: we're in our editor, we're in the terminal, Claude Code is open. The most common use cases are boilerplate generation, TypeScript type inference for complex generics, and test scaffolding. Not because we can't write these things — we can — but because the cognitive cost of writing a well-typed recursive utility type or a full test harness for a new service is disproportionate to its creative value. We'd rather spend that attention on the parts that actually require judgment.
Another underrated use case: explaining unfamiliar APIs without leaving the terminal. If we're integrating a payments provider or a shipping API we haven't touched before, asking Claude to walk through the authentication flow and common gotchas while we're actively writing is faster and more contextually useful than tabbing to documentation. It's not replacing the docs — it's reducing the round-trip cost of reading them.
What we don't do: commit code that AI wrote without reading it. Every line gets reviewed. This isn't a rule we made reluctantly — it's one we made after watching what happens when you don't. AI-generated code is often structurally correct and subtly wrong. The bug isn't in the logic you can see; it's in the assumption the model made about your data shape or your error-handling contract. You only find it by reading.
The dynamic we've landed on is closer to a pair programmer than an autocomplete. A faster junior who never gets tired, has broad knowledge, and occasionally gets confidently wrong about things a senior would catch immediately. You work with that, not around it.
Phase 3 — Code Review
We run an AI-assisted first pass before human code review on every non-trivial PR.
The prompt is deliberately constrained: review for logic errors, edge cases, and security surface. Not style, not naming conventions — we have a linter for that. We want Claude focused on the things that are hard to catch when you've been staring at the same code for four hours.
This doesn't replace human review. It reduces the noise so that human review catches what actually matters. A reviewer who's already seen the obvious issues spotted by AI can spend their attention on architecture alignment, readability at scale, and intent — none of which AI does reliably.
The discipline we enforce: AI flags, human decides. The AI review surfaces candidates for concern. A human engineer evaluates every flag and makes the call. We've seen teams where AI review outputs get rubber-stamped and merged. That's not review — that's a liability-laundering operation.
Phase 4 — QA and Testing
Two concrete workflows here.
First: generating test cases from spec documents. When a feature spec is written clearly enough to test against — and part of our job is making sure it is — we can feed it to Claude and get a first draft of test cases covering happy paths, edge cases, and error states. The output isn't production-ready, but it's a complete starting point that would otherwise take a few hours to build from scratch.
Second: edge case generation. After we write a feature, we prompt Claude with the code and ask "what could go wrong here that the tests don't cover." This is surprisingly effective because it's a task humans are bad at — we wrote the code, so our mental model of it has the same blind spots. An AI that doesn't share our priors is better at imagining the failure modes we didn't think to imagine.
What we don't do: use AI to write the QA strategy. The strategy — what to test, how much coverage is enough, where to prioritize given time constraints — requires understanding the product and the risk profile in a way that can't be extracted from code alone. That stays with us.
Specific workflow note: we're using Playwright for E2E testing, and Claude Code handles generating the test IDs and selector logic from component code. It's one of the highest-leverage automations we've built into the workflow — it eliminates a category of tedious work entirely.
Phase 5 — Client Communication
This one is the most sensitive, so let's be direct about what we actually do.
We use AI to summarize weekly progress for non-technical stakeholders. A week of work has multiple threads — a new integration, a design iteration, a performance fix, a dependency upgrade — and collapsing that into two coherent paragraphs that a founder can read in ninety seconds takes longer than you'd think. We do a first pass with Claude, then edit it into our voice and their context.
We do not use AI to write client emails verbatim. Clients notice. Not always consciously — but the generic cadence of AI prose erodes the sense that you're in a real relationship with a real team. Trust in an agency relationship is built in the details of how you communicate, and that's not something we're willing to offload.
What we do use AI for in communications: pressure-testing explanations. If we're about to explain a technical tradeoff to a non-technical founder, we'll feed the draft explanation to Claude and ask whether a non-technical person would understand it, what questions they'd likely have, and what we've assumed they know that they probably don't. It's a fast way to catch communication gaps before they become client friction.
Where We Draw the Line
These are the places we've decided not to use AI, and why.
Architectural ownership. We use AI to stress-test our thinking. We don't let it make the call. System design decisions carry consequences that outlast the conversation — they shape maintenance burden, scaling costs, and team capability for years. Someone has to own that. It's us.
Client-facing voice. Our updates sound like us. Not because we're precious about it, but because the relationship we're building with clients depends on them trusting that there's a real team behind the work — one that communicates distinctly enough to be recognizable. AI can draft, but it doesn't ship the draft.
Final code commit. Every AI-generated line is read before it ships. No exceptions. The cost of this discipline is maybe ten percent more time in development. The cost of skipping it is much higher and much less predictable.
Debugging root causes. AI guesses at bugs well. It's particularly good at spotting the symptom. What it does poorly is identifying why a system behaves wrong at a deeper level — the architectural assumption that turned out to be false three layers down, the race condition that only appears under a specific sequence of events. That still takes a human who holds the full system model in their head.
Security decisions. AI surfaces patterns it's seen in training data. It doesn't understand your threat model, your deployment environment, or the specific trust boundaries in your system. We use it to flag potential surface area. We don't use it to validate that the surface is secure.
We're not AI skeptics. We're AI pragmatists.
The studios that will pull ahead in 2026 aren't the ones avoiding AI — and they're not the ones running it unchecked either. They're the ones who've built a conscious, repeatable practice around it. Who know which phases benefit from it, which don't, and why. Who've internalized the failure modes well enough to catch them before they ship.
That's what we're building at LSD.
If you're evaluating studios and want to know specifically how we'd integrate AI into your project — let's talk.
