Last week one of the agents that helps me build theKrew opened a pull request that, on the surface, looked clean. Tests passed. The diff was tight. The function it added did what the prompt asked.
I was about to merge when TypeScript stopped me.
The agent had reached into a module that wasn't supposed to be touched from the outside. Couldn't import what it wanted because the module didn't export it. Tried a workaround — typing the import differently — and the compiler told it to stop. The PR sat there with a red X next to it until the agent figured out the right boundary to cross instead.
That little red X, I've come to think, is one of the most underrated parts of theKrew's architecture.
I Was Going to Build It in Python
When I started theKrew last January, the obvious language choice was Python. Every AI startup writes Python. Every model. Every framework. Every example. The path of least resistance was clear.
I went with TypeScript instead.
Part of that was familiarity. I'd spent 18 years on Wall Street writing C# and .NET — strongly typed languages where the compiler had your back. When I moved into consumer software in 2024, TypeScript felt like home in a different wardrobe.
But there was a deeper reason, one I didn't have words for at the time. I have them now. I was already planning to have agents write code alongside me, and I knew Python wouldn't survive that.
A Quote I've Been Sitting With
I read something this week that put words to the thing I'd been feeling. The argument went roughly like this:
> Python has modules, but it doesn't have encapsulation. It allows code on the outside to muck around with what's going on inside a module. Encapsulation is a crucial part of making modularity work. When you're building big programs with many programmers, your team is only as strong as your weakest programmer. It's nice if the compiler can enforce things and make certain kinds of bad behavior not possible.
Let me fact-check this in public, because it's directionally right but slightly overstated, and the nuance matters.
Python does have *some* encapsulation tools. There's the convention of underscore-prefixed names (`_internal_thing`) — but it's a convention, not a rule. Anyone can still import and use those names. There's double-underscore name mangling (`__private`), which the language transforms to `_ClassName__private` — but that's still accessible if you know the mangled name. Modules can declare `__all__` to control what `from module import *` pulls in — but `import module; module._private_thing` works regardless.
So Python *has* encapsulation primitives. They're just not enforced. They rely on the discipline of every person — and now every agent — touching the codebase.
That distinction matters more in 2026 than it did in 2016.
When Your Team Includes Agents
The "weakest programmer" line is older than AI, and it usually meant something specific: a junior dev or a contractor who hadn't read the style guide. The team's quality floor is whoever is least careful, because they're the one who'll cross a boundary that should have been respected.
What changes when an agent is on the team is that the floor drops in a different way. An agent doesn't read the style guide. An agent doesn't have a memory of why a function was made underscore-private three months ago. An agent sees `_helper`, sees that `_helper` does what it needs, and uses it. The agent is doing exactly what the language permits.
Convention-based privacy assumes good-faith readers who've been onboarded. Agents are good-faith but they have not been onboarded — not the way a human gets onboarded by absorbing the codebase over weeks.
This is why the compiler matters more, not less, in a world where agents commit code. The compiler is the onboarding that scales. The compiler is the policy that doesn't get bypassed when someone is moving fast at midnight.
What This Looks Like in theKrew
theKrew is a TypeScript monorepo. Strict mode is on. Modules export what they want consumed and nothing else. The database — Postgres in Supabase — enforces its own boundaries with row-level security policies that say what each role can see and write.
When an agent generates a campaign for a client, it cannot accidentally read another client's data. Not because there's a code review. Not because there's a style guide. Not because the agent is virtuous. Because the database pushes back at the row level. Same idea as the TypeScript compiler pushing back at build time. The system makes the bad behavior impossible.
This is part of the harness I wrote about a couple of days ago. Context, constraints, and feedback loops — those were the three pillars. The compiler is one of the constraints. Row-level security is another. They're the part of the harness that doesn't sleep, doesn't get tired at 11pm, and doesn't get talked into a clever workaround.
Where Python Still Belongs
I'm not going to pretend I'm anti-Python. We use it for ML tooling, one-off scripts, data exploration, prototyping. Python is fast to write, has the AI library ecosystem, and reads almost like English on a good day.
What Python is not, in my opinion, is the right foundation for a system where you have many contributors of varying discipline writing into a long-lived codebase. The language was designed in an era when "we're all consenting adults" was a fine assumption. The team was small. The codebase was small. The contributors were people you'd onboarded.
That assumption breaks when your team is twelve people and one agent. It breaks harder when it's three people and four agents. It will keep breaking as the ratio shifts.
Static typing, enforced module boundaries, schema-validated databases — these aren't legacy enterprise paranoia. They're the load-bearing walls of a system that needs to survive contributors who haven't, can't, or won't read every comment.
What the Compiler Won't Save You From
Worth being honest here. The compiler stops a class of bugs. Not all of them.
It will not stop you from writing the wrong logic. It will not catch a bad product decision. It will not flag a campaign that's poorly targeted, or a landing page that doesn't convert, or a pricing change that breaks your unit economics. Those failures are still on you.
What the compiler — and the schema, and the type system, and the test suite — buys you is the freedom to spend your attention on the hard problems instead of the boring ones. You're not arguing about whether someone should have used the underscore-prefixed function. The compiler already settled it.
That's worth a lot when you're trying to ship.
The Takeaway
When I look at the theKrew codebase six months in, the parts that have stayed clean are the parts the compiler protects. The parts that have drifted are the parts where I trusted convention. The lesson generalizes.
If you're building anything you expect agents to maintain or extend in the next two years, choose the tools that push back. The language. The schema. The deployment pipeline. The test suite that actually fails on regressions instead of warning about them. Each of those is a piece of the harness.
The agents will run the work. They'll do it well. But the harness has to hold.
Here's the question I keep coming back to: what part of *your* codebase is currently held together by the assumption that everyone touching it has read every comment? That's the part the next agent commit is going to break.