YOLO: navigator.berlin · schmidbauer.dev

--dangerously-skip-permissions - Claude Code’s YOLO - I had never tried it, and I finally had a project for it:

A map app for Berlin: per address, noise, climate, green space, mobility, housing, social structure, and twelve elections since 2011.

Precisely because I barely code myself anymore and mostly conceive and specify, this was the perfect chance to hand the machine a completely free hand and see how far it gets.

Why bother

Berlin’s data is open. The Senate publishes noise maps, climate time series, land values, election results. Only: the usability comes straight out of UX hell. The data sits in portals for experts, in formats for experts, with projections for experts.

A gap yawns in between. Anyone looking for a flat in Berlin wants to know: how loud is the street, how green the neighbourhood, how the area voted. No data portal answers that. navigator.berlin does. One address in, everything out.

Concept first, code last

I’m not a programmer in the classic sense of the word. I’m good at thinking a system through before it exists. That’s exactly where the method starts.

Before the first line of code came the spec: a product document, a UX spec, the architecture, then epics and individual stories. Every larger decision was written down first, with a rationale and the rejected alternative. Maps without our own server. No cookies. No US provider. Tests from the second story on, and so on.

This isn’t shooting in the dark. The AI builds the code, but it builds against a spec I own. Fifteen such decision documents end up in the project. They’re not for me. They’re the context the AI reads while building and must follow without exception.

The honest limit belongs here: I don’t replace a specialist. For the last depth in UX or security, a human is still the gold standard. I get to a working result on my own, fast, from concept to deployment. But I know where mine and Claude’s end is. Anyone who claims AI replaces the specialist gets found out in the first expert conversation.

Tests, precisely because I don’t code

Sounds contradictory: I write no code but insist on tests before the implementation. Precisely because of that.

AI writes code that looks plausible. Plausible isn’t correct. The test is the one place where the two split. So in the project, from the second story on: first the test that fails, then the code that turns it green. Not for every little thing, but for everything with logic.

The safety net paid off. It doesn’t only catch my thinking errors. It catches the machine’s.

The hardest part: the election data

Most of the sweat didn’t go into displaying the data. It went into getting it and cleaning it.

Best example: the elections. Twelve I wanted: four federal, four Berlin House of Representatives, four district. Two agencies, two worlds.

The first endpoint was dead and wouldn’t admit it. The statistics office offers an open-data section, but behind it sits a modern web app. A normal download command got no data package, just the HTML shell of the page. The fix was to remote-control a real browser that fetches the file like a human with a mouse.

Then the files themselves. Election results since 2011, and the format changes with almost every election year. Sometimes clean UTF-8, sometimes an old Windows encoding that takes every umlaut apart. One single year broke ranks and threw half the pipeline into chaos until the detection was ready for it. Lesson: never trust the format of a government file. Check it, don’t assume it.

And the parties. A party isn’t called the same over the years. Spellings drift, parties split, new ones appear. Which row belongs to which party, no machine can decide. That’s handwork, a curated table, row by row.

Eighty percent of the work sat in understanding the data, and there AI helped but didn’t fully manage on its own.

Where plan and reality collided

With AI you plan far too much. It’s just so easy - one more feature here, one more there, maybe a few extras over there, and hey I saw something cool the other day, want that too… You only notice while implementing.

Live data was one such case. BVG departures, weather, air quality, rain radar, all planned and thought through. While building came the sobering up: too much complexity, too many foreign dependencies, too much privacy load, and in the end it didn’t fit the concept at all. Cut. navigator now shows only static, checked data.

The languages went the same way. Eight prepared before a single one was fully translated. Everything but German put back. EIGHT! WANT WANT WANT!

It goes like this constantly. The spec looks right on paper, during implementation the opposite shows. So back, respecify.

The hard part comes after. The spec has to travel with the implementation. Every rejection, every change of course belongs back in the document. Otherwise the AI builds against a drifting spec next time. And that’s exactly more dangerous with AI builds than with handwork: the machine takes the spec literally. An outdated document isn’t a cosmetic flaw, it’s a wrong instruction. Keeping spec and code in sync was in the end more work than the building itself. I still don’t have a clean way to do it. That, to me, is the unsolved core problem of building with AI, and I’ve yet to see anyone who has really solved it.

What’s left at the end

navigator.berlin is live. One address, dozens of data layers, a score across all twelve districts.

So what’s the verdict on Claude Code with --dangerously-skip-permissions? From idea to product it goes very fast. Properly specced and fitted with ADRs, I sometimes didn’t even have to look anymore and dared to check only the results of a story. But: tradeoffs are standing traps the human set earlier, that Claude strides into with great confidence and glee and then wallows in.

The AI built it. I did the thinking.