PostHole
Compose Login
You are browsing eu.zone1 in read-only mode. Log in to participate.
rss-bridge 2026-03-01T17:02:15+00:00

6 Practices that turned AI from prototyper to workhorse (106 PRs in 14 days)

Comments


****

Hacker Newsnewpastcommentsaskshowjobssubmitlogin
6 Practices that turned AI from prototyper to workhorse (106 PRs in 14 days)
13 points by waleedk 3 hours agohidepastfavorite9 comments

| | 1. Specs and plans are source code: Specs and plans live in git alongside source code, not in chat history. A new agent reads arch.md for the big picture, then its specific spec. You always know why something was built.2. Three models review every phase: Claude, Gemini, and Codex catch almost entirely different bugs. No single model found more than 55% of issues. If you only review with the model that wrote the code, you're missing half the bugs. 20 bugs caught before shipping. Claude Code found 5 bugs, Gemini and Codex caught another 15, including a severe security issue Claude missed.3. Enforce the process, don't suggest it. A state machine forces Spec → Plan → Implement → Review → PR. The AI can't skip steps. Tests must pass before advancing. AIs don't stick to the plan by themselves, you need rails.4. Annotate, don't edit. Most of the work is writing specs and reviews that guide the code, not hacking at files in an open-ended chat.5. Agents coordinate agents. An architect agent spawns builder agents into isolated git worktrees. You direct the architect; it directs the builders. They message each other async.6. Manage the whole lifecycle. Most AI tools help you write code faster — maybe 30% of the job. The other 70% is planning how, reviewing, integrating, deployment scripts, managing staging vs prod. Have AI run the whole pipeline from spec to PR and beyond.Overall result: One engineer able to produce what a team of 3-4 would usually do. Measured 1.2 points better code on a 10 point scale vs claude code. Downsides: takes a lot longer, much more token usage, but still reasonable at $1.60 per PR.We open sourced it: https://github.com/cluesmith/codev
More details and raw results: https://cluesmith.com/blog/a-tour-of-codevos/ | |

| | | yodon 56 minutes ago | [–]
I'm a huge fan of spec-kit, and am actively looking for a replacement for it because spec-kit is no longer maintained by the team at GitHub.Codev looks like it has a lot of good similarities to spec-kit, and like it's something I need to pay close attention to. That said, I'll encourage you to do another pass on your command names, intros, and cheat-sheet.I suspect most developers using codev will mostly use a very small fraction of the codev commands most of the time, similar to the way spec-kit is mostly /specify, /plan, /tasks, and /implement, with a bit of /clarify and /analyze once you really get comfortable with it. If I'm right, having some docs where you emphasize the simplicity of your core flow would be very helpful.For calibration, five minutes into reading your home page and medium post and some of your repo docs, I'm ready to believe this is true, but I have no idea what that core flow is or looks like. Five minutes is actually a pretty long time, and I suspect most visitors will end up bouncing if they don't get clarity on what the experience is ultimately going to be like for them in five minutes (or, more likely, much less than five minutes).reply |

| | | waleedk 53 minutes ago | | [–]
Yes, this is spec kit on steroids. In particular specs + protocol enforcement works _really_ well. The protocol enforcement is the game changer: I would find the AI just wouldn't stick to specs or plans.Great suggestions. I will do that. Did you notice any specific issues in those?Got it about the core flow. Appreciate it. I plan to record a video showing how to kick off a new project and another one showing how to use it in maintenance mode. Would that be helpful?@yodon if you would like to reach out to me at hello@cluesmith.com I'd love to get your feedback once those assets are ready.reply |

| | | ddoottddoott 1 hour ago | | [–]
Would you rather fight 100 AI workhorses or 1 workhorse AI?reply |

| | | waleedk 50 minutes ago | | [–]
Ha! I would rather fight 100 workhorse AIs with an Architect + Builder AIs on my side :-).Seriously, the agents managing agents thing works so well. When I'm working, I'll sometimes have 6 builder agents fixing different bugs, and I will lose state and I rely on the architect agent who doesn't have stupid limitations like 7 +/- 2 things in working memory.reply |

| | | waleedk 2 hours ago | | [–]
Happy to answer any questions. Here are those links as clickables:Github: https://github.com/cluesmith/codev
Tour + raw results: https://cluesmith.com/blog/a-tour-of-codevos/reply |

| | | trollbridge 1 hour ago | | [–]
This original post looks AI-generated.Could you share the prompts you used to generate it?reply |

| | | waleedk 49 minutes ago | | [–]
In a sense? This human built a system for AI to build stuff then asked the AI to summarize what the AI that built the human built?It was more of a conversation, but it was like: Hey I wrote these 6 points about what we're doing differently, please tailor them to be most useful to an HN audience.reply |

| | | skydhash 1 hour ago | [–]

Codev isn’t an AI model. It’s not a coding assistant. It’s not a VS Code extension. It’s a set of CLI tools, protocols, and infrastructure that orchestrates existing AI coding tools (Claude Code, Gemini CLI, OpenAI’s Codex CLI) into a structured workflow.Thanks for the clarification, I couldn't have guessed otherwise.reply |

| | | waleedk 48 minutes ago | [–]
Useful criticism -- what could I have done to help you get that message sooner?reply |

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact


Original source

Reply