Effective Tools for Long Running Projects

The same tools that make AI agents effective on long-running tasks are the same tools humans need to work well on long, messy projects. Housecat is building the tools and software that enforces those behaviors automatically.

Anthropic recently published Effective Harnesses for Long Running Agents which explains how Claude Code works on complex tasks that span hours or days (e.g. “build a clone of the entire claude.ai web application”).

Anthropic’s post outlines some deeply technical problems and solutions for how the Claude Code harness works, but most of the solutions come from modeling how highly effective teams of people work. To quote:

Inspiration... came from knowing what effective software engineers do every day.

We believe that we can apply the same lessons to how general teamwork and other productivity tools can also increase effectiveness on long running projects.

Challenge: Agents are Fallible

Imagine a software project staffed by engineers working in shifts. Each new person arrives with no experience or training on the project, they take on big tasks to work on however they see fit, then some of the people quit mid-shift without telling anyone what they did.

This is the reality for coding agents. Every agent starts with an empty context, eagerly works on whatever it’s told, then may fail to complete their task properly, leaving the next agent a mess to clean up.

This is made worse because agents tend to try to do too much at once. They “one shot” any instruction they are given, then as they progress the system is often in a state of half-implemented and undocumented work. Finally it’s guaranteed that some will completely fail when the context window is exhausted, the program is killed, or external agents / tools / sub-systems cause fatal errors.

Solution: Tools for Agent Teamwork

Now imagine our software project has a Project Manager (PM) that builds a detailed plan breaking the project down into small tasks. When each new person arrives the PM gives them a detailed report about what we’re doing, what’s been done to date, and one small task for them to work on. The engineer takes their ticket, goes to their desk and works on it, then hands it back to the PM when they are done, complete with docs and verification tests. When tickets come back the PM updates the plan for the next engineer. If tickets don’t come back they reassign the work to a new worker.

This is effectively how Claude Code works with one type of agent working as the PM, directing another type of agent working as a coder.

The Anthropic article goes into technical details of how to make agents do this reliably, but it effectively boils down to the first agent saving a file with the detailed plan, and other agents atomically updating it as they go.

Existing solutions in engineering

It so happens that the best (human) engineering teams had already adopted lots of fantastic tools to enable effective collaboration and project work. Project management tools like Linear and Jira. A culture of breaking down work into small tasks (T-shirt sizing, story points, etc.). Communication and collaboration tools like Slack and GitHub. Testing at every level of the stack including Continuous Integration (CI) and Continuous Delivery (CD) and service level monitoring (SLAs).

AI tools like Claude Code build on all of this work, making them extremely powerful.

Challenge: People are Fallible

Many human teams struggle to work effectively because humans are also fallible, make mistakes or work inefficiently. It’s still all too common for project plans to be missing or out of date, people to work on the wrong thing, or to drop the ball on good practices like sharing progress, updating docs, and keep things tidy. The tools we use should help us overcome these challenges and make us more efficient. Of course the reality is that many teams, whether in tech companies or not, are using tools that have not changed for 10, 20, or even 30+ years. Email, spreadsheets, out-dated CRMs, and Instant Messaging apps. These tools are not doing enough to help!

The result is that less effective teams do not have:

Well codified operations that are consistently followed
An accurate source of truth for data
A centralized project management tool that is used for every task
Communication that happens in a single, durable medium
The ability to test or verify work while, and after, it is complete

Adding AI tools into this environment is likely to have limited results without first addressing the structure and performance of the existing human team.

Solution: Tools for Human Teamwork

Most companies already know the general tools to resemble the second scenario:

Write clear project management plans with small tasks to be done
Work on small pieces at a time
Test your work to verify its actually done
Update documentation as you go along
Notify your team mates and update the plan continually along the way

Just like Anthropic analyzes how effective engineering teams work and builds that into Claude Code, the most effective companies will also analyze effective team work across other domains and build tools that make it happen automatically.

Building these productivity tools for ourselves and other teams is our mission at Housecat. We are building tools that help teams with:

Clear project plans broken into manageable tasks
Processes designed in the same tool work happens
Clean and tidy data
Deterministic verification of work
Clear communication human to human, computer to human and computer to computer
Opinionated data schemas

If you’re interested to learn more about our approach, or think Housecat could help you automate common workflows in your team please reach out!