Pillar

A framework for human and AI agent collaboration

Five autonomy levels, four guardrails, and the observability loop teams use to ship with humans and agents on the same accountability graph.

Start free See the API

Framework

The accountability graph

Human and AI agent collaboration begins with a shared object model. Humans and agents both read from and write to one graph of goals, KPIs, initiatives, and issues. Goals are the outcomes that matter. KPIs are the numbers that prove those outcomes are moving. Initiatives are the bets meant to move the KPIs. Issues are the discrete units of work that fulfill the initiatives. Every node links upward, so any actor can compute the strategic weight of the work in front of them.

Agents in this graph are members, not chat surfaces. A member has an identity row, an owner, an API key, and an activity stream. A member can be assigned issues, comment on them, ship code that closes them, and author KPI snapshots. Every action carries the member identity into the audit log. Any reviewer can trace a metric change back to the initiative, the issue, the commit, and the agent or human that produced it.

The graph is the contract. The framework that follows (five autonomy levels, four guardrails, one observability loop) only works because the contract is in place.

Autonomy levels

Five levels of agent autonomy

A trust ladder, from observe to self-direct. Teams should walk it one rung at a time.

L0
Read-only
Definition. The agent observes the system and reports what it sees; humans drive every action.
Example. An agent reads the Heartbeat endpoint each morning and posts a status summary to a channel. It cannot assign, comment, or ship.
L1
Suggest
Definition. The agent proposes work and waits for a human to approve before any action is taken.
Example. An agent drafts a new issue from a customer comment and requests assignment. A human reviews the draft and either accepts it or rewrites it.
L2
Execute with approval
Definition. The agent picks up assigned work and ships output, but the output is held behind a human approval gate before it lands.
Example. A coding agent closes ATOLL-42 by opening a pull request. The PR sits in review until a human merges it. The agent cannot push to main.
L3
Execute autonomously
Definition. The agent picks, executes, and closes work inside a defined scope; humans review aggregate motion through KPIs rather than per-action.
Example. An agent ships documentation updates daily inside an 'edit /docs only' boundary. Humans read the docs KPI weekly and intervene if quality slips.
L4
Self-direct
Definition. The agent owns an initiative end-to-end, including planning, sequencing, and reporting; humans review the KPI and the initiative outcome.
Example. An agent owns the integration-pages initiative. It plans the work, opens its own issues, ships pages, and updates the initiative's KPI when each one lands.

The levels are cumulative. An agent at L3 still operates under the L0 observability contract and the L1 suggestion etiquette. Autonomy relocates review from per action to per outcome. It does not remove it.

Guardrails

Four guardrails for every autonomy level

The guardrails do not change between L0 and L4. The scope of what the agent can do inside them does.

Scope contracts

Every agent action is bounded by what it has been assigned: specific issues, specific files, specific services. The scope is a contract, not a vibe. If the work falls outside the contract, the agent stops and asks.

Reversibility

Agent work lands as a pull request, a draft, or an isolated branch. It is never an irreversible production action without explicit human approval. The default state of agent output is undoable.

Observability

Every agent action carries a member identity, a timestamp, and an activity feed entry the team can read. Nothing the agent does is invisible. Trust is built from logs.

Quitting authority

The agent must be able to comment 'blocked' or 'out of scope' and stop without retrying forever. Quitting surfaces the blocker and prevents runaway loops.

Install the four before adding the fifth agent. They are the difference between a team that can absorb agents and a team that accumulates them.

Loop

How the loop closes

Collaboration is not a moment. It is a loop. The loop closes when the KPI moves and the team can see why.

The loop starts when an agent acts. It picks up an assigned issue, opens a pull request, or files a KPI snapshot. Atoll writes every action to the activity feed under the agent's member identity, so the rest of the team can see what happened, when, and at what autonomy level. The action is a record in the same audit log humans write to, not a side effect of a chat thread.

From the action, Atoll reads the linked initiative and the linked KPI. A KPI snapshot lands on the scheduled cadence (daily, weekly, however the team configured it). The snapshot reports whether the metric moved, how much, and which initiative is attributed. Heartbeat consumes the snapshots and emits pace signals: on track, off pace, ahead. The signals are structured payloads. Humans and agents read them through the same endpoint.

When a signal lands, a member picks it up. Sometimes the member is another agent at L3 owning a related KPI. Sometimes the member is a human reviewing the dashboard on a Monday. Either way, Atoll records the pickup: a new assignment, a new comment, a new initiative decision. The loop turns again. Nothing in the loop assumes the actor is human or agent. The accountability graph carries the context. The members do the work. The signals expose pace. The loop closes when the goal is met or the bet is killed.

Act

Atoll logs each agent action with member identity, timestamp, and the linked initiative or KPI.

Observe

KPI snapshots and Heartbeat signals translate raw activity into structured pace data the team can read.

Pick up

Another member (human or agent) reads the signal and takes the next action. The loop turns.

Onboarding

A four-week rollout

One agent at a time, one rung at a time. Each week is a graduation, not a leap.

Week 1
L1 Suggest
Pick one agent, one scope.
Give the agent a member identity, an API key, and a narrow scope: one repo, one doc set, or one type of issue. Set it to L1. Let it draft and propose. Humans approve everything.
Week 2
L2 Execute with approval
Graduate to pull requests.
Move the agent to L2 inside the same scope. Its work lands as PRs or drafts that humans review and merge. Read the activity feed every day.
Week 3
L2 Two agents
Add a second agent.
Bring on a second agent at L2 with a non-overlapping scope. Watch the activity feed for collisions: duplicate work, contested assignments, conflicting comments. Refine the scope contracts.
Week 4
L3 Initiative pilot
Run one initiative at L3.
Pick one initiative with a clear KPI. Let one agent operate at L3 inside it. Review the KPI weekly, not the commits daily. Intervene through goals, not pings.

Most teams that fail with agents fail at week one. They skip the suggest phase and put a brand-new agent at L2 on production code. Spend the week building trust. The ladder is short.

Anti-patterns

Where teams get this wrong

Three failure modes show up over and over. Catch them early.

The pseudonymous agent

The agent runs without a member identity, so its actions land as 'system' or under a human's account. No audit trail, no one to review, no way to grade its work. The first guardrail to install is also the cheapest: give the agent a row in the members table and an API key tied to it.

The autonomy cliff

A team jumps from L0 to L4 because the demo was impressive. There is no L1 phase where humans evaluate the agent's suggestions, no L2 phase where they review its pull requests, and no trust ladder to fall back on when something breaks. Fix: graduate one level at a time and let each level run long enough to expose its failure modes.

The orchestration trap

A team builds a meta-agent that 'manages' other agents: an agent-only layer above the shared system. What they need is shared accountability primitives that humans and agents both read from. Agents pick work from the same graph humans use, not from a private bus the team cannot see.

Identity first

No member row, no agent. Give every agent the same identity primitives a human teammate gets.

Ladder, not leap

Walk L0 to L4. Each level exposes a different failure mode you cannot see from the level above.

Shared graph

Humans and agents read from the same accountability graph. No private agent-only bus.

In practice

What it looks like in steady state

One human, three agents, one initiative. The framework is what makes the team legible.

The human

Owns the goal, approves initiatives, reviews KPI motion, and steps into PRs when scope changes. Most days they read more than they type.

The agents

Three members with bounded scopes: one at L3 on docs, two at L2 on product code. Each holds a member row, an API key, and an activity stream.

The graph

One accountability graph carries the context. Anyone (human or agent) asks 'what matters now' and gets a structured answer.

FAQ

Frequently asked questions

What is human and AI agent collaboration?

Human and AI agent collaboration runs humans and AI agents on one accountability graph (goals, KPIs, initiatives, issues) where both can be assigned work, post updates, ship changes, and get reviewed the same way. Collaboration is a shared operating model, not a chat window. Humans set direction. Agents execute within bounded scopes. Every action is observable to the rest of the team. The framework on this page describes five autonomy levels, four guardrails, and the observability loop that connects them.

What are the levels of AI agent autonomy?

Five levels. L0 read-only: the agent observes and reports. L1 suggest: the agent proposes work that a human approves before execution. L2 execute with approval: the agent picks up assigned work and ships output a human reviews and merges. L3 execute autonomously: the agent picks, executes, and closes work within a defined scope, and humans review aggregate motion through KPIs. L4 self-direct: the agent owns an initiative end-to-end, including planning and KPI updates. Teams start at L1 and graduate upward as trust accrues.

What does human-in-the-loop mean for AI agents?

Human-in-the-loop means a human gate exists somewhere in the agent's action chain, usually before an action becomes irreversible. At L1 the gate is the assignment itself. At L2 it is the pull request. At L3 it is the KPI review. At L4 it is the initiative approval. Human-in-the-loop is not the same as manual review of every step. It means there is a defined point where a human can accept, redirect, or stop the work before it ships to production.

How do I onboard an AI agent to my team?

Start with one agent, one scope, and L1 autonomy. Give the agent member identity, an API key, and a bounded set of issues to draft or suggest. In week two, graduate it to L2 on a single repo or doc set where humans review its pull requests. In week three add a second agent and look for where they collide. In week four pick one initiative and let one agent run at L3 with KPI-level review. The cadence matters. Each step is a trust-building exercise. Skipping levels is the most common way teams get burned.

What guardrails should I put on AI agents?

Four. Scope contracts bound every action by what the agent has been assigned. Reversibility forces agent work into pull requests, drafts, or isolated branches so a human can undo it. Observability ensures every action carries a member identity, a timestamp, and an entry in the activity feed. Quitting authority lets the agent comment 'blocked' or 'out of scope' without retrying forever. These four apply at every autonomy level. The difference between L1 and L4 is how much the agent does inside the guardrails, not whether the guardrails exist.

How do humans review AI agent work?

The same way they review human work, plus a few signals unique to agents. Code lands as pull requests with the agent listed as the author. Docs land as drafts pending approval. KPI updates carry attribution to the initiative and the agent that authored them. Reviewers trace any change back through the activity feed to the original assignment. At higher autonomy levels review shifts from per-action to per-KPI. Humans no longer approve every commit. They read the metric and intervene when it moves the wrong way.

What is the difference between AI assistants and AI agents?

An assistant responds to a human prompt and returns a result for the human to act on. An agent has its own work queue, picks the next task without being prompted, executes it, and reports back. Assistants belong inside a human's workflow. Agents belong on the team. The collaboration model is different. Assistants need prompts. Agents need scope, guardrails, and KPIs. Most products marketed as 'AI for teams' are still assistants. A real agent has a member identity in your system, not a chat icon in the sidebar.

Can humans and AI agents really share accountability?

Yes, if the system models them the same way. If agents have member identities, assigned work, audit trails, and KPI authorship, a reviewer can hold them to the same standard as a human teammate. If agents only exist as a chat surface, accountability is a fiction. There is no row to point at, no metric they own, no commit attributed to them. Shared accountability is a property of the data model, not a brand promise. The accountability graph this framework describes is what makes it possible.

Run your team on one graph.

Humans and agents, five autonomy levels, four guardrails, one observability loop. Start with one agent at L1 and walk up the ladder.

Start free Read the API docs