Infrastructure

Platform > 1 member · 37 finalized PRs in past 30d

Output Quality

How many commits were pushed after the PR was opened — fewer means the agent got it right the first time.

How often agent-generated code passes CI on the first try — low rates mean the agent is producing code that doesn't build or pass tests.

How often the same files get changed across multiple recent PRs — high churn suggests the agent isn't making durable changes.

Prompt Efficiency

How many back-and-forth turns it takes to finish a task — fewer means your prompts are clear and the agent stays on track.

How much you spent on AI tokens to produce this PR — tracks whether the agent is cost-efficient or burning through tokens.

How much of the agent's input was served from cache — higher means you're spending less on repeated context.

Agent Behavior

How often the agent backtracks down dead-end reasoning paths — lower means less wasted work and faster completions.

How often the agent re-reads files it already opened — excessive re-reads waste tokens and slow things down.

How much work the agent does between human interventions — higher means it operates independently with less hand-holding.