Agent-Native Architectures: Building Apps After Code Ends

Research Date: 2026-01-20
Publication Date: 2026-01-09
Source URL: https://every.to/guides/agent-native
Authors: Dan Shipper (Every) and Claude (Anthropic)

Reference URLs

Summary

This technical guide presents a framework for building “agent-native” software—applications where AI agents operate as first-class citizens rather than afterthought integrations. Co-authored by Dan Shipper and Claude, the document synthesizes principles from production applications (Reader, Anecdote) built at Every, combined with architectural patterns that emerged through collaborative development.

The core thesis posits that Claude Code demonstrated a fundamental insight: a capable coding agent is effectively a general-purpose agent. The same architecture enabling codebase refactoring can organize files, manage reading lists, or automate workflows. The Claude Code SDK makes this pattern accessible, allowing developers to build applications where features are outcomes described in prompts rather than logic written in code.

The guide distinguishes between tested patterns (marked as proven) and speculative contributions from Claude (marked as “needs validation”), providing transparency about the maturity level of different recommendations.

The Five Pillars of Agent-Native Design

Parity

The foundational principle: agents must achieve any outcome users can accomplish through the UI. Without parity, agents encounter dead ends when users request legitimate actions.

Test: Pick any UI action. Can the agent accomplish it?

Implementation discipline: When adding any UI capability, verify the agent can achieve the same outcome through available tools or tool combinations.

Granularity

Tools should be atomic primitives. Features are outcomes achieved by agents operating in loops with judgment—not choreographed sequences executed by code.

Test: To change behavior, do you edit prompts or refactor code?

ApproachCharacteristics
Less granularclassify_and_organize_files(files) bundles judgment into tool; limits flexibility
More granularread_file, write_file, move_file, bash primitives; agent decides; prompt describes outcome

Composability

With atomic tools and parity established, new features emerge from new prompts without code changes. This applies to both developers shipping features and users customizing behavior.

Example prompt-based feature:

“Review files modified this week. Summarize key changes. Based on incomplete items and approaching deadlines, suggest three priorities for next week.”

The agent composes list_files, read_file, and judgment to achieve the outcome.

Emergent Capability

Agents accomplish tasks not explicitly designed for. This creates a flywheel:

  1. Build with atomic tools and parity
  2. Users request unanticipated capabilities
  3. Agent composes tools to accomplish them (or fails, revealing gaps)
  4. Observe patterns in requests
  5. Add domain tools or prompts for common patterns
  6. Repeat

Test: Can the agent handle open-ended requests within your domain?

This reveals latent demand—instead of guessing features, developers observe what users actually request and formalize patterns that emerge.

Improvement Over Time

Agent-native applications improve without shipping code through:

  • Accumulated context: State persists across sessions via context files
  • Developer-level refinement: Ship updated prompts for all users
  • User-level customization: Users modify prompts for their workflows
  • Self-modification (advanced): Agents edit own prompts or code with safety rails

Files as the Universal Interface

Agents demonstrate native fluency with filesystem operations. Claude Code succeeds because bash + filesystem represents the most battle-tested agent interface.

Design principle: If a human can look at the file structure and understand what’s happening, an agent probably can too.

The context.md Pattern

A file providing portable working memory without code changes:

# Context

## Who I Am
Reading assistant for the Every app.

## What I Know About This User
- Interested in military history and Russian literature
- Prefers concise analysis
- Currently reading *War and Peace*

## What Exists
- 12 notes in /notes
- Three active projects
- User preferences at /preferences.md

## Recent Activity
- User created "Project kickoff" (two hours ago)
- Analyzed passage about Austerlitz (yesterday)

## My Guidelines
- Don't spoil books they're reading
- Use their interests to personalize insights

## Current State
- No pending tasks
- Last sync: 10 minutes ago

The agent reads this file at session start and updates it as state changes.

Files vs. Database

Use Files ForUse Database For
Content users should read/editHigh-volume structured data
Configuration benefiting from version controlData requiring complex queries
Agent-generated contentEphemeral state (sessions, caches)
Anything benefiting from transparencyData with relationships
Large text contentData requiring indexing

Principle: Files for legibility, databases for structure. When uncertain, prefer files—they provide transparency and user inspection.

From Primitives to Domain Tools

Start with pure primitives (bash, file operations, basic storage) to prove architecture and reveal actual agent needs. Add domain-specific tools deliberately as patterns emerge.

Reasons to add domain tools:

  • Vocabulary: A create_note tool teaches the agent what “note” means in your system
  • Guardrails: Some operations need validation beyond agent judgment
  • Efficiency: Common operations can be bundled for speed and cost

Rule for domain tools: They represent one conceptual action from the user’s perspective. Include mechanical validation, but judgment about what/whether to act belongs in prompts.

Critical: Keep primitives available. Domain tools are shortcuts, not gates. Unless specific security or data integrity concerns exist, agents should access underlying primitives for edge cases.

Agent Execution Patterns

Completion Signals

Agents need explicit completion mechanisms—not heuristic detection:

.success("Result")   // continue loop
.error("Message")    // continue (retry possible)
.complete("Done")    // stop loop

Completion is separate from success/failure. A tool can succeed and stop, or fail and signal continue for recovery.

Model Tier Selection

Match model capability to task complexity:

Task TypeTierReasoning
Research agentBalancedTool loops, good reasoning
ChatBalancedFast enough for conversation
Complex synthesisPowerfulMulti-source analysis
Quick classificationFastHigh volume, simple task

Partial Completion

For multi-step tasks, track progress at task level with states: pending, in_progress, completed, failed, skipped.

Context Limits

Design for bounded context from the start:

  • Tools support iterative refinement (summary → detail → full)
  • Provide mid-session consolidation (“summarize learnings and continue”)
  • Assume context will eventually fill

Mobile-Specific Patterns

Mobile presents unique constraints: agents are long-running while iOS apps are not. Apps may background after seconds and terminate for memory reclamation.

Checkpoint and Resume

What to checkpoint: Agent type, messages, iteration count, task list, custom state, timestamp

When to checkpoint: On app backgrounding, after each tool result, periodically during long operations

Resume flow: Load interrupted sessions → Filter by validity (one-hour default) → Show resume prompt → Restore messages and continue

iOS Storage Architecture

iCloud-first with local fallback:

1. iCloud Container (preferred)
   iCloud.com.{bundleId}/Documents/
   ├── Library/
   ├── Research/books/
   ├── Chats/
   └── Profile/

2. Local Documents (fallback)
   ~/Documents/

3. Migration layer
   Auto-migrate local → iCloud

Background Execution

iOS provides approximately 30 seconds of background time. Use it to:

  • Complete current tool call if possible
  • Checkpoint session state
  • Transition gracefully to backgrounded state

For truly long-running agents, consider server-side orchestration with mobile as viewer and input mechanism.

Anti-Patterns

Architectural Anti-Patterns

PatternProblem
Agent as routerAgent routes to functions rather than acting with judgment
Build app, then add agentAgent limited to existing features; no emergent capability
Request/response thinkingMisses the loop; agents pursue outcomes through iterations
Defensive tool designOver-constrained inputs prevent unanticipated capabilities
Happy path in codeCode handles edge cases; agent becomes mere caller

Specific Anti-Patterns

  • Workflow-shaped tools: analyze_and_organize bundles judgment; break into primitives
  • Orphan UI actions: User can do something agent cannot achieve
  • Context starvation: Agent lacks awareness of available resources
  • Gates without reason: Domain tools restrict access unintentionally
  • Heuristic completion detection: Detecting completion through iteration counts or output checks

Success Criteria

Architecture Checklist

  • Agent achieves anything users achieve through UI (parity)
  • Tools are atomic primitives; domain tools are shortcuts (granularity)
  • New features via new prompts (composability)
  • Agent accomplishes unplanned tasks (emergent capability)
  • Behavior changes through prompt edits, not code refactoring

Implementation Checklist

  • System prompt includes available resources and capabilities
  • Agent and user share the same data space
  • Agent actions reflect immediately in UI
  • Every entity has full CRUD capability
  • External APIs use dynamic capability discovery where appropriate
  • Agents explicitly signal completion

The Ultimate Test

Describe an outcome within the application’s domain that no specific feature was built for. Can the agent figure out how to accomplish it, operating in a loop until success?

  • If yes: The application is agent-native
  • If no: The architecture is too constrained

Key Findings

  • Agent-native architecture treats features as prompt-described outcomes rather than coded logic
  • The five pillars (parity, granularity, composability, emergent capability, improvement) form an interdependent system
  • Files provide the most robust agent interface due to existing LLM fluency with filesystem operations
  • Mobile requires explicit checkpoint/resume patterns due to iOS backgrounding constraints
  • Latent demand discovery—observing what users ask agents to do—replaces speculative feature development
  • Domain tools should be shortcuts enabling efficiency, not gates restricting capability

References

  1. Agent-native Architectures Guide - Accessed 2026-01-20
  2. Dan Shipper Twitter Announcement - 2026-01-09
  3. Compound Engineering Plugin - GitHub Repository