Agent-Native Architectures: Building Apps After Code Ends
Research Date: 2026-01-20
Publication Date: 2026-01-09
Source URL: https://every.to/guides/agent-native
Authors: Dan Shipper (Every) and Claude (Anthropic)
Reference URLs
Summary
This technical guide presents a framework for building “agent-native” software—applications where AI agents operate as first-class citizens rather than afterthought integrations. Co-authored by Dan Shipper and Claude, the document synthesizes principles from production applications (Reader, Anecdote) built at Every, combined with architectural patterns that emerged through collaborative development.
The core thesis posits that Claude Code demonstrated a fundamental insight: a capable coding agent is effectively a general-purpose agent. The same architecture enabling codebase refactoring can organize files, manage reading lists, or automate workflows. The Claude Code SDK makes this pattern accessible, allowing developers to build applications where features are outcomes described in prompts rather than logic written in code.
The guide distinguishes between tested patterns (marked as proven) and speculative contributions from Claude (marked as “needs validation”), providing transparency about the maturity level of different recommendations.
The Five Pillars of Agent-Native Design
Parity
The foundational principle: agents must achieve any outcome users can accomplish through the UI. Without parity, agents encounter dead ends when users request legitimate actions.
Test: Pick any UI action. Can the agent accomplish it?
Implementation discipline: When adding any UI capability, verify the agent can achieve the same outcome through available tools or tool combinations.
Granularity
Tools should be atomic primitives. Features are outcomes achieved by agents operating in loops with judgment—not choreographed sequences executed by code.
Test: To change behavior, do you edit prompts or refactor code?
| Approach | Characteristics |
|---|---|
| Less granular | classify_and_organize_files(files) bundles judgment into tool; limits flexibility |
| More granular | read_file, write_file, move_file, bash primitives; agent decides; prompt describes outcome |
Composability
With atomic tools and parity established, new features emerge from new prompts without code changes. This applies to both developers shipping features and users customizing behavior.
Example prompt-based feature:
“Review files modified this week. Summarize key changes. Based on incomplete items and approaching deadlines, suggest three priorities for next week.”
The agent composes list_files, read_file, and judgment to achieve the outcome.
Emergent Capability
Agents accomplish tasks not explicitly designed for. This creates a flywheel:
- Build with atomic tools and parity
- Users request unanticipated capabilities
- Agent composes tools to accomplish them (or fails, revealing gaps)
- Observe patterns in requests
- Add domain tools or prompts for common patterns
- Repeat
Test: Can the agent handle open-ended requests within your domain?
This reveals latent demand—instead of guessing features, developers observe what users actually request and formalize patterns that emerge.
Improvement Over Time
Agent-native applications improve without shipping code through:
- Accumulated context: State persists across sessions via context files
- Developer-level refinement: Ship updated prompts for all users
- User-level customization: Users modify prompts for their workflows
- Self-modification (advanced): Agents edit own prompts or code with safety rails
Files as the Universal Interface
Agents demonstrate native fluency with filesystem operations. Claude Code succeeds because bash + filesystem represents the most battle-tested agent interface.
Design principle: If a human can look at the file structure and understand what’s happening, an agent probably can too.
The context.md Pattern
A file providing portable working memory without code changes:
# Context
## Who I Am
Reading assistant for the Every app.
## What I Know About This User
- Interested in military history and Russian literature
- Prefers concise analysis
- Currently reading *War and Peace*
## What Exists
- 12 notes in /notes
- Three active projects
- User preferences at /preferences.md
## Recent Activity
- User created "Project kickoff" (two hours ago)
- Analyzed passage about Austerlitz (yesterday)
## My Guidelines
- Don't spoil books they're reading
- Use their interests to personalize insights
## Current State
- No pending tasks
- Last sync: 10 minutes ago
The agent reads this file at session start and updates it as state changes.
Files vs. Database
| Use Files For | Use Database For |
|---|---|
| Content users should read/edit | High-volume structured data |
| Configuration benefiting from version control | Data requiring complex queries |
| Agent-generated content | Ephemeral state (sessions, caches) |
| Anything benefiting from transparency | Data with relationships |
| Large text content | Data requiring indexing |
Principle: Files for legibility, databases for structure. When uncertain, prefer files—they provide transparency and user inspection.
From Primitives to Domain Tools
Start with pure primitives (bash, file operations, basic storage) to prove architecture and reveal actual agent needs. Add domain-specific tools deliberately as patterns emerge.
Reasons to add domain tools:
- Vocabulary: A
create_notetool teaches the agent what “note” means in your system - Guardrails: Some operations need validation beyond agent judgment
- Efficiency: Common operations can be bundled for speed and cost
Rule for domain tools: They represent one conceptual action from the user’s perspective. Include mechanical validation, but judgment about what/whether to act belongs in prompts.
Critical: Keep primitives available. Domain tools are shortcuts, not gates. Unless specific security or data integrity concerns exist, agents should access underlying primitives for edge cases.
Agent Execution Patterns
Completion Signals
Agents need explicit completion mechanisms—not heuristic detection:
.success("Result") // continue loop
.error("Message") // continue (retry possible)
.complete("Done") // stop loop
Completion is separate from success/failure. A tool can succeed and stop, or fail and signal continue for recovery.
Model Tier Selection
Match model capability to task complexity:
| Task Type | Tier | Reasoning |
|---|---|---|
| Research agent | Balanced | Tool loops, good reasoning |
| Chat | Balanced | Fast enough for conversation |
| Complex synthesis | Powerful | Multi-source analysis |
| Quick classification | Fast | High volume, simple task |
Partial Completion
For multi-step tasks, track progress at task level with states: pending, in_progress, completed, failed, skipped.
Context Limits
Design for bounded context from the start:
- Tools support iterative refinement (summary → detail → full)
- Provide mid-session consolidation (“summarize learnings and continue”)
- Assume context will eventually fill
Mobile-Specific Patterns
Mobile presents unique constraints: agents are long-running while iOS apps are not. Apps may background after seconds and terminate for memory reclamation.
Checkpoint and Resume
What to checkpoint: Agent type, messages, iteration count, task list, custom state, timestamp
When to checkpoint: On app backgrounding, after each tool result, periodically during long operations
Resume flow: Load interrupted sessions → Filter by validity (one-hour default) → Show resume prompt → Restore messages and continue
iOS Storage Architecture
iCloud-first with local fallback:
1. iCloud Container (preferred)
iCloud.com.{bundleId}/Documents/
├── Library/
├── Research/books/
├── Chats/
└── Profile/
2. Local Documents (fallback)
~/Documents/
3. Migration layer
Auto-migrate local → iCloud
Background Execution
iOS provides approximately 30 seconds of background time. Use it to:
- Complete current tool call if possible
- Checkpoint session state
- Transition gracefully to backgrounded state
For truly long-running agents, consider server-side orchestration with mobile as viewer and input mechanism.
Anti-Patterns
Architectural Anti-Patterns
| Pattern | Problem |
|---|---|
| Agent as router | Agent routes to functions rather than acting with judgment |
| Build app, then add agent | Agent limited to existing features; no emergent capability |
| Request/response thinking | Misses the loop; agents pursue outcomes through iterations |
| Defensive tool design | Over-constrained inputs prevent unanticipated capabilities |
| Happy path in code | Code handles edge cases; agent becomes mere caller |
Specific Anti-Patterns
- Workflow-shaped tools:
analyze_and_organizebundles judgment; break into primitives - Orphan UI actions: User can do something agent cannot achieve
- Context starvation: Agent lacks awareness of available resources
- Gates without reason: Domain tools restrict access unintentionally
- Heuristic completion detection: Detecting completion through iteration counts or output checks
Success Criteria
Architecture Checklist
- Agent achieves anything users achieve through UI (parity)
- Tools are atomic primitives; domain tools are shortcuts (granularity)
- New features via new prompts (composability)
- Agent accomplishes unplanned tasks (emergent capability)
- Behavior changes through prompt edits, not code refactoring
Implementation Checklist
- System prompt includes available resources and capabilities
- Agent and user share the same data space
- Agent actions reflect immediately in UI
- Every entity has full CRUD capability
- External APIs use dynamic capability discovery where appropriate
- Agents explicitly signal completion
The Ultimate Test
Describe an outcome within the application’s domain that no specific feature was built for. Can the agent figure out how to accomplish it, operating in a loop until success?
- If yes: The application is agent-native
- If no: The architecture is too constrained
Key Findings
- Agent-native architecture treats features as prompt-described outcomes rather than coded logic
- The five pillars (parity, granularity, composability, emergent capability, improvement) form an interdependent system
- Files provide the most robust agent interface due to existing LLM fluency with filesystem operations
- Mobile requires explicit checkpoint/resume patterns due to iOS backgrounding constraints
- Latent demand discovery—observing what users ask agents to do—replaces speculative feature development
- Domain tools should be shortcuts enabling efficiency, not gates restricting capability
References
- Agent-native Architectures Guide - Accessed 2026-01-20
- Dan Shipper Twitter Announcement - 2026-01-09
- Compound Engineering Plugin - GitHub Repository