The Core Rule
Every capability call MUST produce a complete, usable result without any human interaction. Results are delivered either through theABPResponsedata, or captured by the transport layer (e.g.,window.print()intercepted by a Puppeteer-based client).
Understanding “Fully Programmatic”
The consumer of your capabilities is a program — an AI agent or automated client — not a person sitting in front of a browser. When an agent callsexport.pdf, it expects a PDF. It cannot:
- Click buttons in a print dialog
- Dismiss alert boxes
- Interact with permission prompts
- Find files in a download bar
window.print() and produce the output itself.
The Transport Layer as Collaborator
Key insight: The “fully programmatic” rule does not mean every capability must accomplish everything within the page’s JavaScript context alone. ABP apps run inside a browser controlled by a transport layer — typically Puppeteer or Playwright via an ABP client. That transport layer has capabilities of its own:| Transport Capability | What It Does | Quality |
|---|---|---|
page.pdf() | Generates a vector PDF using Chrome’s native print engine | Selectable text, proper fonts, accurate CSS, small files |
page.screenshot() | Captures the page or element as an image | Pixel-perfect rendering |
CDP Page.printToPDF | Same as page.pdf(), available in headful mode | Same quality |
- Chrome’s print engine: Vector PDFs with selectable text, proper fonts (~45KB for a typical page)
- Canvas-based JS libraries: Bitmap PDFs where text isn’t selectable, inflated file sizes (~500KB+)
The Correct Mental Model
The app produces content; the agent handles delivery. This is the same pattern across all delivery mechanisms:| Phase 1: Content Production | Phase 2: Delivery | |
|---|---|---|
| Clipboard | App returns content (e.g., convert.markdownToHtml -> HTML) | Agent uses host tool (pbcopy, xclip) |
| App returns content (e.g., HTML) | Agent uses a client-provided PDF tool or saves the HTML | |
| Download | App returns file as BinaryData | Agent writes to disk |
pbcopy is for clipboard.
The Headless Test (Amended)
Before shipping any capability, ask:“Would this produce a complete, usable result for a program controlling the browser, with no human present?”This accounts for the transport layer:
window.print()+ status message — Fails (expects human to click dialog)- Return HTML content — Passes (agent or client generates PDF from returned HTML)
window.print()as transport signal — Passes (bridge intercepts, generates PDF — advanced pattern)alert("Done!")— Fails (hangs waiting for human to click OK)- ABP elicitation — Passes (agent can respond programmatically)
The browser runs in headless mode by default. Some capabilities may need headful mode (set
ABP_HEADLESS=false) for GPU, OAuth, or permissions. But the implementation logic inside each capability MUST NOT depend on a human being there.Anti-Pattern: Native Browser UI
The following browser APIs trigger native UI that agents cannot interact with. Using them inside capability handlers violates the core rule.window.print() — Wrong Usage
Anti-pattern: Calling window.print() and returning a status message that expects a human to interact with the print dialog.
- The response is
success: truewith valid JSON, so it passes all error checks - But it contains zero bytes of PDF data
- The agent receives a 109-byte text description of something a human was supposed to do
- The capability is named
export.pdfbut delivers no PDF
Preferred: Return HTML Content
Recommended pattern: The app exposes a content-producing capability (e.g.,convert.markdownToHtml) that returns HTML. The agent or client can then generate a PDF from that HTML using whatever tool is available (e.g., a client-provided PDF rendering tool, a server-side library, etc.). No PDF-specific logic needed in the app.
convert.markdownToHtml -> get HTML -> use a client-side tool to produce a PDF. The app never needs to know about PDF.
Advanced: window.print() as Transport Signal
When the app needs its own CSS context for rendering (custom @media print styles, font preloading, complex page layout), it can use window.print() as a transport signal. Puppeteer-based clients can intercept this at connect time and generate a real PDF.
- The app has elaborate
@media printstyles for specific PDF layout - The PDF needs fonts, images, or styles from the app’s page context
- You need fine-grained control over page breaks, headers/footers
- Most cases — prefer returning HTML and letting the agent or client handle PDF generation
- You need transport-agnostic support (WebSocket, postMessage) — use server-side PDF generation instead (see Correct Patterns below)
alert() / confirm() / prompt()
Anti-pattern: Using native browser dialogs.
confirm()opens a modal dialog- The page is blocked until a human clicks OK/Cancel
- The agent has no way to interact with the dialog
- The call hangs indefinitely
Programmatic Downloads
Anti-pattern: Triggering browser downloads.- Browser download bar appears
- Agent has no access to the downloaded file
- Response contains no actual file data
Delivery Mechanisms: Clipboard and Share
navigator.clipboard.* and navigator.share() are delivery mechanisms — browser features that route content to humans. Even with auto-granted permissions, writing to a browser’s clipboard is useless to an agent in a separate process, and navigator.share() opens a native dialog the agent can’t interact with.
The fix is the same: expose the content-producing step as your ABP capability; skip the delivery step. The agent handles delivery on the host side.
See the ABP Implementation Guide — Delivery Mechanisms: Clipboard and Share for full wrong/right code examples.
Permission Prompts Without Requirements
Anti-pattern: Triggering permission prompts without declaring requirements.Anti-Pattern: Status Messages Instead of Data
A capability whose name implies it produces output (export.*, convert.*, generate.*, render.*) MUST produce that output — as data in the ABPResponse. For PDF, the preferred approach is returning HTML content and letting the agent or client handle PDF generation.
Real-World Failure Example
An app’sexport.pdf capability returned:
- Response is
success: true— passes - Response is valid JSON — passes
- Response contains zero bytes of PDF — fails
- Agent receives 109-byte text description instead of the PDF it requested
The Rule
If your capability is namedexport.pdf, the agent MUST end up with a PDF. This can happen several ways:
- Preferred: The app returns HTML via a
convert.*capability, and the agent or client generates a PDF - Inline: The
ABPResponsecontains the PDF asBinaryData - Transport-captured: The app called
window.print()and the client generated the PDF viapage.pdf()(advanced pattern, Puppeteer-based clients)
- Returning HTML and calling it
export.pdf - Returning a message telling someone to use a print dialog
- Returning a URL to a page where they can manually generate a PDF
Expected Outputs by Capability Pattern
| Capability Name Pattern | Agent MUST Receive |
|---|---|
export.* | The exported file — as BinaryData in the response. (For PDF, prefer returning HTML via a convert.* capability and letting the agent or client handle PDF generation) |
convert.* | The converted content (text or binary depending on target format) |
generate.* | The generated content |
render.* | The rendered output (image, HTML, etc.) |
Examples
Anti-Pattern: Inconsistent Capabilities
Agents build expectations from the first capability they call in a namespace. Ifexport.html returns the full document inline and export.pdf opens a print dialog, the inconsistency is confusing and breaks the agent’s workflow.
The Rule
All capabilities in the same namespace SHOULD produce consistent results from the agent’s perspective. Ifexport.html returns content as BinaryData, then export.pdf must also deliver a file — whether as BinaryData in the response or captured via window.print(). The internal implementation differs, but the end result the agent receives should be consistent: a file of the declared type.
Example: Consistent Export Namespace
data.document with the same structure. The agent can write generic code to handle any export.* capability.
Implementation Checklist
Before shipping any ABP capability, verify every item:Core Requirements
- The capability produces a complete, usable result for a program controlling the browser, with no human present
- The response contains actual output data — not a status message about a side effect like “print dialog opened”
- If the capability name implies output (
export.*,convert.*,generate.*,render.*), the response contains that output as data in theABPResponse. For PDF, prefer returning HTML content and letting the agent or client handle PDF generation.
Browser APIs — What’s Allowed and What’s Not
-
window.print()— Allowed as an advanced transport signal for PDF generation (clients using Puppeteer can intercept it). Prefer returning HTML and letting the agent or client handle PDF generation. Not allowed with a status message expecting human interaction. - No calls to
alert(),confirm(), orprompt()— use ABP elicitation instead - No calls to
window.open() - No programmatic downloads (click-triggered
<a download>,location.hrefto blob URLs) — return data in the response instead - No clipboard writes (
navigator.clipboard.*) — return content instead; agent handles clipboard on the host side - No native share invocations (
navigator.share()) — return shareable data instead; agent routes as needed - No OS file dialogs (
showSaveFilePicker(),showOpenFilePicker(),<input type="file">click) — for saving: return data in the response; for loading: accept file content as an input parameter - No OS notifications (
new Notification(),Notification.requestPermission()) — return notification-worthy data in the response; agent decides how to surface it - No reliance on native permission prompts without declared requirements
Data Handling
- Binary output (PDFs, images, audio) is returned as
BinaryDatawithmimeTypeandencoding - Large files (>10MB) use
BinaryDataReferencewith adownloadUrl - The
outputSchemaaccurately describes what the capability returns
Consistency
- All capabilities in the same namespace use the same response shape
- If you have
export.htmlandexport.pdf, both produce their declared output format
Error Handling
- Permission-gated capabilities declare their requirements
- Permission denial returns
PERMISSION_DENIEDwithretryable: true - If user input is needed, ABP elicitation is used instead of browser dialogs
Correct Patterns for Common Capabilities
Pattern: PDF Generation
PDF generation is the capability most likely to be implemented incorrectly. There are three valid approaches:Approach 1: Return HTML Content (Recommended)
The simplest approach: the app returns HTML content via aconvert.* or render.* capability, and the agent or client generates a PDF from that HTML using whatever tool is available. No PDF-specific logic needed in the app.
convert.markdownToHtml -> get HTML -> generate PDF using a client-side tool -> get PDF file.
What the agent sees:
Approach 2: window.print() as Transport Signal (Advanced)
When the app has specific rendering needs (custom @media print styles, font preloading, complex page layout), it can use window.print() as a transport signal. Puppeteer-based clients can intercept it and generate a PDF.
Implementation:
Approach 3: Server-Side or In-Browser PDF Generation (Fallback)
If your app must work across all transports — including WebSocket and postMessage where there is no Puppeteer to interceptwindow.print() — generate the PDF via a server endpoint or an in-browser JavaScript library and return it as BinaryData.
Server-side generation (recommended fallback):
| Aspect | Approach 1: Return HTML content (Recommended) | Approach 2: window.print() (Advanced) | Approach 3: Server-side / JS library (Fallback) |
|---|---|---|---|
| PDF quality | Vector — selectable text, proper fonts (~45KB) | Vector — same engine, plus app’s CSS context | Vector (server-side) or bitmap (JS libraries, ~500KB+) |
| App complexity | None — just return content | Medium — print container, @media print CSS | Medium — server endpoint, encode output |
| Dependencies | None (client handles PDF) | None (browser’s built-in engine) | Server endpoint or JS library |
| Transport support | Puppeteer/Playwright only | Puppeteer/Playwright only | All transports |
| When to use | Most apps. App returns HTML, agent decides when/how to make a PDF | When the app has specific @media print styles or needs its own CSS context | When transport-agnostic support is a hard requirement |
Pattern: File Export (Any Type)
For any file type, generate the content in memory and return it asBinaryData:
Pattern: User Confirmation via Elicitation
When a capability involves a destructive or irreversible action, use ABP elicitation to get confirmation:Next Steps
Building ABP Web Apps
Practical guide for making web applications ABP-compatible
Examples & Tutorials
Working code examples for ABP implementations
Protocol Overview
Full ABP specification
MCP Bridge Quick Start
Test with the MCP Bridge