All 20 Criteria
P1-A Structured Data — 1/5
No evidence of schema.org markup, JSON-LD, or Organization/Product/Offer schema found on the homepage, pricing, or docs pages. Sitemap exists but no structured data layer is implemented.
P1-B Machine-Readable Pricing — 3/5
Pricing is unusually well-specified: per-second CPU pricing for 1/2/4/6/8 vCPUs, per-GiB/s memory pricing, and storage allotments are stated numerically. An HTML table is used for plan comparison. Not tagged with schema.org/Offer but the numeric precision is high and machine-parseable by a moderately capable scraper.
P1-C llms.txt / Agent Layer — 3/5
An llms.txt file exists at `e2b.mintlify.app/llms.txt` (their documentation platform). This is a structured documentation index with content links rather than a root-domain file. Docs are comprehensive and indexed for traversal, but no root-domain llms.txt at e2b.dev/ and no agent-facing usage guidance in the file itself.
P1-D API / MCP Availability — 4/5
Full SDK for Python (`pip install e2b`) and JavaScript/TypeScript (`npm i e2b`). REST API with API key authentication. MCP support is explicitly listed as a homepage feature ("Secure Model Context Protocol implementations"). Works with OpenAI, Anthropic, Mistral, Llama, and all major agent frameworks. No publicly linked OpenAPI/Swagger spec found.
P1-E Discoverability (GEO) — 4/5
Exceptional customer signal: Manus, Perplexity, Hugging Face, Groq, Lindy, Gumloop. Featured in numerous "best AI agent sandbox" listicles and framework comparison posts. 5/5 rating on AI Agents Directory from 495 reviews. Highly visible in the AI developer ecosystem and likely indexed in LLM training data.
P2-A Offer Completeness — 4/5
Pricing page states Hobby (free), Pro ($150/mo), and Enterprise (custom) with per-second usage costs, concurrent sandbox caps, session duration limits, storage allotments, and CPU/RAM options. What, who, and how much are all accessible from the pricing page. Enterprise pricing still requires contact, limiting to a 4.
P2-B Scope & Limits — 4/5
Explicit and detailed: Hobby (20 concurrent, 1hr sessions), Pro (100 concurrent, 24hr sessions, up to 1,100 extra concurrency), per-second CPU pricing for 5 vCPU sizes, per-GiB memory pricing, and storage tiers. An agent evaluating fit can calculate cost from stated parameters without human interpretation.
P2-C Substitution Rules — 1/5
No guidance found on unavailability, failover, or substitution behavior. The status page records incidents after the fact but no pre-stated fallback policy exists.
P2-D Conditional Logic Transparency — 2/5
Enterprise tier is "contact us" with no disclosed pricing or conditions. Hobby and Pro tiers are well-defined. BYOC and self-hosted options exist for enterprise but conditions are not publicly stated.
P2-E Semantic Precision — 4/5
Highly precise: "<200ms sandbox startup," "150ms Firecracker cold starts," session durations in hours, vCPU counts, specific pricing per second. Marketing language is present but core capabilities are stated with measurable specificity. Notably avoids vague superlatives in technical descriptions.
P3-A Verifiable Performance Data — 4/5
Public status page (status.e2b.dev) shows 30-day uptime metrics by component: Sandbox Network Traffic (100%), Sandbox Pause/Resume/Create (99.971%), Template Builds (99.983%), Dashboard (100%), Documentation (100%). Specific percentages are published publicly. Incident history is granular (timestamps, duration, resolution). This is the strongest trust signal in E2B's stack.
P3-B Scoped Permissions — 3/5
API key authentication with E2B_API_KEY. Enterprise plan includes role-based access control. Some permission tiers exist. No evidence of agent-specific time-bounded, amount-bounded, or action-bounded permission scopes — which would be necessary for fully autonomous agent deployments.
P3-C Audit Trail — 1/5
No evidence of a machine-accessible audit log API for agent systems. Documentation covers sandbox commands and file operations, but no structured transaction log endpoint that an orchestrating agent could query to verify what a child agent did inside a sandbox.
P3-D Behavioral Consistency — 2/5
Open-source SDKs on GitHub provide version history and transparency. EU AI Act alignment mentioned. Status page tracks uptime. No version-controlled terms of service with change notice periods found. Behavioral consistency signals exist implicitly (open source) but are not explicitly surfaced for agent consumption.
P4-A Friction-Free Activation — 5/5
No credit card required for Hobby tier. Installation is a single command (`pip install e2b` or `npm i e2b`). API key issued immediately after signup. $100 in free credits included. An agent can self-provision and start executing sandboxes in under 2 minutes with no human gate. This is the maximum standard for self-serve activation.
P4-B Agent Decision Signals — 3/5
Free tier with $100 credits is a clear agent-legible trial signal. Usage-based pricing with per-second granularity allows an agent to calculate cost against budget before committing. No explicit programmatic signal for "when to upgrade" (e.g., no capability-check endpoint or machine-readable threshold notification), preventing a score of 5.
P5-A Integration Depth — 3/5
Deep integrations with LangChain, AutoGen, CrewAI, OpenAI, Anthropic, Groq, Hugging Face, and others. Custom templates (sandbox environments) create configuration lock-in as teams invest in environment setup. Production deployments at Manus scale suggest meaningful switching cost. However, the underlying interface (Firecracker VMs) is not unique enough to create network-effect gravity.
P5-B Agent Memory Layer — 2/5
Sessions run up to 24 hours with persistent state within a session. File operations, package installations, and code execution results accumulate within a session. But there is no cross-session memory layer, no agent preference store, and no structured API for an agent to retrieve its historical context from prior sessions.
P5-C Programmatic Renewal — 1/5
Stripe is mentioned as the payment processor. No evidence of a renewal API, usage-triggered upgrade hooks, or any mechanism for an agent to manage its own subscription lifecycle programmatically. Standard human-facing SaaS billing only.
P5-D Compounding Value Signal — 2/5
Custom templates improve as teams refine them. Open-source SDK improvements benefit all users over time. But no agent-readable signal surfaces this compounding value — no API endpoint returning "your sandbox success rate over time" or "templates you've built and their performance history."
Rubric v1 (April 2026). Scores reflect the company's state on the audit date and may have improved since.