Docs

Build with Waterfall

Waterfall is an OpenAI-compatible API gateway. Point your app at our base URL, pick a routing strategy, and let Waterfall choose the smartest capable model for each request.

Base URL

https://api.getwaterfall.org/v1

Auth

Use a Waterfall API key with prepaid credits. x402 is the easiest low-minimum way to add credits.

Format

OpenAI-compatible chat completions. Most OpenAI SDK code works with a base URL change.

Quick Start

1. Choose how to pay

Use an API key for normal app traffic. Add credits with x402 ($5 minimum) or card ($25 minimum).

2. Pick a strategy

Start with free_smart for free agent work, or auto for the default cascade.

3. Send the request

We handle model choice, fallbacks, and cost-aware routing.

Examples

Python

from openai import OpenAI

client = OpenAI(
    base_url="https://api.getwaterfall.org/v1",
    api_key="wf-sk-your-key"
)

response = client.chat.completions.create(
    model="auto",
    extra_body={"routing_strategy": "free_smart"},
    messages=[{"role": "user", "content": "Write a short launch checklist"}],
)

print(response.choices[0].message.content)

TypeScript

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.getwaterfall.org/v1",
  apiKey: "wf-sk-your-key",
});

const response = await client.chat.completions.create({
  model: "auto",
  messages: [{ role: "user", content: "Write a short launch checklist" }],
  extra_body: { routing_strategy: "free_smart" },
});

console.log(response.choices[0].message.content);

curl

curl https://api.getwaterfall.org/v1/chat/completions \
  -H "Authorization: Bearer wf-sk-your-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "auto",
    "routing_strategy": "free_smart",
    "messages": [
      {"role": "user", "content": "Write a short launch checklist"}
    ]
  }'

Routing Strategies

A strategy tells Waterfall what matters for this request: cost, tools, long context, privacy, voice, retrieval, or quality.

See all strategies

free_smart

Free tool-capable models for agents, retries, and high-volume worker tasks

orchestrator

Planner-grade routing for hard agent tasks and synthesis

tool_calling

Models that are good at tools, JSON, and structured output

context_max

Long-context models for docs, repos, and RAG

smart_image

Image generation and image editing models

smart_transcribe

Audio to text

smart_voice

Voice input and output

smart_embedding

Embedding models for search and RAG

smart_rerank

Rerank models for better retrieval results

smart_safety

Moderation, PII, and safety checks

privacy_smart

Zero Data Retention and direct-provider routes where available

x402 Payments

x402 lets software pay with USDC on Base. In Waterfall, it is the lower-minimum way to buy prepaid credits: $5 instead of the $25 card minimum.

Non-crypto users can still pay by card. x402 is there for agents and developers who want cheaper, simpler credit top-ups without card rails.

Free strategies stay free. Paid strategies draw down prepaid credits by the actual model and token usage reported for each request.

# x402 credit top-ups use USDC on Base.
# Minimum: $5 with x402, or $25 by card.
# Wallet: 0xBF5205788d5f5817822ccDD724b6b3e967DA8d75

curl https://api.getwaterfall.org/api/v1/billing/credits \
  -H "Authorization: Bearer $WATERFALL_API_KEY"

Privacy and Regulated Data

privacy_smart uses Zero Data Retention routes and direct-provider routes where available. That is useful for privacy-sensitive apps.

Zero Data Retention means prompt and response content should not be stored after the request is processed. It does not mean the request was never processed by a provider, and it does not automatically make the route HIPAA compliant or safe for privileged legal work.

If you handle PHI, client secrets, or regulated data, you may need a BAA, DPA, approved subprocessors, audit logs, region controls, and retention controls. Use a covered direct-provider route for that work. Do not rely on a generic privacy route as your compliance plan.

Models

Waterfall keeps a live catalog of chat, reasoning, coding, vision, image, audio, embedding, rerank, safety, and moderation models. Use Models to inspect the current list and see which strategies each model belongs to.