For Developers & AI Users

Drop-in API, distributed backend

OpenAI-compatible inference running on community GPUs. Switch your base URL and your code runs on a decentralized network instead of a single cloud.

A Cargo is a portable, signed unit of computation that runs on Islands across the Archipelag.io network.

Think of it as a shipping container for code — each one packages a model, runtime, and resource requirements into a single deployable artifact. Consumers submit jobs, the coordinator finds the best Island, and the Cargo executes in a secure sandbox.

Consumer
Cargo
Island
ONNXDockerWASMGGUF
Read the full architecture docs
01 — Cargos

What you can run today

The network currently supports these Cargo types. More are being added during beta.

LLM Chat

Conversational AI via community GPUs

  • Mistral 7B, Llama, and more open models
  • Token-by-token streaming responses
  • Multi-turn conversation history
  • Market-based pricing — varies by model and demand

Image Generation

Text-to-image on community hardware

  • Stable Diffusion XL, FLUX
  • Up to 1024px resolution
  • Batch generation support
  • Per-image pricing set by the compute exchange
02 — Scale Beyond One Island

Multi-Island compute

Run models too large for any single device, or process batches across dozens of Islands at once.

Pipeline Parallelism

Shard a large model across multiple Islands

  • Run 70B+ models across 2–8 Islands
  • Automatic — coordinator detects when pipeline is needed
  • Same API, same streaming — invisible to your code
  • Each Island holds a layer shard — activations flow through the chain

Batch Fan-Out

Split parallel work across many Islands at once

  • Submit up to 100 inputs in a single API call
  • Each input dispatched to a different Island in parallel
  • Results merged automatically by index
  • Real-time progress — track completion via PubSub or SSE
Read the full docs
03 — Workflows

Chain models into pipelines

Define multi-step workflows as DAGs. The coordinator resolves dependencies, dispatches steps in parallel across Islands, maps data between steps, and merges results.

Translate text, classify the output, then summarize — all in one API call. Each step runs on the best-fit Island. If a step fails, the workflow handles it.

  • DAG-based dependency resolution — steps run in parallel when possible
  • Data mapping between steps via templates
  • Built-in failure handling — retry, skip, or abort per step
  • Same API — submit a workflow definition, get merged results
Step 1: TranslateLLM Chat → Island A
Step 2a: Classify  |  Step 2b: SummarizeParallel → Island B + Island C
Results mergedTranslation + classification + summary returned together
04 — Advanced Capabilities

Beyond inference

Cache, test, train, and protect — all on the same distributed network.

Inference Caching

Similar prompts served from cache instantly

  • Cache hits are instant and free — no Island compute used
  • Semantic matching catches near-duplicates (not just exact text)
  • 30-60% cost reduction for chatbots, FAQ, and support
  • Automatic — no configuration needed, per-Cargo opt-out available

A/B Testing

Compare model versions with real traffic

  • Split traffic between model variants (80/20, 50/50, etc.)
  • Track latency, success rate, and throughput per variant
  • Statistical significance with p-values and confidence intervals
  • Auto-promote the winner when results are conclusive

Federated Fine-Tuning

Train models without centralizing data

  • Each Island trains on its local data — data never leaves
  • Only gradient updates exchanged (not raw data)
  • Differential privacy for mathematical data protection
  • HIPAA, finance, legal — fine-tune without compliance risk

Confidential Inference

Encrypted end-to-end — Islands can't see your data

  • TEE hardware attestation (Intel SGX, AMD SEV, ARM TrustZone)
  • AES-256-GCM encryption — input and output encrypted at all times
  • Homomorphic option for software-only privacy (no TEE required)
  • Same API, same billing — just add confidential: true
Read the docs
05 — Integration

OpenAI-compatible API

If your code works with OpenAI, it works with Archipelag.io. Change the base URL and you're done.

# Python — streaming chat
from openai import OpenAI

client = OpenAI(
    base_url="https://app.archipelag.io/api/v1",
    api_key="your-key"
)

stream = client.chat.completions.create(
    model="mistral-7b",
    messages=[{"role": "user", "content": "Hello!"}],
    stream=True
)

for chunk in stream:
    print(chunk.choices[0].delta.content, end="")
// JavaScript — streaming chat
import OpenAI from 'openai';

const client = new OpenAI({
  baseURL: 'https://app.archipelag.io/api/v1',
  apiKey: 'your-key'
});

const stream = await client.chat.completions.create({
  model: 'mistral-7b',
  messages: [{ role: 'user', content: 'Hello!' }],
  stream: true
});

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content ?? '');
}
Endpoint/v1/chat/completions

Standard OpenAI chat completions format. Supports streaming, temperature, max_tokens, and all common parameters.

AuthenticationAPI Key or Session

Bearer token via API key (read/write scopes) or session cookie for the web UI. Keys are managed in your dashboard.

Data residencyJurisdiction routing

Optional policy parameter restricts job placement to Islands in specific regions (EU, Switzerland, custom). For teams with compliance requirements.

06 — Pricing

Market-based, not fixed tiers

Prices are set by supply and demand on the compute exchange, not by us.

Live rates

The compute exchange shows current clearing prices for each Cargo type. Check the exchange for live rates.

Free during beta

The beta uses virtual credits — no real money changes hands. You get 10,000 credits on signup and they auto-refill. After beta: buy credits, no subscriptions.

Transparent billing

Every job shows its clearing price before execution. You see exactly what you'll pay and which Island will run it.

07 — Platform

More than an API

Use the full web platform or build your own integration.

Chat interface

Full-featured web chat with conversation history, model selection, and host preferences. Real-time streaming built in.

Image generation

Web UI for text-to-image with configurable dimensions, inference steps, and a gallery of past generations.

Playground

Try the API without signing up. Interactive testing with live code examples in Python and JavaScript.

Batch jobs

Submit batches of inputs via the API or web UI. Monitor progress in real time with a visual dashboard.

Cargo Marketplace

Browse 133+ Cargos in the registry. Reviews, ratings, and publisher tiers help you find trusted models.

Developer portal

Build and publish your own Cargos. Track usage, manage submissions, and earn revenue per execution.

08 — Getting Started

Three steps to your first inference

Create an account

Email or GitHub. You get free beta credits immediately—no credit card needed.

Call the API

Use the web chat UI, point your OpenAI client at our base URL, or use the Python/JS SDK.

Get results

The coordinator finds the nearest Island. Responses stream back token-by-token in real time.

The beta is live

Create an account and run your first inference job. Free credits, no credit card.