Skip to content

Colony: A Native AI Agent Runtime for Apple Platforms

Colony is a Swift framework that orchestrates AI agent loops on iOS and macOS using Apple's Foundation Models. Here's why it exists and how it works.

What is Colony?

Colony is a Swift framework for running AI agent loops on iOS 26+ and macOS 26+. It wraps Apple’s Foundation Models, provides a virtual filesystem and shell execution, manages memory, and enforces tool approval policies — all through a clean, composable API.

Think of it as LangChain, but purpose-built for Apple’s stack. No Python. No CoreML wrappers. Just native Swift from the ground up.

import Colony

// Create a configured runtime
let runtime = try Colony.start(modelName: "llama3.2", profile: .device)

// Send a message and get the result
let handle = await runtime.sendUserMessage("List the files in my project")
let outcome = try await handle.outcome.value

The Architecture

Colony has a deliberate two-module split:

ColonyCore — Pure value types, protocols, and policies. Zero runtime logic. This is where you find:

  • ColonyCapabilities — feature flags like .planning, .filesystem, .shell
  • ColonyConfiguration — all agent settings
  • ColonyToolApprovalPolicy — controls when tools need human approval
  • Backend protocols (ColonyFileSystem.Service, ColonyShellService, ColonyMemoryService)

Colony — Runtime orchestration built on HiveCore. This is where:

  • The agent graph is compiled (ColonyAgent)
  • The runtime loop executes (ColonyRuntime)
  • Foundation Models are called (ColonyFoundationModelsClient)
  • Model routing happens (ColonyModelRouter)

Colony @_exported imports ColonyCore, so downstream consumers just import Colony and get everything.

┌─────────────────────────────────────────────────────┐
│                    Your App                          │
│                                                      │
│  import Colony                                       │
│  let runtime = Colony.start(modelName: "...")       │
└─────────────────────────────────────────────────────┘


┌─────────────────────────────────────────────────────┐
│                    Colony                            │
│  ├── ColonyRuntime — wraps HiveRuntime               │
│  ├── ColonyAgent — compiled agent graph               │
│  ├── ColonyFoundationModelsClient — FMiLM bridge      │
│  └── ColonyModelRouter — routing strategies          │
└─────────────────────────────────────────────────────┘


┌─────────────────────────────────────────────────────┐
│                   HiveCore                           │
│  ├── Runtime loop management                         │
│  ├── Channel-based state (messages, tool calls)      │
│  └── Interrupt/resume handling                       │
└─────────────────────────────────────────────────────┘


┌─────────────────────────────────────────────────────┐
│                  ColonyCore                          │
│  ├── Capabilities & policies                        │
│  ├── Backend protocols (FS, Shell, Memory)           │
│  └── Tool definitions                                │
└─────────────────────────────────────────────────────┘

The Runtime Loop

The agent graph implements a cyclic state machine that runs until the model produces a final answer or an interrupt fires:

preModel → model → routeAfterModel → tools → toolExecute → preModel

1. preModel — Preprocesses input, runs compaction if token budget is exceeded

2. model — Calls the LLM with the current conversation state

3. routeAfterModel — Decides what to do with the model’s output:

  • If it has tool calls → go to tools
  • If it’s a final answer → finish

4. tools — Validates tool schema and availability

5. toolExecute — Runs the tool, streams results back to the model

The loop continues until you get a final answer or an interrupt (like a tool approval request).

Capability-Gated Tools

Colony only injects tools into the prompt when their corresponding capability is enabled AND a backend is wired. This is the capability matrix:

CapabilityToolsBackend Protocol
.planningwrite_todos, read_todos(built-in)
.filesystemls, read_file, write_file, edit_file, glob, grepColonyFileSystem.Service
.shellexecuteColonyShellService
.scratchbookscratch_read/add/update/complete(filesystem-backed)
.subagentstaskColonySubagentRegistry

This means you can ship an agent with just filesystem access, or give it full shell and subagent capabilities — the same API, different configuration.

Profiles: Device vs Cloud

ColonyBuilder gives you two presets out of the box:

// .device — strict ~4k token budget, on-device inference
// Compaction at 2,600 tokens
// Tool result eviction at 700 tokens
// Scratchbook enabled
ColonyBuilder().profile(.device)

// .cloud — generous limits, no scratchbook
// Compaction at 12k, eviction at 20k
// No tool approval required
ColonyBuilder().profile(.cloud)

The .device profile is tuned for on-device Foundation Models where context window is precious. The .cloud profile assumes you have room to think.

Tool Approval: Human-in-the-Loop

Before any tool executes, Colony can pause for human approval:

// All tools require approval
ColonyBuilder()
    .configuration(.init(toolApprovalPolicy: .always))

// Only specific tools auto-approve
ColonyBuilder()
    .configuration(.init(toolApprovalPolicy: .allowList(["ls", "read_file"])))

// No approval needed
ColonyBuilder()
    .configuration(.init(toolApprovalPolicy: .never))

When .always is set and a tool is called, the runtime emits an .interrupted outcome with .toolApprovalRequired. You resume with runtime.resumeToolApproval(interruptID: ..., decision: ...).

On-Device Inference

Colony uses Apple’s Foundation Models through ColonyFoundationModelsClient. This is the bridge between HiveCore’s model protocol and Apple’s ModelContext:

public struct ColonyFoundationModelsClient {
    public init(modelName: String, configuration: Configuration)
    public func complete(prompt: String, stream: Bool) async throws -> ModelResult
}

The client handles streaming, quantization, and IOSurface sharing for you. For routing between models (local vs cloud), there’s ColonyModelRouter with strategies like .onDevice, .prioritized, and .costOptimized.

Typed IDs

Colony uses phantom types to prevent mixing ID contexts:

public struct ColonyID<Domain> {
    public let rawValue: String
}

public typealias ColonyThreadID = ColonyID<ColonyIDDomain.Thread>
public typealias ColonyRunID = ColonyID<ColonyIDDomain.Run>
public typealias ColonyInterruptID = ColonyID<ColonyIDDomain.Interrupt>

You can’t accidentally pass a ColonyThreadID where a ColonyRunID is expected — the compiler enforces this.

Why Native Swift Matters

Most agent frameworks are Python-first with optional API wrappers. Colony is Swift-first, which means:

  1. No FFI overhead — Direct calls to Foundation Models, no Python bridge
  2. Memory safety — Swift’s ownership system, no reference counting surprises
  3. Concurrency — Swift’s structured concurrency, actors for state
  4. Apple platform integration — Native IOSurface sharing, on-device GPU access

If you’re building AI features into an iOS app or macOS tool, Colony lets you do that without leaving Swift.

Getting Started

git clone https://github.com/christopherkarani/Colony
cd Colony
swift build
swift test

The framework targets iOS 26+ and macOS 26+. You’ll need Xcode 26+ to build it.

There’s also an example CLI at Sources/ColonyResearchAssistantExample/ that demonstrates interactive REPL mode with human-in-the-loop tool approval.

What’s Next

Colony is production-ready for early adopters. The API surface is documented (95 public types, zero undocumented), 250 tests pass, and the module split makes it easy to reason about.

We’re working on:

  • Better streaming support across the model interface
  • More built-in tool definitions
  • macOS app harness with real-time agent observation

If you’re doing on-device AI agent work on Apple platforms, reach out — we want to hear what you’re building.