If you’ve shipped AI features in an iOS app recently, you know the drill. OpenAI SDK here. Anthropic SDK there. Maybe you tried Ollama for local inference. Each one has its own API, its own message format, its own streaming implementation. Switching from GPT-4 to Claude felt like rewriting the networking layer.
I got tired of this. So I built Conduit.
The core idea
One protocol, nine backends. When you build on TextGenerator, swapping providers means changing one initializer:
// OpenAI
let provider = OpenAIProvider.openAIKey(apiKey: "sk-...")
// Anthropic — just change the initializer
let provider = AnthropicProvider.anthropicKey(apiKey: "sk-ant-...")
// Local MLX (Apple Silicon)
let provider = try await MLXProvider.configuration(.stable)
// Same API everywhere
let result = try await provider.generate(prompt: "Hello", model: .gpt4o, config: .default)
Your prompt pipelines, message handling, streaming logic — all of it stays the same. The provider is an implementation detail.
Tool calling that doesn’t make you want to quit
Tool calling is where things get ugly. OpenAI uses function calling. Anthropic uses tool_use. Each has different JSON formats, different error handling. You end up writing adapters just to get a weather lookup working.
Conduit’s ToolExecutor handles the entire loop:
let session = ChatSession(provider: provider, model: .gpt4o, config: .default)
session.toolExecutor = ToolExecutor(tools: [WeatherTool(), SearchTool()])
session.toolCallRetryPolicy = .retryableAIErrors(maxAttempts: 3)
// Just send. Tools execute automatically.
let response = try await session.send("What's the weather in Tokyo?")
No manual result parsing. No format conversion. Just async/await.
Structured output without the Codable headache
JSON schema validation with LLMs usually means lots of manual Codable. Custom decoding logic. Error handling everywhere. Conduit’s @Generable macro handles the boilerplate:
@Generable
struct WeatherResult {
@Guide(description: "City name")
var city: String
@Guide(description: "Temperature in Celsius")
var temperature: Double
@Guide(description: "Weather condition")
var condition: String
}
// Auto-synthesized: init, schema, partial JSON handling
let result = try await provider.generate(
prompt: "Return weather for Tokyo as JSON",
config: .default.responseFormat(.jsonObject)
)
let weather = try WeatherResult(result.generatedContent)
One attribute, ~150 lines of boilerplate synthesized. Initializers, schema generation, partial JSON handling for streaming.
Streaming JSON that doesn’t break
LLM streaming is messy. Tokens arrive out of order. JSON objects get split across chunks. You spend more time debugging partial JSON than building features.
Conduit’s GeneratedContent handles incomplete JSON gracefully:
// If model outputs {"title": "A story of" (truncated)
// Conduit tries closing with }, then "", falls back to string
let content = try GeneratedContent(json: partialJson)
Combined with StreamingResult<T>, you get typed snapshots as data arrives:
for try await snapshot in stream {
// snapshot.content is WeatherResult.PartiallyGenerated
// All fields Optional — nil until field arrives
print(snapshot.content.city)
}
Local inference on iOS
MLX on Apple Silicon makes this practical now. Conduit makes it first-class:
// Let the model tell you what it supports
let capabilities = try await provider.getModelCapabilities()
// KV quantization, attention sinks, speculative scheduling
config.runtimeFeatures.kvQuantization.enabled = true
// Warmup for JIT compilation
try await provider.warmUp(model: .llama3_2_1b, prefillText: "Hi", maxTokens: 5)
No API keys for simple tasks. No latency on basic prompts. No sending user data anywhere.
SwiftUI integration that actually works
ChatSession is @Observable. Direct SwiftUI binding without wrappers:
@Observable
class MyViewModel {
var session: ChatSession<OpenAIProvider>
}
struct ChatView: View {
@State var viewModel = MyViewModel()
var body: some View {
VStack {
ForEach(viewModel.session.messages) { message in
MessageRow(message)
}
if viewModel.session.isGenerating {
ProgressView()
}
}
}
}
Thread safety via NSLock — the lock is never held across await points.
Some numbers
| Task | Without Conduit | With Conduit |
|---|---|---|
| Switch providers | 2-3 days | 5 minutes |
| Tool calling code | 500+ lines | 50 lines |
| Structured output | Custom Codable | @Generable |
| Streaming JSON | Roll your own | Built-in |
| Local inference | Proprietary | First-class |
What makes this different
Most “unified” SDKs are thin wrappers. Conduit goes deeper:
- Partial JSON recovery isn’t a feature, it’s a design decision
- Runtime feature gating in MLXProvider lets each model expose its capabilities
- Layered VLM detection cascades through metadata, config, and name heuristics
- Actors everywhere — providers, tool executor, model registry
The point
You’re not locked into one provider. You won’t have to rewrite when Claude 4 drops. You don’t need separate code paths for local vs cloud.
Treat AI as infrastructure. Conduit handles the plumbing.
Swift 6.2. iOS 17+. Apple Silicon optimized. Open source.
Get started:
# Cloud-only
swift build
# With OpenAI + Anthropic
swift build --traits OpenAI,Anthropic
# MLX for Apple Silicon
swift build --traits MLX