Hey, I'm Chris
Exploring AI agents, on-device inference, and the systems that make intelligent software reliable.
Projects
ls -la ./projectsWriting
ls -la ./writingA 3B on-device model with a 4K context window produced a 2,336-word grounded research report using live web search. Here's the architecture that made it work.
Built EdgeRunner in pure Swift and Metal from scratch over a weekend. On Qwen3-0.6B Q8_0, it hits 212 tok/s on M3 Max, beating llama.cpp by 16%. Here's what worked.
A throughput research postmortem from Espresso: what we tested, what failed, what stayed slower than baseline, and where the real architectural blockers are.
We built ContextCore to manage conversation context on Apple Silicon without cloud or battery drain. Four-tier memory, Metal shaders, 63M chunks/sec, sub-5ms p99.
If you've shipped AI features in an iOS app recently, you know the drill. Each provider has its own SDK. Switching means rewriting everything. Conduit fixes this.
We fully documented EdgeRunner, a Metal LLM inference engine for Apple Silicon. Here's what we found that was actually interesting.
How local telemetry transforms AI reliability, privacy, and performance for applications running on edge devices.
Colony is a Swift framework that orchestrates AI agent loops on iOS and macOS using Apple's Foundation Models. Here's why it exists and how it works.
After building a multi-agent framework in Swift, I have thoughts on why it might actually be a better fit than Python or TypeScript for this kind of work.
How chain-of-thought prompting and simple techniques transformed large language models from pattern-matchers into genuine problem-solvers.
Exploring the wild probability estimates from AI researchers, the behavioral and architectural evidence of machine consciousness, and why this might be the biggest moral challenge of our era.
Why the serious agent stacks are converging on files as the primitive for machine behavior, and what that means for the future of AI systems.