⚡ Agent Version Control & Pipeline-Parallel LLMs
Issue #20 · Week of March 30
This issue’s selected demos showcase builders tackling core system challenges. We’re seeing strong patterns in multi-agent orchestration, with projects like Ellie Daw’s Version Control for Agentic Engineering and Frank Shotwell’s Metaphorex: Multi-Agent Knowledge Graph exploring how to manage and coordinate autonomous agents effectively.
Developers are also pushing the boundaries of LLM inference and benchmarking. Leonard Wang’s OmniNode: Pipeline-Parallel LLM Inference demonstrates splitting models across consumer devices, while Bob Prendergast’s VLLM and Qdrant Benchmarking and Yassine Hadi’s Benchmarking SLMs with Polars offer deep dives into performance tuning and comparison.
New interfaces and developer tooling are also prominent. PJ Gray’s ModelWar: Core War for Agents presents a unique agent competition, and Nelson PROIA’s GitClaw introduces a voice AI companion for GitHub.
The Midnight Duck: AI Growth
Jerónimo Lopera Espinosa (The Midnight Duck) presented how he used AI-driven content iteration to grow an Instagram community to 106k followers in 26 days. The demo showed a repeatable pipeline for ideation, rapid posting, and feedback loops that tuned messaging and engagement without overthinking. It resonated with the startup crowd because it made the cold-start problem feel solvable, and people seemed to love the practical step-by-step approach. We liked it as a blueprint for turning new model capabilities into real growth work.
PROJECT LINKS
|
VLLM and Qdrant Benchmarking
Bob Prendergast from HP showed how VLLM and Qdrant power a local RAG-style stack, spinning up Docker containers for an LLM, an embedding model, and a vector database. He walked through fundamental embedding and retrieval concepts, then ran GPU-focused benchmarks to compare VLLM against sequential model runners. It felt especially relevant for edge AI builders pushing performance on local hardware, and people seemed to enjoy the hands-on, “benchmark it yourself” framing. We liked it because it turns core ideas into repeatable experimentation.
TECH STACK
PROJECT LINKS
|
Version Control for Agentic Engineering
Ellie Daw, founder at gjalla, presented a new version control approach for agentic engineering, focused on tracking meaningful changes over time when coding agents generate huge volumes of diffs. She described how the system groups and records the “kinds” of changes that matter, so regressions and intent don’t get buried, using a guardrail-first workflow. The idea landed well with the community and sparked subtle nods from other builders who feel the same pain. If productized, this could become a lightweight, auditable layer for safer autonomous coding.
TECH STACK
|
OmniNode: Pipeline-Parallel LLM Inference
Leonard Wang from SUM Innovation presented OmniNode Protocol, a live two-device demo that pipeline-parallelizes LLM inference across consumer Macs over a LAN using Rust, QUIC, and zero-copy GGUF sharding feeding directly into MLX. He built omni-net for mDNS discovery and encrypted QUIC streams, omni-store for memmapped tensor chunking and BLAKE3-CID addressing, and a PyO3 zero-copy bridge that keeps VRAM low by slicing weights and explicitly clearing Metal caches. It matched what attendees were hoping for in the room, namely a practical way to run big models with low VRAM and no cloud, and the story of real networking issues made it feel shippable.
|
Headshotify: AI Passport Photos
Ethel Zhang from Google presented Headshotify, an app that turns any uploaded portrait into passport-worthy headshots by automatically framing the face and optionally removing the background. Headshotify runs face detection to compute alignment coordinates, then applies a segmentation model for clean cutouts before exporting three ready-to-use sizes. The technical through-line from precise cropping to practical asset downloads felt especially useful for builders who want to ship quickly, and the audience feedback leaned positive. We liked it because it combined accuracy, end-to-end tooling, and a clear path to a monetizable photo workflow.
|
How to Ship Complex Features 10x Faster with AI Agents | Dex Horthy (HumanLayer)
How to Run Open-Source LLMs Locally on a Mac with MLX-LM
You are one of 95,000+ readers from Anthropic, OpenAI, Google, Microsoft, Meta, Apple, Amazon, Nvidia, Netflix, Stripe, Databricks, Snowflake, and others — spanning frontier labs, big tech, startups, and top universities.
⚡ Agent Version Control & Pipeline-Parallel LLMs