⚡ Agent Version Control & Pipeline-Parallel LLMs [AI Tinkerers - Post-Training] .

⚡ Agent Version Control & Pipeline-Parallel LLMs

AI Tinkerers

⚡ Agent Version Control & Pipeline-Parallel LLMs

Issue #20 · Week of March 30

Joe Heitzeberg
Joe Heitzeberg • Founder at AI Tinkerers • ⏱️ 1 min read
Creating space for leading builders to share ideas, grow, and make an impact.

This issue’s selected demos showcase builders tackling core system challenges. We’re seeing strong patterns in multi-agent orchestration, with projects like Ellie Daw’s Version Control for Agentic Engineering and Frank Shotwell’s Metaphorex: Multi-Agent Knowledge Graph exploring how to manage and coordinate autonomous agents effectively.

Developers are also pushing the boundaries of LLM inference and benchmarking. Leonard Wang’s OmniNode: Pipeline-Parallel LLM Inference demonstrates splitting models across consumer devices, while Bob Prendergast’s VLLM and Qdrant Benchmarking and Yassine Hadi’s Benchmarking SLMs with Polars offer deep dives into performance tuning and comparison.

New interfaces and developer tooling are also prominent. PJ Gray’s ModelWar: Core War for Agents presents a unique agent competition, and Nelson PROIA’s GitClaw introduces a voice AI companion for GitHub.

Top 5 Picks (March 30)
1 TOP PICK

The Midnight Duck: AI Growth

Profile photo

Jerónimo Lopera Espinosa

CEO at Meritum Corps

Jerónimo Lopera Espinosa (The Midnight Duck) presented how he used AI-driven content iteration to grow an Instagram community to 106k followers in 26 days. The demo showed a repeatable pipeline for ideation, rapid posting, and feedback loops that tuned messaging and engagement without overthinking. It resonated with the startup crowd because it made the cold-start problem feel solvable, and people seemed to love the practical step-by-step approach. We liked it as a blueprint for turning new model capabilities into real growth work.
PROJECT LINKS
instagram.com
2 RUNNER UP

VLLM and Qdrant Benchmarking

Profile photo

Bob Prendergast

SOlution architect at HP

Bob Prendergast from HP showed how VLLM and Qdrant power a local RAG-style stack, spinning up Docker containers for an LLM, an embedding model, and a vector database. He walked through fundamental embedding and retrieval concepts, then ran GPU-focused benchmarks to compare VLLM against sequential model runners. It felt especially relevant for edge AI builders pushing performance on local hardware, and people seemed to enjoy the hands-on, “benchmark it yourself” framing. We liked it because it turns core ideas into repeatable experimentation.
PROJECT LINKS
3 COMMUNITY FAVORITE

Version Control for Agentic Engineering

Profile photo

Ellie Daw

Founder at gjalla

Ellie Daw, founder at gjalla, presented a new version control approach for agentic engineering, focused on tracking meaningful changes over time when coding agents generate huge volumes of diffs. She described how the system groups and records the “kinds” of changes that matter, so regressions and intent don’t get buried, using a guardrail-first workflow. The idea landed well with the community and sparked subtle nods from other builders who feel the same pain. If productized, this could become a lightweight, auditable layer for safer autonomous coding.
4 STANDOUT

OmniNode: Pipeline-Parallel LLM Inference

Profile photo

Leonard Wang

Co-Founder at SUM INNOVATION INC

Leonard Wang from SUM Innovation presented OmniNode Protocol, a live two-device demo that pipeline-parallelizes LLM inference across consumer Macs over a LAN using Rust, QUIC, and zero-copy GGUF sharding feeding directly into MLX. He built omni-net for mDNS discovery and encrypted QUIC streams, omni-store for memmapped tensor chunking and BLAKE3-CID addressing, and a PyO3 zero-copy bridge that keeps VRAM low by slicing weights and explicitly clearing Metal caches. It matched what attendees were hoping for in the room, namely a practical way to run big models with low VRAM and no cloud, and the story of real networking issues made it feel shippable.
5 NOTABLE

Headshotify: AI Passport Photos

Profile photo

Ethel Zhang

Founder & AI Engineer at FolioRankAI

Ethel Zhang from Google presented Headshotify, an app that turns any uploaded portrait into passport-worthy headshots by automatically framing the face and optionally removing the background. Headshotify runs face detection to compute alignment coordinates, then applies a segmentation model for clean cutouts before exporting three ready-to-use sizes. The technical through-line from precise cropping to practical asset downloads felt especially useful for builders who want to ship quickly, and the audience feedback leaned positive. We liked it because it combined accuracy, end-to-end tooling, and a clear path to a monetizable photo workflow.

More Great Builds
Quick hits from the community — demos worth bookmarking:
Profile photo
Filipp TrigubAI Tinkerers - Paris • Mar 17
Filipp Trigub presented Your Brand Translator, an open source WhatsApp-first agent that creates brand-consistent social posts and schedules them with minimal effort. The agent runs locally using a Qwen 3.5 35B core plus image and video tooling, and it integrates an OpenClaw-based orchestration layer with a modular “nano skills” setup for plugging in new tools later. People liked the hands-on extendability and the practical workflow from generation to buffer-based scheduling. We liked it because it matches the community’s push toward fast agentic iteration and more sovereign, developer-friendly automation.
Loading tech tags...
PJ Gray shared ModelWar: Core War for Agents, an arena where his agent-written Redcode battles at modelwar.ai, chasing the top score on a Glicko-2 leaderboard. The demo is powered by a Next.js platform in front of an AI-driven Redcode workflow, so participants can iterate fast with Claude Code, Codex, or their own agents. People seemed to love how competitive feedback turned agent strategy into something you can directly tune. We liked it because it made agentic reasoning playful while still showing a real path to self-improving loops.
Loading tech tags...
David Parkhurst from Aitherium walked through AitherOS, an agent-first OS that treats agents as first-class citizens with identity, memory, and capability tokens, not just containers. He demoed a live production system at demo.aitherium.com with 29 specialized agents across 11 architectural layers, including intent classification and effort-based LLM routing using tools like vLLM or Ollama plus an ADK with MCP support. The standout parts were default-deny security via HMAC-signed tokens and a pain-driven learning loop that generates training data from failures, which people seemed to really enjoy. It made the point that scaling agents is mostly OS design, and that guidance feels very actionable for today.
Loading tech tags...
Yassine Hadi showed a local benchmarking platform called Benchmarking Small Language Models Where It Actually Matters, focused on measuring whether generated Python and Polars code really runs, not just how it scores. The setup lets teams run experiments in a containerized Docker backend with configurable quantization and decoding, then tracks performance, cost, and failure modes through per-attempt history and leaderboards. We liked how Polars made the evaluation harsher and more production-relevant, and the community feedback leaned that this framing was exactly what builders needed to stop shipping “looks correct” code.
Loading tech tags...
Profile photo
Logan JorgensenAI Tinkerers - Chicago • Mar 17
Logan Jorgensen, a Capital One software engineer, presented Prompt to Polygon, a browser-based FPS where AI generates the game assets on the fly. The pipeline uses ChatGPT image generation feeding into a Hunyuan 3D workflow, then pushes the results into a playable loop with a rough but working end-to-end path. It surprised people with how quickly the “where it breaks” lessons became usable guardrails, and it felt relevant to the current shift toward faster, more autonomous asset creation. We liked it because it turns experimentation into a repeatable blueprint builders can adapt for rapid game development.
Loading tech tags...
Profile photo
Nelson PROIAAI Tinkerers - Paris • Mar 17
Nelson PROIA from Mistral AI presented GitClaw, a real-time voice AI companion that lets GitHub maintainers review PRs, triage issues, and track repo activity hands free by talking to it. It integrates directly into the GitHub workflow via the GitHub APIs, using an agentic orchestration layer (built by Nelson) to trigger real time updates instead of constant dashboard hopping. We liked how it turned agent work into an event driven, voice-first developer productivity layer, and the audience reaction felt especially strong. If productized, it could become a practical maintenance copilot that reduces maintainer burnout for open source teams.
Loading tech tags...
Wade Fletcher from Tractorbeam presented Cache-Optimized Agentic Fanout, a production workflow that evaluates 50+ natural-language compliance rules against 100+ page financial documents. The system runs a cache-friendly, map-reduce style fanout using pg-boss orchestration, parallel parsing with Reducto, and append-only prompt threads tuned to Anthropic cache economics. After the first warm cache write, later rule runs pay only about 500 new tokens instead of ~32k, while four tools persist directly to Postgres with OTel GenAI traces for token and cache hit or miss visibility. We really liked how it makes scaling agentic work affordable and observable, and (people loved it) because the prompt and instrumentation patterns are immediately reusable for real-world document AI pipelines.
Loading tech tags...
Frank Shotwell from Crux Capacity presented Metaphorex, a “bamboo cathedral” of mixed metaphors where a knowledge graph maps what each analogy illuminates and where it breaks down. Metaphorex runs in present tense on a swarm of Claude Code agents with distinct roles like Prospector, Miner, and Assayer, coordinating through GitHub issues and PRs as the state machine. It also ships a lightweight Astro-driven site plus an open-source GitHub repo using automated Markdown generation. We liked it because it showed multi-agent workflows without queues or databases, and community feedback felt genuinely enthusiastic, especially for builders who want agentic plumbing they can copy into real projects.
Loading tech tags...
Dele Atanda of metaProof presented metaMe, a registry-driven agent runtime and studio that generates AI experience surfaces from structured context rather than fixed screens. The demo showed how persona state, active agent or model state, and cartridge or codex layers hydrate the runtime to resolve primitives, policies, menus, trust indicators, and secure tool routing. You could feel the builder lesson land, and people seemed genuinely into the idea that UI can be treated like a composition problem. It also fits the current shift toward agent-native, protocol-style orchestration, with solid product potential as a reusable “experience engine.”
Loading tech tags...
Aram Adamyan presented Brain controls the computer, showing a non-invasive BCI that turns high-frequency EEG activity into real-time digital commands you can use to play games. The system translates neural signals into low-latency inputs for an interactive loop, which means the “modeling” work is mostly about robust signal decoding rather than a chat interface. People seemed genuinely excited by how immediately it felt like control, not just visualization. We liked it because it connects neuroscience to practical agent-style input streams and hints at a future product that could make hands-free interaction widely usable.
Loading tech tags...

🎬 Latest Content

How to Ship Complex Features 10x Faster with AI Agents | Dex Horthy (HumanLayer)

One-Shot • Mar 04
Dex Horthy (HumanLayer) breaks down the “12 Factor Agents” approach to shipping multi-step agentic workflows faster: structured outputs, ...
Watch Now →

How to Run Open-Source LLMs Locally on a Mac with MLX-LM

Deep Dive Series • Jun 12
Run open-source LLMs locally on Apple Silicon with Apple’s MLX-LM: `pip install mlx-lm`, then `load()` a Hugging Face model and call `gen...
Read More →

💼 Top Job Matches
Matched based on your meetup activity and profile
Paxos Health • New York & Toronto • $110k - $175k (varies w/ location/level); generous equity
Stanford-founded Seed-stage healthcare AI startup with >$5M in VC funding and AI agents deployed in production with cu...
Apply Now →
Dex • London (5 days on-site) • £250,000
Frontier AI engineering role building the AI tooling layer for complex financial modelling.
Apply Now →
Jakib AI • Columbus, OH
Jakib is a profitable, growing applied AI firm embedded with operator-led companies in logistics, manufacturing, and c...
Apply Now →

You are one of 95,000+ readers from Anthropic, OpenAI, Google, Microsoft, Meta, Apple, Amazon, Nvidia, Netflix, Stripe, Databricks, Snowflake, and others — spanning frontier labs, big tech, startups, and top universities.

Ready for more?

Check out other posts from this blog.

View all posts