✍️ Field notes & long reads

The blog.

In-depth articles, capability tours and field notes — written from actually running the models, tools and pipelines they're about.

More articles

🕸️

LinearRAG — relation-free GraphRAG, hands-on a 4090

GraphRAG without relation extraction. LinearRAG (ICLR 2026) builds its Tri-Graph with zero LLM tokens — just NER and embeddings — then does single-pass multi-hop retrieval via semantic bridging and Personalized PageRank, run end-to-end on a 24 GB 4090 with gpt-5-mini.

🎨

Boogu-Image — the Edit and Turbo models, hands-on a 4090

The Apache-2.0 10B unified image model on a 24 GB 4090: instruction-based photo editing and a 4-step Turbo text-to-image — install with screenshots, prompt/config galleries, and the honest blur/OOM/group offload and steps/CFG/identity story.

🔭

Lens — quantizing the DiT to run 1440 without offload on a 4090

Microsoft's 3.8B text-to-image model with a 4-bit GPT-OSS-20B encoder: why no-offload bf16 OOMs above 1024, two eviction dead-ends, and the FP8 DiT quantization that unlocks 1440 fully on the GPU — with a multi-category gallery and honest per-image quality notes.

🎨

Ideogram 4 — structured JSON prompting on a single RTX 4090

The 9.3B open-weight design model: JSON captions with bounding-box layout, LLM magic prompts, a 14-image quality-scored style gallery (Arabic included) and a 21-run speed benchmark — measured locally in nf4 on a 24 GB 4090.

🌳

Bonsai-Image-4B — a 1.58-bit text-to-image model on a single RTX 4090

PrismML's 4B image generator quantized to ternary (1.58-bit) and distilled to 4 steps — ~5 GB VRAM, sub-second warm renders, ten --style presets, and the measured throughput numbers.

🎬

Lance 3B — a unified image & video model on a single RTX 4090

ByteDance's Lance does image/video understanding, generation and editing in one 3B transformer — all seven modes, the official benchmarks, and the two code fixes needed to fit 24 GB.

Doubling Qwopus 3.6 on a single RTX 4090 — Multi-Token Prediction with llama.cpp

~1.98× average speedup (peak 2.21× on math) via MTP on Jackrong/Qwopus3.6-27B-v2-MTP-GGUF — with the failed Ollama detour and the production llama-server recipe.

🎨

HiDream-O1-Image — Five generation modes & nine diagram styles on a single GPU

A capability tour of the open 8B Pixel-DiT model: text-to-image, instruction editing, IP reference, bbox layout, openpose — end-to-end on an RTX 4090.