✍️ Field notes & long reads

The blog.

In-depth articles, capability tours and field notes — written from actually running the models, tools and pipelines they're about.

A document flowing into a glowing knowledge graph and then an answer — LightRAG indexing documents into a knowledge graph

🕸️

⭐ Featured · Latest

AI ModelsField NotesJune 20269 min read

LightRAG — fast graph-RAG with GPT-5 + Qdrant, hands-on

Graph-RAG without GraphRAG's bill. LightRAG (HKUDS) turns your documents into a knowledge graph and answers with a dual-level retrieval paradigm — comprehensive, but a single API call instead of community traversal, and incremental updates instead of rebuilds. The seven core ideas, then a real end-to-end install on OpenAI gpt-5-mini / gpt-5.5 and a Qdrant vector store — with the graph it builds and the setup fixes.

Read the article →

🕸️

LinearRAG — relation-free GraphRAG, hands-on a 4090

GraphRAG without relation extraction. LinearRAG (ICLR 2026) builds its Tri-Graph with zero LLM tokens — just NER and embeddings — then does single-pass multi-hop retrieval via semantic bridging and Personalized PageRank, run end-to-end on a 24 GB 4090 with gpt-5-mini.

AI ModelsField NotesJune 2026

🎨

Boogu-Image — the Edit and Turbo models, hands-on a 4090

The Apache-2.0 10B unified image model on a 24 GB 4090: instruction-based photo editing and a 4-step Turbo text-to-image — install with screenshots, prompt/config galleries, and the honest blur/OOM/group offload and steps/CFG/identity story.

AI ModelsField NotesJune 2026

🔭

Lens — quantizing the DiT to run 1440 without offload on a 4090

Microsoft's 3.8B text-to-image model with a 4-bit GPT-OSS-20B encoder: why no-offload bf16 OOMs above 1024, two eviction dead-ends, and the FP8 DiT quantization that unlocks 1440 fully on the GPU — with a multi-category gallery and honest per-image quality notes.

AI ModelsField NotesJune 2026

🎨

Ideogram 4 — structured JSON prompting on a single RTX 4090

The 9.3B open-weight design model: JSON captions with bounding-box layout, LLM magic prompts, a 14-image quality-scored style gallery (Arabic included) and a 21-run speed benchmark — measured locally in nf4 on a 24 GB 4090.

AI ModelsField NotesJune 2026

🌳

Bonsai-Image-4B — a 1.58-bit text-to-image model on a single RTX 4090

PrismML's 4B image generator quantized to ternary (1.58-bit) and distilled to 4 steps — ~5 GB VRAM, sub-second warm renders, ten --style presets, and the measured throughput numbers.

AI ModelsField NotesMay 2026

🎬

Lance 3B — a unified image & video model on a single RTX 4090

ByteDance's Lance does image/video understanding, generation and editing in one 3B transformer — all seven modes, the official benchmarks, and the two code fixes needed to fit 24 GB.

AI ModelsField NotesMay 2026

⚡

Doubling Qwopus 3.6 on a single RTX 4090 — Multi-Token Prediction with llama.cpp

~1.98× average speedup (peak 2.21× on math) via MTP on Jackrong/Qwopus3.6-27B-v2-MTP-GGUF — with the failed Ollama detour and the production llama-server recipe.

AI ModelsField NotesMay 2026

🎨

HiDream-O1-Image — Five generation modes & nine diagram styles on a single GPU

A capability tour of the open 8B Pixel-DiT model: text-to-image, instruction editing, IP reference, bbox layout, openpose — end-to-end on an RTX 4090.

AI ModelsField NotesMay 2026