Members-Only
Recent Talks & Demos are for members only
You must be an AI Tinkerers active member to view these talks and demos.
MoE parameters, MoE problems: visualizing Mixture of Experts Routing Layers
See a live visualization of Mixture of Experts routing decisions. This talk demonstrates real-time neural telemetry, showing how sparse activation achieves high performance without cloud-scale waste.
I built a real-time neural telemetry engine designed to intercept and visualize the internal gating decisions of a Mixture of Experts (MoE) model at a deeper level.
The project utilizes a PyTorch forward hook to capture raw 40-dimensional routing weights from Layer 20 of an IBM Granite 3.0 model. By running this entirely on a laptop, the demo provides a live “neural heartbeat” that proves how sparse activation can achieve high-performance reasoning without the latency or computational waste of a cloud-scale cluster.