Show HN: LocalCoder – Tell it your hardware, get the exact local AI model to run
JoseOSAF Thursday, February 05, 2026Hey HN — I built this after seeing the Qwen3-Coder threads here. Every thread had the same questions: which quant for my GPU? How much VRAM do I need? Ollama or llama.cpp? What context window can I actually use?
LocalCoder answers all of that in one page. Pick your platform (Apple Silicon, NVIDIA, CPU), select your chip and memory, and it gives you:
- The best model + quantization for your setup - Expected speed (tokens/sec) and context window - Copy-paste Ollama commands to get running in 60 seconds
The recommendation engine is a curated config matrix built from HN benchmarks, Unsloth docs, and llama.cpp test data. No AI inference on the backend — it's all client-side.
Free tier gives you the top pick + Ollama commands. $9 one-time unlocks alternatives table, llama.cpp commands, and IDE integration guide.
Would love feedback on the recommendations. If your hardware isn't covered or a rec seems off, let me know — I'll update the matrix.