Hazumi News | Show HN: LocalCoder – Tell it your hardware, get the exact local AI model to run

Hey HN — I built this after seeing the Qwen3-Coder threads here. Every thread had the same questions: which quant for my GPU? How much VRAM do I need? Ollama or llama.cpp? What context window can I actually use?

LocalCoder answers all of that in one page. Pick your platform (Apple Silicon, NVIDIA, CPU), select your chip and memory, and it gives you:

- The best model + quantization for your setup - Expected speed (tokens/sec) and context window - Copy-paste Ollama commands to get running in 60 seconds

The recommendation engine is a curated config matrix built from HN benchmarks, Unsloth docs, and llama.cpp test data. No AI inference on the backend — it's all client-side.

Free tier gives you the top pick + Ollama commands. $9 one-time unlocks alternatives table, llama.cpp commands, and IDE integration guide.

Would love feedback on the recommendations. If your hardware isn't covered or a rec seems off, let me know — I'll update the matrix.

Story

Show HN: LocalCoder – Tell it your hardware, get the exact local AI model to run