Story

Show HN: Offline RAG System Using Docker and Llama 3 (No Cloud APIs)

PhilYeh Wednesday, November 26, 2025

I'm sharing a fully offline RAG (Retrieval-Augmented Generation) stack I built to solve a crucial problem in industrial environments: data privacy and recurring API costs.

We deal with sensitive proprietary datasheets and schematics daily, making cloud-based LLMs like ChatGPT non-compliant.

The Solution: A containerized architecture that ensures data never leaves the local network.

The Stack: LLM: Llama 3 (via Ollama) Vector DB: ChromaDB Deployment: Docker Compose (One-click setup) Benefit: Zero API costs, no security risks, fast local performance. The code and architecture are available here: https://github.com/PhilYeh1212/Local-AI-Knowledge-Base-Docke...

Happy to answer questions about the GPU passthrough setup or document ingestion pipeline.

Summary
This article discusses the creation of a Local AI Knowledge Base using Docker and the Llama3 language model. It provides a step-by-step guide on setting up the necessary infrastructure and deploying the knowledge base, allowing users to interact with the AI system locally.
10 1
Summary
github.com
Visit article Read on Hacker News Comments 1