Show HN: I built a desktop app combining Claude, GPT, Gemini with local Ollama
tsunamayo Sunday, March 01, 2026I built a desktop app (PyQt6, Windows) that orchestrates multiple AI models in a 3-phase pipeline:
Phase 1 – A cloud LLM (Claude/GPT/Gemini) decomposes the prompt into structured sub-tasks Phase 2 – Local Ollama models process each sub-task (free, private, runs on your GPU) Phase 3 – The cloud LLM integrates the results into a coherent final answer
The motivation: cloud APIs are great at reasoning and structure but cost money. Local Ollama models are free but sometimes inconsistent. The pipeline lets you use each where it's strongest.
Also includes: - FastAPI + React web UI (accessible from LAN/mobile) - SQLite chat history - ChromaDB-based RAG - Discord webhook notifications
Stack: Python, PyQt6, FastAPI, React, Ollama, Anthropic/OpenAI/Google APIs. MIT license.