Show HN: Stop Claude Code from forgetting everything
I got tired of Claude Code forgetting all my context every time I open a new session: set-up decisions, how I like my margins, decision history. etc.
We built a shared memory layer you can drop in as a Claude Code Skill. It’s basically a tiny memory DB with recall that remembers your sessions. Not magic. Not AGI. Just state.
Install in Claude Code:
/plugin marketplace add https://github.com/mutable-state-inc/ensue-skill
/plugin install ensue-memory
# restart Claude Code
What it does: (1) persists context between sessions (2) semantic & temportal search (not just string grep). Basically git for your Claude brainWhat it doesn’t do: - it won’t read your mind - it’s alpha; it might break if you throw a couch at it
Repo: https://github.com/mutable-state-inc/ensue-skill
If you try it and it sucks, tell me why so I can fix it. Don't be kind, tia
Show HN: Aroma: Every TCP Proxy Is Detectable with RTT Fingerprinting
TL;DR explanation (go to https://github.com/Sakura-sx/Aroma?tab=readme-ov-file#tldr-e... if you want the formatted version)
This is done by measuring the minimum TCP RTT (client.socket.tcpi_min_rtt) seen and the smoothed TCP RTT (client.socket.tcpi_rtt). I am getting this data by using Fastly Custom VCL, they get this data from the Linux kernel (struct tcp_info -> tcpi_min_rtt and tcpi_rtt). I am using Fastly for the Demo since they have PoPs all around the world and they expose TCP socket data to me.
The score is calculated by doing tcpi_min_rtt/tcpi_rtt. It's simple but it's what worked best for this with the data Fastly gives me. Based on my testing, 1-0.7 is normal, 0.7-0.3 is normal if the connection is somewhat unstable (WiFi, mobile data, satellite...), 0.3-0.1 is low and may be a proxy, anything lower than 0.1 is flagged as TCP proxy by the current code.
Show HN: Superset – Terminal to run 10 parallel coding agents
Hey HN, we’re Avi, Kiet, and Satya. We’re building Superset, an open-source terminal made for managing a bunch of coding agents (Claude Code, Codex, etc) in parallel.
- Superset makes it easy to spin up git worktrees and automatically setup your environment
- Agents and terminal tabs are isolated to worktrees, preventing conflicts
- Built-in hooks [0] to notify when your coding agents are done/needs attention,
- A diff viewer to review the changes and make PRs quickly
We’re three engineers who’ve built and maintained large codebases, and kept wanting to work on as many features in parallel as possible. Git worktrees [1] have been a useful solution for this task but they’re annoying to spin up and manage. We started superset as a tool that uses the best practices we’ve discovered running parallel agents.
Here is a demo video:
https://www.youtube.com/watch?v=pHJhKFX2S-4
We all use Superset to build Superset, and it more than doubles our productivity (you’ll be able to tell from the autoupdates). We have many friends using it over their IDE of choice or replacing their terminals with Superset, and it seems to stick because they can keep using whatever CLI agent or tool they want while Superset just augments their existing set of tools.
Superset is written predominantly in Typescript and based on Electron, xterm.js, and node-pty. We chose xterm+node-pty because it's a proven way to run real PTYs in a desktop app (used by VSCode and Hyper), and Electron lets us ship fast. Next, we’re exploring features like running worktrees in cloud VMs to offload local resources, context sharing between agents, and a top-level orchestration agent for managing many worktrees or projects at once.
We’ve learned a lot building this: making a good terminal is more complex than you’d think, and terminal and git defaults aren’t universal (svn vs git, weird shell setups, complex monorepos, etc.).
Building a product for yourself is way faster and quite fun. It's early days, but we’d love you to try Superset across all your CLI tools and environments, we welcome your feedback! :)
[0] https://code.claude.com/docs/en/hooks
[1] https://git-scm.com/docs/git-worktree
Show HN: Evidex – AI Clinical Search (RAG over PubMed/OpenAlex and SOAP Notes)
Hi HN,
I’m a solo dev building a clinical search engine to help my wife (a resident physician) and her colleagues.
The Problem: Current tools (UpToDate/OpenEvidence) are expensive, slow, or increasingly heavy with pharma ads.
The Solution: I built Evidex to be a clean, privacy-first alternative. Search Demo (GIF): https://imgur.com/a/zoUvINt
Technical Architecture (Search-Based RAG): Instead of using a traditional pre-indexed vector database (like Pinecone) which can serve stale data, I implemented a Real-time RAG pattern:
Orchestrator: A Node.js backend performs "Smart Routing" (regex/keyword analysis) on the query to decide which external APIs to hit (PubMed, Europe PMC, OpenAlex, or ClinicalTrials.gov).
Retrieval: It executes parallel fetches to these APIs at runtime to grab the top ~15 abstracts.
Local Data: Clinical guidelines are stored locally in SQLite and retrieved via full-text search (FTS) ensuring exact matches on medical terminology.
Inference: I’m using Gemini 2.5 Flash to process the concatenated abstracts. The massive context window allows me to feed it distinct search results and force strict citation mapping without latency bottlenecks.
Workflow Tools (The "Integration"): I also built a "reasoning layer" to handle complex patient histories (Case Mode) and draft documentation (SOAP Notes). Case Mode Demo (GIF): https://imgur.com/a/h01Zgkx Note Gen Demo (GIF): https://imgur.com/a/DI1S2Y0
Why no Vector DB? In medicine, "freshness" is critical. If a new trial drops today, a pre-indexed vector store might miss it. My real-time approach ensures the answer includes papers published today.
Business Model: The clinical search is free. I plan to monetize by selling billing automation tools to hospital admins later.
Feedback Request: I’d love feedback on the retrieval latency (fetching live APIs is slower than vector lookups) and the accuracy of the synthesized answers.
Show HN: See what readers who loved your favorite book/author also loved to read
Hi HN,
Every year, we ask thousands of readers (and authors) to share their 3 favorite reads of the year.
Now you can enter a book/author you love and see what books readers loved who also loved that book/author.
Try it here: https://shepherd.com/bboy/2025
This goes wide and doesn't try to limit itself to the genre, so you get some interesting results.
What do you think?
Background:
I want better recommendations based on my reading history. I'm incredibly frustrated with what is out there.
This system is based on 5,000 readers voting on their 3 favorite reads from 2023 to 2025. So, this covers ~15,000 books and is a high-quality vote. We wanted to keep the dataset small for now while we play with approaches.
We are building a full Book DNA app that pulls in your Goodreads history and delivers deeply personalized book recommendations based on people who like similar books (a significant challenge).
You can sign up to beta test it here if you want to help me with that:
https://docs.google.com/forms/d/1VOm8XOMU0ygMSTSKi9F0nExnGwo...
The first beta is coming out in late January, but it's pretty basic to start.
Past Show HNs as we've built Shepherd:
https://news.ycombinator.com/item?id=40084193
https://news.ycombinator.com/item?id=38600246
https://news.ycombinator.com/item?id=26871660
Thanks, looking forward to your comments :)
Ben
Show HN: Vibe coding a bookshelf with Claude Code
The article explores the process of creating a virtual bookshelf using the Claude language, a new AI-powered programming tool. It discusses the challenges and techniques involved in building an interactive and visually appealing bookshelf application.
Show HN: Shardium – open-source "Dead Man's Switch" for crypto inheritance
Hi HN, I'm Max.
I built this because I was terrified that if I die tomorrow, my family gets nothing. The existing solutions were either trusting a centralized custodian or complex hardware setups.
Shardium is a client-side tool that splits your seed phrase into 3 shards using Shamir's Secret Sharing.
Shard A: You keep.
Shard B: You give to a beneficiary (PDF).
Shard C: We hold (or you self-host).
It works as a dead man's switch: If you are inactive for 90 days (email ping), Shard C is released to your beneficiary. They combine B + C to recover the funds.
The Stack:
secrets.js-grempe for the math.
FastAPI + PostgreSQL backend.
Client-side encryption (seed never hits the network).
It is 100% Open Source and MIT Licensed. You can self-host it for free ($0), or use the managed version.
I'd love your feedback on the security model. Roast my code here: https://github.com/pyoneerC/shardium
Show HN: Per-instance TSP Solver with No Pre-training (1.66% gap on d1291)
OP here.
Most Deep Learning approaches for TSP rely on pre-training with large-scale datasets. I wanted to see if a solver could learn "on the fly" for a specific instance without any priors from other problems.
I built a solver using PPO that learns from scratch per instance. It achieved a 1.66% gap on TSPLIB d1291 in about 5.6 hours on a single A100.
The Core Idea: My hypothesis was that while optimal solutions are mostly composed of 'minimum edges' (nearest neighbors), the actual difficulty comes from a small number of 'exception edges' outside of that local scope.
Instead of pre-training, I designed an inductive bias based on the topological/geometric structure of these exception edges. The agent receives guides on which edges are likely promising based on micro/macro structures, and PPO fills in the gaps through trial and error.
It is interesting to see RL reach this level without a dataset. I have open-sourced the code and a Colab notebook for anyone who wants to verify the results or tinker with the 'exception edge' hypothesis.
Code & Colab: https://github.com/jivaprime/TSP_exception-edge
Happy to answer any questions about the geometric priors or the PPO implementation!
Show HN: My not-for-profit search engine with no ads, no AI, & all DDG bangs
I've been working on a little open source [1] search engine, nilch. I noticed that nearly all well known search engines, including the alternative ones, tend to be run by companies of various sizes with the goal to make money, so they either fill your results with ads or charge you money, and I dislike this because search is the backbone of the internet and should not be commercial, so it runs in a not-for-profit style and aims to survive on donations. Additionally I'm personally really sick of AI in my search results so I got rid of that, and I wanted DuckDuckGo bangs so it supports all of them. Like many alternative search engines, it is fully private.
Sadly, it currently does not have its own index but rather uses the Brave search API. Once I'm in a financial position that it's possible, I would absolutely love to build a completely new index from the ground up which is open source, as well as an open source ranking and search algorithm, to back it.
I posted on Reddit and got an amazing amount of feedback which I implemented a number of feature requests, so I would really like your ideas, critiques, and bug reports as well. Thank you and sorry for the long post!
[1] https://github.com/UnmappedStack/nilch
Show HN: Z80-μLM, a 'Conversational AI' That Fits in 40KB
How small can a language model be while still doing something useful? I wanted to find out, and had some spare time over the holidays.
Z80-μLM is a character-level language model with 2-bit quantized weights ({-2,-1,0,+1}) that runs on a Z80 with 64KB RAM. The entire thing: inference, weights, chat UI, it all fits in a 40KB .COM file that you can run in a CP/M emulator and hopefully even real hardware!
It won't write your emails, but it can be trained to play a stripped down version of 20 Questions, and is sometimes able to maintain the illusion of having simple but terse conversations with a distinct personality.
--
The extreme constraints nerd-sniped me and forced interesting trade-offs: trigram hashing (typo-tolerant, loses word order), 16-bit integer math, and some careful massaging of the training data meant I could keep the examples 'interesting'.
The key was quantization-aware training that accurately models the inference code limitations. The training loop runs both float and integer-quantized forward passes in parallel, scoring the model on how well its knowledge survives quantization. The weights are progressively pushed toward the 2-bit grid using straight-through estimators, with overflow penalties matching the Z80's 16-bit accumulator limits. By the end of training, the model has already adapted to its constraints, so no post-hoc quantization collapse.
Eventually I ended up spending a few dollars on Claude API to generate 20 questions data (see examples/guess/GUESS.COM), I hope Anthropic won't send me a C&D for distilling their model against the ToS ;P
But anyway, happy code-golf season everybody :)
Show HN: Spacelist, a TUI for Aerospace window manager
Spacelist is an open-source project that allows users to create interactive lists and dashboards. It provides a lightweight and customizable platform for organizing and visualizing information, making it useful for project management, task tracking, and data presentation.
Show HN: swab – A configurable project cleaning tool
The article discusses the SWAB project, which aims to create a secure and decentralized communication platform. SWAB utilizes blockchain technology and encryption to provide privacy and censorship resistance for users, making it a promising tool for secure messaging and data sharing.
Show HN: Agtrace – top and tail -f for AI coding agent sessions
Hey HN,
I built agtrace because I kept losing track of what was happening in my Claude Code sessions – context pressure, tool calls, costs.
It's basically `top` for AI coding agents: - Live dashboard showing context window usage and activity - Session history you can query and diff - Works with Claude Code, Codex, and Gemini CLI - 100% local, reads existing logs, no cloud
Install: `npm i -g @lanegrid/agtrace`
The core idea is pointer-based indexing (no log duplication) and schema-on-read (resilient to provider schema changes).
Would love feedback, especially from heavy Claude Code / Codex users.
Show HN: A solar system simulation in the browser
I didn't realize Universe Sandbox ran on MacOS, and I was in the mood to play around a bit.
Some functions it's got: - Random system generation - Sonification is super fun too - Habitability Simulation (Just for fun, don't cite this please) - Replacing, spawning, deleting objects
I've had tons of fun building this, so I hope someone else can share the joy. It's free and runs in the browser.
I'd love to hear any feedback. I think this is at a state where I might leave it as it is, but if people are interested in other features, maybe I'll keep working on it. I've kept saying I'll stop working on this for a while now though.
Show HN: Zs3 – S3 server in ~1K lines of Zig, 250KB binary, zero dependencies
Most S3 usage is PUT, GET, DELETE, LIST with basic auth. This does exactly that.
SigV4 auth, multipart uploads, range requests. Storage is just files on disk.
No versioning, no ACLs, no encryption. Use MinIO or AWS if you need those.
Show HN: I built an "ilovepdf" for CSV files (and I called it ILoveCSV)
We often find ourselves waiting for Excel just to do simple tasks like merging two CSVs, checking for duplicates, or converting a PDF table to CSV.
Thus, I built ilovecsv.net.
It's a suite of 30+ free tools that run in your browser.
Show HN: Neko.js, a recreation of the first virtual pet
Hi HN,
Here is a late Christmas present: I rebuilt Neko [1], the classic desktop cat that chases your mouse, as a tiny, dependency-free JavaScript library that runs directly on web pages.
Live demo: https://louisabraham.github.io/nekojs/
GitHub: https://github.com/louisabraham/nekojs
Drop-in usage is a single script tag:
<script src="https://louisabraham.github.io/nekojs/neko.js" data-autostart></script>
This is a fairly faithful recreation of Neko98: same state machine, same behaviors, same original 32×32 pixel sprites. It follows your cursor, falls asleep when idle, claws walls, and you can click it to cycle behavior modes.What made this project interesting to me is how I built it. I started by feeding the original C++ source (from the Wayback Machine) to Claude and let it "vibe code" a first JS implementation. That worked surprisingly well as a starting point, but getting it truly accurate required a lot of manual fixes: rewriting movement logic, fixing animation timing, handling edge cases the AI missed, etc.
My takeaway: coding agents are very useful at resurrecting old codebases, and this is probably the best non-soulless use of AI for coding. It gets you 60–70% of the way there very fast, especially for legacy code that would otherwise rot unread. The last 30% still needs a human who cares about details.
The final result is ~38KB uncompressed (~14KB brotli), zero dependencies, and can be dropped into a page with a single <script> tag.
Happy to hear thoughts from desktop pets nostalgics!
[1]: https://en.wikipedia.org/wiki/Neko_(software)
Show HN: UpDown – Simple website uptime monitoring
Is the service down? Check in seconds if it's just you or everyone else.
Show HN: MiddleViewer – A native macOS app for technical interview feedbacks
Hi HN, I built a native macOS app for helping interviewers for writing feedbacks. It listens to realtime conversations, takes the code, you can add your custom rules and BOOM, it will write the feedbacks in a way you want.
Show HN: Mysti – Claude, Codex, and Gemini debate your code, then synthesize
Hey HN! I'm Baha, creator of Mysti.
The problem: I pay for Claude Pro, ChatGPT Plus, and Gemini but only one could help at a time. On tricky architecture decisions, I wanted a second opinion.
The solution: Mysti lets you pick any two AI agents (Claude Code, Codex, Gemini) to collaborate. They each analyze your request, debate approaches, then synthesize the best solution.
Your prompt → Agent 1 analyzes → Agent 2 analyzes → Discussion → Synthesized solution
Why this matters: each model has different training and blind spots. Two perspectives catch edge cases one would miss. It's like pair programming with two senior devs who actually discuss before answering.
What you get: * Use your existing subscriptions (no new accounts, just your CLI tools) * 16 personas (Architect, Debugger, Security Expert, etc) * Full permission control from read-only to autonomous * Unified context when switching agents
Tech: TypeScript, VS Code Extension API, shells out to claude-code/codex-cli/gemini-cli
License: BSL 1.1, free for personal and educational use, converts to MIT in 2030 (would love input on this, does it make sense to just go MIT?)
GitHub: https://github.com/DeepMyst/Mysti
Would love feedback on the brainstorm mode. Is multi-agent collaboration actually useful or am I just solving my own niche problem?
Show HN: Meter – Scrape sites and keep content in sync automatically (no LLM)
I built Meter to keep scraped website content in sync over time.
Meter uses an LLM once to generate a scraping plan, then runs entirely on raw HTTP requests (no Selenium, no LLMs) to periodically detect changes and re-extract content.
I built it after spending years writing custom scrapers: parsing sites, wiring the output into databases, and keeping everything working as pages evolved. Meter follows the same approach I use in practice — do heavy analysis up front, then run fast, cheap scrapes continuously.
I’m really interested to hear from people maintaining scraping jobs or RAG pipelines in this context. I’d love any feedback on the product - thanks!
Show HN: LoongArch Userspace Emulator
https://fwsgonzo.medium.com/notes-on-libloong-loongarch-64-b...
Show HN: I built a real-time IoT monitor bridging ESP8266, Go, and Next.js
I built Synx, a real-time temperature and humidity monitoring system that bridges hardware, systems programming, and modern web dev.
Architecture: - ESP8266 + DHT11 sensor sending data via MQTT - Go backend for data ingestion, writing to InfluxDB (time-series DB) - Next.js frontend with live WebSocket updates (zero latency) and historical charts
Key engineering decisions: - MQTT over HTTP for true real-time push - Server-side timestamping (ESP8266 has no RTC) - InfluxDB for efficient time-series storage - Dual-channel: WebSocket for live data, REST API for history
I built this as a junior Go engineer to move beyond CRUD apps and work with IoT protocols, systems programming, and real-time data streaming.
Would love feedback on the architecture choices!
Show HN: Kuack – Run Kubernetes jobs in visitor browsers
WebAssembly makes it possible to run serious computation in browsers. I wanted to see if we could treat browsers as Kubernetes workers.
Kuack is a Virtual Kubelet provider that schedules Kubernetes workloads to browser tabs. Visitors' browsers connect, report capacity, and become ephemeral workers. It looks like a regular Kubernetes node - same kubectl commands, same OCI images, same workflows. The difference is that pods execute in browsers instead of servers. With multi-platform OCI images, Kubernetes can fall back to regular nodes if no agents are available.
It's designed for short-lived, stateless, CPU-heavy jobs: load testing from real networks, local data preprocessing, edge computing scenarios, machine learning tasks, etc.
Not a replacement for your cluster - just an extra option for workloads that benefit from browser execution.
Show HN: Ez FFmpeg – Video editing in plain English
I built a CLI tool that lets you do common video/audio operations without remembering ffmpeg syntax.
Instead of: ffmpeg -i video.mp4 -vf "fps=15,scale=480:-1:flags=lanczos" -loop 0 output.gif
You write: ff convert video.mp4 to gif
More examples: ff compress video.mp4 to 10mb ff trim video.mp4 from 0:30 to 1:00 ff extract audio from video.mp4 ff resize video.mp4 to 720p ff speed up video.mp4 by 2x ff reverse video.mp4
There are similar tools that use LLMs (wtffmpeg, llmpeg, ai-ffmpeg-cli), but they require API keys, cost money, and have latency.
Ez FFmpeg is different: - No AI – just regex pattern matching - Instant – no API calls - Free – no tokens - Offline – works without internet
It handles ~20 common operations that cover 90% of what developers actually do with ffmpeg. For edge cases, you still need ffmpeg directly.
Interactive mode (just type ff) shows media files in your current folder with typeahead search.
npm install -g ezff
Show HN: Matchstick Puzzle Game in the Browser
An older family member showed me these puzzle games that he was playing via YouTube videos. I wanted to make them more playable in a frictionless way, so I generated all possible combinations for these types of puzzles and put together an interface for them.
Show HN: Xcc700: Self-hosting mini C compiler for ESP32 (Xtensa) in 700 lines
Repo: https://github.com/valdanylchuk/xcc700
Hi Everyone! I just wrote my first compiler!
- single pass, recursive descent, direct emission
- generates REL ELF binaries, runnable using ESP-IDF elf_loader
- very basic features only, just enough for self-hosting
- treats the Xtensa CPU as a stack machine for simplicity, no register allocation / window usage
- compilable on Mac, probably also Linux, can cross-compile for esp32 there
- wrote for fun / cyberdeck project
Sample output from esp32:
xcc700.elf xcc700.c -o /d/cc.elf
[ xcc700 ] BUILD COMPLETED > OK
> IN : 700 Lines / 7977 Tokens
> SYM : 69 Funcs / 91 Globals
> REL : 152 Literals / 1027 Patches
> MEM : 1041 B .rodata / 17120 B .bss
> OUT : 27735 B .text / 33300 B ELF
[ 40 ms ] >> 17500 Lines/sec <<
My best hope is that some fork might grow into a unique nice language tailored to the esp32 platform. I think it is underrated in userland hobby projects.
Show HN: Phantas – A browser-based binaural strobe engine (Web Audio API)
Hi HN, I’m a new developer with Aphantasia (no mental imagery).
A side effect of this is that regaining focus after a distraction takes me a long time (the "23-minute lag"). I tried standard binaural beats, but I discovered a technical flaw: streaming compression (AAC/MP3 on Spotify/YouTube) often muddies the specific phase differences required for effective entrainment.
I realized that to get effective entrainment, I needed lossless audio. Since I couldn't stream lossless easily, I decided to generate it locally. I built Phantas – a browser engine that uses the Web Audio API to generate raw sine waves in real-time on the client side. This ensures mathematical precision with zero compression artifacts.
For audio it uses Native AudioContext for dual-oscillator generation (Left/Right channel split).
For visuals I pair the audio with a 490nm Cyan strobe. The hardest part was syncing the visual flash (using requestAnimationFrame) to the audio pulse without "drift" caused by JavaScript's event loop latency.
I built this primarily for myself. Subjectively, it has reduced my "ramp-up" time from ~20 minutes to about 5 minutes.
I’m releasing the generator for free (no login) to see if this works for others or if it's just my specific brain chemistry. I’d love feedback on:
- Audio/Visual Sync: Does the strobe feel tight on your specific browser/refresh rate?
- Intensity: Are the default 14Hz flickers too aggressive?
Show HN: Witr – Explain why a process is running on your Linux system
Hi HN,
I built a small Linux CLI tool called witr (Why Is This Running?).
The idea came from a situation most of us have hit: you log into a machine, see a process or port running, and immediately wonder why it exists, who started it, and what is keeping it alive right now.
witr traces a process, service, or port back to its origin and responsibility chain and explains it in a way that’s quick to read, especially when you’re debugging under pressure.
This is v0.1.0. It’s intentionally small and focused. Feedback, criticism, and edge cases are very welcome.
Repo: https://github.com/pranshuparmar/witr
Show HN: Golazo – Live soccer updates in your terminal
Hey all!
I built Golazo because I wanted a minimal but effective way to get soccer live updates and catch up on finished matches right in my terminal. No browser tabs, no ads, no distractions: just clean match data where I already spend most of my day.
I couldn’t find any actively maintained tool like this, so I thought it could be cool to build something just for what I need. It was a great learning experience and if it’s useful to other people, then even better!
Current features:
- Live match tracking with real-time score updates (90-second polling intervals)
- Minute-by-minute match events (goals, cards, substitutions)
- Finished match statistics and full event history - Goal notifications via beeep (macOS, Linux, Windows)
- 40+ leagues supported (and growing) with customizable preferences to limit what you fetch
- Smart caching: data cached for 5 minutes, polling only when viewing live matches
Technical details:
- Built with Go using Cobra for CLI, Charm’s Bubble Tea/Bubbles/Lip Gloss for the TUI
- Data from a trimmed-down version of the Fotmob API
- Cross-platform terminal rendering has been the biggest challenge – still working through some rough edges
Easy to install via install script or build from source. Pre-built binaries available for macOS, Windows, and Linux.
Would love to hear feedback from fellow terminal enthusiasts and soccer fans!