Show HN: I built a sub-500ms latency voice agent from scratch
I built a voice agent from scratch that averages ~400ms end-to-end latency (phone stop → first syllable). That’s with full STT → LLM → TTS in the loop, clean barge-ins, and no precomputed responses.
What moved the needle:
Voice is a turn-taking problem, not a transcription problem. VAD alone fails; you need semantic end-of-turn detection.
The system reduces to one loop: speaking vs listening. The two transitions - cancel instantly on barge-in, respond instantly on end-of-turn - define the experience.
STT → LLM → TTS must stream. Sequential pipelines are dead on arrival for natural conversation.
TTFT dominates everything. In voice, the first token is the critical path. Groq’s ~80ms TTFT was the single biggest win.
Geography matters more than prompts. Colocate everything or you lose before you start.
GitHub Repo: https://github.com/NickTikhonov/shuo
Follow whatever I next tinker with: https://x.com/nick_tikhonov
Show HN: Govbase – Follow a bill from source text to news bias to social posts
Govbase tracks every bill, executive order, and federal regulation from official sources (Congress.gov, Federal Register, White House). An AI pipeline breaks each one down into plain-language summaries and shows who it impacts by demographic group.
It also ties each policy directly to bias-rated news coverage and politician social posts on X, Bluesky, and Truth Social. You can follow a single bill from the official text to how media frames it to what your representatives are saying about it.
Free on web, iOS, and Android.
https://govbase.com
I'd love feedback from the community, especially on the data pipeline or what policy areas/features you feel are missing.
Show HN: Visual Lambda Calculus – a thesis project (2008) revived for the web
Originally built as my master's thesis in 2008, Visual Lambda is a graphical environment where lambda terms are manipulated as draggable 2D structures ("Bubble Notation"), and beta-reduction is smoothly animated.
I recently revived and cleaned up the project and published it as an interactive web version: https://bntre.github.io/visual-lambda/
GitHub repo: https://github.com/bntre/visual-lambda
It also includes a small "Lambda Puzzles" challenge, where you try to extract a hidden free variable (a golden coin) by constructing the right term: https://github.com/bntre/visual-lambda#puzzles
Show HN: Giggles – A batteries-included React framework for TUIs
i built a framework that handles focus and input routing automatically for you -- something born out of the things that ink leaves to you, and inspired by charmbracelet's bubbletea
- hierarchical focus and input routing: the hard part of terminal UIs, solved. define focus regions with useFocusScope, compose them freely -- a text input inside a list inside a panel just works. each component owns its keys; unhandled keypresses bubble up to the right parent automatically. no global handler like useInput, no coordination code
- 15 UI components: Select, TextInput, Autocomplete, Markdown, Modal, Viewport, CodeBlock (with diff support), VirtualList, CommandPalette, and more. sensible defaults, render props for full customization
- terminal process control: spawn processes and stream output into your TUI with hooks like useSpawn and useShellOut; hand off to vim, less, or any external program and reclaim control cleanly when they exit
- screen navigation, a keybinding registry (expose a ? help menu for free), and theming included
- react 19 compatible!
docs and live interactive demos in your browser: https://giggles.zzzzion.com
quick start: npx create-giggles-app
Show HN: Pianoterm – Run shell commands from your Piano. A Linux CLI tool
A little weekend project, made so I can pause/play/rewind directly on the piano, when learning a song by ear.
Show HN: uBlock filter list to blur all Instagram Reels
A filter list for uBO that blurs all video and non-follower content from Instagram. Works on mobile with uBO Lite.
related: https://news.ycombinator.com/item?id=47016443
Show HN: GitHub Commits Leaderboard
I made a public leaderboard for all time GitHub commit contributions.
https://ghcommits.com
You can connect your GitHub account and see where you rank by total commit contributions.
It uses GitHub’s contribution data through GraphQL, so it is based on GitHub’s counting rules rather than raw git history. Private contributions can be included. Organization contributions only count if you grant org access during auth.
There is also a public read only API.
https://ghcommits.com/api
The main thing I learned building it is that commit counting sounds straightforward until you try to match how GitHub actually attributes contributions.
I’d be interested in feedback on whether commit contributions are the right ranking metric, and whether I should also support other contribution types.
Show HN: PHP 8 disable_functions bypass PoC
Show HN: Omni – Open-source workplace search and chat, built on Postgres
Hey HN!
Over the past few months, I've been working on building Omni - a workplace search and chat platform that connects to apps like Google Drive/Gmail, Slack, Confluence, etc. Essentially an open-source alternative to Glean, fully self-hosted.
I noticed that some orgs find Glean to be expensive and not very extensible. I wanted to build something that small to mid-size teams could run themselves, so I decided to build it all on Postgres (ParadeDB to be precise) and pgvector. No Elasticsearch, or dedicated vector databases. I figured Postgres is more than capable of handling the level of scale required.
To bring up Omni on your own infra, all it takes is a single `docker compose up`, and some basic configuration to connect your apps and LLMs.
What it does:
- Syncs data from all connected apps and builds a BM25 index (ParadeDB) and HNSW vector index (pgvector)
- Hybrid search combines results from both
- Chat UI where the LLM has tools to search the index - not just basic RAG
- Traditional search UI
- Users bring their own LLM provider (OpenAI/Anthropic/Gemini)
- Connectors for Google Workspace, Slack, Confluence, Jira, HubSpot, and more
- Connector SDK to build your own custom connectors
Omni is in beta right now, and I'd love your feedback, especially on the following:
- Has anyone tried self-hosting workplace search and/or AI tools, and what was your experience like?
- Any concerns with the Postgres-only approach at larger scales?
Happy to answer any questions!
The code: https://github.com/getomnico/omni (Apache 2.0 licensed)
Show HN: An Auditable Decision Engine for AI Systems
Show HN: Timber – Ollama for classical ML models, 336x faster than Python
Timber is a lightweight, high-performance logging library for Java and Kotlin that provides a simple and flexible API for logging messages. It supports multiple logging backends, including Logcat, Timber, and SLF4J, and offers features such as tree-structured logging and custom tag generation.
Show HN: Web Audio Studio – A Visual Debugger for Web Audio API Graphs
Hi HN,
I’ve been working on a browser-based tool for exploring and debugging Web Audio API graphs.
Web Audio Studio lets you write real Web Audio API code, run it, and see the runtime graph it produces as an interactive visual representation. Instead of mentally tracking connect() calls, you can inspect the actual structure of the graph, follow signal flow, and tweak parameters while the audio is playing.
It includes built-in visualizations for common node types — waveforms, filter responses, analyser time and frequency views, compressor transfer curves, waveshaper distortion, spatial positioning, delay timing, and more — so you can better understand what each part of the graph is doing. You can also insert an AnalyserNode between any two nodes to inspect the signal at that exact point in the chain.
There are around 20 templates (basic oscillator setups, FM/AM synthesis, convolution reverb, IIR filters, spatial audio, etc.), so you can start from working examples and modify them instead of building everything from scratch.
Everything runs fully locally in the browser — no signup, no backend.
The motivation came from working with non-trivial Web Audio graphs and finding it increasingly difficult to reason about structure and signal flow once things grow beyond simple examples. Most tutorials show small snippets, but real projects quickly become harder to inspect. I wanted something that stays close to the native Web Audio API while making the runtime graph visible and inspectable.
This is an early alpha and desktop-only for now.
I’d really appreciate feedback — especially from people who have used Web Audio API in production or built audio tools. You can leave comments here, or use the feedback button inside the app.
https://webaudio.studio
Show HN: ApplyPilot – AI Agent that applies to jobs for you
Hey all, I recently open-sourced my project in hope to help others in their job hunting. I did not expect to get over 500 stars in a week and 500k views on Reddit.
What do you think?
P.S. Recruiters & Startup founders hit me up!
Show HN: Gapless.js – gapless web audio playback
Hey HN,
I just released v4 of my gapless playback library that I first built in 2017 for https://relisten.net. We stream concert recordings, where gapless playback is paramount.
It's built from scratch, backed by a rigid state machine (the sole dependency is xstate) and is already running in production over at Relisten.
The way it works is by preloading future tracks as raw buffers and scheduling them via the web audio API. It seamlessly transitions between HTML5 and web audio. We've used this technique for the last 9 years and it works fairly well. Occasionally it will blip slightly from HTML5->web audio, but there's not much to be done to avoid that (just when to do it - lotta nuance here). Once you get on web audio, everything should be clean.
Unfortunately web audio support still lacks on mobile, in which case you can just disable web audio and it'll fallback to full HTML5 playback (sans gapless). But if you drive a largely desktop experience, this is fine. On mobile, most people use our native app.
You can view a demo of the project at https://gapless.saewitz.com - just click on "Scarlet Begonias", seek halfway in the track (as it won't preload until >15s) and wait for "decoding" on "Fire on the Mountain" to switch to "ready". Then tap "skip to -2s and hear the buttery smooth segue.
Show HN: Open-Source Postman for MCP
This article introduces an open-source Postman alternative for Microsoft Cognitive Services (MCS), providing a user-friendly interface to explore and test various MCS APIs, including Text Analytics, Computer Vision, and Language Understanding.
Show HN: Try Archetype 360 – AI‑powered personality test, 3× deeper than MBTI
Hi there, are you familiar with MBTI, DiSC, Big Five? Well I'm experimenting with a new kind of personality test, Archetype 360, and I'd love for you to try it for free and tell me what you think.
- 24 traits across 12 opposing pairs -- that's three times more dimensions than MBTI or DiSC, so you get a much more nuanced profile. - A unique narrative report generated with AI (Claude), written in natural language instead of generic type blurbs. - Your role, goals, and current challenges are blended into the analysis, so the report feels relevant to your real‑life context, not just abstract traits.
It's an "ephemeral app" so your report only lives in your browser, there's no login, and we don't store your data. Make sure you save the report as a PDF before you close the page.
What I'm looking for is honest feedback on your archetype and report:
- Did it feel accurate and "wow" or just meh? - Did you learn anything unexpected about yourself? - What did it miss or not go deep enough on?
I'll use your feedback to refine the prompts and the underlying model. Just comment here or use the feedback form in the app.
If there's enough interest, the next step is to combine Archetype 360 with a variation of Holland Codes / RIASEC (vocational interest areas) to create a full‑power professional orientation report.
What else would you love to see? Ideas welcome!
Best wishes, Daniel
Show HN: We filed 99 patents for deterministic AI governance(Prior Art vs. RLHF)
For the last few months, we've been working on a fundamental architectural shift in how autonomous agents are governed. The current industry standard relies almost entirely on probabilistic alignment (RLHF, system prompts, constitutional training). It works until it's jailbroken or the context window overflows. A statistical disposition is not a security boundary.
We've built an alternative: Deterministic Policy Gates. In our architecture, the LLM is completely stripped of execution power. It can only generate an "intent payload." That payload is passed to a process-isolated, deterministic execution environment where it is evaluated against a cryptographically hashed constraint matrix (the constitution). If it violates the matrix, it is blocked. Every decision is then logged to a Merkle-tree substrate (GitTruth) for an immutable audit trail.
We filed 99 provisional patents on this architecture starting January 10, 2026. Crucially, we embedded strict humanitarian use restrictions directly into the patent claims themselves (The Peace Machine Mandate) so the IP cannot legally be used for autonomous weapons, mass surveillance, or exploitation.
I wrote a full breakdown of the architecture, why probabilistic safety is a dead end, and the timeline of how we filed this before the industry published their frameworks: Read the full manifesto here: https://salvatoresystems.medium.com/the-death-of-probabilist...
The full patent registry is public here: https://aos-patents.com
I'm the founder and solo inventor. Happy to answer any questions about the deterministic architecture, the Merkle-tree state persistence, or the IP strategy of embedding ethics directly into patent claims.
Show HN: Punch card simulator and Fortran IV interpreter
Code: https://github.com/ehrlich-b/punchcards
Just for fun, I've only spent a few hours on it so far. What are everyone's punch card emulation needs?
Show HN: Writing App for Novelist
Novelos Studio is a creative agency that specializes in web design, branding, and digital marketing services. The agency offers a range of services to help businesses build a strong online presence and achieve their marketing goals.
Show HN: Vanilla JavaScript refinery simulator built to explain job to my kids
Hi HN, I’m a chemical engineer and I manage logistics at a refinery down in Texas. Whenever I try to explain downstream operations to people outside the industry (including my kids), I usually get blank stares. I wanted to build something that visualizes the concepts and chemistry of a plant without completely dumbing down the science, so I put together this 5-minute browser game.
I am not a software developer by trade, so I relied heavily on LLMs (Claude, Copilot, Gemini) to help write the code. What started as a simple concept turned into a 9,000-line single-page app built with vanilla HTML, CSS, and JavaScript. I used Matter.js for the 2D physics minigames.
A few technical takeaways from building this as a non-dev: * Managing the LLM workflow: Once the script.js file got large, letting the models output full file rewrites was a disaster (truncations, hallucinations, invisible curly-quote replacements that broke the JS). I started forcing them to act like patch files, strictly outputting "Find this exact block" and "Replace with this exact block." This was the only way to maintain improvements without breaking existing logic.
* Mapping physics to CSS: I wanted the minigames to visually sit inside circular CSS containers (border-radius: 50%). Matter.js doesn't natively care about your CSS. Getting the rigid body physics to respect a dynamic, responsive DOM boundary across different screen sizes required running an elliptical boundary equation (dx * dx) / (rx * rx) + (dy * dy) / (ry * ry) > 1 on every single frame. Maybe this was overkill to try to handle the resizing between phones and PCs.
* Mobile browser events: Forcing iOS Safari to ignore its default behaviors (double-tap zoom, swipe-to-scroll) while still allowing the user to tap and drag Matter.js objects required a ridiculous amount of custom event listener management and CSS (touch-action: manipulation; user-select: none;). I also learned that these actions very easily kill the mouse scroll making it very frustrating for PC users. I am hoping I hit a good middle ground.
* State management: Since I didn't use React or any frameworks, I had to rely on a global state object. Because the game jumps between different phases/minigames, I ran into massive memory leaks from old setInterval loops and Matter.js bodies stacking up. I had to build strict teardown functions to wipe the slate clean on every map transition.
The game walks through electrostatic desalting, fractional distillation, hydrotreating, catalytic cracking, and gasoline blending (hitting specific Octane and RVP specs).
It’s completely free, runs client-side, and has zero ads or sign-ups. I'd appreciate any feedback on the mechanics, or let me know if you manage to break the physics engine. Happy to answer any questions about the chemical engineering side of things as well.
For some reason the URL box is not getting recognized, maybe someone can help me feel less dumb there too. www.fuelingcuriosity.com/game
Show HN: Agd – a content-addressed DAG for tracking what AI agents do
Every agent framework gives you logs(each its own flavour of logs). Unstructured text. Maybe some spans if you're lucky. When your agent breaks something, you get to grep through a wall of output in some proprietery system.
Why can't i easily see what the agent did to produce the PR? why can't i restart a different agent from a state?
I got tired of this. agd is a content-addressed object store and causal DAG for recording agent behavior. It works like git (blobs, trees, refs, immutable objects identified by hash), but the object types are specific to LLM interactions: messages with typed content parts, tool definitions, and workspace snapshots that form a causal graph.
The core idea is stolen from git: the data model is the API. You interact with objects directly. Everything is immutable, everything has a hash.
An "action" in the DAG records: what messages the agent saw (observed state), what it produced (produced state), which tools were available, which model was used, and what caused it (parent actions).
Two states per action, both full snapshots, not deltas. You diff them to see what changed. What you get: - agd log --session s1 shows one conversation's full causal chain - agd show <action> --produced --expand shows the exact prompt and tool calls - agd diff between two actions compares messages and workspace files - agd rewind moves a session back to a previous point (old actions stay in the store) - agd replay reconstructs the exact input state and reruns an action
It integrates as middleware/plugin. Wraps your existing LLM calls, records before/after state, doesn't require rewriting your agent code. The session ref (like a git branch pointer) auto-advances on each action, so parent tracking is a single ref read.
Written in Zig. Most of the code was written with heavy AI assistance. The store is append-only loose files, like git's object directory. Write path targets low single-digit ms per action with batched fsync. Sessions can be bundled and published to a remote for sharing and viewing(working on a POC of this, have some core questions)
This is pre-alpha. The object model and write path work. Workspace capture, session sharing, and a Phoenix LiveView web viewer are functional.
Plenty is still missing: resume-from-any-point, proper diffing, the replay command. The on-disk format will probably change. I wouldn't depend on it for anything you care about yet.
What it does not do: orchestrate agents, make agents smarter, stream in real time, or replace your framework.
Looking for feedback, thoughts, contributors
Repo: https://github.com/frontman-ai/agd
Show HN: BoardMint – upload a PCB, get a standards-backed issue report in ~30s
Hi HN, I’m Pranav (founder). I design hardware and kept seeing a weird split: Engineers don’t trust AI to design full PCBs (hidden assumptions, stackups, manufacturing constraints, EMI/return paths, and the cost of being even slightly wrong - why tools like Flux still aren’t widely trusted for full designs). But customers keep asking ChatGPT to “review” boards. They paste screenshots/Gerbers and expect a real sign-off. It often sounds right, but it can hallucinate or miss what actually causes respins. Lesson building this: the hard part isn’t more AI, it’s deterministic, reproducible detection with explicit assumptions, with AI only to explain findings and suggest fixes. Would love critique: what’s worth catching pre-fab, what’s too noisy, and what would make you trust this as a release gate.
Show HN: CrowPay – add x402 in a few lines, let AI agents pay per request
Hey HN – I've been building in the agent payments space for a while and the biggest bottleneck I see isn't the protocol (x402 is great) — it's that most API providers have no idea how to actually integrate it. The docs exist, the middleware exists, but going from "I have a REST API" to "agents can discover and pay for my endpoints" still takes way more work than it should.
CrowPay fixes that. We integrate x402 payment headers into your existing API and configure USDC settlement on Base. You go from zero to agent-accessible in days, not months.
How it works:
You have an existing API (Express, Next.js, Cloudflare Workers, any HTTP server) We add x402 payment capability — your endpoints return 402 with payment instructions, agents pay in USDC and get access USDC settles to your wallet on Base. You get a dashboard with real-time analytics on agent payment volume.
That's it. You don't have to learn how x402 works under the hood, run blockchain infra, or change your API architecture. Why this matters now: There are over 72,000 AI agents paying for services via x402, with $600M+ in annualized volume across 960+ live endpoints. Stripe just added x402 support. CoinGecko is charging agents $0.01/request. This is going from experiment to real money fast — and most API providers are leaving it on the table because the integration is still too annoying.
The agent-side story: We also handle wallet creation and spending budgets for agent builders. If you're building agents that need to pay for things, Crow lets you create a wallet, fund it, set spending limits, and let your agent loose. The agent gets a budget, and you don't wake up to a surprise $10k bill.
What I'd love to hear:
What's keeping you from adding agent payments today? Is it technical complexity, uncertainty about demand, or something else? Agent builders: how do you handle spending controls? Is "agent gets a wallet with a budget" the right abstraction?
Show HN: Aft, a Python toolkit to study agent behavior
aft was my stab at having a way to understand what claude is doing and also having the language to reason about differences in model behavior when we make them do long agentic runs / change prompts / alter tools etc. The intention of the toolkit to provide an empirical measure of how agent behavior can differ as things changes like environments, tools, prompts etc.
It gives the tools to measure the changes in "behaviors that the users define". This means that it is more like a hypothesis testing framework for what the agent is doing over actually telling what the agent might do.
The reasoning and derivations behind these tools is given over here https://technoyoda.github.io/agent-science.html
Would be very happy to hear feedback and questions. (Please ignore the names given to theorization, it was for shits and giggles)
Show HN: Watchtower – see every API call Claude Code and Codex CLI make
Watchtower is an open-source, self-hosted tool that automatically updates Docker containers when a new version of the base image is released. It monitors Docker registries, compares container images, and updates them accordingly, ensuring your containerized applications are always running the latest version.
Show HN: I spent a billion tokens bridging Elixir and WebAssembly
If you'd like to learn about how I corralled the agents to do this, check out this blog post https://vers.sh/blog/elixir-webassembly-billion-tokens
And, of course, here's the GitHub https://github.com/hdresearch/firebird
I also posted a thread to Twitter where I ship nifty stuff if fun uses of technology is interesting to you https://x.com/itisyev/status/2028543436388016510
Show HN: Smart-commit-rs – A zero-dependency Git commit tool in Rust
Hey yall,
I wanted to share *smart-commit-rs*, a fast, lightweight cross-platform TUI I built to facilitate managing git commits, including generating messages via LLMs.
Here are some of the main features:
* *Convention following:* The tool by default will generate commit messaged according to the Conventional Commit standard, and optionally according to Gitmoji as well. * *Extensive LLM Provider Support:* Built-in integration for Groq (default), OpenAI, Anthropic, Gemini, Grok, DeepSeek, OpenRouter, Mistral, Together, Fireworks, and Perplexity. * *Customer LLM Support:* You can easily point it to a custom provider like a local Ollama instance using OpenAI-compatible endpoints. * *LLM Presets:* You can save various provider presets, being able to freely switch between them. If your primary API throws an HTTP error, you can also configure a fallback rank so the tool automatically retries using the alternate LLM presets you've configured. * *Diff Exclusion Globs:* You can exclude minified assets, `.lock` files, PDFs, etc., from the LLM analysis to save tokens, while still officially committing them. *Advanced Git Tooling:* Message generation doesn't work just with commits. You can use `cgen alter <hash>` to rewrite a specific commit's message, `cgen undo` for a safe soft reset with Conventional Commit-compliant revert messages, or `cgen --tag` to automatically compute and create the next semantic version tag. * *Commit Tracking:* It maintains a per-repository cache of managed commits, browsable via `cgen history` with native `git show` integration.
*A quick note on development:* While the project is rigorously human-reviewed and heavily backed by strict unit testing (matching CI coverage gates), a large portion of the boilerplate and core logic was written using agentic AI.
You can grab it via Cargo (`cargo install auto-commit-rs`) or via the curl/PowerShell install scripts in the repo: https://github.com/gtkacz/smart-commit-rs
Any feedback or contribution is more than welcome, and GitHub stars are greatly appreciated.
Thank you for your time!
Show HN: Valkey-powered semantic memory for Claude Code sessions
I wanted to explore Valkey's vector search capabilities for AI workloads and had been looking for an excuse to build something with Bun. This weekend I combined both into a memory layer for Claude Code.
https://github.com/BetterDB-inc/memory
The problem: Claude Code has CLAUDE.md and auto memory, but it's flat text with no semantic retrieval. You end up repeating context, especially around things not to do.
BetterDB Memory hooks into Claude Code's lifecycle (SessionStart, PostToolUse, PreToolUse, Stop), summarizes each session, generates embeddings, and stores everything in Valkey using FT.SEARCH with HNSW. Next session, relevant memories surface automatically via vector similarity search.
The interesting technical bit is that Valkey handles all of it - vector search, hash storage for structured memory data, sorted sets for knowledge indexing, lists for compression queues. No separate vector database.
There's also an aging pipeline that applies exponential decay to old memories based on recency, clusters similar ones via cosine similarity, and merges them to keep the memory store from growing unbounded.
Self-hostable with Ollama for embeddings and summarization, or plug in any LLM provider. Runs on Bun, ships as compiled binaries. MIT licensed.
Show HN: Audio Toolkit for Agents
The article describes a SAS audio processor, a tool that allows users to process audio files and perform various operations such as normalization, equalization, and conversion between different audio formats. The processor is built using the SAS programming language and is designed to be a powerful and flexible tool for audio processing tasks.
Show HN: Logira – eBPF runtime auditing for AI agent runs
I started using Claude Code (claude --dangerously-skip-permissions) and Codex (codex --yolo) and realized I had no reliable way to know what they actually did. The agent's own output tells you a story, but it's the agent's story.
logira records exec, file, and network events at the OS level via eBPF, scoped per run. Events are saved locally in JSONL and SQLite. It ships with default detection rules for credential access, persistence changes, suspicious exec patterns, and more. Observe-only – it never blocks.
https://github.com/melonattacker/logira