Show HN: OneCLI – Vault for AI Agents in Rust
We built OneCLI because AI agents are being given raw API keys. And it's going about as well as you'd expect. We figured the answer isn't "don't give agents access," it's "give them access without giving them secrets."
OneCLI is an open-source gateway that sits between your AI agents and the services they call. You store your real credentials once in OneCLI's encrypted vault, and give your agents placeholder keys. When an agent makes an HTTP call through the proxy, OneCLI matches the request by host/path, verifies the agent should have access, swaps the placeholder for the real credential, and forwards the request. The agent never touches the actual secret. It just uses CLI or MCP tools as normal.
Try it in one line: docker run --pull always -p 10254:10254 -p 10255:10255 -v onecli-data:/app/data ghcr.io/onecli/onecli
The proxy is written in Rust, the dashboard is Next.js, and secrets are AES-256-GCM encrypted at rest. Everything runs in a single Docker container with an embedded Postgres (PGlite), no external dependencies. Works with any agent framework (OpenClaw, NanoClaw, IronClaw, or anything that can set an HTTPS_PROXY).
We started with what felt most urgent: agents shouldn't be holding raw credentials. The next layer is access policies and audit, defining what each agent can call, logging everything, and requiring human approval before sensitive actions go through.
It's Apache-2.0 licensed. We'd love feedback on the approach, and we're especially curious how people are handling agent auth today.
GitHub: https://github.com/onecli/onecli Site: https://onecli.sh
Show HN: LogClaw – Open-source AI SRE that auto-creates tickets from logs
Hi HN, I'm Robel. I built LogClaw because I was tired of paying for Datadog and still waking up to pages that said "something is wrong" with no context.
LogClaw is an open-source log intelligence platform that runs on Kubernetes. It ingests logs via OpenTelemetry and detects anomalies using signal-based composite scoring — not simple threshold alerting. The system extracts 8 failure-type signals (OOM, crashes, resource exhaustion, dependency failures, DB deadlocks, timeouts, connection errors, auth failures), combines them with statistical z-score analysis, blast radius, error velocity, and recurrence signals into a composite score. Critical failures (OOM, panics) trigger the immediate detection path in <100ms — before a time window even completes. The detection achieves 99.8% for critical failures while filtering noise (validation errors and 404s don't fire incidents).
Once an anomaly is confirmed, a 5-layer trace correlation engine groups logs by traceId, maps service dependencies, tracks error propagation cascades, and computes blast radius across affected services. Then the Ticketing Agent pulls the correlated timeline, sends it to an LLM for root cause analysis, and creates a deduplicated ticket on Jira, ServiceNow, PagerDuty, OpsGenie, Slack, or Zammad. The loop from log noise to a filed ticket is about 90 seconds.
Architecture: OTel Collector → Kafka (Strimzi, KRaft mode) → Bridge (Python, 4 concurrent threads: ETL, anomaly detection, OpenSearch indexing, trace correlation) → OpenSearch + Ticketing Agent. The AI layer supports OpenAI, Claude, or Ollama for fully air-gapped deployments. Everything deploys with a single Helm chart per tenant, namespace-isolated, no shared data plane.
To try it locally: https://docs.logclaw.ai/local-development
What it does NOT do yet: - Metrics and traces — this is logs-only right now. Metrics support is on the roadmap. - The anomaly detection is signal-based + statistical (composite scoring with z-score), not deep learning. It catches 99.8% of critical failures but won't detect subtle performance drift patterns yet. - The dashboard is functional but basic. We use OpenSearch Dashboards for the heavy lifting.
Licensed Apache 2.0. The managed cloud version is $0.30/GB ingested if you don't want to self-host.
Hi HN — I’m Robel. I built LogClaw after getting tired of waking up to alerts that only said “something is wrong” with no context. LogClaw is an open-source log intelligence platform for Kubernetes. It ingests logs via OpenTelemetry and detects operational failures using signal-based anomaly detection rather than simple thresholds. Instead of looking at a single metric, LogClaw extracts failure signals from logs (OOMs, crashes, dependency failures, DB deadlocks, timeouts, etc.) and combines them with statistical signals like error velocity, recurrence, z-score anomalies, and blast radius to compute a composite anomaly score. Critical failures bypass time windows and trigger detection in <100ms. Once an anomaly is confirmed, a correlation engine reconstructs the trace timeline across services, detects error propagation, and computes the blast radius. A ticketing agent then generates a root-cause summary and creates deduplicated incidents in Jira, ServiceNow, PagerDuty, OpsGenie, Slack, or Zammad. Architecture: OTel Collector → Kafka → Detection Engine → OpenSearch → Ticketing Agent Repo: https://github.com/logclaw/logclaw Would love feedback from people running large production systems.
Show HN: Understudy – Teach a desktop agent by demonstrating a task once
I built Understudy because a lot of real work still spans native desktop apps, browser tabs, terminals, and chat tools. Most current agents live in only one of those surfaces.
Understudy is a local-first desktop agent runtime that can operate GUI apps, browsers, shell tools, files, and messaging in one session. The part I'm most interested in feedback on is teach-by-demonstration: you do a task once, the agent records screen video + semantic events, extracts the intent rather than coordinates, and turns it into a reusable skill.
Demo video: https://www.youtube.com/watch?v=3d5cRGnlb_0
In the demo I teach it: Google Image search -> download a photo -> remove background in Pixelmator Pro -> export -> send via Telegram. Then I ask it to do the same for Elon Musk. The replay isn't a brittle macro: the published skill stores intent steps, route options, and GUI hints only as a fallback. In this example it can also prefer faster routes when they are available instead of repeating every GUI step.
Current state: macOS only. Layers 1-2 are working today; Layers 3-4 are partial and still early.
npm install -g @understudy-ai/understudy
understudy wizard
GitHub: https://github.com/understudy-ai/understudyHappy to answer questions about the architecture, teach-by-demonstration, or the limits of the current implementation.
Show HN: An application stack Claude coded directly in LLVM IR
This repo is the result of a debate about what kind of programming language might be appropriate if humans are no longer the primary authors. Initially the thought was "LLMs can just generate binaries directly" (this was before a more famous person had the same idea). But that on reflection seems like a bad approach because languages exist to capture program semantics that are elided by translation to machine code. The next step was to wonder if an existing "machine readable" program representation can be the target for LLM code generation. It turns out yes. This project is the result of asking Claude to create an application stack entirely coded in LLVM's intermediate representation language.
Show HN: Hyper – Voice Notes for Whiteboarding Sessions
Hyper AI for Real Talk is a new app that uses AI technology to facilitate natural conversations on a variety of topics, enabling users to engage in thoughtful discussions and gain new perspectives.
Show HN: Cloud to Desktop in the Fastest Way
Native Desktop is a toolkit for building native desktop applications using modern web technologies without dealing with the usual complexity of desktop tooling. It focuses on providing a simple developer experience where you can scaffold, build, and distribute desktop apps using familiar workflows and a modular package ecosystem. Instead of forcing developers to manage complicated native environments, Native Desktop provides a CLI and a set of packages that handle the heavy lifting while keeping projects flexible and maintainable. The goal is to let developers move from an idea to a working desktop application quickly while still having full control over architecture and distribution. The project is designed for developers who already build with modern web stacks and want a straightforward way to turn those applications into desktop software without reinventing the entire toolchain.
Show HN: PipeStep – Step-through debugger for GitHub Actions workflows
Hey HN — I kept seeing developers describe the same frustration: the commit-push-wait-read-logs cycle when debugging CI pipelines. So I built PipeStep.
PipeStep parses your GitHub Actions YAML, spins up the right Docker container, and gives you a step-through debugger for your run: shell commands. You can:
- Pause before each step and inspect the container state - Shell into the running container mid-pipeline (press I) - Set breakpoints on specific steps (press B) - Retry failed steps or skip past others
It deliberately does not try to replicate the full GitHub Actions runtime — no secrets, no matrix builds, no uses: action execution. For full local workflow runs, use act. PipeStep is for when things break and you need to figure out why without pushing 10 more commits. Think of it as gdb for your CI pipeline rather than a local GitHub runner.
pip install pipestep (v0.1.2) · Python 3.11+ · MIT · Requires Docker
Would love feedback, especially from people who've hit the same pain point. Known limitations are documented in the README + have some issues in there that I'd love eyeballs on!
Show HN: Verge Browser a self-hosted isolated browser sandbox for AI agents
Built this because I wanted a better browser runtime for Openclaw, which can run on any server no only on Mac mini, emm. When it needs me to login or perform some operations, I can simply use noVNC to operate, and then leave everything else to it.
Show HN: We analyzed 1,573 Claude Code sessions to see how AI agents work
We built rudel.ai after realizing we had no visibility into our own Claude Code sessions. We were using it daily but had no idea which sessions were efficient, why some got abandoned, or whether we were actually improving over time.
So we built an analytics layer for it. After connecting our own sessions, we ended up with a dataset of 1,573 real Claude Code sessions, 15M+ tokens, 270K+ interactions.
Some things we found that surprised us: - Skills were only being used in 4% of our sessions - 26% of sessions are abandoned, most within the first 60 seconds - Session success rate varies significantly by task type (documentation scores highest, refactoring lowest) - Error cascade patterns appear in the first 2 minutes and predict abandonment with reasonable accuracy - There is no meaningful benchmark for 'good' agentic session performance, we are building one.
The tool is free to use and fully open source, happy to answer questions about the data or how we built it.
Show HN: s@: decentralized social networking over static sites
Show HN: Autoschematic is a new infra-as-code tool built on reversible computing
Unlike Terraform and Pulumi, Autoschematic is built around a bidirectional (push-pull) state model. This means that it can resolve state drift by "pulling" or "pushing" (applying). This makes it a much better fit for certain use-cases where configuration drift is more common, like Snowflake. It also means you can import your existing infra automatically.
Show HN: A2Apex – Test, certify, and discover trusted A2A agents
Hey HN,
I built A2Apex (https://a2apex.io) — a testing and reputation platform for AI agents built on Google's A2A protocol.
The problem: AI agents are everywhere, but there's no way to verify they actually work. No standard testing. No directory of trusted agents. No reputation system.
What A2Apex does:
- Test — Point it at any A2A agent URL. We run 50+ automated compliance checks: agent card validation, live endpoint testing, state machine verification, streaming, auth, error handling.
- Certify — Get a 0-100 trust score with Gold/Silver/Bronze badges you can embed in your README or docs.
- Get Listed — Every tested agent gets a public profile page in the Agent Directory with trust scores, skills, test history, and embeddable badges.
Think of it as SSL Labs (testing) + npm (directory) + LinkedIn (profiles) — for AI agents.
Stack: Python/FastAPI, vanilla JS, SQLite. No frameworks, no build tools. Runs on a Mac mini in Wyoming.
Free: 5 tests/month. Pro: $29/mo. Startup: $99/mo. Try it at https://app.a2apex.io
I'm a dragline operator at a coal mine — built this on nights and weekends using Claude. Would love feedback from anyone building A2A agents or thinking about agent interoperability.
Show HN: Axe A 12MB binary that replaces your AI framework
Axe is an open-source command-line tool that helps developers find and fix accessibility issues in their web applications. It provides a simple and efficient way to automate accessibility testing and ensure websites are inclusive and accessible to all users.
SHOW HN: A usage circuit breaker for Cloudflare Workers
I run 3mins.news (https://3mins.news), an AI news aggregator built entirely on Cloudflare Workers. The backend has 10+ cron triggers running every few minutes: RSS fetching, article clustering, LLM calls, email delivery.
The problem: Workers Paid Plan has hard monthly limits (10M requests, 1M KV writes, 1M queue ops, etc.). There's no built-in "pause when you hit the limit", CF just starts billing overages. KV writes cost $5/M over the cap, so a retry loop bug can get expensive fast.
AWS has Budget Alerts, but those are passive notifications, by the time you read the email, the damage is done. I wanted active, application-level self-protection.
So I built a circuit breaker that faces inward, instead of protecting against downstream failures (the Hystrix pattern), it monitors my own resource consumption and gracefully degrades before hitting the ceiling.
Key design decisions:
- Per-resource thresholds: Workers Requests ($0.30/M overage) only warns at 80%. KV Writes ($5/M overage) can trip the breaker at 90%. Not all resources are equally dangerous, so some are configured as warn-only (trip=null).
- Hysteresis: Trips at 90%, recovers at 85%. The 5% gap prevents oscillation, without it the system flaps between tripped and recovered every check cycle.
- Fail-safe on monitoring failure: If the CF usage API is down, maintain last known state rather than assuming "everything is fine." A monitoring outage shouldn't mask a usage spike.
- Alert dedup: Per-resource, per-month. Without it you'd get ~8,600 identical emails for the rest of the month once a resource hits 80%.
Implementation: every 5 minutes, queries CF's GraphQL API (requests, CPU, KV, queues) + Observability Telemetry API (logs/traces) in parallel, evaluates 8 resource dimensions, caches state to KV. Between checks it's a single KV read — essentially free.
When tripped, all scheduled tasks are skipped. The cron trigger still fires (you can't stop that), but the first thing it does is check the breaker and bail out if tripped.
It's been running in production for two weeks. Caught a KV reads spike at 82% early in the month, got one warning email, investigated, fixed the root cause, never hit the trip threshold.
The pattern should apply to any metered serverless platform (Lambda, Vercel, Supabase) or any API with budget ceilings (OpenAI, Twilio). The core idea: treat your own resource budget as a health signal, just like you'd treat a downstream service's error rate.
Happy to share code details if there's interest.
Full writeup with implementation code and tests: https://yingjiezhao.com/en/articles/Usage-Circuit-Breaker-for-Cloudflare-Workers
Show HN: Riventa.Dev – AI-native DevOps that acts, not just alerts
Hi HN,
Most DevOps tools are good at observing — they collect data, surface metrics, and send alerts. But the actual decision and action still falls on the engineer.
So I built Riventa.Dev — a DevOps platform where the AI (Riv) doesn't just surface data, it acts.
What Riv does today: - Automatic PR review on every push — no manual trigger, no GitHub Actions boilerplate - Predictive failure detection — catches patterns that historically cause prod failures - DORA metrics dashboard with real pipeline data (MTTR, Deployment Frequency, Change Failure Rate) - Security scanning: SAST, SBOM, dependency analysis — built in, not bolted on - Works with GitHub, GitLab, and Bitbucket
Built solo, from scratch, with a focus on keeping things simple for the end user.
What I'd love feedback on: Is the AI-first positioning clear? Where does the UX feel rough?
Free to try — no credit card required.
Show HN: VaultLeap – USD accounts for founders outside the US
I'm Greg, co-founder of VaultLeap.
Built this for founders who can't get a US bank account. USD/EUR/MXN accounts with real ACH routing numbers and we have Visa cards coming soon.
If you've been cut off from Mercury or similar recently, DM me — happy to help some founders out.
Show HN: We open sourced Vapi – UI included
We kept hitting the same wall building voice AI systems. Pipecat and LiveKit are great projects, genuinely. But getting it to production took us weeks of plumbing - wiring things together, handling barge-ins, setting up telephony, Knowledge base, tool calls, handling barge in etc. And every time we needed to tweak agent behavior, you were back in the code and redeploying. We just wanted to change a prompt and test it in 30 seconds. Thats why Vapi retell etc exist.
So we wrote the entire code and open sourced it as a Visual drag-and-drop for voice agents ( same as vapi or n8n for voice). Built on a Pipecat fork and BSD-2, no strings attached. Tool calls, knowledge base, variable extraction, voicemail detection, call transfer to humans, multilingual support, post-call QA, background noise suppression, and a website widget are all included. You're not paying per-minute fees to a middleman wrapping the same APIs you'd call directly.
You can set it up with a simple docker command. It comes pre-wired with Deepgram, Cartesia, OpenAI , Speechmatics Sarvam for STT, same for TTS, and OpenAI, Gemini, groq, Openrouter, Azure on the LLM side. Telephony works out of the box with Twilio, Vonage , CLoudonix and Asterisk for both inbound and outbound.
There's a hosted version at app.dograh.com if self-hosting isn't your thing.
Repo: github.com/dograh-hq/dograh Video walkthrough: https://youtu.be/sxiSp4JXqws
We built this out of frustration, not a thesis. The tool is free to use and fully open source (and will always remain so), happy to answer questions about the data or how we built it.
Show HN: A desktop app for managing Claude Code sessions
Switchboard is an open-source platform that enables real-time communication between diverse systems, facilitating seamless integration and data exchange across applications, devices, and services.
Show HN: Calyx – Ghostty-Based macOS Terminal with Liquid Glass UI
Calyx is an open-source software framework that provides a modular and extensible architecture for building complex software systems. It aims to simplify the development of large-scale applications by promoting modularity, flexibility, and reusability.
Show HN: Open-source browser for AI agents
Hi HN, I forked chromium and built agent-browser-protocol (ABP) after noticing that most browser-agent failures aren’t really about the model misunderstanding the page. Instead, the problem is that the model is reasoning from a stale state.
ABP is designed to keep the acting agent synchronized with the browser at every step. After each action (click, type, etc), it freezes JavaScript execution and rendering, then captures the resulting state. It also compiles the notable events that occurred during that action loop, such as navigation, file pickers, permission prompts, alerts, and downloads, and sends that along with a screenshot of the frozen page state back to the agent.
The result is that browser interaction starts to feel more like a multimodal chat loop. The agent takes an action, gets back a fresh visual state and a structured summary of what happened, then decides what to do next from there. That fits much better with how LLMs already work.
A few common browser-use failures ABP helps eliminate: * A modal appears after the last Playwright screenshot and blocks the input the agent was about to use * Dynamic filters cause the page to reflow between steps * An autocomplete dropdown opens and covers the element the agent intended to click * alert() / confirm() interrupts the flow * Downloads are triggered, but the agent has no reliable way to know when they’ve completed
As proof, ABP with opus 4.6 as the driver scores 90.5% on the Online Mind2Web benchmark. I think modern LLMs already understand websites, they just need a better tool to interact with them. Happy to answer questions about the architecture, forking chrome or anything else in the comments below.
Try it out: `claude mcp add browser -- npx -y agent-browser-protocol --mcp` (Codex/OpenCode instructions in the docs)
Demo video: https://www.loom.com/share/387f6349196f417d8b4b16a5452c3369
Show HN: Autoresearch@home
autoresearch@home is a collaborative research collective where AI agents share GPU resources to collectively improve a language model. Think SETI@home, but for model training.
How it works: Agents read the current best result, propose a hypothesis, modify train.py, run the experiment on your GPU, and publish results back. When an agent beats the current best validation loss, that becomes the new baseline for every other agent. Agents learn from great runs and failures, since we're using Ensue as the collective memory layer.
This project extends Karpathy's autoresearch by adding the missing coordination layer so agents can actually build on each other's work.
To participate, you need an agent and a GPU. The agent handles everything: cloning the repo, connecting to the collective, picking experiments, running them, publishing results, and asking you to verify you're a real person via email.
Send this prompt to your agent to get started: Read https://github.com/mutable-state-inc/autoresearch-at-home follow the instructions join autoresearch and start contributing.
This whole experiment is to prove that agents work better when they can build off other agents. The timeline is live, so you can watch experiments land in real time.
Show HN: Python DSL for system programming with manual memory and linear types
The article introduces PythoC, a Python-based code generation tool that allows users to generate C code from Python code. It highlights PythoC's ability to handle a wide range of Python constructs and its potential to streamline the development process for projects that require both Python and C components.
Show HN: I built a tool that watches webpages and exposes changes as RSS
I built Site Spy after missing a visa appointment slot because a government page changed and I didn’t notice for two weeks.
It watches webpages for changes and shows the result like a diff. The part I think HN might find interesting is that it can monitor a specific element on a page, not just the whole page, and it can expose changes as RSS feeds.
So instead of tracking an entire noisy page, you can watch just a price, a stock status, a headline, or a specific content block. When it changes, you can inspect the diff, browse the snapshot history, or follow the updates in an RSS reader.
It’s a Chrome/Firefox extension plus a web dashboard.
Main features:
- Element picker for tracking a specific part of a page
- Diff view plus full snapshot timeline
- RSS feeds per watch, per tag, or across all watches
- MCP server for Claude, Cursor, and other AI agents
- Browser push, Email, and Telegram notifications
Chrome: https://chromewebstore.google.com/detail/site-spy/jeapcpanag...
Firefox: https://addons.mozilla.org/en-GB/firefox/addon/site-spy/
Docs: https://docs.sitespy.app
I’d especially love feedback on two things:
- Is RSS actually a useful interface for this, or do most people just want direct alerts?
- Does element-level tracking feel meaningfully better than full-page monitoring?
Show HN: I built proxy that keeps RAG working while hiding PII
Hey HN,
When you send real documents or customer data to LLMs, you face a painful tradeoff:
- Send raw text → privacy disaster - Redact with [REDACTED] → embeddings break, RAG retrieval fails, multi-turn chats become useless, and the model often refuses to answer questions about the redacted entities.
The practical solution is consistent pseudonymization: the same real entity always maps to the same token (e.g. “Tata Motors” → ORG_7 everywhere). This preserves semantic meaning for vector search and reasoning, then you rehydrate the response so the provider never sees actual names, numbers or addresses.
I got fed up fighting this with Presidio + custom glue (truncated RAG chunks, declension in Indian languages, fuzzy merging for typos/siblings, LLM confusion, percentages breaking math). So I built Cloakpipe as a tiny single-binary Rust proxy.
It does: • Multi-layer detection (regex + financial rules + optional GLiNER2 ONNX NER + custom TOML) • Consistent reversible mapping in an AES-256-GCM encrypted vault (memory zeroized) • Smart rehydration that survives truncated chunks like [[ADDRESS:A00 • Built-in fuzzy resolution for typos and similar names • Numeric reasoning mode so percentages still work for calculations
Fully open source (MIT), zero Python dependencies, <5 ms overhead.
Repo: https://github.com/rohansx/cloakpipe Demo & quick start: https://app.cloakpipe.co/demo
Would love feedback from anyone who has audited their RAG data flow or is struggling with the redaction-vs-semantics problem — especially in legal, fintech, or non-English workflows.
What approaches have you landed on?
Show HN: We wrote a custom microkernel for XR because Android felt too bloated
The article discusses the launch of Xeneva OS, a new open-source operating system aiming to provide a secure and privacy-focused alternative to mainstream options. It highlights the key features and goals of the project, including its focus on user privacy, decentralized architecture, and commitment to transparency.
Show HN: A context-aware permission guard for Claude Code
We needed something like --dangerously-skip-permissions that doesn’t nuke your untracked files, exfiltrate your keys, or install malware.
Claude Code's permission system is allow-or-deny per tool, but that doesn’t really scale. Deleting some files is fine sometimes. And git checkout is sometimes not fine. Even when you curate permissions, 200 IQ Opus can find a way around it. Maintaining a deny list is a fool's errand.
nah is a PreToolUse hook that classifies every tool call by what it actually does, using a deterministic classifier that runs in milliseconds. It maps commands to action types like filesystem_read, package_run, db_write, git_history_rewrite, and applies policies: allow, context (depends on the target), ask, or block.
Not everything can be classified, so you can optionally escalate ambiguous stuff to an LLM, but that’s not required. Anything unresolved you can approve, and configure the taxonomy so you don’t get asked again.
It works out of the box with sane defaults, no config needed. But you can customize it fully if you want to.
No dependencies, stdlib Python, MIT.
pip install nah && nah install
https://github.com/manuelschipper/nah
Show HN: Vanilla JavaScript refinery simulator built to explain job to my kids
Hi HN, I’m a chemical engineer and I manage logistics at a refinery down in Texas. Whenever I try to explain downstream operations to people outside the industry (including my kids), I usually get blank stares. I wanted to build something that visualizes the concepts and chemistry of a plant without completely dumbing down the science, so I put together this 5-minute browser game.
Here's a simple runthrough: https://www.youtube.com/watch?v=is-moBz6upU. I pushed to get through a full product pathway to show the V-804 replay.
I am not a software developer by trade, so I relied heavily on LLMs (Claude, Copilot, Gemini) to help write the code. What started as a simple concept turned into a 9,000-line single-page app built with vanilla HTML, CSS, and JavaScript. I used Matter.js for the 2D physics minigames.
A few technical takeaways from building this as a non-dev: * Managing the LLM workflow: Once the script.js file got large, letting the models output full file rewrites was a disaster (truncations, hallucinations, invisible curly-quote replacements that broke the JS). I started forcing them to act like patch files, strictly outputting "Find this exact block" and "Replace with this exact block." This was the only way to maintain improvements without breaking existing logic.
* Mapping physics to CSS: I wanted the minigames to visually sit inside circular CSS containers (border-radius: 50%). Matter.js doesn't natively care about your CSS. Getting the rigid body physics to respect a dynamic, responsive DOM boundary across different screen sizes required running an elliptical boundary equation (dx * dx) / (rx * rx) + (dy * dy) / (ry * ry) > 1 on every single frame. Maybe this was overkill to try to handle the resizing between phones and PCs.
* Mobile browser events: Forcing iOS Safari to ignore its default behaviors (double-tap zoom, swipe-to-scroll) while still allowing the user to tap and drag Matter.js objects required a ridiculous amount of custom event listener management and CSS (touch-action: manipulation; user-select: none;). I also learned that these actions very easily kill the mouse scroll making it very frustrating for PC users. I am hoping I hit a good middle ground.
* State management: Since I didn't use React or any frameworks, I had to rely on a global state object. Because the game jumps between different phases/minigames, I ran into massive memory leaks from old setInterval loops and Matter.js bodies stacking up. I had to build strict teardown functions to wipe the slate clean on every map transition.
The game walks through electrostatic desalting, fractional distillation, hydrotreating, catalytic cracking, and gasoline blending (hitting specific Octane and RVP specs).
It’s completely free, runs client-side, and has zero ads or sign-ups. I'd appreciate any feedback on the mechanics, or let me know if you manage to break the physics engine. Happy to answer any questions about the chemical engineering side of things as well.
For some reason the URL box is not getting recognized, maybe someone can help me feel less dumb there too. https://fuelingcuriosity.com/game
Show HN: Run an Agent Council of LLMs that debate and synthesize answers
I built a local-first UI that adds two reasoning architectures on top of small models like Qwen, Llama and Mistral: a sequential Thinking Pipeline (Plan → Execute → Critique) and a parallel Agent Council where multiple expert models debate in parallel and a Judge synthesizes the best answer. No API keys, zero .env setup — just pip install multimind. Benchmark on GSM8K shows measurable accuracy gains vs. single-model inference.
Show HN: SmartClip – fix multi-line shell commands before they hit your terminal
I kept copying multi-line commands from ChatGPT/Claude/READMEs and getting `command not found` errors when pasting into my terminal. Bracketed paste mode doesn't help — it prevents line-by-line execution, but the content itself still arrives broken (stray `$` prompts, split continuations, operators across lines).
SmartClip hooks into your shell's paste widget (zsh, bash, fish) and silently fixes multi-line commands before the shell sees them. You paste with Cmd+V as usual — no new keybindings, no daemon, no background process.
It uses score-based heuristics to detect shell commands (so it won't mangle your JSON or prose), joins lines intelligently (backslash continuations, pipes, `&&`), strips prompt characters, and validates everything with `bash -n` before inserting. If it's not confident or the fix has invalid syntax, it passes through unchanged.
~150 lines of bash. Zero dependencies.
`brew install akshaydeshraj/smartclip` or `npm install -g smartclip-cli`
Show HN: XLA-based array computing framework for R
Anvil is an open-source, web-based framework that allows users to create and deploy full-stack Python applications without the need for HTML, CSS, or JavaScript. It provides a visual development environment and tools for building, testing, and deploying applications.