Show stories

Show HN: OneCLI – Vault for AI Agents in Rust
guyb3 about 10 hours ago

Show HN: OneCLI – Vault for AI Agents in Rust

We built OneCLI because AI agents are being given raw API keys. And it's going about as well as you'd expect. We figured the answer isn't "don't give agents access," it's "give them access without giving them secrets."

OneCLI is an open-source gateway that sits between your AI agents and the services they call. You store your real credentials once in OneCLI's encrypted vault, and give your agents placeholder keys. When an agent makes an HTTP call through the proxy, OneCLI matches the request by host/path, verifies the agent should have access, swaps the placeholder for the real credential, and forwards the request. The agent never touches the actual secret. It just uses CLI or MCP tools as normal.

Try it in one line: docker run --pull always -p 10254:10254 -p 10255:10255 -v onecli-data:/app/data ghcr.io/onecli/onecli

The proxy is written in Rust, the dashboard is Next.js, and secrets are AES-256-GCM encrypted at rest. Everything runs in a single Docker container with an embedded Postgres (PGlite), no external dependencies. Works with any agent framework (OpenClaw, NanoClaw, IronClaw, or anything that can set an HTTPS_PROXY).

We started with what felt most urgent: agents shouldn't be holding raw credentials. The next layer is access policies and audit, defining what each agent can call, logging everything, and requiring human approval before sensitive actions go through.

It's Apache-2.0 licensed. We'd love feedback on the approach, and we're especially curious how people are handling agent auth today.

GitHub: https://github.com/onecli/onecli Site: https://onecli.sh

github.com
125 40
Summary
Show HN: Axe – A 12MB binary that replaces your AI framework
jrswab about 13 hours ago

Show HN: Axe – A 12MB binary that replaces your AI framework

I built Axe because I got tired of every AI tool trying to be a chatbot.

Most frameworks want a long-lived session with a massive context window doing everything at once. That's expensive, slow, and fragile. Good software is small, focused, and composable... AI agents should be too.

Axe treats LLM agents like Unix programs. Each agent is a TOML config with a focused job. Such as code reviewer, log analyzer, commit message writer. You can run them from the CLI, pipe data in, get results out. You can use pipes to chain them together. Or trigger from cron, git hooks, CI.

What Axe is:

- 12MB binary, two dependencies. no framework, no Python, no Docker (unless you want it)

- Stdin piping, something like `git diff | axe run reviewer` just works

- Sub-agent delegation. Where agents call other agents via tool use, depth-limited

- Persistent memory. If you want, agents can remember across runs without you managing state

- MCP support. Axe can connect any MCP server to your agents

- Built-in tools. Such as web_search and url_fetch out of the box

- Multi-provider. Bring what you love to use.. Anthropic, OpenAI, Ollama, or anything in models.dev format

- Path-sandboxed file ops. Keeps agents locked to a working directory

Written in Go. No daemon, no GUI.

What would you automate first?

github.com
152 98
Summary
Show HN: Detect any object in satellite imagery using a text prompt
eyasu6464 5 days ago

Show HN: Detect any object in satellite imagery using a text prompt

I built a browser-based tool that uses Vision-Language Models (VLMs) to detect objects in satellite imagery via natural language prompts. Draw a polygon on the map, type what you want to find (e.g., "swimming pools," "oil tanks," "solar panels"), and the system scans tile-by-tile, projecting bounding boxes back onto the globe as GeoJSON.

The pipeline: pick zoom level + prompt → slice map into mercantile tiles → feed each tile + prompt to VLM → create bounding boxes → project to WGS84 coordinates → render on map.

No login required for the demo. Works well for distinct structures zero-shot; struggles with dense/occluded objects where narrow YOLO models still win.

useful-ai-tools.com
16 6
Summary
Show HN: Understudy – Teach a desktop agent by demonstrating a task once
bayes-song about 10 hours ago

Show HN: Understudy – Teach a desktop agent by demonstrating a task once

I built Understudy because a lot of real work still spans native desktop apps, browser tabs, terminals, and chat tools. Most current agents live in only one of those surfaces.

Understudy is a local-first desktop agent runtime that can operate GUI apps, browsers, shell tools, files, and messaging in one session. The part I'm most interested in feedback on is teach-by-demonstration: you do a task once, the agent records screen video + semantic events, extracts the intent rather than coordinates, and turns it into a reusable skill.

Demo video: https://www.youtube.com/watch?v=3d5cRGnlb_0

In the demo I teach it: Google Image search -> download a photo -> remove background in Pixelmator Pro -> export -> send via Telegram. Then I ask it to do the same for Elon Musk. The replay isn't a brittle macro: the published skill stores intent steps, route options, and GUI hints only as a fallback. In this example it can also prefer faster routes when they are available instead of repeating every GUI step.

Current state: macOS only. Layers 1-2 are working today; Layers 3-4 are partial and still early.

    npm install -g @understudy-ai/understudy
    understudy wizard
GitHub: https://github.com/understudy-ai/understudy

Happy to answer questions about the architecture, teach-by-demonstration, or the limits of the current implementation.

github.com
88 38
Summary
Show HN: OpenClaw-class agents on ESP32 (and the IDE that makes it possible)
pycoclaw about 5 hours ago

Show HN: OpenClaw-class agents on ESP32 (and the IDE that makes it possible)

PyCoCoLaw is an open-source Python library that provides a simple and intuitive interface for working with the Common Core of Learning (CCL) standards. The library offers functions for accessing and querying the CCL standards, making it a useful tool for educators and developers working with educational content.

pycoclaw.com
19 2
Summary
Show HN: Rudel – Claude Code Session Analytics
keks0r about 13 hours ago

Show HN: Rudel – Claude Code Session Analytics

We built rudel.ai after realizing we had no visibility into our own Claude Code sessions. We were using it daily but had no idea which sessions were efficient, why some got abandoned, or whether we were actually improving over time.

So we built an analytics layer for it. After connecting our own sessions, we ended up with a dataset of 1,573 real Claude Code sessions, 15M+ tokens, 270K+ interactions.

Some things we found that surprised us: - Skills were only being used in 4% of our sessions - 26% of sessions are abandoned, most within the first 60 seconds - Session success rate varies significantly by task type (documentation scores highest, refactoring lowest) - Error cascade patterns appear in the first 2 minutes and predict abandonment with reasonable accuracy - There is no meaningful benchmark for 'good' agentic session performance, we are building one.

The tool is free to use and fully open source, happy to answer questions about the data or how we built it.

github.com
127 73
Summary
lubujackson 3 days ago

Show HN: Web-based ANSI art viewer

My love letter to ANSI art. Full width rendering, scrolling by baud rate, text is selectable, and more.

There are some example links at the top if you're feeling lucky.

sure.is
24 7
Summary
remywang 1 day ago

Show HN: s@: decentralized social networking over static sites

satproto.org
399 208
Show HN: An application stack Claude coded directly in LLVM IR
dboreham about 9 hours ago

Show HN: An application stack Claude coded directly in LLVM IR

This repo is the result of a debate about what kind of programming language might be appropriate if humans are no longer the primary authors. Initially the thought was "LLMs can just generate binaries directly" (this was before a more famous person had the same idea). But that on reflection seems like a bad approach because languages exist to capture program semantics that are elided by translation to machine code. The next step was to wonder if an existing "machine readable" program representation can be the target for LLM code generation. It turns out yes. This project is the result of asking Claude to create an application stack entirely coded in LLVM's intermediate representation language.

github.com
8 0
Summary
Show HN: Slop or not – can you tell AI writing from human in everyday contexts?
eigen-vector about 5 hours ago

Show HN: Slop or not – can you tell AI writing from human in everyday contexts?

I’ve been building a crowd-sourced AI detection benchmark. Two responses to the same prompt — one from a real human (pre-2022, provably pre prevalence of AI slop on the internet), one generated by AI. You pick the slop. Three wrong and you’re out.

The dataset: 16K human posts from Reddit, Hacker News, and Yelp, each paired with AI generations from 6 models across two providers (Anthropic and OpenAI) at three capability tiers. Same prompt, length-matched, no adversarial coaching — just the model’s natural voice with platform context. Every vote is logged with model, tier, source, response time, and position.

Early findings from testing: Reddit posts are easy to spot (humans are too casual for AI to mimic), HN is significantly harder.

I'll be releasing the full dataset on HuggingFace and I'll publish a paper if I can get enough data via this crowdsourced study.

If you play the HN-only mode, you’re helping calibrate how detectable AI is on here specifically.

Would love feedback on the pairs — are any trivially obvious? Are some genuinely hard?

slop-or-not.space
11 14
Show HN: PipeStep – Step-through debugger for GitHub Actions workflows
photobombastic about 10 hours ago

Show HN: PipeStep – Step-through debugger for GitHub Actions workflows

Hey HN — I kept seeing developers describe the same frustration: the commit-push-wait-read-logs cycle when debugging CI pipelines. So I built PipeStep.

PipeStep parses your GitHub Actions YAML, spins up the right Docker container, and gives you a step-through debugger for your run: shell commands.

You can: 1. Pause before each step and inspect the container state. 2. Shell into the running container mid-pipeline (press I). 3. Set breakpoints on specific steps (press B). 4. Retry failed steps or skip past others.

It deliberately does not try to replicate the full GitHub Actions runtime — no secrets, no matrix builds, no uses: action execution. For full local workflow runs, use act. PipeStep is for when things break and you need to figure out why without pushing 10 more commits. Think of it as gdb for your CI pipeline rather than a local GitHub runner.

pip install pipestep (v0.1.2) · Python 3.11+ · MIT · Requires Docker

Would love feedback, especially from people who've hit the same pain point. Known limitations are documented in the README + have some issues in there that I'd love eyeballs on!

github.com
7 4
whilo about 6 hours ago

Show HN: Stratum – SQL that branches and beats DuckDB on 35/46 1T benchmarks

Stratum is an open-source, high-performance analytics engine designed for real-time data processing and large-scale data analysis. The article discusses Stratum's architecture, its use of the Datalog query language, and its ability to handle complex queries and perform efficient incremental updates.

datahike.io
8 3
Summary
brucehsu about 6 hours ago

Show HN: Codelegate, keyboard-driven coding agent orchestrator GUI for Mac/Linux

Do we really need another agent orchestrator? Probably not. But I couldn't find one that matched how I actually work with coding agent CLIs, so I built my own.

Codelegate is a desktop app (Tauri 2 + React + xterm.js) that organizes agent sessions into a keyboard-first workspace. I built it to solve a few specific frustrations:

1. I want to navigate everything with both hands on the keyboard. Sessions switch with `Alt+1..9`, panes with `Alt+A/G/T`. No mouse required. 2. I work on the same repo in parallel using Git worktrees. Codelegate has a built-in worktree flow: create an isolated branch per agent, auto-cleanup on session end. 3. I want to keep using my CLI tools (zellij, etc.) alongside agents, not replace them. 4. I need it on both macOS and Linux.

Each session gives you three panes: Agent, Terminal, and Git. The Git pane handles diff review with syntax highlighting, bulk stage/unstage, commit, and amend. Sessions are grouped by repository in the sidebar.

Currently supports Claude Code and Codex CLI, but anything that runs in a shell can work.

This is v1.0.0 and it only covers the agent CLIs and features I use the most. It's licensed under GPLv3, so it's meant to be forked and shaped into your own workflow.

Hope you enjoy using it or making it your own!

codelegate.dev
3 0
Summary
Show HN: Open-source browser for AI agents
theredsix 1 day ago

Show HN: Open-source browser for AI agents

Hi HN, I forked chromium and built agent-browser-protocol (ABP) after noticing that most browser-agent failures aren’t really about the model misunderstanding the page. Instead, the problem is that the model is reasoning from a stale state.

ABP is designed to keep the acting agent synchronized with the browser at every step. After each action (click, type, etc), it freezes JavaScript execution and rendering, then captures the resulting state. It also compiles the notable events that occurred during that action loop, such as navigation, file pickers, permission prompts, alerts, and downloads, and sends that along with a screenshot of the frozen page state back to the agent.

The result is that browser interaction starts to feel more like a multimodal chat loop. The agent takes an action, gets back a fresh visual state and a structured summary of what happened, then decides what to do next from there. That fits much better with how LLMs already work.

A few common browser-use failures ABP helps eliminate: * A modal appears after the last Playwright screenshot and blocks the input the agent was about to use * Dynamic filters cause the page to reflow between steps * An autocomplete dropdown opens and covers the element the agent intended to click * alert() / confirm() interrupts the flow * Downloads are triggered, but the agent has no reliable way to know when they’ve completed

As proof, ABP with opus 4.6 as the driver scores 90.5% on the Online Mind2Web benchmark. I think modern LLMs already understand websites, they just need a better tool to interact with them. Happy to answer questions about the architecture, forking chrome or anything else in the comments below.

Try it out: `claude mcp add browser -- npx -y agent-browser-protocol --mcp` (Codex/OpenCode instructions in the docs)

Demo video: https://www.loom.com/share/387f6349196f417d8b4b16a5452c3369

github.com
143 52
Summary
Show HN: Every Developer in the World, Ranked
ejc about 6 hours ago

Show HN: Every Developer in the World, Ranked

We've indexed 5M+ GitHub users and built a ranking system that goes beyond follower counts. The idea started from frustration: GitHub is terrible for discovery. You can't answer "who are the best Python developers in Berlin?" or "who identified transformer-based models before they blew up?" without scraping everything yourself. So we did.

What we built: CodeRank score - a composite reputation signal across contributions, repository impact, and community influence Tastemaker score - did you star repos at 50 stars that now have 50,000? We track that Comparison Builder - allows users to build comparison graphics to compare devs, repos, orgs, etc. Sharable Profile Graphics - share your scores and flex on your coworkers or the community at large

Some things we found interesting: Most-followed ≠ most influential. The correlation between follower count and tastemaker score is surprisingly weak. There's a whole tier of developers who consistently find projects weeks and months before they trend, with almost no public following.

Location data on GitHub is a disaster. We spent an embarrassing amount of time on normalization and it's still not anywhere near perfect.

Try it: https://coderank.me/

If your profile doesn't have a score, signing in will trigger scoring for your account.

Curious what the HN crowd thinks about the ranking methodology, happy to get into the weeds on any of it.

coderank.me
8 4
Summary
Show HN: I built a tool that watches webpages and exposes changes as RSS
vkuprin 1 day ago

Show HN: I built a tool that watches webpages and exposes changes as RSS

I built Site Spy after missing a visa appointment slot because a government page changed and I didn’t notice for two weeks.

It watches webpages for changes and shows the result like a diff. The part I think HN might find interesting is that it can monitor a specific element on a page, not just the whole page, and it can expose changes as RSS feeds.

So instead of tracking an entire noisy page, you can watch just a price, a stock status, a headline, or a specific content block. When it changes, you can inspect the diff, browse the snapshot history, or follow the updates in an RSS reader.

It’s a Chrome/Firefox extension plus a web dashboard.

Main features:

- Element picker for tracking a specific part of a page

- Diff view plus full snapshot timeline

- RSS feeds per watch, per tag, or across all watches

- MCP server for Claude, Cursor, and other AI agents

- Browser push, Email, and Telegram notifications

Chrome: https://chromewebstore.google.com/detail/site-spy/jeapcpanag...

Firefox: https://addons.mozilla.org/en-GB/firefox/addon/site-spy/

Docs: https://docs.sitespy.app

I’d especially love feedback on two things:

- Is RSS actually a useful interface for this, or do most people just want direct alerts?

- Does element-level tracking feel meaningfully better than full-page monitoring?

sitespy.app
306 79
Summary
Show HN: Cloud to Desktop in the Fastest Way
lasgawe about 10 hours ago

Show HN: Cloud to Desktop in the Fastest Way

Native Desktop is a toolkit for building native desktop applications using modern web technologies without dealing with the usual complexity of desktop tooling. It focuses on providing a simple developer experience where you can scaffold, build, and distribute desktop apps using familiar workflows and a modular package ecosystem. Instead of forcing developers to manage complicated native environments, Native Desktop provides a CLI and a set of packages that handle the heavy lifting while keeping projects flexible and maintainable. The goal is to let developers move from an idea to a working desktop application quickly while still having full control over architecture and distribution. The project is designed for developers who already build with modern web stacks and want a straightforward way to turn those applications into desktop software without reinventing the entire toolchain.

nativedesktop.com
3 3
Summary
austinbaggio 1 day ago

Show HN: Autoresearch@home

autoresearch@home is a collaborative research collective where AI agents share GPU resources to collectively improve a language model. Think SETI@home, but for model training.

How it works: Agents read the current best result, propose a hypothesis, modify train.py, run the experiment on your GPU, and publish results back. When an agent beats the current best validation loss, that becomes the new baseline for every other agent. Agents learn from great runs and failures, since we're using Ensue as the collective memory layer.

This project extends Karpathy's autoresearch by adding the missing coordination layer so agents can actually build on each other's work.

To participate, you need an agent and a GPU. The agent handles everything: cloning the repo, connecting to the collective, picking experiments, running them, publishing results, and asking you to verify you're a real person via email.

Send this prompt to your agent to get started: Read https://github.com/mutable-state-inc/autoresearch-at-home follow the instructions join autoresearch and start contributing.

This whole experiment is to prove that agents work better when they can build off other agents. The timeline is live, so you can watch experiments land in real time.

ensue-network.ai
74 19
Summary
fuelingcurious 1 day ago

Show HN: Vanilla JavaScript refinery simulator built to explain job to my kids

Hi HN, I’m a chemical engineer and I manage logistics at a refinery down in Texas. Whenever I try to explain downstream operations to people outside the industry (including my kids), I usually get blank stares. I wanted to build something that visualizes the concepts and chemistry of a plant without completely dumbing down the science, so I put together this 5-minute browser game.

Here's a simple runthrough: https://www.youtube.com/watch?v=is-moBz6upU. I pushed to get through a full product pathway to show the V-804 replay.

I am not a software developer by trade, so I relied heavily on LLMs (Claude, Copilot, Gemini) to help write the code. What started as a simple concept turned into a 9,000-line single-page app built with vanilla HTML, CSS, and JavaScript. I used Matter.js for the 2D physics minigames.

A few technical takeaways from building this as a non-dev: * Managing the LLM workflow: Once the script.js file got large, letting the models output full file rewrites was a disaster (truncations, hallucinations, invisible curly-quote replacements that broke the JS). I started forcing them to act like patch files, strictly outputting "Find this exact block" and "Replace with this exact block." This was the only way to maintain improvements without breaking existing logic.

* Mapping physics to CSS: I wanted the minigames to visually sit inside circular CSS containers (border-radius: 50%). Matter.js doesn't natively care about your CSS. Getting the rigid body physics to respect a dynamic, responsive DOM boundary across different screen sizes required running an elliptical boundary equation (dx * dx) / (rx * rx) + (dy * dy) / (ry * ry) > 1 on every single frame. Maybe this was overkill to try to handle the resizing between phones and PCs.

* Mobile browser events: Forcing iOS Safari to ignore its default behaviors (double-tap zoom, swipe-to-scroll) while still allowing the user to tap and drag Matter.js objects required a ridiculous amount of custom event listener management and CSS (touch-action: manipulation; user-select: none;). I also learned that these actions very easily kill the mouse scroll making it very frustrating for PC users. I am hoping I hit a good middle ground.

* State management: Since I didn't use React or any frameworks, I had to rely on a global state object. Because the game jumps between different phases/minigames, I ran into massive memory leaks from old setInterval loops and Matter.js bodies stacking up. I had to build strict teardown functions to wipe the slate clean on every map transition.

The game walks through electrostatic desalting, fractional distillation, hydrotreating, catalytic cracking, and gasoline blending (hitting specific Octane and RVP specs).

It’s completely free, runs client-side, and has zero ads or sign-ups. I'd appreciate any feedback on the mechanics, or let me know if you manage to break the physics engine. Happy to answer any questions about the chemical engineering side of things as well.

For some reason the URL box is not getting recognized, maybe someone can help me feel less dumb there too. https://fuelingcuriosity.com/game

fuelingcuriosity.com
119 46
Summary
Show HN: A context-aware permission guard for Claude Code
schipperai 1 day ago

Show HN: A context-aware permission guard for Claude Code

We needed something like --dangerously-skip-permissions that doesn’t nuke your untracked files, exfiltrate your keys, or install malware.

Claude Code's permission system is allow-or-deny per tool, but that doesn’t really scale. Deleting some files is fine sometimes. And git checkout is sometimes not fine. Even when you curate permissions, 200 IQ Opus can find a way around it. Maintaining a deny list is a fool's errand.

nah is a PreToolUse hook that classifies every tool call by what it actually does, using a deterministic classifier that runs in milliseconds. It maps commands to action types like filesystem_read, package_run, db_write, git_history_rewrite, and applies policies: allow, context (depends on the target), ask, or block.

Not everything can be classified, so you can optionally escalate ambiguous stuff to an LLM, but that’s not required. Anything unresolved you can approve, and configure the taxonomy so you don’t get asked again.

It works out of the box with sane defaults, no config needed. But you can customize it fully if you want to.

No dependencies, stdlib Python, MIT.

pip install nah && nah install

https://github.com/manuelschipper/nah

github.com
122 83
Show HN: XLA-based array computing framework for R
sebffischer 4 days ago

Show HN: XLA-based array computing framework for R

Anvil is an open-source, web-based framework that allows users to create and deploy full-stack Python applications without the need for HTML, CSS, or JavaScript. It provides a visual development environment and tools for building, testing, and deploying applications.

github.com
14 1
Summary
saphalpdyl 1 day ago

Show HN: I built an ISP infrastructure emulator from scratch with a custom vBNG

Demo: https://aether.saphal.me GitHub: https://github.com/saphalpdyl/Aether

Aether is a multi-BNG (Broadband Network Gateway) ISP infrastructure lab built almost from scratch that emulates IPoE IPv4 subscriber management end-to-end. It supports IPoE/Ipv4 networks and runs a python-based vBNG with RADIUS AAA, per-subscriber traffic shaping, and traffic simulation emulated on Containerlab. It is also my first personal networking project, built roughly over a month.

Motivations behind the project

I'm a CS sophomore. About three years ago, I was assigned, as an intern, to build a OSS/BSS platform for a regional ISP by myself without mentoring. Referencing demo.splynx.com , I developed most of the BSS side ( bookkeeping, accounting, inventory management ), but, in terms of networking, I managed to install and setup RADIUS and that was about it. I didn't have anyone to mentor me or ask questions to, so I had given up then.

Three years later, I decided to try cracking it again. This project is meant to serve as a learning reference for anyone who's been in that same position i.e staring at closed-source vendor stacks without proper guidance. This is absolutely not production-grade, but I hope it gives someone a place to start.

Architecture overview

The core component, the BNG, runs on an event-driven architecture where state changes are passed around as messages to avoid handling mutexes and locks. The session manager is the sole owner of the session state. To keep it clean and predictable, the direBNG never accepts external inputctly. The one exception is the Go RADIUS CoA daemon, which passes CoA messages in via IPC sockets. Everything the BNG produces(events, session snapshots) gets pushed to Redis Streams, where the bng-ingestor picks them up, processes them, and persists them.

Simulation and meta-configs

I am generating traffic through a simulator node that mounts the host's docker socket and runs docker exec commands on selected hosts. The topology.yaml used by Containerlab to define the network topology grows bigger as more BNG's and access nodes are added. So aether.config.yaml, a simpler configuration, is consumed by the configuration pipeline to generate the topology.yaml and other files (nginx.conf, kea-dhcp.conf, RADIUS clients.conf etc.)

Known Limitations

- Multiple veth hops through the emulated topology add significant overhead. Profiling with iperf3 (-P 10 -t 10, 9500 MTU, 24 vCPUs) shows BNG→upstream at ~24 Gbit/s, but host→BNG→upstream drops to ~3.5 Gbit/s. The 9500 MTU also isn't representative of real ISP deployments. This gets worse when the actual network is reintroduced capping my throughput to 1.6 Gbits/sec in local. - The circuit ID format (1/0/X) is non-standard. I simplified it for clarity. - No iBGP or VLAN support. - No Ipv6 support. I wanted to target IPv4 networks from the start to avoid getting too much breadth without a lot of depth.

Nearly everything I know about networking (except some sections from AWS) I learned building this. A lot was figured out on the fly, so engineers will likely spot questionable decisions in the codebase. I'd genuinely appreciate that feedback.

Questions

- Currently, the circuit where the user connects is arbitrarily decided by the demo user. In a real system with thousands of circuits, it'd be very difficult to properly assess which circuit the customer might connect to. When adding a new customer to a service, how does the operator decide, based on customer's location, which circuit to provide the service to ?

aether.saphal.me
64 19
Show HN: Klaus – OpenClaw on a VM, batteries included
robthompson2018 1 day ago

Show HN: Klaus – OpenClaw on a VM, batteries included

We are Bailey and Robbie and we are working on Klaus (https://klausai.com/): hosted OpenClaw that is secure and powerful out of the box.

Running OpenClaw requires setting up a cloud VM or local container (a pain) or giving OpenClaw root access to your machine (insecure). Many basic integrations (eg Slack, Google Workspace) require you to create your own OAuth app.

We make running OpenClaw simple by giving each user their own EC2 instance, preconfigured with keys for OpenRouter, AgentMail, and Orthogonal. And we have OAuth apps to make it easy to integrate with Slack and Google Workspace.

We are both HN readers (Bailey has been on here for ~10 years) and we know OpenClaw has serious security concerns. We do a lot to make our users’ instances more secure: we run on a private subnet, automatically update the OpenClaw version our users run, and because you’re on our VM by default the only keys you leak if you get hacked belong to us. Connecting your email is still a risk. The best defense I know of is Opus 4.6 for resilience to prompt injection. If you have a better solution, we’d love to hear it!

We learned a lot about infrastructure management in the past month. Kimi K2.5 and Mimimax M2.5 are extremely good at hallucinating new ways to break openclaw.json and otherwise wreaking havoc on an EC2 instance. The week after our launch we spent 20+ hours fixing broken machines by hand.

We wrote a ton of best practices on using OpenClaw on AWS Linux into our users’ AGENTS.md, got really good at un-bricking EC2 machines over SSM, added a command-and-control server to every instance to facilitate hotfixes and migrations, and set up a Klaus instance to answer FAQs on discord.

In addition to all of this, we built ClawBert, our AI SRE for hotfixing OpenClaw instances automatically: https://www.youtube.com/watch?v=v65F6VBXqKY. Clawbert is a Claude Code instance that runs whenever a health check fails or the user triggers it in the UI. It can read that user’s entries in our database and execute commands on the user’s instance. We expose a log of Clawbert’s runs to the user.

We know that setting up OpenClaw is easy for most HN readers, but I promise it is not for most people. Klaus has a long way to go, but it’s still very rewarding to see people who’ve never used Claude Code get their first taste of AI agents.

We charge $19/m for a t4g.small, $49/m for a t4g.medium, and $200/m for a t4g.xlarge and priority support. You get $15 in tokens and $20 in Orthogonal credits one-time.

We want to know what you are building on OpenClaw so we can make sure we support it. We are already working with companies like Orthogonal and Openrouter that are building things to make agents more useful, and we’re sure there are more tools out there we don’t know about. If you’ve built something agents want, please let us know. Comments welcome!

klausai.com
155 90
Summary
GregReve about 12 hours ago

Show HN: VaultLeap – USD accounts for founders outside the US

I'm Greg, co-founder of VaultLeap.

Built this for founders who can't get a US bank account. USD/EUR/MXN accounts with real ACH routing numbers and we have Visa cards coming soon.

If you've been cut off from Mercury or similar recently, DM me — happy to help some founders out.

vaultleap.com
4 2
Summary
Show HN: Satellite imagery object detection using text prompts
eyasu6464 4 days ago

Show HN: Satellite imagery object detection using text prompts

I built a browser-based tool for detecting objects in satellite imagery using vision-language models (VLMs). You draw a polygon on the map and enter a text prompt such as "swimming pools", "oil tanks", or "buses". The system scans the selected area tile-by-tile and returns detections projected back onto the map as GeoJSON.

Pipeline: select area and zoom level, split the region into mercantile tiles, run each tile with the prompt through a VLM, convert predicted bounding boxes to geographic coordinates (WGS84), and render the results back on the map.

It works reasonably well for distinct structures in a zero-shot setting. occluded objects are still better handled by specialized detectors like YOLO models.

There is a public demo and no login required. I am mainly interested in feedback on detection quality, performance tradeoffs between VLMs and specialized detectors, and potential real-world use cases.

useful-ai-tools.com
51 22
Summary
Show HN: A desktop app for managing Claude Code sessions
kapitalx about 12 hours ago

Show HN: A desktop app for managing Claude Code sessions

Switchboard is an open-source platform that enables real-time communication between diverse systems, facilitating seamless integration and data exchange across applications, devices, and services.

github.com
4 1
Summary
Show HN: Raccoon AI – Collaborative AI Agent for Anything
scorchy38 about 8 hours ago

Show HN: Raccoon AI – Collaborative AI Agent for Anything

Hey HN, I'm Shubh, Co-Founder of Raccoon AI.

Raccoon AI is like having something between Claude Code and Cursor in the web.

The agent has its own computer with a terminal, browser, and internet, and it is built with the right balance of collaboration and autonomy.

You can talk to it mid-task, send it more files while it's still running, or just let it go and come back to a finished result.

It's the kind of product where you open it to try one thing and end up spending two hours because you keep thinking of more things to throw at it.

The thing that most people get excited about is that sessions chain across completely unrelated task types. You can go from market research (real citations, generated charts) to raw data analysis (dump your db, ask questions) to a full interactive app, all in one conversation sharing the same context.

It has unlimited context through auto summarization, which is really good with Ace Max.

It connects to Gmail, GitHub, Google Drive, Notion, Outlook, and 40+ other tools. You can add your own via custom MCP servers.

Raccoon AI is built on top of our own agents SDK, ACE, which hit SOTA on GAIA benchmark with a score of 92.67.

A bit of background: We're a team of 3, and we started about 1.5 years ago to build the best possible browser agent to ever exist, after a couple of pivots we arrived at this and have been constantly shipping and growing since October.

Happy to go deep on the architecture or talk about the limitations and excited about the feedback.

Site: https://raccoonai.tech

raccoonai.tech
3 1
Summary
Show HN: Baltic security monitor from public data sources
makefunstuff about 9 hours ago

Show HN: Baltic security monitor from public data sources

People around me started repeating stuff from various psyop campaigns on TikTok or other social media they consume.

Especially when living in Baltics it's basically 24/7 fearmongering here from anywhere, either it's constant russian disinfo targeted campaigns via their chains of locals or social media campaings or some bloggers chasing hype on clickbait posts, so it was driving me mad, and it is distracting and annoying when someone from your closest ones got hooked on one of these posts and I was wasting time to explain why it was a bs.

So I took my slopmachine and some manually tweaking here and there and made this dashboard. Main metric is basically a daily 0-100 threat score, which are just weighted sums and thresholds - no ML yet.

estwarden.eu
4 0
Summary
kthaker1224 about 9 hours ago

Show HN: Hyper – Voice Notes for Whiteboarding Sessions

Hyper AI for Real Talk is a new app that uses AI technology to facilitate natural conversations on a variety of topics, enabling users to engage in thoughtful discussions and gain new perspectives.

apps.apple.com
3 0
Summary
Show HN: Calyx – Ghostty-Based macOS Terminal with Liquid Glass UI
yuu1ch13 about 14 hours ago

Show HN: Calyx – Ghostty-Based macOS Terminal with Liquid Glass UI

Calyx is an open-source software framework that provides a modular and extensible architecture for building complex software systems. It aims to simplify the development of large-scale applications by promoting modularity, flexibility, and reusability.

github.com
24 30
Summary