Show HN: I built a synthesizer based on 3D physics
I've been working on the Anukari 3D Physics Synthesizer for a little over two years now. It's one of the earliest virtual instruments to rely on the GPU for audio processing, which has been incredibly challenging and fun. In the end, predictably, the GUI for manipulating the 3D system actually ended up being a lot more work than the physics simulation.
So far I am only selling it direct on my website, which seems to be working well. I hope to turn it into a sustainable business, and ideally I'd have enough revenue to hire folks to help with it. So far it's been 99% a solo project, with (awesome) contractors brought in for some of the stuff that I'm bad at, like the 3D models and making instrument presets/videos.
The official launch announcement video is here: https://www.youtube.com/watch?v=NYX_eeNVIEU
But if you REALLY want to see what it can do, check out what Mick Cormick did with in on the first day: https://x.com/Mick_Gordon/status/1918146487948919222
I've kept a fairly detailed developer log about my progress on the project since October 2023, which might be of interest to the hardcore technical folks here: https://anukari.com/blog/devlog
I also gave a talk at Audio Developer Conference 2023 (ADC23) that goes deep into a couple of the problems I solved for Anukari: https://www.youtube.com/watch?v=lb8b1SYy73Q
Show HN: GPT-2 implemented using graphics shaders
Back in the old days, people used to do general-purpose GPU programming by using shaders like GLSL. This is what inspired NVIDIA (and other companies) to eventually create CUDA (and friends). This is an implementation of GPT-2 using WebGL and shaders. Enjoy!
Show HN: Blast – Fast, multi-threaded serving engine for web browsing AI agents
Hi HN!
BLAST is a high-performance serving engine for browser-augmented LLMs, designed to make deploying web-browsing AI easy, fast, and cost-manageable.
The goal with BLAST is to ultimately achieve google search level latencies for tasks that currently require a lot of typing and clicking around inside a browser. We're starting off with automatic parallelism, prefix caching, budgeting (memory and LLM cost), and an OpenAI-Compatible API but have a ton of ideas in the pipe!
Website & Docs: https://blastproject.org/ https://docs.blastproject.org/
MIT-Licensed Open-Source: https://github.com/stanford-mast/blast
Hope some folks here find this useful! Please let me know what you think in the comments or ping me on Discord.
— Caleb (PhD student @ Stanford CS)
Show HN: Kinematic Hand Skeleton Optimization in Jax
I've been trying to wrap my head around fwd/bwd kinematics for imitation learning, so I built a fully‑differentiable kinematic hand skeleton in JAX and visualized it with reruns new callback system in a Jupyter Notebook. This shows each joint angle and how it impacts the kinematic skeleton.
Show HN: Exhibit and Site on Mechanisms for Students
Just finished a super-nerdy amateur hobby project: An exhibit and website to show kids how cool mechanisms are!
Sadly, kids don't get much tangible experience with machines anymore. Ideally, this exhibit will inspire some to explore engineering, even if they are not "book learners". The website provides content to back up the exhibit, with videos and 3D printing files.
The project is inspired by engineering exhibits from the past. Check out the research page for more. The project will be open-sourced to enable people to make their own and extend it. If you want to collaborate, LMK.
--Steve
Show HN: OSle – A 510 bytes OS in x86 assembly
(sorry about double posting, I forgot to put Show HN in front in the original https://news.ycombinator.com/item?id=43863689 thread)
Hey all, As a follow up to my relatively successful series in x86 Assembly of last year[1], I started making an OS that fits in a boot sector. I am purposefully not doing chain loading or multi-stage to see how much I can squeeze out of 510bytes.
It comes with a file system, a shell, and a simple process management. Enough to write non-trivial guest applications, like a text editor and even some games. It's a lot of fun!
It comes with an SDK and you can play around with it in the browser to see what it looks like.
The aim is, as always, to make Assembly less scary and this time around also OS development.
[1]: https://news.ycombinator.com/item?id=41571971
Show HN: Kubetail – Real-time log search for Kubernetes
Hi Everyone!
Kubetail is a general-purpose logging dashboard for Kubernetes, optimized for tailing logs across multi-container workloads in real-time. With Kubetail, you can view logs from all the containers in a workload (e.g. Deployment or DaemonSet) merged into a single chronological timeline, delivered to your browser or terminal.
I launched Kubetail on HN last year and at that time the top request was to add search. Now I'm happy to say we finally have search available in our latest official release (cli/v0.4.3, helm/v0.10.1). You can check it out in action here:
https://www.kubetail.com/demo
Kubetail normally fetches logs using the Kubernetes API, which does not have search built-in. To enable search, click the “Install” button in the GUI or run `kubetail cluster install` in the CLI to deploy a DaemonSet that places a Kubetail agent on every node. Each agent runs a custom Rust binary powered by ripgrep; it scans the node’s log files and streams only matching lines to your browser or terminal. You can think of a Kubetail search as "remote grep" for your Kubernetes logs. Now you don’t need to download an entire log file just to grep it locally.
Since last year we've also added some other neat features that users find helpful. In particular, we built a simple CLI tool that starts the web dashboard on your desktop:
# Install
brew install kubetail
# Run
kubetail serve
We also added a powerful logs sub-command to the CLI that you can use to follow container logs or even fetch all the records in a given time window to analyze them in more detail locally (quick-start): # Follow example
$ kubetail logs deployments/web \
--with-ts \
--with-pod \
--follow
# Fetch example
$ kubetail logs deployments/web \
--since 2025-04-20T00:00:00Z \
--until 2025-04-21T00:00:00Z \
--all > logs.txt
We’ve added a lot more features since last year but these are the ones I wanted to highlight.I hope you like what we're doing with Kubetail! Your feedback is very valuable so please let us know what you think in the comments here or in our Discord chat.
Andres
Show HN: Roons – Mechanical Computer Kit
I built a mechanical computer kit: https://whomtech.com/show-hn
tl;dr: it's a cellular automaton on a "loom" of alternating bars, using contoured tiles to guide marbles through logic gates.
It's not just "Turing complete, job done"; I've tried to make it actually practical. Devices are compact, e.g. you can fit a binary adder into a 3cm square. It took me nearly two years and dozens of different approaches.
There's a sequence of interactive tutorials to try out, demo videos, and a janky simulator. I've also sent out a few prototype kits and have some more ready to go.
Please ask me anything, I will talk about this for hours.
-- Jesse
Show HN: Hyperparam: OSS tools for exploring datasets locally in the browser
For the last year I’ve been developing Hyperparam — a collection of small, fast, dependency-free open-source libraries designed for data scientists and ML engineers to actually look at their data.
- Hyparquet: Read any Parquet file in browser/node.js
- Icebird: Explore Iceberg tables without needing Spark/Presto
- HighTable: Virtual scrolling of millions of rows
- Hyparquet-Writer: Export Parquet easily from JS
- Hyllama: Read llama.cpp .gguf LLM metadata efficiently
CLI for viewing local files: npx hyperparam dataset.parquet
Example dataset on Hugging Face Space: https://huggingface.co/spaces/hyperparam/hyperparam?url=http...
No cloud uploads. No backend servers. A better way to build frontend data applications.
GitHub: https://github.com/hyparam Feedback and PRs welcome!
Show HN: Create your own finetuned AI model using Google Sheets
Hello HN,
We built Promptrepo to make finetuning accessible to product teams — not just ML engineers. Last week, OpenAI’s CPO shared how they use fine-tuning for everything from customer support to deep research, and called it the future for serious AI teams. Yet most teams I know still rely on prompting, because fine-tuning is too technical, while the people who have the training data (product managers and domain experts) are often non-technical. With Promptrepo, they can now:
- Add training examples in Google Sheets
- Click a button to train
- Deploy and test instantly
- Use OpenAI, Claude, Gemini or Llama models
We’ve used this internally for years to power AI workflows in our products (Formfacade, Formesign, Neartail), and we're now opening it up to others. Would love your feedback and happy to answer any questions!
---
Try it free - https://promptrepo.com/finetune
Demo video - https://www.youtube.com/watch?v=e1CTin1bD0w
Why we built it - https://guesswork.co/support/post/fine-tuning-is-the-future-...
Show HN: A motherfucking app (does one thing, under 300 LOC)
Inspired by motherfuckingwebsite.com, this is a 100% static HTML/JS app.
You get one task box. A “do nothing” button. And zero bullshit. No login. No tracking. No frameworks. Just focus. Under 300 lines of code.
Built for fun. Possibly for sanity.
Show HN: Robot Unlock – an open-ended programming game/zachlike
Hello,
In 2010 I made an open-ended programming game based on Befunge and Brainf*k. I was young and didn't know what I was doing - coding it in AutoIT of all things and using borderless windows for sprites. Nevertheless, it was a full game and some people actually played it, sharing solutions with each other. I took it as a sign that the game had some potential - I appreciated this very much at the time.
It was zachlike at its core, except that it came out earlier than SpaceChem and the term hadn't been coined yet.
Years passed, I worked in the game industry, had some fun, learned a few things and eventually burned out. Meanwhile, Zachtronics kept making games and managed to define a genre, proving that there indeed was a market for such games. I'm very happy about that.
Now I want to have a shot at going indie and almost 15 years later I'm launching the sequel to my 2010 game. One of my playtesters has been at it for 26 hours so I know it can be a real nerd sniper. It's a game for the type of person who loves quirky languages and optimizing their programs under extreme constraints.
I have been hanging out on HN for a long time and thought some in this community might like the game. I want to keep doing this and I will as long as I can afford it.
Looking forward to your questions and feedback.
https://store.steampowered.com/app/3318050/Robot_Unlock
Show HN (YC S25): Well – MCP AI-Based Collection of Invoices
Hi HN, we’re the cofounders of Lago:an AI-agent powered Chrome extension that becomes every founder’s best friend when accounting season hits. Well automates supplier invoice collection and pipes the data directly into your accounting tools, ERP, or dashboards — with zero effort.
Why We’re Building Well: Automating the Missing Half of Payments
Our website is at https://wellapp.ai/ and our Github is here: https://github.com/WellApp-ai/well.
---
Our Background
We’re a team of infrastructure builders with deep roots in European fintech.
Over the last decade, we built and scaled two core-banking platforms — from scratch — across IbanFirst, Fintecture, and Qonto (serving 600K+ SMEs). Payments, compliance, scalability: we lived it, pushed it, broke it, and rebuilt it.
Across all those years, one thing became obvious: while moving money became fast, standardized, and predictable, handling supplier invoices remained chaos.
---
The Insight
Payments today are a solved problem.
Protocols like EMV, SEPA Instant, and SWIFT create reliable, instant settlement flows. Whether you tap a card or wire funds across borders, the infrastructure just works.
But invoices?
Every supplier, every service, every vendor still does it differently. Some email you a PDF. Some force you to log into portals. Some send nothing at all unless you chase them manually. There’s no protocol, no standards, no rails — just friction.
In a world where payment is protocolized, *invoice management is broken*.
---
The Problem
This chaos isn’t just annoying — it’s operationally expensive:
- Solopreneurs and small teams waste hours every month just retrieving invoices. - Accounting tools often rely on manual drag-and-drop uploads. - Poor invoice tracking means poor treasury visibility — contributing to the 57% of bankruptcies in France caused by cash flow mismanagement.
We’ve seen it first-hand working closely with thousands of startups and SMEs at Qonto. Adoption of accounts payable solutions is painfully low — not because people don’t need them, but because getting started still requires too much manual work.
No one wants to forward emails, upload PDFs, or manage inbox rules just to track basic finances.
---
Why Now
Two forces are converging:
- Regulation: EU-wide electronic invoicing mandates are coming, shifting the landscape. - Technology: AI now allows us to automate where legacy RPA (robotic process automation) approaches failed.
We believe the next true disruption in finance won’t come from payment rails. It will come from *bridging the invoice gap* — automatically, invisibly, and at scale.
---
What We’re Building
Well is an AI-powered infrastructure designed to automate supplier invoice collection — no matter the format, no matter the source.
Here’s what Well does:
• Captures invoices from portals, emails, and attachments — fully automated.
• Extracts structured data (amounts, vendors, dates) with high accuracy.
• Securely syncs invoice data into your accounting SaaS, ERP, or dashboards.
No manual uploads. No password sharing. No chasing.
Just a protocolized pipeline — like payments, but for supplier invoices.
---
What’s Next
- Extending Well’s reach across 1,000+ SaaS vendors. - Deepening integrations into accounting, ERP, and spend management ecosystems. - Building invisible finance workflows where invoices just...appear where you need them.
We’re starting with solopreneurs, indie hackers, and lean startup teams — but our ambition is to make invoice chaos obsolete for businesses of any size.
If you’re tired of chasing supplier invoices instead of growing your business, we’d love to hear from you.
Learn more at https://wellapp.ai/
Show HN: ART – a new open-source RL framework for training agents
Hey HN, I wanted to share a new project we've been working on for the last couple of months called ART (https://github.com/OpenPipe/ART).
ART is a new open-source framework for training agents using reinforcement learning (RL). RL allows you to train an agent to perform better at any task whose outcome can be measured and quantified.
There are many excellent projects focused on training LLMs with RL, such as GRPOTrainer (https://huggingface.co/docs/trl/main/en/grpo_trainer) and verl (https://github.com/volcengine/verl). We've used these frameworks extensively for customer-facing projects at OpenPipe, but grew frustrated with some key limitations:
- Multi-turn workflows, where the agent calls a tool, gets a response, and calls another, are not well supported. This makes them a non-starter for any task that requires an agent to perform a sequence of actions.
- Other frameworks typically have low GPU efficiency. They may require multiple H100 GPUs just to train a small 7B parameter model, and aren't able to keep the GPUs busy consistently during both the "rollout" and "training" phases of the training loop.
- Existing frameworks are typically not a convenient shape for integrating with existing agentic codebases. Existing trainers expect you to call raw text completion endpoints, and don't automatically provide industry-standard chat completion APIs.
ART is designed to address these limitations and make it easy to train high-quality agents. We've also shared many details and practical lessons learned is in this post, which walks through a demo of training an email research agent that outperforms o3 (https://openpipe.ai/blog/art-e-mail-agent). You can also find out more about ART's architecture in our announcement post (https://openpipe.ai/blog/art-trainer-a-new-rl-trainer-for-ag...).
Happy to answer any questions you have!
Show HN: Maybe – The personal finance app for everyone
MaybeFinance is a financial education platform that provides resources and tools to help individuals improve their financial literacy and make informed decisions about their money. The site covers a range of personal finance topics, including budgeting, investing, debt management, and retirement planning.
Show HN: Multi-model chat. Get answers from multiple AI models at once
Hey HN. I often like to see how different AI models will answer queries - comparatively. This is an open source webapp that locally lets you see various LLM output using openRouter. Enjoy!
Show HN: Cryptle – Wordle for Cryptogrphy
Hi HN! I am Haris, high schooler from Bulgaria and i had a couple of hours laying around and i really love solving puzzles and ciphers!
So, I built a game like Wordle but for Cryptography everyday there is a new word ciphered that you have to decode and guess!
Would love if you check it out and share your results: cryptle.site
Show HN: I taught AI to commentate Pong in real time
xPong is an open-source, multiplayer Pong game built using Rust, WebAssembly, and WebSockets. It features real-time gameplay, a responsive design, and the ability to play against other users online.
Show HN: I built a hardware processor that runs Python
Hi everyone, I built PyXL — a hardware processor that executes a custom assembly generated from Python programs, without using a traditional interpreter or virtual machine. It compiles Python -> CPython Bytecode -> Instruction set designed for direct hardware execution.
I’m sharing an early benchmark: a GPIO test where PyXL achieves a 480ns round-trip toggle — compared to 14-25 micro seconds on a MicroPython Pyboard - even though PyXL runs at a lower clock (100MHz vs. 168MHz).
The design is stack-based, fully pipelined, and preserves Python's dynamic typing without static type restrictions. I independently developed the full stack — toolchain (compiler, linker, codegen), and hardware — to validate the core idea. Full technical details will be presented at PyCon 2025.
Demo and explanation here: https://runpyxl.com/gpio Happy to answer any questions
Show HN: AgenticSeek – Self-hosted alternative to cloud-based AI tools
I’ve spent the last two months building AgenticSeek, a privacy-focused alternative to cloud-based AI tools like ManusAI. It runs entirely on your machine—no API calls, no data leaks.
Why AgenticSeek?
Optimized for local LLMs (developed mostly on an RTX 3060 running deepseek r1 14b).
Truly private: All components (TTS, STT, planner) run locally.
More responsive than alternatives (we respond fast to issues + active Discord).
Designed to be fun—think JARVIS-like voice control, multi-agent workflows, and a slick web UI.
Current Features:
Web browsing (research + form filling), code write/fix, file management/search. Planning capabilites to use multiple agents for complex task.
Is it stable? Prototype-stage—great for tinkerers.
Hoping to get feedback!
Show HN: Polyseed – first(?) pq PAKE implementation
PolySeed is a free and open-source software tool that generates secure and unique seeds for blockchain wallets. It aims to provide a user-friendly and trustworthy solution for generating and managing wallet seeds.
Show HN: Traycer.ai – Turn GitHub Issues into a Step-by-Step Plan
Hey everyone!
We've built Traycer, a tool that transforms your GitHub issues—everything from descriptions and attached images to ongoing conversations—into clear, actionable implementation plans.
You can easily import these plans into your IDE with our extension or use them with any other coding assistant you prefer.
We'd love to hear your thoughts and feedback. Traycer is totally free for open-source projects, and we've got a 2-week free trial if you're working with private repos.
Give it a try and let us know what you think!
Show HN: Sim Studio – Open-Source Agent Workflow GUI
Hi HN! We're Emir and Waleed, and we're building Sim Studio (https://simstudio.ai), an open-source drag and drop UI for building and managing multi-agent workflows as a directed graph. You can define how agents interact with each other, use tools, and handle complex logic like branching, loops, transformations, and conditional execution.
Our repo is https://github.com/simstudioai/sim, docs are at https://docs.simstudio.ai/introduction, and we have a demo here: https://youtu.be/JlCktXTY8sE?si=uBAf0x-EKxZmT9w4
Building reliable, multi-step agent systems with current frameworks often gets complicated fast. In OpenAI's 'practical guide to building agents', they claim that the non-declarative approach and single multi-step agents are the best path forward, but from experience and experimentation, we disagree. Debugging these implicit flows across multiple agent calls and tool uses is painful, and iterating on the logic or prompts becomes slow.
We built Sim Studio because we believe defining the workflow explicitly and visually is the key to building more reliable and maintainable agentic applications. In Sim Studio, you design the entire architecture, comprising of agent blocks that have system prompts, a variety of models (hosted and local via ollama), tools with granular tool use control, and structured output.
We have plenty of pre-built integrations that you can use as standalone blocks or as tools for your agents. The nodes are all connected with if/else conditional blocks, llm-based routing, loops, and branching logic for specialized agents.
Also, the visual graph isn't just for prototyping and is actually executable. You can run simulations of the workflows 1, 10, 100 times to see how modifying any small system prompt change, underlying model, or tool call change change impacts the overall performance of the workflow.
You can trigger the workflows manually, deploy as an API and interact via HTTP, or schedule the workflows to run periodically. They can also be set up to trigger on incoming webhooks and deployed as standalone chat instances that can be password or domain-protected.
We have granular trace spans, logs, and observability built-in so you can easily compare and contrast performance across different model providers and tools. All of these things enable a tighter feedback loop and significantly faster iteration.
So far, users have built deep research agents to detect application fraud, chatbots to interface with their internal HR documentation, and agents to automate communication between manufacturing facilities.
Sim Studio is Apache 2.0 licensed, and fully open source.
We're excited about bringing a visual, workflow-centric approach to agent development. We think it makes building robust, complex agentic workflows far more accessible and reliable. We'd love to hear the HN community's thoughts!
Show HN: Kexa.io – Open-Source IT Security and Compliance Verification
Hi HN,
We're building Kexa.io (https://github.com/kexa-io/Kexa), an open-source tool developed in France (incubated at Euratech Cyber Campus) to help teams automate the often tedious process of verifying IT security and compliance. Keeping track of configurations across diverse assets (servers, K8s, cloud resources) and ensuring they meet security baselines (like CIS benchmarks, etc.) manually is challenging and error-prone.
Our goal with the open-source core is to provide a straightforward way to define checks, scan your assets, and get clear reports on your security posture. You can define your own rules or use common standards.
We are now actively developing our SaaS offering, planned for a beta release around June 2025. The key feature will be an AI-powered security administration agent specifically designed for cloud environments (initially targeting AWS, GCP, Azure). Instead of just reporting issues, this agent will aim to provide proactive, actionable recommendations and potentially automate certain remediation tasks to simplify cloud security management and hardening.
We'd love for the HN community to check out the open-source project on GitHub. Feedback on the concept or the current tool is highly welcome, and a star if you find it interesting helps others discover the project! If the upcoming AI-powered cloud security agent sounds interesting, we'd be particularly keen to hear your thoughts or if you might be interested in joining the beta (~June 2025).
thank you !!
Show HN: I built an AI tool to practice technical interviews with
Hey HN,
Check out our technical paper here: https://arxiv.org/abs/2501.15627 and the video demo: https://www.youtube.com/watch?v=Op8hyLW7Z84
I’ve been obsessed with the art of the interview since I was in college. In my career I interviewed over 100 people and was interviewed from tech companies from startups to big tech and hedge funds.
I built Neuraprep because I noticed something missing — while software engineers have leetcode.com and finance folks have quantquestions.com, other engineering domains (like ML, data science, MLOps) don’t have a go-to platform to prep for interviews. Sure, there’s Kaggle and Coursera, but nothing unified.
So I spent a summer collecting 400+ real interview questions and built detailed answers for each. I drew from academic sources, online communities, and LLMs to refine the content. Then I used this dataset to build an AI that mimics how a human interviewer evaluates responses.
Here’s how it works:
• The reasoning engine extracts core ideas from the user’s answer. • It compares them to the expected ideas from the database. • If something is missing, the conversation continues — just like a real technical interviewer would do.
With recent voice and reasoning model advancements (thanks Sesame, O3), it now runs on-demand phone interviews that feel surprisingly real.
Show HN: Beatsync – perfect audio sync across multiple devices
Hi HN! I made Beatsync, an open-source browser-based audio player that syncs audio with millisecond-level accuracy across many devices.
Try it live right now: https://www.beatsync.gg/
The idea is that with no additional hardware, you can turn any group of devices into a full surround sound system. MacBook speakers are particularly good.
Inspired by Network Time Protocol (NTP), I do clock synchronization over websockets and use the Web Audio API to keep audio latency under a few ms.
You can also drag devices around a virtual grid to simulate spatial audio — it changes the volume of each device depending on its distance to a virtual listening source!
I've been working on this project for the past couple of weeks. Would love to hear your thoughts and ideas!
Show HN: Convert Large CSV/XLSX to JSON or XML in Browser
Hello HN, I'm excited to share a project I've been working on: A simple, fast way to process huge CSV and XLSX files directly in your browser and export them as clean JSON or XML
Here's a few things that makes this converter different: - runs in the browser - all parsing and conversion is client side can handle data any size data - automatically detects delimiters, encodings, and data types as it parses - Live preview with column renaming, search/replace, and data cleanup - Export to JSON or XML — clean, structured output that can be used for API or Databases
backstory: I built this tool for myself. I work with massive CSV and TXT files, some over 10GB, and opening them in Excel would freeze my laptop, some of the online converters only limits to a certain size, so I started learning Python and pandas but ended up wasting so much time trying different delimiters or fixing badly structured data just to make it usable, and I thought this would be a really fun project to build
I'd love some feedback. Thank you
URL: https://csvforge.com
Show HN: A Chrome extension that will auto-reject non-essential cookies
A FOSS chrome extension that attempts to remove the annoyance of cookie pop ups and banners.
There are some extensions out there that auto-accept cookies, but I didn't find one that auto rejected cookies without either chaining some extensions together or setting up custom rules in tools like uBlock origin. So with this extension, you just need to add it for non-essential cookies to be rejected.
Github: https://github.com/mitch292/reject-cookies Extension Link: https://chromewebstore.google.com/detail/bnbodofigkfjljnopfg...
It's still very early days for the extension. I want it to keep improving and working on more and more sites. Feedback welcome. Thanks!
Show HN: Web-eval-agent – Let the coding agent debug itself
Hey HN! We’ve been building an MCP server to help AI-assisted web app developers by using browser agents to test whether changes made by an AI inside an editor actually work. We've been testing it on scenarios like verifying new flows in a UI, or checking that sending a chat request triggers a response. The idea is to let your coding agent both code and evaluate if what it did was correct. Here’s a short demo with Cursor: https://www.youtube.com/watch?v=_AoQK-bwR0w
When building apps, we found the hardest part of AI-assisted coding isn’t the coding—it’s tedious point-and-click testing to see if things work. We got tired of this loop: open the app, click through flows, stare at the network tab, copy console errors to the editor, repeat. It felt obvious this should be AI-assisted too. If you can vibe-code, you should be able to vibe-test!
Some agents like Cline and Windsurf have browser integrations, but Cline’s (via Anthropic Computer Use) felt slow and only reported console logs, and Windsurf’s didn’t work reliably yet. We got so tired of manually testing that we decided to fix it.
Our MCP server sits between your IDE agent (Cursor/Windsurf/Cline/Continue) and a Playwright-powered browser-use agent. It spins up the browser, navigates your app per instructions from the IDE agent, and sends back steps, console events, and network events so the IDE agent can assess the app’s state.
We proxy Browser-use’s original Claude calls and swap in Gemini Flash 2.0, cutting latency from ~8s → ~3s per step. We also cap console/network logs at 10,000 characters to stay within context limits, and filter out irrelevant logs (e.g., noisy XHR requests).
At the end, the browser agent outputs a summary like:
Web Evaluation Report for http://localhost:5173
Task: delete an API key and evaluate UX
Steps: Home → Login → API Keys → Create Key → Delete Key
Flow tested successfully; UX had problems X, Y, Z...
Console (8)... Network (13)... Timeline of events (57) …
This gives the coding agent the ability to recognize the console and network errors, or any issues with clicking around, and have the coding agent fix them before returning back to the user. (There’s a longer example in the README at https://github.com/Operative-Sh/web-eval-agent.)Try it in Cursor / Cline / Windsurf / Claude Desktop: (macOS/Linux):
curl -LSf https://operative.sh/install.sh -o install.sh
less -N install.sh # inspect if you’d like
bash install.sh # installs uv + jq + Playwright + server
# then in Cursor/Cline/Windsurf/Continue: craft a prompt using the web_eval_agent tool
(For Windows, there’s a 4-line manual install in the README.)What we want to do next: pause/go for OAuth screens; save/load browser auth states; Playwright step recording for automated test creation and regression test creation; supporting Loveable / v0 / Bolt.new sites by offering a web version.
We’d love to hear your feedback, especially if you’ve experienced the pain of having to manually test changes happening in your web apps after making changes from inside your IDE, or if you’ve tried any alternative MCP tools for this that have worked well.
Try it out if you feel it’d be helpful for your workflow: https://github.com/Operative-Sh/web-eval-agent. (note: the server hits our operative.sh proxy to cover Gemini tokens. The MCP server itself is OSS; Anthropic base-URL support is coming soon. Free tier included; heavy users can grab the $10 plan to offset our model bill.)
Let us know what you think! Thanks for reading!
Show HN: Heart Rate Zones Plus – The first iOS app I developed
I built this iOS app because I wanted to get an overview of my time in zones per week without checking zones after every workout manually - Now I'm looking for feedback.
Description: Track time in heart rate zones. Track per day, week, month, 7 days and 30 days time period and how much time you spend in each zone. Set goals & visualize progress. Get details about heart rates zones of your workouts.
Features: Custom time periods, Workout to zone attribution to get a feeling which sport attributed most to each zone, Multiple zone calculation methods, Set personal time goals for any zone, Workout breakdown
Pricing: Free
Privacy: Nothing is tracked or send somewhere. Data is just on your device.
Any feedback and features request is appreciated.
Download: https://apps.apple.com/us/app/heart-rate-zones-plus/id674474...
Video of the app in action: https://www.youtube.com/shorts/-qtHxEdMEv0