Temporal: A nine-year journey to fix time in JavaScript
The article discusses the Temporal API, a new JavaScript standard that simplifies date and time manipulation. It provides an overview of the Temporal API's features, including its ability to handle time zones, calendars, and various date/time formats.
Many SWE-bench-Passing PRs would not be merged
The article discusses how many software engineering benchmarking pull requests that pass tests would not be merged into the main codebase, highlighting the challenges of balancing automated testing and human review in software development.
Don't post generated/AI-edited comments. HN is for conversation between humans.
The article outlines Hacker News' guidelines for submissions, including recommendations for creating high-quality posts, avoiding common pitfalls, and maintaining a constructive community. It emphasizes the importance of sharing interesting and thought-provoking content while adhering to the site's rules and principles.
Making WebAssembly a first-class language on the Web
The article discusses Mozilla's efforts to make WebAssembly a first-class language on the web, by improving its performance, interoperability, and developer experience, with the goal of enabling a more diverse set of applications on the web.
Personal Computer by Perplexity
The article discusses the increasing demand for personal computers due to remote work and learning during the COVID-19 pandemic, leading to supply chain issues and long waitlists for popular models. Manufacturers are working to address the shortage, but consumer demand continues to outpace supply.
Show HN: I built a tool that watches webpages and exposes changes as RSS
I built Site Spy after missing a visa appointment slot because a government page changed and I didn’t notice for two weeks.
It watches webpages for changes and shows the result like a diff. The part I think HN might find interesting is that it can monitor a specific element on a page, not just the whole page, and it can expose changes as RSS feeds.
So instead of tracking an entire noisy page, you can watch just a price, a stock status, a headline, or a specific content block. When it changes, you can inspect the diff, browse the snapshot history, or follow the updates in an RSS reader.
It’s a Chrome/Firefox extension plus a web dashboard.
Main features:
- Element picker for tracking a specific part of a page
- Diff view plus full snapshot timeline
- RSS feeds per watch, per tag, or across all watches
- MCP server for Claude, Cursor, and other AI agents
- Browser push, Email, and Telegram notifications
Chrome: https://chromewebstore.google.com/detail/site-spy/jeapcpanag...
Firefox: https://addons.mozilla.org/en-GB/firefox/addon/site-spy/
Docs: https://docs.sitespy.app
I’d especially love feedback on two things:
- Is RSS actually a useful interface for this, or do most people just want direct alerts?
- Does element-level tracking feel meaningfully better than full-page monitoring?
Show HN: Autoresearch@home
autoresearch@home is a collaborative research collective where AI agents share GPU resources to collectively improve a language model. Think SETI@home, but for model training.
How it works: Agents read the current best result, propose a hypothesis, modify train.py, run the experiment on your GPU, and publish results back. When an agent beats the current best validation loss, that becomes the new baseline for every other agent. Agents learn from great runs and failures, since we're using Ensue as the collective memory layer.
This project extends Karpathy's autoresearch by adding the missing coordination layer so agents can actually build on each other's work.
To participate, you need an agent and a GPU. The agent handles everything: cloning the repo, connecting to the collective, picking experiments, running them, publishing results, and asking you to verify you're a real person via email.
Send this prompt to your agent to get started: Read https://github.com/mutable-state-inc/autoresearch-at-home follow the instructions join autoresearch and start contributing.
This whole experiment is to prove that agents work better when they can build off other agents. The timeline is live, so you can watch experiments land in real time.
Show HN: A context-aware permission guard for Claude Code
We needed something like --dangerously-skip-permissions that doesn’t nuke your untracked files, exfiltrate your keys, or install malware.
Claude Code's permission system is allow-or-deny per tool, but that doesn’t really scale. Deleting some files is fine sometimes. And git checkout is sometimes not fine. Even when you curate permissions, 200 IQ Opus can find a way around it. Maintaining a deny list is a fool's errand.
nah is a PreToolUse hook that classifies every tool call by what it actually does, using a deterministic classifier that runs in milliseconds. It maps commands to action types like filesystem_read, package_run, db_write, git_history_rewrite, and applies policies: allow, context (depends on the target), ask, or block.
Not everything can be classified, so you can optionally escalate ambiguous stuff to an LLM, but that’s not required. Anything unresolved you can approve, and configure the taxonomy so you don’t get asked again.
It works out of the box with sane defaults, no config needed. But you can customize it fully if you want to.
No dependencies, stdlib Python, MIT.
pip install nah && nah install
https://github.com/manuelschipper/nah
Google closes deal to acquire Wiz
Previously: Google to buy Wiz for $32B - https://news.ycombinator.com/item?id=43398518 - March 2025 (845 comments)
I was interviewed by an AI bot for a job
https://archive.ph/DEwy7
The MacBook Neo
https://www.pcmag.com/news/asus-co-ceo-macbook-neo-is-a-shoc...
CNN Explainer – Learn Convolutional Neural Network in Your Browser (2020)
The article provides an interactive visualization tool that explains the inner workings of Convolutional Neural Networks (CNNs), a widely used deep learning architecture for image recognition tasks. The tool allows users to interactively explore and understand the key components and concepts of CNNs, such as convolution, pooling, and activation functions.
Meticulous (YC S21) is hiring to redefine software dev
Meticulous, a software company, is seeking a Lead Product Designer to join their team. The role involves collaborating with cross-functional teams, driving the design vision, and championing user-centric solutions to solve complex problems.
BitNet: 100B Param 1-Bit model for local CPUs
BitNet is an open-source project by Microsoft that aims to provide a scalable and efficient blockchain network for decentralized applications. The project explores novel consensus mechanisms and optimization techniques to address the performance and scalability challenges of traditional blockchain platforms.
Entities enabling scientific fraud at scale (2025)
Preliminary data from a longitudinal AI impact study
The article discusses a study by OpenAI and McKinsey that estimates AI productivity gains to be around 10%, lower than commonly reported. It highlights the need for realistic expectations and a cautious approach when it comes to the impact of AI on productivity and employment.
Show HN: Klaus – OpenClaw on a VM, batteries included
We are Bailey and Robbie and we are working on Klaus (https://klausai.com/): hosted OpenClaw that is secure and powerful out of the box.
Running OpenClaw requires setting up a cloud VM or local container (a pain) or giving OpenClaw root access to your machine (insecure). Many basic integrations (eg Slack, Google Workspace) require you to create your own OAuth app.
We make running OpenClaw simple by giving each user their own EC2 instance, preconfigured with keys for OpenRouter, AgentMail, and Orthogonal. And we have OAuth apps to make it easy to integrate with Slack and Google Workspace.
We are both HN readers (Bailey has been on here for ~10 years) and we know OpenClaw has serious security concerns. We do a lot to make our users’ instances more secure: we run on a private subnet, automatically update the OpenClaw version our users run, and because you’re on our VM by default the only keys you leak if you get hacked belong to us. Connecting your email is still a risk. The best defense I know of is Opus 4.6 for resilience to prompt injection. If you have a better solution, we’d love to hear it!
We learned a lot about infrastructure management in the past month. Kimi K2.5 and Mimimax M2.5 are extremely good at hallucinating new ways to break openclaw.json and otherwise wreaking havoc on an EC2 instance. The week after our launch we spent 20+ hours fixing broken machines by hand.
We wrote a ton of best practices on using OpenClaw on AWS Linux into our users’ AGENTS.md, got really good at un-bricking EC2 machines over SSM, added a command-and-control server to every instance to facilitate hotfixes and migrations, and set up a Klaus instance to answer FAQs on discord.
In addition to all of this, we built ClawBert, our AI SRE for hotfixing OpenClaw instances automatically: https://www.youtube.com/watch?v=v65F6VBXqKY. Clawbert is a Claude Code instance that runs whenever a health check fails or the user triggers it in the UI. It can read that user’s entries in our database and execute commands on the user’s instance. We expose a log of Clawbert’s runs to the user.
We know that setting up OpenClaw is easy for most HN readers, but I promise it is not for most people. Klaus has a long way to go, but it’s still very rewarding to see people who’ve never used Claude Code get their first taste of AI agents.
We charge $19/m for a t4g.small, $49/m for a t4g.medium, and $200/m for a t4g.xlarge and priority support. You get $15 in tokens and $20 in Orthogonal credits one-time.
We want to know what you are building on OpenClaw so we can make sure we support it. We are already working with companies like Orthogonal and Openrouter that are building things to make agents more useful, and we’re sure there are more tools out there we don’t know about. If you’ve built something agents want, please let us know. Comments welcome!
5,200 holes carved into a Peruvian mountain left by an ancient economy
Researchers have discovered over 5,200 mysterious holes in a mountain in Peru, raising questions about their origin and purpose. The article explores various theories, including their potential connection to ancient human activities or natural geological processes.
Against vibes: When is a generative model useful
The article argues that generative models can be useful in specific applications, but cautions against relying on 'vibes' or intuition when evaluating their efficacy. It emphasizes the importance of rigorous testing and quantitative analysis in determining the appropriate use cases for generative models.
Britain is ejecting hereditary nobles from Parliament after 700 years
The UK House of Lords has voted to expel all remaining hereditary peers, marking the end of a centuries-old tradition. This move aims to make the upper chamber more representative and accountable, as the government continues its efforts to reform the House of Lords.
How we hacked McKinsey's AI platform
The article describes how a team of researchers successfully exploited vulnerabilities in McKinsey's AI platform, highlighting the importance of robust security measures for AI systems and the need for thorough testing and validation to prevent such breaches.
Physicist Astrid Eichhorn is a leader in the field of asymptotic safety
The article explores the work of theoretical physicist Natalie Wolchover, who is investigating the possibility that space-time may be made of fractals, rather than strings. Wolchover's unconventional approach to understanding the fundamental nature of reality challenges the traditional string theory framework.
Swiss e-voting pilot can't count 2,048 ballots after decryption failure
The article discusses a technical issue that disrupted Switzerland's electronic voting system, causing concerns about the reliability and security of the country's e-voting infrastructure. It highlights the challenges governments face in implementing secure and transparent digital voting systems.
Building Better Country Selects
The article discusses the importance of creating better country select components in web applications, focusing on improving the user experience, accessibility, and performance aspects of this commonly used UI element.
Show HN: Open-source browser for AI agents
Hi HN, I forked chromium and built agent-browser-protocol (ABP) after noticing that most browser-agent failures aren’t really about the model misunderstanding the page. Instead, the problem is that the model is reasoning from a stale state.
ABP is designed to keep the acting agent synchronized with the browser at every step. After each action (click, type, etc), it freezes JavaScript execution and rendering, then captures the resulting state. It also compiles the notable events that occurred during that action loop, such as navigation, file pickers, permission prompts, alerts, and downloads, and sends that along with a screenshot of the frozen page state back to the agent.
The result is that browser interaction starts to feel more like a multimodal chat loop. The agent takes an action, gets back a fresh visual state and a structured summary of what happened, then decides what to do next from there. That fits much better with how LLMs already work.
A few common browser-use failures ABP helps eliminate: * A modal appears after the last Playwright screenshot and blocks the input the agent was about to use * Dynamic filters cause the page to reflow between steps * An autocomplete dropdown opens and covers the element the agent intended to click * alert() / confirm() interrupts the flow * Downloads are triggered, but the agent has no reliable way to know when they’ve completed
As proof, ABP with opus 4.6 as the driver scores 90.5% on the Online Mind2Web benchmark. I think modern LLMs already understand websites, they just need a better tool to interact with them. Happy to answer questions about the architecture, forking chrome or anything else in the comments below.
Try it out: `claude mcp add browser -- npx -y agent-browser-protocol --mcp` (Codex/OpenCode instructions in the docs)
Demo video: https://www.loom.com/share/387f6349196f417d8b4b16a5452c3369
Can the Dictionary Keep Up?
The article explores the history and evolution of dictionaries, tracing how they have transformed from mere collections of words to complex cultural artifacts that reflect the changing times and values of society.
Launch HN: Prism (YC X25) – Workspace and API to generate and edit videos
Hey HN — we’re Rajit, Land, and Alex. We’re building Prism (https://www.prismvideos.com), an AI video creation platform and API.
Here’s a quick demo of how you can remix any video with Prism: https://youtu.be/0eez_2DnayI
Here’s a quick demo of how you can automate UGC-style ads with Openclaw + Prism: https://www.youtube.com/watch?v=5dWaD23qnro
Accompanying skill.md file: https://docs.google.com/document/d/1lIskVljW1OqbkXFyXeLHRsfM...
Making an AI video today usually means stitching together a dozen tools (image generation, image-to-video, upscalers, lip-sync, voiceover, and an editor). Every step turns into export/import and file juggling, so assets end up scattered across tabs and local storage, and iterating on a multi-scene video is slow.
Prism keeps the workflow in one place: you generate assets (images/video clips) and assemble them directly in a timeline editor without downloading files between tools. Practically, that means you can try different models (Kling, Veo, Sora, Hailuo, etc) and settings for a single clip, swap it on the timeline, and keep iterating without re-exporting and rebuilding the edit elsewhere.
We also support templates and one-click asset recreation, so you can reuse workflows from us or the community instead of rebuilding each asset from scratch. Those templates are exposed through our API, letting your AI agents discover templates in our catalog, supply the required inputs, and generate videos in a repeatable way without manually stitching the workflow together.
We built Prism because we were making AI videos ourselves and were unsatisfied with the available tools. We kept losing time to repetitive “glue work” such as constantly downloading files, keeping track of prompts/versions, and stitching clips in a separate video editing software. We’re trying to make the boring parts of multi-step AI video creation less manual so users can generate → review → edit → assemble → export, all inside one platform.
Pricing is based on usage credits, with a free tier (100 credits/month) and free models, so you can try it without providing a credit card: https://prismvideos.com.
We’d love to hear from people who’ve tried making AI videos: where does your workflow break, what parts are the most tedious, and what do you wish video creation tools on the market could do?
Show HN: Satellite imagery object detection using text prompts
I built a browser-based tool for detecting objects in satellite imagery using vision-language models (VLMs). You draw a polygon on the map and enter a text prompt such as "swimming pools", "oil tanks", or "buses". The system scans the selected area tile-by-tile and returns detections projected back onto the map as GeoJSON.
Pipeline: select area and zoom level, split the region into mercantile tiles, run each tile with the prompt through a VLM, convert predicted bounding boxes to geographic coordinates (WGS84), and render the results back on the map.
It works reasonably well for distinct structures in a zero-shot setting. occluded objects are still better handled by specialized detectors like YOLO models.
There is a public demo and no login required. I am mainly interested in feedback on detection quality, performance tradeoffs between VLMs and specialized detectors, and potential real-world use cases.
Launch HN: Sentrial (YC W26) – Catch AI agent failures before your users do
Hey HN! We're Neel and Anay, and we’re building Sentrial (https://sentrial.com). It’s production monitoring for AI products. We automatically detect failure patterns: loops, hallucinations, tool misuse, and user frustrations the moment they happen. When issues surface, Sentrial diagnoses the root cause by analyzing conversation patterns, model outputs, and tool interactions, then recommends specific fixes.
Here's a demo if you're interested: https://www.youtube.com/watch?v=cc4DWrJF7hk. When agents fail, choose wrong tools, or blow cost budgets, there's no way to know why - usually just logs and guesswork. As agents move from demos to production with real SLAs and real users, this is not sustainable.
Neel and I lived this, building agents at SenseHQ and Accenture where we found that debugging agents was often harder than actually building them. Agents are untrustworthy in prod because there’s no good infrastructure to verify what they’re actually doing.
In practice this looks like: - A support agent that began misclassifying refund requests as product questions, which meant customers never reached the refund flow. - A document drafting agent that would occasionally hallucinate missing sections when parsing long specs, producing confident but incorrect outputs. There’s no stack trace or 500 error and you only figure this out when a customer is angry.
We both realized teams were flying blind in production, and that agent native monitoring was going to be foundational infrastructure for every serious AI product. We started Sentrial as a verification layer designed to take care of this.
How it works: You wrap your client with our SDK in only a couple of lines. From there, we detect drift for you: - Wrong tool invocations - Misunderstood intents - Hallucinations - Quality regressions over time. You see it on our platform before a customer files a ticket.
There’s a quick mcp set up, just give claude code: claude mcp add --transport http Sentrial https://www.sentrial.com/docs/mcp
We have a free tier (14 days, no credit card required). We’d love any feedback from anyone running agents whether they be for personal use or within a professional setting.
We’ll be around in the comments!
Building a TB-303 from Scratch
The article provides a comprehensive guide on how to recreate the iconic Roland TB-303 bassline synthesizer from scratch, covering the technical details and principles behind its unique sound.