AI error jails innocent grandmother for months in North Dakota fraud case
An innocent grandmother in North Dakota was jailed for months due to an error in an artificial intelligence system used to detect fraud, highlighting the potential risks of over-reliance on AI in the criminal justice system.
Malus – Clean Room as a Service
https://fosdem.org/2026/schedule/event/SUVS7G-lets_end_open_...
https://malus.sh/blog.html
Bubble Sorted Amen Break
Reversing memory loss via gut-brain communication
The article explores the link between gut microbiome and cognitive decline, highlighting research that suggests gut bacteria may play a role in age-related brain changes and memory loss. It suggests that modifying the gut microbiome could potentially help maintain cognitive function as people age.
ATMs didn't kill bank teller jobs, but the iPhone did
The article explores how the introduction of ATMs did not lead to the demise of bank tellers, as many had predicted. Instead, it argues that ATMs and tellers have evolved to complement each other, with tellers focusing on more complex customer needs and ATMs handling routine transactions.
The Met releases high-def 3D scans of 140 famous art objects
The Metropolitan Museum of Art has released high-definition 3D scans of 140 famous art objects from its collection, allowing people to virtually explore and study these works in unprecedented detail.
Illinois introduces OS-level age verification law
Illinois Senate Bill SB3977 proposes expanding insurance coverage for mental health and substance abuse treatment, aiming to improve access to these services and address the state's ongoing mental health crisis.
Runners who churn butter on their runs
The article explores the surprising discovery that running can cause butter to form in the body, and provides insight into the science behind this phenomenon and tips for runners experiencing this unexpected occurrence.
Launch HN: IonRouter (YC W26) – High-throughput, low-cost inference
Hey HN — I’m Veer and my cofounder is Suryaa. We're building Cumulus Labs (YC W26), and we're releasing our latest product IonRouter (https://ionrouter.io/), an inference API for open-source and fine tuned models. You swap in our base URL, keep your existing OpenAI client code, and get access to any model (open source or finetuned to you) running on our own inference engine.
The problem we kept running into: every inference provider is either fast-but-expensive (Together, Fireworks — you pay for always-on GPUs) or cheap-but-DIY (Modal, RunPod — you configure vLLM yourself and deal with slow cold starts). Neither felt right for teams that just want to ship.
Suryaa spent years building GPU orchestration infrastructure at TensorDock and production systems at Palantir. I led ML infrastructure and Linux kernel development for Space Force and NASA contracts where the stack had to actually work under pressure. When we started building AI products ourselves, we kept hitting the same wall: GPU infrastructure was either too expensive or too much work.
So we built IonAttention — a C++ inference runtime designed specifically around the GH200's memory architecture. Most inference stacks treat GH200 as a compatibility target (make sure vLLM runs, use CPU memory as overflow). We took a different approach and built around what makes the hardware actually interesting: a 900 GB/s coherent CPU-GPU link, 452GB of LPDDR5X sitting right next to the accelerator, and 72 ARM cores you can actually use.
Three things came out of that that we think are novel: (1) using hardware cache coherence to make CUDA graphs behave as if they have dynamic parameters at zero per-step cost — something that only works on GH200-class hardware; (2) eager KV block writeback driven by immutability rather than memory pressure, which drops eviction stalls from 10ms+ to under 0.25ms; (3) phantom-tile attention scheduling at small batch sizes that cuts attention time by over 60% in the worst-affected regimes. We wrote up the details at cumulus.blog/ionattention.
On multimodal pipelines we get better performance than big players (588 tok/s vs. Together AI's 298 on the same VLM workload). We're honest that p50 latency is currently worse (~1.46s vs. 0.74s) — that's the tradeoff we're actively working on.
Pricing is per token, no idle costs: GPT-OSS-120B is $0.02 in / $0.095 out, Qwen3.5-122B is $0.20 in / $1.60 out. Full model list and pricing at https://ionrouter.io.
You can try the playground at https://ionrouter.io/playground right now, no signup required, or drop your API key in and swap the base URL — it's one line. We built this so teams can see the power of our engine and eventually come to us for their finetuned model needs using the same solution.
We're curious what you think, especially if you're running finetuned or custom models — that's the use case we've invested the most in. What's broken, what would make this actually useful for you?
Show HN: OneCLI – Vault for AI Agents in Rust
We built OneCLI because AI agents are being given raw API keys. And it's going about as well as you'd expect. We figured the answer isn't "don't give agents access," it's "give them access without giving them secrets."
OneCLI is an open-source gateway that sits between your AI agents and the services they call. You store your real credentials once in OneCLI's encrypted vault, and give your agents placeholder keys. When an agent makes an HTTP call through the proxy, OneCLI matches the request by host/path, verifies the agent should have access, swaps the placeholder for the real credential, and forwards the request. The agent never touches the actual secret. It just uses CLI or MCP tools as normal.
Try it in one line: docker run --pull always -p 10254:10254 -p 10255:10255 -v onecli-data:/app/data ghcr.io/onecli/onecli
The proxy is written in Rust, the dashboard is Next.js, and secrets are AES-256-GCM encrypted at rest. Everything runs in a single Docker container with an embedded Postgres (PGlite), no external dependencies. Works with any agent framework (OpenClaw, NanoClaw, IronClaw, or anything that can set an HTTPS_PROXY).
We started with what felt most urgent: agents shouldn't be holding raw credentials. The next layer is access policies and audit, defining what each agent can call, logging everything, and requiring human approval before sensitive actions go through.
It's Apache-2.0 licensed. We'd love feedback on the approach, and we're especially curious how people are handling agent auth today.
GitHub: https://github.com/onecli/onecli Site: https://onecli.sh
Bringing Chrome to ARM64 Linux Devices
The article announces that the Chromium team is working to bring the Chrome browser to ARM64 Linux devices, enabling more users to access the browser on a wider range of hardware platforms.
An old photo of a large BBS (2022)
The article discusses the author's experience with a software configuration backup system (SCBBS) used at a previous company, highlighting the challenges and limitations of the system, as well as the importance of having a reliable and efficient backup solution in place.
Forcing Flash Attention onto a TPU and Learning the Hard Way
The article explores a novel attention mechanism called 'Forcing Flash Attention' that aims to improve the performance of Transformer models on TPUs. The mechanism is designed to reduce memory usage and increase inference speed by selectively attending to relevant parts of the input, leading to more efficient and faster model execution.
Document poisoning in RAG systems: How attackers corrupt AI's sources
I'm the author. Repo is here: https://github.com/aminrj-labs/mcp-attack-labs/tree/main/lab...
The lab runs entirely on LM Studio + Qwen2.5-7B-Instruct (Q4_K_M) + ChromaDB — no cloud APIs, no GPU required, no API keys.
From zero to seeing the poisoning succeed: git clone, make setup, make attack1. About 10 minutes.
Two things worth flagging upfront:
- The 95% success rate is against a 5-document corpus (best case for the attacker). In a mature collection you need proportionally more poisoned docs to dominate retrieval — but the mechanism is the same.
- Embedding anomaly detection at ingestion was the biggest surprise: 95% → 20% as a standalone control, outperforming all three generation-phase defenses combined. It runs on embeddings your pipeline already produces — no additional model.
All five layers combined: 10% residual.
Happy to discuss methodology, the PoisonedRAG comparison, or anything that looks off.
Converge (YC S23) Is Hiring a Founding Platform Engineer (NYC, Onsite)
The article discusses the role of a Founding Platform Engineer at RunConverge, a startup focused on building a unified data platform. The position involves designing and implementing the core infrastructure, frameworks, and tooling to support the company's data-driven products and services.
WolfIP: Lightweight TCP/IP stack with no dynamic memory allocations
Dolphin Progress Release 2603
The article highlights the latest updates and improvements made to the Dolphin emulator, including support for Nintendo GameCube and Wii games on a wider range of devices, enhanced performance, and new features like expanded audio and graphics capabilities.
Big data on the cheapest MacBook
The article explores how DuckDB, a lightweight and embeddable SQL database, can be used to process large datasets on even the cheapest Macbook hardware, providing a cost-effective solution for big data analysis on low-powered devices.
Shall I implement it? No
The article describes the development of an AI system that can generate realistic images from text descriptions. The system, called DALL-E, was created by OpenAI and is capable of creating a wide range of images, from simple illustrations to complex, photorealistic scenes.
Show HN: Understudy – Teach a desktop agent by demonstrating a task once
I built Understudy because a lot of real work still spans native desktop apps, browser tabs, terminals, and chat tools. Most current agents live in only one of those surfaces.
Understudy is a local-first desktop agent runtime that can operate GUI apps, browsers, shell tools, files, and messaging in one session. The part I'm most interested in feedback on is teach-by-demonstration: you do a task once, the agent records screen video + semantic events, extracts the intent rather than coordinates, and turns it into a reusable skill.
Demo video: https://www.youtube.com/watch?v=3d5cRGnlb_0
In the demo I teach it: Google Image search -> download a photo -> remove background in Pixelmator Pro -> export -> send via Telegram. Then I ask it to do the same for Elon Musk. The replay isn't a brittle macro: the published skill stores intent steps, route options, and GUI hints only as a fallback. In this example it can also prefer faster routes when they are available instead of repeating every GUI step.
Current state: macOS only. Layers 1-2 are working today; Layers 3-4 are partial and still early.
npm install -g @understudy-ai/understudy
understudy wizard
GitHub: https://github.com/understudy-ai/understudyHappy to answer questions about the architecture, teach-by-demonstration, or the limits of the current implementation.
Should hack-back be legal?
The article discusses the debate around whether 'hack back' should be made legal, examining the potential benefits and risks of allowing targeted cyber retaliation against attackers. It explores the complex legal and ethical considerations surrounding this controversial issue.
Show HN: Axe – A 12MB binary that replaces your AI framework
I built Axe because I got tired of every AI tool trying to be a chatbot.
Most frameworks want a long-lived session with a massive context window doing everything at once. That's expensive, slow, and fragile. Good software is small, focused, and composable... AI agents should be too.
Axe treats LLM agents like Unix programs. Each agent is a TOML config with a focused job. Such as code reviewer, log analyzer, commit message writer. You can run them from the CLI, pipe data in, get results out. You can use pipes to chain them together. Or trigger from cron, git hooks, CI.
What Axe is:
- 12MB binary, two dependencies. no framework, no Python, no Docker (unless you want it)
- Stdin piping, something like `git diff | axe run reviewer` just works
- Sub-agent delegation. Where agents call other agents via tool use, depth-limited
- Persistent memory. If you want, agents can remember across runs without you managing state
- MCP support. Axe can connect any MCP server to your agents
- Built-in tools. Such as web_search and url_fetch out of the box
- Multi-provider. Bring what you love to use.. Anthropic, OpenAI, Ollama, or anything in models.dev format
- Path-sandboxed file ops. Keeps agents locked to a working directory
Written in Go. No daemon, no GUI.
What would you automate first?
US private credit defaults hit record 9.2% in 2025, Fitch says
See also: https://alternativecreditinvestor.com/2025/10/22/us-banks-ex...
The Road Not Taken: A World Where IPv4 Evolved
The article explores the limitations and challenges of the IPv4 protocol, highlighting the need for its successor, IPv6, as the global adoption of internet-connected devices continues to grow. It discusses the depletion of IPv4 addresses and the importance of transitioning to the more scalable and feature-rich IPv6 protocol.
Full Spectrum and Infrared Photography
The article explores the concept of full-spectrum photography, which involves capturing the entire range of visible light and beyond. It discusses the benefits of this technique, such as revealing hidden details and capturing a more complete representation of the world around us.
Are LLM merge rates not getting better?
Related: Many SWE-bench-Passing PRs would not be merged - https://news.ycombinator.com/item?id=47341645 - March 2026 (149 comments)
DDR4 Sdram – Initialization, Training and Calibration
The article provides a detailed overview of the DDR4 memory initialization and calibration process, explaining the key steps involved, including power-up sequence, memory initialization, and calibration procedures to ensure optimal memory performance and reliability.
NASA's DART spacecraft changed an asteroid's orbit around the sun
NASA's DART spacecraft successfully impacted the asteroid Dimorphos, demonstrating the ability to alter the orbit of a celestial body, a key step in developing planetary defense capabilities against potentially hazardous asteroids.
Show HN: Rudel – Claude Code Session Analytics
We built rudel.ai after realizing we had no visibility into our own Claude Code sessions. We were using it daily but had no idea which sessions were efficient, why some got abandoned, or whether we were actually improving over time.
So we built an analytics layer for it. After connecting our own sessions, we ended up with a dataset of 1,573 real Claude Code sessions, 15M+ tokens, 270K+ interactions.
Some things we found that surprised us: - Skills were only being used in 4% of our sessions - 26% of sessions are abandoned, most within the first 60 seconds - Session success rate varies significantly by task type (documentation scores highest, refactoring lowest) - Error cascade patterns appear in the first 2 minutes and predict abandonment with reasonable accuracy - There is no meaningful benchmark for 'good' agentic session performance, we are building one.
The tool is free to use and fully open source, happy to answer questions about the data or how we built it.
The Cost of Indirection in Rust
The article discusses the cost of indirection in Rust, exploring how different programming patterns can impact performance. It examines the trade-offs between direct and indirect access, highlighting the importance of understanding the overhead associated with indirection in Rust.