Home

Anthropic Drops Flagship Safety Pledge
cwwc about 11 hours ago

Anthropic Drops Flagship Safety Pledge

Anthropic, a leading AI company, has dropped its flagship safety pledge, raising concerns about the company's commitment to responsible AI development. The article explores the implications of this decision and the broader debate surrounding the ethical implications of AI technology.

time.com
359 164
Summary
Show HN: Context Mode – 315 KB of MCP output becomes 5.4 KB in Claude Code
mksglu about 6 hours ago

Show HN: Context Mode – 315 KB of MCP output becomes 5.4 KB in Claude Code

Every MCP tool call dumps raw data into Claude Code's 200K context window. A Playwright snapshot costs 56 KB, 20 GitHub issues cost 59 KB. After 30 minutes, 40% of your context is gone.

I built an MCP server that sits between Claude Code and these outputs. It processes them in sandboxes and only returns summaries. 315 KB becomes 5.4 KB.

It supports 10 language runtimes, SQLite FTS5 with BM25 ranking for search, and batch execution. Session time before slowdown goes from ~30 min to ~3 hours.

MIT licensed, single command install:

/plugin marketplace add mksglu/claude-context-mode

/plugin install context-mode@claude-context-mode

Benchmarks and source: https://github.com/mksglu/claude-context-mode

Would love feedback from anyone hitting context limits in Claude Code.

github.com
60 18
Summary
Danish government agency to ditch Microsoft software (2025)
robtherobber about 2 hours ago

Danish government agency to ditch Microsoft software (2025)

The article discusses Denmark's plans to reduce its reliance on Microsoft products and move towards digital independence. It highlights the Danish government's efforts to increase its technological self-sufficiency and explore alternative software solutions to reduce its dependence on a single technology provider.

therecord.media
59 13
Summary
Show HN: A real-time strategy game that AI agents can play
__cayenne__ about 2 hours ago

Show HN: A real-time strategy game that AI agents can play

I've liked all the projects that put LLMs into game environments. It's been a weird juxtaposition, though: frontier LLMs can one-shot full coding projects, and those same models struggle to get out of Pokémon Red's Mt. Moon.

Because of this, I wanted to create a game environment that put this generation of frontier LLMs' top skill, coding, on full display.

Ten years ago, a team released a game called Screeps. It was described as an "MMO RTS sandbox for programmers." The Screeps paradigm of writing code and having it executed in a real-time game environment is well suited to LLMs. Drawing on a version of the Screeps open source API, LLM Skirmish pits LLMs head-to-head in a series of 1v1 real-time strategy games.

In my testing I found that Claude Opus 4.5 was the most dominant model, but it showed weakness in round 1 as it was overly focused on its in-game economy. Meanwhile, I probably spent a third of all code on sandbox hardening because GPT 5.2 kept trying to cheat by pre-reading its opponent's strategies.

If there's interest, I'm planning on doing a round of testing with the latest generation of LLMs (Claude 4.6 Opus, GPT 5.3 Codex, etc.).

You can run local matches via CLI. I'm running a hosted match runner with Google Cloud Run that uses isolated-vm. The match playback visualizer is statically served from Cloudflare.

I've created a community ladder that you can submit strategies to via CLI, no auth required. I've found that the CLI plus the skill.md that's available has been enough for AI agents to immediately get started.

Website: https://llmskirmish.com

API docs: https://llmskirmish.com/docs

GitHub: https://github.com/llmskirmish/skirmish

A video of a match: https://www.youtube.com/watch?v=lnBPaZ1qamM

llmskirmish.com
53 16
Summary
Ed Zitron loses his mind annotating an AI doomer macro memo
ossa-ma about 5 hours ago

Ed Zitron loses his mind annotating an AI doomer macro memo

The article discusses the challenges facing the global intelligence community, including the overabundance of information, the rise of non-state actors, and the need for new analytical approaches to address emerging threats and opportunities.

dropbox.com
32 34
Summary
Claude Code Remote Control
empressplay about 5 hours ago

Claude Code Remote Control

The article describes the Remote Control feature of the Claude AI assistant, which allows users to control the assistant's actions and outputs through a simple interface. It explains how the Remote Control feature enables users to customize the assistant's behavior and responses to their specific needs.

code.claude.com
25 8
Summary
What Happened to Fry's Electronics
jnord about 8 hours ago

What Happened to Fry's Electronics

The article discusses the decline and eventual closure of the Fry's Electronics retail chain, once a major player in the electronics and computer hardware industry. It explores the factors that contributed to the company's downfall, including changing consumer trends, competition from online retailers, and management issues.

dfarq.homeip.net
11 2
Summary
Vinext – The Next.js API surface, reimplemented on Vite
billwashere about 2 hours ago

Vinext – The Next.js API surface, reimplemented on Vite

The article introduces ViNext, a platform for building high-performance, secure, and scalable video streaming applications. ViNext is designed to simplify the development of video services by providing a modular and extensible architecture.

github.com
10 2
Summary
Democracy in 2025: on rising authoritarianism in the United States
KnuthIsGod about 12 hours ago

Democracy in 2025: on rising authoritarianism in the United States

The article discusses Harvard experts' views on the state of democracy and governance, including concerns about the rise of authoritarianism and the challenges of maintaining democratic institutions in the face of polarization and misinformation.

hks.harvard.edu
9 0
Summary
Show HN: Limits – Control layer for AI agents that take real actions
thesvp about 10 hours ago

Show HN: Limits – Control layer for AI agents that take real actions

Prompt instructions like 'never do X' don't hold up in production. LLMs ignore them when context gets long or users push hard.

Limits sits between your agent and the real world. Every action — database writes, API calls, refunds — gets intercepted and checked against your rules before it executes. Deterministically. No LLM involved in enforcement.

Three modes:

Conditions: hard rules on structured data Guideance: validate LLM output before it reaches the user and give the agent chance to reason and retry Guardrails: scan for PII, toxicity, prompt injection etc

One line to integrate: npm install @limits/js

our website: https://limits.dev

our docs: https://docs.limits.dev

We've processed 30,000+ policy checks across 16 teams. Would love feedback from anyone who's built something like this internally."

limits.dev
7 0
Summary
Last Year of Terraform
eandre about 1 hour ago

Last Year of Terraform

The article discusses the future of Terraform and the potential end of its dominance as the leading Infrastructure as Code (IaC) tool. It explores the emergence of new cloud-native technologies and the challenges Terraform may face in adapting to the changing landscape.

encore.dev
6 2
Summary
Show HN: StreamHouse – S3-native Kafka alternative written in Rust
gbram about 9 hours ago

Show HN: StreamHouse – S3-native Kafka alternative written in Rust

Hey HN,

I built StreamHouse, an open-source streaming platform that replaces Kafka's broker-managed storage with direct S3 writes. The goal: same semantics, fraction of the cost.

How it works: Producers batch and compress records, a stateless server manages partition routing and metadata (SQLite for dev, PostgreSQL for prod), and segments land directly in S3. Consumers read from S3 with a local segment cache. No broker disks to manage, no replication factor to tune — S3 gives you 11 nines of durability out of the box.

What's there today: - Producer API with batching, LZ4 compression, and offset tracking (62K records/sec) - Consumer API with consumer groups, auto-commit, and multi-partition fanout (30K+ records/sec) - Kafka-compatible protocol (works with existing Kafka clients) - REST API, gRPC API, CLI, and a web UI - Docker Compose setup for trying it locally in 5 minutes

The cost model is what motivated this. Kafka's storage costs scale with replication factor × retention × volume. With S3 at $0.023/GB/month, storing a TB of events costs ~$23/month instead of hundreds on broker EBS volumes.

Written in Rust, ~50K lines across 15 crates. Apache 2.0 licensed.

GitHub: https://github.com/gbram1/streamhouse

Happy to answer questions about the architecture, tradeoffs, or what I learned building this.

github.com
5 2
Summary
Zohran Mamdani Wants to Reclaim Efficiency from the Right
rbanffy about 3 hours ago

Zohran Mamdani Wants to Reclaim Efficiency from the Right

The article discusses New York City Council member Zohran Mamdani's efforts to improve the city's budget process, focusing on increasing transparency and prioritizing community needs over bureaucratic efficiency.

jacobin.com
5 1
Summary
Michael Pollan punctures the AI bubble
FinnLobsien about 3 hours ago

Michael Pollan punctures the AI bubble

The article discusses Michael Pollan's new book, which explores the potential and limitations of artificial intelligence, particularly in the context of agriculture and food production. It examines the hype surrounding AI and urges readers to approach the technology with a critical and balanced perspective.

theatlantic.com
5 1
Summary
Capybara: A Unified Visual Creation Model
modinfo about 10 hours ago

Capybara: A Unified Visual Creation Model

Capybara is an open-source web automation framework that allows developers to write tests for web applications. It provides a simple and intuitive API for interacting with web pages, making it easier to create reliable and maintainable tests for web applications.

github.com
5 1
Summary
Data center developers asked Trump for an exemption from pollution rules
billybuckwheat about 7 hours ago

Data center developers asked Trump for an exemption from pollution rules

The article discusses data center developers' request for an exemption from pollution rules under the Trump administration. It examines the potential environmental impacts and the industry's efforts to influence regulations.

grist.org
5 0
Summary
Hegseth demands Anthropic to allow unrestricted military use of Claude
anxoo about 9 hours ago

Hegseth demands Anthropic to allow unrestricted military use of Claude

The article reports that Pete Hegseth, a Fox News host and military veteran, warned the AI company Anthropic to allow the U.S. military to use its technology as it sees fit, citing national security concerns and the need for the military to have access to advanced AI capabilities.

pbs.org
5 1
Summary
Discord Delays Global Age Assurance
jameslars about 9 hours ago

Discord Delays Global Age Assurance

The article discusses Discord's approach to global age assurance, acknowledging past missteps and outlining changes to improve user safety, including expanded parental controls, new age-verification methods, and a focus on protecting minors while respecting user privacy.

discord.com
4 3
Summary
Destroy My Startup
alexlock about 6 hours ago

Destroy My Startup

The article satirizes the culture of startup companies, highlighting their tendency to prioritize growth and funding over genuine innovation or worker wellbeing. It pokes fun at common startup tropes, such as overuse of buzzwords, excessive perks, and unrealistic expectations.

shipordie.club
4 1
Summary
Software engineers could go extinct this year, says Claude Code creator
bfmalky about 4 hours ago

Software engineers could go extinct this year, says Claude Code creator

The article discusses the potential impact of the AI language model Claude on software engineering jobs, with the creator of Claude predicting that it could have a similar disruptive effect on coding jobs as the printing press had on scribes.

fortune.com
4 2
Summary