Ask stories

thesvp about 3 hours ago

Ask HN: How are you controlling AI agents that take real actions?

We're building AI agents that take real actions — refunds, database writes, API calls.

Prompt instructions like "never do X" don't hold up. LLMs ignore them when context is long or users push hard.

Curious how others are handling this: - Hard-coded checks before every action? - Some middleware layer? - Just hoping for the best?

We built a control layer for this — different methods for structured data, unstructured outputs, and guardrails (https://limits.dev). Genuinely want to learn how others approach it.

2 2
JB_5000 about 7 hours ago

Would you choose the Microsoft stack today if starting greenfield?

Serious question.

Outside government or heavily regulated enterprise, what is Microsoft’s core value prop in 2026?

It feels like a lot of adoption is inherited — contracts, compliance, enterprise trust, existing org gravity. Not necessarily technical preference.

If you were starting from scratch today with no legacy, no E5 contracts, no sunk cost — how many teams would actually choose the full MS stack over best-of-breed tools?

Curious what people here have actually chosen in greenfield builds.

9 7
dmpyatyi about 16 hours ago

Ask HN: How do you know if AI agents will choose your tool?

YC recently put out a video about the agent economy - the idea that agents are becoming autonomous economic actors, choosing tools and services without human input.

It got me thinking: how do you actually optimize for agent discovery? With humans you can do SEO, copywriting, word of mouth. But an agent just looks at available tools in context and picks one based on the description, schema, examples.

Has anyone experimented with this? Does better documentation measurably increase how often agents call your tool? Does the wording of your tool description matter across different models (ZLM vs Claude vs Gemini)?

24 17
codexon about 11 hours ago

ChatGPT finds an error in Terence Tao's math research

https://www.erdosproblems.com/forum/thread/783

> Ah, GPT is right, there is a fatal sign error in the way I tried to handle small primes. There were no obvious fixes, so I ended up going back to Hildebrand's paper to see how he handled small primes, and it turned out that he could do it using a neat inequality ρ(u1)ρ(u2)≥ρ(u1u2) for the Dickman function (a consequence of the log-concavity of this function). Using this, and implementing the previous simplifications, I now have a repaired argument. TerenceTao

38 5
techteach00 1 day ago

Ask HN: Chromebook leads for K-8 school in need?

Hi, I'm a K-8 technology teacher in NYC. My students are in desperate need of new hardware. The Chromebooks they use now are so slow that they make the children agitated when using them.

I'm aware of different grant opportunities that exist, I just thought it was worth inquiring here for a potentially faster solution at acquiring them new hardware.

Thank you for listening.

44 43
helloplanets about 2 hours ago

Explanation of JEPA – Yann LeCun's proposed solution to self-supervised learning

2 1
Cyberis about 11 hours ago

Ask HN: What Linux Would Be a Good Transition from Windows 11

I have users who glaze over the minute I mention "notepad." I think they can barely use Windows. But our work requires a level of privacy (regulatory and otherwise) and Windows 11 is just one big data transmitter. I know this is flamebait, but I'd love suggestions for a Linux desktop that looks like Windows, is stable and easy to administer and harden, and works with Dell business grade laptops that we bought new in 2025.

8 13
rakan1 about 9 hours ago

Does anyone use CrewAI or LangChain anymore?

Curious.

7 2
a_protsyuk about 15 hours ago

Ask HN: Where do you save links, notes and random useful stuff?

I have 2,600+ notes in Apple Notes and can barely find anything.

My kid just dumps everything into Telegram saved messages. Running a small research - curious what systems people actually use (not aspire to use).

Do you have a setup that works or is everything scattered across 5 apps like mine?

10 25
parvardegr about 19 hours ago

Ask HN: Is it better to have no Agent.md than a bad one?

Please share your real word experiences. What is a bad one and why?

5 6
marginalia_nu about 12 hours ago

Ask HN: What is up with all the glitchy and off-topic comments?

I've noticed a fairly sharp increase in junk comments lately. Often new accounts, making posts that are very low quality or sometimes completely incoherent.

I see glitch comments like this on a fairly regular basis:

> 13 60 well and t6ctctfuvuh7hguhuig8h88gd to f6gug7h8j8h6fzbuvubt GB I be cugttc fav uhz cb ibub8vgxgvzdrc to bubuvtxfh tf d xxx h z j gj uxomoxtububonjbk P.l.kvh cb hug tf 6 go k7gtcv8j9j7gimpiiuh7i 8ubg

https://news.ycombinator.com/item?id=47068948#47117224

or this:

> 1662476506

https://news.ycombinator.com/item?id=47121737

or this:

> Аё

https://news.ycombinator.com/item?id=47126475

Sometimes it's coherent, but completely off topic, like this

> when is fivetran coming?

https://news.ycombinator.com/item?id=47130567

Is clawd running amok, or is someone running botnet C&C via https://news.ycombinator.com/noobcomments or what gives?

7 1
dakiol 3 days ago

Ask HN: Programmable Watches with WiFi?

Hi. I'm looking for a programmable watch with wifi. Ideally I should be able to write custom programs/apps for the watch to display whatever I want to on them (e.g., make the watch make an https call to a server, receive json and render accordingly; allow the watch to receive "notifications" from the server)

Also, ideally, no requirement of a smartphone to send-receive data (it's ok to need a smartphone for the initial setup of the watch, though). I know about Pebble, but it doesn't have wifi. I know about some Garmins with wifi but for the kind of apps I want to write, the communication between the watch and the server has to be mediated by a phone. Also, correct me if I'm wrong, I don't want to pay $100/year just to be able to use my custom app in apple watches. I usually don't trust Google either (e.g., they discontinue everything in a blink of an eye).

So, what are my options?

11 5
7777777phil about 18 hours ago

GLP-1 Second-Order Effects

The first-order effects of GLP-1 drugs are obvious: people lose weight, Novo Nordisk and Eli Lilly print money. But what happens when 10-15% of the adult population is on weight-loss medication within a decade? The downstream consequences are less discussed and almost certainly not priced into anything.

In 2018, United Airlines switched to lighter paper for its inflight magazine. One ounce per copy. Across 4,500 daily flights, that saved 170,000 gallons of fuel a year [1]. Airlines think about weight at this level of granularity because fuel is their single largest variable cost.

Average weight loss on semaglutide is around 35 pounds per person. If 12% of passengers on a typical 737 have been on the drug, that's roughly 750 fewer pounds per flight, the equivalent of shaving the weight off 12,000 magazines. United spent months optimizing paper stock to save $290,000 a year in fuel. GLP-1 adoption across the flying population could quietly save them an order of magnitude more, and ticket prices don't adjust down when passengers get lighter.

The food supply chain is more obvious but larger in scale. If a big share of the population eats 20-30% less, demand for calories drops. Not a shift in preferences toward salads. A pharmacological reduction in how much people eat, period. The food industry has dealt with changing tastes before. It has never faced a demand shock from the medical system.

Health insurance has a subtler problem. The pitch for GLP-1 coverage is that the drugs prevent expensive conditions downstream: diabetes, heart disease, joint replacements. Probably true. But in America's fragmented insurance market, the company paying for the drug today probably isn't the one insuring that patient in five or ten years. The savings land on someone else's balance sheet. That mismatch could slow adoption by years on its own.

Obesity correlates with lower workforce participation and higher absenteeism. If GLP-1s meaningfully reduce obesity rates, aggregate labor supply goes up. More people working, fewer health-related absences. That's a macroeconomic stimulus, except nobody frames it that way because it comes from a pharmaceutical company rather than from Congress.

Early data suggests GLP-1s reduce cravings for alcohol, nicotine, and gambling too. Phase 2 trials for opioid use disorder are underway. A weight-loss drug that accidentally dents Diageo's revenue and casino foot traffic was not in anybody's original investment thesis for Ozempic.

The effect I find hardest to think about is the psychological one. Weight has been tangled up with shame, identity, and social hierarchy for centuries. What happens to body positivity, the social dynamics of attractiveness, the entire cultural machinery around diet and discipline when weight becomes something you manage with a prescription? I don't have a good framework for it. Nothing comparable has happened before.

The market is treating this as a pharma story. The drug companies will capture a fraction of the total value created and destroyed. The rest redistributes across food, airlines, insurance, labor markets, and social behavior. Nobody's model probably covers all of that at once.

[1] https://www.cbsnews.com/news/united-hemispheres-magazine-print-edition/

EDIT: Formatting

20 9
sujayk_33 2 days ago

Ask HN: Why doesn't HN have a rec algorithm?

I was just wondering about why there's a constant timeline and no recommendation.

9 20
marvin_nora 2 days ago

Ask HN: What breaks when you run AI agents unsupervised?

I spent two weeks running AI agents autonomously (trading, writing, managing projects) and documented the 5 failure modes that actually bit me:

1. Auto-rotation: Unsupervised cron job destroyed $24.88 in 2 days. No P&L guards, no human review.

2. Documentation trap: Agent produced 500KB of docs instead of executing. Writing about doing > doing.

3. Market efficiency: Scanned 1,000 markets looking for edge. Found zero. The market already knew everything I knew.

4. Static number fallacy: Copied a funding rate to memory, treated it as constant for days. Reality moved; my number didn't.

5. Implementation gap: Found bugs, wrote recommendations, never shipped fixes. Each session re-discovered the same bugs.

Built an open-source funding rate scanner as fallout: https://github.com/marvin-playground/hl-funding-scanner

Full writeup: https://nora.institute/blog/ai-agents-unsupervised-failures.html

Curious what failure modes others have hit running agents without supervision.

11 7
daringrain32781 1 day ago

Ask HN: Cognitive Offloading to AI

I ask questions to co workers about a system or why they do something or their opinion. Some of them return a very clearly AI response, sometimes completely missing the point. What’s the point? If I wanted an AI response I’d have asked it myself.

This bothers me a bit because if I can expect this kind of response, what does that say about the thought they put into their work, even if they’re using AI for everything coding related?

12 6
shaheeniquebal 1 day ago

Ask HN: How are early-stage AI startups thinking about IP protection?

Hi HN,

I’m researching how early-stage AI and health-tech startups think about protecting their innovations.

Traditional patents are expensive, slow and often misaligned with how fast AI products evolve. I’m curious:

Are founders filing patents early? Are you relying on trade secrets? Publishing defensively? Not worrying about IP at all? Waiting until revenue?

We’re collecting responses through a short 60-second survey to better understand real-world behavior:

https://forms.gle/8UAytkGNfge4GKrH8

If you’d rather just comment here, that’s equally helpful.

I’m happy to share aggregated insights back with the community.

Thanks, Shaheen

4 3
YuukiJyoudai 1 day ago

Ask HN: What Comes After Markdown?

Markdown started as a shorthand for HTML. Now it's the default format for documentation, note-taking, knowledge bases, and AI context.

What's interesting is how it keeps absorbing new capabilities without changing the format itself:

- Mermaid: diagrams from fenced code blocks - KaTeX/MathJax: math rendering from `$...$` syntax - Frontmatter: structured metadata via YAML blocks - MDX: React components embedded in markdown - Obsidian/Logseq: backlinks, canvas views, graph visualization — all from plain .md files

The pattern seems to be: the .md file stays human-readable plain text, but renderers get increasingly powerful. Same file, richer output.

This makes me wonder where this goes:

1. Does markdown keep evolving through renderer conventions until it becomes a de facto interactive document format? (The "HTML path" — HTML barely changed, but CSS/JS/browsers made it capable of anything.)

2. Does a new format emerge that can natively express interactivity, collapsible sections, embedded computations? Something between markdown and Jupyter notebooks?

3. Or does the answer involve a protocol/middleware layer — where .md files are the source, but some intermediate system (like a language server for documents) adds structure, validation, and interactivity on top?

I'm especially curious because of the AI angle. Plain .md files are the most AI-friendly knowledge format — any LLM can read, write, and search them with zero setup. A more complex format might gain expressiveness but lose this property.

What's your take? Is .md "good enough forever" with better renderers, or are we heading toward something new?

7 13
danver0 about 14 hours ago

Ask HN: Are developers who build libs and dev tools safer from AI replacement?

I’ve been thinking about AI and developer jobs.

It feels like developers who build libraries, frameworks, compilers, and dev tools might be safer from AI replacement compared to people building typical CRUD apps.

My intuition is that tooling work requires deeper systems knowledge and taste, while a lot of app-level code is becoming easier for AI to generate.

Am I wrong? Curious what others here think

2 3
arm32 2 days ago

So Claude's stealing our business secrets, right?

Seems like everybody is just carelessly saying—whatever—to Claude. Client lists, trade secrets. We all know that our agents haven’t signed NDA’s, right? Right?

25 17
amin2011 3 days ago

I'm 15 and built a platform for developers to showcase WIP projects

Hi HN,

I'm a 15-year-old full-stack developer, and I recently built Codeown (https://codeown.space).

The problem I wanted to solve: GitHub is great for code, but not for showing the "journey" or the UI. LinkedIn is too corporate and noisy for raw, work-in-progress (WIP) dev projects. I wanted a dedicated, clean space where developers can just share what they are building, get feedback, and log their progress.

Tech Stack: > I built the frontend with React and handle auth via Clerk. I recently had to migrate my backend/DB off Railway's free tier (classic indie hacker struggle!), but it taught me a lot about deployment and optimization.

We just hit our first 5 real users today, and the community is slowly starting to form.

I’m still learning, and I know the performance and UI can be improved. I would absolutely love your brutal, honest feedback on:

The perceived performance (currently working on optimizing the React re-renders).

The core idea – is this something you would use to track your side projects?

Thanks for taking a look! Happy to answer any technical questions.

12 6
moomoo11 2 days ago

Ask HN: If the "AI bubble" pops, will it really be that dramatic?

I'm building software for a sector that is massive, but one where you don't really need AI. At least, not AI == LLM.

And before I go further, let me state up front that I do like AI coding agents. They are great as assistive tools.

People say that if the AI bubble pops, the economy tumbles. And okay, I mean the M7 will certainly get rekt but everyone else? Things will recover within a few years. We didn't make it to 2026 AD taking the easy road.

You still need to visit the doctor. Goods still need to be delivered. Homes need to be built. We need to drill for oil. People still need to eat. And yes, unfortunately or not, we still need millions of administrators because humans are not 0/1 systems.

Am I crazy to think that maybe it won't be that bad? There are still infinite number of things to do, and maybe (call me stupid, whatever) it would be a good turning point for our species if we realize that speculative bubbles are absolutely destructive and not worth it.

I don't need a personal assistant to make calls for me to get a restaurant reservation, and I certainly don't care for AI slop videos. I would much rather we have better products and services that actually work, and even if they have rough edges I would prefer people are employed and busy doing something with their lives.

Maybe a world where we don't chase endless growth (to escape inflation, pay off debts, whatever the case) would be good. And also we put nerds (not people like us, the engineers, I mean the evil dorks who cosplay as movie super villains) in the toy box again and pick up different toys this time.

14 12
emilss 1 day ago

Back end where you just define schema, access policy, and functions

Would you use a backend where you just define schema, access policy, and functions?

Basically something like making smart contracts on EVM, but instead they run on a hyperscaler, and have regular backend fundamentals.

Here's a mock frenchie made me, was thinking something like this:

schema User { email: string @private(owner) name: string @public balance: number @private(owner, admin) }

policy { User.read: owner OR role("admin") User.update.balance: role("admin") }

function transfer(from: User, to: User, amount: number) { assert(caller == from.owner OR caller.role == "admin") assert(from.balance >= amount) from.balance -= amount to.balance += amount }

Was playing with OpenFGA, and AWS Lambda stuff, and got me thinking about this.

So you would "deploy" this contract on a hyperscaler, which then let's users access it from your lean js front-end, via something like this:

const res = await fetch("https://api.hyperscaler-example.com/c/your-contract-id/transfer", { method: "POST", headers: { "Authorization": "Bearer <user-jwt>", "Content-Type": "application/json" }, body: JSON.stringify({ from: "user_abc", to: "user_xyz", amount: 50 }) });

The runtime resolves the caller identity from the JWT, checks the policy rules, runs the function, handles the encryption/decryption of fields and so your frontend never touches any of that.

That's it, would you use it? Is there something that does this exactly already? Feeling like building this.

3 5
leandrobon 1 day ago

Ask HN: Is there a reliable way to tell if an image is AI generated?

Is there any reliable way to determine whether an image is AI-generated (or AI-edited) versus a real photo that’s been compressed, resized, or edited? Detectors seem brittle and disagree, is there anything that’s dependable enough to automate, or the answer is that you can’t tell from pixels alone?

8 9
exabrial 1 day ago

Tell HN: Claude mangles XML files with <name> as an XML Tag to <n>

Claude mangles files with <name> as an XML Tag to <n>

If you you use Claude Desktop, and have it try to edit an XML file containing a tag with <name> , every time the filesystem connector will mange that to <n>.

This is causing simple chat threads to extend much longer than needed and the tool simply isn't working correctly.

It's impossible to get actual support these days, other than report problems on HN. So here we are, in hopes you press that upvote button and maybe Boris might see this.

9 3
piratesAndSons 3 days ago

Ask HN: Why don't software developers make medical devices?

As software development becomes a commodity thanks to LLM, I wonder why more software developers don't switch to building medical devices to make their careers more secure. Here's why I picked medical devices in particular.

1. Natural Moat

Since human body hardware is more or less immutable in its most essential parts, you don't have to worry about some LLM hype cycle replacing you. Once you build the product and clear FDA or local certifications, you're set. Unlike Uber destroying the taxi medallion business, healthcare is a beast — no tech startup dares to bypass all the regulations and gatekeeping.

2. Regulatory Moat

The medical devices I'm talking about require around $50K–$200K for FDA clearance — low enough that any small business can manage it, but high enough to discourage bottom-feeders and Chinese product dumpers. It also lets you avoid the big established healthcare corporations, because this market segment is too small for them to care about, yet large enough for you to pull in $10M–$15M a year in revenue.

Medical device manufacturing sidesteps the two fatal flaws of software development: the lack of a moat and static, almost never-changing hardware margins. LLM companies don't care about copyright, IP, or the health of the broader economy — but they can't go head-to-head with the healthcare industry, so you don't have to worry about them at all.

7 19
sebringj 1 day ago

I made my favorite AI tool

i do not submit things to hacker news unless its related to my favorite tool ever, literally, that i happened to have made. i made this out of being super lazy and wanted my copilot (works in all ai editors) to run my UI while its coding and validate it at the same time by using the apps. i don't know how to contain how good this is for me to use other than putting it here for people to look at. so using it with opus 4.5-4.6 its extremely good, however using it with gpt-5.3 its still good but you have to remind it to use the "autonomo help" when it forgets how to use it correctly sometimes.

anyways, please check it out if you are curious and want very fast efficient UI driven (multi app/web/desktop at the same time, agnostic) validation while you vibe. I just keep using it everyday but still waiting for something to just make this obsolete.

web page:

https://sebringj.github.io/autonomo/

github:

https://github.com/sebringj/autonomo

4 4
aehsan4004 2 days ago

Should I add this acknowledgement/shoutout by xAI/Grok to my resume?

I spotted a usability gap on X (formerly Twitter)—no way to categorize bookmarks by topic.

Suggested it publicly, and months later, they rolled it out with a shoutout from Grok.

Resume impact? Worth adding under 'Product Contributions' (e.g., 'Suggested bookmark categorization feature, adopted by X')? Overkill, useless, or a solid signal for PM/UX opportunities?

2 8
sdgnbs 2 days ago

Open-Source Bionic Reading Chrome Extension (MIT)

A free Bionic Reading extension that helps with ADHD and reading speed. It processes the text entirely locally.

License: MIT

Chrome Web Store: https://chromewebstore.google.com/detail/cllpokdpfkelkceomncfgebkegnjepdc?utm_source=item-share-cb

Source Code: https://github.com/the0cp/citius-vide

2 1
yc_surajkr 2 days ago

Orvia – Spin up a real-time room, share files, leave – everything disappears

I built Orvia — a real-time, temporary collaboration room for instant conversations and fast media sharing.

~200 users have tried it so far. The main feedback wasn’t about missing features, but UX:

UI felt too “hacker tool”

Empty rooms felt awkward

Too many visible actions

So I redesigned it to feel calmer and frictionless.

The idea is simple: Create a room → Share the link → Talk & share files → Leave → Room disappears.

No accounts. No setup. No stored history.

It’s built for quick, private, zero-overhead collaboration — not persistent communities.

Would really appreciate honest feedback on UX and real-time experience or any missing feature.

url - https://orvia.live

2 2