Ask stories

xinbenlv about 8 hours ago

Ask HN: Best practice securing secrets on local machines working with agents?

When building with autonomous / semi-autonomous agents, they often need broad local access: env vars, files, CLIs, browsers, API keys, etc. This makes the usual assumption — “the local machine is safe and untampered” — feel shaky.

We already use password managers, OAuth, scoped keys, and sandboxing, but agents introduce new risks: prompt injection, tool misuse, unexpected action chains, and secrets leaking via logs or model context. Giving agents enough permission to be useful seems at odds with least-privilege.

I haven’t seen much discussion on this. How are people thinking about secret management and trust boundaries on dev machines in the agent era? What patterns actually work in practice?

4 0
rajkumar14 about 2 hours ago

Ask HN: Modern test automation software (Python/Go/TS)?

Hi HN,

I’m looking for recommendations for modern test automation software/frameworks that work well with Python/Go/TS. I wasn’t able to find any from my search and I don’t want to spin up my own test automation infrastructure.

My use case is hardware + firmware testing in a lab environment, where I want to avoid being forced into a specific vendor’s hardware ecosystem.

What I’m looking for:

- Python/Go/TS compatibility (SDK, API, or first-class support)

- Ability to see and query historical test runs (dashboards / trend views of logs and metrics)

- Ability to define custom test sequences/workflows with the ability to run steps concurrently (not just a flat list of tests). Examples: conditional steps, retries, setup/teardown phases, multi-device orchestration (PSUs, DMMs, DAQs, and DUTs)

- Hardware-agnostic / no vendor lock-in: I should be able to swap instruments/devices without rewriting everything or being tied to a proprietary vendor (looking at you NI)

- Ideally: also have a slack integration for initiating runs & notifications of run completions

Questions:

1. Is spinning up my own software architecture the only option? And If you’ve built something like this, what stack worked best (e.g., Robot Framework, pytest + plugins, custom orchestrator, Airflow/Prefect/Temporal, etc.)?

2. Are there purpose-built platforms you’d actually recommend that don’t vendor lock?

3. What do you use for run history + reporting?

4. Any “gotchas” with reliability, scaling to many devices, or maintaining driver layers?

I’m happy to assemble the sequence logic myself if needed, but I’d love to avoid reinventing orchestration and run history/reporting from scratch.

5 2
sendes about 5 hours ago

Ask HN: What is your opinion on non-mainstream mobile OS options (e.g. /e/OS)?

Pretty much the title. Context: I have to change my phone, and I thought it is also an opportunity to think about what I would value in a phone OS (privacy and control) vs what I would need for convenience (app availability, seamless connectivity, etc.).

I am gathering opinions, and where else to ask this but here?

What is your experience/thinking on mobile OS options? Would you recommend any brands that use non-mainstream OS versions (eg Fairphone)?

5 2
BlackPearl02 about 11 hours ago

Ask HN: How do you verify cron jobs did what they were supposed to?

I've been running into this issue where my cron jobs "succeed" but don't actually do their job correctly.

For example:

Backup cron runs, exit code 0, but creates empty files

Data sync completes successfully but only processes a fraction of records

Report generator finishes but outputs incomplete data

The logs say everything's fine, but the results are wrong. Actually, the errors are probably in the logs somewhere, but who checks logs proactively? I'm not going through log files every day to see if something silently failed.

I've tried:

Adding validation in scripts - works, but you still need to check the logs

Webhook alerts - but you have to write connectors for every script

Error monitoring tools - but they only catch exceptions, not wrong results

I ended up building a simple monitoring tool that watches job results instead of just execution - you send it the actual results (file size, count, etc.) and it alerts if something's off. No need to dig through logs.

But I'm curious: how do you all handle this? Are you actually checking logs regularly, or do you have something that proactively alerts you when results don't match expectations?

4 1
aureliusm about 9 hours ago

Ask HN: Industrial smart glasses with online / offline capabilities?

I am working for a company that is providing technical support to farmers in rural areas. We have local teams for support, but they often have to collaborate with remote teams/specialists. Idea is to have the field technician equipped with smart glasses so that the remote specialist can work with something more than just static images and verbal descriptions. Does anyone have positive (or negative) experience with industrial grade solutions in difficult environments (think humidity, dust, temperature)? Ideally the solution should be durable, easy to operate, have good picture and sound quality and be well supported by the vendor. As farms are often outside 5G network reach, offline capabilities for just recording are also valuable.

3 0
geooff_ about 5 hours ago

Ask HN: Anyone doing production image editing with image models? How?

Hey HN — I’m building an app where users upload “real life” clothing photos (ex. a wrinkly shirt folded on the floor). The goal is to transform that single photo into a clean, ecommerce-style image of the garment.

One key UX requirement: the output needs to be a PNG with transparency (alpha) so we can consistently crop/composite the garment into an on-rails UI (cards, outfit layouts, etc.). Think “subject cutout that drops cleanly into templates.”

My current pipeline looks like: 1. User-uploaded photo (messy background, weird angles) 2. User-upload is matched to “query” image (style target) + promptCurrently using Nano Banana) 4. Background removal model to get transparency and save as RGBA PNG

This works, but it feels hacky + occasionally introduces edge artifacts. Also, the generation model sometimes invents shadows/background cues that confuse the background removal step. It feels like the two steps are fighting one another.

I’m trying to understand what “good” looks like in production for this kind of workflow:

Are people still doing gen/edit → separate background removal as the standard?

Are any of you using alpha-native generation (RGBA outputs) in production? If so, what’s the stack/workflow?

If you’ve done “messy UGC photo → catalog asset” specifically: what broke most often and what fixed it?

I’m not looking for vendor pitches—mostly practical patterns people are using (open source workflows, model classes, ComfyUI/SD pipelines, API-based stacks, etc.). Happy to share more details if helpful.

3 0
baalimago about 16 hours ago

Ask HN: Is there any good open source model with reliable agentic capabilities?

I don't want to send my data to third party vendors all the time. But from my experience, the LLMs needs to be quite beefy in order to understand tool-calling, especially at longer contexts (200k+).

Before I dive headlong into investigating this and spend money on a project doomed to fail, do anyone have experience with a local model which can handle this sort of workload? I intend to run it on decent gaming CPU with 64-128GB ram.

3 0
akhil08agrawal about 12 hours ago

Tell HN: Drowning in information but still missing everything

I need to vent. I'm a Senior PM and genuinely don't know how anyone stays on top of everything anymore. My morning routine has become a 2-hour anxiety spiral:

- Check Slack for overnight fires - Skim 200+ unread messages - Open Twitter to see what's happening in my space - Check if competitors launched anything - Glance at 3 newsletters I subscribed to and never actually read - Scroll LinkedIn because apparently that's where industry news lives now - Check Product Hunt because what if something relevant launched - Peek at HackerNews for tech trends

And after all that? I still missed that our competitor launched a major feature. Found out from a SALES CALL. Two weeks late. The worst part is the anxiety. I subscribe to 12 newsletters. I skim maybe 2. I read 0 thoroughly. But I can't unsubscribe because what if I miss something important? I've tried everything:

- RSS readers (dead) - Saved folders (never check them) - ChatGPT for research (doesn't know my context, gives generic answers) - Zapier automations (broke after 2 weeks) - Just "accepting I'll miss things" (the anxiety won)

My evening doomscroll is half "staying current" and half anxiety management. My partner thinks I'm addicted to my phone. Maybe I am. But it's not entertainment—it's fear of being the PM who missed the signal everyone else saw. I spend more time GATHERING information than actually THINKING about what to build. Anyone else feel this way? Or have I just lost the plot?

4 4
gman21 about 9 hours ago

Ask HN: Unusual Network Filter

I encountered a strange situation: in public sector, there is a isolated network (wifi-5g ap-router-modem, made by oppo branded by wireless career, one Linux desktop connected with Ethernet to oppo router, one android tv connected over wifi and employee owned phones connected over wifi). The unusual problem is that regarding of network settings (dns, 2.4/5ghz and so on) YouTube does not work on tv but works everywhere else. TV shows that network has limited access and does not load YouTube app (but loads Netflix app). YouTube is available on both desktop over Ethernet and phone over wifi. Curl on Linux desktop shows proper http response to youtube site. Router control panel is accessible and does not contain any unusual entries that might block yt access on tv. Do you have any idea how youtube might be blocked on tv?

3 0
naolbeyene about 7 hours ago

Ask HN: How do you authorize AI agent actions in production?

I'm deploying AI agents that can call external APIs – process refunds, send emails, modify databases. The agent decides what to do based on user input and LLM reasoning.

My concern: the agent sometimes attempts actions it shouldn't, and there's no clear audit trail of what it did or why.

Current options I see: 1. Trust the agent fully (scary) 2. Manual review of every action (defeats automation) 3. Some kind of permission/approval layer (does this exist?)

For those running AI agents in production: - How do you limit what the agent CAN do? - Do you require approval for high-risk operations? - How do you audit what happened after the fact?

Curious what patterns have worked.

3 3
terabytest 2 days ago

Ask HN: Do you have any evidence that agentic coding works?

I've been trying to get agentic coding to work, but the dissonance between what I'm seeing online and what I'm able to achieve is doing my head in.

Is there real evidence, beyond hype, that agentic coding produces net-positive results? If any of you have actually got it to work, could you share (in detail) how you did it?

By "getting it to work" I mean: * creating more value than technical debt, and * producing code that’s structurally sound enough for someone responsible for the architecture to sign off on.

Lately I’ve seen a push toward minimal or nonexistent code review, with the claim that we should move from “validating architecture” to “validating behavior.” In practice, this seems to mean: don’t look at the code; if tests and CI pass, ship it. I can’t see how this holds up long-term. My expectation is that you end up with "spaghetti" code that works on the happy path but accumulates subtle, hard-to-debug failures over time.

When I tried using Codex on my existing codebases, with or without guardrails, half of my time went into fixing the subtle mistakes it made or the duplication it introduced.

Last weekend I tried building an iOS app for pet feeding reminders from scratch. I instructed Codex to research and propose an architectural blueprint for SwiftUI first. Then, I worked with it to write a spec describing what should be implemented and how.

The first implementation pass was surprisingly good, although it had a number of bugs. Things went downhill fast, however. I spent the rest of my weekend getting Codex to make things work, fix bugs without introducing new ones, and research best practices instead of making stuff up. Although I made it record new guidelines and guardrails as I found them, things didn't improve. In the end I just gave up.

I personally can't accept shipping unreviewed code. It feels wrong. The product has to work, but the code must also be high-quality.

431 437
oliverjanssen 1 day ago

Tell HN: 2 years building a kids audio app as a solo dev – lessons learned

Hi,

I started Muky in April 2024. Classic side project that got out of hand. We have two kids - the younger one is happy with the Toniebox, but our older one outgrew it. She started asking for specific songs, audiobooks that aren't available as figurines, and "the music from that movie."

We had an old iPad Mini lying around and already pay for Apple Music. Felt dumb to keep buying €17/$20 figurines for 30-45 minutes of content when we have 100 million songs.

Now at version 4.0 after ~20 updates. Some lessons:

On the hardware vs app tradeoff: Toniebox and Yoto are brilliant for little ones – tactile, simple, no screen needed. But they hit a wall once kids want more. And handing a 5-year-old Apple Music means infinite scrolling and "Dad, what's this song about?" Muky sits in between – full library access, but parents control what's visible.

On sharing: Remember lending CDs or cassettes to friends? Or kids swapping Tonie figurines at a playdate? I wanted that for a digital app. So I built QR code sharing. Scan, import, done. And unlike a physical thing – both keep a copy.

On onboarding: First versions: empty app, figure it out yourself. Retention was awful. Now: 4-step onboarding that actually guides you. Should've done this from the start.

On content discovery: 100 million songs sounds great until you have to find something. Parents don't want to search – they want suggestions. Spent a lot of time building a Browse tab with curated albums and audiobooks for kids. Finally feels like the app helps you instead of just waiting for input.

On going native: Went with Swift/SwiftUI instead of Flutter or React Native. No regrets - SwiftUI is a joy to work with and performance is great. Android users ask for a port regularly. No capacity for that now, but Swift for Android is progressing (https://www.swift.org/documentation/articles/swift-sdk-for-a...). Maybe one day. CarPlay is another one parents keep asking for – going native should make that easier to add, if Apple grants me the entitlement.

On subscriptions vs one-time: Started with one-time purchase. Revenue spikes at launch, then nothing. Switched to subscription – existing one-time buyers kept full access. Harder to sell, but sustainable.

Ask me anything about indie iOS dev or building for kids. App is at https://muky.app if you're curious.

132 74
koconder about 1 hour ago

Ask HN: I'm sure more than just Microsoft is down rn

Am i going crazy, all day i have seen ngnix errors from cloud services, github action failures and huggingface unable to download data.

Downdetector saying outages everywhere from Microsoft to Cloudflare but only Microsoft has said they have an outage?

X was down this morning for a few hours too. Anyone else have intel, X is coming up blank.

8 3
eeezl0dey about 3 hours ago

Ask HN: Thoughts on monitoring multi-chain staking and alerts with KoinyxBot

I’ve been working on a Telegram bot to help me (and others) track staking activity across multiple proof-of-stake networks.

Over time staking rewards can vary for reasons that aren’t obvious at a glance, like:

validator performance drift,

commission changes,

reward timing differences between networks, or

subtle slashing-related events.

Most dashboards show current state (balance/APY) but not significant changes over time or the events that caused them. That makes it easy to miss drift until you notice lower than expected rewards.

So I built a monitoring bot that watches many wallets across chains and focuses on generating alerts when things meaningfully change, rather than polling frequently for small or irrelevant changes.

The bot is already usable via Telegram (@KoinyxBot) and I post updates to a channel (@koinyx). There’s no public landing page yet, so I’m skipping a URL here — but I wanted to ask this community directly:

What are the most useful signals or events you think a multi-chain staking monitor should alert about? For example:

meaningful drops in reward rate vs historical pace,

validator performance anomalies,

unexpected epochs with no rewards,

changes in commission or validator status,

cross-chain differences in timing of reward distributions.

I’m not looking for marketing feedback — just what patterns or notifications would actually be useful to people who stake actively across networks.

Thanks for any perspectives!

2 0
ATechGuy 2 days ago

Ask HN: Why are so many rolling out their own AI/LLM agent sandboxing solution?

Seeing a lot of people running coding agents (Claude Code, etc.) in custom sandboxes Docker/VMs, firejail/bubblewrap, scripts that gate file or network access.

Curious to know what's missing that makes people DIY this? And what would a "good enough" standard look like?

27 10
nonethewiser about 5 hours ago

Ask HN: GitHub "files changed" tab change?

Is anyone else experiencing this?

Apparently the "Files changed" tab works differently now. I navigated to the tab and now instead of `/files` it goes to `/changes`. It looks very similar except:

It shows me 1 file at a time. I cant scroll to see everything. I cant even search the diffs. It's maddening. I can still see all files on smaller PRs.

Seems like maybe its set by a viewcontent.github.com cookie but it re-appears everytime I delete it while logged in. If I try to navigate to `/files` it redirects to `/changes`. Definitely tied to user account. I can still see the old `/files` page when not logged in. Can't find anything about this elsewhere.

2 0
zkid18 3 days ago

Ask HN: COBOL devs, how are AI coding affecting your work?

Curious to hear from anyone actively working with COBOL/mainframes. Do you see LLMs as a threat to your job security, or the opposite?

I feel that the mass of code that actually runs the economy is remarkably untouched by AI coding agents.

168 183
PL_Venard 1 day ago

Ask HN: Does "Zapier for payment automation" exist?

It's 2026 and I shouldn't spend 3 hours every month manually splitting $15K revenue:

• 50% to co-founder • 10% across 3 contractors • 5-10% to ~15 affiliates • 30% to tax account

This should be automated. Maybe through a kind of workflow builder that can trigger money flows.

What I've tried:

- Stripe Connect: Only splits to one account - Zapier: Can't actually move money (ToS restriction) - Manual scripts: Works but I'm now maintaining financial infrastructure. - Escrow.com: $100 min fee, designed for one-off transactions

What I want: Set rules once → money routes automatically each month.

Questions:

1. Does this exist? (Feels like I'm bad at searching)

2. If not, why? Regulatory? Nobody trusts automation with money? Technical blocker? Stable coins could maybe help shipping this.

3. What's your current solution? Custom code? Just manual transfers?

I've talked to ~20 founders. Most are either: - Writing custom scripts (requires dev skills + maintenance) - Paying accountants (expensive, still manual) - Suffering through manual transfers (time sink)

Seems like a gap between "fully manual" and "build your own payment infrastructure."

Am I missing something obvious?

8 12
pragmaticalien8 1 day ago

Tell HN: Claude session limits getting small

I am a max subscriber for claude.ai using their browser UI and claude desktop ai app, not a API user, claude code user. In the last week I have come across every session of mine not lasting more than an hour or so. Have any of you come across this? Is this something new? Reaching out to their help is basically dealing with claude and canned responses, escalated to human but they've yet to explain/respond

23 14
movedx 2 days ago

Ask HN: Revive a mostly dead Discord server

Hello :-)

I have a Discord server I set up a long time ago. Around 2016 I think. Back then, it was lively and active and loads of fun. Over time it's developed close to 5,000 members (it actually had over 5,000 members at one point) and currently has 501 members online as I type this. It's more likely there's about 10-15 that are paying attention to anything happening.

It's a Discord that originally focused on DevOps. It complemented my YouTube channel on the same topic, but since then, as it's slowly died out, and my channel's focus as shifted and changed, it's become a bit of a waste land.

It's a shame really, because a really fun Discord server can be a great place to be, but I'm not sure where to take it now.

How would you handle this situation? What would be your approach to reviving the Discord and perhaps trying to get a community of like minded hackers going again in 2026?

I won't link the Discord here as I'm not trying to beg for users or spam. I just genuinely want to work on a solution to improve the life of the server. I will put it in my HN profile, though, so if you do want to check it out that extra step is required.

Are people even interested in Discord servers any more? I don't know.

Thanks in advance.

19 28
hbarka about 16 hours ago

Ask HN: Why does Google Maps still use mercator projection?

Mercator projection exaggerates land size near the poles. Example: Greenland. I understand why this was necessary for flat paper maps and paper navigation but on the internet, web maps should be able to dynamically adjust based on viewing tangent. The true relative size would be as if you’re looking at a globe map and your sightline is tangent to the curve of the globe.

5 2
AznHisoka about 8 hours ago

Ask HN: Is GitHub Down?

Status page is up, but can't seem to git pull or push

11 5
andrewstetsenko about 5 hours ago

Ask HN: Do you have side income as a software engineer?

Yesterday, I had a long call with a friend, a tech recruiter with over a decade in the field.

She was tired not from resumes or interviews, but from engineers, still negotiating like it’s 2021.

The “golden bubble”.

Where every frontend engineer was a unicorn and every offer had quadruple-digit equity, that bubble still shapes expectations. But the market today is more like… reality.

It’s not that talent isn’t valued. It’s that value has changed shape.

The engineers who bristle at lower stock grants or extra days in office aren’t wrong to want fairness… they’re just anchoring to a world that no longer exists or might quickly disappear.

I don’t believe in surrender. I believe in recalibration.

What happens when we stop pretending the market is unchanged, and start asking: How do I create a career that isn’t a bubble-dependent bet? What do I build that isn’t tied to a single job title or company’s stock price?

The most interesting people I talk to today are already thinking in parallel streams: mentoring, writing, consulting, advising, building side projects, diversifying, not out of fear, but because the old narrative of one job = stable identity doesn’t hold.

I’m curious: do you have side income as a software engineer? If so, what’s worked (or not) for you?

10 3
remusomega 1 day ago

Tell HN: Avoid Cerebras if you are a founder

I was an Enterprise customer on their platform. Cerebras began terminating production models and replacing them every few months. This time they notified us they are terminating Llama 3.3 70B, which is the model my plan was subscribed to.

Instead of offering us an alternative plan they told us all others are sold out and have kicked us off their platform.

Their Discord support group has multiple Enterprise customers all getting the same treatment. Their own support staff is telling them to migrate to Groq.

You can’t build a stable business with a company that randomly terminates models this frequently. Who does not respect their customers (especially Enterprise customers, not even small utilizers), offering zero contingency plans if they decide to axe your model.

Because of their architecture, they can only host a finite number of models. Turnover is fast, so if your model gets marked for deprecation, you will get forced off the platform.

Cerebras is cool for personal Hobby projects, but is in absolutely no position to be selling "Enterprise" accounts to businesses.

34 14
donatj 1 day ago

Ask HN: How locked down are your work machines?

I've been working as a Software Engineer for 20+ years.

Places I worked in the early years barely had an IT department at all. As a developer you were expected to be able to maintain your machine. We'd install whatever we want, experiment with different operating systems, etc. Total free rein, box was our tool to get work done with, they didn't care how you did it.

That went away a long time ago. Basic corporate spyware and rules came pretty early but still free rein over our tools.

I've worked with the same company for close to a decade now, and they have been tightening and tightening the noose slowly but surely. We're purportedly a software company, but we lost admin rights, installable software went from a blocklist to an allowlist. Everything we install needs to get approved by IT, and that approval takes weeks.

Today they took our Chrome extensions away. They've got an allowlist of about 15 extensions we can install. Everything I submitted for approval got rejected.

I'm frustrated with this arrangement and am wondering how standard this is these days in this industry?

So I'm genuinely curious, Hacker News: How big of a company do you work for, what industry, and how locked down is your machine?

18 22
Olshansky 3 days ago

Ask HN: How do you run parallel agent sessions?

Anthropic has these docs that use git worktree: https://code.claude.com/docs/en/common-workflows#run-paralle...

There are some apps like this that leverage git-worktrees under the hood: https://conductor.build

I've tried lazygit as well to make it more convenient.

I still end up preferring having multiple clones of my repo when I need to ensure the agents don't accidentally overlap.

Curious what other people do.

7 2
jimnotgym 2 days ago

Ask HN: Which common map projections make Greenland look smaller?

I see an urgent need for a map projection that makes Greenland look as small as possible. What are the options?

18 17
blhack 2 days ago

Ask HN: What have you built/shipped with Claude Code?

I'm trying not to get too caught up in the hype here. I'm curious what you've actually built or shipped so far with claude-code.

I made a little phonics flashcard game for my kids https://apps.apple.com/us/app/3letterstories/id6753956099

Also related to that: a tool to fine tune the images on the flash cards (which are all AI generated on gemini).

Some internal tooling - a self contained page that does JSON/python dict visualization - this obviously exists already.

Nothing big yet, but it's only been a couple of days. This tool does seem really good at building frontends/dashboards.

9 4
petetnt about 7 hours ago

Tell HN: GitHub has experienced issues 60% of days this year

Looking at the https://www.githubstatus.com/history there's been a significant issue in GitHub 13 out of 22 days this year. These track issues that warrant their own status page, but from personal experience there's some sort of degraded performance to be seen in the service almost daily, most often in the evening in my local time.

As a long time paying GitHub user the stability has been bad for years now, but this has been an upwards trend that's only been accelerated in the recent times. What's happening inside of GitHub and can we expect it to ever get better anymore?

5 1
RobertSerber 1 day ago

How do you keep AI-generated applications consistent as they evolve over time?

Hi HN,

I’ve been experimenting with letting LLMs generate and then continuously modify small business applications (CRUD, dashboards, workflows). The first generation usually works — the problems start on the second or third iteration.

Some recurring failure modes I keep seeing: • schema drift that silently breaks dashboards • metrics changing meaning across iterations • UI components querying data in incompatible ways • AI fixing something locally while violating global invariants

What’s striking is that most AI app builders treat generation as a one-shot problem, while real applications are long-lived systems that need to evolve safely.

The direction I’m exploring is treating the application as a runtime model rather than generated code: • the app is defined by a structured, versioned JSON/DSL (entities, relationships, metrics, workflows) • every AI-proposed change is validated by the backend before execution • UI components bind to semantic concepts (metrics, datasets), not raw queries • AI proposes structure; the runtime enforces consistency

Conceptually this feels closer to how Kubernetes treats infrastructure, or how semantic layers work in analytics — but applied to full applications rather than reporting.

I’m curious: • Has anyone here explored similar patterns? • Are there established approaches to controlling AI-driven schema evolution? • Do you think semantic layers belong inside the application runtime, or should they remain analytics-only?

Not pitching anything — genuinely trying to understand how others are approaching AI + long-lived application state.

Thanks.

10 0