Ask stories

JPLeRouzic 5 days ago

Ask HN: How are Markov chains so different from tiny LLMs?

I polished a Markov chain generator and trained it on an article by Uri Alon and al (https://pmc.ncbi.nlm.nih.gov/articles/PMC7963340/).

It generates text that seems to me at least on par with tiny LLMs, such as demonstrated by NanoGPT. Here is an example:

  jplr@mypass:~/Documenti/2025/SimpleModels/v3_very_good$
  ./SLM10b_train UriAlon.txt 3
  
  Training model with order 3...
  
  Skip-gram detection: DISABLED (order < 5)
  
  Pruning is disabled
  
  Calculating model size for JSON export...
  
  Will export 29832 model entries
  
  Exporting vocabulary (1727 entries)...
  
  Vocabulary export complete.
  
  Exporting model entries...
  
    Processed 12000 contexts, written 28765 entries (96.4%)...
  
  JSON export complete: 29832 entries written to model.json
  
  Model trained and saved to model.json
  
  Vocabulary size: 1727
  
  jplr@mypass:~/Documenti/2025/SimpleModels/v3_very_good$ ./SLM9_gen model.json
Aging cell model requires comprehensive incidence data. To obtain such a large medical database of the joints are risk factors. Therefore, the theory might be extended to describe the evolution of atherosclerosis and metabolic syndrome. For example, late‐stage type 2 diabetes is associated with collapse of beta‐cell function. This collapse has two parameters: the fraction of the senescent cells are predicted to affect disease threshold . For each individual, one simulates senescent‐cell abundance using the SR model has an approximately exponential incidence curve with a decline at old ages In this section, we simulated a wide range of age‐related incidence curves. The next sections provide examples of classes of diseases, which show improvement upon senolytic treatment tends to qualitatively support such a prediction. model different disease thresholds as values of the disease occurs when a physiological parameter ϕ increases due to the disease. Increasing susceptibility parameter s, which varies about 3‐fold between BMI below 25 (male) and 54 (female) are at least mildly age‐related and 25 (male) and 28 (female) are strongly age‐related, as defined above. Of these, we find that 66 are well described by the model as a wide range of feedback mechanisms that can provide homeostasis to a half‐life of days in young mice, but their removal rate slows down in old mice to a given type of cancer have strong risk factors should increase the removal rates of the joint that bears the most common biological process of aging that governs the onset of pathology in the records of at least 104 people, totaling 877 disease category codes (See SI section 9), increasing the range of 6–8% per year. The two‐parameter model describes well the strongly age‐related ICD9 codes: 90% of the codes show R 2 > 0.9) (Figure 4c). This agreement is similar to that of the previously proposed IMII model for cancer, major fibrotic diseases, and hundreds of other age‐related disease states obtained from 10−4 to lower cancer incidence. A better fit is achieved when allowing to exceed its threshold mechanism for classes of disease, providing putative etiologies for diseases with unknown origin, such as bone marrow and skin. Thus, the sudden collapse of the alveoli at the outer parts of the immune removal capacity of cancer. For example, NK cells remove senescent cells also to other forms of age‐related damage and decline contribute (De Bourcy et al., 2017). There may be described as a first‐passage‐time problem, asking when mutated, impair particle removal by the bronchi and increase damage to alveolar cells (Yang et al., 2019; Xu et al., 2018), and immune therapy that causes T cells to target senescent cells (Amor et al., 2020). Since these treatments are predicted to have an exponential incidence curve that slows at very old ages. Interestingly, the main effects are opposite to the case of cancer growth rate to removal rate We next consider the case of frontline tissues discussed above.

198 172
nemsj about 4 hours ago

Amex Architecture

I was readin about AMEX architecture (some blogs, LLM's discussion...) and was wondering if AMEX still uses z/TPF or not. I am not able to find any clear info about this online, the most I got is from a decade year old discussion. So if anyone can help me around this it would be great!!

2 0
GaryBluto about 9 hours ago

Ask HN: Where can you find old NetBSD packages?

I've been meaning to set up an airgapped NetBSD 1.6 computer for playing music and writing but am unable to find any packages or source code for programs at the time. archive.netbsd.org only has packages from release 7 onwards.

7 2
namesarehard about 14 hours ago

Ask HN: Current state of Android USB tethering?

Does anyone know which Android phones besides Pixel 6 and newer support CDC NCM USB tethering?

I tried few Samsung phones (S21 - S25), Xiaomi Redmi 13 and they only support RNDIS.

Also, I compiled a list of my findings, and if anyone is interested, it’s open for contributions: https://github.com/namesarehard0/android-usb-tethering

7 0
urnicus about 20 hours ago

Ask HN: Are you still working with a website that requires Internet Explorer?

I ran across one of these warnings the other day while working with a government website, "For best results, we suggest using Internet Explorer".

That warning was so common during the 2000's. Nowadays, it is like cat nip to me when I run across one. Does anybody still need to interact with one of these types of websites? Are you able to interact directly with it or do you need to utilize a virtual machine? I'm so curious.

10 7
Ftrea 1 day ago

Ask HN: How would you architect a RAG system for 10M+ documents today?

I'm tasked with building a private AI assistant for a corpus of 10 million text documents (living in PostgreSQL). The goal is semantic search and chat, with a requirement for regular incremental updates.

I'm trying to decide between:

Bleeding edge: Implementing something like LightRAG or GraphRAG.

Proven stack: Standard Hybrid Search (Weaviate/Elastic + Reranking) orchestrated by tools like Dify.

For those who have built RAG at this scale:

What is your preferred stack for 2025?

Is the complexity of Graph/LightRAG worth it over standard chunking/retrieval for this volume?

How do you handle maintenance and updates efficiently?

Looking for architectural advice and war stories.

18 4
ramharts 4 days ago

Facebook has made it impossible to delete Pages – dark patterns everywhere

I'm honestly shocked at how bad the current Facebook interface has become. I’m trying to delete a Page I own, and the platform basically makes it impossible. The options have moved or disappeared, the Page Settings menu leads to the wrong profile, Business Suite doesn’t show the Page, and the “Access and Control” section doesn’t list it at all.

Facebook keeps bouncing me between: – personal profile settings – business portfolio settings – Meta Business Suite – classic Page UI

None of them give the actual option to delete the Page. It’s like the platform actively hides the feature.

And here’s the worst part: I AM the admin. I can publish on the Page. I can edit it. I can manage everything… except delete it.

I get that Meta wants to keep pages alive for engagement and ad data, but blocking users from removing something they own is straight-up abusive UX. No user should have to waste hours navigating four different interfaces to do something basic like “delete a page.”

If anyone has figured out the REAL way to delete a Page in 2025 with the new Facebook UI (which keeps changing), please share. Meta’s documentation is outdated, and their support is nonexistent.

This shouldn’t be this hard.

45 15
jacobwilliamroy 3 days ago

Ask HN: What is the current state of the art in BIG (>5TB) cloud backups?

I'm talking about greater than 5 TB in size. Rclone looks really good because I can just give it a bandwidth limit, point it at google drive and fire and forget. But I'm curious if that is the best way to do this? What does HN think?

20 18
emmahexa 1 day ago

Restaurant Shift Scheduling via Linear Optimization and Staff Constraints

I’m working on a scheduling tool for restaurant / hospitality business owners and want to get some real feedback before I go too deep.

The idea is: you pick what matters most that week (keeping labor cost down, making schedules fair, matching staffing to projected sales, etc.), and the software automatically builds a full schedule based on who’s available and how productive each person is. No more scheduling by hand.

It would: 1. Generate a weekly schedule for you 2. Respect everyone's availability/time-off 3. Keep an eye on labor cost % 4. Adjust staffing with projected sales 5. Let you compare different “optimized” versions (cheapest, most fair, best sales coverage)

If you’re the one making schedules now, what’s the most annoying part? What would make a tool like this worth using?

Appreciate any honest thoughts!

2 8
throwawaybbbbbb 3 days ago

Tell HN: Cursor exposes side projects to your employer

I went to see my Cursor (the AI IDE) analytics and clicked a banner advertising their new company-level analytics dashboard. It now has a section “AI Edits by repository” that includes all the repositories used with Cursor, including your personal side projects. [0] I suspect they scrape the name of the repository from the list of GIT remotes, without explicit consent or notice.

If you're using Cursor with a company (teams, enterprise) subscription, information of all your code commits is sent to their API. This telemetry cannot be disabled and is available in a highly granular format in their API. [1]

The dashboard includes also includes information on when you were writing code. [2] The data is available in a highly granular format in their API. [3]

[0]: https://cursor.com/docs/account/teams/analytics#repository-insights [1]: https://cursor.com/docs/account/teams/ai-code-tracking-api#get-ai-commit-metrics-json-paginated [2] https://cursor.com/docs/account/teams/analytics#daily-usage [3] https://cursor.com/docs/account/teams/ai-code-tracking-api#get-ai-code-change-metrics-json-paginated

32 22
clostao 6 days ago

Ask HN: Cloud providers are losing in favor of bare-metal?

Lately, I’ve noticed a new trend on X: Devs (and indie hackers in particular) are ditching cloud providers and jumping straight to bare-metal servers like Hetzner.

Honestly, I think the big cloud companies just haven’t kept up. Their services feel clunky compared to the standalone alternatives. Just try comparing Vercel’s dev experience to Amplify’s, and you’ll see what I mean. On top of that, AWS has gotten way stingier with startup credits.

Put those two together, and it’s no surprise fewer people are hosting their MVPs on AWS. It’s tough to stay under $150/month with a database and a server, while on bare metal you can grab 16 GB RAM for around $20/month.

- Do you think the cloud is actually losing ground? - And for those using bare-metal: how do you handle DB backups, CI/CD, and pulling logs? - Would you scale something using bare-metal servers?

[Carlos](https://github.com/clostao)

35 26
vieews 2 days ago

Ask HN: Struggling founders, pls share your startup struggle

Founders,

what's been the hardest part of running or building your startup lately?

Whether it's fundraising, finding PMF, hiring, burnout, technical problems, customer acquisition, co-founder issues, runway stress, or anything else, we'd love to hear real stories.

16 15
jacobwilliamroy 2 days ago

Ask HN: What is the best way to see what files are being read in Windows?

I am looking at migrating a Windows server (Windows Server 2012 R2 Standard) and I am wondering if there is some way to learn what files are being read. I know the operating system keeps this metadata but I have also learned that this metadata is unreliable. Is there a third party tool or some kind of powershell script I can use to track this data?

5 4