Data Activation Thoughts
galsapir Sunday, January 18, 2026i've been working with healthcare/biobank data and keep thinking about what "data moats" mean now that llms can ingest anything. some a16z piece from 2019 said moats were eroding — now the question seems to be whether you can actually make your data useful to these systems, not just have it. there's some recent work (tables2traces, ehr-r1) showing you can convert structured medical data into reasoning traces that improve llm performance, but the approaches are still rough and synthetic traces don't fully hold up to scrutiny (writing this to think through it, not because i have answers)
Summary
The article discusses the concept of data activation, which involves transforming raw data into a format that can be readily used by machine learning models. It highlights the importance of data activation in enabling efficient and effective model training and deployment.
12
2
Summary
galsapir.github.io