Show HN: Built a tool solve the nightmare of chunking tables in PDF vs. Markdown
2dogsanerd Sunday, November 23, 2025Hey HN, solo dev here. After years of frustration with how LLMs handle complex documents, especially PDFs with tables, I decided to build a solution myself. My approach uses a Markdown conversion step to preserve the table structure, which seems to work surprisingly well for chunking. This little parser is the first public piece of a much larger, privacy-focused AI platform I'm building. I'm pretty much running on fumes financially, so any feedback, critique, or support is massively appreciated. Happy to answer any questions about the approach!
Summary
The article describes a Smart Ingest Kit, an open-source project that provides a flexible and scalable solution for ingesting, processing, and storing data. The kit includes components for data ingestion, transformation, and storage, allowing users to build customized data pipelines to meet their specific requirements.
10
0
Summary
github.com