Show HN: Batchling – save 50% off any GenAI requests in two lines of code
vienneraphael Thursday, February 26, 2026batchling is a Python gateway to provider-native GenAI Batch APIs, so your existing calls can run at batch-priced rates instead of standard realtime pricing.
As an AI developer myself, I discovered Batch APIs when tingling with AI benchmarking: I wanted to save 50% because I was ok with a 24h-SLA.
What I discovered was a hard engineering reality:
- No standards: each batch API has a different flow and batch lifecycles are never the same.
- Framework shift: as a developer, switching from sync/async execution to deferred (submit, poll, download) feels off and requires to build custom code and store files.
That's when I noticed that no open-source project gave a solution to that problem, so I built it myself.
Batch APIs are nothing new, but they lack awareness and adoption. The problem has never been the Batch API itself but its integration and developer experience.
batchling is bridging that gap, giving everyone a developer-first experience of Batch APIs and unlock scale and cost-savings for compatible requests.
batchling usage was designed to be as seamless as possible: just wrap existing async code into an async context manager (the only lib entrypoint) to automatically batch requests.
Users can even push that further and use the CLI to wrap a whole function, without adding a single line of code.
Under the hood, batchling:
- intercepts requests in the scope of the context manager
- repurposes them to batch format
- manages the whole batch lifecycle (submit, poll, download)
- hands back requests when they are processed such that the script can continue its execution seamlessly.
batchling v0.1.0a1 comes batteries-included with:
- Large batch providers support (Anthropic, Doubleword, Gemini, Groq, Mistral, OpenAI, Together, XAI)
- Extensive AI Frameworks integration (Instructor, Langchain LiteLLM, Pydantic AI, Pydantic Evals..)
- Request caching: avoid recomputing requests for which you already own a batch containing its response.
- Python SDK (2 lines of code to change) and Typer CLI (no code change required)
- Rich documentation stuffed with examples, get started and run your first batch in minutes.
I believe this is a game changer in terms of adoption and accessibility for any AI org, research lab or individual that burns tokens through API.
I'd love to get feedback from AI developers and new ideas by exchanging with the technical community. The library is open to contributions, whether they be issues, docs fixes or PR.
Repo: https://github.com/vienneraphael/batchling
Docs: https://batchling.pages.dev