Show stories

CzaxTanmay 3 days ago

Show HN: ÆTHRA – Writing Music as Code

Hi HN

I’m building ÆTHRA — a programming language designed specifically for composing music and emotional soundscapes.

Instead of focusing on general-purpose programming, ÆTHRA is a pure DSL where code directly represents musical intent: tempo, mood, chords, progression, dynamics, and instruments.

The goal is to make music composition feel closer to writing a story or emotion, rather than manipulating low-level audio APIs.

Key ideas: - Text-based music composition - Chords and progressions as first-class concepts - Time, tempo, and structure handled by the language - Designed for ambient, cinematic, emotional, and minimal music - Interpreter written in C# (.NET)

Example ÆTHRA code (simplified):

tempo 60 instrument guitar

chord Am for 4 chord F for 4 chord C for 4 chord G for 4

This generates a slow, melancholic progression suitable for ambient or cinematic scenes.

ÆTHRA currently: - Generates WAV audio - Supports notes, chords, tempo, duration, velocity - Uses a simple interpreter (no external DAWs or MIDI tools) - Is intentionally minimal and readable

What it is NOT: - Not a DAW replacement - Not MIDI-focused

Why I made it: I wanted a language where music is the primary output — not an afterthought. Something between code, emotion, and sound design.

The project is open-source and early-stage (v0.8). I’m mainly looking for: - Feedback on the language design - Ideas for musical features worth adding - Thoughts from people into PL design, audio, or generative art

Repo: <https://github.com/TanmayCzax/AETHRA>

Thanks for reading — happy to answer questions or discuss ideas.

18 5
Show HN: Voiden – an offline, Git-native API tool built around Markdown
dhruv3006 about 5 hours ago

Show HN: Voiden – an offline, Git-native API tool built around Markdown

Hi HN,

We have open-sourced Voiden.

Most API tools are built like platforms. They are heavy because they optimize for accounts, sync, and abstraction - not for simple, local API work.

Voiden treats API tooling as files.

It’s an offline-first, Git-native API tool built on Markdown, where specs, tests, and docs live together as executable Markdown in your repo. Git is the source of truth.

No cloud. No syncing. No accounts. No telemetry.Just Markdown, Git, hotkeys, and your damn specs.

Voiden is extensible via plugins (including gRPC and WSS).

Repo: https://github.com/VoidenHQ/voiden

Download Voiden here : https://voiden.md/download

We'd love feedback from folks tired of overcomplicated and bloated API tooling !

github.com
12 4
Show HN: Zuckerman – minimalist personal AI agent that self-edits its own code
ddaniel10 about 6 hours ago

Show HN: Zuckerman – minimalist personal AI agent that self-edits its own code

github.com
50 35
aa-on-ai about 3 hours ago

Show HN: The Pixel Funeral – A cemetery for dead design concepts

The article discusses the development of a pixel art funeral service, where a deceased person's life is commemorated through a customized digital memorial created by artists. It explores the emotional and technological aspects of this novel approach to honoring the departed.

pixel-funeral.vercel.app
2 0
Summary
Show HN: Stumpy – Secure AI Agents You Can Text
bluesnowmonkey about 3 hours ago

Show HN: Stumpy – Secure AI Agents You Can Text

Hi HN, I'm Preston. I built this because I needed an AI assistant that could follow me around - not just live on my laptop. Stumpy agents run in the cloud, connect to Slack/SMS/Telegram/email, and can only contact people who've opted in.

Happy to answer questions. Feedback welcome at preston@stumpy.ai

stumpy.ai
2 0
Summary
Show HN: Taracode – Open-source DevOps AI assistant that runs 100% locally
taravision about 4 hours ago

Show HN: Taracode – Open-source DevOps AI assistant that runs 100% locally

The Tara Vision project is an open-source computer vision library written in Python, focused on providing high-performance, easy-to-use computer vision tools for a wide range of applications, including object detection, image segmentation, and more.

github.com
2 1
Summary
Show HN: Minimal – Open-Source Community driven Hardened Container Images
ritvikarya98 about 24 hours ago

Show HN: Minimal – Open-Source Community driven Hardened Container Images

I would like to share Minimal - Its a open source collection of hardened container images build using Apko, Melange and Wolfi packages. The images are build daily, checked for updates and resolved as soon as fix is available in upstream source and Wolfi package. It utilizes the power of available open source solutions and contains commercially available images for free. Minimal demonstrates that it is possible to build and maintain hardened container images by ourselves. Minimal will add more images support, and goal is to be community driven to add images as required and fully customizable.

github.com
106 29
Summary
Mikulas_Tomanka about 5 hours ago

Show HN: A private FIRE calculator suite that runs in the browser

Hi HN,

I built Firenum because most FIRE calculators I found were either too simplistic or required uploading my entire financial life to a third-party server.

I wanted a comprehensive suite that could model more than just the 4% rule. This tool handles Coast, Lean, Fat, and Barista FIRE, but more importantly, it lets you model 'what-if' scenarios like market crashes or major life events to see how they impact your timeline.

Key features:

-> Privacy-First: No signup required. All calculations and data persistence happen in your local storage. Nothing is sent to a backend.

-> Scenario Modeling: You can simulate market downturns to see the resilience of your plan.

-> Multi-Currency: Supports 8 major currencies.

-> Progress Tracking: A dashboard to visualize the 'boring middle' of the journey.

The goal was to make something as powerful as a complex spreadsheet but with a much better UX. I’d love to hear your thoughts on the projection logic and if there are any specific variables (like tax drag or inflation adjustments) you think are missing.

URL: https://firenum.com/

I’m happy to answer any questions about the math or the local-first implementation!

firenum.com
3 0
Show HN: Moltbook – A social network for moltbots (clawdbots) to hang out
schlichtm 4 days ago

Show HN: Moltbook – A social network for moltbots (clawdbots) to hang out

Hey everyone!

Just made this over the past few days.

Moltbots can sign up and interact via CLI, no direct human interactions.

Just for fun to see what they all talk about :)

moltbook.com
253 854
Summary
DavidCanHelp about 3 hours ago

Show HN: Database Internals, a book by Claude Opus 4.5

The article provides an in-depth exploration of database internals, covering topics such as storage, indexing, and query processing. It offers a comprehensive understanding of the fundamental mechanisms and architectures that power modern databases, making it a valuable resource for developers and system architects.

cloudstreet-dev.github.io
2 0
Summary
simedw 2 days ago

Show HN: I trained a 9M speech model to fix my Mandarin tones

Built this because tones are killing my spoken Mandarin and I can't reliably hear my own mistakes.

It's a 9M Conformer-CTC model trained on ~300h (AISHELL + Primewords), quantized to INT8 (11 MB), runs 100% in-browser via ONNX Runtime Web.

Grades per-syllable pronunciation + tones with Viterbi forced alignment.

Try it here: https://simedw.com/projects/ear/

simedw.com
453 138
Summary
Show HN: We Ran a Live Red-Team Attack on OpenClaw Agents
udit_50 about 7 hours ago

Show HN: We Ran a Live Red-Team Attack on OpenClaw Agents

This report documents a live adversarial test between two autonomous AI agents running on OpenClaw.

One agent acted as a red team attacker. One acted as a defensive agent. The agents communicated directly over webhooks with real tooling access. No humans were involved once the session started.

The attacker attempted both direct social engineering and indirect injection via documents. Direct attacks were blocked. Indirect attacks via JSON metadata are still under analysis.

The goal of this work is observability, not claims of safety. We expect agent-to-agent adversarial interaction to become common as autonomous systems are deployed more widely.

Happy to answer technical questions.

gobrane.com
2 0
Summary
Zachzhao about 17 hours ago

Show HN: OpenJuris – AI legal research with citations from primary sources

We built tooling that connects LLMs directly to case law databases with citation verification to address hallucination in legal AI. Think of it as giving the model access to actual legal sources instead of relying on training data.

openjuris.org
15 8
Show HN: Phage Explorer
eigenvalue 1 day ago

Show HN: Phage Explorer

I got really interested in biology and genetics a few months ago, just for fun.

This was largely inspired by the work of Sydney Brenner, which became the basis of my brennerbot.org project.

In particular, I became very fascinated by phages, which are viruses that attack bacteria. They're the closest thing to the "fundamental particles" of biology: the minimal units of genetic code that do something useful that allows them to reproduce and spread.

They also have some incredible properties, like having a structure that somehow encodes an icosahedron.

I always wondered how the DNA of these things translated into geometry in the physical world. That mapping between the "digital" realm of ACGT, which in turn maps onto the 20 amino acids in groups of 3, and the world of 3D, analog shapes, still seems magical and mysterious to me.

I wanted to dig deeper into the subject, but not by reading a boring textbook. I wanted to get a sense for these phages in a tangible way. What are the different major types of phages? How do they compare to each other in terms of the length and structure of their genetic code? The physical structure they assume?

I decided to make a program to explore all this stuff in an interactive way.

And so I'm very pleased to present you with my open-source Phage Explorer:

phage-explorer.org

I probably went a bit overboard, because what I ended up with has taken a sickening number of tokens to generate, and resulted in ~150k lines of Typescript and Rust/Wasm.

It implements 23 analysis algorithms, over 40 visualizations, and has the complete genetic data and 3D structure of 24 different classes of phage.

It actually took a lot of engineering to make this work well in a browser; it's a surprising amount of data (this becomes obvious when you look at some of the 3D structure models).

It works fairly well on mobile, but if you want to get the full experience, I highly recommend opening it on a desktop browser in high resolution.

As far as I know, it's the most complete informational / educational software about phages available anywhere. Now, I am the first to admit that I'm NOT an expert, or even that knowledgeable, about, well, ANY of this stuff.

So if you’re a biology expert, please take a look and let me know what you think of what I've made! And if I've gotten anything wrong, please let me know in the GitHub Issues and I'll fix it:

https://github.com/Dicklesworthstone/phage_explorer

phage-explorer.org
118 31
Show HN: An extensible pub/sub messaging server for edge applications
ortuman 4 days ago

Show HN: An extensible pub/sub messaging server for edge applications

hi there! i’ve been working on a project called Narwhal, and I wanted to share it with the community to get some valuable feedback.

what is it? Narwhal is a lightweight Pub/Sub server and protocol designed specifically for edge applications. while there are great tools out there like NATS or MQTT, i wanted to build something that prioritizes customization and extensibility. my goal was to create a system where developers can easily adapt the routing logic or message handling pipeline to fit specific edge use cases, without fighting the server's defaults.

why Rust? i chose Rust because i needed a low memory footprint to run efficiently on edge devices (like Raspberry Pis or small gateways), and also because I have a personal vendetta against Garbage Collection pauses. :)

current status: it is currently in Alpha. it works for basic pub/sub patterns, but I’d like to start working on persistence support soon (so messages survive restarts or network partitions).

i’d love for you to take a look at the code! i’m particularly interested in all kind of feedback regarding any improvements i may have overlooked.

github.com
41 0
Summary
kafked about 8 hours ago

Show HN: A site where anyone can rename any location on Earth

Click any city, mountain, country, sea or whatever on a globe, propose a new name, community votes (or your proposal gets auto-accepted in a few minutes if nobody cares).

It's live now and I'm genuinely curious what happens when strangers on the internet get collective control over world geography. Either it becomes something interesting or it turns into a mess of edgy jokes, stereotypes, and stuff that will make me regret this whole idea

rename.world
4 0
Summary
Show HN: AgentGram – Open-source social network for AI agents
iisweetheartii about 8 hours ago

Show HN: AgentGram – Open-source social network for AI agents

AgentGram is an open-source, decentralized social media platform built on the Ethereum blockchain, enabling users to own their data and earn crypto rewards for engaging with content.

github.com
2 1
Summary
Show HN: Amla Sandbox – WASM bash shell sandbox for AI agents
souvik1997 2 days ago

Show HN: Amla Sandbox – WASM bash shell sandbox for AI agents

WASM sandbox for running LLM-generated code safely.

Agents get a bash-like shell and can only call tools you provide, with constraints you define. No Docker, no subprocess, no SaaS — just pip install amla-sandbox

github.com
143 73
Show HN: Kolibri, a DIY music club in Sweden
EastLondonCoder 3 days ago

Show HN: Kolibri, a DIY music club in Sweden

We’re Maria and Jonatan, a married couple running a small music night in Norrköping, Sweden, called Kolibri.

It’s not a software project. We run it through our own small Swedish company, pay artists, and do the operations ourselves. We do one night a month (usually the last Friday) in a restaurant venue called Mitropa. A typical night is about 50–70 paying guests. The first years it was DJs only, but last year we started doing live bands as well.

We made a simple site with schedule plus photos/video so you can see what it looks like: https://kolibrinkpg.com/

On the site:

  * photos and short videos (size/atmosphere)

  * the kind of acts we book (post-punk, darkwave, synth, adjacent electronic)

  * enough context to copy parts of the format if you’re building something similar locally

  * for the tech-curious: we built our own ticketing system (first used in February) and a media ingestion pipeline for Instagram and external photographers
How it started was accidental. I was doing remote music sessions with a friend in London (Ableton projects back and forth on FaceTime), ran out of beer, and walked into the nearest place. I got talking to Nahir, who runs Mitropa, and floated the idea of running a DIY music night there. He was up for it.

What made it take off was doing things in person. People will show up alone if they trust the room. Maria ended up doing a lot of that work: greeting newcomers, noticing who looks uncertain, and setting a tone where people treat each other decently.

Maria didn’t come from a DJ background. Klubbvärdinnan started as a joke name at Kolibri and then became her DJ moniker. She got good quickly, and after a first gig outside our own night she started getting booked elsewhere too.

Marketing-wise, what worked best was very analogue: walking around town, visiting local businesses we genuinely like, buying something, introducing ourselves, and asking if we could leave a flyer.

In the beginning we weren’t sure how to present it on social media. So we filmed headphone walks: one person walking through town listening to a track we picked. It looked good, people wanted to be in them, and afterwards we’d buy them a couple of drinks and actually talk. That turned a social media interaction into a real connection. It was a bit of luck, but it worked.

Questions welcome about what worked, what failed, costs/logistics, and what we’d do differently if we started over.

kolibrinkpg.com
140 30
Summary
Show HN: Booktest – review-driven regression testing for LLM / ML behavior
arauhala about 13 hours ago

Show HN: Booktest – review-driven regression testing for LLM / ML behavior

The article discusses the open-source project Booktest, which is a collection of tools and resources for testing and evaluating book recommendation algorithms. It provides a standardized dataset, evaluation metrics, and a framework for comparing the performance of different book recommendation systems.

github.com
2 3
Summary
Show HN: Pinecone Explorer – Desktop GUI for the Pinecone vector database
arsentjev 5 days ago

Show HN: Pinecone Explorer – Desktop GUI for the Pinecone vector database

https://github.com/stepandel/pinecone-explorer

pinecone-explorer.com
30 5
Summary
Show HN: Securing the Ralph Wiggum Loop – DevSecOps for Autonomous Coding Agents
agairola about 13 hours ago

Show HN: Securing the Ralph Wiggum Loop – DevSecOps for Autonomous Coding Agents

Hi HN,

Since AutoGPT in 2023, I’ve been uneasy about fully unsupervised AI agents. I see the productivity upside, but “kick it off and walk away” felt risky.

Recently, the “Ralph Wiggum loop” pattern has gone viral. The idea is simple: An autonomous coding agent runs repeatedly until all PRD items are complete, with fresh context each loop and state stored outside the model in git, JSON, etc.

What bothered me was this part: what protects the system while I’m AFK?

Traditional AI-assisted dev today looks like: AI writes code → human reviews → CI scans → human fixes

What I wanted instead: AI writes code → security scans immediately → AI fixes issues → repeats until secure → escalates if stuck

So I built a prototype that embeds security scanning directly inside the agent loop. The agent runs tools like Semgrep, Grype, Checkov, etc. inside its own session, sees the findings, and iteratively fixes them before anything is committed.

The loop looks like this:

PRD → Agent → Scan → Pass? → Commit Fail → Fix → Retry (3x) → Escalate to human

A few design principles that mattered:

* Baseline delta: pre-existing issues are tracked separately. Only new findings block commits. * Sandbox constraints: no network access, no sudo, no destructive commands. * Human override: nothing is fully autonomous. You can step back in at any point.

Is this bulletproof? Definitely not. Is it production-ready? No. But it’s a starting point for applying DevSecOps thinking to autonomous agents instead of trusting “AI magic.”

Repo link: https://github.com/agairola/securing-ralph-loop

Would love feedback from folks experimenting with agent loops, secure automation, or AI-assisted development gone wrong.

Happy to iterate.

github.com
2 0
Summary
Show HN: Pinchwork – A task marketplace where AI agents hire each other
aschuth about 23 hours ago

Show HN: Pinchwork – A task marketplace where AI agents hire each other

Got a Molty with time on their claws? Put them to work. Got one drowning in tasks? Let them delegate.

Pinchwork is a marketplace where agents post tasks, pick up work, and earn credits. Matching and verification are also done by agents, recursive labor all the way down.

Why? Every agent has internet, but not every agent has everything. You lack Twilio keys but a notification agent doesn't. You need an image generated but only run text. You can't audit your own code. You're single-threaded but need 10 things done in parallel.

  POST /v1/register            → 100 free credits
  POST /v1/tasks               → post work with a bounty
  POST /v1/tasks/pickup        → grab a task
  POST /v1/tasks/{id}/deliver  → get paid
Credits are escrowed, deliveries get verified by independent agents, and the whole thing speaks JSON or markdown. Self-hostable: docker run.

Live at https://pinchwork.dev — docs at https://pinchwork.dev/skill.md

github.com
10 8
Summary
Show HN: Cicada – A scripting language that integrates with C
briancr 2 days ago

Show HN: Cicada – A scripting language that integrates with C

I wrote a lightweight scripting language that runs together with C. Specifically, it's a C library, you run it through a C function call, and it can callback your own C functions. Compiles to ~250 kB. No dependencies beyond the C standard library.

Key language features: * Uses aliases not pointers, so it's memory-safe * Arrays are N-dimensional and resizable * Runs scripts or its own 'shell' * Error trapping * Methods, inheritance, etc. * Customizable syntax

github.com
57 38
Summary
Show HN: ToolKuai – Privacy-first, 100% client-side media tools
indie_max 1 day ago

Show HN: ToolKuai – Privacy-first, 100% client-side media tools

Hi HN,

I’m Linn, the creator of ToolKuai (https://toolkuai.com).

Like many of you, I’ve always been wary of "free" online file converters. Most of them are black boxes: you upload your private documents or images to a remote server, and you have no idea where that data ends up or how it’s being used to train models.

I wanted to build a suite of tools (Video/Image compressor, OCR, AI Background Remover) that runs entirely in the browser. No files ever leave your machine. The Tech Stack

To make this performant enough to rival server-side processing, I leaned heavily into modern web APIs:

- AI Background Removal: I'm using ONNX models (Xenova/modnet and ISNet) running locally via Transformers.js. The processing is 100% client-side, fallbacking to WASM when WebGPU isn't available.

- Frontend: Built with SvelteKit (Svelte 5) for its lean footprint and fast reactivity.

- Storage & Delivery: AI models are self-hosted on Cloudflare R2 to avoid massive bandwidth costs and ensure fast delivery.

Current Stats (13 days in):

The site is only 2 weeks old. Surprisingly, I’ve seen strong organic interest from Taiwan and Hong Kong. Average time on site is currently around 3.5 minutes, which suggests people are actually staying to process multiple files, confirming that the client-side speed is hitting the mark.

Future & Monetization

The tool is free. I’ve decided to avoid the "Pro/Premium" subscription model, as I believe these utility tools should be accessible. I'm exploring non-intrusive ads to cover the infrastructure costs (mostly R2 and Vercel).

I’d love to get some feedback from the HN community on:

- Performance on different hardware (especially the WebGPU-based video compressor).

- Privacy concerns or suggestions on how to further harden the "No-Server" promise.

- Any specific media tools you feel are currently lacking in the "client-side only" ecosystem.

Link: https://toolkuai.com

Thanks!

toolkuai.com
7 0
Show HN: Mystral Native – Run JavaScript games natively with WebGPU (no browser)
Flux159 5 days ago

Show HN: Mystral Native – Run JavaScript games natively with WebGPU (no browser)

Hi HN, I've been building Mystral Native — a lightweight native runtime that lets you write games in JavaScript/TypeScript using standard Web APIs (WebGPU, Canvas 2D, Web Audio, fetch) and run them as standalone desktop apps. Think "Electron for games" but without Chromium. Or a JS runtime like Node, Deno, or Bun but optimized for WebGPU (and bundling a window / event system using SDL3).

Why: I originally started by starting a new game engine in WebGPU, and I loved the iteration loop of writing Typescript & instantly seeing the changes in the browser with hot reloading. After getting something working and shipping a demo, I realized that shipping a whole browser doesn't really work if I also want the same codebase to work on mobile. Sure, I could use a webview, but that's not always a good or consistent experience for users - there are nuances with Safari on iOS supporting WebGPU, but not the same features that Chrome does on desktop. What I really wanted was a WebGPU runtime that is consistent & works on any platform. I was inspired by deno's --unsafe-webgpu flag, but I realized that deno probably wouldn't be a good fit long term because it doesn't support iOS or Android & doesn't bundle a window / event system (they have "bring your own window", but that means writing a lot of custom code for events, dealing with windowing, not to mention more specific things like implementing a WebAudio shim, etc.). So that got me down the path of building a native runtime specifically for games & that's Mystral Native.

So now with Mystral Native, I can have the same developer experience (write JS, use shaders in WGSL, call requestAnimationFrame) but get a real native binary I can ship to players on any platform without requiring a webview or a browser. No 200MB Chromium runtime, no CEF overhead, just the game code and a ~25MB runtime.

What it does: - Full WebGPU via Dawn (Chrome's implementation) or wgpu-native (Rust) - Native window & events via SDL3 - Canvas 2D support (Skia), Web Audio (SDL3), fetch (file/http/https) - V8 for JS (same engine as Chrome/Node), also supports QuickJS and JSC - ES modules, TypeScript via SWC - Compile to single binary (think "pkg"): `mystral compile game.js --include assets -o my-game` - macOS .app bundles with code signing, Linux/Windows standalone executables - Embedding API for iOS and Android (JSC/QuickJS + wgpu-native)

It's early alpha — the core rendering path works well & I've tested on Mac, Linux (Ubuntu 24.04), and Windows 11, and some custom builds for iOS & Android to validate that they can work, but there's plenty to improve. Would love to get some feedback and see where it can go!

MIT licensed.

Repo: https://github.com/mystralengine/mystralnative

Docs: https://mystralengine.github.io/mystralnative/

github.com
49 18
Summary
Show HN: Peptide calculators ask the wrong question. I built a better one
silviogutierrez about 18 hours ago

Show HN: Peptide calculators ask the wrong question. I built a better one

Most peptide calculators ask the wrong question.

They ask: How much water are you adding?

But in practice, what you actually know is your vial size and your target dose.

The water amount should be the output, not the input.

It should also make your dose land on a real syringe tick mark. Not something like 17.3 units.

I built a peptide calculator that works this way: https://www.joyapp.com/peptides/

What’s different:

- You pick vial size and target dose → reconstitution is calculated for you

- Doses align to actual syringe markings

- Common dose presets per peptide

- Works well on mobile (where this is usually done)

- Supports blends and compounds (e.g. GLOW or CJC-1295 + Ipamorelin)

- You can save your vials. No account required.

Happy to hear feedback or edge cases worth supporting.

joyapp.com
4 0
Summary
tullie 5 days ago

Show HN: ShapedQL – A SQL engine for multi-stage ranking and RAG

Hi HN,

I’m Tullie, founder of Shaped. Previously, I was a researcher at Meta AI, worked on ranking for Instagram Reels, and was a contributor to PyTorch Lightning.

We built ShapedQL because we noticed that while retrieval (finding 1,000 items) has been commoditized by vector DBs, ranking (finding the best 10 items) is still an infrastructure problem.

To build a decent for you feed or a RAG system with long-term memory, you usually have to put together a vector DB (Pinecone/Milvus), a feature store (Redis), an inference service, and thousands of lines of Python to handle business logic and reranking.

We built an engine that consolidates this into a single SQL dialect. It compiles declarative queries into high-performance, multi-stage ranking pipelines.

HOW IT WORKS:

Instead of just SELECT , ShapedQL operates in four stages native to recommendation systems:

RETRIEVE: Fetch candidates via Hybrid Search (Keywords + Vectors) or Collaborative Filtering. FILTER: Apply hard constraints (e.g., "inventory > 0"). SCORE: Rank results using real-time models (e.g., p(click) or p(relevance)). REORDER: Apply diversity logic so your Agent/User doesn’t see 10 nearly identical results.

THE SYNTAX: Here is what a RAG query looks like. This replaces about 500 lines of standard Python/LangChain code:

SELECT item_id, description, price

FROM

  -- Retrieval: Hybrid search across multiple indexes

  search_flights("$param.user_prompt", "$param.context"),

  search_hotels("$param.user_prompt", "$param.context")
WHERE

  -- Filtering: Hard business constraints

  price <= "$param.budget" AND is_available("$param.dates")
ORDER BY

  -- Scoring: Real-time reranking (Personalization + Relevance)

  0.5 * preference_score(user, item) +

  0.3 * relevance_score(item, "$param.user_prompt")
LIMIT 20

If you don’t like SQL, you can also use our Python and Typescript SDKs. I’d love to know what you think of the syntax and the abstraction layer!

playground.shaped.ai
80 23
Summary
Show HN: Hebo Gateway, an embeddable AI gateway with OpenAI-compatible endpoints
dselvaggio about 18 hours ago

Show HN: Hebo Gateway, an embeddable AI gateway with OpenAI-compatible endpoints

Hey HN, we just shipped v0.1 of Hebo Gateway.

There are plenty of gateways already, but we kept running into the same issue: once you need real customization (auth, routing, rate limits, observability, request/response transforms), most “off the shelf” gateways get hard to extend.

Hebo Gateway is for cases where you want the gateway to be part of your app. You can run it standalone, or embed it into an existing backend. It exposes OpenAI-compatible endpoints (/chat/completions, /embeddings, /models), works with any Vercel AI SDK provider, and adds a hook system so you can plug logic into the request lifecycle without forking the core.

Quickstart, examples, and “what’s next” are in the post: https://hebo.ai/blog/260127-hebo-gateway

I would love feedback on OpenAI-compat edge cases you have been bitten by (especially streaming and reasoning-related stuff), and what hooks you wish gateways provided out of the box.

github.com
2 0
Summary
lcolucci 5 days ago

Show HN: LemonSlice – Upgrade your voice agents to real-time video

Hey HN, we're the co-founders of LemonSlice (try our HN playground here: https://lemonslice.com/hn). We train interactive avatar video models. Our API lets you upload a photo and immediately jump into a FaceTime-style call with that character. Here's a demo: https://www.loom.com/share/941577113141418e80d2834c83a5a0a9

Chatbots are everywhere and voice AI has taken off, but we believe video avatars will be the most common form factor for conversational AI. Most people would rather watch something than read it. The problem is that generating video in real-time is hard, and overcoming the uncanny valley is even harder.

We haven’t broken the uncanny valley yet. Nobody has. But we’re getting close and our photorealistic avatars are currently best-in-class (judge for yourself: https://lemonslice.com/try/taylor). Plus, we're the only avatar model that can do animals and heavily stylized cartoons. Try it: https://lemonslice.com/try/alien. Warning! Talking to this little guy may improve your mood.

Today we're releasing our new model* - Lemon Slice 2, a 20B-parameter diffusion transformer that generates infinite-length video at 20fps on a single GPU - and opening up our API.

How did we get a video diffusion model to run in real-time? There was no single trick, just a lot of them stacked together. The first big change was making our model causal. Standard video diffusion models are bidirectional (they look at frames both before and after the current one), which means you can't stream.

From there it was about fitting everything on one GPU. We switched from full to sliding window attention, which killed our memory bottleneck. We distilled from 40 denoising steps down to just a few - quality degraded less than we feared, especially after using GAN-based distillation (though tuning that adversarial loss to avoid mode collapse was its own adventure).

And the rest was inference work: modifying RoPE from complex to real (this one was cool!), precision tuning, fusing kernels, a special rolling KV cache, lots of other caching, and more. We kept shaving off milliseconds wherever we could and eventually got to real-time.

We set up a guest playground for HN so you can create and talk to characters without logging in: https://lemonslice.com/hn. For those who want to build with our API (we have a new LiveKit integration that we’re pumped about!), grab a coupon code in the HN playground for your first Pro month free ($100 value). See the docs: https://lemonslice.com/docs. Pricing is usage-based at $0.12-0.20/min for video generation.

Looking forward to your feedback!

EDIT: Tell us what characters you want to see in the comments and we can make them for you to talk to (e.g. Max Headroom)

*We did a Show HN last year for our V1 model: https://news.ycombinator.com/item?id=43785044. It was technically impressive but so bad compared to what we have today.

130 130