Daily top stories
Bicycle
Infinite Mac: Infinitemac.org
UK sets up fake booter sites to muddy DDoS market
My4TH – A minimalistic FORTH computer with discrete CPU
Open Flamingo – open framework to train multimodal LLMs
Cerebras-GPT: A Family of Open, Compute-Efficient, Large Language Models
EU Commission doesn't understand what's written in its own chat control bill
Launch HN: Metal (YC W23) – Embeddings as a Service
If you’re unfamiliar with embeddings, they are representations of real world data expressed as a vector, where the position of the vector can be compared to other vectors – thereby deriving meaning from the data. They can be used to create things like semantic search, recommender systems, clustering analysis, classification, and more.
Working at companies like Datadog, Meta, and Spotify, we found it frustrating to build ML apps. Lack of tooling, infrastructure, and proper abstraction made working with ML tedious and slow. To get features out the door we’ve had to build data ingestion pipelines from scratch, manually maintain live customer datasets, build observability to measure drift, manage no-downtime deployments, and the list goes on. It took months to get simple features in front of users and the developer experience was terrible.
OpenAI, Hugging Face and others have brought models to the masses, but the developer experience still needs to be improved. To actually use embeddings, hitting APIs like OpenAI is just one piece of the puzzle. You also need to figure out storage, create indexes, maintain data quality through fine-tuning, manage versions, code operations on top of your data, and create APIs to consume it. All of this friction makes it a pain to ship live applications.
Metal solves these problems by providing an end-to-end platform for embeddings. Here’s how it works:
Data In: You send data to our system via our SDK or API. Data can be text, images, PDFs, or raw embeddings. When data hits our pipeline we preprocess by extracting the text from documents and chunking when necessary. We then generate embeddings using the selected model. If the index has fine-tuning transformation, we transform the embedding into the new vector space so it matches the target data. We then store the embeddings in cold storage for any needed async jobs.
From there we index the embeddings for querying. We use HSNW right now, but are planning to support FLAT indexes as well. We currently index in Redis, but plan to make this configurable and provide more options for datastores.
Data Out: We provide querying endpoints to hit the indexes, finding the ANN. For fine-tuned indexes, we generate embeddings from the base model used and then transform the embedding into the new vector space during the pre-query phase.
Additionally, we provide methods to run clustering jobs on the stored embeddings and visualizations in the UI. We are experimenting with zero-shot classification, by embedding the classes and matching to each embedding in the closest class, allowing us to provide a “classify” method in our SDK. We would love feedback on what other async job types would be useful!
Examples of what users have built so far include embedding product catalogs for improved similarity search, personalized in-app messaging with user behavior clusters, and similarity search on images for content creators.
Metal has a free tier that anyone can use, a developer tier for $20/month, and an enterprise tier with custom pricing. We’re currently building an open source product that will be released soon.
Most importantly, we’re sharing Metal with the HN community because we want to build the best developer experience possible, and the only metric we care about is live apps on prod. We’d love to hear your feedback, experiences with embeddings, and your ideas for how we can improve the product. Looking forward to your comments, thank you!
Interaction Nets, Combinators, and Calculus – HVM
Intel 80386 CPU Information
TaxyAI: Open-source browser automation with GPT-4
Medieval Arabic surgeon Ibn al Quff's account on surgical pain relief
Apple Music Classical
Snippyly (YC W22) Is Hiring Front end SWEs to make the web multiplayer
Tectonic – A modern, complete, self-contained Tex engine with Unicode support
Iceland long term visa for remote workers
Show HN: A fully open-source (Apache 2.0)implementation of llama
The original LLaMA code is GPL licensed which means any project using it must also be released under GPL.
This "taints" any other code and prevents meaningful academic and commercial use.
Lit-LLaMA solves that for good.
Nuenv: An experimental Nushell builder for Nix packages
A non-federated decentralized social protocol based on Git
The Shaman’s Secrets
Ivy League Prices Are Pushing $90k a Year
Fugue: A unified interface for distributed computing
Android app from China executed 0-day exploit on millions of devices
How to Polis, 101, Part IIb: Archons
Quicker serverless Postgres connections
Windows 11 KB5023778 update adds promotions to the Start menu
Open Sourcing Cody – Sourcegraph's AI-enabled editor assistant
Amazon starts flagging frequently returned products that you maybe shouldn’t buy
Random Fuzzy Thoughts
Apple introduces Apple Pay Later
You're all caught up
Don't spend all your valuable time here, life is more important Content refreshes every hour, on the hour