Introducing LiteFS

danielskogly
642
152
8d
FLY.IO

Comments

no_wizard 8d
Thios is distributed SQLite 3, running (I assume at least partially managed?) Litestream[0] for you. Which is pretty cool!

What I'd like to have seen is how this compares to things like rqlite[1] or Cloudflare's D1[2] addressed directly in the article

That said, I think this is pretty good for things like read replica's. I know the sales pitch here is as a full database, and I don't disagree with it. What I find however, is that most workloads are already attached to a database of some sort. This is a great way to make very - insanely, really - fast read replica's across regions of your data. You can use an independent raft[3][4] implementation to do this.

One thing SQLite is actually really good at is storing JSON blobs in my experience. I have successfully used it for replicating JSON representations of read only data in the past to great success.

[0]: https://litestream.io/

[1]: https://github.com/rqlite/rqlite

[2]: https://blog.cloudflare.com/introducing-d1/

[3]: https://raft.github.io/

[4]: https://raft.github.io/#implementations

pphysch 8d
> Developing against a relational database requires devs to watch out for "N+1" query patterns, where a query leads to a loop that leads to more queries. N+1 queries against Postgres and MySQL can be lethal to performance. Not so much for SQLite.

This is misleading AFAICT. The article(s) is actually comparing remote RDBMS to local RDBMS, not Postgres to SQLite.

Postgres can also be served over a UNIX socket, removing the individual query overhead due to TCP roundtrip.

SQLite is a great technology, but keep in mind that you can also deploy Postgres right next to your app as well. If your app is something like a company backend that could evolve a lot and benefit from Postgres's advanced features, this may be the right choice.

hobo_mark 8d
Well that was fast... [1]

Are the readable replicas supposed to be long-lived (as in, I don't know, hours)? Or does consul happily converge even with ephemeral instances coming and going every few minutes (thinking of something like Cloud Run and the like, not sure if Fly works the same way)? And do they need to make a copy of the entire DB when they "boot" or do they stream pages in "on demand"?

[1] https://news.ycombinator.com/item?id=32240230

lijogdfljk 8d
This is really cool! Unfortunately i primarily am interested in offline databases so perhaps i'm just not the target audience. However i have to ask, on that note, does this have any application in the offline space?

Ie i wonder if there's a way to can write your applications such that they have less/minimal contention, and then allow the databases to merge when back online? Of course, what happens when there inevitably _is_ contention? etc

Not sure that idea would have a benefit over many SQLite DBs with userland schemas mirroring CRDT principles though. But a boy can dream.

Regardless, very cool work being done here.

asim 8d
10 years ago fly.io is the company I wanted to build. Something with massive technical depth that becomes a developer product. They're doing an incredible job and part of that comes down to how they evangelise the product outside of all the technical hackery. This requires so much continued effort. AND THEN to actually run a business on top of all that. Kudos to you guys. I struggled so much with this. Wish you nothing but continued success.
theomega 8d
Does anyone else bump into the issue, that the fly.io website does not load if requested via IPv6 on Mac? I tried Safari, Chrome and curl and neither work:

  $ curl -v https://fly.io/blog/introducing-litefs/
  *   Trying 2a09:8280:1::a:791:443...
  * Connected to fly.io (2a09:8280:1::a:791) port 443 (#0)
  * ALPN, offering h2
  * ALPN, offering http/1.1
  * successfully set certificate verify locations:
  *  CAfile: /etc/ssl/cert.pem
  *  CApath: none
  * (304) (OUT), TLS handshake, Client hello (1):

Requesting via ipv4 works

  $ curl -4v https://fly.io/blog/introducing-litefs/
  *   Trying 37.16.18.81:443...
  * Connected to fly.io (37.16.18.81) port 443 (#0)
  * ALPN, offering h2
  * ALPN, offering http/1.1
  * successfully set certificate verify locations:
  *  CAfile: /etc/ssl/cert.pem
  *  CApath: none
  * (304) (OUT), TLS handshake, Client hello (1):
  * (304) (IN), TLS handshake, Server hello (2):
  * (304) (IN), TLS handshake, Unknown (8):
  * (304) (IN), TLS handshake, Certificate (11):
  * (304) (IN), TLS handshake, CERT verify (15):
  * (304) (IN), TLS handshake, Finished (20):
  * (304) (OUT), TLS handshake, Finished (20):
  * SSL connection using TLSv1.3 / AEAD-CHACHA20-POLY1305-SHA256
  * ALPN, server accepted to use h2
  * Server certificate:
  *  subject: CN=fly.io
  *  start date: Jul 25 11:20:01 2022 GMT
  *  expire date: Oct 23 11:20:00 2022 GMT
  *  subjectAltName: host "fly.io" matched cert's "fly.io"
  *  issuer: C=US; O=Let's Encrypt; CN=R3
  *  SSL certificate verify ok.
  * Using HTTP2, server supports multiplexing
  * Connection state changed (HTTP/2 confirmed)
  * Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
  * Using Stream ID: 1 (easy handle 0x135011c00)
  > GET /blog/introducing-litefs/ HTTP/2
  > Host: fly.io
  > user-agent: curl/7.79.1
  > accept: */*
  >
  * Connection state changed (MAX_CONCURRENT_STREAMS == 32)!
  < HTTP/2 200
  < accept-ranges: bytes
  < cache-control: max-age=0, private, must-revalidate
  < content-type: text/html
  < date: Wed, 21 Sep 2022 16:50:16 GMT
  < etag: "632b20f0-1bdc1"
  < fly-request-id: 01GDGFA3RPZPRDV9M3AQ3159ZK-fra
  < last-modified: Wed, 21 Sep 2022 14:34:24 GMT
  < server: Fly/51ee4ef9 (2022-09-20)
  < via: 1.1 fly.io, 2 fly.io
  <
  <!doctype html> ...
ranger_danger 8d
ELI5?
nicoburns 8d
Where is the data actually being stored in this setup? A copy on each machine running the application? If so, is there another copy somewhere else (e.g. S3) in case all nodes go down?

Also, what happens if the Consul instance goes down?

If my application nodes can't be ephemeral then this seems like it would be harder to operate than Postgres or MySQL in practice. If it completely abstracts that away somehow then I suppose that'd be pretty cool.

Currently finding it hard to get on board with the idea that adding a distributed system here actually makes things simpler.

whoisjuan 8d
Unrelated, but why does the map on their homepage show a region in Cuba? That must be wrong.
vcryan 8d
This approach is very appealing to me :) curious about how people handle schema migrations when using this approach.

I segment sqlite files (databases) that have the same schema into the same folder. I haven't really had a case where migrations was really a concern, but I could see it happening soon.

Seems like in my deployment, I'm going to need an approach to loop over dbs to apply this change... I currently have a step of app deployment that attempts to apply migrations... but it is more simplistic because the primary RDBMS (postgresql) just appears to the application as a single entity which is the normative use-case for db-migrate-runners.

infogulch 8d
> To improve latency, we're aiming at a scale-out model that works similarly to Fly Postgres. That's to say: writes get forwarded to the primary and all read requests get served from their local copies.

How can you ensure that a client that just performed a forwarded write will be able to read that back on their local replica on subsequent reads?

vcryan 8d
Currently, I am running multiple application servers (and lambda functions) using AWS Fargate that access sqlite files (databases) on an EFS share. So far so good, although my use cases are fairly simple.
clord 8d
I can imagine a database which tries to solve both of these domains.

A centralized database handles consistency, and vends data closures to distributed applications for in-process querying (and those closures reconcile via something like CRDT back to the core db).

Does this exist?

mwcampbell 8d
I wonder if using FUSE has had any appreciable impact on performance, particularly read performance. I ask because FUSE has historically had a reputation for being slow, e.g. with the old FUSE port of ZFS.
endisneigh 8d
Seems neat, until you try to do schema migrations. Unless they can guarantee that all containers’ SQLite instances have the same scheme without locking I’m not sure how doesn’t run into the same issues as many NoSQL.

CouchDB had this same issue with its database per user model and eventually consistent writes.

jensneuse 8d
Sound like a drop in solution to add high availability to WunderBase (https://github.com/wundergraph/wunderbase). Can we combine LiteFS with Litestream for Backups, or how would you do HA + Backups together?
fny 8d
This reads like a professor who's so steeped in research that he's forgotten how to communicate to his students!

What exactly are we talking about here? A WebSQL thats actually synced to a proper RDBMS? Synced across devices? I'm not clear about an end to end use case.

Edit: Honestly, this line from the LiteFS docs[0] needs to be added to the top of the article:

> LiteFS is a distributed file system that transparently replicates SQLite databases. This lets you run your application like it's running against a local on-disk SQLite database but behind the scenes the database is replicated to all the nodes in your cluster. This lets you run your database right next to your application on the edge.

I had no idea what was being talked about otherwise.

[0]: https://fly.io/docs/litefs/

hinkley 8d
> Second, your application can only serve requests from that one server. If you fired up your server in Dallas then that'll be snappy for Texans. But your users in Chennai will be cursing your sluggish response times since there's a 250ms ping time between Texas & India.

> To improve availability, it uses leases to determine the primary node in your cluster. By default, it uses Hashicorp's Consul.

Having a satellite office become leader of a cluster is one of the classic blunders in distributed computing.

There are variants of Raft where you can have quorum members that won't nominate themselves for election, but out of the box this is a bad plan.

If you have a Dallas, Chennai, Chicago, and Cleveland office and Dallas goes dark (ie, the tunnel gets fucked up for the fifth time this year), you want Chicago to become the leader, Cleveland if you're desperate. But if Chennai gets elected then everyone has a bad time, including Dallas when it comes back online.

Existenceblinks 8d
The pain is that this approach is suitable on VPS/IaaS where disk volume is supported. As a solo dev, I only use PaaS kind of infra, there are just a few PaaS i'm aware of that support attachable disk. Fly, Render, .. nothing else?