What I'd like to have seen is how this compares to things like rqlite or Cloudflare's D1 addressed directly in the article
That said, I think this is pretty good for things like read replica's. I know the sales pitch here is as a full database, and I don't disagree with it. What I find however, is that most workloads are already attached to a database of some sort. This is a great way to make very - insanely, really - fast read replica's across regions of your data. You can use an independent raft implementation to do this.
One thing SQLite is actually really good at is storing JSON blobs in my experience. I have successfully used it for replicating JSON representations of read only data in the past to great success.
This is misleading AFAICT. The article(s) is actually comparing remote RDBMS to local RDBMS, not Postgres to SQLite.
Postgres can also be served over a UNIX socket, removing the individual query overhead due to TCP roundtrip.
SQLite is a great technology, but keep in mind that you can also deploy Postgres right next to your app as well. If your app is something like a company backend that could evolve a lot and benefit from Postgres's advanced features, this may be the right choice.
Are the readable replicas supposed to be long-lived (as in, I don't know, hours)? Or does consul happily converge even with ephemeral instances coming and going every few minutes (thinking of something like Cloud Run and the like, not sure if Fly works the same way)? And do they need to make a copy of the entire DB when they "boot" or do they stream pages in "on demand"?
Ie i wonder if there's a way to can write your applications such that they have less/minimal contention, and then allow the databases to merge when back online? Of course, what happens when there inevitably _is_ contention? etc
Not sure that idea would have a benefit over many SQLite DBs with userland schemas mirroring CRDT principles though. But a boy can dream.
Regardless, very cool work being done here.
Requesting via ipv4 works
$ curl -v https://fly.io/blog/introducing-litefs/ * Trying 2a09:8280:1::a:791:443... * Connected to fly.io (2a09:8280:1::a:791) port 443 (#0) * ALPN, offering h2 * ALPN, offering http/1.1 * successfully set certificate verify locations: * CAfile: /etc/ssl/cert.pem * CApath: none * (304) (OUT), TLS handshake, Client hello (1):
$ curl -4v https://fly.io/blog/introducing-litefs/ * Trying 18.104.22.168:443... * Connected to fly.io (22.214.171.124) port 443 (#0) * ALPN, offering h2 * ALPN, offering http/1.1 * successfully set certificate verify locations: * CAfile: /etc/ssl/cert.pem * CApath: none * (304) (OUT), TLS handshake, Client hello (1): * (304) (IN), TLS handshake, Server hello (2): * (304) (IN), TLS handshake, Unknown (8): * (304) (IN), TLS handshake, Certificate (11): * (304) (IN), TLS handshake, CERT verify (15): * (304) (IN), TLS handshake, Finished (20): * (304) (OUT), TLS handshake, Finished (20): * SSL connection using TLSv1.3 / AEAD-CHACHA20-POLY1305-SHA256 * ALPN, server accepted to use h2 * Server certificate: * subject: CN=fly.io * start date: Jul 25 11:20:01 2022 GMT * expire date: Oct 23 11:20:00 2022 GMT * subjectAltName: host "fly.io" matched cert's "fly.io" * issuer: C=US; O=Let's Encrypt; CN=R3 * SSL certificate verify ok. * Using HTTP2, server supports multiplexing * Connection state changed (HTTP/2 confirmed) * Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0 * Using Stream ID: 1 (easy handle 0x135011c00) > GET /blog/introducing-litefs/ HTTP/2 > Host: fly.io > user-agent: curl/7.79.1 > accept: */* > * Connection state changed (MAX_CONCURRENT_STREAMS == 32)! < HTTP/2 200 < accept-ranges: bytes < cache-control: max-age=0, private, must-revalidate < content-type: text/html < date: Wed, 21 Sep 2022 16:50:16 GMT < etag: "632b20f0-1bdc1" < fly-request-id: 01GDGFA3RPZPRDV9M3AQ3159ZK-fra < last-modified: Wed, 21 Sep 2022 14:34:24 GMT < server: Fly/51ee4ef9 (2022-09-20) < via: 1.1 fly.io, 2 fly.io < <!doctype html> ...
Also, what happens if the Consul instance goes down?
If my application nodes can't be ephemeral then this seems like it would be harder to operate than Postgres or MySQL in practice. If it completely abstracts that away somehow then I suppose that'd be pretty cool.
Currently finding it hard to get on board with the idea that adding a distributed system here actually makes things simpler.
I segment sqlite files (databases) that have the same schema into the same folder. I haven't really had a case where migrations was really a concern, but I could see it happening soon.
Seems like in my deployment, I'm going to need an approach to loop over dbs to apply this change... I currently have a step of app deployment that attempts to apply migrations... but it is more simplistic because the primary RDBMS (postgresql) just appears to the application as a single entity which is the normative use-case for db-migrate-runners.
How can you ensure that a client that just performed a forwarded write will be able to read that back on their local replica on subsequent reads?
A centralized database handles consistency, and vends data closures to distributed applications for in-process querying (and those closures reconcile via something like CRDT back to the core db).
Does this exist?
CouchDB had this same issue with its database per user model and eventually consistent writes.
What exactly are we talking about here? A WebSQL thats actually synced to a proper RDBMS? Synced across devices? I'm not clear about an end to end use case.
Edit: Honestly, this line from the LiteFS docs needs to be added to the top of the article:
> LiteFS is a distributed file system that transparently replicates SQLite databases. This lets you run your application like it's running against a local on-disk SQLite database but behind the scenes the database is replicated to all the nodes in your cluster. This lets you run your database right next to your application on the edge.
I had no idea what was being talked about otherwise.
> To improve availability, it uses leases to determine the primary node in your cluster. By default, it uses Hashicorp's Consul.
Having a satellite office become leader of a cluster is one of the classic blunders in distributed computing.
There are variants of Raft where you can have quorum members that won't nominate themselves for election, but out of the box this is a bad plan.
If you have a Dallas, Chennai, Chicago, and Cleveland office and Dallas goes dark (ie, the tunnel gets fucked up for the fifth time this year), you want Chicago to become the leader, Cleveland if you're desperate. But if Chennai gets elected then everyone has a bad time, including Dallas when it comes back online.