The Road to 128 bit Linux

rwmj
278
172
6d
LWN.NET

Comments

jmillikin 6d
The section about 128-bit pointers being necessary for expanded memory sizes is unconvincing -- 64 bits provides 16 EiB (16 x 1024 x 1024 x 1 GiB), which is the sort of address space you might need for byte-level addressing of a warehouse full of high-density HDDs. Memory sizes don't grow like they used to, and it's difficult to imagine what kind of new physics would let someone fit that many bytes into a machine that's practical to control with a single Linux kernel instance.

CHERI is a much more interesting case, because it expands the definition of what a "pointer" is. Most low-level programmers think of pointers as just an address, but CHERI turns it into a sort of tuple of (address, bounds, permissions) -- every pointer is bounds-checked. The CHERI folks did some cleverness to pack that all into 128 bits, and I believe their demo platform uses 128-bit registers.

The article also touches on the UNIX-y assumption that `long` is pointer-sized. This is well known (and well hated) by anyone that has to port software from UNIX to Windows, where `long` and `int` are the same size, and `long long` is pointer-sized. I'm firmly in the camp of using fixed-size integers but the Linux kernel uses `long` all over the place, and unless they plan to do a mass migration to `intptr_t` it's difficult to imagine a solution that would let the same C code support 32-, 64-, and 128-bit platforms.

(comedy option: 32-bit int, 128-bit long, and 64-bit `unsigned middle`)

The article also mentions Rust types as helpful, but Rust has its own problems with big pointers because they inadvisably merged `size_t`, `ptrdiff_t`, and `intptr_t` into the same type. They're working on adding equivalent symbols to the FFI module[0], but untangling `usize` might not be possible at this point.

[0] https://github.com/rust-lang/rust/issues/88345

PaulHoule 6d
On one hand The IBM System/38 used 128 bit pointers in the 1970s, despite having a 48 bit physical address bus. These were used to manage persistent objects on disk or network with unique ids a lot like uuids.

On the other hand, filling out a 64 bit address space looks tough. I struggled to find something of the same magnitude of 2^64 and I got ‘number of iron atoms in an iron filing’, From a nanotechnological point of view a memory bank that size is feasible (fits in a rack at 10,000 atoms per bit) but progress in semiconductors is slowing down. Features are still getting smaller but they aren’t getting cheaper anymore.

gwbas1c 6d
> Matthew Wilcox took the stage to make the point that 64 bits may turn out to be too few — and sooner than we think

Let's think critically for a moment. I grew up in the 1980s and 1990s, when we all craved more and more powerful computers. I even remember the years when each generation of video games was marketed as 8-bit, 16-bit, 32-bit, ect.

BUT: We're hitting a point where, for what we use computers for, they're powerful enough. I don't think I'll ever need to carry a 128-bit phone in my pocket, nor do I think I'll need a 128-bit web browser, nor do I think I'll need a 128-bit web server. (See other posts about how 64-bits can address massive amounts of memory.)

Will we need 128-bit computing? I'm sure someone will find a need. But let's not assume they'll need an operating system designed in the 1990s for use cases that we can't imagine today.

LinkLink 6d
For reference 2^64 = ~10^19.266 I don't think this is unreasonable at all, its unlikely that computers will largely stay the same in the coming years. I believe we'll see many changes to how things like mass addressing of data and computing resources is done. Right now our limitations in these regards are addressed by distributed computing and databases, but in a hyper-connected world there may come a time when such huge address space could actually be used.

It's an unlikely hypothetical but imagine if fiber ran everywhere, and all computers seamlessly worked together sharing computer power as needed. Even 256 bits wouldn't be out of the question then. And before you say something like that will never happen consider trying to convince somebody from 2009 that in 13 years people would be buying internet money backed by nothing.

wongarsu 6d
What's the average lifespan of a line of kernel code? I imagine by starting this project 12 years before its anticipated use case they can get very far just by requiring that any new code is 128-bit compatible (in addition to doing the broader infrastructure changes needed like fixing the syscall ABI)
teddyh 6d
Maybe we can finally fix maxint_t to be the largest integer type again.
tayistay 6d
Is 128 bit the limit of what we would need? We use 128 bit UUIDs. 2^256 seems to be more than the number of atoms on Earth.
rmorey 6d
this seems just a bit too early - so that probably means it’s exactly the right time!
Blikkentrekker 6d
> How would this look in the kernel? Wilcox had originally thought that, on a 128-bit system, an int should be 32 bits, long would be 64 bits, and both long long and pointer types would be 128 bits. But that runs afoul of deeply rooted assumptions in the kernel that long has the same size as the CPU's registers, and that long can also hold a pointer value. The conclusion is that long must be a 128-bit type.

Can anyone explain the rationale for not simply naming types after their size? In many programming languages, rather than this arcane terminology, “i16”, “i32”, “i64”, and “i128” simpy exist.

t-3 6d
Are there operations that vector processors are inherently worse at or much harder to program for? Nowadays they seem to be mainly used for specialized tasks like graphics and machine learning accelerators, but given the expansion of SIMD instruction sets, are general purpose vector CPUs in the pipeline anywhere?
munro 6d
There was a post awhile back from NASA saying how many digits of Pi they actually need [1].

    import math

    pi = 3141592653589793238462643383279502884197169399375105820974944592307816406286208998628034825342117067982148086513282306647093844609550582231725359408128481117450284102701938521105559644622948954930381964428810975665933446128475648233786783165271201909145648566923460348610454326648213393607260249141273724587006606315588174881520920962829254091715364367892590360

    sign_bits = 1
    sig_bits = math.ceil(math.log2(pi))
    exp_bits = math.floor(math.log2(sig_bits))

    assert sign_bits + sig_bits + exp_bits == 1209
I'm sure I got something wrong here, def off-by-one, but roughly it looks like it would need 1209-bit floats (2048-bit rounded up!). IDK, mildly interesting. :>

[1] https://www.jpl.nasa.gov/edu/news/2016/3/16/how-many-decimal...

bitwize 6d
Pointers will get fat (CHERI and other tagged pointer schemes) well before even server users will need to byte-address into more than 2^64 bytes' worth of stuff. So we should probably be realistically aiming for 256-bit architectures...
PaulDavisThe1st 6d
I remember a quote from the papers about Opal, an experimental OS that was intended to use h/w protection rather than virtual memory, so that all processes share the same address space and can just exchange pointers to share data.

"A 64 bit memory space is large enough that if a process allocated 1MB every second, it could continue doing this until significantly past the expected lifetime of the sun before it ran into problems"

Beltiras 6d
I would think by now any bunch of clever people would be trying to fix the generalized problem of supporting n-bit memory addressing instead of continually solving the single problem of "how do we go from n*2 to (n+1)*2". I guess it's more practical to just let the next generation of kernel maintainers go through all of this hullabaloo again in 2090.
MikeHalcrow 6d
I recall sitting in a packed room with over a hundred devs at the 2004 Ottawa Linux Symposium while the topic of the number of filesystem bits was being discussed (link: https://www.linux.com/news/ottawa-linux-symposium-day-2/). I recall people throwing out questions as to why we weren't just jumping to 128 or 256 bits, and at one point someone blurted out something about 1024 bits. Someone then made a comment about the number of atoms in the universe, everyone chuckled, and the discussion moved on. I sensed the feeling in the room was that any talk of 128 bits or more was simply ridiculous. Mind you this was for storage.

Fast-forward 18 years, and it's fascinating to me to see people now seriously floating the proposal to support 256-bit pointers.

Aqueous 6d
If we're going to go for 128- why not just go for 256-? that way we won't have to do this again for a while.

or better yet, design a new abstraction for not having to hard-code the limit of the pointer size but instead allow it to be extensible as more addressable space becomes a reality, instead of having to transition over and over. is this even possible? if it is, shouldn't we head in that direction?

jaimehrubiks 6d
Somebody asked before to please not share lwn's SubscriberLinks. The articles should become free about a week after.
ghoward 6d
It would be sad if we, as an industry, do not take this opportunity to create a better OS.

First, we should decide whether to have a microkernel or a monolithic kernel.

I think the answer is obvious: microkernel. This is much safer, and seL4 has shown that performance need not suffer too much.

Next, we should start by acknowledging the chicken-and-egg problem, especially with drivers. We will need drivers.

So let's reuse Linux drivers by implementing a library for them to run in userspace. This should be difficult, but not impossible, and the rewards would be massive, basically deleting the chicken-and-egg problem for drivers.

To solve the userspace chicken-and-egg problem (having applications that run on the OS), implement a POSIX API on top of the OS. Yes, this will mean that some bad legacy like `fork()` will exist, but it will solve that chicken-and-egg problem.

From there, it's a simple matter of deciding what the best design is.

I believe it would be four things:

1. Acknowledging hardware as in [1].

2. A copy-on-write filesystem with a transactional API (maybe a modified ZFS or BtrFS).

3. A uniform event API like Windows' handles and Wait() functions or Plan 9's file descriptors.

For number 3, note that not everything has to be a file, but receiving events like signals and events from child processes should be waitable, like in Windows or Linux's signalfd and pidfd.

For number 2, this would make programming so much easier on everybody, including kernel and filesystem devs. And I may be wrong, but it seems like it would not be hard to implement. When doing copy-on-write, just copy as usual, and update the root B-tree node; the transaction commits when the root B-tree node is flushed to disk, and the flush succeeds.

(Of course, this would also require disks that don't lie, but that's another problem.)

[1]: https://www.usenix.org/conference/osdi21/presentation/fri-ke...

tomcam 6d
Just wanted to say I love this discussion. Have been pondering the need for a 128-bit OS for decades but several of the issues raised were completely novel to me. Fantastic to have so many people so much smarter than I am hash it out informally here. Feels like a master class.
znpy 6d
I wonder if with 128-bit wide pointers it would make sense to start using early-lisp-style tagged pointers.
torginus 6d
On a bit tangential note, RAM price for a given cost used to increase exponentially until the 2010s or so.

Since then, it only roughly halved. What happened?

https://jcmit.net/memoryprice.htm

I know it's not process geometry, since we went from 45nm->5nm in the time, a roughly 81x decrease.

Is is realistic to assume scaling will resume?

amelius 6d
Perhaps it's an idea to make Linux parameterized in the pointer/word size, and let the compiler figure it out in the future.
jupp0r 6d
"The problem now is that there is no 64-bit type in the mix. One solution might be to "ask the compiler folks" to provide a __int64_t type. But a better solution might just be to switch to Rust types, where i32 is a 32-bit, signed integer, while u128 would be unsigned and 128 bits. This convention is close to what the kernel uses already internally, though a switch from "s" to "i" for signed types would be necessary. Rust has all the types we need, he said, it would be best to just switch to them."

Does anybody know why they don't use the existing fixed size integer types [1] from C99 ie uint64_t etc and define a 128 bit wide type on top of that (which will also be there in C23 IIRC)?

My own kernel dev experience is pretty rusty at this point (pun intended), but in the last decade of writing cross platform (desktop, mobile) userland C++ code I advocated exclusively for using fixed width types (std::uint32_t etc) as well as constants (UINT32_MAX etc).