SectorC: A C Compiler in 512 bytes

XORVOID.COM
452
79
xorvoid
5d

Comments

@xonix 5d
This reminded me the idea of compilers bootstrapping (https://news.ycombinator.com/item?id=35714194). That is, now you can code in SectorC some slightly more advanced version of C capable of compiling TCC (https://bellard.org/tcc/), and then with TCC you can go forward to GCC and so on.
@keyle 5d
The conclusion table was resume building skill in and of itself.
@vitiral 5d
Pretty nifty, nice work!

I'll point out to any passerby that this C doesn't support structs, so it's unlikely you'd actually want to build anything in it.

@DesiLurker 5d
something like this could be interesting for deep-space applications where you only have a bare metal environment with hardened processor and limited memory & of course ping time of days (to earth).

or alternatively for embedding a C compiler inside a LLM to use the LLM as a form of virtual machine.

@gigel82 5d
I'm wondering if you can build an actual "Linux from scratch" with this as the lowest level, without the need to use a host system at all.
@pgt 5d
Reminds of the META II Metacompiler http://hcs64.com/files/pd1-3-schorre.pdf
@molticrystal 5d
Now they just need to port something like oneKpaq to 16 bit or maybe something from the extremely tiny decompressor thread [1], just to test compression level to get an idea kpaq on its quickest setting(taking minutes instead of what could be days on its highest) reduced SectorC to 82.81% of its size, of course adding the 128 bit stub knocked it to 677 bytes. It would be interesting to try it on the slowest takes day to bruteforce setting, but I'm not going to attempt that.

Some of the compressors in that forum thread since they are 32 bytes and such, might find it easier to get net gains.

[0] https://github.com/temisu/oneKpaq

[1] https://encode.su/threads/3387-(Extremely)-tiny-decompressor...

@userbinator 5d
I saw the repeating 'A' at the end of the base64 text and thought "it's not even 512 bytes; it's smaller!"

That said, the title is just a little clickbaity --- it's a C-subset compiler, and more accurately a JIT interpreter. There also appears to be no attempt at operator precedence. Nonetheless, it's still an impressive technical achievement and shows the value of questioning common assumptions.

Finally, I feel tempted to offer a small size optimisation:

    sub ax,2
is 3 bytes whereas

    dec ax
    dec ax
is 2 bytes.

You may be able to use single-byte xchg's with ax instead of movs, and the other thing which helps code density a lot in 16-bit code is to take advantage of the addressing modes and LEA to do 3-operand add immediates where possible.

@kiwidrew 5d
This is fascinating, I really did not think it was possible to implement even a tiny subset of C in just 512 bytes of x86 code. Using atoi() as a generic hash function is a brilliantly awful hack!
@zoom6628 5d
Great read and awesome achievement. Could see this being useful for smaller microcontrollers.
@Snelius 5d
It was funny to read. Thx! :))
@kvakil 5d
wow, this is impressive.

I wrote a similar x86-16 assembler in < 512 B of x86-16 assembly, and this seems much more difficult <https://github.com/kvakil/0asm/>. I did find a lot of similar tricks were helpful: using gadgets and hashes. Once trick I don't see in sectorc which shaved quite a bit off of 0asm was self-modifying code, which 0asm uses to "change" to the second-pass of the assembler. (I wrote some other techniques here: <https://kvakil.me/posts/asmkoan.html>.)

bootOS (<https://github.com/nanochess/bootOS>) and other tools by the author are also amazing works of assembly golf.

@ezekiel68 5d
If you don't use this... are you even suckless?
@khazhoux 5d
Bravo! This was a wonderful read, xorvoid.
@deafpolygon 5d
"A C Compiler in 512 bytes! Whoa cool!" click

> It supports a subset of C that is large enough to write real and interesting programs.

Oh.

@wkz 5d
Great writeup!

Especially liked this nugget:

> (NOTE: This grammar is 704 bytes in ascii, 38% larger than it's implementation!)

@pk-protect-ai 5d
Amazing.
@mihaic 5d
That is insane, congrats.

I would have wished some explanation on where the function calls like vga_init and vga_set_pixel come from, I'm not a graybeard yet.

@sylware 5d
See the bootstrap project: https://bootstrappable.org/projects.html
@scrawl 5d
really interesting write-up. thanks for sharing!

do you think there are any lessons that can be applied to a "normal" interpreter/compiler written in standard C? i'm always interested in learning how to reduce the size of my interpreter binaries

@HarHarVeryFunny 5d
Amazing!

I think this, from the conclusion, is the real takeaway:

> Things that seem impossible often aren’t and we should Just Do It anyway

I certainly would never have tried to get a C compiler (even a subset) so small since it my instinct would have been that it was not possible.