Allergic to maintenance

Making part-time development practical again.

Jul 04, 2024

It’s been a mere ten months and for some inexplicable reason (or even perhaps many inexplicable reasons, I don’t know) my soft distro Alabaster has broken. Whatever work I was going to do with a fresh install has been thwarted once again.

I’m tired of this. I don’t want to spend whatever slivers of free time I have conducting wild goose chases to fix problems in my software that is invariably caused by others. I have a day job I have to go to, and no, it’s not the flowery fake white collar variety where I get paid to come look pretty and pretend to work for two hours and scroll Twitter for the remaining six.

As I gear up to write my paper about software modular memory, I look back on my weblog and shudder to realise that it has been well over a year since I first broke the news about this idea. This is taking me far longer than I would like to get done, and it’s not just because I have a life. Every single time I try to start a software project of any kind I end up getting arrested by bureaucratic software bullshit. At this point I’m terminally pissed off about it.

Why did Alabaster break? I don’t know. Half the time I watch the terminal and see commands execute out of order. No idea what’s going on there, but why the fuck should I care? What kind of unholy optimisation did they make that depended on assumptions that don’t actually hold? Don’t answer that, because I shouldn’t even need to care about this. That insanity is probably only one of several demons conspiring to ruin my codebase because some autistic jackass thought it would be a cool idea and none of the people around him getting paid six figures to suck off executives cared to disagree.

If this is your idea of cool fun, I have a torture rack to put you on.

I had already hatched Byblos as a campaign to rid myself of the abomination that is modern developer workflows writ large. Now I’m finding out I can’t even keep my regular operating systems sane long enough to use them to develop Byblos. How can I expect anyone else to take this work seriously if they have no means to demonstrate it for themselves? How many jackasses are getting paid by wispy foundations and corporate initiatives to sit in chairs and continue inventing bullshit architectures to keep themselves employed?

I’m also tired of bitching about this. I might be three layers of indirection down at this point (Byblos 2 → Byblos 1 → this Alabaster redux), but I have no choice but to keep on digging until I can get down far enough that all of this ‘modern’ bullshit code stops appearing to fuck everything in my computer up. I have 20,000 working days in my life and I’m not about to throw them away on busywork made for corporate parasites.

The objective in this case is simple: create a distribution of an old, well-known version of the Linux kernel that runs on a set of architectures I have the ability to care about. I’m also going to take a holistic, good-enough-to-work-everywhere approach to running code on all of these architectures even on other systems that have evolved beyond them. In short, this means x86-64 v1, and ARMv7-A.

With the architectures chosen, the next step is to choose a version of the kernel. To get a coarse idea of the complexity inherent to these kernels, I counted the lines of code in each of them. Kernels 2.4 and 6.9 are not being considered and are shown to give a sense of perspective to the candidate versions, 2.6 and 3.19.

Linux 2.4.37.9. 2.7M lines of C, 696K lines of headers, 138K lines of assembly, 3.6M lines total.

Linux 2.6.39. 7.7M lines of C, 1.6M lines of headers, 204K lines of assembly, 9.8M lines total.

Linux 3.19.8. 10.1M lines of C, 2.1M lines of headers, 240K lines of assembly, 13M lines total.

Linux 6.9.7 (latest stable as of this writing). 17.8M lines of C, 7.5M lines of headers, 478K lines of reStructuredText, 419K lines of JSON, 356K lines of YAML, 230K lines of assembly, 27.3M lines total.

While I would love to have been able to consider 2.4 considering its small size, it does not support x86-64 or ARMv6, and is missing features that actually matter like the O(1) scheduler, ext4 support, EFI support, AHCI support, and probably more.

Given these results, I have put together a table giving you a view of a few relevant statistics of comparison I found poignant:

In comparing total line counts (excluding comments and blank lines):
- Kernel 2.6 is 2.72× larger than kernel 2.4
- Kernel 3.19 is:
  - 3.59× larger than kernel 2.4
  - 1.32× larger than kernel 2.6
- Kernel 6.9 is:
  - 7.52× larger than kernel 2.4
  - 2.77× larger than kernel 2.6
  - 2.09× larger than kernel 3.19
In comparing only C source (excluding headers) and assembly (again, excluding comments and blank lines):
- Kernel 2.6 is 2.78× larger than kernel 2.4
- Kernel 3.19 is:
  - 3.65× larger than kernel 2.4
  - 1.31× larger than kernel 2.6
- Kernel 6.9 is:
  - 6.32× larger than kernel 2.4
  - 2.27× larger than kernel 2.6
  - 1.73× larger than kernel 3.19

From this I drew several conclusions:

2.4 is the leanest kernel in the lineup by a wide margin
The gap between 2.6 and 3.19 is the smallest of any two adjacent versions compared
6.9’s growth suggests an acceleration in the growth of the kernel over time, since the modern release cadence of 4.x to today is much shorter than that of the 2.x and 3.x eras
6.9 added much more bespoke source (i.e. source in languages other than C and assembly)
6.9 lost 10K lines of assembly in total compared to 3.19

Onto the final decision: all else equal, less is better, so 2.6 wins. However, all else is not equal, so the 3.x line must be examined for any killer features that might be desired. Going by the highlights column on Wikipedia’s Linux kernel version history, there doesn’t seem to be anything released in 3.x’s lifetime that is as earth-shattering as the highlights mentioned in 2.6’s history or in later kernels.

There is one 3.x highlight I want to discuss, and that is the introduction of ARM64 support. In general, I would prefer to support only one ISA. That ideal ISA would probably be x86-64, virtualised or emulated as necessary to run elsewhere, but unfortunately the performance hit in doing this on machines like an M1 Mac (let alone a toy ARM device like a Pi) is too harsh. Rosetta 2 gets impressive performance in dynamically translating macOS programs because of M1-specific design decisions that mimic x86 in important ways, while also sacrificing many large frontier domains of the ISA that are needed for any kind of serious emulation or virtualisation.

You might be thinking that to run performantly on ARM64 machines I would require this, but I do not, because 32-bit ARMv7-A is, in this context, good enough for everything. I expect QEMU’s virtual board to dynamically translate code famously on the M1, far outclassing its performance in doing so with x86 code, because ARMv7-A is so much more closely related to ARMv8, and furthermore the virtual board can take many liberties to achieve this that IBM-PC compatibility might otherwise prevent. And this rationale should hold to varying but sufficient extents on any kind of ARM or possibly even RISC system you can find, at least to a useful extent.

As I said in my explanation for why 2.4 is right out, 2.6 has many of the key features I would be quite sore without, so using 3.19 is not worth it as things stand right now. I am choosing 2.6, but with one caveat: if I find something in 3.x that I really, truly need, I will make the switch to it, but no further versions will be considered.

This article forms the basis of the rationale for an Alabaster Linux redux. What was once a quasi-distribution broken by others will, in time, be a new, fully-fledged distro of its own, with no heritage or ecosystem overlap with other distros to speak of. I can only hope this is the last layer of bedrock I have to dig through to start retaining the fruits of the work I do in software engineering.

Allergic to maintenance

Making part-time development practical again.

Discussion about this post