This document contains only my personal opinions and calls of judgement, and where any comment is made as to the quality of anybody's work, the comment is an opinion, in my judgement.
[file this blog page at: digg del.icio.us Technorati]
A friend was discussing the privacy and economic implications of online ordering and home delivery for books, but then the discussion went to home delivery in general, and someone observed that home delivery means that certain actions can then be targeted quite precisely to a certain person, because home delivery usually is to a named individual.
Indeed for computers this has meant that the Snowden
revelations have confirmed that the NSA (and no doubt every
other major player) department of targeted
operations
routinely intercept and compromise deliveries
of computer equipment.
But this need not apply just to computers, but also to food, medicines, books even, all sorts of household items. At one extreme poisoned or explosive products and devices delivered to named individuals have been used often as attacks.
Overall there is much greater safety in numbers and anonymity: buying products with cash in random shops makes it much more likely that they have not been tampered with.
I have been looking at typefaces and fonts and rendering for a while and one of the questions has always been which text (that is non-decorative) typefaces to use. Once upon a time the ideal was to use the five typefaces in the Adobe 14 font set but they have always been proprietary (except for the donation of Courier to X Windows) and most suited to high DPI rendering for printers, mostly because of their design. The URW clones of the Adobe 14 donated to X Windows are good, and have been recently resurrected as the Gyre typefaces, but they are not popular.
The best alternative then became the Microsoft web core fonts as they were designed for low DPI devices and well hinted for them, but while they are free-of-charge to use, they are not otherwise free to modify and improve.
With the passing of years of use I have become ever fonder of the Ubuntu typefaces because both of their pretty nice design, and their being well hinted, complete and free to modify and extend.
There are other freeware typeface families, and the best alternative to the Ubuntu one is the DejaVu typefaces but I like them less, because I find them a bit too tall and print oriented. They are as complete (and perhapa even more so) and well hinted as the Ubuntu typefaces though.
The other freeware typeface families are either less complete, not well hinted, or have some other issues, for example the Google Noto font collection has too many of different and unncessary types.
Typefaces that are almost as good as the Ubuntu and DejaVu ones include the Google Roboto ones and the KDE Oxygen fonts, but they are not as well hinted or are not as complete.
But overall I still think that the Ubuntu typefaces are very good, peerhaps still the best autonomous Ubuntu project (the second best being probably Bazaar, at long last).
I have used in the pasts both RSS and Atom for the feeds of this web site, and I had written a CSS style sheet to display the RSS feeds. I have updated that a bit, and written an equivalent one for Atom. The sort-of special feature of both is that they turn the link elements in both into activable links by using some JavaScript postprocessing script that turns the main link elements into either:
Note: in Atom the text contents of some elements is supposed to full HTML, but the simple style sheets I have written treat the content as simple text.
Note: On this site the feeds are served both with a .atom or .rss suffix, and an .xml suffix, because the first two are served with application/atom+xml and with application/rss+xml MIME types, and the styling code in the browsers seem to be enabled only for the base text/html MIME type (as well as that for HTML).
The setup seems to work with Microsoft Edge and with Brave, and the links to the relevant files are:
Note: Here are links to both current ATOM feeds (and older RSS feed) for this site to be rendered using the stylesheets above: changed.xml, changed2.xml (changes.xml).
In natural language there are category names like
chair
and proper names like Japan, where
the propert names denote individual entities. I reckon that it
is usually quite important to distinguish between the two even
if computing not just in natural language.
In computing there is a further distinction, as there are
generic proper names like site29host317
(commonly used for cattle
type resources)
and specific ones like sunflower (commonly used
for pet
type resources), and a further
distinction can be made as to generic proper names between
arbitrary ones and path based one, that is names that embody
some information on how to find the named resource.
Names label not just to computers, but also to other resources, for example accounts, devices, filesystems, media, processes, files.
Most names used in computing are generic proper names, and
most are at least in part path based. However I believe that
proper specific names are useful in many more cases, because
they are useful in labeling independent
resources.
So I usually give filesystems a proper name, because as a rule to me which computer I am using is sort of not very important, what matters is which filesystems I am accessing; for example if a computer dies I usually just move the storage units containing its filesystems to some other computer.
A difficulty with that is that while it is possible to use
filesystem specific proper names in various places, most
notably /etc/fstab (and I have
written a nice script
that allows it to be used a a map for the automounter) that
might require scanning all available devices and media to
determine which one holds which filesystem, as ultimately it
is block
device names that have to be used
to mount
filesystems.
Also usually block device names depend on physical device names, which are usually based partially on paths, for example /dev/nvme0n1p7 or /dev/sdak3, and the naming of device in a contemporary GNU/Linux system is typically opaque and possibly semi-random.
Because of that I have decided to give to several devices and
media on my pet
systems specific individual
names, treating them as pets too, to avoid having to remember
the details of their paths.
This has required me to deal with the demented udevd framework that was replaced because of the execrable GKH) the much simpler and better devfsd (1, 2) system (designed and implemented well by Richard Gooch), with a scheme of symbolic-links based aliases like this:
# Alxum USB3 UASP dock KERNEL=="sd*[!0-9]", SUBSYSTEMS=="scsi", \ ENV{SCSI_MODEL}=="ASMT1153e", \ ENV{DEVTYPE}=="disk", SYMLINK+="disk/alxum" KERNEL=="sd*", SUBSYSTEMS=="scsi", \ ENV{SCSI_MODEL}=="ASMT1153e", \ ENV{DEVTYPE}=="partition", SYMLINK+="disk/alxum%n">or the SDXC slot on the side of a laptop:
# ThinkPad E495 MMC slot on the side. KERNEL=="mmcblk0", SUBSYSTEMS=="block", \ ENV{DEVTYPE}=="disk", SYMLINK+="disk/side" KERNEL=="mmcblk0p*", SUBSYSTEMS=="block", \ ENV{DEVTYPE}=="partition", SYMLINK+="disk/side%n"This scheme break down a bit if there are multiple docks with the same model numbers, but that is uncommon.
# Media of laptop "daisy" KERNEL=="nvme?n?", SUBSYSTEMS=="block", \ ENV{ID_PART_TABLE_UUID}=="ef4b523b-bb01-48b1-9396-7c60f5df2c2f", \ ENV{DEVTYPE}=="disk", SYMLINK+="media/daisy0" KERNEL=="nvme*", SUBSYSTEMS=="block", \ ENV{ID_PART_TABLE_UUID}=="ef4b523b-bb01-48b1-9396-7c60f5df2c2f", \ ENV{DEVTYPE}=="partition", SYMLINK+="media/daisy0p%n" KERNEL=="sd*[!0-9]", SUBSYSTEMS=="scsi", \ ENV{ID_PART_TABLE_UUID}=="42643b7a-2d2a-45a8-9412-06d23ba63f2f", \ ENV{DEVTYPE}=="disk", SYMLINK+="media/daisy1" KERNEL=="sd*", SUBSYSTEMS=="scsi", \ ENV{ID_PART_TABLE_UUID}=="42643b7a-2d2a-45a8-9412-06d23ba63f2f", \ ENV{DEVTYPE}=="partition", SYMLINK+="media/daisy1p%n"or for a backup disk:
# Seagate 5T backup disk KERNEL=="sd*[!0-9]", SUBSYSTEMS=="scsi", \ ENV{ID_PART_TABLE_UUID}=="03871ee3-2a23-4cbc-b40e-dc17a7ff5ed6", \ ENV{DEVTYPE}=="disk", SYMLINK+="media/both" KERNEL=="sd*", SUBSYSTEMS=="scsi", \ ENV{ID_PART_TABLE_UUID}=="03871ee3-2a23-4cbc-b40e-dc17a7ff5ed6", \ ENV{DEVTYPE}=="partition", SYMLINK+="media/both%n"In the case of media I have chosen to put their names under a new /dev/media/ directory.
There are of course already aliases under some subdirectories of /dev/disk/, but they are either very verbose, or are path based, and the names I give are much better mnemonics. Some relevant notes:
The main values of the scheme above are that it makes it easier to document some aspects of a system configuration, and that it prevents making a lot of mistakes that can happen when using generic (especially path based) names wrongly.
Update 2022-04-30: The entries in /etc/fstab can then be written more portably and easily, for example as:
/dev/media/daisy0p1 / jfs defaults,auto,atime,nodiratime 7 1 /dev/media/daisy0p2 none swap sw,pri=10 0 0 /dev/media/daisy0p3 /fs/home1 jfs defaults,noauto,noatime 14 2 /dev/media/daisy1p1 /fs/home2 jfs defaults,noauto,noatime 14 2 /dev/media/both1 /fs/both_root f2fs defaults,noauto,nofail,noatime 0 0 /dev/media/both2 /fs/both_home1 f2fs defaults,noauto,nofail,noatime 0 0 /dev/media/both3 /fs/both_home2 f2fs defaults,noauto,nofail,noatime 0 0 /dev/disk/alxum1 /fs/alxum1 auto defaults,noauto,nofail,noatime 0 0 /dev/disk/alxum2 /fs/alxum2 auto defaults,noauto,nofail,noatime 0 0 /dev/disk/alxum3 /fs/alxum3 auto defaults,noauto,nofail,noatime 0 0 /dev/disk/alxum1 /fs/alxum1-n nilfs2 defaults,noauto,nofail,noatime,nogc 0 0 /dev/disk/alxum2 /fs/alxum2-n nilfs2 defaults,noauto,nofail,noatime,nogc 0 0 /dev/disk/alxum3 /fs/alxum3-n nilfs2 defaults,noauto,nofail,noatime,nogc 0 0 /dev/disk/alxum1 /fs/alxum1-x xfs defaults,noauto,nofail,noatime,inode64 0 0 /dev/disk/alxum2 /fs/alxum2-x xfs defaults,noauto,nofail,noatime,inode64 0 0 /dev/disk/alxum3 /fs/alxum3-x xfs defaults,noauto,nofail,noatime,inode64 0 0 /dev/disk/alxum1 /mnt/alxum1-t exfat ro,rw,noauto,user,noatime,async,umask=0077 0 0 /dev/disk/alxum2 /mnt/alxum2-t exfat ro,rw,noauto,user,noatime,async,umask=0077 0 0 /dev/disk/alxum3 /mnt/alxum3-t exfat ro,rw,noauto,user,noatime,async,umask=0077 0 0 /dev/disk/alxum1 /mnt/alxum1-v vfat ro,rw,noauto,user,noatime,async,umask=0077,showexec,shortname=mixed 0 0 /dev/disk/alxum2 /mnt/alxum2-v vfat ro,rw,noauto,user,noatime,async,umask=0077,showexec,shortname=mixed 0 0 /dev/disk/alxum3 /mnt/alxum3-v vfat ro,rw,noauto,user,noatime,async,umask=0077,showexec,shortname=mixed 0 0
Note: the -n suffix on the mount directories is to select the type of filesystem with the automounter or mount --target as user because in neither case it is possible to pass the filesystem type as a parameter.
That works particularly well if the /fs/ directory is managed by the automounter using my script to turn /etc/fstab into a dynamic automount map.
There are among many others two celebrated reasons why GNU/Linux or also BSD (the base for MacOS X and iOS) have become popular:
But there is a third reason that has been forgotten, it is not just or even mainly being non-proprietary, no-cost alternatives to MS-Windows and other products: they are based on the UNIX architecture, which is arguably better, as in simpler, more consistent, and more flexible than that of most other operating systems.
The original UNIX was widely adopted because of that being arguably better even if it was a proprietary product with a significant price, for example (also 1, 2):
This seems often to be forgotten by those (I think many associated with FreeDesktop.org and GNOME) who seem to be interested only in it being a non-proprietary, no-cost alternative to MS-Windows, and effectively take the MS-Windows design style as a model to follow. UNIX systems are supposed to be arguably better, not just cheaper or more free.
That is particularly important as to designing extensions to the base UNIX system, which had significant limitations: those extensions should be as simple, consistent, flexible as the rest of the UNIX design (for example 1, 2, 3) rather than being MS-Windows like messy extensions.
Having recently written a
simplified history of early UNIX init designs
this will be mostly about an evaluation of the design of
systemd,
using as a counterpart
runit,
as many people seem to misunderstand both, in particular
because they do not seem to be aware of the central issue:
which is not being an init system, which is trivial,
but the separate issue of supervision of daemons, taking into
account that they need to communicate with each other and
arbitrary other processes, which therefore involves discussing
the single greatest strength of the UNIX architecture, that is
how well it allows
IPC
among related processes
.
Note: in UNIX processes are related if
they re the descendants of the same process, that is they
share in part the environment
and file
descriptors of their ancestor process, which can to an extent
control them.
The clearest example of how well UNIX allows dealing with
related processes is shell pipelines
that
are composed of related processes, which allow IPC of many
processes with automatic data flow control and implicit
synchronization, without even having to write any explicit
code, just by having the shells set
up pipes
as communication channels among
the related processes it creates for the pipeline.
Conversely in the classic UNIX architecture IPC among unrelated processes is extremely awkward, because pipes are not available, and IPC happens only through files, and there are are only pretty awkward synchronization mechanisms for files, they must be coded explicitly, and there is no automatic data flow control.
Note: In the classic UNIX architecture doing IPC via files was flow controlled and synchronized manually: run manually a command on a file, once the file is done, run manually another command on that file.
However there are many services that should be provided by
daemons to unrelated process, typically involving some form of
spooling
, and doing that manually is quite
bad.
Therefore since shortly after the classic UNIX editions there have been many attempts, for example Edition 7 multiplexes (by Greg Chesson), 4BSD sockets (by Bill Joy), early System V named pipes, later System V STREAMS) to provide better IPC among unrelated processes, and most have attempted to replicate pipes as named entities in the filesytem, and all have substantially failed, in part because it is a difficult design problem, in large part because it cannot be solved by random hacking, it requires a level of knowledge and insight even greater than that of designing IPC among related processes using pipes, which was the major contribution from the authors of UNIX.
The big issues for supervising unrelated processes then involve IPC:
As to these issues runit and systemd take completely different approaches, as runit is a well designed, simple, robust and efficient supervisor of processes, and essentially ignores the issue of dependency ordering, while systemd aims to manage services and their dependencies, and does so in a poorly designed and misunderstood way.
What is common to them is the fundamental approach taken: they both turn unrelated daemon processes into related ones by having all daemon processes managed by the supervisor component, which in the case of of runit is a separate program from init (one quite similar to daemontools), and in the case of systemd is part of the init program. In both cases the supervisor keeps a table of the child process numbers, and in both cases process management is done not via per-daemon scripts as in the case of the System V init with /etc/init.d/ but through requests to the supervisor that then signals appositely the child processes.
Note: Some people a big deal of the latter aspect, but keeping a table of process numbers in a set of .pid files and sending signals to them directly instead of having the supervisors do that is nicer but not such a big deal.
There is an important point about both aspects, that runit manages process and with a separate supervisor from init while systemd attempts to manage services and with a supervisor integrated with init: systemd main goal is to minimize boot times by maximizing the parallelism with which service daemons can be launched, assuming that there are many services with a complex web of dependencies among them, while runit seems designed to supervise a relatively small number of service daemons that can be ordered in a much less parallel way.
The reason why systemd was designed with that main
goal seems to me that modern
GNU/Linux
desktop GUI environments can have hundreds of
services with complex relations among them, just like under
MS-Windows, and minimizing the time to the appearance of a
graphical login prompt by parallelizing services as much as
possible gives the impression of a more responsive system,
just like under MS-Windows. It is not by mere chance that
systemd and related software are endorsed by
FreeDesktop.org which seems an
organization devoted to making UNIX as similar to MS-Windows
as possible.
Note: Therefore systemd process supervision must be integrated with init to ensure that system initialization is managed by systemd too, including the initial activation of devices and the mounting of filestores, so they can also be parallelized to reduce boot times.
Therefore in order to manage optimally (in terms of parallelism) the dependencies among unrelated services, by making them all related by managing them all as children of the same systemd, every resource must be turned into a systemd service or a pseudo-service, and this includes traditionally separate concepts like initial boot activity, activating storage devices, mounting filesystems, and much else.
Note: Consider the example of two
services each of which has a spool area on a separate
filestore on two different block devices on two distinct
physical devices: they can be started in parallel as long as
activating the physical devices, configuring the block
devices, and mounting the filestores can be done in parallel
before the service is started, and all the relevant steps are
done lazily
as-needed.
Part of that is due to a rather fundamental problem, that while process start does not imply service start, there is no established convention by which a daemon can show that its services has been initialized and is ready. To work around this the idea has been to overlay on top of UNIX a universal IPC system for unrelated services (rather than processes) called D-Bus (which is similar to DCOM in MS-Windows), so that all requests to a daemon be turned into unrelated process IPC, and therefore requesting process automatically wait for a reply from the services they depend on:
$ qdbus --system | grep -v '^:' | sort com.redhat.NewPrinterNotification com.redhat.PrinterDriversInstaller org.bluez org.freedesktop.Accounts org.freedesktop.ColorManager org.freedesktop.PolicyKit1 org.freedesktop.RealtimeKit1 org.freedesktop.UDisks2 org.freedesktop.UPower org.freedesktop.login1 org.freedesktop.systemd1 org.freedesktop.DBus
Then several existing UNIX services have to be rewritten into D-Bus accessible services, and integrated within systemd or packaged with it, in order to maximize parallelism while respecting dependencies, and indeed the systemd documentation advises:
Note that while systemd offers a flexible dependency system between units it is recommended to use this functionality only sparingly and instead rely on techniques such as bus-based or socket-based activation which make dependencies implicit, resulting in a both simpler and more flexible system.
My overall evaluation is that runit achieves its limited aims successfully and elegantly and robustly, and systemd achieves its much more sophisticated aims, even if worthy, mostly but not wholly successfully and with much complexity and fragility. I think these are the main reasons for the latter:
not necessarilyapproached insightfully.
idiot savants
in being clever
but prone to mindless hacking, and I have the impression
that there was a considerable lack of insight into the UNIX
architectural choices and excessive admiration for Microsoft
style designs (down to details like using .ini
MS-DOS style configuration files).not necessarilythoughtfully designed.
not necessarilyat the level of completeness and quality of the original UNIX documentation either, which is a pity as the D-Bus and systemd infrastructure is overriding a lot of the simpler and better documented underlying UNIX architecture.
Note: systemd is thus a large and complex program dependent on a lot of complex libraries:
$ sudo lsof -p 1 |& grep -w REG | sort -k 9 systemd 1 root mem REG 259,5 2454496 1616052 /lib/systemd/libsystemd-shared-245.so systemd 1 root txt REG 259,5 1620224 1616398 /lib/systemd/systemd systemd 1 root mem REG 259,5 191472 465694 /lib/x86_64-linux-gnu/ld-2.31.so systemd 1 root mem REG 259,5 133200 465682 /lib/x86_64-linux-gnu/libaudit.so.1.0.0 systemd 1 root mem REG 259,5 351352 466666 /lib/x86_64-linux-gnu/libblkid.so.1.1.0 systemd 1 root mem REG 259,5 2029224 466299 /lib/x86_64-linux-gnu/libc-2.31.so systemd 1 root mem REG 259,5 27064 465688 /lib/x86_64-linux-gnu/libcap-ng.so.0.0.0 systemd 1 root mem REG 259,5 31120 465689 /lib/x86_64-linux-gnu/libcap.so.2.32 systemd 1 root mem REG 259,5 202760 465692 /lib/x86_64-linux-gnu/libcrypt.so.1.1.0 systemd 1 root mem REG 259,5 454192 369757 /lib/x86_64-linux-gnu/libcryptsetup.so.12.5.0 systemd 1 root mem REG 259,5 431472 465699 /lib/x86_64-linux-gnu/libdevmapper.so.1.02.1 systemd 1 root mem REG 259,5 18816 466300 /lib/x86_64-linux-gnu/libdl-2.31.so systemd 1 root mem REG 259,5 137584 465710 /lib/x86_64-linux-gnu/libgpg-error.so.0.28.0 systemd 1 root mem REG 259,5 162264 465747 /lib/x86_64-linux-gnu/liblzma.so.5.2.4 systemd 1 root mem REG 259,5 1369352 466301 /lib/x86_64-linux-gnu/libm-2.31.so systemd 1 root mem REG 259,5 387768 465707 /lib/x86_64-linux-gnu/libmount.so.1.1.0 systemd 1 root mem REG 259,5 68320 465717 /lib/x86_64-linux-gnu/libpam.so.0.84.2 systemd 1 root mem REG 259,5 157224 466313 /lib/x86_64-linux-gnu/libpthread-2.31.so systemd 1 root mem REG 259,5 40040 466315 /lib/x86_64-linux-gnu/librt-2.31.so systemd 1 root mem REG 259,5 163200 465772 /lib/x86_64-linux-gnu/libselinux.so.1 systemd 1 root mem REG 259,5 178528 114923 /lib/x86_64-linux-gnu/libudev.so.1.6.17 systemd 1 root mem REG 259,5 30936 389164 /lib/x86_64-linux-gnu/libuuid.so.1.3.0 systemd 1 root 10r REG 0,23 0 13427082 /proc/1/mountinfo systemd 1 root 14r REG 0,23 0 4026532084 /proc/swaps systemd 1 root mem REG 259,5 39088 789550 /usr/lib/x86_64-linux-gnu/libacl.so.1.1.2253 systemd 1 root mem REG 259,5 80736 789565 /usr/lib/x86_64-linux-gnu/libapparmor.so.1.6.1 systemd 1 root mem REG 259,5 34872 789575 /usr/lib/x86_64-linux-gnu/libargon2.so.1 systemd 1 root mem REG 259,5 2954080 789770 /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1 systemd 1 root mem REG 259,5 1168056 790022 /usr/lib/x86_64-linux-gnu/libgcrypt.so.20.2.5 systemd 1 root mem REG 259,5 129096 790260 /usr/lib/x86_64-linux-gnu/libidn2.so.0.3.6 systemd 1 root mem REG 259,5 35440 790279 /usr/lib/x86_64-linux-gnu/libip4tc.so.2.0.0 systemd 1 root mem REG 259,5 67912 790309 /usr/lib/x86_64-linux-gnu/libjson-c.so.4.0.0 systemd 1 root mem REG 259,5 104656 790328 /usr/lib/x86_64-linux-gnu/libkmod.so.2.3.5 systemd 1 root mem REG 259,5 129248 790402 /usr/lib/x86_64-linux-gnu/liblz4.so.1.9.2 systemd 1 root mem REG 259,5 584392 938151 /usr/lib/x86_64-linux-gnu/libpcre2-8.so.0.9.0 systemd 1 root mem REG 259,5 133568 789590 /usr/lib/x86_64-linux-gnu/libseccomp.so.2.5.1 systemd 1 root mem REG 259,5 1575112 938483 /usr/lib/x86_64-linux-gnu/libunistring.so.2.1.0
Therefore I am not very satisfied with either runit
or systemd because while both
work-ish
enough to be viable:
While waiting for the attempt in the latter point, I feel that a more sophisticated system than runit yet less awkward, monolithic and complex system than systemd would have been possible, based on something like the automounter and inetd, and more filesystem based like the runit and Plan 9.
For the sake of looking at a minimalist GNU/Linux
distribution I have been running in a VM a live
DVD
image of
Void Linux
which notably uses
musl libc
and
runit.
For various reasons this has finally prompted me to write about init implementation alternatives, starting with a bit of history here:
In traditional UNIX systems the init had very limited roles:
reapthem then they terminated.
Things started to become more complex when in Edition 7
the initial shell script began to
start daemons
in the order in which they were mentioned, and startup
happened in sequential phases called run
levels
where it was guaranteed that all daemons in a
run level were started before any daemon in the next run
levels.
Because the daemons started by the Edition 7 initial shell script were unsupervised, when they started to multiply someone designed a new init that did some process supervision as configured in the /etc/inittab file. Supervision is very limited: the list of supervised processes to start was static, and processes in practice can just be started and optionally restarted if they terminated, and are started the order in which they were listed in that /etc/inittab.
While System V init documentation mentions run levels but they are not sequential phases, they are a far more general notion of run levels states, which was in practice almost never used, where when transitioning from a run level state to another:
Because of the limitations of the System V init
someone
modified it by adding an extra
layer of self-supervision similar in part to that of the
Edition 7 service starting shell scripts: apart from a few
directly started daemons, most others were started by a
per-daemon shell script under /etc/init.d/ with
the following logic:
signalswith that process number.
The critical aspect of this scheme is that daemon supervision is like in Edition 7 and previous schemes no longer done (except for reaping) by init but by a per-daemon supervision script, according to a scheme unique to that daemon (even if usually based on signals sent to a stored process number).
Overall the critical differences are in daemon supervision, and in particular in two aspects:
Then in BSD UNIX and later a second daemon starting
mechanism was added for daemons contactable via
IPC:
inetd
(and its successor
xinetd)
which has the remarkable property of having the option to
start a daemon lazily
on access
rather than pre-starting it at some initial point, and the
ever more remarkable implicit property that since
access to its daemons is via IPC dependencies among daemons
are both automatic and are defined by daemons being
ready (replying to IPC) rather than merely
started.
The latter point is quite important: in all previous schemes ordering and dependencies of daemons was defined by the starting of a daemon, rather than by it being ready, and most daemons have an initialization phase before becoming ready. It is entirely possible that if daemon A is a dependency of daemon B and daemon A is started before daemon B, when daemon B is ready daemon A is not ready yet.
Fortunately the case where a daemon started before another becomes ready after the the one started after it is rare, and the only major cases of daemons that take a while to get ready are network service daemons, for which that is not a great problem, and storage activation, and in most UNIX like systems storage activation commands complete only when activation is ready, and they are usually run before all other daemons.
To avoid that something like inetd was created for
network services and to activate storage lazily the various
types of automounter
services.
In UNIX like systems user authorization is based on numeric user (and group) identifiers, but there are also user (and group) names that define accounts, that is bundles of resources and configurations. While a user name cannot be associated with multiple user number, more than one user name can have the same user number, that is the same authorizations, even if this situation is rare.
Despite that some programs try to
canonicalize
user name by resolving them to
user numbers and then looking up the user number to get back
the user name, which then can be is different from the
original user name used for logging in.
In particular this is done by the
PulseAudio daemon, which then
creates its service socket in the wrong home directory, at
least the wrong one by the logic of many other programs that
simply use the user name they find in the
environment
.
The workaround in that particular case is to hardcode the service socket path in the files:
$ cat .config/pulse/default.pa load-module module-native-protocol-unix auth-anonymous=0 socket=/tmp/pulse-username .include /etc/pulse/default.pa
$ cat .config/pulse/client.conf default-server=unix:/tmp/pulse-username
Note: that configures a socket file under /tmp/ rather than under the user home directory because often the home directory is accessed via NFS or another network shared filesystem type, and socket on those have sometimes less preferable behaviour.
It seems quite funny to find a post about Ubuntu Snap containers with the following argument:
Better yet, you can use extensions, a framework designed to make snap usage more consistent, faster, and easier. We talked about extensions last year, with the KDE extension as an example. Similarly, there are several other supported extensions, including GNOME, ROS and Flutter.
In addition to having snaps behave in a more predictable way, extensions help you gain – or rather lose – size! [...]
For instance, the KDE’s KCalc snap, which typically weighs around 100 MB as a standalone application, comes in at a very small, neat 972 KB – a 99% reduction from the original target and a number worth the 1980s gaming scene. Of course, the necessary libraries still need to exist somewhere – and they are contained in the KDE frameworks snap, which is used for all KDE applications.
That prompts the question of how that is different from putting the KDE runtime environment in a single ordinary package and then having an ordinary kcalc ordinary package depends on that.
Previous posts show that I am quite interested in filesystem types (1, 2, 3, etc.) and storage (1, 2, 3, etc.) and that is because I think that they are really important; in part because in several situations I have seen there were terrible storage systems, which caused significant delays to users.
Note: however I have seen several other situations in which the storage systems were also terrible but this did not matter much because the requirements were so low.
So recently I wanted to look at a first idea of how good are
allocators at avoiding fragmentation of file layout, at least
in the simplest case, filling up a filesystem. In this case I
used the root
filesystem on a laptop, which
is on a 2GB/s NVME stick (so reading it has negligible impact
on the speed of the test), to an old slow 5400RPM 2.5in 250GB
HDD over USB3, as I like to see how things go with high
latency slow transfer rate (it tops out at around 80MB/s)
devices (fast SSDs are too easy).
The filesystem tree takes around 81GiB and the partition in which it resides had a capacity of around 88GiB; being a root filesystem tree it has lots of small files, but I have added some subtrees also with somewhat large files (just several GiB though). I have done the copies with rsync -axHOJ after freshly formatting the filesystem. Then I have used a little script with find and filefrag to count the extents in files larger than 16KiB and ordered the results by number of fragments:
# tail -n5 frag* ==> frag-sdc6-bcachefs <== 5 /mnt/sdc6/loc/data/fonts/lm1.106bas.zip 5 /mnt/sdc6/usr/lib/llvm-11/lib/libclang-cpp.so.11 5 /mnt/sdc6/usr/share/keyrings/debian-keyring.gpg 5 /mnt/sdc6/usr_src/pkg7/Cyberbit/CyberCJK.ZIP 6 /mnt/sdc6/usr/share/skypeforlinux/resources/app.asar.unpacked/modules/slimcore.node ==> frag-sdc6-btrfs <== 3 /mnt/sdc6/loc/data/distrib/gentoo-amd64-20190615-zfs-0.8.1.iso 3 /mnt/sdc6/opt/brave.com/brave/brave 5 /mnt/sdc6/var_data/recoll/xapiandb/termlist.glass 10 /mnt/sdc6/var_data/recoll/xapiandb/postlist.glass 14 /mnt/sdc6/var_data/recoll/xapiandb/position.glass ==> frag-sdc6-ext4 <== 10 /mnt/sdc6/loc/data/distrib/systemrescuecd-amd64-6.1.3.iso.part 12 /mnt/sdc6/loc/data/dbase/freedb-complete-20140101.tar.bz2 12 /mnt/sdc6/loc/data/distrib/gentoo-amd64-20190615-zfs-0.8.1.iso 15 /mnt/sdc6/var_data/recoll/xapiandb/docdata.glass 32 /mnt/sdc6/var_data/recoll/xapiandb/position.glass ==> frag-sdc6-f2fs <== 17 /mnt/sdc6/loc/bsp/xwi6/worldofpadman.run 27 /mnt/sdc6/loc/data/dbase/freedb-complete-20140101.tar.bz2 29 /mnt/sdc6/var_data/recoll/xapiandb/termlist.glass 32 /mnt/sdc6/var_data/recoll/xapiandb/position.glass 38 /mnt/sdc6/var_data/recoll/xapiandb/postlist.glass ==> frag-sdc6-jfs <== 12 /mnt/sdc6/usr/src/linux-oem-5.10-headers-5.10.0-1023/arch/mips/include/asm/octeon/cvmx-npei-defs.h 12 /mnt/sdc6/var_data/recoll/xapiandb/termlist.glass 14 /mnt/sdc6/usr/share/efitools/efi/ReadVars.efi 19 /mnt/sdc6/var_data/recoll/xapiandb/postlist.glass 43 /mnt/sdc6/var_data/recoll/xapiandb/position.glass ==> frag-sdc6-ocfs2 <== 467 /mnt/sdc6/usr/lib/firefox/libxul.so 491 /mnt/sdc6/usr/share/AAVMF/AAVMF_CODE.fd 513 /mnt/sdc6/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/libcore-3adda86a16d1040e.rlib 597 /mnt/sdc6/var/lib/bogofilter/wordlist.db 662 /mnt/sdc6/usr/share/code/code ==> frag-sdc6-reiserfs <== 464 /mnt/sdc6/var/lib/apt/lists/ftp.ch.debian.org_debian_dists_buster_main_source_Sources 661 /mnt/sdc6/var/lib/cdebconf/templates.dat 698 /mnt/sdc6/var_data/recoll/xapiandb/termlist.glass 2472 /mnt/sdc6/var_data/recoll/xapiandb/postlist.glass 4187 /mnt/sdc6/var_data/recoll/xapiandb/position.glass ==> frag-sdc6-xfs <== 2 /mnt/sdc6/loc/data/distrib/gentoo-amd64-20190615-zfs-0.8.1.iso 2 /mnt/sdc6/loc/data/distrib/systemrescuecd-amd64-6.1.3.iso.part 2 /mnt/sdc6/var_data/recoll/xapiandb/termlist.glass 3 /mnt/sdc6/var_data/recoll/xapiandb/position.glass 3 /mnt/sdc6/var_data/recoll/xapiandb/postlist.glass
Some comments in a meaningful order:
COWfilesystem types have very nicely unfragmented allocation, and they also had very good write rates. By looking at write rates, which do not drop much even when writing lots of small files and comparing with ext4, JFS and XFS where that happens to varying degrees, I think that is because COW filesystems do not update metadata like inodes and internal trees in-place and thus are able to keep streaming. Alternatively they need less stringent persistence (as per POSIX and metadata updates) thanks to COW updates being both journaled and being effectively a journal.
O_PONIES controversy(1, 2, 3, etc.) its default persistency conditions are weaker than those of JFS and XFS (I suspect in particular I wonder about the auto_da_alloc option, but it is just a guess).
There have been recent indications that (sequential) COW systems (filesytems or databases) end up with a lot of write amplification but at least in this case the results are not bad, because the write amplification is sequential, and consumes bandwidth instead of much scarcer random IOPS, and seems to save even more IOPS by reducing updates in-place for metadata.
Overall as usual I like JFS, F2FS, and also Btrfs (even if I would not use its storage management aspects), and bcachefs looks very promising too (I like most of its storage management aspects but are not fully mature yet).
So I have been looking again at filesystems and rought measures of their complexity in the Linux kernel version 5.10 for bcachefs and here are the sizes of the sources and those of the compiled code:
# D="udf jfs nilfs2 reiserfs gfs2 f2fs ocfs2 bcachefs ../drivers/md/bcache xfs btrfs" # for N in $D ext4; do L=`cat $N/*.[chsyl] | wc -l`; echo -e "$L\t$N"; done | sort -k1n 11297 udf 17656 ../drivers/md/bcache 22793 nilfs2 31976 gfs2 32384 jfs 32397 reiserfs 41828 f2fs 59809 ext4 67184 bcachefs 67474 xfs 70832 ocfs2 133746 btrfs # for N in $D; do size $N/*.ko; done | sort -u -k1n text data bss dec hex filename 2472 1228 0 3700 e74 ocfs2/ocfs2_stack_o2cb.ko 4982 2224 38 7244 1c4c ocfs2/ocfs2_stackglue.ko 6023 1420 8 7451 1d1b ocfs2/ocfs2_stack_user.ko 87502 6592 8 94102 16f96 udf/udf.ko 157573 3364 968 161905 27871 jfs/jfs.ko 157754 10908 48 168710 29306 nilfs2/nilfs2.ko 169614 25283 152 195049 2f9e9 ../drivers/md/bcache/bcache.ko 202212 3588 4368 210168 334f8 reiserfs/reiserfs.ko 261705 121432 33048 416185 659b9 gfs2/gfs2.ko 474565 76180 560 551305 86989 f2fs/f2fs.ko 679404 155492 240 835136 cbe40 ocfs2/ocfs2.ko 689345 44535 72 733952 b3300 bcachefs/bcachefs.ko 823525 252029 384 1075938 106ae2 xfs/xfs.ko 984066 113236 15040 1112342 10f916 btrfs/btrfs.ko
Note: the code size for ext4 is missing as it is compiled to be built-in.
Note: the total size of bcachefs includes that of bcache which it uses for low level storage management.
Also separately and not exactly but quite reliably comparable sizes for OpenZFS:
# cat `find git-zfs/include git-zfs/lib git-zfs/module -name '*.[chyl]'` | wc -l 482914 # size /lib/modules/5.8.0-40-generic/kernel/zfs/zfs.ko text data bss dec hex filename 1917113 74096 1538072 3529281 35da41 /lib/modules/5.8.0-40-generic/kernel/zfs/zfs.ko
Note: these are quite rough measures of complexity and functionality, as source code line counts depend also on coding style, and compiled code size depends also on inlining.
Pretty obviously UDF, JFS, NILFS2, Reiser3, and even F2FS are in a class of their own: they are all sophisticated designs with full functionality and they are much, much simpler than OCFS2, bcachefs and especially XFS, Btrfs, ZFS.
In particular XFS seems to be very complex as it is largely a plain vanilla filesystem, without the complex parallel access logic of OCFS2 or the complex RAID and volume management logic of bcachefs, Btrfs and ZFS.
Given the enormous sizes of the latter I feel that it is a miracle that they work reliably, and that is probably mostly due to being around for a long time (except bcachefs which is relatively new).
My usual preference is for the simpler filesytems and in particular for JFS, but UDF and NILFS2 have interesting special features and work well too, and F2FS has some considerable special features and in particular it is heavily used in high end cellphones and tablets, so it can be expected to be well tested and maintained.
Of the more fantastically complex designs I have become more
skeptical of XFS as while it is reliable its complexity is not
matched by equivalent features (it was rumoured to have
included five different B-tree implementations). I have been
using mostly JFS and NILFS2, and also Btrfs quite a bit, at
home, without using its questionable
storage management design, and I have used ZFS often at
various work places, because other people were familiar with
it.
I have been using Btrfs (which is going to be the default for Fedora) in part because of its snapshotting, but then I rarely use it as I keep a small series of backups, which have pretty much the same functionality as to going back in time, but primarily for the checksums.
I am thinking of trying to stop using at home Btrfs and use more
F2FS and where I reckon that I want data checksums to use
bcachefs as it seems quite good and has the big advantage
of having a single main developer with good
taste
, and perhaps continue using ZFS at work.