Planet Debian

Subscribe to Planet Debian feed
Planet Debian - https://planet.debian.org/
Updated: 1 day 3 min ago

Russ Allbery: Review: Unseen Academicals

Wed, 2024-04-17 22:37

Review: Unseen Academicals, by Terry Pratchett

Series: Discworld #37 Publisher: Harper Copyright: October 2009 Printing: November 2014 ISBN: 0-06-233500-6 Format: Mass market Pages: 517

Unseen Academicals is the 37th Discworld novel and includes many of the long-standing Ankh-Morpork cast, but mostly as supporting characters. The main characters are a new (and delightful) bunch with their own concerns. You arguably could start reading here if you really wanted to, although you would risk spoiling several previous books (most notably Thud!) and will miss some references that depend on familiarity with the cast.

The Unseen University is, like most institutions of its sort, funded by an endowment that allows the wizards to focus on the pure life of the mind (or the stomach). Much to their dismay, they have just discovered that an endowment that amounts to most of their food budget requires that they field a football team.

Glenda runs the night kitchen at the Unseen University. Given the deep and abiding love that wizards have for food, there is both a main kitchen and a night kitchen. The main kitchen is more prestigious, but the night kitchen is responsible for making pies, something that Glenda is quietly but exceptionally good at.

Juliet is Glenda's new employee. She is exceptionally beautiful, not very bright, and a working class girl of the Ankh-Morpork streets down to her bones. Trevor Likely is a candle dribbler, responsible for assisting the Candle Knave in refreshing the endless university candles and ensuring that their wax is properly dribbled, although he pushes most of that work off onto the infallibly polite and oddly intelligent Mr. Nutt.

Glenda, Trev, and Juliet are the sort of people who populate the great city of Ankh-Morpork. While the people everyone has heard of have political crises, adventures, and book plots, they keep institutions like the Unseen University running. They read romance novels, go to the football games, and nurse long-standing rivalries. They do not expect the high mucky-mucks to enter their world, let alone mess with their game.

I approached Unseen Academicals with trepidation because I normally don't get along as well with the Discworld wizard books. I need not have worried; Pratchett realized that the wizards would work better as supporting characters and instead turns the main plot (or at least most of it; more on that later) over to the servants. This was a brilliant decision. The setup of this book is some of the best of Discworld up to this point.

Trev is a streetwise rogue with an uncanny knack for kicking around a can that he developed after being forbidden to play football by his dear old mum. He falls for Juliet even though their families support different football teams, so you may think that a Romeo and Juliet spoof is coming. There are a few gestures of one, but Pratchett deftly avoids the pitfalls and predictability and instead makes Juliet one of the best characters in the book by playing directly against type. She is one of the characters that Pratchett is so astonishingly good at, the ones that are so thoroughly themselves that they transcend the stories they're put into.

The heart of this book, though, is Glenda.

Glenda enjoyed her job. She didn't have a career; they were for people who could not hold down jobs.

She is the kind of person who knows where she fits in the world and likes what she does and is happy to stay there until she decides something isn't right, and then she changes the world through the power of common sense morality, righteous indignation, and sheer stubborn persistence. Discworld is full of complex and subtle characters fencing with each other, but there are few things I have enjoyed more than Glenda being a determinedly good person. Vetinari of course recognizes and respects (and uses) that inner core immediately.

Unfortunately, as great as the setup and characters are, Unseen Academicals falls apart a bit at the end. I was eagerly reading the story, wondering what Pratchett was going to weave out of the stories of these individuals, and then it partly turned into yet another wizard book. Pratchett pulled another of his deus ex machina tricks for the climax in a way that I found unsatisfying and contrary to the tone of the rest of the story, and while the characters do get reasonable endings, it lacked the oomph I was hoping for. Rincewind is as determinedly one-note as ever, the wizards do all the standard wizard things, and the plot just isn't that interesting.

I liked Mr. Nutt a great deal in the first part of the book, and I wish he could have kept that edge of enigmatic competence and unflappableness. Pratchett wanted to tell a different story that involved more angst and self-doubt, and while I appreciate that story, I found it less engaging and a bit more melodramatic than I was hoping for. Mr. Nutt's reactions in the last half of the book were believable and fit his background, but that was part of the problem: he slotted back into an archetype that I thought Pratchett was going to twist and upend.

Mr. Nutt does, at least, get a fantastic closing line, and as usual there are a lot of great asides and quotes along the way, including possibly the sharpest and most biting Vetinari speech of the entire series.

The Patrician took a sip of his beer. "I have told this to few people, gentlemen, and I suspect never will again, but one day when I was a young boy on holiday in Uberwald I was walking along the bank of a stream when I saw a mother otter with her cubs. A very endearing sight, I'm sure you will agree, and even as I watched, the mother otter dived into the water and came up with a plump salmon, which she subdued and dragged on to a half-submerged log. As she ate it, while of course it was still alive, the body split and I remember to this day the sweet pinkness of its roes as they spilled out, much to the delight of the baby otters who scrambled over themselves to feed on the delicacy. One of nature's wonders, gentlemen: mother and children dining on mother and children. And that's when I first learned about evil. It is built into the very nature of the universe. Every world spins in pain. If there is any kind of supreme being, I told myself, it is up to all of us to become his moral superior."

My dissatisfaction with the ending prevents Unseen Academicals from rising to the level of Night Watch, and it's a bit more uneven than the best books of the series. Still, though, this is great stuff; recommended to anyone who is reading the series.

Followed in publication order by I Shall Wear Midnight.

Rating: 8 out of 10

Categories: FLOSS Project Planets

Samuel Henrique: Hello World

Wed, 2024-04-17 20:00

This is my very first post, just to make sure everything is working as expected.

Made with Zola and the Abridge theme.

Categories: FLOSS Project Planets

Petter Reinholdtsen: RAID status from LSI Megaraid controllers in Debian

Wed, 2024-04-17 11:00

I am happy to report that the megactl package, useful to fetch RAID status when using the LSI Megaraid controller, now is available in Debian. It passed NEW a few days ago, and is now available in unstable, and probably showing up in testing in a weeks time. The new version should provide Appstream hardware mapping and should integrate nicely with isenkram.

As usual, if you use Bitcoin and want to show your support of my activities, please send Bitcoin donations to my address 15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b.

Categories: FLOSS Project Planets

Dirk Eddelbuettel: RcppArmadillo 0.12.8.2.1 on CRAN: Micro Fix

Tue, 2024-04-16 22:00

Armadillo is a powerful and expressive C++ template library for linear algebra and scientific computing. It aims towards a good balance between speed and ease of use, has a syntax deliberately close to Matlab, and is useful for algorithm development directly in C++, or quick conversion of research code into production environments. RcppArmadillo integrates this library with the R environment and language–and is widely used by (currently) 1135 other packages on CRAN, downloaded 33.7 million times (per the partial logs from the cloud mirrors of CRAN), and the CSDA paper (preprint / vignette) by Conrad and myself has been cited 579 times according to Google Scholar.

Yesterday’s release accommodates reticulate by suspending a single test that now ‘croaks’ creating a reverse-dependency issue for that package. No other changes were made.

The set of changes since the last CRAN release follows.

Changes in RcppArmadillo version 0.12.8.2.1 (2024-04-15)
  • One-char bug fix release commenting out one test that upsets reticulate when accessing a scipy sparse matrix

Courtesy of my CRANberries, there is a diffstat report relative to previous release. More detailed information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the Rcpp R-Forge page.

If you like this or other open-source work I do, you can sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

Categories: FLOSS Project Planets

Andreas Rönnquist: Status update for Allegro packaging in Debian

Mon, 2024-04-15 12:10

I have mailed to a Debian bug on allegro4.4 describing my reasoning
regarding the allegro libraries – in short, allegro4.4 is pretty much
dead upstream, and my interest was basically to keep alex4 (which is
cool) in Debian, but since it migrated to non-free, my interest in
allegro4.4 has waned. So, if anybody would like to still see allegro4.4
in Debian, please step up now and help out. Since it is dead upstream,
my reasoning is that it is better to remove it from Debian if no
maintainer who wants to help steps up.

Previously Tobias Hansen has helped out, but now it is 8 (!) years
since his last upload of either package.(Please don’t interpret this as
judgement, I am very happy for the help he has provided and all the
work he has done on the packages).

Allegro5 is another deal – still active upstream, and I have kept it up
to date in Debian, and while I have held the latest upload a short while
because of the time_t transition, it will come sooner or later – There
I am also waiting on a final decision on this bug from upstream. Other than
that allegro 5 is in a very good state, and I will keep maintaining it
as long as I can. But help would of course be appreciated on allegro5
too.

Categories: FLOSS Project Planets

Petter Reinholdtsen: Time to move orphaned Debian packages to git

Sun, 2024-04-14 03:30

There are several packages in Debian without a associated git repository with the packaging history. This is unfortunate and it would be nice if more of these would do so. Quote a lot of these are without a maintainer, ie listed as maintained by the 'Debian QA Group' place holder. In fact, 438 packages have this property according to UDD (SELECT source FROM sources WHERE release = 'sid' AND (vcs_url ilike '%anonscm.debian.org%' OR vcs_browser ilike '%anonscm.debian.org%' or vcs_url IS NULL OR vcs_browser IS NULL) AND maintainer ilike '%packages@qa.debian.org%';). Such packages can be updated without much coordination by any Debian developer, as they are considered orphaned.

To try to improve the situation and reduce the number of packages without associated git repository, I started a few days ago to search out candiates and provide them with a git repository under the 'debian' collaborative Salsa project. I started with the packages pointing to obsolete Alioth git repositories, and am now working my way across the ones completely without git references. In addition to updating the Vcs-* debian/control fields, I try to update Standards-Version, debhelper compat level, simplify d/rules, switch to Rules-Requires-Root: no and fix lintian issues reported. I only implement those that are trivial to fix, to avoid spending too much time on each orphaned package. So far my experience is that it take aproximately 20 minutes to convert a package without any git references, and a lot more for packages with existing git repositories incompatible with git-buildpackages.

So far I have converted 10 packages, and I will keep going until I run out of steam. As should be clear from the numbers, there is enough packages remaining for more people to do the same without stepping on each others toes. I find it useful to start by searching for a git repo already on salsa, as I find that some times a git repo has already been created, but no new version is uploaded to Debian yet. In those cases I start with the existing git repository. I convert to the git-buildpackage+pristine-tar workflow, and ensure a debian/gbp.conf file with "pristine-tar=True" is added early, to avoid uploading a orig.tar.gz with the wrong checksum by mistake. Did that three times in the begin before I remembered my mistake.

So, if you are a Debian Developer and got some spare time, perhaps considering migrating some orphaned packages to git?

As usual, if you use Bitcoin and want to show your support of my activities, please send Bitcoin donations to my address 15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b.

Categories: FLOSS Project Planets

Simon Josefsson: Reproducible and minimal source-only tarballs

Sat, 2024-04-13 12:44

With the release of Libntlm version 1.8 the release tarball can be reproduced on several distributions. We also publish a signed minimal source-only tarball, produced by git-archive which is the same format used by Savannah, Codeberg, GitLab, GitHub and others. Reproducibility of both tarballs are tested continuously for regressions on GitLab through a CI/CD pipeline. If that wasn’t enough to excite you, the Debian packages of Libntlm are now built from the reproducible minimal source-only tarball. The resulting binaries are hopefully reproducible on several architectures.

What does that even mean? Why should you care? How you can do the same for your project? What are the open issues? Read on, dear reader…

This article describes my practical experiments with reproducible release artifacts, following up on my earlier thoughts that lead to discussion on Fosstodon and a patch by Janneke Nieuwenhuizen to make Guix tarballs reproducible that inspired me to some practical work.

Let’s look at how a maintainer release some software, and how a user can reproduce the released artifacts from the source code. Libntlm provides a shared library written in C and uses GNU Make, GNU Autoconf, GNU Automake, GNU Libtool and gnulib for build management, but these ideas should apply to most project and build system. The following illustrate the steps a maintainer would take to prepare a release:

git clone https://gitlab.com/gsasl/libntlm.git cd libntlm git checkout v1.8 ./bootstrap ./configure make distcheck gpg -b libntlm-1.8.tar.gz

The generated files libntlm-1.8.tar.gz and libntlm-1.8.tar.gz.sig are published, and users download and use them. This is how the GNU project have been doing releases since the late 1980’s. That is a testament to how successful this pattern has been! These tarballs contain source code and some generated files, typically shell scripts generated by autoconf, makefile templates generated by automake, documentation in formats like Info, HTML, or PDF. Rarely do they contain binary object code, but historically that happened.

The XZUtils incident illustrate that tarballs with files that are not included in the git archive offer an opportunity to disguise malicious backdoors. I blogged earlier how to mitigate this risk by using signed minimal source-only tarballs.

The risk of hiding malware is not the only motivation to publish signed minimal source-only tarballs. With pre-generated content in tarballs, there is a risk that GNU/Linux distributions such as Trisquel, Guix, Debian/Ubuntu or Fedora ship generated files coming from the tarball into the binary *.deb or *.rpm package file. Typically the person packaging the upstream project never realized that some installed artifacts was not re-built through a typical autoconf -fi && ./configure && make install sequence, and never wrote the code to rebuild everything. This can also happen if the build rules are written but are buggy, shipping the old artifact. When a security problem is found, this can lead to time-consuming situations, as it may be that patching the relevant source code and rebuilding the package is not sufficient: the vulnerable generated object from the tarball would be shipped into the binary package instead of a rebuilt artifact. For architecture-specific binaries this rarely happens, since object code is usually not included in tarballs — although for 10+ years I shipped the binary Java JAR file in the GNU Libidn release tarball, until I stopped shipping it. For interpreted languages and especially for generated content such as HTML, PDF, shell scripts this happens more than you would like.

Publishing minimal source-only tarballs enable easier auditing of a project’s code, to avoid the need to read through all generated files looking for malicious content. I have taken care to generate the source-only minimal tarball using git-archive. This is the same format that GitLab, GitHub etc offer for the automated download links on git tags. The minimal source-only tarballs can thus serve as a way to audit GitLab and GitHub download material! Consider if/when hosting sites like GitLab or GitHub has a security incident that cause generated tarballs to include a backdoor that is not present in the git repository. If people rely on the tag download artifact without verifying the maintainer PGP signature using GnuPG, this can lead to similar backdoor scenarios that we had for XZUtils but originated with the hosting provider instead of the release manager. This is even more concerning, since this attack can be mounted for some selected IP address that you want to target and not on everyone, thereby making it harder to discover.

With all that discussion and rationale out of the way, let’s return to the release process. I have added another step here:

make srcdist gpg -b libntlm-1.8-src.tar.gz

Now the release is ready. I publish these four files in the Libntlm’s Savannah Download area, but they can be uploaded to a GitLab/GitHub release area as well. These are the SHA256 checksums I got after building the tarballs on my Trisquel 11 aramo laptop:

91de864224913b9493c7a6cec2890e6eded3610d34c3d983132823de348ec2ca libntlm-1.8-src.tar.gz ce6569a47a21173ba69c990965f73eb82d9a093eb871f935ab64ee13df47fda1 libntlm-1.8.tar.gz

So how can you reproduce my artifacts? Here is how to reproduce them in a Ubuntu 22.04 container:

podman run -it --rm ubuntu:22.04 apt-get update apt-get install -y --no-install-recommends autoconf automake libtool make git ca-certificates git clone https://gitlab.com/gsasl/libntlm.git cd libntlm git checkout v1.8 ./bootstrap ./configure make dist srcdist sha256sum libntlm-*.tar.gz

You should see the exact same SHA256 checksum values. Hooray!

This works because Trisquel 11 and Ubuntu 22.04 uses the same version of git, autoconf, automake, and libtool. These tools do not guarantee the same output content for all versions, similar to how GNU GCC does not generate the same binary output for all versions. So there is still some delicate version pairing needed.

Ideally, the artifacts should be possible to reproduce from the release artifacts themselves, and not only directly from git. It is possible to reproduce the full tarball in a AlmaLinux 8 container – replace almalinux:8 with rockylinux:8 if you prefer RockyLinux:

podman run -it --rm almalinux:8 dnf update -y dnf install -y make wget gcc wget https://download.savannah.nongnu.org/releases/libntlm/libntlm-1.8.tar.gz tar xfa libntlm-1.8.tar.gz cd libntlm-1.8 ./configure make dist sha256sum libntlm-1.8.tar.gz

The source-only minimal tarball can be regenerated on Debian 11:

podman run -it --rm debian:11 apt-get update apt-get install -y --no-install-recommends make git ca-certificates git clone https://gitlab.com/gsasl/libntlm.git cd libntlm git checkout v1.8 make -f cfg.mk srcdist sha256sum libntlm-1.8-src.tar.gz

As the Magnus Opus or chef-d’œuvre, let’s recreate the full tarball directly from the minimal source-only tarball on Trisquel 11 – replace docker.io/kpengboy/trisquel:11.0 with ubuntu:22.04 if you prefer.

podman run -it --rm docker.io/kpengboy/trisquel:11.0 apt-get update apt-get install -y --no-install-recommends autoconf automake libtool make wget git ca-certificates wget https://download.savannah.nongnu.org/releases/libntlm/libntlm-1.8-src.tar.gz tar xfa libntlm-1.8-src.tar.gz cd libntlm-v1.8 ./bootstrap ./configure make dist sha256sum libntlm-1.8.tar.gz

Yay! You should now have great confidence in that the release artifacts correspond to what’s in version control and also to what the maintainer intended to release. Your remaining job is to audit the source code for vulnerabilities, including the source code of the dependencies used in the build. You no longer have to worry about auditing the release artifacts.

I find it somewhat amusing that the build infrastructure for Libntlm is now in a significantly better place than the code itself. Libntlm is written in old C style with plenty of string manipulation and uses broken cryptographic algorithms such as MD4 and single-DES. Remember folks: solving supply chain security issues has no bearing on what kind of code you eventually run. A clean gun can still shoot you in the foot.

Side note on naming: GitLab exports tarballs with pathnames libntlm-v1.8/ (i.e.., PROJECT-TAG/) and I’ve adopted the same pathnames, which means my libntlm-1.8-src.tar.gz tarballs are bit-by-bit identical to GitLab’s exports and you can verify this with tools like diffoscope. GitLab name the tarball libntlm-v1.8.tar.gz (i.e., PROJECT-TAG.ARCHIVE) which I find too similar to the libntlm-1.8.tar.gz that we also publish. GitHub uses the same git archive style, but unfortunately they have logic that removes the ‘v’ in the pathname so you will get a tarball with pathname libntlm-1.8/ instead of libntlm-v1.8/ that GitLab and I use. The content of the tarball is bit-by-bit identical, but the pathname and archive differs. Codeberg (running Forgejo) uses another approach: the tarball is called libntlm-v1.8.tar.gz (after the tag) just like GitLab, but the pathname inside the archive is libntlm/, otherwise the produced archive is bit-by-bit identical including timestamps. Savannah’s CGIT interface uses archive name libntlm-1.8.tar.gz with pathname libntlm-1.8/, but otherwise file content is identical. Savannah’s GitWeb interface provides snapshot links that are named after the git commit (e.g., libntlm-a812c2ca.tar.gz with libntlm-a812c2ca/) and I cannot find any tag-based download links at all. Overall, we are so close to get SHA256 checksum to match, but fail on pathname within the archive. I’ve chosen to be compatible with GitLab regarding the content of tarballs but not on archive naming. From a simplicity point of view, it would be nice if everyone used PROJECT-TAG.ARCHIVE for the archive filename and PROJECT-TAG/ for the pathname within the archive. This aspect will probably need more discussion.

Side note on git archive output: It seems different versions of git archive produce different results for the same repository. The version of git in Debian 11, Trisquel 11 and Ubuntu 22.04 behave the same. The version of git in Debian 12, AlmaLinux/RockyLinux 8/9, Alpine, ArchLinux, macOS homebrew, and upcoming Ubuntu 24.04 behave in another way. Hopefully this will not change that often, but this would invalidate reproducibility of these tarballs in the future, forcing you to use an old git release to reproduce the source-only tarball. Alas, GitLab and most other sites appears to be using modern git so the download tarballs from them would not match my tarballs – even though the content would.

Side note on ChangeLog: ChangeLog files were traditionally manually curated files with version history for a package. In recent years, several projects moved to dynamically generate them from git history (using tools like git2cl or gitlog-to-changelog). This has consequences for reproducibility of tarballs: you need to have the entire git history available! The gitlog-to-changelog tool also output different outputs depending on the time zone of the person using it, which arguable is a simple bug that can be fixed. However this entire approach is incompatible with rebuilding the full tarball from the minimal source-only tarball. It seems Libntlm’s ChangeLog file died on the surgery table here.

So how would a distribution build these minimal source-only tarballs? I happen to help on the libntlm package in Debian. It has historically used the generated tarballs as the source code to build from. This means that code coming from gnulib is vendored in the tarball. When a security problem is discovered in gnulib code, the security team needs to patch all packages that include that vendored code and rebuild them, instead of merely patching the gnulib package and rebuild all packages that rely on that particular code. To change this, the Debian libntlm package needs to Build-Depends on Debian’s gnulib package. But there was one problem: similar to most projects that use gnulib, Libntlm depend on a particular git commit of gnulib, and Debian only ship one commit. There is no coordination about which commit to use. I have adopted gnulib in Debian, and add a git bundle to the *_all.deb binary package so that projects that rely on gnulib can pick whatever commit they need. This allow an no-network GNULIB_URL and GNULIB_REVISION approach when running Libntlm’s ./bootstrap with the Debian gnulib package installed. Otherwise libntlm would pick up whatever latest version of gnulib that Debian happened to have in the gnulib package, which is not what the Libntlm maintainer intended to be used, and can lead to all sorts of version mismatches (and consequently security problems) over time. Libntlm in Debian is developed and tested on Salsa and there is continuous integration testing of it as well, thanks to the Salsa CI team.

Side note on git bundles: unfortunately there appears to be no reproducible way to export a git repository into one or more files. So one unfortunate consequence of all this work is that the gnulib *.orig.tar.gz tarball in Debian is not reproducible any more. I have tried to get Git bundles to be reproducible but I never got it to work — see my notes in gnulib’s debian/README.source on this aspect. Of course, source tarball reproducibility has nothing to do with binary reproducibility of gnulib in Debian itself, fortunately.

One open question is how to deal with the increased build dependencies that is triggered by this approach. Some people are surprised by this but I don’t see how to get around it: if you depend on source code for tools in another package to build your package, it is a bad idea to hide that dependency. We’ve done it for a long time through vendored code in non-minimal tarballs. Libntlm isn’t the most critical project from a bootstrapping perspective, so adding git and gnulib as Build-Depends to it will probably be fine. However, consider if this pattern was used for other packages that uses gnulib such as coreutils, gzip, tar, bison etc (all are using gnulib) then they would all Build-Depends on git and gnulib. Cross-building those packages for a new architecture will therefor require git on that architecture first, which gets circular quick. The dependency on gnulib is real so I don’t see that going away, and gnulib is a Architecture:all package. However, the dependency on git is merely a consequence of how the Debian gnulib package chose to make all gnulib git commits available to projects: through a git bundle. There are other ways to do this that doesn’t require the git tool to extract the necessary files, but none that I found practical — ideas welcome!

Finally some brief notes on how this was implementated. Enabling bootstrappable source-only minimal tarballs via gnulib’s ./bootstrap is achieved by using the GNULIB_REVISION mechanism, locking down the gnulib commit used. I have always disliked git submodules because they add extra steps and has complicated interaction with CI/CD. The reason why I gave up git submodules now is because the particular commit to use is not recorded in the git archive output when git submodules is used. So the particular gnulib commit has to be mentioned explicitly in some source code that goes into the git archive tarball. Colin Watson added the GNULIB_REVISION approach to ./bootstrap back in 2018, and now it no longer made sense to continue to use a gnulib git submodule. One alternative is to use ./bootstrap with --gnulib-srcdir or --gnulib-refdir if there is some practical problem with the GNULIB_URL towards a git bundle the GNULIB_REVISION in bootstrap.conf.

The srcdist make rule is simple:

git archive --prefix=libntlm-v1.8/ -o libntlm-v1.8.tar.gz HEAD

Making the make dist generated tarball reproducible can be more complicated, however for Libntlm it was sufficient to make sure the modification times of all files were set deterministically to a timestamp found in the git repository. Interestingly there seems to be a couple of different ways to accomplish this, Guix doesn’t support minimal source-only tarballs but rely on a .tarball-timestamp file inside the tarball. Paul Eggert explained what TZDB is using some time ago. The approach I’m using now is fairly similar to the one I suggested over a year ago.

Doing continous testing of all this is critical to make sure things don’t regress. Libntlm’s pipeline definition now produce the generated libntlm-*.tar.gz tarballs and a checksum as a build artifact. Then I added the 000-reproducability job which compares the checksums and fails on mismatches. You can read its delicate output in the job for the v1.8 release. Right now we insists that builds on Trisquel 11 match Ubuntu 22.04, that PureOS 10 builds match Debian 11 builds, that AlmaLinux 8 builds match RockyLinux 8 builds, and AlmaLinux 9 builds match RockyLinux 9 builds. As you can see in pipeline job output, not all platforms lead to the same tarballs, but hopefully this state can be improved over time. There is also partial reproducibility, where the full tarball is reproducible across two distributions but not the minimal tarball, or vice versa.

If this way of working plays out well, I hope to implement it in other projects too.

What do you think? Happy Hacking!

Categories: FLOSS Project Planets

Paul Tagliamonte: Domo Arigato, Mr. debugfs

Sat, 2024-04-13 09:27

Years ago, at what I think I remember was DebConf 15, I hacked for a while on debhelper to write build-ids to debian binary control files, so that the build-id (more specifically, the ELF note .note.gnu.build-id) wound up in the Debian apt archive metadata. I’ve always thought this was super cool, and seeing as how Michael Stapelberg blogged some great pointers around the ecosystem, including the fancy new debuginfod service, and the find-dbgsym-packages helper, which uses these same headers, I don’t think I’m the only one.

At work I’ve been using a lot of rust, specifically, async rust using tokio. To try and work on my style, and to dig deeper into the how and why of the decisions made in these frameworks, I’ve decided to hack up a project that I’ve wanted to do ever since 2015 – write a debug filesystem. Let’s get to it.

Back to the Future It shouldn't shock anyone to learn I'm a huge fan of Go, right?

Time to admit something. I really love Plan 9. It’s just so good. So many ideas from Plan 9 are just so prescient, and everything just feels right. Not just right like, feels good – like, correct. The bit that I’ve always liked the most is 9p, the network protocol for serving a filesystem over a network. This leads to all sorts of fun programs, like the Plan 9 ftp client being a 9p server – you mount the ftp server and access files like any other files. It’s kinda like if fuse were more fully a part of how the operating system worked, but fuse is all running client-side. With 9p there’s a single client, and different servers that you can connect to, which may be backed by a hard drive, remote resources over something like SFTP, FTP, HTTP or even purely synthetic.

I even triggered a weird bug in vim when writing a 9p filesystem that wound up impacting WSL -- although it seems like maybe not due to 9p (rather, SMB)

The interesting (maybe sad?) part here is that 9p wound up outliving Plan 9 in terms of adoption – 9p is in all sorts of places folks don’t usually expect. For instance, the Windows Subsystem for Linux uses the 9p protocol to share files between Windows and Linux. ChromeOS uses it to share files with Crostini, and qemu uses 9p (virtio-p9) to share files between guest and host. If you’re noticing a pattern here, you’d be right; for some reason 9p is the go-to protocol to exchange files between hypervisor and guest. Why? I have no idea, except maybe due to being designed well, simple to implement, and it’s a lot easier to validate the data being shared and validate security boundaries. Simplicity has its value.

As a result, there’s a lot of lingering 9p support kicking around. Turns out Linux can even handle mounting 9p filesystems out of the box. This means that I can deploy a filesystem to my LAN or my localhost by running a process on top of a computer that needs nothing special, and mount it over the network on an unmodified machine – unlike fuse, where you’d need client-specific software to run in order to mount the directory. For instance, let’s mount a 9p filesystem running on my localhost machine, serving requests on 127.0.0.1:564 (tcp) that goes by the name “mountpointname” to /mnt.

Unfortunately, this requires root to mount and feels very un-plan9, but it does work and the protocol is good. $ mount -t 9p \ -o trans=tcp,port=564,version=9p2000.u,aname=mountpointname \ 127.0.0.1 \ /mnt

Linux will mount away, and attach to the filesystem as the root user, and by default, attach to that mountpoint again for each local user that attempts to use it. Nifty, right? I think so. The server is able to keep track of per-user access and authorization along with the host OS.

WHEREIN I STYX WITH IT "Simple" here is intended as my highest form of praise. Writing complex things is easy. Taking your work, and simplifying it down the core is the most difficult part of our work.

Since I wanted to push myself a bit more with rust and tokio specifically, I opted to implement the whole stack myself, without third party libraries on the critical path where I could avoid it. The 9p protocol (sometimes called Styx, the original name for it) is incredibly simple. It’s a series of client to server requests, which receive a server to client response. These are, respectively, “T” messages, which transmit a request to the server, which trigger an “R” message in response (Reply messages). These messages are TLV payload with a very straight forward structure – so straight forward, in fact, that I was able to implement a working server off nothing more than a handful of man pages.

There's also a 9P2000.L 9p variant which has more Linux specific extensions. There's a good chance I port this forward when I get the chance.

Later on after the basics worked, I found a more complete spec page that contains more information about the unix specific variant that I opted to use (9P2000.u rather than 9P2000) due to the level of Linux specific support for the 9P2000.u variant over the 9P2000 protocol.

MR ROBOTO It really bothers me rust libraries that deal with I/O need to support std::io, but to add support for async runtimes, you need to implement support for tokio::io and every other runtime; but them's the breaks I guess. I really miss Go's built-in async support and io module.

The backend stack over at zoo is rust and tokio running i/o for an HTTP and WebRTC server. I figured I’d pick something fairly similar to write my filesystem with, since 9P can be implemented on basically anything with I/O. That means tokio tcp server bits, which construct and use a 9p server, which has an idiomatic Rusty API that partially abstracts the raw R and T messages, but not so much as to cause issues with hiding implementation possibilities. At each abstraction level, there’s an escape hatch – allowing someone to implement any of the layers if required. I called this framework arigato which can be found over on docs.rs and crates.io.

/// Simplified version of the arigato File trait; this isn't actually /// the same trait; there's some small cosmetic differences. The /// actual trait can be found at: /// /// https://docs.rs/arigato/latest/arigato/server/trait.File.html trait File { /// OpenFile is the type returned by this File via an Open call. type OpenFile: OpenFile; /// Return the 9p Qid for this file. A file is the same if the Qid is /// the same. A Qid contains information about the mode of the file, /// version of the file, and a unique 64 bit identifier. fn qid(&self) -> Qid; /// Construct the 9p Stat struct with metadata about a file. async fn stat(&self) -> FileResult<Stat>; /// Attempt to update the file metadata. async fn wstat(&mut self, s: &Stat) -> FileResult<()>; /// Traverse the filesystem tree. async fn walk(&self, path: &[&str]) -> FileResult<(Option<Self>, Vec<Self>)>; /// Request that a file's reference be removed from the file tree. async fn unlink(&mut self) -> FileResult<()>; /// Create a file at a specific location in the file tree. async fn create( &mut self, name: &str, perm: u16, ty: FileType, mode: OpenMode, extension: &str, ) -> FileResult<Self>; /// Open the File, returning a handle to the open file, which handles /// file i/o. This is split into a second type since it is genuinely /// unrelated -- and the fact that a file is Open or Closed can be /// handled by the `arigato` server for us. async fn open(&mut self, mode: OpenMode) -> FileResult<Self::OpenFile>; } /// Simplified version of the arigato OpenFile trait; this isn't actually /// the same trait; there's some small cosmetic differences. The /// actual trait can be found at: /// /// https://docs.rs/arigato/latest/arigato/server/trait.OpenFile.html trait OpenFile { /// iounit to report for this file. The iounit reported is used for Read /// or Write operations to signal, if non-zero, the maximum size that is /// guaranteed to be transferred atomically. fn iounit(&self) -> u32; /// Read some number of bytes up to `buf.len()` from the provided /// `offset` of the underlying file. The number of bytes read is /// returned. async fn read_at( &mut self, buf: &mut [u8], offset: u64, ) -> FileResult<u32>; /// Write some number of bytes up to `buf.len()` from the provided /// `offset` of the underlying file. The number of bytes written /// is returned. fn write_at( &mut self, buf: &mut [u8], offset: u64, ) -> FileResult<u32>; } Thanks, decade ago paultag! If this isn't my record for longest idea-to-wip-project time, it's close.

Let’s do it! Let’s use arigato to implement a 9p filesystem we’ll call debugfs that will serve all the debug files shipped according to the Packages metadata from the apt archive. We’ll fetch the Packages file and construct a filesystem based on the reported Build-Id entries. For those who don’t know much about how an apt repo works, here’s the 2-second crash course on what we’re doing. The first is to fetch the Packages file, which is specific to a binary architecture (such as amd64, arm64 or riscv64). That architecture is specific to a component (such as main, contrib or non-free). That component is specific to a suite, such as stable, unstable or any of its aliases (bullseye, bookworm, etc). Let’s take a look at the Packages.xz file for the unstable-debug suite, main component, for all amd64 binaries.

$ curl \ https://deb.debian.org/debian-debug/dists/unstable-debug/main/binary-amd64/Packages.xz \ | unxz

This will return the Debian-style rfc2822-like headers, which is an export of the metadata contained inside each .deb file which apt (or other tools that can use the apt repo format) use to fetch information about debs. Let’s take a look at the debug headers for the netlabel-tools package in unstable – which is a package named netlabel-tools-dbgsym in unstable-debug.

Package: netlabel-tools-dbgsym Source: netlabel-tools (0.30.0-1) Version: 0.30.0-1+b1 Installed-Size: 79 Maintainer: Paul Tagliamonte <paultag@debian.org> Architecture: amd64 Depends: netlabel-tools (= 0.30.0-1+b1) Description: debug symbols for netlabel-tools Auto-Built-Package: debug-symbols Build-Ids: e59f81f6573dadd5d95a6e4474d9388ab2777e2a Description-md5: a0e587a0cf730c88a4010f78562e6db7 Section: debug Priority: optional Filename: pool/main/n/netlabel-tools/netlabel-tools-dbgsym_0.30.0-1+b1_amd64.deb Size: 62776 SHA256: 0e9bdb087617f0350995a84fb9aa84541bc4df45c6cd717f2157aa83711d0c60

So here, we can parse the package headers in the Packages.xz file, and store, for each Build-Id, the Filename where we can fetch the .deb at. Each .deb contains a number of files – but we’re only really interested in the files inside the .deb located at or under /usr/lib/debug/.build-id/, which you can find in debugfs under rfc822.rs. It’s crude, and very single-purpose, but I’m feeling a bit lazy.

Who needs dpkg?! Hilariously, the fourth? fifth? non-serious time (second serious time) I've had to do this for a new language.

For folks who haven’t seen it yet, a .deb file is a special type of .ar file, that contains (usually) three files inside – debian-binary, control.tar.xz and data.tar.xz. The core of an .ar file is a fixed size (60 byte) entry header, followed by the specified size number of bytes.

[8 byte .ar file magic] [60 byte entry header] [N bytes of data] [60 byte entry header] [N bytes of data] [60 byte entry header] [N bytes of data] ... I can't believe it's already been over a decade since my NM process, and nearly 16 years since I became an Ubuntu member.

First up was to implement a basic ar parser in ar.rs. Before we get into using it to parse a deb, as a quick diversion, let’s break apart a .deb file by hand – something that is a bit of a rite of passage (or at least it used to be? I’m getting old) during the Debian nm (new member) process, to take a look at where exactly the .debug file lives inside the .deb file.

$ ar x netlabel-tools-dbgsym_0.30.0-1+b1_amd64.deb $ ls control.tar.xz debian-binary data.tar.xz netlabel-tools-dbgsym_0.30.0-1+b1_amd64.deb $ tar --list -f data.tar.xz | grep '.debug$' ./usr/lib/debug/.build-id/e5/9f81f6573dadd5d95a6e4474d9388ab2777e2a.debug

Since we know quite a bit about the structure of a .deb file, and I had to implement support from scratch anyway, I opted to implement a (very!) basic debfile parser using HTTP Range requests. HTTP Range requests, if supported by the server (denoted by a accept-ranges: bytes HTTP header in response to an HTTP HEAD request to that file) means that we can add a header such as range: bytes=8-68 to specifically request that the returned GET body be the byte range provided (in the above case, the bytes starting from byte offset 8 until byte offset 68). This means we can fetch just the ar file entry from the .deb file until we get to the file inside the .deb we are interested in (in our case, the data.tar.xz file) – at which point we can request the body of that file with a final range request. I wound up writing a struct to handle a read_at-style API surface in hrange.rs, which we can pair with ar.rs above and start to find our data in the .deb remotely without downloading and unpacking the .deb at all.

I really like HTTP Range requests a lot. I did some stats to figure out what compression dbgsym packages use these days; my LAN debug mirror contains 113459 xz compressed tarfiles, and 9 gzip compressed tarfiles at the time of writing.

After we have the body of the data.tar.xz coming back through the HTTP response, we get to pipe it through an xz decompressor (this kinda sucked in Rust, since a tokio AsyncRead is not the same as an http Body response is not the same as std::io::Read, is not the same as an async (or sync) Iterator is not the same as what the xz2 crate expects; leading me to read blocks of data to a buffer and stuff them through the decoder by looping over the buffer for each lzma2 packet in a loop), and tarfile parser (similarly troublesome). From there we get to iterate over all entries in the tarfile, stopping when we reach our file of interest. Since we can’t seek, but gdb needs to, we’ll pull it out of the stream into a Cursor<Vec<u8>> in-memory and pass a handle to it back to the user.

From here on out its a matter of gluing together a File traited struct in debugfs, and serving the filesystem over TCP using arigato. Done deal!

A quick diversion about compression

I was originally hoping to avoid transferring the whole tar file over the network (and therefore also reading the whole debug file into ram, which objectively sucks), but quickly hit issues with figuring out a way around seeking around an xz file. What’s interesting is xz has a great primitive to solve this specific problem (specifically, use a block size that allows you to seek to the block as close to your desired seek position just before it, only discarding at most block size - 1 bytes), but data.tar.xz files generated by dpkg appear to have a single mega-huge block for the whole file. I don’t know why I would have expected any different, in retrospect. That means that this now devolves into the base case of “How do I seek around an lzma2 compressed data stream”; which is a lot more complex of a question.

After going through a lot of this, I realized just how complex the xz format is -- it's a lot more than just lzma2!

Thankfully, notoriously brilliant tianon was nice enough to introduce me to Jon Johnson who did something super similar – adapted a technique to seek inside a compressed gzip file, which lets his service oci.dag.dev seek through Docker container images super fast based on some prior work such as soci-snapshotter, gztool, and zran.c. He also pulled this party trick off for apk based distros over at apk.dag.dev, which seems apropos. Jon was nice enough to publish a lot of his work on this specifically in a central place under the name “targz” on his GitHub, which has been a ton of fun to read through.

The gist is that, by dumping the decompressor’s state (window of previous bytes, in-memory data derived from the last N-1 bytes) at specific “checkpoints” along with the compressed data stream offset in bytes and decompressed offset in bytes, one can seek to that checkpoint in the compressed stream and pick up where you left off – creating a similar “block” mechanism against the wishes of gzip. It means you’d need to do an O(n) run over the file, but every request after that will be sped up according to the number of checkpoints you’ve taken.

Given the complexity of xz and lzma2, I don’t think this is possible for me at the moment – especially given most of the files I’ll be requesting will not be loaded from again – especially when I can “just” cache the debug header by Build-Id. I want to implement this (because I’m generally curious and Jon has a way of getting someone excited about compression schemes, which is not a sentence I thought I’d ever say out loud), but for now I’m going to move on without this optimization. Such a shame, since it kills a lot of the work that went into seeking around the .deb file in the first place, given the debian-binary and control.tar.gz members are so small.

The Good

First, the good news right? It works! That’s pretty cool. I’m positive my younger self would be amused and happy to see this working; as is current day paultag. Let’s take debugfs out for a spin! First, we need to mount the filesystem. It even works on an entirely unmodified, stock Debian box on my LAN, which is huge. Let’s take it for a spin:

$ mount \ -t 9p \ -o trans=tcp,version=9p2000.u,aname=unstable-debug \ 192.168.0.2 \ /usr/lib/debug/.build-id/

And, let’s prove to ourselves that this actually mounted before we go trying to use it:

$ mount | grep build-id 192.168.0.2 on /usr/lib/debug/.build-id type 9p (rw,relatime,aname=unstable-debug,access=user,trans=tcp,version=9p2000.u,port=564)

Slick. We’ve got an open connection to the server, where our host will keep a connection alive as root, attached to the filesystem provided in aname. Let’s take a look at it.

$ ls /usr/lib/debug/.build-id/ 00 0d 1a 27 34 41 4e 5b 68 75 82 8E 9b a8 b5 c2 CE db e7 f3 01 0e 1b 28 35 42 4f 5c 69 76 83 8f 9c a9 b6 c3 cf dc E7 f4 02 0f 1c 29 36 43 50 5d 6a 77 84 90 9d aa b7 c4 d0 dd e8 f5 03 10 1d 2a 37 44 51 5e 6b 78 85 91 9e ab b8 c5 d1 de e9 f6 04 11 1e 2b 38 45 52 5f 6c 79 86 92 9f ac b9 c6 d2 df ea f7 05 12 1f 2c 39 46 53 60 6d 7a 87 93 a0 ad ba c7 d3 e0 eb f8 06 13 20 2d 3a 47 54 61 6e 7b 88 94 a1 ae bb c8 d4 e1 ec f9 07 14 21 2e 3b 48 55 62 6f 7c 89 95 a2 af bc c9 d5 e2 ed fa 08 15 22 2f 3c 49 56 63 70 7d 8a 96 a3 b0 bd ca d6 e3 ee fb 09 16 23 30 3d 4a 57 64 71 7e 8b 97 a4 b1 be cb d7 e4 ef fc 0a 17 24 31 3e 4b 58 65 72 7f 8c 98 a5 b2 bf cc d8 E4 f0 fd 0b 18 25 32 3f 4c 59 66 73 80 8d 99 a6 b3 c0 cd d9 e5 f1 fe 0c 19 26 33 40 4d 5a 67 74 81 8e 9a a7 b4 c1 ce da e6 f2 ff

Outstanding. Let’s try using gdb to debug a binary that was provided by the Debian archive, and see if it’ll load the ELF by build-id from the right .deb in the unstable-debug suite:

$ gdb -q /usr/sbin/netlabelctl Reading symbols from /usr/sbin/netlabelctl... Reading symbols from /usr/lib/debug/.build-id/e5/9f81f6573dadd5d95a6e4474d9388ab2777e2a.debug... (gdb)

Yes! Yes it will!

$ file /usr/lib/debug/.build-id/e5/9f81f6573dadd5d95a6e4474d9388ab2777e2a.debug /usr/lib/debug/.build-id/e5/9f81f6573dadd5d95a6e4474d9388ab2777e2a.debug: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter *empty*, BuildID[sha1]=e59f81f6573dadd5d95a6e4474d9388ab2777e2a, for GNU/Linux 3.2.0, with debug_info, not stripped The Bad

Linux’s support for 9p is mainline, which is great, but it’s not robust. Network issues or server restarts will wedge the mountpoint (Linux can’t reconnect when the tcp connection breaks), and things that work fine on local filesystems get translated in a way that causes a lot of network chatter – for instance, just due to the way the syscalls are translated, doing an ls, will result in a stat call for each file in the directory, even though linux had just got a stat entry for every file while it was resolving directory names. On top of that, Linux will serialize all I/O with the server, so there’s no concurrent requests for file information, writes, or reads pending at the same time to the server; and read and write throughput will degrade as latency increases due to increasing round-trip time, even though there are offsets included in the read and write calls. It works well enough, but is frustrating to run up against, since there’s not a lot you can do server-side to help with this beyond implementing the 9P2000.L variant (which, maybe is worth it).

The Ugly

Unfortunately, we don’t know the file size(s) until we’ve actually opened the underlying tar file and found the correct member, so for most files, we don’t know the real size to report when getting a stat. We can’t parse the tarfiles for every stat call, since that’d make ls even slower (bummer). Only hiccup is that when I report a filesize of zero, gdb throws a bit of a fit; let’s try with a size of 0 to start:

$ ls -lah /usr/lib/debug/.build-id/e5/9f81f6573dadd5d95a6e4474d9388ab2777e2a.debug -r--r--r-- 1 root root 0 Dec 31 1969 /usr/lib/debug/.build-id/e5/9f81f6573dadd5d95a6e4474d9388ab2777e2a.debug $ gdb -q /usr/sbin/netlabelctl Reading symbols from /usr/sbin/netlabelctl... Reading symbols from /usr/lib/debug/.build-id/e5/9f81f6573dadd5d95a6e4474d9388ab2777e2a.debug... warning: Discarding section .note.gnu.build-id which has a section size (24) larger than the file size [in module /usr/lib/debug/.build-id/e5/9f81f6573dadd5d95a6e4474d9388ab2777e2a.debug] [...]

This obviously won’t work since gdb will throw away all our hard work because of stat’s output, and neither will loading the real size of the underlying file. That only leaves us with hardcoding a file size and hope nothing else breaks significantly as a result. Let’s try it again:

$ ls -lah /usr/lib/debug/.build-id/e5/9f81f6573dadd5d95a6e4474d9388ab2777e2a.debug -r--r--r-- 1 root root 954M Dec 31 1969 /usr/lib/debug/.build-id/e5/9f81f6573dadd5d95a6e4474d9388ab2777e2a.debug $ gdb -q /usr/sbin/netlabelctl Reading symbols from /usr/sbin/netlabelctl... Reading symbols from /usr/lib/debug/.build-id/e5/9f81f6573dadd5d95a6e4474d9388ab2777e2a.debug... (gdb)

Much better. I mean, terrible but better. Better for now, anyway.

Kilroy was here

Do I think this is a particularly good idea? I mean; kinda. I’m probably going to make some fun 9p arigato-based filesystems for use around my LAN, but I don’t think I’ll be moving to use debugfs until I can figure out how to ensure the connection is more resilient to changing networks, server restarts and fixes on i/o performance. I think it was a useful exercise and is a pretty great hack, but I don’t think this’ll be shipping anywhere anytime soon.

Along with me publishing this post, I’ve pushed up all my repos; so you should be able to play along at home! There’s a lot more work to be done on arigato; but it does handshake and successfully export a working 9P2000.u filesystem. Check it out on on my github at arigato, debugfs and also on crates.io and docs.rs.

At least I can say I was here and I got it working after all these years.

Categories: FLOSS Project Planets

Russell Coker: Software Needed for Work

Sat, 2024-04-13 03:08

When I first started studying computer science setting up a programming project was easy, write source code files and a Makefile and that was it. IRC was the only IM system and email was the only other communications system that was used much. Writing Makefiles is difficult but products like the Borland Turbo series of IDEs did all that for you so you could just start typing code and press a function key to compile and run (F5 from memory).

Over the years the requirements and expectations of computer use have grown significantly. The typical office worker is now doing many more things with computers than serious programmers used to do. Running an IM system, an online document editing system, and a series of web apps is standard for companies nowadays. Developers have to do all that in addition to tools for version control, continuous integration, bug reporting, and feature tracking. The development process is also more complex with extra steps for reproducible builds, automated tests, and code coverage metrics for the tests. I wonder how many programmers who started in the 90s would have done something else if faced with Github as their introduction.

How much of this is good? Having the ability to send instant messages all around the world is great. Having dozens of different ways of doing so is awful. When a company uses multiple IM systems such as MS-Teams and Slack and forces some of it’s employees to use them both it’s getting ridiculous. Having different friend groups on different IM systems is anti-social networking. In the EU the Digital Markets Act [1] forces some degree of interoperability between different IM systems and as it’s impossible to know who’s actually in the EU that will end up being world-wide.

In corporations document management often involves multiple ways of storing things, you have Google Docs, MS Office online, hosted Wikis like Confluence, and more. Large companies tend to use several such systems which means that people need to learn multiple systems to be able to work and they also need to know which systems are used by the various groups that they communicate with. Microsoft deserves some sort of award for the range of ways they have for managing documents, Sharepoint, OneDrive, Office Online, attachments to Teams rooms, and probably lots more.

During WW2 the predecessor to the CIA produced an excellent manual for simple sabotage [2]. If something like that was written today the section General Interference with Organisations and Production would surely have something about using as many incompatible programs and web sites as possible in the work flow. The proliferation of software required for work is a form of denial of service attack against corporations.

The efficiency of companies doesn’t really bother me. It sucks that companies are creating a demoralising workplace that is unpleasant for workers. But the upside is that the biggest companies are the ones doing the worst things and are also the most afflicted by these problems. It’s almost like the Bureau of Sabotage in some of Frank Herbert’s fiction [3].

The thing that concerns me is the effect of multiple standards on free software development. We have IRC the most traditional IM support system which is getting replaced by Matrix but we also have some projects using Telegram, and Jabber hasn’t gone away. I’m sure there are others too. There are also multiple options for version control (although github seems to dominate the market), forums, bug trackers, etc. Reporting bugs or getting support in free software often requires interacting with several of them. Developing free software usually involves dealing with the bug tracking and documentation systems of the distribution you use as well as the upstream developers of the software. If the problem you have is related to compatibility between two different pieces of free software then you can end up dealing with even more bug tracking systems.

There are real benefits to some of the newer programs to track bugs, write documentation, etc. There is also going to be a cost in changing which gives an incentive for the older projects to keep using what has worked well enough for them in the past,

How can we improve things? Use only the latest tools? Prioritise ease of use? Aim more for the entry level contributors?

Related posts:

  1. Source Escrow for Proprietary Software British taxpayers are paying for extra support for Windows XP...
  2. Car Drivers vs Mechanics and Free Software In a comment on my post about Designing Unsafe Cars...
  3. Advertising Free Software Projects Today I just noticed the following advert on one of...
Categories: FLOSS Project Planets

Scarlett Gately Moore: Kubuntu: Noble Numbat Beta available! Qt6 snaps coming soon.

Fri, 2024-04-12 15:29

It has been a very busy couple of weeks as we worked against some major transitions and a security fix that required a rebuild of the $world. I am happy to report that against all odds we have a beta release! You can read all about it here: https://kubuntu.org/news/kubuntu-24-04-beta-released/ Post beta freeze I have already begun pushing our fixes for known issues today. A big one being our new branding! Very exciting times in the Kubuntu world.

In the snap world I will be using my free time to start knocking out KDE applications ( not covered by the project ). I have also recruited some help, so you should start seeing these pop up in the edge channel very soon!

Now that we are nearing the release of Noble Numbat, my contract is coming to an end with Kubuntu. If you would like to see Plasma 6 in the next release and in a PPA for Noble, please consider donating to extend my contract at https://kubuntu.org/donate !

On a personal level, I am still looking to help with my grandson and you can find that here: https://www.gofundme.com/f/in-loving-memory-of-william-billy-dean-scalf

Thanks for stopping by,

Scarlett

Categories: FLOSS Project Planets

NOKUBI Takatsugu: mailman3-web error when upgrading to bookworm

Fri, 2024-04-12 09:34

I tried to upgrade bullseye machien to bookworm, so I got the following error:

File “/usr/lib/python3/dist-packages/django/contrib/auth/mixins.py”, line 5, in
from django.contrib.auth.views import redirect_to_login
File “/usr/lib/python3/dist-packages/django/contrib/auth/views.py”, line 20, in
from django.utils.http import (
ImportError: cannot import name ‘url_has_allowed_host_and_scheme’ from ‘django.utils.http’ (/usr/lib/python3/dist-packages/django/utils/http.py)

During handling of the above exception, another exception occurred:

It is similar to #1000810, but it is already closed.

My solution is:

  • apt remove mailman3-web
    • keep db and config files (do not purge)
  • apt autoremove
    • remove django related packages
  • apt install mailman3-web mailman3-full

I tried to send to the report, but it rerutns `550 Unknown or archived bug’ …

Categories: FLOSS Project Planets

Freexian Collaborators: Monthly report about Debian Long Term Support, March 2024 (by Roberto C. Sánchez)

Thu, 2024-04-11 20:00

Like each month, have a look at the work funded by Freexian’s Debian LTS offering.

Debian LTS contributors

In March, 19 contributors have been paid to work on Debian LTS, their reports are available:

  • Abhijith PA did 0.0h (out of 10.0h assigned and 4.0h from previous period), thus carrying over 14.0h to the next month.
  • Adrian Bunk did 59.5h (out of 47.5h assigned and 52.5h from previous period), thus carrying over 40.5h to the next month.
  • Bastien Roucariès did 22.0h (out of 20.0h assigned and 2.0h from previous period).
  • Ben Hutchings did 9.0h (out of 2.0h assigned and 22.0h from previous period), thus carrying over 15.0h to the next month.
  • Chris Lamb did 18.0h (out of 18.0h assigned).
  • Daniel Leidert did 12.0h (out of 12.0h assigned).
  • Emilio Pozuelo Monfort did 0.0h (out of 3.0h assigned and 57.0h from previous period), thus carrying over 60.0h to the next month.
  • Guilhem Moulin did 22.5h (out of 7.25h assigned and 15.25h from previous period).
  • Holger Levsen did 0.0h (out of 0.5h assigned and 11.5h from previous period), thus carrying over 12.0h to the next month.
  • Lee Garrett did 0.0h (out of 0.0h assigned and 60.0h from previous period), thus carrying over 60.0h to the next month.
  • Markus Koschany did 40.0h (out of 40.0h assigned).
  • Ola Lundqvist did 19.5h (out of 24.0h assigned), thus carrying over 4.5h to the next month.
  • Roberto C. Sánchez did 9.25h (out of 3.5h assigned and 8.5h from previous period), thus carrying over 2.75h to the next month.
  • Santiago Ruano Rincón did 19.0h (out of 16.5h assigned and 2.5h from previous period).
  • Sean Whitton did 4.5h (out of 4.5h assigned and 1.5h from previous period), thus carrying over 1.5h to the next month.
  • Sylvain Beucler did 25.0h (out of 24.5h assigned and 35.5h from previous period), thus carrying over 35.0h to the next month.
  • Thorsten Alteholz did 14.0h (out of 14.0h assigned).
  • Tobias Frost did 12.0h (out of 12.0h assigned).
  • Utkarsh Gupta did 19.5h (out of 0.0h assigned and 48.75h from previous period), thus carrying over 29.25h to the next month.
Evolution of the situation

In March, we have released 31 DLAs.

Adrian Bunk was responsible for updating gtkwave not only in LTS, but also in unstable, stable, and old-stable as well. This update involved an upload of a new upstream release of gtkwave to each target suite to address 82 separate CVEs. Guilhem Moulin prepared an update of libvirt which was particularly notable, as it fixed multiple vulnerabilities which would lead to denial of service or information disclosure.

In addition to the normal security updates, multiple LTS contributors worked at getting various packages updated in more recent Debian releases, including gross for bullseye/bookworm (by Adrian Bunk), imlib2 for bullseye, jetty9 and tomcat9/10 for bullseye/bookworm (by Markus Koschany), samba for bullseye, py7zr for bullseye (by Santiago Ruano Rincón), cacti for bullseye/bookwork (by Sylvain Beucler), and libmicrohttpd for bullseye (by Thorsten Alteholz). Additionally, Sylvain actively coordinated with cacti upstream concerning an incomplete fix for CVE-2024-29894.

Thanks to our sponsors

Sponsors that joined recently are in bold.

Categories: FLOSS Project Planets

Freexian Collaborators: Debian Contributions: SSO Authentication for jitsi.debian.social, /usr-move updates, and more! (by Utkarsh Gupta)

Thu, 2024-04-11 20:00

Contributing to Debian is part of Freexian’s mission. This article covers the latest achievements of Freexian and their collaborators. All of this is made possible by organizations subscribing to our Long Term Support contracts and consulting services.

P.S. We’ve completed over a year of writing these blogs. If you have any suggestions on how to make them better or what you’d like us to cover, or any other opinions/reviews you might have, et al, please let us know by dropping an email to us. We’d be happy to hear your thoughts. :)

SSO Authentication for jitsi.debian.social, by Stefano Rivera

Debian.social’s jitsi instance has been getting some abuse by (non-Debian) people sharing sexually explicit content on the service. After playing whack-a-mole with this for a month, and shutting the instance off for another month, we opened it up again and the abuse immediately re-started.

Stefano sat down and wrote an SSO Implementation that hooks into Jitsi’s existing JWT SSO support. This requires everyone using jitsi.debian.social to have a Salsa account.

With only a little bit of effort, we could change this in future, to only require an account to open a room, and allow guests to join the call.

/usr-move, by Helmut Grohne

The biggest task this month was sending mitigation patches for all of the /usr-move issues arising from package renames due to the 2038 transition. As a result, we can now say that every affected package in unstable can either be converted with dh-sequence-movetousr or has an open bug report. The package set relevant to debootstrap except for the set that has to be uploaded concurrently has been moved to /usr and is awaiting migration. The move of coreutils happened to affect piuparts which hard codes the location of /bin/sync and received multiple updates as a result.

Miscellaneous contributions
  • Stefano Rivera uploaded a stable release update to python3.11 for bookworm, fixing a use-after-free crash.
  • Stefano uploaded a new version of python-html2text, and updated python3-defaults to build with it.
  • In support of Python 3.12, Stefano dropped distutils as a Build-Dependency from a few packages, and uploaded a complex set of patches to python-mitogen.
  • Stefano landed some merge requests to clean up dead code in dh-python, removed the flit plugin, and uploaded it.
  • Stefano uploaded new upstream versions of twisted, hatchling, python-flexmock, python-authlib, python–mitogen, python-pipx, and xonsh.
  • Stefano requested removal of a few packages supporting the Opsis HDMI2USB hardware that DebConf Video team used to use for HDMI capture, as they are not being maintained upstream. They started to FTBFS, with recent sdcc changes.
  • DebConf 24 is getting ready to open registration, Stefano spent some time fixing bugs in the website, caused by infrastructure updates.
  • Stefano reviewed all the DebConf 23 travel reimbursements, filing requests for more information from SPI where our records mismatched.
  • Stefano spun up a Wafer website for the Berlin 2024 mini DebConf.
  • Roberto C. Sánchez worked on facilitating the transfer of upstream maintenance responsibility for the dormant Shorewall project to a new team led by the current maintainer of the Shorewall packages in Debian.
  • Colin Watson fixed build failures in celery-haystack-ng, db1-compat, jsonpickle, libsdl-perl, kali, knews, openssh-ssh1, python-json-log-formatter, python-typing-extensions, trn4, vigor, and wcwidth. Some of these were related to the 64-bit time_t transition, since that involved enabling -Werror=implicit-function-declaration.
  • Colin fixed an off-by-one error in neovim, which was already causing a build failure in Ubuntu and would eventually have caused a build failure in Debian with stricter toolchain settings.
  • Colin added an sshd@.service template to openssh to help newer systemd versions make containers and VMs SSH-accessible over AF_VSOCK sockets.
  • Following the xz-utils backdoor, Colin spent some time testing and discussing OpenSSH upstream’s proposed inline systemd notification patch, since the current implementation via libsystemd was part of the attack vector used by that backdoor.
  • Utkarsh reviewed and sponsored some Go packages for Lena Voytek and Rajudev.
  • Utkarsh also helped Mitchell Dzurick with the adoption of pyparted package.
  • Helmut sent 10 patches for cross build failures.
  • Helmut partially fixed architecture cross bootstrap tooling to deal with changes in linux-libc-dev and the recent gcc-for-host changes and also fixed a 64bit-time_t FTBFS in libtextwrap.
  • Thorsten Alteholz uploaded several packages from debian-printing: cjet, lprng, rlpr and epson-inkjet-printer-escpr were affected by the newly enabled compiler switch -Werror=implicit-function-declaration. Besides fixing these serious bugs, Thorsten also worked on other bugs and could fix one or the other.
  • Carles updated simplemonitor and python-ring-doorbell packages with new upstream versions.
  • Santiago is still working on the Salsa CI MRs to adapt the build jobs so they can rely on sbuild. Current work includes adapting the images used by the build job, implementing the basic sbuild support the related jobs, and adjusting the support for experimental and *-backports releases..
    Additionally, Santiago reviewed some MR such as Make timeout action explicit in the logs and the subsequent Implement conditional timeout verbosity, and the batch of MRs included in https://salsa.debian.org/salsa-ci-team/pipeline/-/merge_requests/482.
  • Santiago also reviewed applications for the improving Salsa CI in Debian GSoC 2024 project. We received applications from four very talented candidates. The selection process is currently ongoing. A huge thanks to all of them!
  • As part of the DebConf 24 organization, Santiago has taken part in the Content team discussions.
Categories: FLOSS Project Planets

Reproducible Builds (diffoscope): diffoscope 264 released

Thu, 2024-04-11 20:00

The diffoscope maintainers are pleased to announce the release of diffoscope version 264. This version includes the following changes:

[ Chris Lamb ] * Don't crash on invalid zipfiles, even if we encounter 'badness' through through the file. (Re: #1068705) [ FC (Fay) Stegerman ] * Add note when there are duplicate entries in ZIP files. (Closes: reproducible-builds/diffoscope!140) [ Vagrant Cascadian ] * Add an external tool reference for GNU Guix for zipdetails.

You find out more by visiting the project homepage.

Categories: FLOSS Project Planets

Jonathan McDowell: Sorting out backup internet #1: recursive DNS

Thu, 2024-04-11 13:41

I work from home these days, and my nearest office is over 100 miles away, 3 hours door to door if I travel by train (and, to be honest, probably not a lot faster given rush hour traffic if I drive). So I’m reliant on a functional internet connection in order to be able to work. I’m lucky to have access to Openreach FTTP, provided by Aquiss, but I worry about what happens if there’s a cable cut somewhere or some other long lasting problem. Worst case I could tether to my work phone, or try to find some local coworking space to use while things get sorted, but I felt like arranging a backup option was a wise move.

Step 1 turned out to be sorting out recursive DNS. It’s been many moons since I had to deal with running DNS in a production setting, and I’ve mostly done my best to avoid doing it at home too. dnsmasq has done a decent job at providing for my needs over the years, covering DHCP, DNS (+ tftp for my test device network). However I just let it slave off my ISP’s nameservers, which means if that link goes down it’ll no longer be able to resolve anything outside the house.

One option would have been to either point to a different recursive DNS server (Cloudfare’s 1.1.1.1 or Google’s Public DNS being the common choices), but I’ve no desire to share my lookup information with them. As another approach I could have done some sort of failover of resolv.conf when the primary network went down, but then I would have to get into moving files around based on networking status and that felt a bit clunky.

So I decided to finally setup a proper local recursive DNS server, which is something I’ve kinda meant to do for a while but never had sufficient reason to look into. Last time I did this I did it with BIND 9 but there are more options these days, and I decided to go with unbound, which is primarily focused on recursive DNS.

One extra wrinkle, pointed out by Lars, is that having dynamic name information from DHCP hosts is exceptionally convenient. I’ve kept dnsmasq as the local DHCP server, so I wanted to be able to forward local queries there.

I’m doing all of this on my RB5009, running Debian. Installing unbound was a simple matter of apt install unbound. I needed 2 pieces of configuration over the default, one to enable recursive serving for the house networks, and one to enable forwarding of queries for the local domain to dnsmasq. I originally had specified the wildcard address for listening, but this caused problems with the fact my router has many interfaces and would sometimes respond from a different address than the request had come in on.

/etc/unbound/unbound.conf.d/network-resolver.conf server: interface: 192.0.2.1 interface: 2001::db8:f00d::1 access-control: 192.0.2.0/24 allow access-control: 2001::db8:f00d::/56 allow


/etc/unbound/unbound.conf.d/local-to-dnsmasq.conf server: domain-insecure: "example.org" do-not-query-localhost: no forward-zone: name: "example.org" forward-addr: 127.0.0.1@5353


I then had to configure dnsmasq to not listen on port 53 (so unbound could), respond to requests on the loopback interface (I have dnsmasq restricted to only explicitly listed interfaces), and to hand out unbound as the appropriate nameserver in DHCP requests - once dnsmasq is not listening on port 53 it no longer does this by default.

/etc/dnsmasq.d/behind-unbound interface=lo port=5353 dhcp-option=option6:dns-server,[2001::db8:f00d::1] dhcp-option=option:dns-server,192.0.2.1


With these minor changes in place I now have local recursive DNS being handled by unbound, without losing dynamic local DNS for DHCP hosts. As an added bonus I now get 10/10 on Test IPv6 - previously I was getting dinged on the ability for my DNS server to resolve purely IPv6 reachable addresses.

Next step, actually sorting out a backup link.

Categories: FLOSS Project Planets

Reproducible Builds: Reproducible Builds in March 2024

Thu, 2024-04-11 12:49

Welcome to the March 2024 report from the Reproducible Builds project! In our reports, we attempt to outline what we have been up to over the past month, as well as mentioning some of the important things happening more generally in software supply-chain security. As ever, if you are interested in contributing to the project, please visit our Contribute page on our website.

Arch Linux minimal container userland now 100% reproducible

In remarkable news, Reproducible builds developer kpcyrd reported that that the Arch Linux “minimal container userland” is now 100% reproducible after work by developers dvzv and Foxboron on the one remaining package. This represents a “real world”, widely-used Linux distribution being reproducible.

Their post, which kpcyrd suffixed with the question “now what?”, continues on to outline some potential next steps, including validating whether the container image itself could be reproduced bit-for-bit. The post, which was itself a followup for an Arch Linux update earlier in the month, generated a significant number of replies.


Validating Debian’s build infrastructure after the XZ backdoor

From our mailing list this month, Vagrant Cascadian wrote about being asked about trying to perform concrete reproducibility checks for recent Debian security updates, in an attempt to gain some confidence about Debian’s build infrastructure given that they performed builds in environments running the high-profile XZ vulnerability.

Vagrant reports (with some caveats):

So far, I have not found any reproducibility issues; everything I tested I was able to get to build bit-for-bit identical with what is in the Debian archive.

That is to say, reproducibility testing permitted Vagrant and Debian to claim with some confidence that builds performed when this vulnerable version of XZ was installed were not interfered with.


Making Fedora Linux (more) reproducible

In March, Davide Cavalca gave a talk at the 2024 Southern California Linux Expo (aka SCALE 21x) about the ongoing effort to make the Fedora Linux distribution reproducible.

Documented in more detail on Fedora’s website, the talk touched on topics such as the specifics of implementing reproducible builds in Fedora, the challenges encountered, the current status and what’s coming next. (YouTube video)


Increasing Trust in the Open Source Supply Chain with Reproducible Builds and Functional Package Management

Julien Malka published a brief but interesting paper in the HAL open archive on Increasing Trust in the Open Source Supply Chain with Reproducible Builds and Functional Package Management:

Functional package managers (FPMs) and reproducible builds (R-B) are technologies and methodologies that are conceptually very different from the traditional software deployment model, and that have promising properties for software supply chain security. This thesis aims to evaluate the impact of FPMs and R-B on the security of the software supply chain and propose improvements to the FPM model to further improve trust in the open source supply chain. PDF

Julien’s paper poses a number of research questions on how the model of distributions such as GNU Guix and NixOS can “be leveraged to further improve the safety of the software supply chain”, etc.


Software and source code identification with GNU Guix and reproducible builds

In a long line of commendably detailed blog posts, Ludovic Courtès, Maxim Cournoyer, Jan Nieuwenhuizen and Simon Tournier have together published two interesting posts on the GNU Guix blog this month. In early March, Ludovic Courtès, Maxim Cournoyer, Jan Nieuwenhuizen and Simon Tournier wrote about software and source code identification and how that might be performed using Guix, rhetorically posing the questions: “What does it take to ‘identify software’? How can we tell what software is running on a machine to determine, for example, what security vulnerabilities might affect it?”

Later in the month, Ludovic Courtès wrote a solo post describing adventures on the quest for long-term reproducible deployment. Ludovic’s post touches on GNU Guix’s aim to support “time travel”, the ability to reliably (and reproducibly) revert to an earlier point in time, employing the iconic image of Harold Lloyd hanging off the clock in Safety Last! (1925) to poetically illustrate both the slapstick nature of current modern technology and the gymnastics required to navigate hazards of our own making.


Two new Rust-based tools for post-processing determinism

Zbigniew Jędrzejewski-Szmek announced add-determinism, a work-in-progress reimplementation of the Reproducible Builds project’s own strip-nondeterminism tool in the Rust programming language, intended to be used as a post-processor in RPM-based distributions such as Fedora

In addition, Yossi Kreinin published a blog post titled “refix: fast, debuggable, reproducible builds that describes a tool that post-processes binaries in such a way that they are still debuggable with gdb, etc.. Yossi post details the motivation and techniques behind the (fast) performance of the tool.


Distribution work

In Debian this month, since the testing framework no longer varies the build path, James Addison performed a bulk downgrade of the bug severity for issues filed with a level of normal to a new level of wishlist. In addition, 28 reviews of Debian packages were added, 38 were updated and 23 were removed this month adding to ever-growing knowledge about identified issues. As part of this effort, a number of issue types were updated, including Chris Lamb adding a new ocaml_include_directories toolchain issue [] and James Addison adding a new filesystem_order_in_java_jar_manifest_mf_include_resource issue [] and updating the random_uuid_in_notebooks_generated_by_nbsphinx to reference a relevant discussion thread [].

In addition, Roland Clobus posted his 24th status update of reproducible Debian ISO images. Roland highlights that the images for Debian unstable often cannot be generated due to changes in that distribution related to the 64-bit time_t transition.

Lastly, Bernhard M. Wiedemann posted another monthly update for his reproducibility work in openSUSE.


Mailing list highlights

Elsewhere on our mailing list this month:


Website updates

There were made a number of improvements to our website this month, including:

  • Pol Dellaiera noticed the frequent need to correctly cite the website itself in academic work. To facilitate easier citation across multiple formats, Pol contributed a Citation File Format (CIF) file. As a result, an export in BibTeX format is now available in the Academic Publications section. Pol encourages community contributions to further refine the CITATION.cff file. Pol also added an substantial new section to the “buy in” page documenting the role of Software Bill of Materials (SBOMs) and ephemeral development environments. [][]

  • Bernhard M. Wiedemann added a new “commandments” page to the documentation [][] and fixed some incorrect YAML elsewhere on the site [].

  • Chris Lamb add three recent academic papers to the publications page of the website. []

  • Mattia Rizzolo and Holger Levsen collaborated to add Infomaniak as a sponsor of amd64 virtual machines. [][][]

  • Roland Clobus updated the “stable outputs” page, dropping version numbers from Python documentation pages [] and noting that Python’s set data structure is also affected by the PYTHONHASHSEED functionality. []


Delta chat clients now reproducible

Delta Chat, an open source messaging application that can work over email, announced this month that the Rust-based core library underlying Delta chat application is now reproducible.


diffoscope

diffoscope is our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. This month, Chris Lamb made a number of changes such as uploading versions 259, 260 and 261 to Debian and made the following additional changes:

  • New features:

    • Add support for the zipdetails tool from the Perl distribution. Thanks to Fay Stegerman and Larry Doolittle et al. for the pointer and thread about this tool. []
  • Bug fixes:

    • Don’t identify Redis database dumps as GNU R database files based simply on their filename. []
    • Add a missing call to File.recognizes so we actually perform the filename check for GNU R data files. []
    • Don’t crash if we encounter an .rdb file without an equivalent .rdx file. (#1066991)
    • Correctly check for 7z being available—and not lz4—when testing 7z. []
    • Prevent a traceback when comparing a contentful .pyc file with an empty one. []
  • Testsuite improvements:

    • Fix .epub tests after supporting the new zipdetails tool. []
    • Don’t use parenthesis within test “skipping…” messages, as PyTest adds its own parenthesis. []
    • Factor out Python version checking in test_zip.py. []
    • Skip some Zip-related tests under Python 3.10.14, as a potential regression may have been backported to the 3.10.x series. []
    • Actually test 7z support in the test_7z set of tests, not the lz4 functionality. (Closes: reproducible-builds/diffoscope#359). []

In addition, Fay Stegerman updated diffoscope’s monkey patch for supporting the unusual Mozilla ZIP file format after Python’s zipfile module changed to detect potentially insecure overlapping entries within .zip files. (#362)

Chris Lamb also updated the trydiffoscope command line client, dropping a build-dependency on the deprecated python3-distutils package to fix Debian bug #1065988 [], taking a moment to also refresh the packaging to the latest Debian standards []. Finally, Vagrant Cascadian submitted an update for diffoscope version 260 in GNU Guix. []


Upstream patches

This month, we wrote a large number of patches, including:

Bernhard M. Wiedemann used reproducibility-tooling to detect and fix packages that added changes in their %check section, thus failing when built with the --no-checks option. Only half of all openSUSE packages were tested so far, but a large number of bugs were filed, including ones against caddy, exiv2, gnome-disk-utility, grisbi, gsl, itinerary, kosmindoormap, libQuotient, med-tools, plasma6-disks, pspp, python-pypuppetdb, python-urlextract, rsync, vagrant-libvirt and xsimd.

Similarly, Jean-Pierre De Jesus DIAZ employed reproducible builds techniques in order to test a proposed refactor of the ath9k-htc-firmware package. As the change produced bit-for-bit identical binaries to the previously shipped pre-built binaries:

I don’t have the hardware to test this firmware, but the build produces the same hashes for the firmware so it’s safe to say that the firmware should keep working.


Reproducibility testing framework

The Reproducible Builds project operates a comprehensive testing framework running primarily at tests.reproducible-builds.org in order to check packages and other artifacts for reproducibility.

In March, an enormous number of changes were made by Holger Levsen:

  • Debian-related changes:

    • Sleep less after a so-called “404” package state has occurred. []
    • Schedule package builds more often. [][]
    • Regenerate all our HTML indexes every hour, but only every 12h for the released suites. []
    • Create and update unstable and experimental base systems on armhf again. [][]
    • Don’t reschedule so many “depwait” packages due to the current size of the i386 architecture queue. []
    • Redefine our scheduling thresholds and amounts. []
    • Schedule untested packages with a higher priority, otherwise slow architectures cannot keep up with the experimental distribution growing. []
    • Only create the stats_buildinfo.png graph once per day. [][]
    • Reproducible Debian dashboard: refactoring, update several more static stats only every 12h. []
    • Document how to use systemctl with new systemd-based services. []
    • Temporarily disable armhf and i386 continuous integration tests in order to get some stability back. []
    • Use the deb.debian.org CDN everywhere. []
    • Remove the rsyslog logging facility on bookworm systems. []
    • Add zst to the list of packages which are false-positive diskspace issues. []
    • Detect failures to bootstrap Debian base systems. []
  • Arch Linux-related changes:

    • Temporarily disable builds because the pacman package manager is broken. [][]
    • Split reproducible_html_live_status and split the scheduling timing . [][][]
    • Improve handling when database is locked. [][]
  • Misc changes:

    • Show failed services that require manual cleanup. [][]
    • Integrate two new Infomaniak nodes. [][][][]
    • Improve IRC notifications for artifacts. []
    • Run diffoscope in different systemd slices. []
    • Run the node health check more often, as it can now repair some issues. [][]
    • Also include the string Bot in the userAgent for Git. (Re: #929013). []
    • Document increased tmpfs size on our OUSL nodes. []
    • Disable memory account for the reproducible_build service. [][]
    • Allow 10 times as many open files for the Jenkins service. []
    • Set OOMPolicy=continue and OOMScoreAdjust=-1000 for both the Jenkins and the reproducible_build service. []

Mattia Rizzolo also made the following changes:

  • Debian-related changes:

    • Define a systemd slice to group all relevant services. [][]
    • Add a bunch of quotes in scripts to assuage the shellcheck tool. []
    • Add stats on how many packages have been built today so far. []
    • Instruct systemd-run to handle diffoscope’s exit codes specially. []
    • Prefer the pgrep tool over grepping the output of ps. []
    • Re-enable a couple of i386 and armhf architecture builders. [][]
    • Fix some stylistic issues flagged by the Python flake8 tool. []
    • Cease scheduling Debian unstable and experimental on the armhf architecture due to the time_t transition. []
    • Start a few more i386 & armhf workers. [][][]
    • Temporarly skip pbuilder updates in the unstable distribution, but only on the armhf architecture. []
  • Other changes:

    • Perform some large-scale refactoring on how the systemd service operates. [][]
    • Move the list of workers into a separate file so it’s accessible to a number of scripts. []
    • Refactor the powercycle_x86_nodes.py script to use the new IONOS API and its new Python bindings. []
    • Also fix nph-logwatch after the worker changes. []
    • Do not install the stunnel tool anymore, it shouldn’t be needed by anything anymore. []
    • Move temporary directories related to Arch Linux into a single directory for clarity. []
    • Update the arm64 architecture host keys. []
    • Use a common Postfix configuration. []

The following changes were also made: by

  • Jan-Benedict Glaw:

    • Initial work to clean up a messy NetBSD-related script. [][]
  • Roland Clobus:

    • Show the installer log if the installer fails to build. []
    • Avoid the minus character (i.e. -) in a variable in order to allow for tags in openQA. []
    • Update the schedule of Debian live image builds. []
  • Vagrant Cascadian:

    • Maintenance on the virt* nodes is completed so bring them back online. []
    • Use the fully qualified domain name in configuration. []

Node maintenance was also performed by Holger Levsen, Mattia Rizzolo [][] and Vagrant Cascadian [][][][]


If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

Categories: FLOSS Project Planets

Russell Coker: ML Training License

Thu, 2024-04-11 07:30

Last year a Debian Developer blogged about writing Haskell code to give a bad result for LLMs that were trained on it. I forgot who wrote the post and I’d appreciate the URL if anyone has it.

I respect such technical work to enforce one’s legal rights when they aren’t respected by corporations, but I have a different approach.

As an aside the Fosdem lecture “Fortify AI against regulation, litigation and lobotomies” is interesting on this topic [1], it’s what inspired me to write about this.

For what I write I am at this time happy to allow it to be used as part of a large training data set (consider this blog post a licence grant that applies until such time as I edit this post to change it). But only if aggregated with so much other data that my content is only a tiny portion of the data set by any metric. So I don’t want someone to make a programming LLM that has my code as the only C code or a political data set that has my blog posts as the only left-wing content. If someone wants to train an LLM on only my content to make a Russell-simulator then I don’t license my work for that purpose but also as it’s small enough that anyone with a bit of skill could do it on a weekend I can’t stop it. I would be really interested in seeing the results if someone from the FOSS community wanted to make a Russell-simulator and would probably issue them a license for such work if asked.

If my work comprises more than 0.1% of the content in a particular measure (theme, programming language, political position, etc) in a training data set then I don’t permit that without prior discussion.

Finally if someone wants to make a FOSS training data set to be used for FOSS LLM systems (maybe under the AGPL or some similar license) then I’ll allow my writing to be used as part of that.

Related posts:

  1. lemonup and blog license I have just updated my previous post about licenses and...
  2. License Fees for Music in Clubs The Cyber Law Center has blogged about the Phonographic Performance...
  3. BTRFS Training Some years ago Barwon South Water gave LUV 3 old...
Categories: FLOSS Project Planets

Wouter Verhelst: OpenSC and the Belgian eID

Thu, 2024-04-11 05:33

Getting the Belgian eID to work on Linux systems should be fairly easy, although some people do struggle with it.

For that reason, there is a lot of third-party documentation out there in the form of blog posts, wiki pages, and other kinds of things. Unfortunately, some of this documentation is simply wrong. Written by people who played around with things until it kind of worked, sometimes you get a situation where something that used to work in the past (but wasn't really necessary) now stopped working, but it's still added to a number of locations as though it were the gospel.

And then people follow these instructions and now things don't work anymore.

One of these revolves around OpenSC.

OpenSC is an open source smartcard library that has support for a pretty large number of smartcards, amongst which the Belgian eID. It provides a PKCS#11 module as well as a number of supporting tools.

For those not in the know, PKCS#11 is a standardized C API for offloading cryptographic operations. It is an API that can be used when talking to a hardware cryptographic module, in order to make that module perform some actions, and it is especially popular in the open source world, with support in NSS, amongst others. This library is written and maintained by mozilla, and is a low-level cryptographic library that is used by Firefox (on all platforms it supports) as well as by Google Chrome and other browsers based on that (but only on Linux, and as I understand it, only for linking with smartcards; their BoringSSL library is used for other things).

The official eID software that we ship through eid.belgium.be, also known as "BeID", provides a PKCS#11 module for the Belgian eID, as well as a number of support tools to make interacting with the card easier, such as the "eID viewer", which provides the ability to read data from the card, and validate their signatures. While the very first public version of this eID PKCS#11 module was originally based on OpenSC, it has since been reimplemented as a PKCS#11 module in its own right, with no lineage to OpenSC whatsoever anymore.

About five years ago, the Belgian eID card was renewed. At the time, a new physical appearance was the most obvious difference with the old card, but there were also some technical, on-chip, differences that are not so apparent. The most important one here, although it is not the only one, is the fact that newer eID cards now use a NIST P-384 elliptic curve-based private keys, rather than the RSA-based ones that were used in the past. This change required some changes to any PKCS#11 module that supports the eID; both the BeID one, as well as the OpenSC card-belpic driver that is written in support of the Belgian eID.

Obviously, the required changes were implemented for the BeID module; however, the OpenSC card-belpic driver was not updated. While I did do some preliminary work on the required changes, I was unable to get it to work, and eventually other things took up my time so I never finished the implementation. If someone would like to finish the work that I started, the preliminal patch that I wrote could be a good start -- but like I said, it doesn't yet work. Also, you'll probably be interested in the official documentation of the eID card.

Unfortunately, in the mean time someone added the Applet 1.8 ATR to the card-belpic.c file, without also implementing the required changes to the driver so that the PKCS#11 driver actually supports the eID card. The result of this is that if you have OpenSC installed in NSS for either Firefox or any Chromium-based browser, and it gets picked up before the BeID PKCS#11 module, then NSS will stop looking and pass all crypto operations to the OpenSC PKCS#11 module rather than to the official eID PKCS#11 module, and things will not work at all, causing a lot of confusion.

I have therefore taken the following two steps:

  1. The official eID packages now conflict with the OpenSC PKCS#11 module. Specifically only the PKCS#11 module, not the rest of OpenSC, so you can theoretically still use its tools. This means that once we release this new version of the eID software, when you do an upgrade and you have OpenSC installed, it will remove the PKCS#11 module and anything that depends on it. This is normal and expected.
  2. I have filed a pull request against OpenSC that removes the Applet 1.8 ATR from the driver, so that OpenSC will stop claiming that it supports the 1.8 applet.

When the pull request is accepted, we will update the official eID software to make the conflict versioned, so that as soon as it works again you will again be able to install the OpenSC and BeID packages at the same time.

In the mean time, if you have the OpenSC PKCS#11 module installed on your system, and your eID authentication does not work, try removing it.

Categories: FLOSS Project Planets

Ian Jackson: Why we’ve voted No to CfD for Derril Water solar farm

Tue, 2024-04-09 17:38

ceb and I are members of the Derril Water Solar Park cooperative.

We were recently invited to vote on whether the coop should bid for a Contract for Difference, in a government green electricity auction.

We’ve voted No.

“Green electricity” from your mainstream supplier is a lie

For a while ceb and I have wanted to contribute directly to green energy provision. This isn’t really possible in the mainstream consumer electricy market.

Mainstream electricity suppliers’ “100% green energy” tariffs are pure greenwashing. In a capitalist boondoogle, they basically “divvy up” the electricity so that customers on the (typically more expensive) “green” tariff “get” the green electricity, and the other customers “get” whatever is left. (Of course the electricity is actually all mixed up by the National Grid.) There are fewer people signed up for these tariffs than there is green power generated, so this basically means signing up for a “green” tariff has no effect whatsoever, other than giving evil people more money.

Ripple

About a year ago we heard about Ripple. The structure is a little complicated, but the basic upshot is:

Ripple promote and manage renewable energy schemes. The schemes themselves are each an individual company; the company is largely owned by a co-operative. The co-op is owned by consumers of electricity in the UK., To stop the co-operative being an purely financial investment scheme, shares ownership is limited according to your electricity usage. The electricity is be sold on the open market, and the profits are used to offset members’ electricity bills. (One gotcha from all of this is that for this to work your electricity billing provider has to be signed up with Ripple, but ours, Octopus, is.)

It seemed to us that this was a way for us to directly cause (and pay for!) the actual generation of green electricity.

So, we bought shares in one these co-operatives: we are co-owners of the Derril Water Solar Farm. We signed up for the maximum: funding generating capacity corresponding to 120% of our current electricity usage. We paid a little over £5000 for our shares.

Contracts for Difference

The UK has a renewable energy subsidy scheme, which goes by the name of Contracts for Difference. The idea is that a renewable energy generation company bids in advance, saying that they’ll sell their electricity at Y price, for the duration of the contract (15 years in the current round). The lowest bids win. All the electricity from the participating infrastructure is sold on the open market, but if the market price is low the government makes up the difference, and if the price is high, the government takes the winnings.

This is supposedly good for giving a stable investment environment, since the price the developer is going to get now doesn’t depends on the electricity market over the next 15 years. The CfD system is supposed to encourage development, so you can only apply before you’ve commissioned your generation infrastructure.

Ripple and CfD

Ripple recently invited us to agree that the Derril Water co-operative should bid in the current round of CfDs.

If this goes ahead, and we are one of the auction’s winners, the result would be that, instead of selling our electricity at the market price, we’ll sell it at the fixed CfD price.

This would mean that our return on our investment (which show up as savings on our electricity bills) would be decoupled from market electricity prices, and be much more predictable.

They can’t tell us the price they’d want to bid at, and future electricity prices are rather hard to predict, but it’s clear from the accompanying projections that they think we’d be better off on average with a CfD.

The documentation is very full of financial projections and graphs; other factors aren’t really discussed in any detail.

The rules of the co-op didn’t require them to hold a vote, but very sensibly, for such a fundamental change in the model, they decided to treat it roughly the same way as for a rules change: they’re hoping to get 75% Yes votes.

Voting No

The reason we’re in this co-op at all is because we want to directly fund renewable electricity.

Participating in the CfD auction would involve us competing with capitalist energy companies for government subsidies. Subsidies which are supposed to encourage the provision of green electricity.

It seems to us that participating in this auction would remove most of the difference between what we hoped to do by investing in Derril Water, and just participating in the normal consumer electricity market.

In particular, if we do win in the auction, that’s probably directly removing the funding and investment support model for other, market-investor-funded, projects.

In other words, our buying into Derril Water ceases to be an additional green energy project, changing (in its minor way) the UK’s electricity mix. It becomes a financial transaction much more tenously connected (if connected at all) to helping mitigate the climate emergency.

So our conclusion was that we must vote against.



comments
Categories: FLOSS Project Planets

Gunnar Wolf: Think outside the box • Welcome Eclipse!

Tue, 2024-04-09 12:38

Now that we are back from our six month period in Argentina, we decided to adopt a kitten, to bring more diversity into our lives. Perhaps this little girl will teach us to think outside the box!

Yesterday we witnessed a solar eclipse — Mexico City was not in the totality range (we reached ~80%), but it was a great experience to go with the kids. A couple dozen thousand people gathered for a massive picnic in las islas, the main area inside our university campus.

Afterwards, we went briefly back home, then crossed the city to fetch the little kitten. Of course, the kids were unanimous: Her name is Eclipse.

Categories: FLOSS Project Planets

Pages