GNU Planet!

Subscribe to GNU Planet! feed
Planet GNU - https://planet.gnu.org/
Updated: 4 hours 21 min ago

Applied Pokology: Applied Pokology - Interesting poke idiom: sparse tables

Fri, 2023-01-27 19:00

During tonight poke online office hours our friend hdzki came with an interesting use case. He is poking at some binary structures that are like sparse tables whose entries are distributed in the file in an arbitrary way. Each sparse table is characterized by an array of consecutive non-NULL pointers. Each pointer points to an entry in the table. The table entries can be anywhere in the IO space, and are not necessarily consecutive, nor be in order.

Categories: FLOSS Project Planets

Andy Wingo: three approaches to heap sizing

Fri, 2023-01-27 16:45

How much memory should a program get? Tonight, a quick note on sizing for garbage-collected heaps. There are a few possible answers, depending on what your goals are for the system.

you: doctor science

Sometimes you build a system and you want to study it: to identify its principal components and see how they work together, or to isolate the effect of altering a single component. In that case, what you want is a fixed heap size. You run your program a few times and determine a heap size that is sufficient for your problem, and then in future run the program with that new fixed heap size. This allows you to concentrate on the other components of the system.

A good approach to choosing the fixed heap size for a program is to determine the minimum heap size a program can have by bisection, then multiplying that size by a constant factor. Garbage collection is a space/time tradeoff: the factor you choose represents a point on the space/time tradeoff curve. I would choose 1.5 in general, but this is arbitrary; I'd go more with 3 or even 5 if memory isn't scarce and I'm really optimizing for throughput.

Note that a fixed-size heap is not generally what you want. It's not good user experience for running ./foo at the command line, for example. The reason for this is that program memory use is usually a function of the program's input, and only in some cases do you know what the input might look like, and until you run the program you don't know what the exact effect of input on memory is. Still, if you have a team of operations people that knows what input patterns look like and has experience with a GC-using server-side process, fixed heap sizes could be a good solution there too.

you: average josé/fina

On the other end of the spectrum is the average user. You just want to run your program. The program should have the memory it needs! Not too much of course; that would be wasteful. Not too little either; I can tell you, my house is less than 100m², and I spend way too much time shuffling things from one surface to another. If I had more space I could avoid this wasted effort, and in a similar way, you don't want to be too stingy with a program's heap. Do the right thing!

Of course, you probably have multiple programs running on a system that are making similar heap sizing choices at the same time, and the relative needs and importances of these programs could change over time, for example as you switch tabs in a web browser, so the right thing really refers to overall system performance, whereas what you are controlling is just one process' heap size; what is the Right Thing, anyway?

My corner of the GC discourse agrees that something like the right solution was outlined by Kirisame, Shenoy, and Panchekha in a 2022 OOPSLA paper, in which the optimum heap size depends on the allocation rate and the gc cost for a process, which you measure on an ongoing basis. Interestingly, their formulation of heap size calculation can be made by each process without coordination, but results in a whole-system optimum.

There are some details but you can imagine some instinctive results: for example, when a program stops allocating because it's waiting for some external event like user input, it doesn't need so much memory, so it can start shrinking its heap. After all, it might be quite a while before the program has new input. If the program starts allocating again, perhaps because there is new input, it can grow its heap rapidly, and might then shrink again later. The mechanism by which this happens is pleasantly simple, and I salute (again!) the authors for identifying the practical benefits that an abstract model brings to the problem domain.

you: a damaged, suspicious individual

Hoo, friends-- I don't know. I've seen some things. Not to exaggerate, I like to think I'm a well-balanced sort of fellow, but there's some suspicion too, right? So when I imagine a background thread determining that my web server hasn't gotten so much action in the last 100ms and that really what it needs to be doing is shrinking its heap, kicking off additional work to mark-compact it or whatever, when the whole point of the virtual machine is to run that web server and not much else, only to have to probably give it more heap 50ms later, I-- well, again, I exaggerate. The MemBalancer paper has a heartbeat period of 1 Hz and a smoothing function for the heap size, but it just smells like danger. Do I need danger? I mean, maybe? Probably in most cases? But maybe it would be better to avoid danger if I can. Heap growth is usually both necessary and cheap when it happens, but shrinkage is never necessary and is sometimes expensive because you have to shuffle around data.

So, I think there is probably a case for a third mode: not fixed, not adaptive like the MemBalancer approach, but just growable: grow the heap when and if its size is less than a configurable multiplier (e.g. 1.5) of live data. Never shrink the heap. If you ever notice that a process is taking too much memory, manually kill it and start over, or whatever. Default to adaptive, of course, but when you start to troubleshoot a high GC overhead in a long-lived proess, perhaps switch to growable to see its effect.

unavoidable badness

There is some heuristic badness that one cannot avoid: even with the adaptive MemBalancer approach, you have to choose a point on the space/time tradeoff curve. Regardless of what you do, your system will grow a hairy nest of knobs and dials, and if your system is successful there will be a lively aftermarket industry of tuning articles: "Are you experiencing poor object transit? One knob you must know"; "Four knobs to heaven"; "It's raining knobs"; "GC engineers DO NOT want you to grab this knob!!"; etc. (I hope that my British readers are enjoying this.)

These ad-hoc heuristics are just part of the domain. What I want to say though is that having a general framework for how you approach heap sizing can limit knob profusion, and can help you organize what you have into a structure of sorts.

At least, this is what I tell myself; inshallah. Now I have told you too. Until next time, happy hacking!

Categories: FLOSS Project Planets

GNU Taler news: GNU Taler v0.9.1 released

Thu, 2023-01-26 18:00
We are happy to announce the release of GNU Taler v0.9.1.
Categories: FLOSS Project Planets

a2ps @ Savannah: a2ps 4.14.93 released [alpha]

Thu, 2023-01-26 17:29

I am happy to announce another pre-release of what will eventually be the
first release of GNU a2ps since 2007.

I have had very little feedback about previous pre-releases, so I intend to
make a stable release soon. If you’re interested in GNU a2ps, please try
this pre-release! I hope that once I make a full release it will quickly be
packaged for distributions.

Here are the compressed sources and a GPG detached signature:
  https://alpha.gnu.org/gnu/a2ps/a2ps-4.14.93.tar.gz
  https://alpha.gnu.org/gnu/a2ps/a2ps-4.14.93.tar.gz.sig

Use a mirror for higher download bandwidth:
  https://www.gnu.org/order/ftp.html

Here are the SHA1 and SHA256 checksums:

8eb28d7a8ca933a08918d706f231978a91e42d3f  a2ps-4.14.93.tar.gz
VoCuvBKrC1y5P/wZbx92C6O28jvtCfs9ZnskCjx/xmM  a2ps-4.14.93.tar.gz

The SHA256 checksum is base64 encoded, instead of the
hexadecimal encoding that most checksum tools default to.

Use a .sig file to verify that the corresponding file (without the
.sig suffix) is intact.  First, be sure to download both the .sig file
and the corresponding tarball.  Then, run a command like this:

  gpg --verify a2ps-4.14.93.tar.gz.sig

The signature should match the fingerprint of the following key:

  pub   rsa2048 2013-12-11 [SC]
        2409 3F01 6FFE 8602 EF44  9BB8 4C8E F3DA 3FD3 7230
  uid   Reuben Thomas <rrt@sc3d.org>
  uid   keybase.io/rrt <rrt@keybase.io>

If that command fails because you don't have the required public key,
or that public key has expired, try the following commands to retrieve
or refresh it, and then rerun the 'gpg --verify' command.

  gpg --locate-external-key rrt@sc3d.org

  gpg --recv-keys 4C8EF3DA3FD37230

  wget -q -O- 'https://savannah.gnu.org/project/release-gpgkeys.php?group=a2ps&download=1' | gpg --import -

As a last resort to find the key, you can try the official GNU
keyring:

  wget -q https://ftp.gnu.org/gnu/gnu-keyring.gpg
  gpg --keyring gnu-keyring.gpg --verify a2ps-4.14.93.tar.gz.sig


This release was bootstrapped with the following tools:
  Autoconf 2.71
  Automake 1.16.5
  Gnulib v0.1-5639-g80b225fe1e

NEWS

* Noteworthy changes in release 4.14.93 (2023-01-26) [alpha]
 * Features:
   - Use libpaper's paper sizes. This includes user-defined paper sizes
     when using libpaper 2. It is still possible to define custom margins
     using "Medium:" specifications in the configuration file, and the
     one size defined by a2ps that libpaper does not know about, Quarto, is
     retained for backwards compatiblity, and as an example.
 * Bug fixes:
   - Avoid a crash when a medium is not specified; instead, use the default
     libpaper size (configured by the user or sysadmin, or the locale
     default).
   - Fix some other potential crashes and compiler warnings.
 * Documentation:
   - Reformat --help output consistently to 80 columns.
 * Build:
   - Require autoconf 2.71.
   - Require libpaper.

Categories: FLOSS Project Planets

FSF Blogs: Thank you and a very warm welcome to our new members

Wed, 2023-01-25 16:17
January 20, 2023 marked the end of our most recent fundraising campaign and associate member drive. We are proud to add 330 new associate members to our organization, and we have immense appreciation for the community that helped us get there. Please help us share our appreciation.
Categories: FLOSS Project Planets

poke @ Savannah: GNU poke 3.0 released

Wed, 2023-01-25 14:08

I am happy to announce a new major release of GNU poke, version 3.0.

This release is the result of a year of development.  A lot of things have changed and improved with respect to the 2.x series; we have fixed many bugs and added quite a lot of new exciting and useful features.  See below for a description of many of them.

From now on, we intend to do not one but two major releases of poke every year.  What is moving us to change this is the realization that users have to wait for too long to enjoy new features, which are continuously being added in a project this young and active.

The tarball poke-3.0.tar.gz is now available at
https://ftp.gnu.org/gnu/poke/poke-3.0.tar.gz.

> GNU poke (http://www.jemarch.net/poke) is an interactive, extensible editor for binary data.  Not limited to editing basic entities such as bits and bytes, it provides a full-fledged procedural, interactive programming language designed to describe data structures and to operate on them.


Thanks to the people who contributed with code and/or documentation to this release.  In certain but no significant order they are:

   Mohammad-Reza Nabipoor
   Arsen Arsenović
   Luca Saiu
   Bruno Haible
   apache2
   Indu Bhagat
   Agathe Porte
   Alfred M. Szmidt
   Daiki Ueno
   Darshit Shah
   Jan Seeger
   Sergio Durigan Junior

   ... and yours truly

As always, thank you all!

But wait, this time we also have special thanks:

To Bruno Haible for his invaluable advise and his help in throughfully testing this new release in many different platforms and configurations.

To the Sourceware overseers, Mark Wielaard, Arsen Arsenović, and Sam James for their help in setting up the buildbots we are using for CI at sourceware.

What is new in this release:

User interface updates
  • A screen pager has been added to the poke application.  If enabled with the `.set pager yes' option, output will be paged one screenful at a time.
  • A tracer has been added to libpoke and the poke application. If enabled with the `.set tracer yes' option, subsequent loaded Poke types will be instrumentalized so calls to user-defined handlers are executed when certain events happen:
    • Every time a field gets mapped.
    • Every time a struct/union gets mapped.
    • Every time a field gets constructed.
    • Every time a struct/union gets constructed.
    • Every time an optional field is omitted when mapping or constructing.
  • A new command sdiff (for "structured diff") has been added to the poke application, that provides a way to generate patchable diffs of mapped structured Poke values.  This command is an interface to the structured diffs provided by the new diff.pk pickle.
  • When no name is passed to the .mem command, an unique name for the memory IOS with the form N will be used automatically, where N is a positive integer.
  • Auto-completion of 'attributes is now available in the poke application.
  • Constraint errors now contain details on the location (which field) where the constraint error happens, along with the particular expression that failed.
  • Inline assembler expressions and statements are now supported:

    ,----
    | asm (TEMPLATE [: OUTPUTS [: INPUTS]])
    | asm TYPE: (TEMPLATE [: INPUTS])
    `----

  • Both `printf' and `format' now support printing values of type `any'.
  • Both `printf' and `format' now support printing integral values interpreted as floating-point values encoded in IEEE 754.  Format tags %f, %g and %e are supported.  This feature, along with the new ieee754.pk pickle, eases dealing with floating-point data in binary data.
  • Pre-conditional optional fields are added to complement the currently supported post-conditional optional fields. A pre-conditional optional field like the following makes FNAME optional based on the evaluation of CONDITION.  But the field itself is not mapped if the condition evaluates to false:

    ,----
    | if (CONDITION)
    |   TYPE FNAME;
    `----

  • A new option `.set autoremap no' can be used in order to tell poke to not remap mapped values automatically.  This greatly speeds up things, but assumes that the contents of the IO space are not updated out of the control of the user.  See the manual for details.
  • The :to argument to the `extract' command is now optional, and defaults to the empty string.
  • ${XDG_CONFIG_HOME:-$HOME/.config} is now preferred to XDG_CONFIG_DIRS.
Poke Language updates
  • Array and struct constructors are now primaries in the Poke syntax. This means that it is no longer necessary to enclose them between parenthesis in constructions like:

    ,----
    | (Packet {}).field
    `----

    and this is now accepted:
    ,----
    | Packet {}.field
    `----

  • Bit-concatenation is now supported in l-values.  After executing the following code the value of `a' is 0x1N and the value of `b' is (uint<28>)0x2345678:

    ,----
    | var a = 0 as int<4>;
    | var b = 0 as uint<28>;
    |
    | a:::b = 0x12345678;
    `----

  • Arrays can now be indented by size, by specifying an offset as an index.  This is particularly useful for accessing structures such as string tables without having to explicitly iterate on the array's elements.
  • Union types can now be declared as "integral".  The same features of integral structs are now available for unions: integration, deintegration, the ability of being used in contexts where an integer is expected, etc.
  • Support for "computed fields" has been added to struct and union types.  Computed fields are accessed just like regular fields, but the semantics of referring to them and of assigning to them are specified by the user by the way of defining getter and setter methods.
  • This version introduces three new Poke attributes that work on values of type `any':

    ,----
    | VAL'elem (N)
    |    evaluates to the Nth element in VAL, as a value of type `any'.
    |
    | VAL'eoffset (N)
    |    evaluates to the offset of the Nth element in VAL.
    |
    | VAL'esize (N)
    |    evaluates to the size of the Nth element in VAL.
    |
    | VAL'ename (N)
    |    attribute evaluates to the name of the Nth element in VAL.
    `----

  • Two new operators have been introduced to facilitate operating Poke array as stacks in an efficient way: apush and apop.  Since these operators change the size of the involved arrays, they are only allowed in unbounded arrays.
  • Poke programs can now hook in the IO subsystem by installing functions that will be invoked when certain operations on IO spaces are being performed:

    ,----
    | ios_open_hook
    |   Functions in this hook are invoked once a new IO space has been
    |   opened.
    |
    | ios_set_hook
    |   Functions in this hook are invoked once the current IO space
    |   changes.
    |
    | ios_close_pre_hook
    | ios_close_hook
    |   Functions in these hooks are invoked before and after an IO space is
    |   closed, respectively.
    `----

  • The 'length attribute is now valid in values of type `any'.
  • Poke declarations can now be annotated as `immutable'.  It is not allowed to re-define immutable definitions.
  • A new compiler built-in `iolist' has been introduced, that returns an array with the IO space identifiers of currently open IOS.
  • We have changed the logic of the EXCOND operator ?!.  It now evaluates to 1 (true) if the execution of the first operand raises the specified exception, and to 0 (false) otherwise.  We profusedly apologize for the backwards incompatibility, but this is way better than the previous (reversed) logic.
  • The containing struct or union value can now be refered as SELF in the body of methods.  SELF is of type `any'.
  • Integer literal suffixes (B, H, U, etc) are case-insensitive. But until now little-case `b' wasn't being recognized as such.  Now `1B' is the same than `1b'.
  • Casting to union types now raise a compile-time error.
  • If no explicit message is specified in calls to `assert', a default one showing the source code of the failing condition is constructed and used instead.
  • An operator `remap' has been used in order to force a re-map of some mapped Poke value.
  • Signed integral types of one bit are not allowed.  How could they be, in two's complement?
  • The built-in function get_time has been renamed to gettime, to follow the usual naming of the corresponding standard C function.
Standard Poke Library updates
  • New standard functions:

    ,----
    | eoffset (V, N)
    |   Given a value of type `any' and a name, returns the offset of
    |   the element having that name.
    |
    | openset (HANDLER, [FLAGS])
    |   Open an IO space and make it the current IO space.
    |
    | with_temp_ios ([HANDLER], [FLAGS], [DO], [ENDIAN])
    |   Execute some code with a temporary IO space.
    |
    | with_cur_ios (IOS, [DO], [ENDIAN])
    |   Execute some code on some given IO space.
    `----

libpoke updates
  • New API function pk_struct_ref_set_field_value.
  • New API function pk_type_name.
Pickles updates
  • New pickles provided in the poke distribution:

    ,----
    | diff.pk
    |   Useful binary diffing utilities.  In particular, it implements
    |   the "structured diff" format as described in
    |   https://binary-tools.net/bindiff.pdf.
    |
    | io.pk
    |   Facilities to dump data to the terminal.
    |
    | pk-table.pk
    |   Convenient facilities to Poke programs to print tabulated data.
    |
    | openpgp.pk
    |   Pickle to poke at OpenPGP RFC 4880 data.
    |
    | sframe.pk
    | sframe-dump.pk
    |   Pickles for the SFrame unwinding format, and related dump
    |   utilities.
    |
    | search.pk
    |   Utility for searching data in IO spaces that conform to some
    |   given Poke type.
    |
    | riscv.pk
    |   Pickle to poke at instructions encoded in the RISC-V instruction
    |   set (RV32I).  It also provides methods to generate assembly
    |   language.
    |
    | coff.pk
    | coff-aarch64.pk
    | coff-i386.pk
    |   COFF object files.
    |
    | pe.pk
    | pe-amd64.pk
    | pe-arm.pk
    | pe-arm64.pk
    | pe-debug.pk
    | pe-i386.pk
    | pe-ia64.pk
    | pe-m32r.pk
    | pe-mips.pk
    | pe-ppc.pk
    | pe-riscv.pk
    | pe-sh3.pk
    |   PE/COFF object files.
    |
    | pcap.pk
    |   Capture file format.
    |
    | uuid.pk
    |   Universally Unique Identifier (UUID) as defined by RFC4122.
    |
    | redoxfs.pk
    |   RedoxFS files ystem of Redox OS.
    |
    | ieee754.pk
    |   IEEE Standard for Floating-Point Arithmetic.
    `----

  • The ELF pickle now provides functions implementing ELF hashing.
Build system updates
  • It is now supported to configure the poke sources with --disable-hserver.
Documentation updates
  • Documentation for the `format' language construction has been added to the poke manual.
Other updates
  • A new program poked, for "poke daemon", has been contributed to the poke distribution by Mohammad-Reza Nabipoor.  poked links with libpoke and uses Unix sockets to act as a broker to communicate with an instance of a Poke incremental compiler.  This is already used by several user interfaces to poke.
  • The machine-interface subsystem has been removed from poke, in favor of the poked approach.
  • The example GUI that was intended to be a test tool for the machine interface has been removed from the poke distribution.
  • Many bugs have been fixed.

--
Jose E. Marchesi
Frankfurt am Main
26 January 2023

Categories: FLOSS Project Planets

GNU Guile: GNU Guile 3.0.9 released

Wed, 2023-01-25 09:25

We are pleased to announce the release of GNU Guile 3.0.9! This release fixes a number of bugs and adds several new features, among which:

  • New bindings for POSIX functionality, including bindings for the at family of functions (openat, statat, etc.), a new spawn procedure that wraps posix_spawn and that system* now uses, and the ability to pass flags such as O_CLOEXEC to the pipe procedure.
  • A new bytevector-slice procedure.
  • Reduced memory consumption for the linker and assembler.

For full details, see the NEWS entry, and check out the download page.

Happy Guile hacking!

Categories: FLOSS Project Planets

FSF News: FSF board adopts updated by-laws to protect copyleft

Tue, 2023-01-24 15:41
BOSTON, Massachusetts, USA -- Tuesday, January 24, 2023 -- The board of the Free Software Foundation (FSF) today announced it has adopted updated bylaws for the nonprofit effective Feb. 1, 2023.
Categories: FLOSS Project Planets

Andy Wingo: parallel ephemeron tracing

Tue, 2023-01-24 05:48

Hello all, and happy new year. Today's note continues the series on implementing ephemerons in a garbage collector.

In our last dispatch we looked at a serial algorithm to trace ephemerons. However, production garbage collectors are parallel: during collection, they trace the object graph using multiple worker threads. Our problem is to extend the ephemeron-tracing algorithm with support for multiple tracing threads, without introducing stalls or serial bottlenecks.

Recall that we ended up having to define a table of pending ephemerons:

struct gc_pending_ephemeron_table { struct gc_ephemeron *resolved; size_t nbuckets; struct gc_ephemeron *buckets[0]; };

This table holds pending ephemerons that have been visited by the graph tracer but whose keys haven't been found yet, as well as a singly-linked list of resolved ephemerons that are waiting to have their values traced. As a global data structure, the pending ephemeron table is a point of contention between tracing threads that we need to design around.

a confession

Allow me to confess my sins: things would be a bit simpler if I didn't allow tracing workers to race.

As background, if your GC supports marking in place instead of always evacuating, then there is a mark bit associated with each object. To reduce the overhead of contention, a common strategy is to actually use a whole byte for the mark bit, and to write to it using relaxed atomics (or even raw stores). This avoids the cost of a compare-and-swap, but at the cost that multiple marking threads might see that an object's mark was unset, go to mark the object, and think that they were the thread that marked the object. As far as the mark byte goes, that's OK because everybody is writing the same value. The object gets pushed on the to-be-traced grey object queues multiple times, but that's OK too because tracing should be idempotent.

This is a common optimization for parallel marking, and it doesn't have any significant impact on other parts of the GC--except ephemeron marking. For ephemerons, because the state transition isn't simply from unmarked to marked, we need more coordination.

high level

The parallel ephemeron marking algorithm modifies the serial algorithm in just a few ways:

  1. We have an atomically-updated state field in the ephemeron, used to know if e.g. an ephemeron is pending or resolved;

  2. We use separate fields for the pending and resolved links, to allow for concurrent readers across a state change;

  3. We introduce "traced" and "claimed" states to resolve races between parallel tracers on the same ephemeron, and track the "epoch" at which an ephemeron was last traced;

  4. We remove resolved ephemerons from the pending ephemeron hash table lazily, and use atomic swaps to pop from the resolved ephemerons list;

  5. We have to re-check key liveness after publishing an ephemeron to the pending ephemeron table.

Regarding the first point, there are four possible values for the ephemeron's state field:

enum { TRACED, CLAIMED, PENDING, RESOLVED };

The state transition diagram looks like this:

,----->TRACED<-----. , | ^ . , v | . | CLAIMED | | ,-----/ \---. | | v v | PENDING--------->RESOLVED

With this information, we can start to flesh out the ephemeron object itself:

struct gc_ephemeron { uint8_t state; uint8_t is_dead; unsigned epoch; struct gc_ephemeron *pending; struct gc_ephemeron *resolved; void *key; void *value; };

The state field holds one of the four state values; is_dead indicates if a live ephemeron was ever proven to have a dead key, or if the user explicitly killed the ephemeron; and epoch is the GC count at which the ephemeron was last traced. Ephemerons are born TRACED in the current GC epoch, and the collector is responsible for incrementing the current epoch before each collection.

algorithm: tracing ephemerons

When the collector first finds an ephemeron, it does a compare-and-swap (CAS) on the state from TRACED to CLAIMED. If that succeeds, we check the epoch; if it's current, we revert to the TRACED state: there's nothing to do.

(Without marking races, you wouldn't need either TRACED or CLAIMED states, or the epoch; it would be implicit in the fact that the ephemeron was being traced at all that you had a TRACED ephemeron with an old epoch.)

So now we have a CLAIMED ephemeron with an out-of-date epoch. We update the epoch and clear the pending and resolved fields, setting them to NULL. If, then, the ephemeron is_dead, we are done, and we go back to TRACED.

Otherwise we check if the key has already been traced. If so we forward it (if evacuating) and then trace the value edge as well, and transition to TRACED.

Otherwise we have a live E but we don't know about K; this ephemeron is pending. We transition E's state to PENDING and add it to the front of K's hash bucket in the pending ephemerons table, using CAS to avoid locks.

We then have to re-check if K is live, after publishing E, to account for other threads racing to mark to K while we mark E; if indeed K is live, then we transition to RESOLVED and push E on the global resolved ephemeron list, using CAS, via the resolved link.

So far, so good: either the ephemeron is fully traced, or it's pending and published, or (rarely) published-then-resolved and waiting to be traced.

algorithm: tracing objects

The annoying thing about tracing ephemerons is that it potentially impacts tracing of all objects: any object could be the key that resolves a pending ephemeron.

When we trace an object, we look it up in the pending ephemeron hash table. But, as we traverse the chains in a bucket, we also load each node's state. If we find a node that's not in the PENDING state, we atomically forward its predecessor to point to its successor. This is correct for concurrent readers because the end of the chain is always reachable: we only skip nodes that are not PENDING, nodes never become PENDING after they transition away from being PENDING, and we only add PENDING nodes to the front of the chain. We even leave the pending field in place, so that any concurrent reader of the chain can still find the tail, even when the ephemeron has gone on to be RESOLVED or even TRACED.

(I had thought I would need Tim Harris' atomic list implementation, but it turns out that since I only ever insert items at the head, having annotated links is not necessary.)

If we find a PENDING ephemeron that has K as its key, then we CAS its state from PENDING to RESOLVED. If this works, we CAS it onto the front of the resolved list. (Note that we also have to forward the key at this point, for a moving GC; this was a bug in my original implementation.)

algorithm: resolved ephemerons

Periodically a thread tracing the graph will run out of objects to trace (its mark stack is empty). That's a good time to check if there are resolved ephemerons to trace. We atomically exchange the global resolved list with NULL, and then if there were resolved ephemerons, then we trace their values and transition them to TRACED.

At the very end of the GC cycle, we sweep the pending ephemeron table, marking any ephemeron that's still there as is_dead, transitioning them back to TRACED, clearing the buckets of the pending ephemeron table as we go.

nits

So that's it. There are some drawbacks, for example that this solution takes at least three words per ephemeron. Oh well.

There is also an annoying point of serialization, which is related to the lazy ephemeron resolution optimization. Consider that checking the pending ephemeron table on every object visit is overhead; it would be nice to avoid this. So instead, we start in "lazy" mode, in which pending ephemerons are never resolved by marking; and then once the mark stack / grey object worklist fully empties, we sweep through the pending ephemeron table, checking each ephemeron's key to see if it was visited in the end, and resolving those ephemerons; we then switch to "eager" mode in which each object visit could potentially resolve ephemerons. In this way the cost of ephemeron tracing is avoided for that part of the graph that is strongly reachable. However, with parallel markers, would you switch to eager mode when any thread runs out of objects to mark, or when all threads run out of objects? You would get greatest parallelism with the former, but you run the risk of some workers prematurely running out of data, but when there is still a significant part of the strongly-reachable graph to traverse. If you wait for all threads to be done, you introduce a serialization point. There is a related question of when to pump the resolved ephemerons list. But these are engineering details.

Speaking of details, there are some gnarly pitfalls, particularly that you have to be very careful about pre-visit versus post-visit object addresses; for a semi-space collector, visiting an object will move it, so for example in the pending ephemeron table which by definition is keyed by pre-visit (fromspace) object addresses, you need to be sure to trace the ephemeron key for any transition to RESOLVED, and there are a few places this happens (the re-check after publish, sweeping the table after transitioning from lazy to eager, and when resolving eagerly).

implementation

If you've read this far, you may be interested in the implementation; it's only a few hundred lines long. It took me quite a while to whittle it down!

Ephemerons are challenging from a software engineering perspective, because they are logically a separate module, but they interact both with users of the GC and with the collector implementations. It's tricky to find the abstractions that work for all GC algorithms, whether they mark in place or move their objects, and whether they mark the heap precisely or if there are some conservative edges. But if this is the sort of thing that interests you, voilà the API for users and the API to and from collector implementations.

And, that's it! I am looking forward to climbing out of this GC hole, one blog at a time. There are just a few more features before I can seriously attack integrating this into Guile. Until the next time, happy hacking :)

Categories: FLOSS Project Planets

texinfo @ Savannah: Texinfo 7.0.2 released

Mon, 2023-01-23 11:52

We have released version 7.0.2 of Texinfo, the GNU documentation format. This is a minor bug-fix release.

It's available via a mirror (xz is much smaller than gz, but gz is available too just in case):

http://ftpmirror.gnu.org/texinfo/texinfo-7.0.2.tar.xz
http://ftpmirror.gnu.org/texinfo/texinfo-7.0.2.tar.gz

Please send any comments to bug-texinfo@gnu.org.

Full announcement:

https://lists.gnu.org/archive/html/info-gnu/2023-01/msg00008.html

Categories: FLOSS Project Planets

GNU Guix: Meet Guix at FOSDEM

Mon, 2023-01-23 08:56

GNU Guix will be present at FOSDEM next week, February 4th and 5th. This is the first time since the pandemic that FOSDEM takes place again “in the flesh” in Brussels, which is exciting to those of us lucky enough to get there! Everything will be live-streamed and recorded thanks to the amazing FOSDEM crew, so everyone can enjoy wherever they are; some of the talks this year will be “remote” too: pre-recorded videos followed by live Q&A sessions with the speaker.

Believe it or not, it’s the 9th year Guix is represented at FOSDEM, with more than 30 talks given in past editions! This year brings several talks that will let you learn more about different areas of the joyful Hydra Guix has become.

This all starts on Saturday, in particular with the amazing declarative and minimalistic computing track:

There are many other exciting talks in this track, some of which closely related to Guix and Guile; check it out!

You can also discover Guix in other tracks:

As was the case pre-pandemic, we are also organizing the Guix Days as a FOSDEM fringe event, a two-day Guix workshop where contributors and enthusiasts will meet. The workshop takes place on Thursday Feb. 2nd and Friday Feb. 3rd at the Institute of Cultural Affairs (ICAB) in Brussels.

Again this year there will be few talks; instead, the event will consist primarily of “unconference-style” sessions focused on specific hot topics about Guix, the Shepherd, continuous integration, and related tools and workflows.

Attendance to the workshop is free and open to everyone, though you are invited to register (there are few seats left!). Check out the workshop’s wiki page for registration and practical info. Hope to see you in Brussels!

About GNU Guix

GNU Guix is a transactional package manager and an advanced distribution of the GNU system that respects user freedom. Guix can be used on top of any system running the Hurd or the Linux kernel, or it can be used as a standalone operating system distribution for i686, x86_64, ARMv7, AArch64, and POWER9 machines.

In addition to standard package management features, Guix supports transactional upgrades and roll-backs, unprivileged package management, per-user profiles, and garbage collection. When used as a standalone GNU/Linux distribution, Guix offers a declarative, stateless approach to operating system configuration management. Guix is highly customizable and hackable through Guile programming interfaces and extensions to the Scheme language.

Categories: FLOSS Project Planets

parallel @ Savannah: GNU Parallel 20230122 ('Bolsonaristas') released [stable]

Sun, 2023-01-22 13:17

GNU Parallel 20230122 ('Bolsanaristas') has been released. It is available for download at: lbry://@GnuParallel:4

Quote of the month:

  Colorful output
  parallel, with --color flag
  tasks more vibrant now
    -- ChatGPT

New in this release:

  • Bug fixes and man page updates.

News about GNU Parallel:

GNU Parallel - For people who live life in the parallel lane.

If you like GNU Parallel record a video testimonial: Say who you are, what you use GNU Parallel for, how it helps you, and what you like most about it. Include a command that uses GNU Parallel if you feel like it.

About GNU Parallel

GNU Parallel is a shell tool for executing jobs in parallel using one or more computers. A job can be a single command or a small script that has to be run for each of the lines in the input. The typical input is a list of files, a list of hosts, a list of users, a list of URLs, or a list of tables. A job can also be a command that reads from a pipe. GNU Parallel can then split the input and pipe it into commands in parallel.

If you use xargs and tee today you will find GNU Parallel very easy to use as GNU Parallel is written to have the same options as xargs. If you write loops in shell, you will find GNU Parallel may be able to replace most of the loops and make them run faster by running several jobs in parallel. GNU Parallel can even replace nested loops.

GNU Parallel makes sure output from the commands is the same output as you would get had you run the commands sequentially. This makes it possible to use output from GNU Parallel as input for other programs.

For example you can run this to convert all jpeg files into png and gif files and have a progress bar:

  parallel --bar convert {1} {1.}.{2} ::: *.jpg ::: png gif

Or you can generate big, medium, and small thumbnails of all jpeg files in sub dirs:

  find . -name '*.jpg' |
    parallel convert -geometry {2} {1} {1//}/thumb{2}_{1/} :::: - ::: 50 100 200

You can find more about GNU Parallel at: http://www.gnu.org/s/parallel/

You can install GNU Parallel in just 10 seconds with:

    $ (wget -O - pi.dk/3 || lynx -source pi.dk/3 || curl pi.dk/3/ || \
       fetch -o - http://pi.dk/3 ) > install.sh
    $ sha1sum install.sh | grep 883c667e01eed62f975ad28b6d50e22a
    12345678 883c667e 01eed62f 975ad28b 6d50e22a
    $ md5sum install.sh | grep cc21b4c943fd03e93ae1ae49e28573c0
    cc21b4c9 43fd03e9 3ae1ae49 e28573c0
    $ sha512sum install.sh | grep ec113b49a54e705f86d51e784ebced224fdff3f52
    79945d9d 250b42a4 2067bb00 99da012e c113b49a 54e705f8 6d51e784 ebced224
    fdff3f52 ca588d64 e75f6033 61bd543f d631f592 2f87ceb2 ab034149 6df84a35
    $ bash install.sh

Watch the intro video on http://www.youtube.com/playlist?list=PL284C9FF2488BC6D1

Walk through the tutorial (man parallel_tutorial). Your command line will love you for it.

When using programs that use GNU Parallel to process data for publication please cite:

O. Tange (2018): GNU Parallel 2018, March 2018, https://doi.org/10.5281/zenodo.1146014.

If you like GNU Parallel:

  • Give a demo at your local user group/team/colleagues
  • Post the intro videos on Reddit/Diaspora*/forums/blogs/ Identi.ca/Google+/Twitter/Facebook/Linkedin/mailing lists
  • Get the merchandise https://gnuparallel.threadless.com/designs/gnu-parallel
  • Request or write a review for your favourite blog or magazine
  • Request or build a package for your favourite distribution (if it is not already there)
  • Invite me for your next conference

If you use programs that use GNU Parallel for research:

  • Please cite GNU Parallel in you publications (use --citation)

If GNU Parallel saves you money:

About GNU SQL

GNU sql aims to give a simple, unified interface for accessing databases through all the different databases' command line clients. So far the focus has been on giving a common way to specify login information (protocol, username, password, hostname, and port number), size (database and table size), and running queries.

The database is addressed using a DBURL. If commands are left out you will get that database's interactive shell.

When using GNU SQL for a publication please cite:

O. Tange (2011): GNU SQL - A Command Line Tool for Accessing Different Databases Using DBURLs, ;login: The USENIX Magazine, April 2011:29-32.

About GNU Niceload

GNU niceload slows down a program when the computer load average (or other system activity) is above a certain limit. When the limit is reached the program will be suspended for some time. If the limit is a soft limit the program will be allowed to run for short amounts of time before being suspended again. If the limit is a hard limit the program will only be allowed to run when the system is below the limit.

Categories: FLOSS Project Planets

Simon Josefsson: Understanding Trisquel

Sun, 2023-01-22 06:10

Ever wondered how Trisquel and Ubuntu differs and what’s behind the curtain from a developer perspective? I have. Sharing what I’ve learnt will allow you to increase knowledge and trust in Trisquel too.

Trisquel GNU/Linux logo

The scripts to convert an Ubuntu archive into a Trisquel archive are available in the ubuntu-purge repository. The easy to read purge-focal script lists the packages to remove from Ubuntu 20.04 Focal when it is imported into Trisquel 10.0 Nabia. The purge-jammy script provides the same for Ubuntu 22.04 Jammy and (the not yet released) Trisquel 11.0 Aramo. The list of packages is interesting, and by researching the reasons for each exclusion you can learn a lot about different attitudes towards free software and understand the desire to improve matters. I wish there were a wiki-page that for each removed package summarized relevant links to earlier discussions. At the end of the script there is a bunch of packages that are removed for branding purposes that are less interesting to review.

Trisquel adds a couple of Trisquel-specific packages. The source code for these packages are in the trisquel-packages repository, with sub-directories for each release: see 10.0/ for Nabia and 11.0/ for Aramo. These packages appears to be mostly for branding purposes.

Trisquel modify a set of packages, and here is starts to get interesting. Probably the most important package to modify is to use GNU Linux-libre instead of Linux as the kernel. The scripts to modify packages are in the package-helpers repository. The relevant scripts are in the helpers/ sub-directory. There is a branch for each Trisquel release, see helpers/ for Nabia and helpers/ for Aramo. To see how Linux is replaced with Linux-libre you can read the make-linux script.

This covers the basic of approaching Trisquel from a developers perspective. As a user, I have identified some areas that need more work to improve trust in Trisquel:

  • Auditing the Trisquel archive to confirm that the intended changes covered above are the only changes that are published.
  • Rebuild all packages that were added or modified by Trisquel and publish diffoscope output comparing them to what’s in the Trisquel archive. The goal would be to have reproducible builds of all Trisquel-related packages.
  • Publish an audit log of the Trisquel archive to allow auditing of what packages are published. This boils down to trust of the OpenPGP key used to sign the Trisquel archive.
  • Trisquel archive mirror auditing to confirm that they are publishing only what comes from the official archive, and that they do so timely.

I hope to publish more about my work into these areas. Hopefully this will inspire similar efforts in related distributions like PureOS and the upstream distributions Ubuntu and Debian.

Happy hacking!

Categories: FLOSS Project Planets

FSF News: FSF now accepting board nominations from associate members

Thu, 2023-01-19 17:55
BOSTON, Massachusetts, USA -- Thursday, January 19, 2023 -- Associate members of the Free Software Foundation (FSF) now have the chance to nominate and evaluate candidates to serve on the board of directors for the first time since the nonprofit was founded thirty-seven years ago.
Categories: FLOSS Project Planets

FSF Events: Free Software Directory meeting on IRC: Friday, January 27, starting at 12:00 EST (17:00 UTC)

Tue, 2023-01-17 14:28
Join the FSF and friends on Friday, January 27, from 12:00 to 15:00 EST (17:00 to 20:00 UTC) to help improve the Free Software Directory.
Categories: FLOSS Project Planets

FSF Events: Free Software Directory meeting on IRC: Friday, January 20, starting at 12:00 EST (17:00 UTC)

Tue, 2023-01-17 14:25
Join the FSF and friends on Friday, January 20, from 12:00 to 15:00 EST (17:00 to 20:00 UTC) to help improve the Free Software Directory.
Categories: FLOSS Project Planets

diffutils @ Savannah: diffutils-3.9 released [stable]

Sun, 2023-01-15 19:16

This is to announce diffutils-3.9, a stable release.

There have been 51 commits by 3 people in the 76 weeks since 3.8.

See the NEWS below for a brief summary.

Thanks to everyone who has contributed!
The following people contributed changes to this release:

  Bruno Haible (1)
  Jim Meyering (14)
  Paul Eggert (36)

Jim [on behalf of the diffutils maintainers]
==================================================================

Here is the GNU diffutils home page:
    http://gnu.org/s/diffutils/

For a summary of changes and contributors, see:
  http://git.sv.gnu.org/gitweb/?p=diffutils.git;a=shortlog;h=v3.9
or run this command from a git-cloned diffutils directory:
  git shortlog v3.8..v3.9

To summarize the 931 gnulib-related changes, run these commands
from a git-cloned diffutils directory:
  git checkout v3.9
  git submodule summary v3.8

Here are the compressed sources and a GPG detached signature:
  https://ftp.gnu.org/gnu/diffutils/diffutils-3.9.tar.xz
  https://ftp.gnu.org/gnu/diffutils/diffutils-3.9.tar.xz.sig

Use a mirror for higher download bandwidth:
  https://ftpmirror.gnu.org/diffutils/diffutils-3.9.tar.xz
  https://ftpmirror.gnu.org/diffutils/diffutils-3.9.tar.xz.sig

Here are the SHA1 and SHA256 checksums:

35905d7c3d1ce116e6794be7fe894cd25b2ded74  diffutils-3.9.tar.xz
2A076QogGGjeg9eNrTQTrYgWDMU7zDbrnq98INvwI/E  diffutils-3.9.tar.xz

The SHA256 checksum is base64 encoded, instead of the
hexadecimal encoding that most checksum tools default to.

Use a .sig file to verify that the corresponding file (without the
.sig suffix) is intact.  First, be sure to download both the .sig file
and the corresponding tarball.  Then, run a command like this:

  gpg --verify diffutils-3.9.tar.xz.sig

The signature should match the fingerprint of the following key:

  pub   rsa4096/0x7FD9FCCB000BEEEE 2010-06-14 [SCEA]
        Key fingerprint = 155D 3FC5 00C8 3448 6D1E  EA67 7FD9 FCCB 000B EEEE
  uid                   [ unknown] Jim Meyering <jim@meyering.net>
  uid                   [ unknown] Jim Meyering <meyering@fb.com>
  uid                   [ unknown] Jim Meyering <meyering@gnu.org>

If that command fails because you don't have the required public key,
or that public key has expired, try the following commands to retrieve
or refresh it, and then rerun the 'gpg --verify' command.

  gpg --locate-external-key jim@meyering.net

  gpg --recv-keys 7FD9FCCB000BEEEE

  wget -q -O- 'https://savannah.gnu.org/project/release-gpgkeys.php?group=diffutils&download=1' | gpg --import -

As a last resort to find the key, you can try the official GNU
keyring:

  wget -q https://ftp.gnu.org/gnu/gnu-keyring.gpg
  gpg --keyring gnu-keyring.gpg --verify diffutils-3.9.tar.xz.sig


This release was bootstrapped with the following tools:
  Autoconf 2.72a.65-d081
  Automake 1.16i
  Gnulib v0.1-5689-g83adc2f722

==================================================================
NEWS

* Noteworthy changes in release 3.9 (2023-01-15) [stable]

** Bug fixes

  diff -c and -u no longer output incorrect timezones in headers
  on platforms like Solaris where struct tm lacks tm_gmtoff.
  [bug#51228 introduced in 3.4]

Categories: FLOSS Project Planets

GNU Health: Jérôme Lejeune Foundation adopts GNU Health

Sat, 2023-01-14 11:57

We start 2023 with exciting news for the medical and scientific community!

GNU Health has been adopted by he Jérôme Lejeune foundation, a leading organization in the research and management of trisomy 21 (Down Syndrome) and other intellectual disabilities of genetic origin.

Lejeune foundation has its headquarters in France, with offices in Argentina, the United States and Spain.

On December 2022, the faculty of engineering from the University of Entre Rios, represented by the dean Diego Campana and the head of the school of Public Health, Fernando Sassetti, formalized the agreement with the president of the Lejeune foundation in Argentina, Luz Morano.

The same month, I met in Madrid with the medical director and IT team of the Lejeune foundation Spain.

Luz Morano declared “[GNU Health] goes beyond the Foundation, providing the health professionals the specific features to manage a patient with trisomy 21. We are putting a project in the hands of humanity

[GNU Health] goes beyond the Foundation, providing the health professionals the specific features to manage a patient with trisomy 21. We are putting a project in the hands of humanity

Luz Morano, President of Lejeune Foundation, Argentina

Morano also stated: “GNU Health will pave the road for the medical management, and let us focus on our two other missions: Research and the defense of patient rights

The agreement is in the context of the GNU Health Alliance of Academic and Research Institutions that UNER has with GNU Solidario. In this sense, Fernando Sassetti explained “It provides tools for an integrative approach of those people with certain pathologies that due to the reduced number are not managed in the best way. This will benefit the organizations and health professionals, that today lack the means to do so in the best way and timely manner. It benefits the patients, in their right to have an integral health record.”

Research and Open Science

The adoption of GNUHealth by the Jérôme Lejeune Foundation opens new exciting avenues for the scientific community. In addition to the clinical management and medical history, GNU Health will enable scientists to dive into the fields of genomics, epigenetics and exposomics, gathering and processing information from multiple contexts and subjects, thanks to the distributed nature of the GNU Health Federation.

The GNU Health HMIS counts many packages and features, some of them of special interest for this project. In addition to the specific customizations for the foundation, the packages already present in GNUHealth, such as obstetrics, pediatrics, genomics, socioeconomics or lifestyle will provide a holistic approach to the person with trisomy 21 and other related conditions.

All of this will be done using exclusively Free/Libre software and open science.

People before Patients

Trisomy 21 poses challenges for the individual, their family, health professionals and the society. The scientific community needs to push the research to shed light on the etiology, physiopathology and associated clinical manifestations, such as heart defects, blood disorders or Alzheimer’s.

Most importantly, as part of the scientific community, we must put a stop to the discrimination and stigmatization. We must tear down the barriers and walls built on our societies that prevent the inclusion of individuals with trisomy 21.

As part of this effort, GNU Health provides the WHO International Classification on Functioning, disability and health (ICF). In other words, is not just the health condition or disorder we may have, but how the environmental factors and barriers influence the normal functioning and integration as individuals in the society. Many times, those physical, artificial barriers present in our daily lives are way more pernicious than the condition itself.

The strong focus of GNU Health in Social Medicine, and the way we perceive medicine as a social science will help improving the life of the person living with trisomy 21, and contribute to the much needed healing process in our societies. We need to work on the molecular basis of the health conditions, but little can be done if without empathetic, inclusive and supportive societies so people can live and enjoy life with dignity, no matter their health or socioeconomic status.

Projects like this represent the spirit of GNU Health and make me immensely proud to be part of this community.

Happy and healthy hacking!
Luis Falcon, MD
President, GNU Solidario

Links:

Categories: FLOSS Project Planets

Pages