drunomics: A Journey Towards Sustainability and Team Building

A Journey Towards Sustainability and Team Building drunomics Team Event 2024 Burgas, Bulgaria jurgen.thano Fri, 07/05/2024 - 15:15 The drunomics team gathered for an event full of excitement and inspiration in Burgas, Bulgaria. From bonding activities to insightful discussions and workshops, it was a day full of energy and enthusiasm. Want to know more? Our blog post covers all the key moments and takeaways from this fantastic day! Body

At drunomics GmbH, our dedication to sustainable digital practices is at the heart of everything we do. We see technology as a powerful force for positive change, and our recent team event in Burgas, Bulgaria, emphasized sustainability as the focal point of the event. Here is a glimpse into the vibrant discussions, collaborative learning, and memorable moments that made this event truly special.

Nurturing Sustainability 1. Green UX Design Insights

During our event, we immersed ourselves in the principles of Green UX design. Our discussions centered on creating digital experiences that are both user-friendly and environmentally conscious. Here are some of the key insights we shared:

  • Optimizing Code for Efficiency: Streamlining scripts and using efficient algorithms not only boosts performance but also cuts down on energy consumption, contributing to a greener web.
  • Data Transfer Reduction: Minimizing data transfer is crucial for sustainability. We explored methods like lazy loading, content compression, and efficient caching to create smoother user experiences while reducing our digital carbon footprint.
  • Balancing Aesthetics and Sustainability: Green UX is about finding harmony between visual appeal and resource efficiency. Thoughtful use of images, animations, and fonts can achieve this balance effectively.
2. User-Centric Sustainability

Our discussions also focused on empowering users to make eco-friendly choices through smart intuitive design:

  • Empowering Users: We brainstormed ways to integrate sustainability into user interactions, such as through informative tool-tips, personalized recommendations, and eco-friendly badges.

  • Behavioral Nudges: Subtle prompts within the user experience can encourage eco-friendly behavior, like promoting energy-saving modes, suggesting public transportation options, or highlighting sustainable product choices.
3. Collaborative Learning

The event was an engaging forum for sharing ideas and fostering dynamic discussions. We challenged assumptions and explored new perspectives on integrating sustainability into our projects. This collaborative approach helped us discover innovative solutions and prepared us to infuse eco-consciousness into every stage of our work.

Real Python: The Real Python Podcast – Episode #211: Python Doesn't Round Numbers the Way You Might Think

Does Python round numbers the same way you learned back in math class? You might be surprised by the default method Python uses and the variety of ways to round numbers in Python. Christopher Trudeau is back on the show this week, bringing another batch of PyCoder's Weekly articles and projects.

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

mark.ie: A bash script to install different Drupal profiles the easy way

Over the past few weeks I've been sharing handy ways to set up Drupal for easier Drupal core development. Here's a bash script for installing Drupal and allowing you to choose what profile you want.

The Python Show: Dashboards in Python with Streamlit

This week, I chatted with Channin Nantasenamat about Python and the Streamlit web framework.

Specifically, we chatted about the following topics:

  • Python packages

  • Streamlit

  • Teaching bioinformatics

  • Differences in data science disciplines

  • Being a YouTuber

  • and much more!

Reproducible Builds (diffoscope): diffoscope 272 released

The diffoscope maintainers are pleased to announce the release of diffoscope version 272. This version includes the following changes:

[ Chris Lamb] * Move away from using DSA OpenSSH keys in tests; support has been removed in OpenSSH 9.8p1. (Closes: reproducible-builds/diffoscope#382) * Move to assert_diff helper in test_openssh_pub_key.py * Update copyright years.

You find out more by visiting the project homepage.

Carl Trachte: DAG Hamilton Workflow for Toy Text Processing Script

Hello. It's been a minute.

I was fortunate to attend PYCON US in Pittsburgh earlier this year. DAGWorks had a booth on the expo floor where I discovered Hamilton. The project grabbed my attention as something that could help organize and present my code workflow better. My reaction could be compared to browsing Walmart while picking up a hardware item and seeing the perfect storage medium for your clothes or crafts at a bargain price, but even better, having someone there to explain the whole thing to you. The folks at the booth were really helpful.

Below I take on a contrived web scraping (it's crude) script in my domain (metals mining) and create a Hamilton workflow from it.

Pictured below is the Hamilton flow in the graphviz output format the project uses for flowcharts (graphviz has been around for decades - an oldie but goodie as it were).

I start with a csv file that has some really basic data on three big American metal mines (I did have to research the Wikipedia addresses - for instance, I originally looked for the Goldstrike Mine under the name "Post-Betze." It goes by several different names and encompasses several mines - more on that anon):

mine,state,commodity,wikipedia page,colloquial associationRed Dog,Alaska,zinc,https://en.wikipedia.org/wiki/Red_Dog_mine,TeckGoldstrike,Nevada,gold,https://en.wikipedia.org/wiki/Goldstrike_mine,Nevada Gold MinesBingham Canyon,Utah,copper,https://en.wikipedia.org/wiki/Bingham_Canyon_Mine,Kennecott

Basically, I am going to attempt to scrape Wikipedia for information on who owns the three mines. Then I will try to use heuristics to gather information on what I think I know about them and gauge how up to date the Wikipedia information is.

Hamilton uses a system whereby you name your functions in a noun-like fashion ("def stuff()" instead of "def getstuff()") and feed those names as variables to the other functions in the workflow as parameters. This is what allows the tool to check your workflow for inconsistencies (types, for instance) and build the graphviz chart shown above.

You can use separate modules with functions and import them. I've done some of this on the bigger workflows I work with. Your Hamilton functions then end up being little one liners that call the bigger functions in the modules. This is necessary if you have functions you use repeatedly in your workflow that take different values at different stages. For this toy project, I've kept the whole thing self contained in one module toyscriptiii.py (yes, the iii in the filename represents my multiple failed attempts at web scraping and text processing - it's harder than it looks).

Below is the Hamilton main file run.py (I believe the "run.py" name is convention.) I have done my best to preserve the dictionary return values as "faux immutable" through use of the copy module in each function. This helps me in debugging and examining output, much of which can be done from the run.py file (all the return values are stored in a dictionary). I've worked with a dataset with about 600,000 rows that had about 10 nodes. My computer has 32GB of RAM (Windows 11); it handled memory fine (less than half). For really big data, keeping all these dictionaries in memory might be a problem.

# python 3.12
"""Hamilton demo."""
import sys
import pprint
from hamilton import driver
import toyscriptiii as ts
dr = driver.Builder().with_modules(ts).build()
dr.display_all_functions("ts.png", deduplicate_inputs=True, keep_dot=True, orient='BR')
results = dr.execute(['parsed_data',                      'data_with_wikipedia',                      'data_with_company',                      'info_output',                      'commodity_word_counts',                      'colloquial_company_word_counts',                      'info_dict_merged',                      'wikipedia_report'],                      inputs={'datafile':'data.csv'})

The main toy module with functions configured for the Hamilton graph:

# python 3.12
"""Toy script.
Takes some input from a csv file on big Americanmines and looks at Wikipedia text for some extracontext."""
import copy
import pprint
import sys
from urllib import request
import re
from bs4 import BeautifulSoup
def parsed_data(datafile:str) -> dict:    """    Get csv data into a dictionary keyed on mine name.    """    retval = {}    with open(datafile, 'r') as f:        headers = [x.strip() for x in next(f).split(',')]        for linex in f:            vals = [x.strip() for x in linex.split(',')]            retval[vals[0]] = {key:val for key, val in zip(headers, vals)}     pprint.pprint(retval)    return retval        def data_with_wikipedia(parsed_data:dict) -> dict:    """    Connect to wikipedia sites and fill in    raw html data.
    Return dictionary.    """    retval = copy.deepcopy(parsed_data)    for minex in retval:        obj = request.urlopen(retval[minex]['wikipedia page'])        html = obj.read()        soup = BeautifulSoup(html, 'html.parser')        print(soup.title)        # Text from html and strip out newlines.        newstring = soup.get_text().replace('\n', '')        retval[minex]['wikipediatext'] = newstring    return retval
def data_with_company(data_with_wikipedia:dict) -> dict:    """    Fetches company ownership for mine out of     Wikipedia text dump.
    Returns a new dictionary with the company name    without the big wikipedia text dump.    """    # Wikipedia setup for mine company name.    COMPANYPAT = r'[a-z]Company'    # Lower case followed by upper case heuristic.    ENDCOMPANYPAT = '[a-z][A-Z]'    retval = copy.deepcopy(data_with_wikipedia)    companypat = re.compile(COMPANYPAT)    endcompanypat = re.compile(ENDCOMPANYPAT)     for minex in retval:        print(minex)        match = re.search(companypat, retval[minex]['wikipediatext'])        if match:            print('Company match span = ', match.span())            companyidx = match.span()[1]            match2 = re.search(endcompanypat, retval[minex]['wikipediatext'][companyidx:])            print('End Company match span = ', match2.span())            retval[minex]['company'] = retval[minex]['wikipediatext'][companyidx:companyidx + match2.span()[0] + 1]        # Get rid of big text dump in return value.        retval[minex].pop('wikipediatext')    return retval
def info_output(data_with_company:dict) -> str:    """    Prints some output text to a file for each    mine in the data_with_company dictionary.
    Returns string filename of output.    """    INFOLINEFMT = 'The {mine:s} mine is a big {commodity:s} mine in the State of {state:s} in the US.'    COMPANYLINEFMT = '\n    {company:s} owns the mine.\n\n'    retval = 'mine_info.txt'    with open(retval, 'w') as f:        for minex in data_with_company:            print(INFOLINEFMT.format(**data_with_company[minex]), file=f)            print(COMPANYLINEFMT.format(**data_with_company[minex]), file=f)    return retval
def commodity_word_counts(data_with_wikipedia:dict, data_with_company:dict) -> dict:    """    Return dictionary keyed on mine with counts of    commodity (e.g., zinc etc.) mentions on Wikipedia    page (excluding ones in the company name).    """    retval = {}    # This will probably miss some occurrences at mashed together    # word boundaries. It is a rough estimate.    # '\b[Gg]old\b'    commoditypatfmt = r'\b[{0:s}{1:s}]{2:s}\b'    for minex in data_with_wikipedia:        print(minex)        commodityuc = data_with_wikipedia[minex]['commodity'][0].upper()        commoditypat = commoditypatfmt.format(commodityuc,                                              data_with_wikipedia[minex]['commodity'][0],                                              data_with_wikipedia[minex]['commodity'][1:])        print(commoditypat)        commoditymatches = re.findall(commoditypat, data_with_wikipedia[minex]['wikipediatext'])        # pprint.pprint(commoditymatches)        nummatchesraw = len(commoditymatches)        print('Initial length of commoditymatches is {0:d}.'.format(nummatchesraw))        companymatches = re.findall(data_with_company[minex]['company'],                                    data_with_wikipedia[minex]['wikipediatext'])        numcompanymatches = len(companymatches)        print('Length of companymatches is {0:d}.'.format(numcompanymatches))        # Is the commodity name part of the company name?        print('commoditypat = ', commoditypat)        print(data_with_company[minex]['company'])        commoditymatchcompany = re.search(commoditypat, data_with_company[minex]['company'])        if commoditymatchcompany:            print('commoditymatchcompany.span() = ', commoditymatchcompany.span())            nummatchesfinal = nummatchesraw - numcompanymatches            retval[minex] = nummatchesfinal         else:            retval[minex] = nummatchesraw     return retval
def colloquial_company_word_counts(data_with_wikipedia:dict) -> dict:    """    Find the number of times the company you associate with    the property/mine (very subjective) is within the    text of the mine's wikipedia article.    """    retval = {}    for minex in data_with_wikipedia:        colloquial_pat = data_with_wikipedia[minex]['colloquial association']        print(minex)        nummatches = len(re.findall(colloquial_pat, data_with_wikipedia[minex]['wikipediatext']))        print('{0:d} matches for colloquial association {1:s}.'.format(nummatches, colloquial_pat))        retval[minex] = nummatches    return retval
def info_dict_merged(data_with_company:dict,                     commodity_word_counts:dict,                     colloquial_company_word_counts:dict) -> dict:    """    Get a dictionary with all the collected information    in it minus the big Wikipedia text dump.    """    retval = copy.deepcopy(data_with_company)    for minex in retval:        retval[minex]['colloquial association count'] = colloquial_company_word_counts[minex]        retval[minex]['commodity word count'] = commodity_word_counts[minex]    return retval
def wikipedia_report(info_dict_merged:dict) -> str:    """    Writes out Wikipedia information (word counts)    to file in prose; returns string filename.    """    retval = 'wikipedia_info.txt'    colloqfmt = 'The {0:s} mine has {1:d} occurrences of colloquial association {2:s} in its Wikipedia article text.\n'    commodfmt = 'The {0:s} mine has {1:d} occurrences of commodity name {2:s} in its Wikipedia article text.\n\n'    with open(retval, 'w') as f:        for minex in info_dict_merged:            print(colloqfmt.format(info_dict_merged[minex]['mine'],                                   info_dict_merged[minex]['colloquial association count'],                                   info_dict_merged[minex]['colloquial association']), file=f)            print(commodfmt.format(info_dict_merged[minex]['mine'],                                   info_dict_merged[minex]['commodity word count'],                                   info_dict_merged[minex]['commodity']), file=f)    return retval

My REGEX abilities are somewhere between "I've heard the term REGEX and know regular expressions exist" and bracketed characters in each slot brute force. It worked for this toy example. Each Wikipedia page features the word "Company" followed by the name of the owning corporate entity.

Here is are the two text outputs the script produces from the information provided (Wikipedia articles from July, 2024):

The Red Dog mine is a big zinc mine in the State of Alaska in the US.
    NANA Regional Corporation owns the mine.

The Goldstrike mine is a big gold mine in the State of Nevada in the US.
    Barrick Gold owns the mine.

The Bingham Canyon mine is a big copper mine in the State of Utah in the US.
    Rio Tinto Group owns the mine.

The Red Dog mine has 21 occurrences of colloquial association Teck in its Wikipedia article text.
The Red Dog mine has 29 occurrences of commodity name zinc in its Wikipedia article text.

The Goldstrike mine has 0 occurrences of colloquial association Nevada Gold Mines in its Wikipedia article text.
The Goldstrike mine has 16 occurrences of commodity name gold in its Wikipedia article text.

The Bingham Canyon mine has 49 occurrences of colloquial association Kennecott in its Wikipedia article text.
The Bingham Canyon mine has 84 occurrences of commodity name copper in its Wikipedia article text.

Company names are relatively straightforward, although mining company and properties acquisitions and mergers being what they are, it can get complicated. I unwittingly chose three properties that Wikipedia reports as having one owner. Other big mines like Morenci, Arizona (copper) and Cortez, Nevada (gold) show more than one owner; that case is for another programming day. The Goldstrike information might be out of date - no mention of Nevada Gold Mines or Newmont (one mention, but in a different context). The Cortez Wikipedia page is more current, although it still doesn't mention Nevada Gold Mines.

The inclusion of colloquial association in the input csv file was an afterthought based on a lot of the Wikipedia information not being completely in line with what I thought I knew. Teck is the operator of the Red Dog Mine in Alaska. That name does get mentioned frequently in the Wikipedia article.

Enough mining stuff - it is a programming blog after all. Next time (not written yet) I hope to cover dressing up and highlighting the graphviz output a bit.

Thank you for stopping by.

Manual action needed to resolve boot failure for Fedora Atomic Desktops and Fedora IoT

Since the 39.20240617.0 and 40.20240617.0 updates for Atomic Desktops and the 40.20240617.0 update for IoT, systems with Secure Boot enabled may fail to boot if they have been installed before Fedora Linux 40. You might see the following error:

error: ../../grub-core/kern/efi/sb.c:182:bad shim signature. error: ../../grub-core/loader/i386/efi/linux.c:258:you need to load the kernel first. Press any key to continue...

Note: You can also read this post on the Fedora Magazine.


In order to resolve this issue, you must first boot into the previous version of your system. It should still be functional. In order to do this, reboot your system and select the previous boot entry in the selection menu displayed on boot. Its name should be something like:

Fedora Linux 39.20240610.0 (Silverblue) (ostree:1)

Once you have logged in, search for the terminal application for your desktop and open a new terminal window. On Fedora IoT, log in via SSH or on the console. Make sure that you are not running in a toolbox for all the commands listed on this page.

If you are running a Fedora Atomic Desktop based on Fedora 39 and have not yet updated to Fedora 40, you first need to update to the latest working Fedora 39 version with those commands:

$ sudo rpm-ostree cleanup --pending $ sudo rpm-ostree deploy 39.20240616.0

If you are running Fedora IoT, then first update to the latest working version with this command:

$ sudo rpm-ostree cleanup --pending $ sudo rpm-ostree deploy 40.20240614.0

Then reboot your system.

Once you are logged in again on the latest working version, proceed with the following commands:

$ sudo -i $ cp -rp /usr/lib/ostree-boot/efi/EFI /boot/efi $ sync

Once completed, reboot your system. You should now be able to update again, as normal, using the graphical interface or the command line:

$ sudo rpm-ostree update Why did this happen?

On Fedora Atomic Desktops and Fedora IoT systems, the components that are part of the boot chain (Shim, GRUB) are not (yet) automatically updated alongside the rest of the system. Thus, if you have installed a Fedora Atomic Desktop or a Fedora IoT system before Fedora 40, it uses an old versions of the Shim and bootloader binaries to boot your system.

When Secure Boot is enabled, the EFI firmware loads Shim first. Shim is signed by the Microsoft Third Party Certificate Authority so that it can be verified on most hardware out of the box. The Shim binary includes the Fedora certificates used to verify binaries signed by Fedora. Then Shim loads GRUB, which in turn loads the Linux kernel. Both are signed by Fedora.

Until recently, the kernel binaries where signed two times, with an older key and a newer one. With the 6.9 kernel update, the kernel is no longer signed with the old key. If GRUB or Shim is old enough and does not know about the new key, the signature verification fails.

See the initial report in the Fedora Silverblue issue tracker.

What are we doing to prevent it from happening again?

We have known for a while that not updating the bootloader was not a satisfying situation. We have been working on enabling bootupd for Fedora Atomic Desktops and Fedora IoT. bootupd is a small application that is responsible only for bootloader updates. While initially planned for Fedora Linux 38 (!), we had to delay enabling it due to various issues and missing functionality in bootupd itself and changes needed in Anaconda.

We are hoping to enable bootupd in Fedora Linux 41, hopefully by default, which should finally resolve this situation. See the Enable bootupd for Fedora Atomic Desktops and Fedora IoT Fedora Change page.

Note that the root issue also impacts Fedora CoreOS but steps have been put in place to force a bootloader update before the 6.9 kernel update. See the tracking issue for Fedora CoreOS.

Eli Bendersky: You don't need virtualenv in Go

Programmers that come to Go from Python often wonder "do I need something like virtualenv here?"

The short answer is NO; this post will provide some additional details.

While virtualenv in Python is useful in many situations, I think it'd be fair to divide them into two broad scenarios: for execution and for development. Let's see what Go offers for each of these scenarios.


There are multiple, mutually-incompatible versions of Python out in the wild. There are even multiple versions of the packaging tools (like pip). On top of this, different programs need different packages, often themselves with mutually-incompatible versions.

Python code typically expects to be installed, and expects to find packages it depends on installed in a central location. This can be an issue for systems where we don't have the permission to install packages/code to a central location.

All of this makes distributing Python applications quite tricky. It's common to use bundling tools like PyInstaller, but virtualenv is also a popular option [1].

Go is a statically compiled language, so this is a non-problem! Binaries are easy to build and distribute; the binary is a native executable for a given platform (just like a native executable built from C or C++ source), and has no dependencies on compiler or package versions. While you can install Go programs into a central location, you by no means have to do this. In fact, you typically don't have to install Go programs at all. Just invoke the binary.

It's also worth mentioning that Go has great cross-compilation support, making it easy to create binaries for multiple OSes from a single development machine.


Consider the following situation: you're developing a package, which depends on N other packages at specific versions; e.g. you need package foo at version 1.2 or above. Your system may have an older version of foo installed - 0.9; you try to upgrade it to 1.2 and some other program breaks. Now, this all sounds very manageable for package foo - how hard can it be to upgrade the uses of this simple package?

Reality is more difficult. foo could be Django; your code depends on a new version, while some other critical systems depend on an old version. Good luck fixing this conundrum. In Python, viruatenv is a critical tool to make such situations manageable; newer tools like pipenv wrap virtualenv with more usability patterns.

How about Go?

If you're using Go modules, this situation is very easy to handle. In a way, a Go module serves as its own virtualenv. Your go.mod file specifies the exact versions of dependency packages needed for your development, and these versions don't mix up with packages you need to develop some other project (which has its own go.mod).

Moreover, Go module directives like replace make it easy to short-circuit dependencies to try local patches. While debugging your project you find that package foo has a bug that may be affecting you? Want to try a quick fix and see if you're right? No problem, just clone foo locally, apply a fix, and use a replace to use this locally patched foo. See this post for a few ways to automate this process.

What about different Go versions? Suppose you have to investigate a user report complaining that your code doesn't work with an older Go version. Or maybe you're curious to see how the upcoming beta release of a Go version will affect you. Go makes it easy to install different versions locally. These different versions have their own standard libraries that won't interfere with each other.

[1]Fun fact: this blog uses the Pelican static site generator. To regenerate the site I run Pelican in a virtualenv because I need a specific version of Pelican with some personal patches.
Glyph Lefkowitz: Against Innovation Tokens

Planet Python - Thu, 2024-07-04 15:54

Updated 2024-07-04: After some discussion, added an epilogue going into more detail about the value of the distinction between the two types of tokens.

In 2015, Dan McKinley laid out a model for software teams selecting technologies. He proposed that each team have a limited supply of “innovation tokens”, and, when selecting a technology, they can choose boring ones for free but “innovative” ones cost a token. This implies that we all know which technologies are innovative, and we assume that they are inherently costly, so we want to restrict their supply.

That model has become popular to the point that it is now part of the vernacular. In many discussions, it is accepted as received wisdom, or even common sense.

In this post I aim to show you that despite being superficially helpful, this model is wrong, and in fact, may be counterproductive. I believe it is an attractive nuisance in computer programming discourse.

In fairness to Mr. McKinley, the model he described in this post is:

  1. nearly a decade old at this point, and
  2. much more nuanced in its description of the problem with “innovation” than the subsequent memetic mutation of the concept.

While I will be referencing McKinley’s post, and I do take some issue with it, I am reacting more strongly to the life of its own that this idea has taken on once it escaped its original context. There are a zillion worse posts rehashing this concept, on blogs and LinkedIn, but I won’t be linking to them because the goal is not to call anybody out.

To some extent I am re-raising McKinley’s own caveats and reinforcing them. So I may be arguing with a strawman, but it’s a strawman I have seen deployed with some regularity over the years.

To reduce it to its core, this strawman is “don’t use new or interesting technology, and if you have to, only use a little bit”.

Within the broader culture of programmers, an “innovation token” has become a shorthand to smear any technology perceived — almost always based on vibes, not data — as risky, and the adoption of novel approaches as pretentious and unserious. Speaking of programmer culture though, I do have to acknowledge there is also a pervasive tendency for us to get distracted by novelty and waste time on puzzles rather than problem-solving, so I understand where the reactionary attitude represented by the concept of an innovation token comes from.

But it is reactionary.

At its worst, it borders on anti-intellectualism. I have heard it used on more than one occasion as a thought-terminating cliche to discard a potentially promising new tool. But before I get into that, let me try to give a sympathetic summary of the idea, because the model is not entirely bad.

It has been popular for a long time because it does work okay as an heuristic.

The real problem that McKinley is describing is operational overhead. When programmers make a technology selection, we are often considering how difficult it will make the programming. Innovative technology selections are, by definition, less mature.

That lack of maturity — particularly in the open source world — often means that the project is in a part of its lifecycle where it is concerned with development affordances more than operational ones. Therefore, the stereotypical innovative project, even one which might legitimately be a big improvement to development velocity, will create more operational overhead. That operational overhead creates a hidden cost for the operations team later on.

This is a point I emphatically agree with. When selecting a technology, you should consider its ease of operation more than its ease of development. If your team is successful, they will be operating and maintaining it far longer than they are initially integrating and deploying it.

Furthermore, some operational overhead is inevitable. You will need to hire people to mitigate it. More popular, more mature projects will have a bigger talent pool to hire from, so your training costs will be lower, and those training costs are part of your operational cost too.

Rationing innovation tokens therefore can work as a reasonable heuristic, or proxy metric, for avoiding a mess of complex operational problems associated with dependencies that are expensive to operate and hard to hire for.

There are some minor issues I want to point out before getting to the overarching one.

  1. “has a lot of operational overhead” is a stereotype of a new technology, not an inherent property. If you want to reject a technology on the basis of being too high-overhead, at least look into its actual overhead a little bit. Sometimes, especially in 2024 as opposed to 2015, the point of a new, shiny piece of tech is to address operational issues that the more boring, older one had.
  2. “hard to learn” is also a stereotype; if “newer” meant “harder” then we would all be using troff rather than Google Docs. Actually ask if the innovativeness is making things harder or easier; don’t assume.
  3. You are going to have to train people on your stack no matter what. If a technology is adding a lot of value, it’s absolutely worth hiring for general ability and making a plan to teach people about it. You are going to have to do this with the core technology of your product anyway.

As I said, though, these are minor issues. The big problem with modeling operational overhead as an “innovation token” is that an even bigger concern than selecting an innovative tool is selecting too many tools.

The impulse to select more tools and make your operational environment more complex can be made worse by trying to avoid innovative tools. The important thing is not “less innovation”, but more consistency. To illustrate this, let’s do a simple thought experiment.

Let’s say you’re going to make a web app. There’s a tool in Haskell that you really like for a critical part of your app’s problem domain. You don’t want to spend more than one innovation token though, and everything in Haskell is inherently innovative, so you write a little service that just does that one part and you write the rest of your app in Ruby, calling into that service whenever you need to use that thing. This will appropriately restrict your “innovation token” expenditure.

Does doing this actually reduce your operational overhead, though?

First, you will have to find a team that likes both Ruby and Haskell and sees no problem using both. If you are not familiar with the cultural proclivities of these languages, suffice it to say that this is unlikely. Hiring for Haskell programmers is hard because there are fewer of them than Ruby programmers, but hiring for polyglot Haskell/Ruby programmers who are happy to do either is going to be really hard.

Since you will need to find different people to write in the different languages, even in the best case scenario, you will have two teams: the Haskell team and the Ruby team. Even if you are incredibly disciplined about inter-service responsibilities, there will be some areas where duplication of code is necessary across those services. Disagreements will arise and every one of these disagreements will be a source of social friction and software defects.

Then, you need to set up separate CI pipelines for each language, separate deployment systems, and of course, separate databases. Right away you are effectively doubling your workload.

In the worse, and unfortunately more likely scenario, there will be enormous infighting between these two teams. Operational incidents will be more difficult to manage because rather than learning the Haskell tools for operational visibility and disseminating that institutional knowledge amongst your team, you will be half-learning the lessons from two separate ecosystems and attempting to integrate them. Every on-call engineer will be frantically trying to learn a language ecosystem they don’t use regularly, or you will double the size of your on-call rotation. The Ruby team may start to resent the Haskell team for getting to exclusively work on the fun parts of the problem while they are doing things that look more like rote grunt work.

A better way to think about the problem of managing operational overhead is, rather than “innovation tokens”, consider “boundary tokens”.

That is to say, rather than evaluating the general sense of weird vibes from your architecture, consider the consistency of that architecture. If you’re using Haskell, use Haskell. You should be all-in on Haskell web frameworks, Haskell ORMs, Haskell OAuth integrations, and so on.1 To cross the boundary out of Haskell, you need to spend a boundary token, and you shouldn’t have many of those.

I submit that the increased operational overhead that you might experience with an all-Haskell tool selection will be dwarfed by the savings that you get by having a team that is aligned with each other, that can communicate easily, and that can share programs with each other without needing to first strategize about a channel for the two pieces of work to establish bidirectional communication. The ability to simply call a function when you need to call it is very powerful, and extremely underrated.

Consistency ought to apply at each layer of the stack; it is perhaps most obvious with programming languages, but it is true of web frameworks, test frameworks, cryptographic libraries, you name it. Make a choice and stick with it, because every deviation from that choice carries a significant cost. Moreover this cost is a hidden cost, in the same way that the operational downsides of an “innovative” tool that hasn’t seen much production use might be hidden.

Discarding a more standard tool in favor of a tool more consistent with your architecture extends even to fairly uncontroversial, ubiquitous tools. For example, one of my favorite architectural patterns is to forego the use of the venerable — and very boring – Cron, the UNIX task-scheduler. Instead of Cron, it can make a lot of sense to have hand-written bespoke code for scheduling tasks within the application. Within the “innovation tokens” model, this is a very silly waste of a token!

Just use Cron! Everybody knows how to use Cron!

Except… does everybody know how to use Cron? Here are some questions to consider, if you’re about to roll out a big dependency on Cron:

  1. How do you write a unit test for a scheduling rule with Cron?
  2. Can you even remember how to write a cron rule that runs at the times you want?
  3. How do you inject secrets and configuration variables into the distinct and somewhat idiosyncratic runtime execution environment of Cron?
  4. How do you know that you did that variable-injection properly until the job actually runs, possibly in the middle of the night?
  5. How do you deploy your monitoring and error-logging frameworks to observe your scripts run under Cron?

Granted, this architectural choice is less controversial than it once was. Cron used to be ambiently available on whatever servers you happened to be running. As container-based deployments have increased in popularity, this sense that Cron is just kinda around has gone away, and if you need to run a container that just runs Cron, much of the jankiness of its deployment is a lot more immediately visible.

There is friction at the boundary between things. That friction is a cost, but sometimes it’s a cost worth paying.

If there’s a really good library in Haskell and a really good library in Ruby and you really do want to use them both, maybe it makes sense to actually have multiple services. As your team gets larger and more mature, the need to bring in more tools, and the ability to handle the associated overhead, will only increase over time. But the place that the cost comes in the most is at the boundary between tools, not in the operational deficiencies of any one particular tool.

Even in a bog-standard web application with the most boring, least innovative tech stack imaginable (PHP, MySQL, HTML, CSS, JavaScript), many of the annoying points of friction are where different, inconsistent technologies make contact. If you are a programmer working on the web yourself, consider your own impression of the level of controversy of these technologies:

Consider that there are far more complex technical tools in terms of required skills to implement them, like computer vision or physics simulation, tools which are also pretty widely used, which consistently generate lower levels of controversy. People do have strong feelings about these things as well, of course, and it’s hard to find things to link to that show “this isn’t controversial”, but, like, search your feelings, you know it to be true.

You can see the benefits of the boundary token approach in programming language design. Many of the most influential and best-loved programming languages had an impact not by bundling together lots of tools, but by making everything into one thing:

  • LISP: everything is a list
  • Smalltalk: everything is an object
  • ML: everything is an algebraic data type
  • Forth: everything is a stack

There is a tremendous power in thinking about everything as a single kind of thing, because then you don’t have to juggle lots of different ideas about different kinds of things; you can just think about your problem.

When people complain about programming languages, they’re often complaining about how many different kinds of thing they have to remember in order to use it.

If you keep your boundary-token budget small, and allow your developers to accomplish as much as possible while staying within a solution space delineated by a single, clean cognitive boundary, I promise you can innovate as much as you want and your operational costs will remain manageable.


In subsequent Mastodon discussion of this post on with Matt Campbell and Meejah, I realized that I may not have made it entirely clear why I feel the distinction between “boundary” and “innovation” tokens is important. I do say above that the “innovation token” model can be a useful heuristic, so why bother with a new, but slightly different heuristic? Especially since most experienced engineers - indeed, McKinley himself - would budget “innovation” quite similarly to “boundaries”, and might even consider the use of more “innovative” Haskell tools in my hypothetical scenario to not even be an expenditure of innovation tokens at all.

To answer that, I need to highlight the purpose of having heuristics like this in the first place. These are vague, nebulous guidelines, not hard and fast rules. I cannot give you a token calculator to plug your technical decisions into. The purpose of either token heuristic is to facilitate discussions among a team.

With a team of skilled and experienced engineers, the distinction is meaningless. Senior and staff engineers (at least, the ones who deserve their level) will intuit the goals behind “innovation tokens” and inherently consider things like operational overhead anyway. In practice, a high-performing, well-aligned team discussing innovation tokens and one discussing boundary tokens will look functionally indistinguishable.

The distinction starts to be important when you have management pressures, nervous executives, inexperienced engineers, a fresh team without existing consensus about core technology choices, and so on. That is to say, most teams that exist in the messy, perpetually in medias res world of the software industry.

If you are just getting started on a project and you have a bunch of competent but disagreeable engineers, the words “innovation” and “boundaries” function very differently.

If you ask, “is this an innovation” about a particular technical tool, you are asking your interlocutor to pull in a bunch of their skills and experience to subjectively evaluate the relative industry-wide, or maybe company-wide, or maybe team-wide2 newness of the thing being discussed. The discussion of whether it counts as boring or innovative is immediately fraught with a ton of subjective, difficult-to-quantify information about costs of hiring, difficulty of learning, and your impression of the feelings of hundreds or thousands of people outside of your team. And, yes, ultimately you do need to have an estimate of all that stuff, but starting your “is it OK to use this” conversation by simultaneously arguing about all those subjective judgments is setting yourself up for failure.

Instead, if you ask “does this introduce a boundary between two different technologies with different conceptual models”, while that is not a perfectly objective question, it is much easier for your team to answer, with much crisper intermediary factual questions. What are the two technologies? What are the models? How much do they differ? You can just hash out the answers to each one within the team directly, rather than needing to sift through the last few years of Stack Overflow developer surveys to determine relative adoption or popularity of technologies in the world at large.

Restricting your supply of either boundary or innovation tokens is a good idea, but achieving unanimity within your team about what your boundaries are is always going to be easier than deciding what your innovations are.


Thank you to my patrons who are supporting my writing on this blog. If you like what you’ve read here and you’d like to read more of it, or you’d like to support my various open-source endeavors, you can support my work as a sponsor! I am also available for consulting work if you think your organization could benefit from expertise on topics like “how can we make our architecture more consistent”.

  1. I gave a talk about this once, a very long time ago, where Haskell was Python. 

  2. It’s not clear, that’s a big part of the problem. 

Categories: FLOSS Project Planets

mark.ie: My LocalGov Drupal contributions for week-ending July 5th, 2024

Here's what I've been working on for my LocalGov Drupal contributions this week. Thanks to Big Blue Door for sponsoring the time to work on these.

Categories: FLOSS Project Planets

Python Morsels: Strings in Python

Strings are used to store text-based data.

Table of contents

  1. Strings store text
  2. How are strings used?
  3. String methods in Python
  4. String concatenation
  5. Double quotes vs single quotes
  6. Escape characters
  7. Strings are everywhere in Python

Strings store text

This is a string:

>>> message = "This is text"

Python strings store text:

>>> message 'This is text' How are strings used?

Strings are often used for …

Read the full article: https://www.pythonmorsels.com/strings-in-python/
Categories: FLOSS Project Planets

PyCharm: Polars vs. pandas: What’s the Difference?

If you’ve been keeping up with the advances in Python dataframes in the past year, you couldn’t help hearing about Polars, the powerful dataframe library designed for working with large datasets.

Unlike other libraries for working with large datasets, such as Spark, Dask, and Ray, Polars is designed to be used on a single machine, prompting a lot of comparisons to pandas. However, Polars differs from pandas in a number of important ways, including how it works with data and what its optimal applications are. In the following article, we’ll explore the technical details that differentiate these two dataframe libraries and have a look at the strengths and limitations of each.

If you’d like to hear more about this from the creator of Polars, Ritchie Vink, you can also see our interview with him below!

Why use Polars over pandas?

In a word: performance. Polars was built from the ground up to be blazingly fast and can do common operations around 5–10 times faster than pandas. In addition, the memory requirement for Polars operations is significantly smaller than for pandas: pandas requires around 5 to 10 times as much RAM as the size of the dataset to carry out operations, compared to the 2 to 4 times needed for Polars.

You can get an idea of how Polars performs compared to other dataframe libraries here. As you can see, Polars is between 10 and 100 times as fast as pandas for common operations and is actually one of the fastest DataFrame libraries overall. Moreover, it can handle larger datasets than pandas can before running into out-of-memory errors.

Why is Polars so fast?

These results are extremely impressive, so you might be wondering: How can Polars get this sort of performance while still running on a single machine? The library was designed with performance in mind from the beginning, and this is achieved through a few different means.

Written in Rust

One of the most well-known facts about Polars is that it is written in Rust, a low-level language that is almost as fast as C and C++. In contrast, pandas is built on top of Python libraries, one of these being NumPy. While NumPy’s core is written in C, it is still hamstrung by inherent problems with the way Python handles certain types in memory, such as strings for categorical data, leading to poor performance when handling these types (see this fantastic blog post from Wes McKinney for more details).

One of the other advantages of using Rust is that it allows for safe concurrency; that is, it is designed to make parallelism as predictable as possible. This means that Polars can safely use all of your machine’s cores for even complex queries involving multiple columns, which led Ritchie Vink to describe Polar’s performance as “embarrassingly parallel”. This gives Polars a massive performance boost over pandas, which only uses one core to carry out operations. Check out this excellent talk by Nico Kreiling from PyCon DE this year, which goes into more detail about how Polars achieves this.

Based on Arrow

Another factor that contributes to Polars’ impressive performance is Apache Arrow, a language-independent memory format. Arrow was actually co-created by Wes McKinney in response to many of the issues he saw with pandas as the size of data exploded. It is also the backend for pandas 2.0, a more performant version of pandas released in March of this year. The Arrow backends of the libraries do differ slightly, however: while pandas 2.0 is built on PyArrow, the Polars team built their own Arrow implementation.

One of the main advantages of building a data library on Arrow is interoperability. Arrow has been designed to standardize the in-memory data format used across libraries, and it is already used by a number of important libraries and databases, as you can see below.

This interoperability speeds up performance as it bypasses the need to convert data into a different format to pass it between different steps of the data pipeline (in other words, it avoids the need to serialize and deserialize the data). It is also more memory-efficient, as two processes can share the same data without needing to make a copy. As serialization/deserialization is estimated to represent 80–90% of the computing costs in data workflows, Arrow’s common data format lends Polars significant performance gains.

Arrow also has built-in support for a wider range of data types than pandas. As pandas is based on NumPy, it is excellent at handling integer and float columns, but struggles with other data types. In contrast, Arrow has sophisticated support for datetime, boolean, binary, and even complex column types, such as those containing lists. In addition, Arrow is able to natively handle missing data, which requires a workaround in NumPy.

Finally, Arrow uses columnar data storage, which means that, regardless of the data type, all columns are stored in a continuous block of memory. This not only makes parallelism easier, but also makes data retrieval faster.

Query optimization

One of the other cores of Polars’ performance is how it evaluates code. Pandas, by default, uses eager execution, carrying out operations in the order you’ve written them. In contrast, Polars has the ability to do both eager and lazy execution, where a query optimizer will evaluate all of the required operations and map out the most efficient way of executing the code. This can include, among other things, rewriting the execution order of operations or dropping redundant calculations. Take, for example, the following expression to get the mean of column Number1 for each of the categories “A” and “B” in Category.

( df .groupby(by = "Category").agg(pl.col("Number1").mean()) .filter(pl.col("Category").is_in(["A", "B"])) )

If this expression is eagerly executed, the groupby operation will be unnecessarily performed for the whole DataFrame, and then filtered by Category. With lazy execution, the DataFrame can be filtered and groupby performed on only the required data.

Expressive API

Finally, Polars has an extremely expressive API, meaning that basically any operation you want to perform can be expressed as a Polars method. In contrast, more complex operations in pandas often need to be passed to the apply method as a lambda expression. The problem with the apply method is that it loops over the rows of the DataFrame, sequentially executing the operation on each one. Being able to use built-in methods allows you to work on a columnar level and take advantage of another form of parallelism called SIMD.

When should you stick with pandas?

All of this sounds so amazing that you’re probably wondering why you would even bother with pandas anymore. Not so fast! While Polars is superb for doing extremely efficient data transformations, it is currently not the optimal choice for data exploration or for use as part of machine learning pipelines. These are areas where pandas continues to shine.

One of the reasons for this is that while Polars has great interoperability with other packages using Arrow, it is not yet compatible with most of the Python data visualization packages nor machine learning libraries such as scikit-learn and PyTorch. The only exception is Plotly, which allows you to create charts directly from Polars DataFrames.

A solution that is being discussed is using the Python dataframe interchange protocol in these packages to allow them to support a range of dataframe libraries, which would mean that data science and machine learning workflows would no longer be bottlenecked by pandas. However, this is a relatively new idea, and it will take time for these projects to implement.

Tooling for Polars and pandas

After all of this, I am sure you are eager to try Polars yourself! PyCharm Professional for Data Science offers excellent tooling for working with both pandas and Polars in Jupyter notebooks. In particular, pandas and Polars DataFrames are displayed with interactive functionality, which makes exploring your data much quicker and more comfortable.

Some of my favorite features include the ability to scroll through all rows and columns of the DataFrame without truncation, get aggregations of DataFrame values in one click, and export the DataFrame in a huge range of formats (including Markdown!).

If you’re not yet using PyCharm, you can try it with a 30-day trial by following the link below.

Start your PyCharm Pro free trial

Categories: FLOSS Project Planets

Explore how AI is revolutionizing Drupal development in Jay Callicott's latest article, "The AI-Driven Developer: From Assistance to Autonomy in Drupal Development." This insightful piece delves into the evolution of AI tools from mere coding assistants to autonomous agents, transforming the roles of developers. Discover the potential of AI-driven modules like DrupalAI and the importance of crafting effective AI prompts to enhance productivity and innovation in web development. Don’t miss out on this glimpse into the future of AI and Drupal!
Categories: FLOSS Project Planets

1xINTERNET blog: 1xINTERNET at Drupal Developer Days Burgas 2024

Planet Drupal - Thu, 2024-07-04 08:00

At 1xINTERNET, we proudly sponsored and spread our knowledge at Drupal Developer Days in Burgas 2024. Discover the insights we shared this year!

May and June in KDE PIM

Planet KDE - Thu, 2024-07-04 05:00

Here's our bi-monthly update from KDE's personal information management applications team. This report covers progress made in the months of May and June 2024.

Since the last report 38 people have contributed over 1500 changes to KDE PIM code base.

PIM Sprint

Let's start with the biggest event of the last two months: the PIM sprint!

The team met in Toulouse for a weekend of discussions, hacking and French pastries. You can read reports from Kevin, Carl, Dan and Volker on their blogs to get all the nitty gritty.

In this report, we will cover the biggest topics that were discussed and worked on during the sprint.


We have decided to plan and track our work in milestones. Milestones should represent a concrete goal with clear definitions of what we understand as done, and be achievable within a reasonable time frame. Each milestone is then split into smaller bite-sized tasks that can be worked on independently.

This will help us prioritize important work, make our progress more visible and, most importantly, make it easier for people to get excited about what we are working on. New contributors will also be able to pick up a well-defined task and start contributing to PIM.

You can see the milestones on our Gitlab board - if anything there catches your eye and you would like to help, reach out to us on the #kontact:kde.org Matrix channel!

This report, as well as future ones will try to focus on the current milestones and their progress, hopefully making them more exciting to read :)

Retiring KJots and KNotes

We have decided to retire the KJots and KNotes applications. These applications have not seen any support or development in many years and are not in a state that we feel comfortable shipping to our users. With the introduction of Marknote, KDE can now offer a modern, well-maintained note-taking application that we can recommend users to migrate to. The latest release of Marknote has gained support for importing notes from KJots and KNotes, so no notes will be lost.

Polished Tag Support

Tags were introduced into KDE PIM many, many years ago, but they have never reached their full potential. We have decided to change that and make tags a first-class citizen in our applications. The first step is making sure that tags are actually usable, so we started by implementing automatic extraction of tags from events and todos and syncing them into local iCal calendars and remote DAV calendars. Thanks to this, you can now sync tags between KOrganizer and NextCloud, for example.

Moving Protocol Implementations to KDE Framworks

We have libraries in KDE PIM that implement various standards and protocols. By moving them to KDE Frameworks we make them independent from KDE PIM and thus available to anyone who wants to use them. In the past we have moved KCalendarCore (iCal support library) and KContacts (vCard support library) to Frameworks. We are now working on moving KMime (email/RFC822 support library) and KIMAP (IMAP protocol implementations) to Frameworks as well.

This is helping cleanup KMime. KMime APIs is now in many places const correct to avoid the risk of modifying a message when reading it, proper CamelCase headers are now generated like for all the KDE Frameworks. Finally, parsing a MIME file is now up to 10 times faster on typical emails.

Other Improvements and Fixes Itinerary

Our travel assistant app Itinerary gained support for the public transport routing service Transitous, got a new import staging area and can now create new entries directly from OSM elements. For more details see its own summary blog post.


"Snow flurry" fixed the start of the week math for locales that use Sunday as the first day of the week. They also fixed the navigation of the basic mode for the month view and week view (which are used on mobile).

Claudio continued working on the Merkuro Mail application and added a progress bar in the sidebar which appears when a background job is running.

The settings dialogs have been ported to the new KirigamiAddons.ConfigurationView which fixes some issues on mobile.

Get Involved

If you would like to get involved in KDE PIM, check our milestones board and pick a task! And don't forget to join us in the #kontact:kde.org Matrix channel or the kde-pim mailing list!

Arturo Borrero González: Wikimedia Toolforge: migrating Kubernetes from PodSecurityPolicy to Kyverno

Planet Debian - Thu, 2024-07-04 05:00

Christian David, CC BY-SA 4.0, via Wikimedia Commons

This post was originally published in the Wikimedia Tech blog, authored by Arturo Borrero Gonzalez.

Summary: this article shares the experience and learnings of migrating away from Kubernetes PodSecurityPolicy into Kyverno in the Wikimedia Toolforge platform.

Wikimedia Toolforge is a Platform-as-a-Service, built with Kubernetes, and maintained by the Wikimedia Cloud Services team (WMCS). It is completely free and open, and we welcome anyone to use it to build and host tools (bots, webservices, scheduled jobs, etc) in support of Wikimedia projects.

We provide a set of platform-specific services, command line interfaces, and shortcuts to help in the task of setting up webservices, jobs, and stuff like building container images, or using databases. Using these interfaces makes the underlying Kubernetes system pretty much invisible to users. We also allow direct access to the Kubernetes API, and some advanced users do directly interact with it.

Each account has a Kubernetes namespace where they can freely deploy their workloads. We have a number of controls in place to ensure performance, stability, and fairness of the system, including quotas, RBAC permissions, and up until recently PodSecurityPolicies (PSP). At the time of this writing, we had around 3.500 Toolforge tool accounts in the system. We early adopted PSP in 2019 as a way to make sure Pods had the correct runtime configuration. We needed Pods to stay within the safe boundaries of a set of pre-defined parameters. Back when we adopted PSP there was already the option to use 3rd party agents, like OpenPolicyAgent Gatekeeper, but we decided not to invest in them, and went with a native, built-in mechanism instead.

In 2021 it was announced that the PSP mechanism would be deprecated, and removed in Kubernetes 1.25. Even though we had been warned years in advance, we did not prioritize the migration of PSP until we were in Kubernetes 1.24, and blocked, unable to upgrade forward without taking actions.

The WMCS team explored different alternatives for this migration, but eventually we decided to go with Kyverno as a replacement for PSP. And so with that decision it began the journey described in this blog post.

First, we needed a source code refactor for one of the key components of our Toolforge Kubernetes: maintain-kubeusers. This custom piece of software that we built in-house, contains the logic to fetch accounts from LDAP and do the necessary instrumentation on Kubernetes to accommodate each one: create namespace, RBAC, quota, a kubeconfig file, etc. With the refactor, we introduced a proper reconciliation loop, in a way that the software would have a notion of what needs to be done for each account, what would be missing, what to delete, upgrade, and so on. This would allow us to easily deploy new resources for each account, or iterate on their definitions.

The initial version of the refactor had a number of problems, though. For one, the new version of maintain-kubeusers was doing more filesystem interaction than the previous version, resulting in a slow reconciliation loop over all the accounts. We used NFS as the underlying storage system for Toolforge, and it could be very slow because of reasons beyond this blog post. This was corrected in the next few days after the initial refactor rollout. A side note with an implementation detail: we stored a configmap on each account namespace with the state of each resource. Storing more state on this configmap was our solution to avoid additional NFS latency.

I initially estimated this refactor would take me a week to complete, but unfortunately it took me around three weeks instead. Previous to the refactor, there were several manual steps and cleanups required to be done when updating the definition of a resource. The process is now automated, more robust, performant, efficient and clean. So in my opinion it was worth it, even if it took more time than expected.

Then, we worked on the Kyverno policies themselves. Because we had a very particular PSP setting, in order to ease the transition, we tried to replicate their semantics on a 1:1 basis as much as possible. This involved things like transparent mutation of Pod resources, then validation. Additionally, we had one different PSP definition for each account, so we decided to create one different Kyverno namespaced policy resource for each account namespace — remember, we had 3.5k accounts.

We created a Kyverno policy template that we would then render and inject for each account.

For developing and testing all this, maintain-kubeusers and the Kyverno bits, we had a project called lima-kilo, which was a local Kubernetes setup replicating production Toolforge. This was used by each engineer in their laptop as a common development environment.

We had planned the migration from PSP to Kyverno policies in stages, like this:

  1. update our internal template generators to make Pod security settings explicit
  2. introduce Kyverno policies in Audit mode
  3. see how the cluster would behave with them, and if we had any offending resources reported by the new policies, and correct them
  4. modify Kyverno policies and set them in Enforce mode
  5. drop PSP

In stage 1, we updated things like the toolforge-jobs-framework and tools-webservice.

In stage 2, when we deployed the 3.5k Kyverno policy resources, our production cluster died almost immediately. Surprise. All the monitoring went red, the Kubernetes apiserver became irresponsibe, and we were unable to perform any administrative actions in the Kubernetes control plane, or even the underlying virtual machines. All Toolforge users were impacted. This was a full scale outage that required the energy of the whole WMCS team to recover from. We temporarily disabled Kyverno until we could learn what had occurred.

This incident happened despite having tested before in lima-kilo and in another pre-production cluster we had, called Toolsbeta. But we had not tested that many policy resources. Clearly, this was something scale-related. After the incident, I went on and created 3.5k Kyverno policy resources on lima-kilo, and indeed I was able to reproduce the outage. We took a number of measures, corrected a few errors in our infrastructure, reached out to the Kyverno upstream developers, asking for advice, and at the end we did the following to accommodate the setup to our needs:

  • corrected the external HAproxy kubernetes apiserver health checks, from checking just for open TCP ports, to actually checking the /healthz HTTP endpoint, which more accurately reflected the health of each k8s apiserver.
  • having a more realistic development environment. In lima-kilo, we created a couple of helper scripts to create/delete 4000 policy resources, each on a different namespace.
  • greatly over-provisioned memory in the Kubernetes control plane servers. This is, bigger memory in the base virtual machine hosting the control plane. Scaling the memory headroom of the apiserver would prevent it from running out of memory, and therefore crashing the whole system. We went from 8GB RAM per virtual machine to 32GB. In our cluster, a single apiserver pod could eat 7GB of memory on a normal day, so having 8GB on the base virtual machine was clearly not enough. I also sent a patch proposal to Kyverno upstream documentation suggesting they clarify the additional memory pressure on the apiserver.
  • corrected resource requests and limits of Kyverno, to more accurately describe our actual usage.
  • increased the number of replicas of the Kyverno admission controller to 7, so admission requests could be handled more timely by Kyverno.

I have to admit, I was briefly tempted to drop Kyverno, and even stop pursuing using an external policy agent entirely, and write our own custom admission controller out of concerns over performance of this architecture. However, after applying all the measures listed above, the system became very stable, so we decided to move forward. The second attempt at deploying it all went through just fine. No outage this time 🙂

When we were in stage 4 we detected another bug. We had been following the Kubernetes upstream documentation for setting securityContext to the right values. In particular, we were enforcing the procMount to be set to the default value, which per the docs it was ‘DefaultProcMount’. However, that string is the name of the internal variable in the source code, whereas the actual default value is the string ‘Default’. This caused pods to be rightfully rejected by Kyverno while we figured the problem. I sent a patch upstream to fix this problem.

We finally had everything in place, reached stage 5, and we were able to disable PSP. We unloaded the PSP controller from the kubernetes apiserver, and deleted every individual PSP definition. Everything was very smooth in this last step of the migration.

This whole PSP project, including the maintain-kubeusers refactor, the outage, and all the different migration stages took roughly three months to complete.

For me there are a number of valuable reasons to learn from this project. For one, the scale is something to consider, and test, when evaluating a new architecture or software component. Not doing so can lead to service outages, or unexpectedly poor performances. This is in the first chapter of the SRE handbook, but we got a reminder the hard way 🙂

This post was originally published in the Wikimedia Tech blog, authored by Arturo Borrero Gonzalez.

KDE Gear 24.05.2

Over 180 individual programs plus dozens of programmer libraries and feature plugins are released simultaneously as part of KDE Gear.

Today they all get new bugfix source releases with updated translations, including:

  • kdepim-runtime: Fix a memory leak in the EWS resource (Commit, fixes bug #486861)
  • kio-gdrive: Fix "This file does not exist" after clicking on a folder (Commit, fixes bug #487021)
  • partitionmanager: Fix a crash caused by clicking the remove mount point button (Commit, fixes bug #432103)

Distro and app store packagers should update their application packages.

Categories: FLOSS Project Planets

Keychain Development Update: Yubikey Support

Planet KDE - Wed, 2024-07-03 20:00

Following my latest post about Keychain, here is a new development update. Yubikey and Key Files are now supported, which allows you to requires a YubiKey to open a password database but also to save it.

Saving and editing groups also now works.

Group editing dialog

And I now started working on the database creation process. The UI is ready but I still need to bind it to the backend.

Thanks to everyone who send me encouragement messages and also to Laurent who did a lot of cleanups in the codebase.

See you in the next development update.

Categories: FLOSS Project Planets

Planet Drupal - Wed, 2024-07-03 19:07

With Drupal 7’s (D7) end-of-life (EOL) in 6 months on January 5, 2025, organizations relying on D7 face critical decisions regarding the future of their websites. This article will help guide you through the paths you can take: migrating to modern Drupal, leveraging extended long-term support options, or staying on unsupported Drupal 7. 

Update to Modern Drupal

Transitioning from Drupal 7 to a newer version is crucial in future-proofing your digital presence. These versions embrace modern PHP standards, object-oriented programming, and Symfony components, providing a powerful foundation for your website. This upgrade allows you to access advanced features, enhanced performance, and ensures ongoing support and security updates.

Why Migrate to the Latest Versions?

Modernization: Drupal 10 offers cutting-edge features and performance improvements, and an easy upgrade path to Drupal 11, releasing very soon.
Security: Continuous security updates protect your site from vulnerabilities.
Flexibility: Adopt contemporary coding standards and best practices.
Future-Proofing: Ensure compatibility with future updates and maintain a seamless digital experience.

Additionally, the upcoming release of Starshot, slated before the end of 2024, promises even more enhancements and features that will elevate your website's capabilities. By migrating now, your organization can seamlessly integrate these future advancements.

Migrating to newer versions can involve navigating significant architectural changes, and may require extensive modifications to custom modules and themes. However, tools like Drupal Rector and Retrofit on our DIY migration resources page can help make this process easier. The benefits of modernization, enhanced security, and future-proofing outweigh the initial investment in time, resources, and budget.

But you don’t have to do it yourself.  There are a number of Drupal Certified Partners who can assist organizations in planning and implementing their migration.

Find the qualified company that is best for you: Certified Migration Partners.

Extended Security Support for Drupal 7

To address the challenges of using unsupported software, the Drupal Association has established a program for supporting site owners who won't be able to migrate before the end of life date.  The D7 Extended Security Support Program identifies existing Drupal Certified Partners who meet stringent standards and who the Drupal Association feels confident recommending.

With the end of support, the Drupal Security Team will no longer be involved in supporting Drupal 7.  The Drupal Association recognizes that some site owners will not be in a position to migrate their site or need more time to do so.  For many of these site owners, paying for extended support would be a good option.

Recognizing that the Drupal Security Team would not be officially involved in any such service, the Drupal Association created rigorous standards before certifying companies under this program.  Some of these requirements include:

  • Being a Drupal Certified Partner at the Gold tier or higher
  • Employing a core security team member
  • Experience in providing security and compatibility fixes
  • History of reporting 2 or more CVEs and creating fixes for the same
  • Willingness to enter in a service level agreement to ensure standards are being met

Find the company that will work best for you: D7 Extended Security Support Partners

Stay on Unsupported Drupal 7

When Drupal 7 reaches its EOL, it will no longer receive new security updates, fixes, or official support from the Drupal community. While this option might seem cost-effective and leverages your team's stability and familiarity with Drupal 7, it comes with significant risks.

Without updates, your site will be vulnerable to new security exploits and non-compliance with standards such as FedRAMP, PCI-DSS, and HIPAA. Over time, tools and utilities supporting your Drupal 7 site may become incompatible with new versions of dependencies like PHP, and finding developers skilled in outdated technology could become increasingly difficult.

The Drupal Association does not recommend this option.


Organizations must carefully weigh their options as Drupal 7 approaches its EOL to ensure continued security, compliance, and compatibility. Embracing Drupal 10 and the upcoming Drupal 11 and Starshot release will position your organization for long-term success with access to the latest features, security updates, and a vibrant support community..

Categories: FLOSS Project Planets

GNU Planet! - Wed, 2024-07-03 19:03

 When NeXT still existed and the black hardware was a thing, Steve Jobs made the announcement that OPENSTEP would be created and that the object model, not the operating system and not the hardware, was the important thing.

This is a concept that Apple has forgotten.  With it's push towards Apple Silicon and a walled-garden, Apple has committed itself to the same pitfall that NeXT fell into.  NeXT lacked the infrastructure to handle OPENSTEP running on multiple kinds of hardware, but the object model on different OSes was successful... this is evident in OPENSTEP1.1 for Solaris and OPENSTEP for NT.

GNUstep attempts to reach the same goal, but provides the APIs that are available with Cocoa.   The object model IS the important thing and this is why GNUstep is so important.  It breaks the walled garden and makes it possible for users to run their apps and tools on other operating systems.  GNUstep HASN'T forgotten and we believe this is a core concept that Apple has left behind.

Categories: FLOSS Project Planets
