Feeds

Horizontal Digital Blog: Drupal's bundle classes offer granular control over node URLs

Planet Drupal - Thu, 2024-09-05 07:40
If you're reading this, you probably know that in Drupal a node can be accessed at its so-called canonical link at /node/{node id}. You also likely know that by enabling the core Path module, you can spice things up by setting a url alias. Further, with the contributed Pathauto and Redirect modules, you can make additional url magic happen automatically.
Categories: FLOSS Project Planets

Tag1 Consulting: Migrating Your Data from D7 to D10: Migrating field widget settings

Planet Drupal - Thu, 2024-09-05 07:30

Today, we continue with the next field-related migration: widgets. While doing so, we will find out that new migrations might uncover issues or misconfigurations with already executed migrations.

Read more mauricio Thu, 09/05/2024 - 04:30
Categories: FLOSS Project Planets

Python Software Foundation: Pallets projects added to scope of PSF CVE Numbering Authority

Planet Python - Thu, 2024-09-05 04:55
Last year the Python Software Foundation was announced as a CVE Numbering Authority (CNA) to manage and assign CVE IDs for CPython and pip. Becoming a CVE Numbering Authority allows the PSF to provide expertise about Python in the CVE ecosystem, ensuring that users have accurate and up-to-date information about vulnerabilities affecting key projects.
Today, the PSF is expanding our CNA scope to also include Pallets projects, such as Flask, Jinja, Click, and Quart. For a complete list, see the Pallets organization on GitHub. Please report any security vulnerabilities for these projects following the Pallets security policy.
 This work is being done to learn how the PSF can better serve Python's large ecosystem of projects in the context of the CVE ecosystem. The PSF previously published a guide on how open source projects can become their own CVE Numbering Authorities. You can learn more about the CVE CNA program on the CVE website.

Pallets is a fiscal sponsoree of the Python Software Foundation. Fiscal sponsorship is a key plank of the PSF’s mission in supporting the Python community. The PSF supports 20 fiscal sponsorees including regional PyCons, Python Meetup and User Groups, and Python projects. Learn more about our Fiscal Sponsorees on our website and consider supporting the groups with a US-tax deductible donation.

Categories: FLOSS Project Planets

Sandro Tosi: TL;DR belongs at the top of an article

Planet Debian - Thu, 2024-09-05 02:21

 TL;DR

  • if you are writing an article and plan to add a TL;DR section, then put it at the very top, right after the title.
  • that's it, no excuses, end of discussion.
It has happen to probably everyone to read an article, reach the end of it only to see a TL;DR section right at the bottom, and thinking: "eeh i wish this would have been at the top so i didnt have to read (DR) this long article (TL) to gather its core ideas".
If the reason for "Too Long; Didn't Read" to exist is to avoid the reader to go thru the whole article to get its main points, then the natural place to present it is at the very top of said article.
So if you're planning on writing something and to add a TL;DR section (you don't have to, of course, but if you do that work too) then please position it at the very beginning of your work.

Categories: FLOSS Project Planets

Python GUIs: Build a Translation Application Using Tkinter and OpenAI — Use ChatGPT to Translate Your Text from Python

Planet Python - Thu, 2024-09-05 02:00

Translation tools have existed for many years and are incredibly useful if you're learning a new language or wanting to read foreign websites. One of the most popular tools is Google Translate , but there is now another alternative: using OpenAI's ChatGPT tool to translate text.

In this tutorial, we'll build a desktop translator application to translate natural language using ChatGPT APIs. We'll be building the UI using the Tkinter library from the Python standard library:

Example translation of text via OpenAI

Table of Contents Installing the Required Packages

Our Translator uses the openai library to perform the actual translation via OpenAI's ChatGPT tool. Tkinter is already available in the standard library.

The first task will be to set up a Python virtual environment. Open the terminal, and run the following commands:

bat > mkdir translator > cd translator > python -m venv venv > .\venv\Scripts\activate > python -m pip install openai sh $ mkdir translator $ cd translator/ $ python -m venv venv $ source venv/bin/activate (venv) $ python -m pip install openai

Working through these instructions, first we create a root directory for the Translator app. Next we create and activate a Python virtual environment for the project. Finally, we install the openai package.

Next, create a file named translator.py in the root of your project. Also add a folder called images/ where you'll store the icons for the application. The folder structure should look like this:

python translator/ &boxv &boxvr&boxh&boxh images/ &boxv &boxvr&boxh&boxh arrow.png &boxv &boxur&boxh&boxh logo.png &boxv &boxur&boxh&boxh translator.py

The images for this project can be downloaded here.

The images/ folder contains the two icons that you'll use for the application. The translator.py is the app's source file.

Building the Window

Open the translator.py file with your favorite Python code editor. We'll start by creating our main window:

python import tkinter as tk class TranslatorApp(tk.Tk): def __init__(self): super().__init__() self.title("Language Translator") self.resizable(width=False, height=False) if __name__ == "__main__": app = TranslatorApp() app.mainloop()

This code imports Tkinter and then defines the application's main class, which we have called TranslatorApp. This class will hold the application's main window and allow us to run the main loop.

Importing tkinter under the alias tk is a common convention in Tkinter code.

Inside the class we define the __init__() method, which handles initialization of the class. In this method, we first call the initializer __init__() of the parent class, tk.Tk, to initialize the app's window. Then, we set the window's title using the title() method. To make the window unresizable, we use the resizable() method with width and height set to False.

At the bottom of the code, we have the if __name__ == "__main__" idiom to check whether the file is being run directly as an executable program. Inside the condition block we first create an instance of TranslatorApp and then run the application's main loop or event loop.

If you run this code, you'll get an empty Tkinter window on your desktop:

python $ python translator.py

The empty Tkinter window

Creating the GUI for the Translator App

Now that the main window is set up, let's start adding widgets to build the GUI. To do this, we'll create a method called setup_ui(), as shown below:

python import tkinter as tk class TranslatorApp(tk.Tk): def __init__(self): super().__init__() self.title("Language Translator") self.resizable(width=False, height=False) self.setup_ui() def setup_ui(self): frame = tk.Frame(self) frame.pack(padx=10, pady=10) if __name__ == "__main__": app = TranslatorApp() app.mainloop()

The setup_ui() method will define the application's GUI. In this method, we first create a frame widget using the tk.Frame class whose master argument is set to self (the application's main window). Next, we position the frame inside the main window using the pack() geometry manager, using padx and pady arguments to set some padding around the frame.

Finally, we add the call to self.setup_ui() to the __init__() method.

We'll continue to develop the UI by adding code to the setup_ui() method.

Net, we'll add the app's logo. In the setup_ui() method add the following code below the frame definition:

python import tkinter as tk class TranslatorApp(tk.Tk): def __init__(self): super().__init__() self.title("Language Translator") self.resizable(width=False, height=False) self.setup_ui() def setup_ui(self): frame = tk.Frame(self) frame.pack(padx=10, pady=10) self.logo = tk.PhotoImage(file="images/logo.png") tk.Label(frame, image=self.logo).grid(row=0, column=0, sticky="w") if __name__ == "__main__": app = TranslatorApp() app.mainloop()

This code loads the logo using the tk.PhotoImage class. To resize it, we use the subsample() method. Then, we add the logo to the frame using a tk.Label widget. The label takes the frame and the logo as arguments. Finally, to position the logo, we use the grid() geometry manager with appropriate values for the row, column, and sticky arguments.

The sticky argument determines which side of a cell the widget should align -- North (top), South (bottom), East (right) or West (left). Here we're aligning it on the Wiest or left of the cell with "w":

Tkinter window with the OpenAI logo in it

Getting a List of Languages

We need list of languages to shown in the dropdown. There are various lists available online. But since we're using OpenAI for the translations, why not use it to give us the list of languages too. Since this is just for testing purposes, lets grab the top 20 human languages (by first and second language speakers).

We can prompt ChatGPT with something like:

Give me a list of the top 20 human languages with the most first and second language speakers in Python list format

..and it will return the following list:

python languages = [ "English", "Mandarin Chinese", "Hindi", "Spanish", "French", "Standard Arabic", "Bengali", "Russian", "Portuguese", "Urdu" ]

I'm going to add Dutch to the list, because it's my second language. Feel free to add your own languages to the list.

Adding the Interface

Let's start adding some inputs to the UIs. First we'll create the language selection drop down boxes:

python import tkinter as tk import tkinter.ttk as ttk LANGUAGES = [ "English", "Mandarin Chinese", "Hindi", "Spanish", "French", "Standard Arabic", "Bengali", "Russian", "Portuguese", "Urdu", "Dutch", # Gekoloniseerd. ] DEFAULT_SOURCE = "English" DEFAULT_DEST = "Dutch" class TranslatorApp(tk.Tk): def __init__(self): super().__init__() self.title("Language Translator") self.resizable(width=False, height=False) self.setup_ui() def setup_ui(self): frame = tk.Frame(self) frame.pack(padx=10, pady=10) self.logo = tk.PhotoImage(file="images/logo.png") tk.Label(frame, image=self.logo).grid(row=0, column=0, sticky="w") # Source language combobox self.from_language = ttk.Combobox(frame, values=LANGUAGES) self.from_language.current(LANGUAGES.index(DEFAULT_SOURCE)) self.from_language.grid(row=1, column=0, sticky="we") # Arrow icon self.arrows = tk.PhotoImage(file="images/arrow.png").subsample(15, 15) tk.Label(frame, image=self.arrows).grid(row=1, column=1) # Destination language combobox self.to_language = ttk.Combobox(frame, values=LANGUAGES) self.to_language.current(LANGUAGES.index(DEFAULT_DEST)) self.to_language.grid(row=1, column=2, sticky="we") if __name__ == "__main__": app = TranslatorApp() app.mainloop()

We have added our language list as the constant LANGUAGES. We also define the default languages for when the application starts up, using constants DEFAULT_SOURCE and DEFAULT_DEST.

Next, we create two combo boxes to hold the list of source and destination languages. The combo boxes are created using the ttk.Combobox class. One to the left and another to the right. Between the combo boxes, we've also added an arrow icon loaded using the tk.PhotoImage class. Again, we've added the icon to the app's window using ttk.Label.

Both combo boxes take frame and values as arguments. The values argument populates the combo boxes with languages. To specify the default language, we use the current() method, looking up the position of our default languages in the languages list with .index().

To position the combo boxes inside the frame, we use the grid() geometry manager with the appropriate arguments. Run the application, and you will see the following window:

Source and destination languages

With the source and destination combo boxes in place, let's add three more widgets: two scrollable text widgets and a button. The scrollable text on the left will hold the source text, while the scrollable text on the right will hold the translated text. The button will allow us to run the actual translation.

Building the Input UI

Get back to the code editor and update the setup_ui() method as follows. Note that we also need to import the ScrollText class:

python import tkinter as tk import tkinter.ttk as ttk from tkinter.scrolledtext import ScrolledText LANGUAGES = [ "English", "Mandarin Chinese", "Hindi", "Spanish", "French", "Standard Arabic", "Bengali", "Russian", "Portuguese", "Urdu", "Dutch", ] DEFAULT_SOURCE = "English" DEFAULT_DEST = "Dutch" class TranslatorApp(tk.Tk): def __init__(self): super().__init__() self.title("Language Translator") self.resizable(width=False, height=False) self.setup_ui() def setup_ui(self): frame = tk.Frame(self) frame.pack(padx=10, pady=10) self.logo = tk.PhotoImage(file="images/logo.png").subsample(5, 5) tk.Label(frame, image=self.logo).grid(row=0, column=0, sticky="w") # Source language combobox languages = [lang.title() for lang in LANGUAGES.values()] self.from_language = ttk.Combobox(frame, values=languages) self.from_language.current(languages.index(DEFAULT_SOURCE)) self.from_language.grid(row=1, column=0, sticky="we") # Arrow icon self.arrows = tk.PhotoImage(file="images/arrow.png").subsample(15, 15) tk.Label(frame, image=self.arrows).grid(row=1, column=1) # Destination language combobox self.to_language = ttk.Combobox(frame, values=languages) self.to_language.current(languages.index(DEFAULT_DEST)) self.to_language.grid(row=1, column=2, sticky="we") # Source text self.from_text = ScrolledText( frame, font=("Dotum", 16), width=50, height=20, ) self.from_text.grid(row=2, column=0) # Translated text self.to_text = ScrolledText( frame, font=("Dotum", 16), width=50, height=20, state="disabled", ) self.to_text.grid(row=2, column=2) # Translate button self.translate_button = ttk.Button( frame, text="Translate", command=self.translate, ) self.translate_button.grid(row=3, column=0, columnspan=3, pady=10) def translate(self): pass if __name__ == "__main__": app = TranslatorApp() app.mainloop()

In the code snippet, we use the ScrolledText class to create the two scrolled text areas. Both text areas take frame, font, width, and height as arguments. The second text area also takes state as an additional argument. Setting state to "disabled" allows us to create a read-only text area.

Then, we use the ttk.Button class to create a button with frame, text, and command as arguments. The command argument allows us to bind the button's click event to the self.translate() method, which we will define in a moment. For now, we've added a placeholder.

To position all these widgets on the app's window, we use the grid() geometry manager. Now, the app will look something like the following:

Translator app's GUI

Our translation app's GUI is ready! Finally, we can start adding functionality to the application.

Getting an OpenAI API Key

You can use OpenAPI's APIs for free, with some limitations. To get an OpenAI API key you will need to create an account. Once you have created an account go ahead and get an API key.

Click "Create new secret key" in the top right hand corner to create a key. Give the key a name (it doesn't matter what you use) and then click "Create secret key". Copy the resulting key and keep it safe. You'll need it in the next step.

Implementing the Translation Functionality

We'll implement the language translation functionality in the translate() method. This gets the current text from the UI and then uses openai to perform the translation. We need a few more imports, and to create the OpenAI client instance at the top of the application:

python import tkinter as tk import tkinter.ttk as ttk from tkinter.messagebox import showerror from tkinter.scrolledtext import ScrolledText import httpcore from openai import OpenAI client = OpenAI( api_key="<YOUR API KEY HERE>" )

Here we've imported the showerror helper for displaying error boxes in our application. We've imported httpcore which we'll use to handle HTTP errors when accessing the API. Finally, we've added an import for the OpenAI class from openai. This is what handles the actual translation.

To use it, we create an instance of the class as OpenAI. Replace <YOUR API KEY HERE> with the API key you generated on OpenAI just now.

We'll continue by implementing the translate() method. Below we're just showing the function itself:

python class TranslatorApp(tk.Tk): # ... def translate(self): source_language = self.from_language.get() destination_language = self.to_language.get() text = self.from_text.get(1.0, tk.END).strip() try: completion = client.chat.completions.create( messages=[ {"role": "system", "content": "You are a language interpreter."}, { "role": "user", "content": ( f"Translate the following text from {source_language} " f"to {destination_language}, only reply with the text: " f"{text}" ), }, ], model="gpt-3.5-turbo", ) reply = completion.choices[0].message.content except httpcore.ConnectError: showerror( title="Error", message="Make sure you have an internet connection", ) return except Exception as e: showerror( title="Error", message=f"An unexpected error occurred: {e}", ) return self.to_text.config(state="normal") self.to_text.delete(1.0, tk.END) self.to_text.insert(tk.END, reply) self.to_text.config(state="disabled")

The translate() method handles the entire translation process. It starts by retrieving the source and destination languages from the corresponding combo boxes, and the input text from the box on the left.

If any of these are not defined, we use a showerror dialog to inform the user of the problem.

Once we have the source and destination language and some text to translate, we can perform the actual translation through ChatGPT. First, we give the language model a hint about what we want it to do -- interpret language:

python {"role": "system", "content": "You are a language interpreter."},

Next we build the message we want it to respond to. We ask it to translate the provided text from the source to destination language, and to respond with only the translated text. If we don't specify this, we'll get some additional description or context.

You might want to experiment with asking for the text and context separately, as that is often helpful when learning languages.

python { "role": "user", "content": ( f"Translate the following text from {source_language} " f"to {destination_language}, only reply with the text: " f"{text}" ), },

The created completion is submitted to the API and we can retrieve the resulting text from the object:

python reply = completion.choices[0].message.content

If the call to translate() finds a connection error, then we tell the user to check their internet connection. To handle any other exceptions, we catch the generic Exception class and display an error message with the exception details.

If the translation is successful, then we enable the destination scrolled area, display the translated text, and disable the area again so it remains read-only.

The complete final code is shown below:

python import tkinter as tk import tkinter.ttk as ttk from tkinter.messagebox import showerror from tkinter.scrolledtext import ScrolledText import httpcore from openai import OpenAI client = OpenAI( api_key="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" ) LANGUAGES = [ "English", "Mandarin Chinese", "Hindi", "Spanish", "French", "Standard Arabic", "Bengali", "Russian", "Portuguese", "Urdu", "Dutch", ] DEFAULT_SOURCE = "English" DEFAULT_DEST = "Dutch" class TranslatorApp(tk.Tk): def __init__(self): super().__init__() self.title("Language Translator") self.resizable(width=False, height=False) self.setup_ui() def setup_ui(self): frame = tk.Frame(self) frame.pack(padx=10, pady=10) self.logo = tk.PhotoImage(file="images/logo.png") tk.Label(frame, image=self.logo).grid(row=0, column=0, sticky="w") # Source language combobox self.from_language = ttk.Combobox(frame, values=LANGUAGES) self.from_language.current(LANGUAGES.index(DEFAULT_SOURCE)) self.from_language.grid(row=1, column=0, sticky="we") # Arrow icon self.arrows = tk.PhotoImage(file="images/arrow.png").subsample(15, 15) tk.Label(frame, image=self.arrows).grid(row=1, column=1) # Destination language combobox self.to_language = ttk.Combobox(frame, values=LANGUAGES) self.to_language.current(LANGUAGES.index(DEFAULT_DEST)) self.to_language.grid(row=1, column=2, sticky="we") # Source text self.from_text = ScrolledText( frame, font=("Dotum", 16), width=50, height=20, ) self.from_text.grid(row=2, column=0) # Translated text self.to_text = ScrolledText( frame, font=("Dotum", 16), width=50, height=20, state="disabled", ) self.to_text.grid(row=2, column=2) # Translate button self.translate_button = ttk.Button( frame, text="Translate", command=self.translate, ) self.translate_button.grid(row=3, column=0, columnspan=3, pady=10) def translate(self): source_language = self.from_language.get() destination_language = self.to_language.get() text = self.from_text.get(1.0, tk.END).strip() try: completion = client.chat.completions.create( messages=[ {"role": "system", "content": "You are a language interpreter."}, { "role": "user", "content": ( f"Translate the following text from {source_language} " f"to {destination_language}, only reply with the text: " f"{text}" ), }, ], model="gpt-3.5-turbo", ) reply = completion.choices[0].message.content except httpcore.ConnectError: showerror( title="Error", message="Make sure you have an internet connection", ) return except Exception as e: showerror( title="Error", message=f"An unexpected error occurred: {e}", ) return self.to_text.config(state="normal") self.to_text.delete(1.0, tk.END) self.to_text.insert(tk.END, reply) self.to_text.config(state="disabled") if __name__ == "__main__": app = TranslatorApp() app.mainloop()

The finished app is shown below:

The completed Translator app

Conclusion

In this tutorial we built a Translator application using the Tkinter GUI library from the Python standard library. We worked step by step through building the UI using a grid layout, and then implemented the language translation functionality with openai & ChatGPT.

Try and take what you've learnt in this tutorial & applying it to your own projects!

Categories: FLOSS Project Planets

Quansight Labs Blog: Announcing Scientific Python Accessibility Events

Planet Python - Wed, 2024-09-04 20:00
I am happy to announce two upcoming public events focused on helping the scientific Python ecosystem develop their accessibility skills before the new year.
Categories: FLOSS Project Planets

Goals Sprint Recap

Planet KDE - Wed, 2024-09-04 20:00

In April we had the combined goals sprint, where a fine group of KDE people working on things around Automation & Systematization, Sustainable Software, and Accessibility got together. It was a nice cross-over of the KDE goals, taking advantage of having people in one room for a weekend to directly discuss topics of the goals and interactions between them. David, Albert, Nate, Nico, and Volker wrote about their impressions from the sprint.

So what happened regarding the Sustainable Software goal at the sprint and where are we today with these topics? There are some more detailed notes of the sprint. Here is a summary of some key topics with an update on current progress.

Kick-Off for the Opt-Green project

The Opt-Green project is the second funded project of the KDE Eco team. The first one was the Blue Angel for Free Software project, where we worked on creating material helping Free Software projects to assess and meet the criteria for the Blue Angel certification for resource and energy-efficient software products.

The Opt Green project is about promotion of extending the operating life of hardware with Free Software to reduce electronic waste. It's funded for two years by the German Federal Environment Agency and the Federal Ministry for the Environment, Nature Conservation, Nuclear Safety and Consumer Protection and is running from April 2024 to March 2026.

Figure : Opt-Green presentation

Joseph introduced the project, why it's important, how the environment is suffering from software-induced hardware obsolescence, and how Free Software in general and KDE specifically can help with fighting it. The approach of the project is to go beyond our typical audience and introduce people who are environmentally aware but not necessary very technical to the idea of running sustainable, up-to-date Free Software on their computers, even devices they may think are no longer usable due to lack of vendor support. In many cases this is a perfectly fine solution, and it's surprisingly attractive to a number of people who care about sustainability but haven't really been introduced to Free Software yet.

Where we are today

The project is in full swing. The project has already been present at quite a number of events to motivate people to install Free Software on their (old) devices and support them in how to do it. See for example the report about the Academy of Games for upcoming 9th graders in Hannover, Germany.

Revamping the KDE Eco website

We had a great session putting together ideas and concepts about how we could improve the KDE Eco website. From brainstorming ideas to sketching a wireframe as a group, we discussed and agreed on a direction of how to present what we are doing in the KDE Eco team.

Figure : KDE Eco website sketches

The key idea is to focus on three main audiences (end users, advocates, and developers) and present specific material targeted at these groups. This nicely matches what we already have, e.g., the KDE Eco handbook for how to fulfill the Blue Angel criteria for developers, or the material being produced for events reaching out to end users, and while giving it a much more focused presentation.

Where we are today

The first iteration of the new design is now live on eco.kde.org. There is more to come, but it already gives an impression where this is going. Anita created a wonderful set of design elements which will help to shape the visual identity of KDE Eco going forward.

Surveying end users about their attitude to hardware reuse

Making use of old hardware by installing sustainable free software on it is a wide field. There are many different variations of devices and what users do with them also varies a lot. What are the factors that might encourage users to reuse old hardware, what is holding them back?

To get a bit more reliable answers to these questions we came up with a concept for a user survey which can be used at events where we present the Opt Green project. This includes questions about what hardware people have and what is holding them back from installing new software on it.

Where we are today

The concept has been implemented with an online survey on KDE's survey service. It's available in English and German and is being used at the events where the Opt Green project is present.

Figure : Opt-Green Survey

Sustainable AI

One of the big hype topics of the last two years has been Generative AI and the Large Language Models which are behind this technology. They promise to bring revolutionary new features, much closer to how humans interact in natural language, but they also come with new challenges and concerns.

One of the big questions is how this new technology affects our digital freedoms. How does it relate to Free Software? How does licensing and openness work? How does it fit KDE's values? Where does it make sense to use its technology? What are the ethical implications? What are the implications in terms of sustainability?

We had a discussion around the possible idea of adopting something like Nextcloud's Ethical AI rating in KDE as well. This would make it more transparent to users how use of AI features affects their freedoms and gives them a choice to use what they consider to be satisfactory.

Where we are today

This is still pretty much an open question. The field is moving fast, there are legal questions around copyright and other aspects still to be answered. Local models are becoming more and more an option. But what openness means in AI has become very blurry. KDE still has to find a position here.

Categories: FLOSS Project Planets

KDE’s Annual report for the year 2023 is out

Planet KDE - Wed, 2024-09-04 20:00

Everything you wanted to know about the things we did last year is in this report: the funds we raised, how we spent them, the sprints and events we attended, the projects we took on, the milestones we hit, and much, much more.

Read it here.

Categories: FLOSS Project Planets

Junichi Uekawa: Google docs has some tab feature.

Planet Debian - Wed, 2024-09-04 19:08
Google docs has some tab feature. I received a note that my script addTodayDate may need to be modified to handle the feature. tabs API. But then I don't think I need it yet, so I will just use the default tab.

Categories: FLOSS Project Planets

Plasma Crash Course - DrKonqi

Planet KDE - Wed, 2024-09-04 16:18

A while ago a colleague of mine asked about our crash infrastructure in Plasma and whether I could give some overview on it. This seems very useful to others as well, I thought. Here I am, telling you all about it!

Our crash infrastructure is comprised of a number of different components.

  • KCrash: a KDE Framework performing crash interception and prepartion for handover to…
  • coredumpd: a systemd component performing process core collection and handover to…
  • DrKonqi: a GUI for crashes sending data to…
  • Sentry: a web service and UI for tracing and presenting crashes for developers

We’ve already looked at KCrash and coredumpd. Now it is time to look at DrKonqi.

DrKonqi

DrKonqi is the UI that comes up when a crash happens. We’ll explore how it integrates with coredumpd and Sentry.

Crash Pickup

When I outlined the functionality of coredumpd, I mentioned that it starts an instance of systemd-coredump@.service. This not only allows the core dumping itself to be controlled by systemd’s resource control and configuration systems, but it also means other systemd units can tie into the crash handling as well.

That is precisely what we do in DrKonqi. It installs drkonqi-coredump-processor@.service which, among other things, contains the rule:

WantedBy=systemd-coredump@.service

…meaning systemd will not only start systemd-coredump@unique_identifier but also a corresponding drkonqi-coredump-processor@unique_identifier. This is similar to how services start as part of the system boot sequence: they all are “wanted by” or “want” some other service, and that is how systemd knows what to start and when (I am simplifying here 😉). Note that unique_identifier is actually a systemd feature called “instances” — one systemd unit can be instantiated multiple times this way.

drkonqi-coredump-processor

When drkonqi-coredump-processor@unique_identifier runs, it first has some synchronization to do.

As a brief recap from the coredumpd post: coredumpd’s crash collection ends with writing a journald entry that contains all collected data. DrKonqi needs this data, so we wait for it to appear in the journal.

Once the journal entry has arrived, we are good to go and will systemd-socket-activate a helper in the relevant user.

The way this works is a bit tricky: drkonqi-coredump-processor runs as root, but DrKonqi needs to be started as the user the crash happened to. To bridge this gap a new service drkonqi-coredump-launcher comes into play.

drkonqi-coredump-launcher

Every user session has a drkonqi-coredump-launcher.socket systemd unit running that provides a socket. This socket gets connected to by the processor (remember: it is root so it can talk to the user socket). When that happens, an instance of drkonqi-coredump-launcher@.service is started (as the user) and the processor starts streaming the data from journald to the launcher.

The crash has now traveled from the user, through the kernel, to system-level systemd services, and has finally arrived back in the actual user session.

Having been started by systemd and initially received the crash data from the processor, drkonqi-coredump-launcher will now augment that data with the KCrash metadata originally saved to disk by KCrash. Once the crash data is complete, the launcher only needs to find a way to “pick up” the crash. This will usually be DrKonqi, but technically other types of crash pickup are also supported. Most notably, developers can set the environment variable KDE_COREDUMP_NOTIFY=1 to receive system notifications about crashes with an easy way to open gdb for debugging. I’ve already written about this a while ago.

When ready, the launcher will start DrKonqi itself and pass over the complete metadata.

the crashed application └── kernel └── systemd-coredumpd ├── systemd-coredumpd@unique_identifier.service └── drkonqi-coredump-processor@unique_identifier.service ├── drkonqi-coredump-launcher.socket └── drkonqi-coredump-launcher@unique_identifier.service └── drkonqi

What a journey!

Crash Processing

DrKonqi kicks off crash processing. This is hugely complicated and probably worth its own post. But let’s at least superficially explore what is going on.

The launcher has provided DrKonqi with a mountain of information so it can now utilize the CLI for systemd-coredump, called coredumpctl, to access the core dump and attach an instance of the debugger GDB to it.

GDB runs as a two step automated process:

Preamble Step

As part of this automation, we run a service called the preamble: a Python program that interfaces with the Python API of GDB. Its most important functionality is to create a well-structured backtrace that can be converted to a Sentry payload. Sentry, for the most part, doesn’t ingest platform specific core dumps or crash reports, but instead relies on an abstract payload format that is generated by so called Sentry SDKs. DrKonqi essentially acts as such an SDK for us. Once the preamble is done, the payload is transferred into DrKonqi and the next step can continue.

Trace Step

After the preamble, DrKonqi executes an actual GDB trace (i.e. the literal backtrace command in gdb) to generate the developer output. This is also the trace that gets sent to KDE’s Bugzilla instance at bugs.kde.org if the user chooses to file a bug report. The reason this is separate from the already created backtrace is mostly for historic reasons. The trace is then routed through a text parser to figure out if it is of sufficient quality; only when that is the case will DrKonqi allow filing a report in Bugzilla.

Transmission

With all the trace data assembled, we just need to send them off to Bugzilla or Sentry, depending on what the user chose to do.

Bugzilla

The Bugzilla case is simply sending a very long string of the backtrace to the Bugzilla API (albeit surrounded by some JSON).

Sentry

The Sentry case on the other hand requires more finesse. For starters, the Sentry code also works when offline. The trace and optional user message get converted into a Sentry envelope tagged with a receiver address — a Sentry-specific URL for ingestion so it knows under which project to file the crash. The envelope is then written to ~/.cache/drkonqi/sentry-envelopes/. At this point, DrKonqi’s job is done; The actual transmission happens in an auxiliary service.

Writing an envelope to disk triggers drkonqi-sentry-postman.service which will attempt to send all pending envelopes to Sentry using the URL inside the payload. It will try to do so every once in a while in case there are pending envelopes as well, thereby making sure crashes that were collected while offline still make it to Sentry eventually. Once sent successfully, the envelopes are archived in ~/.cache/drkonqi/sentry-sent-envelopes/.

This concludes DrKonqi’s activity. There’s much more detail going on behind the scenes but it’s largely inconsequential to the overall flow. Next time we will look at the final piece in the puzzle — Sentry itself.

Categories: FLOSS Project Planets

Kanopi Studios: Default Content in Drupal

Planet Drupal - Wed, 2024-09-04 16:14

In Drupal 10.3, the DefaultContent API was added to Drupal core as part of the experimental Recipes APIs. These APIs allow Drupal to create content from files that are part of a recipe. This content that we programmatically create isn’t intended for deploying or migrating content, we have the Workspaces and other modules for that. […]

The post Default Content in Drupal appeared first on Kanopi Studios.

Categories: FLOSS Project Planets

Members Newsletter – September 2024

Open Source Initiative - Wed, 2024-09-04 13:35
September 2024 Members Newsletter

It’s been a busy couple of months, and things are going to stay that way as we approach All Things Open in October. Version 0.0.9 of the Open Source AI Definition has been released after collecting months of community feedback.

We’re continuing our march towards a stable release by the end of October 2024, at All Things Open. Get involved by joining the discussion on the forum, finding OSI staff around the world, and online at the weekly town halls. The community continues iterating through drafts after meeting diverse stakeholders at the worldwide roadshow, collecting feedback and carefully looking for new arguments in dissenting opinions. All thanks to a grant by the Alfred P. Sloan Foundation. We also need to decide how to best address the reviews of new licenses for datasets, documentation and the agreements governing model parameters. 

The lively conversations will continue at conferences, town halls, and online. The first two stops were at AI_dev and Open Source Congress. Other events are planned to take place in Africa, South America, Europe and North America.

On a separate delightful note, the Open Source community got some welcome news on August 29, as Elastic returned to the community by adding the AGPL licensing option for Elasticsearch and Kibana. This decision is confirmation that shipping software with licenses that comply with the Open Source Definition is valuable—to the maker, to the customer, and to the user. Elastic’s choice of a strong copyleft license signals the continuing importance of that license and its dual effect: one, it’s designed to preserve the user’s freedoms downstream, and two, it also grants strong control over the project by the single-vendor developers.

We’re encouraged to see Elastic return to the Open Source community. And who knows… maybe others will follow suit!

Stefano Maffulli

Executive Director, OSI 

I hold weekly office hours on Fridays with OSI members: book time if you want to chat about OSI’s activities, if you want to volunteer or have suggestions.

News from the OSI Community input drives the new draft of the Open Source AI Definition

From the Research and Advocacy program

The Open Source AI Definition v0.0.9 has been released and collaboration continues at in-person events and in the online forums. Read what changes have been made, what to do next and how to get involved. Read more.

Three things I learned at KubeCon + AI_Dev China 2024

From the Research and Advocacy program

KubeCon China 2024 was a whirlwind of innovation, community and technical deep dives. As it often happens at these community events, I was blown away by the energy, enthusiasm and sheer amount of knowledge being shared. Read more.

Highlights from our participation at Open Source Congress

From the Research and Advocacy program

The Open Source Initiative (OSI) proudly participated in the Open Source Congress 2024, held from August 25-27 in Beijing, China. This event was a pivotal gathering for key individuals in the Open Source nonprofit community, aiming to foster collaboration, innovation, and strategic development within the ecosystem. Read more.

OSI in the news Elasticsearch is open source, again

OSI at elastic.co

“Being able to call Elasticsearch and Kibana Open Source again is pure joy.” — Shay Banon, Elastic Founder and CTO. Read more.

Meta is accused of bullying the open source community

OSI at The Economist

Purists are pushing back against Meta’s efforts to set its own standard on the definition of open-source AI. Stefano Maffulli, head of the OSI, says Mr Zuckerberg “is really bullying the industry to follow his lead”. Read more.

Debate over “open source AI” term brings new push to formalize definition

OSI at Ars Technica

The Open Source Initiative (OSI) recently unveiled its latest draft definition for “open source AI,” aiming to clarify the ambiguous use of the term in the fast-moving field. The move comes as some companies like Meta release trained AI language model weights and code with usage restrictions while using the “open source” label. This has sparked intense debates among free-software advocates about what truly constitutes “open source” in the context of AI. Read more.

Other Highlights Other news News from OSI affiliates News from OpenSource.net Voices of the Open Source AI Definition

The Open Source Initiative (OSI) is running a series of stories about a few of the people involved in the Open Source AI Definition (OSAID) co-design process.

2024 Generative AI Survey

This survey aims to understand the deployment, use, and challenges of generative AI technologies in organizations and the role of open source in this domain. Take survey here.

Events Upcoming events CFPs Thanks to our sponsors New members and renewals
  • Mercado Libre

Interested in sponsoring, or partnering with, the OSI? Please see our Sponsorship Prospectus and our Annual Report. We also have a dedicated prospectus for the Deep Dive: Defining Open Source AI. Please contact the OSI to find out more about how your company can promote open source development, communities and software.

Support OSI by becoming a member!

Let’s build a world where knowledge is freely shared, ideas are nurtured, and innovation knows no bounds! 

Join the Open Source Initiative!

Categories: FLOSS Research

Highlights from our participation at Open Source Congress 2024

Open Source Initiative - Wed, 2024-09-04 13:30

The Open Source Initiative (OSI) proudly participated in the Open Source Congress 2024, held from August 25-27 in Beijing, China. This event was a gathering for key individuals in the Open Source nonprofit community, aiming to foster collaboration, innovation, and strategic development within the ecosystem. Here are some highlights from OSI’s participation at the event.

Panel: Collaboration between Open Source Organizations

Stefano Maffulli, OSI’s Executive Director, played an important role in the panel on “Collaboration between Open Source Organizations.” This session, moderated by Daniel Goldscheider (Executive Director, OpenWallet Foundation) and Chris Xie (Board Advisor, Linux Foundation Research), brought together influential leaders, including Keith Bergelt (CEO, Open Invention Network), Bryan Che (Advisory Board Member, Software Heritage Foundation), Mike Milinkovich (Executive Director, Eclipse Foundation), Rebecca Rumbul (Executive Director, Rust Foundation), Xiaohua Xin (Deputy Secretary-General, OpenAtom Foundation), and Jim Zemlin (Executive Director, Linux Foundation). The panel discussed the importance of collaboration in addressing the challenges faced by the Open Source ecosystem and explored ways to strengthen inter-organizational ties.

Fireside Chat: Datasets, Privacy, and Copyright

Stefano Maffulli also led a fireside chat on “Datasets, Privacy, and Copyright” in the context of Open Source AI along with Donnie Dong (Steering Committee Member, Digital Asia Hub; Senior Partner, Hylands Law Firm). This session was particularly relevant given the growing concerns around AI and the legal implications of creating and distributing large datasets. The discussion provided valuable insights into how these issues intersect with Open Source principles and what steps the community can take to address them responsibly. Some questions addressed included the use of copyrighted material in training datasets; fair use in the context of AI training and content generation; and China’s AI regulatory framework.

Talk: The Open Source AI Definition

OSI’s involvement was further highlighted by Stefano Maffulli’s talk on “The Open Source AI Definition,” where he announced version 0.0.9 of the Open Source AI Definition (OSAID), a significant milestone resulting from a multi-year, global, and multi-stakeholder process. This version reflects the collective input of a diverse range of experts and community members who participated in extensive co-design workshops and public consultations, ensuring that the definition is robust, inclusive, and aligned with the principles of openness. Maffulli emphasized the importance of the “4 Freedoms of Open Source AI”—Use, Study, Modify, and Share—as foundational principles guiding the development of AI technologies. The session was particularly crucial for gathering feedback from the community in China, providing a platform for discussing the practical implications of the OSAID in different cultural and regulatory contexts.

Panel: The Future of Open Source Congress

Deborah Bryant, OSI’s US Policy Director, moderated a pivotal panel discussion on “The Future of Open Source Congress: Converting Ideas to Shared Action.” This session focused on how the community can transform discussions into actionable strategies, ensuring the continued growth and impact of Open Source globally.

Other highlights from the event

The “Unlocking Innovation: Open Strategies in Generative AI” panel led by Anni Lai (Chair of Generative AI Commons; Board member of LF AI & Data; Head of Open Source Operations, Futurewei) explored how openness is essential for advancing Generative AI innovation, democratizing access, and ensuring ethical AI practices. Panelists Richard Sikang Bian (Outreach Chair, LF AI & Data; Head of OSPO, Ant Group), Richard Lin (Member, OpenDigger Community; Head of Open Source, 01.ai), Ted Liu (Co-founder, KAIYUANSHE), and Zhenhua Sun (China Workgroup Chair, OpenChain; Open Source Legal Counsel, ByteDance) delved into the challenges of the Open Source generative AI landscape, such as “open washing,” inconsistent definitions, and the complexities of licensing. They highlighted the need for clear, standardized frameworks to define what truly constitutes Open Source AI, emphasizing that openness fosters transparency, accelerates learning, and mitigates biases. The panelists called for increased collaboration among stakeholders to address these challenges and further develop Open Source AI standards, ensuring that AI technologies are transparent, ethical, and widely adoptable.

In her closing keynote at the Open Source AI track, Amreen Taneja, Standards Lead at the Digital Public Goods Alliance (DPGA), emphasized the critical role of Open Source AI in advancing public good and supporting the Sustainable Development Goals (SDGs). She explained that Digital Public Goods (DPGs) are digital technologies made freely available to benefit society and highlighted the importance of OSAI in democratizing access to powerful AI technologies. Taneja outlined the DPGA’s efforts to align AI with public interests, including updating the DPG Standard to better accommodate AI, ensuring transparency in AI development, and promoting responsible AI practices that prioritize privacy and avoid harm. She stressed the need for rigorous evaluation, clear ownership, open licensing, and platform independence to drive the adoption of AI DPGs, ultimately aiming to create AI systems that are ethical, transparent, and beneficial for all.

Quotes from OSI Board and affiliates

Attending the Open Source Congress was really inspiring. Over two days, we participated in intensive discussions and exchanges with dozens of Open Source foundations and organizations worldwide, which was incredibly beneficial. I believe this will foster broader cross-community collaboration globally. I hope the conclusion of the second Open Source Congress marks the beginning of ongoing cooperation, allowing our “community of communities” to maintain regular communication and exchange. 

Nadia JiangBoard Chair of KAIYUANSHE

Open Source development experience is all about two words: consensus and antifragile decision-making process. The most valuable part of this event is seeing and listening to all the executive directors, open-source leaders in the room, and being very comfortable with the information density and the constructiveness of the discussions. Towards the end of the day, what people care about are not fundamentally different and there are indeed really difficult questions to resolve. I feel the world becomes slightly better after this OSC, and that means a lot to have an event like this.

Richard BianHead of Ant Group OSPO; Outreach Chair, Linux Foundation AI & Data

Open Source is the cornerstone of innovation, transparency, and collaboration, driving solutions that benefit everyone. The Open Source Congress 2024 represented a significant step forward in fostering alignment and building consensus within the open source community. By bringing together diverse voices and ideas, it amplified our collective efforts to create a more open, inclusive, and impactful digital ecosystem for the future.

Amreen TanejaStandards Lead, Digital Public Goods Alliance

Stefano Maffulli with Board Directors of KAIYUANSHE: Emily Chen, Nadia Jiang (photo credits), and Ted Liu. Conclusion

OSI’s active participation in the Open Source Congress 2024 reinforced its leadership role in the global Open Source community. By engaging in critical discussions, leading panels, and contributing to the future direction of Open Source initiatives, OSI continues to shape the landscape of Open Source development, ensuring that it remains inclusive, innovative, and aligned with the values of the global community.

This event marked another successful chapter in OSI’s ongoing efforts to drive collaboration and innovation in the Open Source world. We extend our sincere thanks to the organizers of OSC and the Open Source community in China for creating a platform that brought together a diverse and dynamic group of stakeholders, enabling meaningful discussions and progress. We look forward to continuing these conversations and turning ideas into action in the years to come.

Categories: FLOSS Research

Electric Citizen: Get Ready for Twin Cities Drupal Camp

Planet Drupal - Wed, 2024-09-04 13:09

This September, Minneapolis will once again host this annual gathering of the Drupal community.

If you are a Drupal developer, web designer, content strategist, site editor or anything connected to open-source and Drupal, you should consider attending!

 

Categories: FLOSS Project Planets

Dominique De Cooman: Dreaming about Drupal its long term potential

Planet Drupal - Wed, 2024-09-04 12:02
Read moreA post about dreaming about what Drupal could become and the long term potential of Drupal.Dreaming about Drupal its long term potentialdrupalWednesday, September 4, 2024 - 18:02
Categories: FLOSS Project Planets

Stefanie Molin: How to Create a Pre-Commit Hook

Planet Python - Wed, 2024-09-04 10:55
Pre-commit hooks are a great way to help maintain code quality. However, some of your code quality standards may be specific to your project, and therefore, not covered by existing code linting and formatting tools. In this article, I will show you how to incorporate custom checks into your `pre-commit` setup.
Categories: FLOSS Project Planets

Real Python: Lists vs Tuples in Python

Planet Python - Wed, 2024-09-04 10:00

In Python, lists and tuples are versatile and useful data types that allow you to store data in a sequence. You’ll find them in virtually every nontrivial Python program. Learning about them is a core skill for you as a Python developer.

In this tutorial, you’ll:

  • Get to know lists and tuples
  • Explore the core characteristics of lists and tuples
  • Learn how to define and manipulate lists and tuples
  • Decide when to use lists or tuples in your code

To get the most out of this tutorial, you should know the basics of Python programming, including how to define variables.

Get Your Code: Click here to download the free sample code that shows you how to work with lists and tuples in Python.

Take the Quiz: Test your knowledge with our interactive “Lists vs Tuples in Python” quiz. You’ll receive a score upon completion to help you track your learning progress:

Interactive Quiz

Lists vs Tuples in Python

Challenge yourself with this quiz to evaluate and deepen your understanding of Python lists and tuples. You'll explore key concepts, such as how to create, access, and manipulate these data types, while also learning best practices for using them efficiently in your code.

Getting Started With Python Lists and Tuples

In Python, a list is a collection of arbitrary objects, somewhat akin to an array in many other programming languages but more flexible. To define a list, you typically enclose a comma-separated sequence of objects in square brackets ([]), as shown below:

Python >>> colors = ["red", "green", "blue", "yellow"] >>> colors ['red', 'green', 'blue', 'yellow'] Copied!

In this code snippet, you define a list of colors using string objects separated by commas and enclose them in square brackets.

Similarly, tuples are also collections of arbitrary objects. To define a tuple, you’ll enclose a comma-separated sequence of objects in parentheses (()), as shown below:

Python >>> person = ("Jane Doe", 25, "Python Developer", "Canada") >>> person ('Jane Doe', 25, 'Python Developer', 'Canada') Copied!

In this example, you define a tuple with data for a given person, including their name, age, job, and base country.

Up to this point, it may seem that lists and tuples are mostly the same. However, there’s an important difference:

Feature List Tuple Is an ordered sequence ✅ ✅ Can contain arbitrary objects ✅ ✅ Can be indexed and sliced ✅ ✅ Can be nested ✅ ✅ Is mutable ✅ ❌

Both lists and tuples are sequence data types, which means they can contain objects arranged in order. You can access those objects using an integer index that represents their position in the sequence.

Even though both data types can contain arbitrary and heterogeneous objects, you’ll commonly use lists to store homogeneous objects and tuples to store heterogeneous objects.

Note: In this tutorial, you’ll see the terms homogeneous and heterogeneous used to express the following ideas:

  • Homogeneous: Objects of the same data type or the same semantic meaning, like a series of animals, fruits, colors, and so on.
  • Heterogeneous: Objects of different data types or different semantic meanings, like the attributes of a car: model, color, make, year, fuel type, and so on.

You can perform indexing and slicing operations on both lists and tuples. You can also have nested lists and nested tuples or a combination of them, like a list of tuples.

The most notable difference between lists and tuples is that lists are mutable, while tuples are immutable. This feature distinguishes them and drives their specific use cases.

Essentially, a list doesn’t have a fixed length since it’s immutable. Therefore, it’s natural to use homogeneous elements to have some structure in the list. A tuple, on the other hand, has a fixed length so the position of elements can have meaning, supporting heterogeneous data.

Creating Lists in Python

In many situations, you’ll define a list object using a literal. A list literal is a comma-separated sequence of objects enclosed in square brackets:

Python >>> countries = ["United States", "Canada", "Poland", "Germany", "Austria"] >>> countries ['United States', 'Canada', 'Poland', 'Germany', 'Austria'] Copied!

In this example, you create a list of countries represented by string objects. Because lists are ordered sequences, the values retain the insertion order.

Read the full article at https://realpython.com/python-lists-tuples/ »

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Categories: FLOSS Project Planets

Tellico 4.0 Released

Planet KDE - Wed, 2024-09-04 09:48

I’m excited to make Tellico 4.0 available as the first version to leverage the new Qt6 and KDE Frameworks 6 libraries. Tellico 4.0 also continues to build with Qt5/KF5 for those who haven’t yet transitioned to the newer versions.

Especially since this has many updates and changes in the underlying library code, please backup your data before switching to the new version. Creating a full backup file can be done by using the Export to Zip option which will create a file with all your images together with the main collection.

Please let me know of any compilation issues or bugs, particularly since I haven’t tested this on a wide range of Qt6/KF6 releases. The KDE builds are all working, which certainly helps my confidence, but one never knows.

Improvements and Fixes
  • Building with Qt6 is enabled by default, falling back to Qt5 for older versions of ECM or when the BUILD_WITH_QT6=off flag is used.
  • Book and video collections can be imported from file metadata (Bug 214606).
  • All entry templates were updated to include any loan information (Bug 411903).
  • Creating and viewing the internal log file is supported through the --log and --logfile command-line options (Bug 426624).
  • The DBUS interface can output to stdout using -- as the file name.
  • Choice fields are now allowed to have multiple values (Bug 483831).
  • The iTunes, Discogs, and MusicBrainz sources now separate multi-disc albums (Bug 479503).
  • A configurable locale was added to the IMDb data source.
  • The Allocine and AnimeNFO data sources were removed.
Categories: FLOSS Project Planets

The Drop Times: Drupal GovCon 2024: LaunchDarkly and Drupal: A Solid Combo For A/B Testing

Planet Drupal - Wed, 2024-09-04 09:28
Michael Kinnunen, Backend Engineer at CivicActions, recently shared his insights on A/B testing at Drupal GovCon, focusing on its application within Drupal using LaunchDarkly. Writing for The DropTimes, he reflects on the event's rich learning experience, highlighting how sessions on content translation, large-scale content management, and Drupal’s new Recipes feature deepened his appreciation for Drupal’s growing impact on government websites. The conference left him inspired and eager for the future of Drupal and its community.
Categories: FLOSS Project Planets

Reproducible Builds: Reproducible Builds in August 2024

Planet Debian - Wed, 2024-09-04 09:27

Welcome to the August 2024 report from the Reproducible Builds project!

Our reports attempt to outline what we’ve been up to over the past month, highlighting news items from elsewhere in tech where they are related. As ever, if you are interested in contributing to the project, please visit our Contribute page on our website.

Table of contents:

  1. LWN: The history, status, and plans for reproducible builds
  2. Intermediate Autotools build artifacts removed from PostgreSQL distribution tarballs
  3. Distribution news
  4. Mailing list news
  5. diffoscope
  6. Website updates
  7. Upstream patches
  8. Reproducibility testing framework
LWN: The history, status, and plans for reproducible builds

The free software newspaper of record, Linux Weekly News, published an in-depth article based on Holger Levsen’s talk, Reproducible Builds: The First Eleven Years which was presented at the recent DebConf24 conference in Busan, South Korea.

Titled The history, status, and plans for reproducible builds and written by Jake Edge, LWN’s article not only summarises Holger’s talk and clarifies its message but it links to external information as well. Holger’s original talk can also be watched on the DebConf24 webpage (direct .webm link and his HTML slides are available also). There are also a significant number of comments on LWN’s page as well.

Holger Levsen also headed a scheduled discussion session at DebConf24 on Preserving *other* build artifacts addressing a topic where a number of Debian packages are (or would like to) produce results that are neither the .deb files, the build logs nor the logs of CI tests. This is an issue for reproducible builds as this “4th type” of build artifact are typically shipped within the binary .deb packages, and are invariably non-deterministic; thus making the .deb files unreproducible. (A direct .webm link and HTML slides are available).


Intermediate Autotools build artifacts removed from PostgreSQL distribution tarballs

Peter Eisentraut wrote a detailed blog post on the subject of “The new PostgreSQL 17 make dist”. Like many projects, the PostgreSQL database has previously pre-built parts of its GNU Autotools build system: “the reason for this is a mix of convenience and traditional practice”. Peter astutely notes that this arrangement in the build system is “quite tricky” as:

You need to carefully maintain the different states of “clean source code”, “partially built source code”, and “fully built source code”, and the commands to transition between them.

However, Peter goes on to mention that:

… a lot more attention is nowadays paid to the software supply chain. There are security and legal reasons for this. When users install software, they want to know where it came from, and they want to be sure that they got the right thing, not some fake version or some version of dubious legal provenance.

And cites the XZ Utils backdoor as a reason to care about transparent and reproducible ways of distributing and communicating a source tarball and provenance. Because of this, intermediate build artifacts are now henceforth essentially disallowed from PostgreSQL distribution tarballs.

Distribution news

In Debian this month, 30 reviews of Debian packages were added, 17 were updated and 10 were removed this month adding to our knowledge about identified issues. One issue type was added by Chris Lamb, too. []

In addition, an issue was filed to update the Salsa CI pipeline (used by 1,000s of Debian packages) to no longer test for reproducibility with reprotest’s build_path variation. Holger Levsen provided a rationale for this change in the issue, which has already been made to the tests being performed by tests.reproducible-builds.org.


In Arch Linux this month, Jelle van der Waa published a short blog post on the topic of Investigating creating reproducible images with mkosi, motivated by the desire to make it possible for anyone to “re-recreate the official Arch cloud image bit-by-bit identical on their own machine as per [the] reproducible builds definition.” In addition, Jelle filed a patch for pacman, the Arch Linux package manager, to respect the SOURCE_DATE_EPOCH environment variable when installing a package.


In openSUSE news, Bernhard M. Wiedemann published another report for that distribution.


In Android news, the IzzyOnDroid project added 49 new rebuilder recipes and now features 256 total reproducible applications representing 21% of the total offerings in the repository. IzzyOnDroid is “an F-Droid style repository for Android apps[:] applications in this repository are official binaries built by the original application developers, taken from their resp. repositories (mostly GitHub).”


Mailing list news

From our mailing list this month:

  • Bernhard M. Wiedemann posted a brief message to the list with some helpful information regarding nondeterminism within Rust binaries, positing the use of the codegen-units = 16 default and resulting in a bug being filed in the Rust issue tracker. []

  • Bernhard also wrote to the list, following up to a thread in November 2023, on attempts to make the LibreOffice suite of office applications build reproducibly. In the thread from this month, Bernhard could announce that the four patches previously mentioned have landed in LibreOffice upstream.

  • Fay Stegerman linked the mailing list to a thread she made on the Signal issue tracker regarding whether “device-specific binaries [can] ever be considered meaningfully reproducible”. In particular: “the whole part about ‘allow[ing] multiple third parties to come to a consensus on a “correct” result’ breaks down completely when ‘correct’ is device-specific and not something everyone can agree on.” []

  • Developer kpcyrd posted an update for source code indexing project, whatsrc.org. Announcing that it now importing packages from live-bootstrap (“a usable Linux system [that is] created with only human-auditable, and wherever possible, human-written, source code”) into its database of provenance data.

  • Lastly, Mechtilde Stehmann posted an update to an earlier thread about how Java builds are not reproducible on the armhf architecture, enquiring how they might gain temporary access to such a machine in order to perform some deeper testing. []


diffoscope

diffoscope is our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. This month, Chris Lamb released versions 274, 275, 276 and 277, uploaded these to Debian, and made the following changes as well:

  • New features:

    • Strip ANSI escapes—usually colour codes—from the output of the Procyon Java decompiler. []
    • Factor out a method for stripping ANSI escapes. []
    • Append output from dumppdf(1) in more cases, avoiding situations where we fallback to a binary diff. []
    • Add support for versions of Perl’s IO::Compress::Zip version 2.212. []
  • Bug fixes:

    • Also catch RuntimeError exceptions when importing the PyPDF library so that it, or, crucially, its transitive dependencies, cannot not cause diffoscope to traceback at runtime and build time. []
    • Do not call marshal.load(…) of precompiled Python bytecode as it, alas, inherently unsafe. Replace for now with a brief summary of the code section of .pyc. [][]
    • Don’t include excessive debug output when calling dumppdf(1). []
  • Testsuite-related changes:

    • Don’t bother to check version number in test_python.py: the fixture for this test is fixed. [][]
    • Update test_zip text fixtures and definitions to support new changes to the Perl IO::Compress library. []

In addition, Mattia Rizzolo updated the available architectures for a number of test dependencies [] and Sergei Trofimovich fixed an issue to avoid diffoscope crashing when hashing directory symlinks [] and Vagrant Cascadian proposed GNU Guix updates for diffoscope versions [275 and 276 and [277.


Website updates

There were a rather substantial number of improvements made to our website this month, including:


Upstream patches

The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:


Reproducibility testing framework

The Reproducible Builds project operates a comprehensive testing framework running primarily at tests.reproducible-builds.org in order to check packages and other artifacts for reproducibility. In August, a number of changes were made by Holger Levsen, including:

  • Temporarily install the openssl-provider-legacy package for the Debian unstable environments for running diffoscope due to Debian bug #1078944. [][][][]
  • Mark Debian armhf architecture nodes as being down due to proxy down. [][]
  • Detect proxy failures. [][][]
  • Run the index-buildinfo for the builtin-pho script with the -q switch. []
  • Disable all Arch Linux reproducible jobs. []

In addition, Mattia Rizzolo updated the website configuration to install the ruby-jekyll-sitemap package as it is now used in the website [], Roland Clobus updated the script to build Debian ‘live’ images to treat openQA issues as warnings [], and Vagrant Cascadian marked the cbxi4b node as down [].


If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

Categories: FLOSS Project Planets

Pages