Programiz: Python Program to Count the Number of Occurrence of a Character in String

Planet Python - Thu, 2024-03-14 07:21
In this example, you will learn to count the number of occurrences of a character in a string.
Categories: FLOSS Project Planets

Programiz: Python Program to Check If Two Strings are Anagram

Planet Python - Thu, 2024-03-14 07:19
In this example, you will learn to check if two strings are anagram.
Categories: FLOSS Project Planets

Programiz: Python Program to Count the Number of Digits Present In a Number

Planet Python - Thu, 2024-03-14 07:19
In this example, you will learn to count the number of digits present in a number.
Categories: FLOSS Project Planets

Matthew Garrett: Digital forgeries are hard

Planet Debian - Thu, 2024-03-14 05:11
Closing arguments in the trial between various people and Craig Wright over whether he's Satoshi Nakamoto are wrapping up today, amongst a bewildering array of presented evidence. But one utterly astonishing aspect of this lawsuit is that expert witnesses for both sides agreed that much of the digital evidence provided by Craig Wright was unreliable in one way or another, generally including indications that it wasn't produced at the point in time it claimed to be. And it's fascinating reading through the subtle (and, in some cases, not so subtle) ways that that's revealed.

One of the pieces of evidence entered is screenshots of data from Mind Your Own Business, a business management product that's been around for some time. Craig Wright relied on screenshots of various entries from this product to support his claims around having controlled meaningful number of bitcoin before he was publicly linked to being Satoshi. If these were authentic then they'd be strong evidence linking him to the mining of coins before Bitcoin's public availability. Unfortunately the screenshots themselves weren't contemporary - the metadata shows them being created in 2020. This wouldn't fundamentally be a problem (it's entirely reasonable to create new screenshots of old material), as long as it's possible to establish that the material shown in the screenshots was created at that point. Sadly, well.

One part of the disclosed information was an email that contained a zip file that contained a raw database in the format used by MYOB. Importing that into the tool allowed an audit record to be extracted - this record showed that the relevant entries had been added to the database in 2020, shortly before the screenshots were created. This was, obviously, not strong evidence that Craig had held Bitcoin in 2009. This evidence was reported, and was responded to with a couple of additional databases that had an audit trail that was consistent with the dates in the records in question. Well, partially. The audit record included session data, showing an administrator logging into the data base in 2011 and then, uh, logging out in 2023, which is rather more consistent with someone changing their system clock to 2011 to create an entry, and switching it back to present day before logging out. In addition, the audit log included fields that didn't exist in versions of the product released before 2016, strongly suggesting that the entries dated 2009-2011 were created in software released after 2016. And even worse, the order of insertions into the database didn't line up with calendar time - an entry dated before another entry may appear in the database afterwards, indicating that it was created later. But even more obvious? The database schema used for these old entries corresponded to a version of the software released in 2023.

This is all consistent with the idea that these records were created after the fact and backdated to 2009-2011, and that after this evidence was made available further evidence was created and backdated to obfuscate that. In an unusual turn of events, during the trial Craig Wright introduced further evidence in the form of a chain of emails to his former lawyers that indicated he had provided them with login details to his MYOB instance in 2019 - before the metadata associated with the screenshots. The implication isn't entirely clear, but it suggests that either they had an opportunity to examine this data before the metadata suggests it was created, or that they faked the data? So, well, the obvious thing happened, and his former lawyers were asked whether they received these emails. The chain consisted of three emails, two of which they confirmed they'd received. And they received a third email in the chain, but it was different to the one entered in evidence. And, uh, weirdly, they'd received a copy of the email that was submitted - but they'd received it a few days earlier. In 2024.

And again, the forensic evidence is helpful here! It turns out that the email client used associates a timestamp with any attachments, which in this case included an image in the email footer - and the mysterious time travelling email had a timestamp in 2024, not 2019. This was created by the client, so was consistent with the email having been sent in 2024, not being sent in 2019 and somehow getting stuck somewhere before delivery. The date header indicates 2019, as do encoded timestamps in the MIME headers - consistent with the mail being sent by a computer with the clock set to 2019.

But there's a very weird difference between the copy of the email that was submitted in evidence and the copy that was located afterwards! The first included a header inserted by gmail that included a 2019 timestamp, while the latter had a 2024 timestamp. Is there a way to determine which of these could be the truth? It turns out there is! The format of that header changed in 2022, and the version in the email is the new version. The version with the 2019 timestamp is anachronistic - the format simply doesn't match the header that gmail would have introduced in 2019, suggesting that an email sent in 2022 or later was modified to include a timestamp of 2019.

This is by no means the only indication that Craig Wright's evidence may be misleading (there's the whole argument that the Bitcoin white paper was written in LaTeX when general consensus is that it's written in OpenOffice, given that's what the metadata claims), but it's a lovely example of a more general issue.

Our technology chains are complicated. So many moving parts end up influencing the content of the data we generate, and those parts develop over time. It's fantastically difficult to generate an artifact now that precisely corresponds to how it would look in the past, even if we go to the effort of installing an old OS on an old PC and setting the clock appropriately (are you sure you're going to be able to mimic an entirely period appropriate patch level?). Even the version of the font you use in a document may indicate it's anachronistic. I'm pretty good at computers and I no longer have any belief I could fake an old document.

(References: this Dropbox, under "Expert reports", "Patrick Madden". Initial MYOB data is in "Appendix PM7", further analysis is in "Appendix PM42", email analysis is "Sixth Expert Report of Mr Patrick Madden")

Categories: FLOSS Project Planets

Streamlining Multi-platform Development and Testing

Planet KDE - Thu, 2024-03-14 04:00

In today’s pervasively digital landscape, building software for a single platform is a 1990s approach. Modern applications, even those designed for specific embedded targets, must be adaptable enough to run seamlessly across various platforms without sacrificing efficiency or reliability.

This is often easier said than done. Here are some key points to consider when developing and testing multi-platform embedded software.

Emulation and virtual machines

When developing software, especially in the initial stages, testing and debugging often don’t happen on the final hardware but on development machines. That, and a frequent lack of target hardware means it’s a good idea to produce a build that can run within a virtual machine or container. Dedicating time and effort in developing custom hardware emulation layers and specialized build images pay off by enabling anyone in the test or development team to run a virtualized version of the final product.

Multi-board variants

Many product lines offer multiple hardware variants, with differing screen sizes and capabilities. Depending on the severity of the differences, these variants might require dedicated builds, potentially extending the time and resources devoted to your project. To avoid proliferating build configurations, consider enabling the software to auto-adapt to its hardware environment, provided it can be done reliably and without too much effort.

Mobile companion apps

Does your embedded product need to interface with a companion mobile app? These apps often handle remote configuration, reporting, and user profiles, enhancing product functionality and user experience. If so, consider using a cross-platform tool kit or framework to build your software. These allow you to share your business logic and UI components between iOS, Android, and your embedded platform. You can choose to reuse UI components in a stripped-down version of your application written specifically for mobile, or write the application once and have it adjust its behavior depending on the screen size and other hardware differences.

Strategies for multi-platform development

The key to successful multi-platform development is striking a balance between efficiency and coverage. Here are some strategies to consider.

Cross-compilation decisions

When dealing with multiple platforms, decide if it’s necessary to cross-compile for every platform with each commit. While this ensures up-to-date software for all variants, it can significantly extend the length of the build cycle. Consider reserving certain platforms for daily or less frequent builds to maintain a balance between speed and thoroughness.

Build system setup

Establish a robust build system with dedicated build computers, well-defined build scripts, and effective notification systems. Assign one person to oversee the build system and infrastructure to ensure its reliability and maintenance.

Embrace continuous integration (CI)

Transitioning from a traditional build approach to a continuous integration (CI) system is beneficial in the long run, especially when you’re managing multiple platforms. However CI demands automated builds, comprehensive unit testing, and automated test scripts. Despite this up-front investment, CI pays off by reducing bugs, enhancing release reliability, and speeding up maintenance changes.

Comprehensive testing

As much as possible, incorporate the “hard” testing bits into your automated testing/CI process – in other words, integration and user interface testing. These tests, while more complex to set up, significantly contribute to the robustness of your software. What works flawlessly in an emulated desktop environment may behave differently on the actual hardware, so ensure your testing procedures also include hardware target testing.

Building multi-platform with quality

Developing and testing software for multiple platforms requires a commitment to maintaining quality. For additional insights into ensuring your software’s versatility, reliability, and efficiency across all target platforms, read our best practices guide on Designing Your First Embedded Device: The Development Environment.

About KDAB

If you like this article and want to read similar material, consider subscribing via our RSS feed.

Subscribe to KDAB TV for similar informative short video content.

KDAB provides market leading software consulting and development services and training in Qt, C++ and 3D/OpenGL. Contact us.

The post Streamlining Multi-platform Development and Testing appeared first on KDAB.

Categories: FLOSS Project Planets

Mike Driscoll: NEW COURSE: Python 101 Video Course on Udemy and TutorialsPoint

Planet Python - Wed, 2024-03-13 23:45

I recently put my Python 101 Video Course up on Udemy and TutorialsPoint.

There are one thousand free copies of the course available on Udemy by using the following link:

If you prefer TutorialsPoint, you can get a free copy here:

The Python 101 video course is also available on TeachMePython and my Gumroad store.

I am slowly adding quizzes to the Udemy version of the course and plan to add the same or similar quizzes on TeachMePython. Unfortunately, TutorialsPoint doesn’t support quizzes in the same way as they use a video quiz format, so that site won’t have any quizzes.

The post NEW COURSE: Python 101 Video Course on Udemy and TutorialsPoint appeared first on Mouse Vs Python.

Categories: FLOSS Project Planets

Spyder IDE: Reusable research Birds of a Feather session at Scipy 2023: Solutions and tools

Planet Python - Wed, 2024-03-13 22:30

The Spyder team and collaborators hosted a Birds of a Feather (BoF) session at SciPy 2023, focused on moving beyond just scripts and notebooks toward truly reproducible, reusable research. In Part 1 of this two-part series, we went over our motivation and goals for the session and the challenges that attendees brought up. Now, we’ll review the tips, strategies, tools and platforms (including Spyder!) that participants shared as ways to address these obstacles. We'd again like to thank Juanita Gomez for helping organize the BoF, Hari for his hard work compiling a summary of the outcomes, and everyone for attending and sharing such great ideas and insights!

Making notebooks more reusable

As far as reproducibility is concerned, it was brought up that it can be difficult to easily compare outputs between notebooks created by different researchers. In response, one participant mentioned that VSCode recently made an improvement to the notebook diff viewers to more easily show just the code changes. However, users stressed that it was critical to be able to diff the actual notebook output, not just its contents, and expressed a desire for a tool to cover that aspect.

In response to these concerns, others responded that notebooks should not be considered a unit of reproducible research, which should instead be a complete software project, including notebooks or scripts, an environment/requirements file and a record of commands to run there. They recommended the 8-levels of Reproducibility and Conda Project to help guide and implement this.

Additionally, attendees recommended Papermill, describing it as a very useful tool for parameterizing and executing notebooks programmatically. Others suggested Devcontainers, to allow collaborating with a lab group or team in a shared environment and seeing everything on their screen, as well as Live Share in VSCode.

Participants also expressed frustration that despite notebooks being intended to make programming more literate, this often does not happen in practice. Beginners like the interactivity in notebooks because they don't know how to use more advanced programming tools, but they don't always take advantage of their readability features. To address this, attendees stressed the importance of getting users accustomed to best practices that can also be helpful for reproducibility. A participant mentioned a nbflake8 tool to lint notebooks, though it could not be easily found online, and others wished for a Ruff implementation (which at the time of this writing is now complete).

Migrating notebooks to modules

As one participant put it, "I love notebooks, and also love modules, and love the flow of code from notebooks into modules once it approaches that point." They went on to describe modules as a key unit of documented, tested code, but which doesn't mean a lot on its own, whereas combined with a notebook, it gives them context and meaning. For communities that may be afraid of modules, the participant recommended trying to make creating and transitioning to them easier, so users have fully importable, reusable Python code. For students, notebooks often turn into a fancy scratch pad or script file, and once they get stuff that works, they can move that stuff out into modules, and then the notebooks start to morph into examples and the history of what the work was about that can be interpreted by other researchers.

Other attendees chimed in with similar stories, with a NIST researcher mentioning this is an area they'd been working on for 10 years, with their approach being putting the stuff they want to be modular in a regular Python module, and then have a Jupyter notebook that shows an example using the code, such as in their IPRPy project. To aid this process, participants suggested tools like the Autodocstring extension in VSCode and the docstring generator built into Spyder's editor as great ways to reduce the friction for students when writing documentation, as they just add the triple quotes and the IDE generates a pre-filled docstring for them.

An important reproducibility and reusability tool many cited for this was nbdev, which can allow users to develop their code and let it grow, and then eventually export the parts as modules at the end. According to attendees, its documentation mostly talks about everything as packages, but it can also be used for individual notebooks and modules. Some participants were initially hesitant to show it to their students since they're early Python programmers, but it was actually quite easy for them, only requiring as little as one line of code at the end. (Unfortunately as of this writing, it seems ndbdev development has stalled due to its expected commercial opportunities not materializing.) Others asked for more documentation resources for this, since they were still learning Python themselves and would like to learn more about this and teach it to their students. In addition to this very blog post and guide, one attendee brought up that they did a tutorial on that topic at SciPy, adding that the documentation is pretty intimidating but it would be great to have something more focused on smaller-scale usage.

As additional approaches, attendees mentioned they have their students use Jupytext, which helps the student to convert notebooks to Python files that can be committed to a Git repository. This allows the code to be committed as a Python file, while allowing Jupyter to open it as a notebook and continue working on it. Others brought up nb-convert, a command line tool that can convert notebooks to many different formats including a Python script, which is integrated into IDEs like Spyder, and that there is also a similar VSCode feature.

Enabling reusable Python packages

When it comes to overall workflow, all agreed that going from a script or notebook to a reusable, installable Python package could be a major challenge, especially for students and non-programmers. Attendees from NASA mentioned that for their projects everything has to be documented, and one of the things they've struggled with was converting a notebook to the type of report NASA is typically looking for. Others described their workflow being as simple and "old school" as writing a aaa_readme.txt file where they record a diary of what they were doing on that project so if they have a break working on it, they can go back to those notes and remind themselves.

To help address this, participants recommended a "really cool" tool called "Show Your Work" that comes out of the astrophysics community, which is primarily aimed at producing a paper at the end but also a Python package, and includes all the steps that show users' work along the way. It is built around a tool called Snakemake, which then sets up a template for both the Python package and the paper. Additionally, attendees described it as having a "really helpful" guide for getting started and ensuring all of a user's projects have the same structure. It was brought up that Azel Donath, maintainer of Gammapy and speaker at SciPy 2023, published their Gammapy paper by using this tool.

As a followup, participants asked how this differed from Quarto, to which the response was that Quarto is much more general, whereas Show Your Work was specifically built to allow users to produce a PDF in LaTeX at the end. Others mentioned Duecredit, a related tool for citing open source authors which looks at code and finds the authors (via Git commits) that wrote it.

Additionally, users expressed particular appreciation for the Cookiecutter template that Henry Schreiner III has for packaging. They mentioned that a lot of their workflows are just messing around with their data, and having something like a package structure from the get go helps make it easier to not miss things. As a followup, a nuclear engineer mentioned they often have two week projects leveraging Jupyter at their center, with a cookiecutter template that has Sphinx, and a directory structure, and metadata that looks familiar and has everything set up by default. They described how this particularly helps ensure that different colleagues and team members are on the same page with doing things. Finally, others suggested the data-driven Cookiecutter template, which provides an ordered structure for where things go, what they are named and how they are run.

Next steps

Now that we’ve gathered a wealth of community feedback, ideas and resources, we’re currently working to further translate these insights into an actionable guide (or series of such) on a community platform, to make it easier for everyone to apply them. Keep an eye out for that, and until then, happy Spydering!

Categories: FLOSS Project Planets

MidCamp - Midwest Drupal Camp: Beware the ides of March!

Planet Drupal - Wed, 2024-03-13 21:01
Beware the ides of March!

With just one week until we meet in person for MidCamp 2024, book your ticket before the standard discount pricing ends on March 14, 11.59pm CT to save $100!

Session Schedule

We’ve got a great line-up this year! All sessions on Wednesday and Thursday (March 20/21) are included in the price of your MidCamp registration. We encourage you to start planning your days -- and get ready for some great learning opportunities!

Expanded Learning Tickets

Know any students or individuals seeking to expand their Drupal knowledge?

We have heavily discounted tickets ($25!) available for students and those wanting to learn more about Drupal to join us at MidCamp and learn more about the community!

There are sessions for everyone—topics range from Site Building and DevOps to Project Management and Design.

Get your tickets now!

Volunteer for some exclusive swag!

If you know you'll be joining us in Chicago, why not volunteer some of your spare time and get some exclusive swag in return! Check out our non-code opportunities to get involved.

Stay In The Loop

Please feel free to ask on the MidCamp Slack and come hang out with the community online. We will be making announcements there from time to time. We’re also on Twitter and Mastodon.

We can’t wait to see you next week! 

The MidCamp Team

Categories: FLOSS Project Planets

Wingware: Wing Python IDE Version 10.0.3 - March 15, 2024

Planet Python - Wed, 2024-03-13 21:00

Wing 10.0.3 adds more control over AI request context, improves keyboard navigability on Windows, fixes folding failures seen in Python files, avoids failure to show debug process output in Debug I/O, improves Diff/Merge for directories and files, fixes hanging while examining some types of debug data, solves several problems with setting up Poetry projects on Windows, and makes about 20 other improvements.

See the change log for details.

Download Wing 10 Now: Wing Pro | Wing Personal | Wing 101 | Compare Products

What's New in Wing 10

AI Assisted Development

Wing Pro 10 takes advantage of recent advances in the capabilities of generative AI to provide powerful AI assisted development, including AI code suggestion, AI driven code refactoring, description-driven development, and AI chat. You can ask Wing to use AI to (1) implement missing code at the current input position, (2) refactor, enhance, or extend existing code by describing the changes that you want to make, (3) write new code from a description of its functionality and design, or (4) chat in order to work through understanding and making changes to code.

Examples of requests you can make include:

"Add a docstring to this method" "Create unit tests for class SearchEngine" "Add a phone number field to the Person class" "Clean up this code" "Convert this into a Python generator" "Create an RPC server that exposes all the public methods in class BuildingManager" "Change this method to wait asynchronously for data and return the result with a callback" "Rewrite this threaded code to instead run asynchronously"

Yes, really!

Your role changes to one of directing an intelligent assistant capable of completing a wide range of programming tasks in relatively short periods of time. Instead of typing out code by hand every step of the way, you are essentially directing someone else to work through the details of manageable steps in the software development process.

Read More

Support for Python 3.12

Wing 10 adds support for Python 3.12, including (1) faster debugging with PEP 669 low impact monitoring API, (2) PEP 695 parameterized classes, functions and methods, (3) PEP 695 type statements, and (4) PEP 701 style f-strings.

Poetry Package Management

Wing Pro 10 adds support for Poetry package management in the New Project dialog and the Packages tool in the Tools menu. Poetry is an easy-to-use cross-platform dependency and package manager for Python, similar to pipenv.

Ruff Code Warnings & Reformatting

Wing Pro 10 adds support for Ruff as an external code checker in the Code Warnings tool, accessed from the Tools menu. Ruff can also be used as a code reformatter in the Source > Reformatting menu group. Ruff is an incredibly fast Python code checker that can replace or supplement flake8, pylint, pep8, and mypy.

Try Wing 10 Now!

Wing 10 is a ground-breaking new release in Wingware's Python IDE product line. Find out how Wing 10 can turbocharge your Python development by trying it today.

Downloads: Wing Pro | Wing Personal | Wing 101 | Compare Products

See Upgrading for details on upgrading from Wing 9 and earlier, and Migrating from Older Versions for a list of compatibility notes.

Categories: FLOSS Project Planets

Dirk Eddelbuettel: ciw 0.0.1 on CRAN: New Package!

Planet Debian - Wed, 2024-03-13 20:03

Happy to share that ciw is now on CRAN! I had tooted a little bit about it, e.g., here. What it provides is a single (efficient) function incoming() which summarises the state of the incoming directories at CRAN. I happen to like having these things at my (shell) fingertips, so it goes along with (still draft) wrapper ciw.r that will be part of the next littler release.

For example, when I do this right now as I type this, I see

edd@rob:~$ ciw.r Folder Name Time Size Age <char> <char> <POSc> <char> <difftime> 1: waiting maximin_1.0-5.tar.gz 2024-03-13 22:22:00 20K 2.48 hours 2: inspect GofCens_0.97.tar.gz 2024-03-13 21:12:00 29K 3.65 hours 3: inspect verbalisr_0.5.2.tar.gz 2024-03-13 20:09:00 79K 4.70 hours 4: waiting rnames_1.0.1.tar.gz 2024-03-12 15:04:00 2.7K 33.78 hours 5: waiting PCMBase_1.2.14.tar.gz 2024-03-10 12:32:00 406K 84.32 hours 6: pending MPCR_1.1.tar.gz 2024-02-22 11:07:00 903K 493.73 hours edd@rob:~$

which is rather compact as CRAN kept busy! This call runs in about (or just over) one second, which includes launching r. Good enough for me. From a well-connected EC2 instance it is about 800ms on the command-line. When I do I from here inside an R session it is maybe 700ms. And doing it over in Europe is faster still. (I am using ping=FALSE for these to omit the default sanity check of ‘can I haz networking?’ to speed things up. The check adds another 200ms or so.)

The function (and the wrapper) offer a ton of options too this is ridiculously easy to do thanks to the docopt package:

edd@rob:~$ ciw.r -x Usage: ciw.r [-h] [-x] [-a] [-m] [-i] [-t] [-p] [-w] [-r] [-s] [-n] [-u] [-l rows] [-z] [ARG...] -m --mega use 'mega' mode of all folders (see --usage) -i --inspect visit 'inspect' folder -t --pretest visit 'pretest' folder -p --pending visit 'pending' folder -w --waiting visit 'waiting' folder -r --recheck visit 'waiting' folder -a --archive visit 'archive' folder -n --newbies visit 'newbies' folder -u --publish visit 'publish' folder -s --skipsort skip sorting of aggregate results by age -l --lines rows print top 'rows' of the result object [default: 50] -z --ping run the connectivity check first -h --help show this help text -x --usage show help and short example usage where ARG... can be one or more file name, or directories or package names. Examples: ciw.r -ip # run in 'inspect' and 'pending' mode ciw.r -a # run with mode 'auto' resolved in incoming() ciw.r # run with defaults, same as '-itpwr' When no argument is given, 'auto' is selected which corresponds to 'inspect', 'waiting', 'pending', 'pretest', and 'recheck'. Selecting '-m' or '--mega' are select as default. Folder selecting arguments are cumulative; but 'mega' is a single selections of all folders (i.e. 'inspect', 'waiting', 'pending', 'pretest', 'recheck', 'archive', 'newbies', 'publish'). ciw.r is part of littler which brings 'r' to the command-line. See https://dirk.eddelbuettel.com/code/littler.html for more information. edd@rob:~$

The README at the git repo and the CRAN page offer a ‘screenshot movie’ showing some of the options in action.

I have been using the little tools quite a bit over the last two or three weeks since I first put it together and find it quite handy. With that again a big Thank You! of appcreciation for all that CRAN does—which this week included letting this past the newbies desk in under 24 hours.

If you like this or other open-source work I do, you can sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

Categories: FLOSS Project Planets

Freexian Collaborators: Monthly report about Debian Long Term Support, February 2024 (by Roberto C. Sánchez)

Planet Debian - Wed, 2024-03-13 20:00

Like each month, have a look at the work funded by Freexian’s Debian LTS offering.

Debian LTS contributors

In February, 18 contributors have been paid to work on Debian LTS, their reports are available:

  • Abhijith PA did 10.0h (out of 14.0h assigned), thus carrying over 4.0h to the next month.
  • Adrian Bunk did 13.5h (out of 24.25h assigned and 41.75h from previous period), thus carrying over 52.5h to the next month.
  • Bastien Roucariès did 20.0h (out of 20.0h assigned).
  • Ben Hutchings did 2.0h (out of 14.5h assigned and 9.5h from previous period), thus carrying over 22.0h to the next month.
  • Chris Lamb did 18.0h (out of 18.0h assigned).
  • Daniel Leidert did 10.0h (out of 10.0h assigned).
  • Emilio Pozuelo Monfort did 3.0h (out of 28.25h assigned and 31.75h from previous period), thus carrying over 57.0h to the next month.
  • Guilhem Moulin did 7.25h (out of 4.75h assigned and 15.25h from previous period), thus carrying over 12.75h to the next month.
  • Holger Levsen did 0.5h (out of 3.5h assigned and 8.5h from previous period), thus carrying over 11.5h to the next month.
  • Lee Garrett did 0.0h (out of 18.25h assigned and 41.75h from previous period), thus carrying over 60.0h to the next month.
  • Markus Koschany did 40.0h (out of 40.0h assigned).
  • Roberto C. Sánchez did 3.5h (out of 8.75h assigned and 3.25h from previous period), thus carrying over 8.5h to the next month.
  • Santiago Ruano Rincón did 13.5h (out of 13.5h assigned and 2.5h from previous period), thus carrying over 2.5h to the next month.
  • Sean Whitton did 4.5h (out of 0.5h assigned and 5.5h from previous period), thus carrying over 1.5h to the next month.
  • Sylvain Beucler did 24.5h (out of 27.75h assigned and 32.25h from previous period), thus carrying over 35.5h to the next month.
  • Thorsten Alteholz did 14.0h (out of 14.0h assigned).
  • Tobias Frost did 12.0h (out of 12.0h assigned).
  • Utkarsh Gupta did 11.25h (out of 26.75h assigned and 33.25h from previous period), thus carrying over 48.75 to the next month.
Evolution of the situation

In February, we have released 17 DLAs.

The number of DLAs published during February was a bit lower than usual, as there was much work going on in the area of triaging CVEs (a number of which turned out to not affect Debia buster, and others which ended up being duplicates, or otherwise determined to be invalid). Of the packages which did receive updates, notable were sudo (to fix a privilege management issue), and iwd and wpa (both of which suffered from authentication bypass vulnerabilities).

While this has already been already announced in the Freexian blog, we would like to mention here the start of the Long Term Support project for Samba 4.17. You can find all the important details in that post, but we would like to highlight that it is thanks to our LTS sponsors that we are able to fund the work from our partner, Catalyst, towards improving the security support of Samba in Debian 12 (Bookworm).

Thanks to our sponsors

Sponsors that joined recently are in bold.

Categories: FLOSS Project Planets

What We're Up To In 2024

Planet KDE - Wed, 2024-03-13 20:00

It's 2024 already, and even already March. Like last year, we had a video call with all sponsored developers, artists and volunteers to discuss what we achieved last year, figure out the biggest issues we're facing and set the priorities for this year.


A very serious issue is that the maintainer of the Android and ChromeOS port of Krita has become too busy to work on Krita full-time. The Android and ChromeOS versions of Krita both use the Android platform, and that platform changes often and arbitrarily. This means that Sharaf has spent almost all of his time keeping Krita running on Android (and ChromeOS), instead of, as we had planned, work on a dedicated tablet user interface for Krita on Android. And since that maintenance work now is not being done, we're having a really big problem there. Additionally, since KDE has retired the binary factory and moved binary builds to invent.kde.org's continuous integration system, we don't have automatic builds for Android anymore.

We've also lost another sponsored developer. They were sick for quite some time, but recently they blogged they had started to work a different job. Since they were especially working on maintaining the libraries Krita is dependent on and were very good at upstreaming fixes, they will also really be missed.

Finally, we got Krita into the Apple MacOS store last year. However, two years ago, Krita's maintainer, that's me, changed her legal name. Now the certificates needed to sign the package for the store have expired, and we needed to create new certificates. Those have to have the signer's current legal name, and for some reason, it's proving really hard get the store to allow that the same developer, with the same ID and code but a different legal name to upload packages. We're working on that.

What We Did Last Year

Of course, we released Krita 5.2 and two bugfix releases for Krita 5.2. We'll do at least one other bugfix release before we release Krita 5.3.

The audio system for Krita's animation feature got completely overhauled, ported away from Qt's QtMultimedia system to MLT, The storyboard feature got improved a lot, we gained JPEG-XL support just in time for Google's Chrome team to decide to drop it, because there was nobody supporting it... We also refactored the system we use to build all dependent libraries on all platforms. Well, work on MacOS is still going on, with PyQt being a problem point. Of course, there were a lot of other things going on as well.

Wolthera started rewriting the text object, and mostly finished that and now is working on the tool to actually write, modify and typeset text. This is a huge change with very impressive results!

What We Hope To Do This Year

Parts of this list is from last year, part of it is new.

One big caveat: now that the KDE project has released the first version of KDE Frameworks for Qt6, porting Krita to Qt6 is going to have to happen. This is a big project, not just because of disappearing functions, but very much because of the changes to the support for GPU rendering. On Windows, OpenGL drivers are pretty buggy, and because of that, Qt5 offered the possibility to use the Angle compatibility layer between applications that use OpenGL and the native Direct3D library for GPU rendering. That's gone, and unless we rewrite our GPU rendering system, we need to put Angle back into the stack.

All in all, it's pretty likely that porting to Qt6 will take a lot of time away from us implementing fun new features. But when that is done we can start working on a tablet-friendly user interface, provided we can still release Krita for Android.

That's not to say we don't want to implement fun new featurs!

Here's the shortlist:

  • Implement a system to create flexible text balloons and integrate that with the text object so the text flows into the balloons
  • Implement a new layer type, for comic book Frameworks
  • Provide integration with Blender. (This is less urgent, though, since there is a very useful third-party plugin for that already: [Blender Layer] (https://github.com/Yuntokon/BlenderLayer/).)
  • Replace the current docker system with something more flexible, and maintained.
  • Implement a system to provide tool presets
  • Create a new user interface for handling palettes
  • Add an animation audio waveform display
  • Add support for animation reference frame workflow.

We also discussed using the GPU for improving performance. One original idea was to use the GPU for brushes, but the artists argued that the brush performance is fine, and what's way too slow are the liquefy transform tool, transform masks and some filters. In the end, Dmitry decided to investigate

  • optimizing transform masks on the GPU

And there's the most controversial thing of all: should we add AI features to Krita? We have had several heated discussions amongst developers and artists on the mailing list and on invent.kde.org.

The artists in the meeting argued that generative AI is worthless and would at best lead to bland, repetitive templates, but that assistive AI could be useful. In order to figure out whether that's true, we started investigating one particular project: AI-assisted inking of sketches. This is useful, could replace a tedious step when doing art while still retaining the artistic individuality. Whether it will actually make it to Krita is uncertain of course, but the investigation will hopefully help us understand better the issue, the possibilities and the problems.

Note: we won't be implementing anything that uses models trained on scraped images and we will make sure that the carbon footprint of the feature doesn't exceed its usefulness.

Categories: FLOSS Project Planets

The Drop Times: Zoocha: The Drupal Development Specialists in the UK

Planet Drupal - Wed, 2024-03-13 17:34
Discover the journey of Zoocha, a leading Drupal Development Agency renowned for its commitment to innovation, sustainability, and client success. Since its inception in 2009, Zoocha has demonstrated a profound dedication to open-source technology and agile methodologies, earning notable certifications and accolades. With a portfolio boasting collaborations with high-profile clients across various sectors, Zoocha excels in delivering customized Drupal solutions that meet unique client needs.

This article delves into Zoocha's strategies for engaging talent, promoting Drupal among younger audiences, and its ambitious goal to achieve carbon neutrality by 2025. Explore how Zoocha's commitment to quality, security, and environmental sustainability positions it as a trusted partner in the dynamic world of web development.
Categories: FLOSS Project Planets

Four Kitchens: Custom Drush commands with Drush Generate

Planet Drupal - Wed, 2024-03-13 14:59

Marc Berger

Senior Backend Engineer

Always looking for a challenge, Marc tries to add something new to his toolbox for every project and build — be it a new CSS technology, creating custom APIs, or testing out new processes for development.

January 1, 1970

Recently, one of our clients had to retrieve some information from their Drupal site during a CI build. They needed to know the internal Drupal path from a known path alias. Common Drush commands don’t provide this information directly, so we decided to write our own custom Drush command. It was a lot easier than we thought it would be! Let’s get started.

Note: This post is based on commands and structure for Drush 12.

While we can write our own Drush command from scratch, let’s discuss a tool that Drush already provides us: the drush generate command. Drush 9 added support to generate scaffolding and boilerplate code for many common Drupal coding tasks such as custom modules, themes, services, plugins, and many more. The nice thing about using the drush generate command is that the code it generates conforms to best practices and Drupal coding standards — and some generators even come with examples as well. You can see all available generators by simply running drush generate without any arguments.

Step 1: Create a custom module

To get started, a requirement to create a new custom Drush command in this way is to have an existing custom module already in the codebase. If one exists, great. You can skip to Step 2 below. If you need a custom module, let’s use Drush to generate one:

drush generate module

Drush will ask a series of questions such as the module name, the package, any dependencies, and if you want to generate a .module file, README.md, etc. Once the module has been created, enable the module. This will help with the autocomplete when generating the custom Drush command.

drush en <machine_name_of_custom_module>

Step 2: Create custom Drush command boilerplate

First, make sure you have a custom module where your new custom Drush command will live and make sure that module is enabled. Next, run the following command to generate some boilerplate code:

drush generate drush:command-file

This command will also ask some questions, the first of which is the machine name of the custom module. If that module is enabled, it will autocomplete the name in the terminal. You can also tell the generator to use dependency injection if you know what services you need to use. In our case, we need to inject the path_alias.manager service. Once generated, the new command class will live here under your custom module:


Let’s take a look at this newly generated code. We will see the standard class structure and our dependency injection at the top of the file:

<?php namespace Drupal\custom_drush\Drush\Commands; use Consolidation\OutputFormatters\StructuredData\RowsOfFields; use Drupal\Core\StringTranslation\StringTranslationTrait; use Drupal\Core\Utility\Token; use Drupal\path_alias\AliasManagerInterface; use Drush\Attributes as CLI; use Drush\Commands\DrushCommands; use Symfony\Component\DependencyInjection\ContainerInterface; /** * A Drush commandfile. * * In addition to this file, you need a drush.services.yml * in root of your module, and a composer.json file that provides the name * of the services file to use. */ final class CustomDrushCommands extends DrushCommands { use StringTranslationTrait; /** * Constructs a CustomDrushCommands object. */ public function __construct( private readonly Token $token, private readonly AliasManagerInterface $pathAliasManager, ) { parent::__construct(); } /** * {@inheritdoc} */ public static function create(ContainerInterface $container) { return new static( $container->get('token'), $container->get('path_alias.manager'), ); }

Note: The generator adds a comment about needing a drush.services.yml file. This requirement is deprecated and will be removed in Drush 13, so you can ignore it if you are using Drush 12. In our testing, this file does not need to be present.

Further down in the new class, we will see some boilerplate example code. This is where the magic happens:

/** * Command description here. */ #[CLI\Command(name: 'custom_drush:command-name', aliases: ['foo'])] #[CLI\Argument(name: 'arg1', description: 'Argument description.')] #[CLI\Option(name: 'option-name', description: 'Option description')] #[CLI\Usage(name: 'custom_drush:command-name foo', description: 'Usage description')] public function commandName($arg1, $options = ['option-name' => 'default']) { $this->logger()->success(dt('Achievement unlocked.')); }

This new Drush command doesn’t do very much at the moment, but provides a great jumping-off point. The first thing to note at the top of the function are the new PHP 8 attributes that begin with the #. These replace the previous PHP annotations that are commonly seen when writing custom plugins in Drupal. You can read more about the new PHP attributes.

The different attributes tell Drush what our custom command name is, description, what arguments it will take (if any), and any aliases it may have.

Step 3: Create our custom command

For our custom command, let’s modify the code so we can get the internal path from a path alias:

/** * Command description here. */ #[CLI\Command(name: 'custom_drush:interal-path', aliases: ['intpath'])] #[CLI\Argument(name: 'pathAlias', description: 'The path alias, must begin with /')] #[CLI\Usage(name: 'custom_drush:interal-path /path-alias', description: 'Supply the path alias and the internal path will be retrieved.')] public function getInternalPath($pathAlias) { if (!str_starts_with($pathAlias, "/")) { $this->logger()->error(dt('The alias must start with a /')); } else { $path = $this->pathAliasManager->getPathByAlias($pathAlias); if ($path == $pathAlias) { $this->logger()->error(dt('There was no internal path found that uses that alias.')); } else { $this->output()->writeln($path); } } //$this->logger()->success(dt('Achievement unlocked.')); }

What we’re doing here is changing the name of the command so it can be called like so:

drush custom_drush:internal-path <path> or via the alias: drush intpath <path>

The <path> is a required argument (such as /my-amazing-page) because of how it is called in the getInternalPath method. By passing a path, this method first checks to see if the path starts with /. If it does, it will perform an additional check to see if there is a path that exists. If so, it will return the internal path, i.e., /node/1234. Lastly, the output is provided by the logger method that comes from the inherited DrushCommands class. It’s a simple command, but one that helped us automatically set config during a CI job.

Table output

Note the boilerplate code also generated another example below the first — one that will provide output in a table format:

/** * An example of the table output format. */ #[CLI\Command(name: 'custom_drush:token', aliases: ['token'])] #[CLI\FieldLabels(labels: [ 'group' => 'Group', 'token' => 'Token', 'name' => 'Name' ])] #[CLI\DefaultTableFields(fields: ['group', 'token', 'name'])] #[CLI\FilterDefaultField(field: 'name')] public function token($options = ['format' => 'table']): RowsOfFields { $all = $this->token->getInfo(); foreach ($all['tokens'] as $group => $tokens) { foreach ($tokens as $key => $token) { $rows[] = [ 'group' => $group, 'token' => $key, 'name' => $token['name'], ]; } } return new RowsOfFields($rows); }

In this example, no argument is required, and it will simply print out the list of tokens in a nice table:

------------ ------------------ ----------------------- Group Token Name ------------ ------------------ ----------------------- file fid File ID node nid Content ID site name Name ... ... ... Final thoughts

Drush is a powerful tool, and like many parts of Drupal, it’s expandable to meet different needs. While I shared a relatively simple example to solve a small challenge, the possibilities are open to retrieve all kinds of information from your Drupal site to use in scripting, CI/CD jobs, reporting, and more. And by using the drush generate command, creating these custom solutions is easy, follows best practices, and helps keep code consistent.

Further reading

The post Custom Drush commands with Drush Generate appeared first on Four Kitchens.

Categories: FLOSS Project Planets

a2ps @ Savannah: a2ps 4.15.6 released [stable]

GNU Planet! - Wed, 2024-03-13 14:24

I am delighted to announce version 4.15.6 of GNU a2ps, the Anything to
PostScript converter.

This release fixes a couple of bugs, in particular with printing (the -P
flag). See below for details.

Here are the compressed sources and a GPG detached signature:

Use a mirror for higher download bandwidth:

Here are the SHA1 and SHA256 checksums:

e20e8009d8812c8d960884b79aab95f235c725c0  a2ps-4.15.6.tar.gz
h/+dgByxGWkYHVuM+LZeZeWyS7DHahuCXoCY8pBvvfQ  a2ps-4.15.6.tar.gz

The SHA256 checksum is base64 encoded, instead of the
hexadecimal encoding that most checksum tools default to.

Use a .sig file to verify that the corresponding file (without the
.sig suffix) is intact.  First, be sure to download both the .sig file
and the corresponding tarball.  Then, run a command like this:

  gpg --verify a2ps-4.15.6.tar.gz.sig

The signature should match the fingerprint of the following key:

  pub   rsa2048 2013-12-11 [SC]
        2409 3F01 6FFE 8602 EF44  9BB8 4C8E F3DA 3FD3 7230
  uid   Reuben Thomas <rrt@sc3d.org>
  uid   keybase.io/rrt <rrt@keybase.io>

If that command fails because you don't have the required public key,
or that public key has expired, try the following commands to retrieve
or refresh it, and then rerun the 'gpg --verify' command.

  gpg --locate-external-key rrt@sc3d.org

  gpg --recv-keys 4C8EF3DA3FD37230

  wget -q -O- 'https://savannah.gnu.org/project/release-gpgkeys.php?group=a2ps&download=1' | gpg --import -

As a last resort to find the key, you can try the official GNU

  wget -q https://ftp.gnu.org/gnu/gnu-keyring.gpg
  gpg --keyring gnu-keyring.gpg --verify a2ps-4.15.6.tar.gz.sig

This release was bootstrapped with the following tools:
  Autoconf 2.71
  Automake 1.16.5
  Gnulib v0.1-7186-g5aa8eafc0e


* Noteworthy changes in release 4.15.6 (2024-03-13) [stable]
 * Bug fixes:
   - Fix a2ps-lpr-wrapper to work with no arguments, as a2ps requires.
   - Minor fixes & improvements to sheets.map for image types and PDF.
 * Build system:
   - Minor fixes and improvements.

Categories: FLOSS Project Planets

ClearlyDefined at the ORT Community Days

Open Source Initiative - Wed, 2024-03-13 10:45

Once again Bosch’s campus in Berlin received ORT Community Days, the annual event organized by the OSS Review Toolkit (ORT) community. ORT is an Open Source suite of tools to automate software compliance checks.

During this two day event, members from startups like Double Open and NexB, as well as large corporations like Mercedes-Benz, Volkswagen, CARIAD, Porsche, Here Technologies, EPAM, Deloitte, Sony, Zeiss, Fraunhofer, and Roche, came together to discuss best practices around software supply chain compliance.

The ClearlyDefined community had an important presence at the event, represented by E. Lynette Rayle and Lukas Spieß from GitHub and Qing Tomlinson from SAP. I had the pleasure to represent the Open Source Initiative as the community manager for ClearlyDefined. The mission of ClearlyDefined is to crowdsource a global database of licensing metadata for every software component ever published. We see the ORT community as an important partner towards achieving this mission.

Relevant talks

There were several interesting talks at ORT Community Days. These are the ones I found most relevant to ClearlyDefined:

Philippe Ombredanne presented ScanCode, a project of great importance to ClearlyDefined, as we use this tool to detect licenses, copyrights, and dependencies. Philippe gave an overview of the project and its challenges. For ClearlyDefined, we would like to see better accuracy and performance improvements. 

Sebastian Schuberth presented the Double Open Server (DOS) companion for ORT. DOS is a server application that scans the source code of open source components, stores the scan results for use in license compliance pipelines, and provides a graphical interface for manually curating the license findings. I believe there’s an opportunity to integrate DOS with ClearlyDefined by providing access to our APIs to fetch licensing metadata and allowing the sharing of curations.

Marcel Kurzmann and Martin Nonnenmacher presented Eclipse Apoapsis, another ORT server that makes use of its integration APIs for dependency analysis, license scanning, vulnerability databases, rule engine, and report generation. Again, I feel we could also integrate Eclipse Apoapsis with ClearlyDefined the same way as with DOS.

Till Jaeger gave an excellent talk about curation of ORT output from the perspective of FOSS license compliance. He highlighted the Cyber Resilient Act (CRA), which brings legal provisions for SBOMs, and which will likely increase the need for tools like ORT. Till shared the many challenges in the curation process, particularly the compatibility issues from dual licensing, and went on to showcase the OSADL compatibility matrix.

Presenting ClearlyDefined

I had the privilege of presenting ClearlyDefined together with E. Lynette Rayle from GitHub and we got some really good feedback and questions from the audience.

With the move towards SBOMs everywhere for compliance and security reasons, organizations will face great challenges to generate these at scale for each stage on the supply chain, for every build or release. Additionally, multiple organizations will have to curate the same missing or wrongly identified licensing metadata over and over again.

ClearlyDefined is well suited to solve these problems by serving a cached copy of licensing metadata for each component through a simple API. Organizations will also be able to contribute back with any missing or wrongly identified licensing metadata, helping to create a database that is accurate for the benefit of all.

GitHub is well aware of these challenges and is interested in helping its users in this regard. They recently added 17.5 million package licenses sourced from ClearlyDefined to their database, expanding the license coverage for packages that appear in dependency graph, dependency insights, dependency review, and a repository’s software bill of materials (SBOM).

To make use of ClearlyDefined’s data, a user can simply make a call to its API service. For example, to fetch licensing metadata from the lodash library on NPM at version 4.17.21, one would call:

curl -X GET "https://api.clearlydefined.io/definitions/npm/npmjs/-/lodash/4.17.21" -H "accept: */*"

This API call would be processed by the service for ClearlyDefined, as illustrated in the diagram below. If there’s a match in the definition store, then that definition would be sent back to the user. Otherwise, this request would trigger the crawler for ClearlyDefined (part of the harvesting process), which would download the lodash library from NPM, scan the library, and write the results to the raw results store. The service for ClearlyDefined would then read the raw results, summarize it, and create a definition to be written in the definition store. Finally, the definition would be served to the user.

The curation process is done through another API call via PATCHes. For example, the below PATCH updates a declared license to Apache-2.0:

"contributionInfo": {
      "summary": "[Test] Update declared license",
      "details": "The declared license should be Apache as per the LICENSE file.",
      "resolution": "Updated declared license to Apache-2.0.",

This curation is handled by the service for ClearlyDefined, as illustrated in the diagram below. The curation would trigger the creation of a PR in ClearlyDefined’s curated-data repository, which would be reviewed by and signed off by two curators. The PR would then be merged and written in the curated-data store.

GitHub has deployed its own local Harvester for ClearlyDefined, as illustrated in the diagram below. GitHub’s OSPO Policy Service posts requests to GitHub’s Harvester for ClearlyDefined, which downloads any components and dependencies from various package managers, scans these components, and writes the results directly to ClearlyDefined’s raw results store. GitHub’s OSPO Policy Service fetches definitions from the service for ClearlyDefined as well as licenses and attributions from GitHub’s Package License Gateway. GitHub maintains a local cache store which is synced with any updates from ClearlyDefined’s changes-notifications blob storage.

ClearlyDefined’s development has seen an increased participation from various organizations this past year, including GitHub, SAP, Microsoft, Bloomberg, and CodeThink.

Currently, maintainers of ClearlyDefined are focused on ongoing maintenance. Key goals for ClearlyDefined in 2024 include:

  • Publishing periodic releases and switching to semantic versioning
  • Bringing dependencies up to date (in particular using the latest scancode)
  • Improving the NOASSERTION/OTHER issue
  • Advancing usability and the curation process through the UI 
  • Enhancing the documentation and process for creating a local harvest

Our slides are available here.

Relevant breakout sessions

ORT Community Days provided several breakout sessions to allow participants to discuss pain points and solutions.

A special discussion around curations was led by Sebastian Schuberth and E. Lynette Rayle. The ORT Package Curation Data can be broken down into two categories: metadata interpretations and legal curations. The group discussed their thoughts about the curation process and its challenges, including handling false positives and the sharing of curations.

Nowadays, no conference would be complete without at least one talk or discussion about Artificial Intelligence. A group gathered to discuss the potential use of AI to improve user experience as well as for OSS compliance. The majority of attendees believed ORT’s documentation could be improved through the use of AI and even an assistant would be helpful to answer the most common questions. As for the use of AI for OSS compliance, there’s a lot of potential here, and one idea would be to use ClearlyDefined’s curation dataset to fine tune a LLM.


The second edition of ORT Community Days represented a unique opportunity for the ClearlyDefined community to better engage with the ORT community. We were able to meet the maintainers and members of ORT and learn from them about the current and future challenges. We were also able to explore how our communities can further collaborate. 

On behalf of the ClearlyDefined community, I would like to thank the organizers of this wonderful event: Marcel Kurzmann, Nikola Babadzhanov, Surya Santhi, and Thomas Steenbergen. I would also like to thank E. Lynette Rayle, Lukas Spieß and Qing Tomlinson from the ClearlyDefined community who have accepted my invitation to participate in this conference.

If you are interested in Open Source supply chain compliance and security, I invite you to learn a bit more about the ClearlyDefined and the ORT communities. You might also be interested in my report from FOSS Backstage.

Categories: FLOSS Research

Three perspectives from FOSS Backstage

Open Source Initiative - Wed, 2024-03-13 10:45

As a community manager, I find FOSS Backstage to be one of my favorite conferences content-wise and community-wise. This is a conference that happens every year in Berlin, usually in early March. It’s a great opportunity to meet community leaders from Europe and across the world with the goal of fostering discussions around three complementary perspectives: a) community health and growth, b) project governance and sustainability, and c) supply chain compliance and security.

Community health and growth

While there were several interesting talks, one of the highlights of the “Community health and growth” track was Tom “spot” Callaway’s talk embracing your weird: community building through fun & play. Tom shared some really interesting ideas to help members bond together: a badge program, a candy swap activity, a coin giveaway, a scavenger hunt, and a karaoke session.

FOSS Backstage this year was special because I got to finally meet 3 members from the ClearlyDefined community who have given a new life to this project: E. Lynette Rayle and Lukas Spieß from GitHub and Qing Tomlinson from SAP. While we did not go into a scavenger hunt or a karaoke session (that would have been fun), we spent most of our time during the week having lunch and dinner together, watching talk sessions together, networking with old and new acquaintances, and even going for some sightseeing in Berlin. This has allowed us to not only share ideas about the future of ClearlyDefined, but most importantly to have fun together and create a strong bond between us.

Please find below a list of interesting talks from this track:

Project governance and sustainability

In last year’s FOSS Backstage, I had the opportunity to meet Thomas Steenbergen for the first time. He’s the co-founder of ClearlyDefined and the OSS Review Toolkit (ORT) communities. Project governance and sustainability is something Thomas deeply cares about, and I was honored to be invited to give  a talk together with him for this year’s conference.

Our talk was about aligning wishes of multiple organizations into an Open Source project. This is a challenge that many projects face: oftentimes they struggle to align wishes and get commitment from multiple organizations towards a shared roadmap. There’s also the challenge of the “free rider” problem, where the overuse of a common resource without giving back often leads to the tragedy of the commons. Thomas shared the idea of a collaboration marketplace and a contributor commitment agreement where organizations come together to identify, commit, and implement a common enhancement proposal. This is a strategy that we are applying to ORT and ClearlyDefined.

Our slides are available here.

Please find below a list of interesting talks from this track:

Supply chain compliance and security

Under the “supply chain compliance and security” track, I was happy to watch a wonderful talk from my friend Ana Jimenez Santamaria entitled looking at Open Source security from a community angle. She has been leading the TODO Group at the Linux Foundation for quite a few years now, and it was interesting to learn how they are helping OSPOs (Open Source Program Offices) to create a trusted software supply chain. Ana highlighted three takeaways:

  • OSPOs integrate Open Source in an organization’s IT infrastructure.
  • Collaboration between employees, Open Source staff, and security teams with the Open Source ecosystem offers a complete security coverage across the whole supply chain.
  • OSPOs have the important mission of achieving digitalization, innovation and security in a healthy and continuous way.

Please find below a list of interesting talks from this track:

Bonus: Open Source AI

Nowadays, no conference would be complete without at least one talk about Artificial Intelligence, so Frank Karlitschek’s keynote what the AI revolution means for Open Source and our society was very welcome! Frank demonstrated that Open Source AI can indeed compete with proprietary solutions from the big players. He presented Nextcloud Assistant that runs locally, and that can be studied and modified. This assistant offers several exciting features: face recognition in photos, text translation, text summarization, text generation, image generation, speech transcript, and document classification –  all this while preserving privacy.

It’s worth pointing out that the Open Source Initiative is driving a multi-stakeholder process to define an “Open Source AI” and everyone is welcome to be part of the conversation.


I had a wonderful time at FOSS Backstage and I invite everyone interested in community, governance, and supply chain to join this amazing event next year. I would like to thank the organizers who work “backstage” to put together this conference. Thank you Paul Berschick, Sven Spiller, Alexander Brateanu, Isabel Drost-Fromm, Anne Sophie Riege, and Stefan Rudnitzki. A special thanks also to the volunteers, speakers, sponsors, and last but not least to all attendees who made this event special.

If you are interested in Open Source supply chain compliance and security, I invite you to learn a bit more about the ClearlyDefined and the ORT communities. Be sure to check out my report from the ORT Community Days.

Categories: FLOSS Research

GNU Guix: Adventures on the quest for long-term reproducible deployment

GNU Planet! - Wed, 2024-03-13 10:05

Rebuilding software five years later, how hard can it be? It can’t be that hard, especially when you pride yourself on having a tool that can travel in time and that does a good job at ensuring reproducible builds, right?

In hindsight, we can tell you: it’s more challenging than it seems. Users attempting to travel 5 years back with guix time-machine are (or were) unavoidably going to hit bumps on the road—a real problem because that’s one of the use cases Guix aims to support well, in particular in a reproducible research context.

In this post, we look at some of the challenges we face while traveling back, how we are overcoming them, and open issues.

The vision

First of all, one clarification: Guix aims to support time travel, but we’re talking of a time scale measured in years, not in decades. We know all too well that this is already very ambitious—it’s something that probably nobody except Nix and Guix are even trying. More importantly, software deployment at the scale of decades calls for very different, more radical techniques; it’s the work of archivists.

Concretely, Guix 1.0.0 was released in 2019 and our goal is to allow users to travel as far back as 1.0.0 and redeploy software from there, as in this example:

$ guix time-machine -q --commit=v1.0.0 -- \ environment --ad-hoc python2 -- python > guile: warning: failed to install locale Python 2.7.15 (default, Jan 1 1970, 00:00:01) [GCC 5.5.0] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>>

(The command above uses guix environment, the predecessor of guix shell, which didn’t exist back then.) It’s only 5 years ago but it’s pretty much remote history on the scale of software evolution—in this case, that history comprises major changes in Guix itself and in Guile. How well does such a command work? Well, it depends.

The project has two build farms; bordeaux.guix.gnu.org has been keeping substitutes (pre-built binaries) of everything it built since roughly 2021, while ci.guix.gnu.org keeps substitutes for roughly two years, but there is currently no guarantee on the duration substitutes may be retained. Time traveling to a period where substitutes are available is fine: you end up downloading lots of binaries, but that’s OK, you rather quickly have your software environment at hand.

Bumps on the build road

Things get more complicated when targeting a period in time for which substitutes are no longer available, as was the case for v1.0.0 above. (And really, we should assume that substitutes won’t remain available forever: fellow NixOS hackers recently had to seriously consider trimming their 20-year-long history of substitutes because the costs are not sustainable.)

Apart from the long build times, the first problem that arises in the absence of substitutes is source code unavailability. I’ll spare you the details for this post—that problem alone would deserve a book. Suffice to say that we’re lucky that we started working on integrating Guix with Software Heritage years ago, and that there has been great progress over the last couple of years to get closer to full package source code archival (more precisely: 94% of the source code of packages available in Guix in January 2024 is archived, versus 72% of the packages available in May 2019).

So what happens when you run the time-machine command above? It brings you to May 2019, a time for which none of the official build farms had substitutes until a few days ago. Ideally, thanks to isolated build environments, you’d build things for hours or days, and in the end all those binaries will be here just as they were 5 years ago. In practice though, there are several problems that isolation as currently implemented does not address.

Among those, the most frequent problem is time traps: software build processes that fail after a certain date (these are also referred to as “time bombs” but we’ve had enough of these and would rather call for a ceasefire). This plagues a handful of packages out of almost 30,000 but unfortunately we’re talking about packages deep in the dependency graph. Here are some examples:

  • OpenSSL unit tests fail after a certain date because some of the X.509 certificates they use have expired.
  • GnuTLS had similar issues; newer versions rely on datefudge to fake the date while running the tests and thus avoid that problem altogether.
  • Python 2.7, found in Guix 1.0.0, also had that problem with its TLS-related tests.
  • OpenJDK would fail to build at some point with this interesting message: Error: time is more than 10 years from present: 1388527200000 (the build system would consider that its data about currencies is likely outdated after 10 years).
  • Libgit2, a dependency of Guix, had (has?) a time-dependent tests.
  • MariaDB tests started failing in 2019.

Someone traveling to v1.0.0 will hit several of these, preventing guix time-machine from completing. A serious bummer, especially to those who’ve come to Guix from the perspective of making their research workflow reproducible.

Time traps are the main road block, but there’s more! In rare cases, there’s software influenced by kernel details not controlled by the build daemon:

In a handful of cases, but important ones, builds might fail when performed on certain CPUs. We’re aware of at least two cases:

Neither time traps nor those obscure hardware-related issues can be avoided with the isolation mechanism currently used by the build daemon. This harms time traveling when substitutes are unavailable. Giving up is not in the ethos of this project though.

Where to go from here?

There are really two open questions here:

  1. How can we tell which packages needs to be “fixed”, and how: building at a specific date, on a specific CPU?
  2. How can keep those aspects of the build environment (time, CPU variant) under control?

Let’s start with #2. Before looking for a solution, it’s worth remembering where we come from. The build daemon runs build processes with a separate root file system, under dedicated user IDs, and in separate Linux namespaces, thereby minimizing interference with the rest of the system and ensuring a well-defined build environment. This technique was implemented by Eelco Dolstra for Nix in 2007 (with namespace support added in 2012), at a time where the word container had to do with boats and before “Docker” became the name of a software tool. In short, the approach consists in controlling the build environment in every detail (it’s at odds with the strategy that consists in achieving reproducible builds in spite of high build environment variability). That these are mere processes with a bunch of bind mounts makes this approach inexpensive and appealing.

Realizing we’d also want to control the build environment’s date, we naturally turn to Linux namespaces to address that—Dolstra, Löh, and Pierron already suggested something along these lines in the conclusion of their 2010 Journal of Functional Programming paper. Turns out there is now a time namespace. Unfortunately it’s limited to CLOCK_MONOTONIC and CLOCK_BOOTTIME clocks; the manual page states:

Note that time namespaces do not virtualize the CLOCK_REALTIME clock. Virtualization of this clock was avoided for reasons of complexity and overhead within the kernel.

I hear you say: What about datefudge and libfaketime? These rely on the LD_PRELOAD environment variable to trick the dynamic linker into pre-loading a library that provides symbols such as gettimeofday and clock_gettime. This is a fine approach in some cases, but it’s too fragile and too intrusive when targeting arbitrary build processes.

That leaves us with essentially one viable option: virtual machines (VMs). The full-system QEMU lets you specify the initial real-time clock of the VM with the -rtc flag, which is exactly what we need (“user-land” QEMU such as qemu-x86_64 does not support it). And of course, it lets you specify the CPU model to emulate.

News from the past

Now, the question is: where does the VM fit? The author considered writing a package transformation that would change a package such that it’s built in a well-defined VM. However, that wouldn’t really help: this option didn’t exist in past revisions, and it would lead to a different build anyway from the perspective of the daemon—a different derivation.

The best strategy appeared to be offloading: the build daemon can offload builds to different machines over SSH, we just need to let it send builds to a suitably-configured VM. To do that, we can reuse some of the machinery initially developed for childhurds that takes care of setting up offloading to the VM: creating substitute signing keys and SSH keys, exchanging secret key material between the host and the guest, and so on.

The end result is a service for Guix System users that can be configured in a few lines:

(use-modules (gnu services virtualization)) (operating-system ;; … (services (append (list (service virtual-build-machine-service-type)) %base-services)))

The default setting above provides a 4-core VM whose initial date is January 2020, emulating a Skylake CPU from that time—the right setup for someone willing to reproduce old binaries. You can check the configuration like this:

$ sudo herd configuration build-vm CPU: Skylake-Client number of CPU cores: 4 memory size: 2048 MiB initial date: Wed Jan 01 00:00:00Z 2020

To enable offloading to that VM, one has to explicitly start it, like so:

$ sudo herd start build-vm

From there on, every native build is offloaded to the VM. The key part is that with almost no configuration, you get everything set up to build packages “in the past”. It’s a Guix System only solution; if you run Guix on another distro, you can set up a similar build VM but you’ll have to go through the cumbersome process that is all taken care of automatically here.

Of course it’s possible to choose different configuration parameters:

(service virtual-build-machine-service-type (virtual-build-machine (date (make-date 0 0 00 00 01 10 2017 0)) ;further back in time (cpu "Westmere") (cpu-count 16) (memory-size (* 8 1024)) (auto-start? #t)))

With a build VM with its date set to January 2020, we have been able to rebuild Guix and its dependencies along with a bunch of packages such as emacs-minimal from v1.0.0, overcoming all the time traps and other challenges described earlier. As a side effect, substitutes are now available from ci.guix.gnu.org so you can even try this at home without having to rebuild the world:

$ guix time-machine -q --commit=v1.0.0 -- build emacs-minimal --dry-run guile: warning: failed to install locale substitute: updating substitutes from 'https://ci.guix.gnu.org'... 100.0% 38.5 MB would be downloaded: /gnu/store/53dnj0gmy5qxa4cbqpzq0fl2gcg55jpk-emacs-minimal-26.2

For the fun of it, we went as far as v0.16.0, released in December 2018:

guix time-machine -q --commit=v0.16.0 -- \ environment --ad-hoc vim -- vim --version

This is the furthest we can go since channels and the underlying mechanisms that make time travel possible did not exist before that date.

There’s one “interesting” case we stumbled upon in that process: in OpenSSL 1.1.1g (released April 2020 and packaged in December 2020), some of the test certificates are not valid before April 2020, so the build VM needs to have its clock set to May 2020 or thereabouts. Booting the build VM with a different date can be done without reconfiguring the system:

$ sudo herd stop build-vm $ sudo herd start build-vm -- -rtc base=2020-05-01T00:00:00

The -rtc … flags are passed straight to QEMU, which is handy when exploring workarounds…

The time-travel continuous integration jobset has been set up to check that we can, at any time, travel back to one of the past releases. This at least ensures that Guix itself and its dependencies have substitutes available at ci.guix.gnu.org.

Reproducible research workflows reproduced

Incidentally, this effort rebuilding 5-year-old packages has allowed us to fix embarrassing problems. Software that accompanies research papers that followed our reproducibility guidelines could no longer be deployed, at least not without this clock twiddling effort:

It’s good news that we can now re-deploy these 5-year-old software environments with minimum hassle; it’s bad news that holding this promise took extra effort.

The ability to reproduce the environment of software that accompanies research work should not be considered a mundanity or an exercise that’s “overkill”. The ability to rerun, inspect, and modify software are the natural extension of the scientific method. Without a companion reproducible software environment, research papers are merely the advertisement of scholarship, to paraphrase Jon Claerbout.

The future

The astute reader surely noticed that we didn’t answer question #1 above:

How can we tell which packages needs to be “fixed”, and how: building at a specific date, on a specific CPU?

It’s a fact that Guix so far lacks information about the date, kernel, or CPU model that should be used to build a given package. Derivations purposefully lack that information on the grounds that it cannot be enforced in user land and is rarely necessary—which is true, but “rarely” is not the same as “never”, as we saw. Should we create a catalog of date, CPU, and/or kernel annotations for packages found in past revisions? Should we define, for the long-term, an all-encompassing derivation format? If we did and effectively required virtual build machines, what would that mean from a bootstrapping standpoint?

Here’s another option: build packages in VMs running in the year 2100, say, and on a baseline CPU. We don’t need to require all users to set up a virtual build machine—that would be impractical. It may be enough to set up the project build farms so they build everything that way. This would allow us to catch time traps and year 2038 bugs before they bite.

Before we can do that, the virtual-build-machine service needs to be optimized. Right now, offloading to build VMs is as heavyweight as offloading to a separate physical build machine: data is transferred back and forth over SSH over TCP/IP. The first step will be to run SSH over a paravirtualized transport instead such as AF_VSOCK sockets. Another avenue would be to make /gnu/store in the guest VM an overlay over the host store so that inputs do not need to be transferred and copied.

Until then, happy software (re)deployment!


Thanks to Simon Tournier for insightful comments on a previous version of this post.

Categories: FLOSS Project Planets

Real Python: Visualizing Data in Python With Seaborn

Planet Python - Wed, 2024-03-13 10:00

If you have some experience using Python for data analysis, chances are you’ve produced some data plots to explain your analysis to other people. Most likely you’ll have used a library such as Matplotlib to produce these. If you want to take your statistical visualizations to the next level, you should master the Python seaborn library to produce impressive statistical analysis plots that will display your data.

In this tutorial, you’ll learn how to:

  • Make an informed judgment as to whether or not seaborn meets your data visualization needs
  • Understand the principles of seaborn’s classic Python functional interface
  • Understand the principles of seaborn’s more contemporary Python objects interface
  • Create Python plots using seaborn’s functions
  • Create Python plots using seaborn’s objects

Before you start, you should familiarize yourself with the Jupyter Notebook data analysis tool available in JupyterLab. Although you can follow along with this seaborn tutorial using your favorite Python environment, Jupyter Notebook is preferred. You might also like to learn how a pandas DataFrame stores its data. Knowing the difference between a pandas DataFrame and Series will also prove useful.

So now it’s time for you to dive right in and learn how to use seaborn to produce your Python plots.

Free Bonus: Click here to download the free code that you can experiment with in Python seaborn.

Getting Started With Python seaborn

Before you use seaborn, you must install it. Open a Jupyter Notebook and type !python -m pip install seaborn into a new code cell. When you run the cell, seaborn will install. If you’re working at the command line, use the same command, only without the exclamation point (!). Once seaborn is installed, Matplotlib, pandas, and NumPy will also be available. This is handy because sometimes you need them to enhance your Python seaborn plots.

Before you can create a plot, you do, of course, need data. Later, you’ll create several plots using different publicly available datasets containing real-world data. To begin with, you’ll work with some sample data provided for you by the creators of seaborn. More specifically, you’ll work with their tips dataset. This dataset contains data about each tip that a particular restaurant waiter received over a few months.

Creating a Bar Plot With seaborn

Suppose you wanted to see a bar plot showing the average amount of tips received by the waiter each day. You could write some Python seaborn code to do this:

Python In [1]: import matplotlib.pyplot as plt ...: import seaborn as sns ...: ...: tips = sns.load_dataset("tips") ...: ...: ( ...: sns.barplot( ...: data=tips, x="day", y="tip", ...: estimator="mean", errorbar=None, ...: ) ...: .set(title="Daily Tips ($)") ...: ) ...: ...: plt.show() Copied!

First, you import seaborn into your Python code. By convention, you import it as sns. Although you can use any alias you like, sns is a nod to the fictional character the library was named after.

To work with data in seaborn, you usually load it into a pandas DataFrame, although other data structures can also be used. The usual way of loading data is to use the pandas read_csv() function to read data from a file on disk. You’ll see how to do this later.

To begin with, because you’re working with one of the seaborn sample datasets, seaborn allows you online access to these using its load_dataset() function. You can see a list of the freely available files on their GitHub repository. To obtain the one you want, all you need to do is pass load_dataset() a string telling it the name of the file containing the dataset you’re interested in, and it’ll be loaded into a pandas DataFrame for you to use.

The actual bar plot is created using seaborn’s barplot() function. You’ll learn more about the different plotting functions later, but for now, you’ve specified data=tips as the DataFrame you wish to use and also told the function to plot the day and tip columns from it. These contain the day the tip was received and the tip amount, respectively.

The important point you should notice here is that the seaborn barplot() function, like all seaborn plotting functions, can understand pandas DataFrames instinctively. To specify a column of data for them to use, you pass its column name as a string. There’s no need to write pandas code to identify each Series to be plotted.

The estimator="mean" parameter tells seaborn to plot the mean y values for each category of x. This means your plot will show the average tip for each day. You can quickly customize this to instead use common statistical functions such as sum, max, min, and median, but estimator="mean" is the default. The plot will also show error bars by default. By setting errorbar=None, you can suppress them.

The barplot() function will produce a plot using the parameters you pass to it, and it’ll label each axis using the column name of the data that you want to see. Once barplot() is finished, it returns a matplotlib Axes object containing the plot. To give the plot a title, you need to call the Axes object’s .set() method and pass it the title you want. Notice that this was all done from within seaborn directly, and not Matplotlib.

Note: You may be wondering why the barplot() function is encapsulated within a pair of parentheses (...). This is a coding style often used in seaborn code because it frequently uses method chaining. These extra brackets allow you to horizontally align method calls, starting each with its dot notation. Alternatively, you could use the backslash (\) for line continuation, although that is discouraged.

If you take another look at the code, the alignment of .set() is only possible because of these extra encasing brackets. You’ll see this coding style used throughout this tutorial, as well as when you read the seaborn documentation.

In some environments like IPython and PyCharm, you may need to use Matplotlib’s show() function to display your plot, meaning you must import Matplotlib into Python as well. If you’re using a Jupyter notebook, then using plt.show() isn’t necessary, but using it removes some unwanted text above your plot. Placing a semicolon (;) at the end of barplot() will also do this for you.

When you run the code, the resulting plot will look like this:

As you can see, the waiter’s daily average tips rise slightly on the weekends. It looks as though people tip more when they’re relaxed.

Note: One thing you should be aware of is that load_dataset(), unlike read_csv(), will automatically convert string columns into the pandas Categorical data type for you. You use this where your data contains a limited, fixed number of possible values. In this case, the day column of data will be treated as a Categorical data type containing the days of the week. You can see this by using tips["day"] to view the column:

Python In [2]: tips["day"] Out[2]: 0 Sun 1 Sun 2 Sun 3 Sun 4 Sun ... 239 Sat 240 Sat 241 Sat 242 Sat 243 Thur Name: day, Length: 244, dtype: category Categories (4, object): ['Thur', 'Fri', 'Sat', 'Sun'] Copied!

As you can see, your day column has a data type of category. Note, also, that while your original data starts with Sun, the first entry in the category is Thur. In creating the category, the days have been interpreted for you in the correct order. The read_csv() function doesn’t do this.

Read the full article at https://realpython.com/python-seaborn/ »

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Categories: FLOSS Project Planets

Qt for MCUs 2.7 released

Planet KDE - Wed, 2024-03-13 09:20

A new version of Qt for MCUs is available, bringing new features to the Qt Quick Ultralite engine, additional microcontrollers, and various improvements to our GUI framework for resource-constrained embedded systems.

Categories: FLOSS Project Planets