Russ Allbery: Review: Tess of the Road

Planet Debian - Mon, 2022-12-19 23:19

Review: Tess of the Road, by Rachel Hartman

Series: Tess of the Road #1 Publisher: Random House Copyright: 2018 Printing: 2022 ISBN: 1-101-93130-2 Format: Kindle Pages: 536

Tess of the Road is the first book of a YA fantasy duology set in the same universe as Seraphina and Shadow Scale.

It's hard to decide what to say about reading order (and I now appreciate the ambiguous answers I got). Tess of the Road is a sequel to Seraphina and Shadow Scale in the sense that there are numerous references to the previous duology, but it has a different protagonist and different concerns. You don't need to read the other duology first, but Tess of the Road will significantly spoil the resolution of the romance plot in Seraphina, and it will be obvious that you've skipped over background material. That said, Shadow Scale is not a very good book, and this is a much better book.

I guess the summary is this: if you're going to read the first duology, read it first, but don't feel obligated to do so.

Tess was always a curious, adventurous, and some would say unmanageable girl, nothing like her twin. Jeanne is modest, obedient, virtuous, and practically perfect in every way. Tess is not; after a teenage love affair resulting in an out-of-wedlock child and a boy who disappeared rather than marry her, their mother sees no alternative but to lie about which of the twins is older. If Jeanne can get a good match among the nobility, the family finances may be salvaged. Tess's only remaining use is to help her sister find a match, and then she can be shuffled off to a convent.

Tess throws herself into court politics and does exactly what she's supposed to. She engineers not only a match, but someone Jeanne sincerely likes. Tess has never lacked competence. But this changes nothing about her mother's view of her, and Tess is depressed, worn, and desperately miserable in Jeanne's moment of triumph. Jeanne wants Tess to stay and become the governess of her eventual children, retaining their twin bond of the two of them against the world. Their older sister Seraphina, more perceptively, tries to help her join an explorer's expedition. Tess, in a drunken spiral of misery, insults everyone and runs away, with only a new pair of boots and a pack of food.

This is going to be one of those reviews where the things I didn't like are exactly the things other readers liked. I see why people loved this book, and I wish I had loved it too. Instead, I liked parts of it a great deal and found other parts frustrating or a bit too off-putting. Mostly this is a preference problem rather than a book problem.

My most objective complaint is the pacing, which was also my primary complaint about Shadow Scale. It was not hard to see where Hartman was going with the story, I like that story, I was on board with going there, but getting there took for-EV-er. This is a 536 page book that I would have edited to less than 300 pages. It takes nearly a hundred pages to get Tess on the road, and while some of that setup is necessary, I did not want to wallow in Tess's misery and appalling home life for quite that long.

A closely related problem is that Hartman continues to love flashbacks. Even after Tess has made her escape, we get the entire history of her awful teenage years slowly dribbled out over most of the book. Sometimes this is revelatory; mostly it's depressing. I had guessed the outlines of what had happened early in the book (it's not hard), and that was more than enough motivation for me, but Hartman was determined to make the reader watch every crisis and awful moment in detail. This is exactly what some readers want, and sometimes it's even what I want, but not here. I found the middle of the book, where the story is mostly flashbacks and flailing, to be an emotional slog.

Part of the problem is that Tess has an abusive mother and goes through the standard abuse victim process of being sure that she's the one who's wrong and that her mother is justified in her criticism. This is certainly realistic, and it eventually lead to some satisfying catharsis as Tess lets go of her negative self-image. But Tess's mother is a narcissistic religious fanatic with a persecution complex and not a single redeeming quality whatsoever, and I loathed reading about her, let alone reading Tess tiptoeing around making excuses for her. The point of this in the story is for Tess to rebuild her self-image, and I get it, and I'm sure this will work for some readers, but I wanted Tess's mother (and the rest of her family except her sisters) to be eaten by dragons. I do not like the emotional experience of hating a character in a book this much.

Where Tess of the Road is on firmer ground is when Tess has an opportunity to show her best qualities, such as befriending a quigutl in childhood and, in the sort of impulsive decision that shows her at her best, learning their language. (For those who haven't read the previous books, quigutls are a dog-sized subspecies of dragon that everyone usually treats like intelligent animals, although they're clearly more than that.) Her childhood quigutl friend Pathka becomes her companion on the road, which both gives her wanderings some direction and adds some useful character interaction.

Pathka comes with a plot that is another one of those elements that I think will work for some readers but didn't work for me. He's in search of a Great Serpent, a part of quigutl mythology that neither humans or dragons pay attention to. That becomes the major plot of the novel apart from Tess's emotional growth. Pathka also has a fraught relationship with his own family, which I think was supposed to parallel Tess's relationships but never clicked for me. I liked Tess's side of this relationship, but Pathka is weirdly incomprehensible and apparently fickle in ways that I found unsatisfying. I think Hartman was going for an alien tone that didn't quite work for me.

This is a book that gets considerably better as it goes along, and the last third of the book was great. I didn't like being dragged through the setup, but I loved the character Tess became. Once she reaches the road crew, this was a book full of things that I love reading about. The contrast between her at the start of the book and the end is satisfying and rewarding. Tess's relationship with her twin Jeanne deserves special mention; their interaction late in the book is note-perfect and much better than I had expected.

Unfortunately, Tess of the Road doesn't have a real resolution. It's only the first half of Tess's story, which comes back to that pacing problem. Ah well.

I enjoyed this but I didn't love it. The destination was mostly worth the journey, but I thought the journey was much too long and I had to spend too much time in the company of people I hated far more intensely than was comfortable. I also thought the middle of the book sagged, a problem I have now had with two out of three of Hartman's books. But I can see why other readers with slightly different preferences loved it. I'm probably invested enough to read the sequel, although I'm a bit grumbly that the sequel is necessary.

Followed by In the Serpent's Wake.

Rating: 7 out of 10

Categories: FLOSS Project Planets

Ian Jackson: Rust for the Polyglot Programmer, December 2022 edition

Planet Debian - Mon, 2022-12-19 20:47

I have reviewed, updated and revised my short book about the Rust programming language, Rust for the Polyglot Programmer.

It now covers some language improvements from the past year (noting which versions of Rust they’re available in), and has been updated for changes in the Rust library ecosystem.

With (further) assistance from Mark Wooding, there is also a new table of recommendations for numerical conversion.

Recap about Rust for the Polyglot Programmer

There are many introductory materials about Rust. This one is rather different. Compared to much other information about Rust, Rust for the Polyglot Programmer is:

  • Dense: I assume a lot of starting knowledge. Or to look at it another way: I expect my reader to be able to look up and digest non-Rust-specific words or concepts.

  • Broad: I cover not just the language and tools, but also the library ecosystem, development approach, community ideology, and so on.

  • Frank: much material about Rust has a tendency to gloss over or minimise the bad parts. I don’t do that. That also frees me to talk about strategies for dealing with the bad parts.

  • Non-neutral: I’m not afraid to recommend particular libraries, for example. I’m not afraid to extol Rust’s virtues in the areas where it does well.

  • Terse, and sometimes shallow: I often gloss over what I see as unimportant or fiddly details; instead I provide links to appropriate reference materials.

After reading Rust for the Polyglot Programmer, you won’t know everything you need to know to use Rust for any project, but should know where to find it.

Comments are welcome of course, via the Dreamwidth comments or Salsa issue or MR. (If you’re making a contribution, please indicate your agreement with the Developer Certificate of Origin.)

edited 2022-12-20 01:48 to fix a typo

Categories: FLOSS Project Planets

Wingware: Wing Python IDE Version 9.0.2 - December 20, 2022

Planet Python - Mon, 2022-12-19 20:00

Wing 9.0.2 speeds up the debugger during module imports, fixes several issues with match/case, corrects initial directory used with 'python -m', fixes auto-refresh of version control status, adds commands for traversing current selection in the multi-selection dialog, and fixes some stability issues.

See the change log for details.

Download Wing 9.0.2 Now: Wing Pro | Wing Personal | Wing 101 | Compare Products

What's New in Wing 9

Support for Python 3.11

Wing 9 adds support for Python 3.11, the next major release of Python, so you can take advantage of Python 3.11's substantially improved performance and new features.

Debugger Optimizations

Wing 9 reduces debugger overhead by about 20-50% in Python 3.7+. The exact amount of performance improvement you will see depends on the nature of the code that is being debugged and the Python version that you are using.

Streamlined Light and Dark Theming

Wing 9 allows configuring a light and dark theme independently (on the first Preferences page) in order to make it easier to switch between light and dark modes. Two new light themes New Light and Faerie Storm have been added, and switching display modes should be faster and smoother visually.

Other Improvements

Wing 9 also shows auto-invocation arguments for methods of super(), fixes a number of issues affecting code analysis and multi-threaded debugging, and makes several other improvements.

For a complete list of new features in Wing 9, see What's New in Wing 9.

Try Wing 9 Now!

Wing 9 is an exciting new step for Wingware's Python IDE product line. Find out how Wing 9 can turbocharge your Python development by trying it today.

Downloads: Wing Pro | Wing Personal | Wing 101 | Compare Products

See Upgrading for details on upgrading from Wing 8 and earlier, and Migrating from Older Versions for a list of compatibility notes.

Categories: FLOSS Project Planets

Freexian Collaborators: Recent improvements to Tryton's Debian Packaging (by Mathias Behrle and Raphaël Hertzog)

Planet Debian - Mon, 2022-12-19 19:00

Freexian has been using Tryton for a few years to handle its invoicing and accounting. We have thus also been using the Debian packages maintained by Mathias Behrle and we have been funding some of his work because maintaining an ERP with more than 50 source packages was too much for him to handle alone on his free time.

When Mathias discovered our Project Funding initiative, it was quite natural for him to consider applying to be able to bring some much needed improvements to Tryton’s Debian packaging. He’s running his own consulting company (MBSolutions) so it’s easy for him to invoice Freexian to get the money for the funded projects.

What follows is Mathias Behrle’s description of the projects that he submitted and of the work that he achieved.

If you want to contact him, you can reach out to mathiasb@m9s.biz or mbehrle@debian.org. You can also follow him on Mastodon.


In January 2022 I applied for two projects in the Freexian Project Funding Initiative.

  • Tryton Project 1

    • The starting point of this project was Debian Bug #998319: tryton-server should provide a ready-to-use production-grade server config.

      To address this problem instead of only providing configuration snippets the idea was to provide a full featured guided setup of a Tryton production environment, thus eliminating the risks of trial and error for the system administrator.

  • Tryton Project 2

    • The goal of this project was to complete the available Tryton modules in Debian main with the latest set available from tryton.org and to automate the task of creating new Debian packages from Tryton modules as much as possible.

As the result of Task 1, several new packages emerged:

  • tryton-server-postgresql provides the guided setup of a PostgreSQL database backend.
  • tryton-server-uwsgi provides the installation and configuration of a robust WSGI server on top of tryton-server.
  • tryton-server-nginx provides the configuration of a scalable web frontend to the uwsgi server, including the optional setup of secure access by Letsencrypt certificates.
  • tryton-server-all-in-one puts it all together to provide a fully functional Tryton production environment, including a database filled with basic static data. With the installation of this package a robust and secure production grade setup is possible from scratch, all configuration leg work is done in the background.

The work was thoroughly reviewed by Neil Williams. Thanks go to him for his detailed feedback providing very valuable information from the view of a fresh Tryton user getting in first contact with the software. A cordial thank you as well goes to the translation teams providing initial reviews and translations for the configuration questions.

The efforts of Task 1 were completed with Task 2:

  • A Tryton specific version of PyPi2deb was created to help in the preparation of new Debian packages for new Tryton modules.
  • All missing Tryton modules for the current series were packaged for Debian.

On top of those two planned projects, I completed an additional task: the packaging of the Tryton Web Client.

The Web Client is a quite important feature to access a Tryton server with the browser and even a crucial requirement for some companies. Unfortunately the packaging of the Web Client for Debian was problematic from the beginning. tryton-sao requires exact versions of JavaScript libraries that are almost never guaranteed to be available in the different targeted Debian releases. Therefore a package with vendored libraries has been created and will hopefully soon hit the Debian main archive. The package is already available from the Tryton Backport Mirror for the usually supported Debian releases.


I am very pleased that the Tryton suite in Debian has gained full coverage of Tryton modules and a user-friendly installation. The completion of the project represents a huge step forward in the state-of-the-art deployment of a production grade Tryton environment. Without the monetary support of Freexian’s project funding the realization of this project wouldn’t have been possible in this way and to this extent.

Categories: FLOSS Project Planets

Reproducible Builds (diffoscope): diffoscope 229 released

Planet Debian - Mon, 2022-12-19 19:00

The diffoscope maintainers are pleased to announce the release of diffoscope version 229. This version includes the following changes:

[ Chris Lamb ] * Skip test_html.py::test_diff if html2text is not installed. (Closes: #1026034) [ Holger Levsen ] * Bump standards version to 4.6.2, no changes needed.

You find out more by visiting the project homepage.

Categories: FLOSS Project Planets

Simon Josefsson: Second impressions of Guix 1.4

Planet Debian - Mon, 2022-12-19 16:38

While my first impression of Guix 1.4rc2 on NV41PZ was only days ago, the final Guix 1.4 release has happened. I thought I should give it a second try, although being at my summer house with no wired ethernet I realized this may be overly optimistic. However I am happy to say that a guided graphical installation on my new laptop went smooth without any problem. Practicing OS installations has a tendency to make problems disappear.

My WiFi issues last time was probably due to a user interface mistake on my part: you have to press a button to search for wireless networks before seeing them. I’m not sure why I missed this the first time, but maybe the reason was that I didn’t really expect WiFi to work on this laptop with one Intel-based WiFi card without firmware and a USB-based WiFi dongle. I haven’t went back to the rc2 image, but I strongly believe it wasn’t a problem with that image but my user mistake. Perhaps some more visual clues could be given that Guix found a usable WiFi interface, as this isn’t completely obvious now.

My main pet problem with the installation is the language menu. It contains a bazillion languages, and I want to find Swedish in it. However the list is half-sorted so it looks like it is alphabetized but paging through the list I didn’t find ‘svenska’, but did notice that the sorting restarts after a while. Eventually I find my language of chose, but a better search interface would be better. Typing ‘s’ to find it jumps around in the list. This may be a user interface design problem for me as well: I just may be missing whatever great logic I’m sure there is to find my language in that menu.

I did a simple installation, enabling GNOME, Cups and OpenSSH. Given the experience with sharing /home with my Trisquel installation last time, I chose to not mount it this time, fixing this later on if I want to share files between OSes. Watching the installation proceed with downloading packages over this slow WiFi was meditative, and I couldn’t help but wonder what logic there was to the many steps where it says it is going to download X MB of software, downloads a set of packages, and then starts another iteration saying it is going to download Y MB and then downloads a list of packages. Maybe there is a package dependency tree being worked out while I watch.

After logging into GNOME I had to provide the WiFi password another time, it seems it wasn’t saved during installation, or I was too impatient to wait for WiFi to come up automatically. Using the GNOME WiFi selection menu worked fine. The webcam issue is still present, the image is distorted and it doesn’t happen in Trisquel. Other than that, everythings appear to work, but it has to be put through more testing.

Upgrading Guix after installation is still suffering from the same issue I noticed with the rc2 images, this time I managed to save the error message in case someone wants to provide an official fix or workaround. The initial guix pull command also takes forever, even on this speedy laptop, but after the initial run it is faster. Here are the error messages (pardon the Swedish):

root@kaka ~# guix pull ... root@kaka ~# guix system reconfigure /etc/config.scm guix system: fel: aborting reconfiguration because commit 8e2f32cee982d42a79e53fc1e9aa7b8ff0514714 of channel 'guix' is not a descendant of 989a3916dc8967bcb7275f10452f89bc6c3389cc tips: Use `--allow-downgrades' to force this downgrade. root@kaka ~#

I’ll avoid using –allow-downgrades this time to see if there is a better solution available.

Update: Problem resolved: my muscle memory typed sudo -i before writing the commands above. If I stick to the suggested ‘guix pull‘ (as user) followed by ‘sudo guix system reconfigure /etc/config.scm‘ everything works. I’ll leave this in case someone else runs into this problem.

Categories: FLOSS Project Planets

Talking Drupal: Talking Drupal #378 - Acquia’s Drupal Acceleration Team

Planet Drupal - Mon, 2022-12-19 14:00

Today we are talking about Acquia’s Drupal Acceleration Team with Tim Plunkett.

For show notes visit: www.talkingDrupal.com/378

  • What is the Drupal Acceleration Team (DAT)
  • Responsibilities
  • Previous releases
  • Office of the CTO - OCTO
  • How big is the team
  • Direction
  • Priorities for new releases
  • Dries’ involvement
  • Contribution %
  • What are you working on now
  • Something you wish you were working on
  • R&D
  • Planning 2-5 years
  • Getting involved
Resources Guests

Tim Plunkett - @timplunkett


Nic Laflin - www.nLighteneddevelopment.com @nicxvan John Picozzi - www.epam.com @johnpicozzi Leslie Glynn - redfinsolutions.com @leslieglynn

MOTW Correspondent

Martin Anderson-Clutz - @mandclu Keysave Adds Javascript to allow editors and admins to save an entity or config using command-s or control-s instead of clicking on the submit button.

Categories: FLOSS Project Planets

James Bennett: Boring Python: code quality

Planet Python - Mon, 2022-12-19 13:25

This is the second in a series of posts I intend to write about how to build, deploy, and manage Python applications in as boring a way as possible. In the first post in the series I gave a definition of what I mean by “boring”, and it’s worth revisiting:

I don’t mean “reliable” or “bug-free” or “no incidents”. While there is some overlap, and some of the things I’ll be recommending can help to reduce …

Read full entry

Categories: FLOSS Project Planets

Matthew Wright: Options to run pandas DataFrame.apply in parallel

Planet Python - Mon, 2022-12-19 13:24

A common use case in pandas is to want to apply a function to rows in a DataFrame. For a novice, the temptation can be to iterate through the rows in the DataFrame and pass the data to a function, but that is not a good idea. (You can read this article for a detailed explanation of why). Pandas has a method on both DataFrames and Series that applies a function to the data. Ideally, we want that to be a vectorized implementation. But in many cases a non-vectorized implementation already exists, or the solution cannot be vectorized. If the DataFrame is large enough, or the function slow enough, applying the function can be very time consuming. In those situations, a way to speed things up is to run the code in parallel on multiple CPUs. In this article, I’ll survey a number of popular options for applying functions to pandas DataFrames in parallel.

An example problem

To make things more concrete, let’s consider an example where each row in a DataFrame represents a sample of data. We want to calculate a value from each row. The calculation might be slow. For demonstration purposes, we’ll just invent a CPU intensive task. It turns out calculating arctangent is one such task, so we’ll just make a function that does a lot of that. Our data will be a simple DataFrame with one data point per row, but it will be randomized so that each row is likely to be unique. We want unique data so that optimization via caching or memoization doesn’t impact our comparisons.

import pandas as pd import numpy as np import math # our slow function def slow_function(start: float) -> float: res = 0 for i in range(int(math.pow(start, 7))): res += math.atan(i) * math.atan(i) return res %timeit slow_function(5) 18.5 ms ± 465 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

We can see that this function is fairly slow, so calculating it over hundreds of values will take multiple seconds.

# make sample data sample = pd.DataFrame({'value': np.random.random(500) + 5}) sample.tail() value 495 5.577242 496 5.484517 497 5.136881 498 5.174797 499 5.644561 Running apply

Now if we want to run our slow_function on each row, we can use apply. One quick note on DataFrame.apply – it will apply per column by default (it will use axis 0). This means the function will be invoked once per column. The applied function receives the column (a Series) each time it is called, not each row (also a Series). If we use axis=1, then apply will pass each row to the function instead. This is the choice what we want here.

I’m using a lambda to pick out the value column to pass into the slow_function. At the end, I turn the resulting Series back into a DataFrame.

sample[-5:].apply(lambda r: slow_function(r['value']), axis=1).to_frame(name="value") value 495 414125.614960 496 368264.399398 497 232842.530062 498 245144.830221 499 450413.419081

That is a little ugly though. Wouldn’t it be great if we could just use a vectorized solution on the entire column instead? Well it turns out there’s a very easy way to create a vectorized solution using Numpy, just wrap it in np.vectorize.

sample[-5:].apply(np.vectorize(slow_function)) value 495 414125.614960 496 368264.399398 497 232842.530062 498 245144.830221 499 450413.419081

But is this an optimized vectorized solution? Unfortunately it’s not. The docs for np.vectorize point this out:

The vectorize function is provided primarily for convenience, not for
performance. The implementation is essentially a for loop.

Let’s verify the speeds here with some timings. We’ll also just try running apply on the value column, which is a pandas Series. In this case, there’s only one axis, so it applies the function to each element.

%timeit sample.apply(lambda r: slow_function(r['value']), axis=1) 17.5 s ± 426 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) %timeit sample.apply(np.vectorize(slow_function)) 17.6 s ± 176 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) %timeit sample['value'].apply(slow_function) 17.7 s ± 130 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

So all three of these methods are essentially doing the same thing. While the code for np.vectorize looks nice and clean, it’s not faster. Each solution is running a for loop over each row in the DataFrame (or Series), running our slow_function on each value. So let’s move on to the goal of this article: let’s run this function on multiple cores at once.

Parallel processing

Before we step into running our code on multiple cores, let’s cover a few basics. Everything we’ve done thus far has all been done in one process. This means that the Python code is all running on one CPU core, even if my computer has multiple CPU cores available.

If we want to take advantage of multiple processes or cores at once, we have that option in Python. The basic idea is to run multiple Python processes, and have each one perform a fraction of the calculations. Then all the results are returned to the primary process. For example, if we have 4 cores available, then we should be able to have each core perform 25% of the calculations at the same time. In theory, the job will be done 4 times faster. In reality, it will be less efficient than that.

Comparing implementations

Before we move on to parallel implementations, let’s setup the code we’ll use to compare them.

Note that all of the code samples below are in one Python file (slow_function.py) for your convenience. You can use it to run the timings you’ll see below, or run an any implementation from the command line. You can access it here in my github repo and follow along in your own environment.

To run this code, I created a clean virtualenv for this article using pyenv and installed Python 3.9.12. All the projects were installed in the same virtualenv.

For all of these code samples, we’ll assume we have the following code is available:

import math import sys import argparse import multiprocessing import numpy as np def slow_function(start: float) -> float: res = 0 for i in range(int(math.pow(start, 7))): res += math.atan(i) * math.atan(i) return res def get_sample(): data = {'value': np.random.random(500) + 5} return data

Here is the default (single CPU) implementation, the same as what we ran above:

def run_default(): import pandas as pd sample = pd.DataFrame(get_sample()) sample['results'] = sample['value'].apply(slow_function) print("Default results:\n", sample.tail(5))

My method for timing this is to run the timeit module on the code above, like this:

python -m timeit "import slow_function; slow_function.run_default()"

Which yields

1 loop, best of 5: 17.4 sec per loop

As seen above, our base problem is about 17 seconds to run. How much can we improve on that?

Core multiprocessing

As a base parallel case, we will implement a solution with the core Python multiprocessing module. Then we will look at a number of popular libraries that make this task easier to implement. You can decide which one is easiest to understand and use for your purposes. We’ll also look at a few interesting tidbits about the projects that can help you make a decision on whether to use them.

The multiprocessing module is fairly straightforward to use. It comes with core python, so there is no extra installation step. We only need to invoke it correctly. There are several ways to use the module, but I’ll show you an example using multiprocessing.Pool.

For more details on multiprocessing, you can read my article that shows basic usage.

Note that multiprocessing doesn’t know about pandas and DataFrames, so to send each row into the pool, we have to provide either the guts of our data, or an iterable.

Some multiprocessing gotchas

FYI: when using multiprocessing, you also might have to put your slow_function in a separate python file since the processes that are launched by the multiprocessing module have to have access to the same functions. This tends to show up on some platforms, like Windows or when running from Jupyter notebooks. In the case where I was running this code in a Jupyter notebook, I saw this error if using functions defined in the notebook: AttributeError: Can't get attribute 'slow_function' on <module '__main__' (built-in)>.

This is what a multiprocessing implementaton looks like.

def run_multiprocessing(): import pandas as pd sample = pd.DataFrame(get_sample()) with multiprocessing.Pool(processes=multiprocessing.cpu_count()) as pool: results = pool.map(slow_function, sample['value']) sample['results'] = results print("Multiprocessing results:\n", sample.tail(5))

Again, running this using timeit as follows:

python -m timeit "import slow_function; slow_function.run_multiprocessing()"


1 loop, best of 5: 5.86 sec per loop

Now we can see that the multiprocessing version runs a little more than 3x faster. My machine has 4 real cores (and 8 virtual cores), so this is somewhat in line with expectations. Instead of being 4x faster, it has to deal with a bit of overhead for copying the data and competing with other tasks on my machine. If we had even more cores available, we could further improve the performance

Other options

Even with a simple example it’s clear that using multiprocessing is not seamless. We have to extract the data from our DataFrame to pass into the pool.map function, and the results are returned in a list. There’s also a __main__ guard boilerplate, and we had to move our function out to a separate file for Jupyter to work with it.

There are a number of projects that build on top of multiprocessing, pandas, and other projects. Some of them even work directly with the concept of a DataFrame, but support distributed computing. For the rest of the article, we’ll implement this simple problem using each project. This demonstrates how each one works and the basic steps to get it running.


Joblib is a generic set of tools for pipelining code in Python. It’s not specifically integrated with pandas, but it’s easy enough to use and has some other nice features such as disk caching of functions and memoization.

You can install joblib with pip:

pip install joblib

The example code is fairly simple:

def run_joblib(): import pandas as pd from joblib import Parallel, delayed sample = pd.DataFrame(get_sample()) results = Parallel(n_jobs=multiprocessing.cpu_count())( delayed(slow_function)(i) for i in sample['value'] ) sample['results'] = results print("joblib results:\n", sample.tail(5))

Checking the performance:

python -m timeit "import slow_function; slow_function.run_joblib()"

gives us

1 loop, best of 5: 5.77 sec per loop

For general parallel processing, joblib makes for cleaner code than the multiprocessing. The speed is the same, and the project offers some extra tools that can be helpful.

Now we’ll look a few projects that are more closely integrated with pandas. If you’re used to working with pandas and look back at the code we’ve written so far, it might look a little clunky and different from other pandas DataFrame methods that you’re used to. The rest of the projects will look quite a bit more like standard pandas code.


Dask is a library that scales the standard PyData tools, like pandas, NumPy, and scikit-learn. From a code perspective, it usually looks pretty similar to the code you are used to, but it’s possible to scale out to multiple cores on one machine, or even clusters of multiple machines. Even though we are only looking at processing a DataFrame that will fit into memory on one machine, it’s possible to run code with Dask that uses more memory than available on the main node. But Dask work great with on your local machine and even provides benefits without a full cluster.

As you see in the code below, a Dask DataFrame wraps a regular pandas DataFrame, and supplies a similar interface. The difference with Dask is that sometimes you need to supply some hints to the calculation (the meta argument to apply), and the execution is always deferred. To get the result, you call compute. But writing this code feels much the same as writing normal pandas code.

You can install it using pip:

pip install "dask[complete]"

or if you’re using conda:

conda install dask

This is a very basic intro, read the introductory docs for more complete examples.

In order for dask to run in parallel on a local host, you’ll have to start a local cluster. We do this only once.

# global variable DASK_RUNNING = False def run_dask(): import pandas as pd import dask.dataframe as dd # hack for allowing our timeit code to work with one cluster global DASK_RUNNING if not DASK_RUNNING: # normally, you'll do this once in your code from dask.distributed import Client, LocalCluster cluster = LocalCluster() # Launches a scheduler and workers locally client = Client(cluster) # Connect to distributed cluster and override default print(f"Started cluster at {cluster.dashboard_link}") DASK_RUNNING = True sample = pd.DataFrame(get_sample()) dask_sample = dd.from_pandas(sample, npartitions=multiprocessing.cpu_count()) dask_sample['results'] = dask_sample['value'].apply(slow_function, meta=('value', 'float64')).compute() print("Dask results:\n", dask_sample.tail(5))

Again, we time this as follows:

python -m timeit "import slow_function; slow_function.run_dask()"

and get

1 loop, best of 5: 5.21 sec per loop

Note that when you are running a local cluster, you can access a handy dashboard for monitoring the cluster, it’s available via the field cluster.dashboard_link.

On my machine, Dask performs as well as the other parallel options. It has the added benefit of monitoring and further scalability.


Modin is a library that is built on top of Dask (and other libraries) but serves as a drop in replacement for pandas, making it even easier to work with existing code. When using Modin, they suggest replacing the import pandas as pd line as import modin.pandas as pd. That may be the only change needed to take advantage of it. Modin will provide speed improvements out of the box, and with some configuration and use of other libraries, can continue to scale up.

You install Modin with pip:

pip install modin

But you’ll need to install a backend as well. See the section of the docs for more details. Since we just installed Dask above, I’ll use that. I’ll also run the Dask cluster for Modin to use.

pip install "modin[dask]"

Note that besides the imports and Dask setup, our code looks exactly like bare pandas code.

def run_modin(): global DASK_RUNNING import os os.environ["MODIN_ENGINE"] = "dask" # Modin will use Dask if not DASK_RUNNING: from dask.distributed import Client, LocalCluster cluster = LocalCluster() # Launches a scheduler and workers locally client = Client(cluster) # Connect to distributed cluster and override default print(f"Started cluster at {cluster.dashboard_link}") DASK_RUNNING = True import modin.pandas as pd sample = pd.DataFrame(get_sample()) sample['results'] = sample['value'].apply(slow_function) print("Modin results:\n", sample.tail(5))

Timing from

python -m timeit "import slow_function; slow_function.run_modin()"

gives us

1 loop, best of 5: 5.57 sec per loop

Modin with Dask provides the benefits of Dask, without the code differences.


Swifter is a package that figures out the best way to apply a function to a pandas DataFrame. It can do several things, including multiprocessing and vectorization. It integrates with other libraries like Dask and Modin, and will attempt to use them in the most efficient way possible. To use it, you just use the Swifter version of apply, not the one from DataFrame – as shown below.

You can install swifter with pip:

pip install swifter

To use it with Modin, just import modin before swifter (or register it with swifter.register_modin()). It’s almost the same as the base pandas version.

def run_swifter(): global DASK_RUNNING import os os.environ["MODIN_ENGINE"] = "dask" # Modin will use Dask if not DASK_RUNNING: from dask.distributed import Client, LocalCluster cluster = LocalCluster() # Launches a scheduler and workers locally client = Client(cluster) # Connect to distributed cluster and override default print(f"Started cluster at {cluster.dashboard_link}") DASK_RUNNING = True import pandas as pd import swifter swifter.register_modin() # or could import modin.pandas as pd sample = pd.DataFrame(get_sample()) sample['results'] = sample['value'].swifter.apply(slow_function) print("Swifter results:\n", sample.tail(5))

Double-checking the performance:

python -m timeit "import slow_function; slow_function.run_swifter()"

gives us slightly slower results:

1 loop, best of 5: 12.3 sec per loop

While there is a speed difference (Swifter is slowere here), this can be explained by the fact that Swifter samples the data in order to determine whether it is worthwhile to use a parallel option. For larger calculations, this extra work will be negligible. Changing the defaults is very easy through configuration, see docs for more details.

Swifter also includes some handy progress bars in both the shell and Jupyter notebooks. For longer running jobs, that is very convenient.


Pandarallel is another project that integrates with pandas, similar to Swifter. You need to do a small initialization, then use the extra DataFrame methods to apply a method to a DataFrame in parallel. It has nice support for Jupyter progress bars as well, which can be a nice touch for users running it in a notebook. It doesn’t have the same level of support for distributed libraries like Dask. But it’s very simple code to write.

You install Pandarallel with pip:

pip install pandarallel def run_pandarallel(): from pandarallel import pandarallel pandarallel.initialize() import pandas as pd sample = pd.DataFrame(get_sample()) sample['results'] = sample['value'].parallel_apply(slow_function) print("Pandarallel results:\n", sample.tail(5))

Checking results with

python -m timeit "import slow_function; slow_function.run_pandarallel()"


1 loop, best of 5: 5.12 sec per loop

If you are only looking for a simple way to run apply in parallel, and don’t need the other improvements of the other projects, it can be a good option.


PySpark is a Python interface to Apache Spark. The Spark project is a multi-language engine for executing data engineering, data science, and machine learning tasks in a clustered environment. Similar to Dask, it can scale up from single machines to entire clusters. It also supports multiple languages.

PySpark contains a pandas API, so it is possible to write pandas code that works on Spark with little effort. Note that the pandas API is not 100% complete and also has some minor differences from standard pandas. But as you’ll see, there are performance impacts that might make porting code to PySpark worth it.

You can install pyspark with pip (I also needed to install PyArrow):

pip install pyspark pyarrow

The sample code is similar to basic pandas.

def run_pyspark(): import pyspark.pandas as ps sample = ps.DataFrame(get_sample()) sample['results'] = sample['value'].apply(slow_function) print("PySpark results:\n", sample.tail(5))

Testing the speed with

python -m timeit "import slow_function; slow_function.run_pyspark()"


1 loop, best of 5: 2.73 sec per loop

This is quite a bit faster than the other options. But it’s worth noting here that the underlying implementation is not running the same pandas code on more CPUs, but rather running the Spark code on multiple CPUs. This is just a simple example, and there is quite a bit of configuration possible with Spark, but you can see that pandas integration makes trying it out quite easy.


We looked at a simple CPU bound function that we applied to a DataFrame of data. This was our base case. We then used the following libraries to implement a parallel version:

  • multiprocessing
  • joblib
  • Dask
  • Modin
  • Swifter
  • Pandarallel
  • PySpark

Each of these projects offers features and improvements over the base multiprocessing version, with improvements from 3 to 7 times over our base case. Depending on your needs, one of these projects can offer improved readability and scalability.

The post Options to run pandas DataFrame.apply in parallel appeared first on wrighters.io.

Categories: FLOSS Project Planets

GNU Guix: GNU Guix 1.4.0 released

GNU Planet! - Mon, 2022-12-19 10:30

We are pleased to announce the release of GNU Guix version 1.4.0!

The release comes with ISO-9660 installation images, a virtual machine image, and with tarballs to install the package manager on top of your GNU/Linux distro, either from source or from binaries—check out the download page. Guix users can update by running guix pull.

It’s been 18 months since the previous release. That’s a lot of time, reflecting both the fact that, as a rolling release, users continuously get new features and update by running guix pull; but let’s face it, it also shows an area where we could and should collectively improve our processes. During that time, Guix received about 29,000 commits by 453 people, which includes important new features as we’ll see; the project also changed maintainers, structured cooperation as teams, and celebrated its ten-year anniversary!

Illustration by Luis Felipe, published under CC-BY-SA 4.0.

This post provides highlights for all the hard work that went into this release—and yes, these are probably the longest release notes in Guix’s history, so make yourself comfortable, relax, and enjoy.

Bonus! Here’s a chiptune (by Trevor Lentz, under CC-BY-SA 3.0) our illustrator Luis Felipe recommends that you listen to before going further.

Improved software environment management

One area where Guix shines is the management of software environments. The guix environment command was designed for that but it suffered from a clumsy interface. It is now superseded by guix shell, though we are committed to keeping guix environment until at least May 1st, 2023. guix shell is a tool that’s interesting to developers, but it’s also a useful tool when you’re willing to try out software without committing it to your profile with guix install. Let’s say you want to play SuperTuxKart but would rather not have it in your profile because, hey, it’s a professional laptop; here’s how you would launch it, fetching it first if necessary:

guix shell supertuxkart -- supertuxkart

In addition to providing a simpler interface, guix shell significantly improves performance through caching. It also simplifies developer workflows by automatically recognizing guix.scm and manifest.scm files present in a directory: drop one of these in your project and other developers can get started hacking just by running guix shell, without arguments. Speaking of which: --export-manifest will get you started by “converting� command-line arguments into a manifest. Read more about guix shell in the manual.

Another guix shell innovation is optional emulation of the filesystem hierarchy standard (FHS). The FHS specifies locations for different file categories—/bin for essential command binaries, /lib for libraries, and so on. Guix with its store does not adhere to the FHS, which prevents users from running programs that assume FHS adherence. The new --emulate-fhs (or -F) flag of guix shell, combined with --container (-C), instructs it to create a container environment that follows the FHS. This is best illustrated with this example, where the ls command of the coreutils package appears right under /bin, as if we were on an FHS system like Debian:

$ guix shell -CF coreutils -- /bin/ls -1p / bin/ dev/ etc/ gnu/ home/ lib/ lib64 proc/ sbin/ sys/ tmp/ usr/

Another big new feature is Guix Home. In a nutshell, Home brings the declarative nature of Guix System to your home environment: it lets you declare all the aspects of your home environments—“dot files�, services, and packages—and can instantiate that environment, in your actual $HOME or in a container.

If you’re already maintaining your dot files under version control, or if you would like to keep things under control so you don’t have to spend days or weeks configuring again next time you switch laptops, this is the tool you need. Check out this excellent introduction that David Wilson gave at the Ten Years celebration, and read more about Guix Home in the manual.

Package transformation options give users fine control over the way packages are built. The new --tune option enables tuning of packages for a specific CPU micro-architecture; this enables the use of the newest single-instruction multiple-data (SIMD) instructions, such as AVX-512 on recent AMD/Intel CPUs, which can make a significant difference for some workloads such as linear algebra computations.

Since the 1.3.0 release, the project started maintaining an alternative build farm at https://bordeaux.guix.gnu.org. It’s independent from the build farm at ci.guix.gnu.org (donated and hosted by the Max Delbrück Center for Molecular Medicine in Berlin, Germany), which has two benefits: it lets us challenge substitutes produced by each system, and it provides redundancy should one of these two build farms go down. Guix is now configured by default to fetch substitutes from any of these two build farms. In addition, a bug was fixed, ensuring that Guix gracefully switches to another substitute provider when one goes down.

Those who’ve come to enjoy declarative deployment of entire fleets of machines will probably like the new --execute option of guix deploy.

Stronger distribution

The distribution itself has seen lots of changes. First, the Guix System installer received a number of bug fixes and it now includes a new mechanism that allows users to automatically report useful debugging information in case of a crash. This will help developers address bugs that occur with unusual configurations.

Application startup has been reduced thanks to a new per-application dynamic linker cache that drastically reduces the number of stat and open calls due to shared library lookup (we’re glad it inspired others).

Guix System is now using version 0.9 of the GNU Shepherd, which addresses shortcomings, improves logging, and adds features such as systemd-style service activation and inetd-style service startup. Speaking of services, the new guix system edit sub-command provides an additional way for users to inspect services, completing guix system search and guix system extension-graph.

There are 15 new system services to choose from, including Jami, Samba, fail2ban, and Gitile, to name a few.

A new interface is available to declare swap space in operating system configurations. This interface is more expressive and more flexible than what was available before.

Similarly, the interface to declare static networking configuration has been overhauled. On GNU/Linux, it lets you do roughly the same as the ip command, only in a declarative fashion and with static checks to prevent you from deploying obviously broken configurations.

More than 5,300 packages were added for a total of almost 22,000 packages, making Guix one of the top-ten biggest distros according to Repology. Among the many noteworthy package upgrades and addition, GNOME 42 is now available. KDE is not there yet but tens of KDE packages have been added so we’re getting closer; Qt 6 is also available. The distribution also comes with GCC 12.2.0, GNU libc 2.33, Xfce 4.16, Linux-libre 6.0.10, LibreOffice, and Emacs 28.2 (with just-in-time compilation support!).

In other news, motivated by the fact that Python 2 officially reached “end of life� in 2020, more than 500 Python 2 packages were removed—those whose name starts with python2-. This includes “big ones� like python2-numpy and python2-scipy. Those who still need these have two options: using guix time-machine to jump to an older commit that contains the packages they need, or using the Guix-Past channel to build some of those old packages in today’s environments—scientific computing is one area where this may come in handy.

On top of that, the Web site features a new package browser—at last! Among other things, the package browse provides stable package URLs like https://packages.guix.gnu.org/packages/PACKAGE.

The NEWS file lists additional noteworthy changes and bug fixes you may be interested in.

More documentation

As with past releases, we have worked on documentation to make Guix more approachable. “How-to� kind of sections have been written or improved, such as:

The Cookbook likewise keeps receiving how-to entries, check it out!

The Guix reference manual is fully translated into French and German; 70% is available in Spanish, and there are preliminary translations in Russian, Chinese, and other languages. Guix itself is fully translated in French, with almost complete translations in Brazilian Portuguese, German, Slovak, and Spanish, and partial translations in almost twenty other languages. Check out the manual on how to help or this guided tour by translator in chief Julien Lepiller!

Supporting long-term reproducibility

A salient feature of Guix is its support for reproducible software deployment. There are several aspects to that, one of which is being able to retrieve source code from the Software Heritage archive. While Guix was already able to fetch the source code of packages from Software Heritage as a fallback, with version 1.4.0 the source code of Guix channels is automatically fetched from Software Heritage if its original URL has become unreachable.

In addition, Guix is now able to retrieve and restore source code tarballs such as tar.gz files. Software Heritage archives the contents of tarballs, but not tarball themselves. This created an impedance mismatch for Guix, where the majority of package definitions refer to tarballs and expect to be able to verify the content hash of the tarball itself. To bridge this gap, Timothy Sample developed Disarchive, a tool that can (1) extract tarball metadata, and (2) assemble previously-extracted metadata and actual files to reconstruct a tarball, as shown in the diagram below.

The Guix project has set up a continuous integration job to build a Disarchive database, which is available at disarchive.gnu.org. The database includes metadata for all the tarballs packages refer to. When a source code tarball disappears, Guix now transparently retrieves tarball metadata from Disarchive database, fetches file contents from Software Heritage, and reconstructs the original tarball. As of the “Preservation of Guix Report� published in January 2022, almost 75% of the .tar.gz files packages refer to are now fully archived with Disarchive and Software Heritage. Running guix lint -c archival PKG will tell you about the archival status of PKG. You can read more in the annual report of Guix-HPC.

This is a significant step forward to provide, for the first time, a tool that can redeploy past software environments while maintaining the connection between source code and binaries.

Application bundles and system images

The guix pack command to create “application bundles�—standalone application images—has been extended: guix pack -f deb creates a standalone .deb package that can be installed on Debian and derivative distros; the new --symlink flag makes it create symlinks within the image.

At the system level, the new guix system image command supersedes previously existing guix system sub-commands, providing a single entry point to build images of all types: raw disk images, QCOW2 virtual machine images, ISO8660 CD/DVD images, Docker images, and even images for Microsoft’s Windows Subsystem for Linux (WSL2). This comes with a high-level interface that lets you declare the type of image you want: the storage format, partitions, and of course the operating system for that image. To facilitate its use, predefined image types are provided:

$ guix system image --list-image-types The available image types are: - rock64-raw - pinebook-pro-raw - pine64-raw - novena-raw - hurd-qcow2 - hurd-raw - raw-with-offset - iso9660 - efi32-raw - wsl2 - uncompressed-iso9660 - efi-raw - docker - qcow2 - tarball

That includes for example an image type for the Pine64 machines and for the GNU/Hurd operating system. For example, this is how you’d create an QCOW2 virtual machine image suitable for QEMU:

guix system image -t qcow2 my-operating-system.scm

… where my-operating-system.scm contains an operating system declaration.

Likewise, here’s how you’d create, on your x86_64 machine, an image for your Pine64 board, ready to be transferred to an SD card or similar storage device that the board will boot from:

guix system image -t pine64-raw my-operating-system.scm

The pine64-raw image type specifies that software in the image is actually cross-compiled to aarch64-linux-gnu—that is, GNU/Linux on an AArch64 CPU, with the appropriate U-Boot variant as its bootloader. Sky’s the limit!

Nicer packaging experience

A significant change that packagers will immediately notice is package simplification, introduced shortly after 1.3.0. The most visible effect is that package definitions now look clearer:

(package ;; … (inputs (list pkg-config guile-3.0))) ;�

… instead of the old baroque style with “input labels�:

(package ;; … (inputs `(("pkg-config" ,pkg-config) ;� ("guile" ,guile-3.0))))

The new guix style command can automatically convert from the “old� style to the “new� style of package inputs. It can also reformat whole Scheme files following the stylistic canons du jour, which is particularly handy when getting started with the language.

That’s just the tip of the iceberg: the new modify-inputs macro makes package input manipulation easier and clearer, and one can use G-expressions for instance in package phases. Read our earlier announcement for more info. On top of that, the new field sanitizer mechanism is used to validate some fields; for instance, the license field is now type-checked and the Texinfo syntax of description and synopsis is validated, all without any run-time overhead in common cases. We hope these changes will make it easier to get started with packaging.

The guix build command has new flags, --list-systems and --list-targets, to list supported system types (which may be passed to --system) and cross-compilation target triplets (for use with --target). Under the hood, the new (guix platform) module lets developers define “platforms�—a combination of CPU architecture and operating system—in an abstract way, unifying various bits of information previously scattered around.

In addition, packagers can now mark as “tunable� packages that would benefit from CPU micro-architecture optimizations, enabled with --tune.

Python packaging has seen important changes. First, the python package now honors the GUIX_PYTHONPATH environment variable rather than PYTHONPATH. That ensures that Python won’t unwillingly pick up packages not provided by Guix. Second, the new pyproject-build-system implements PEP 517. It complements the existing python-build-system, and both may eventually be merged together.

What’s great with packaging is when it comes for free. The guix import command gained support for several upstream package repositories: minetest (extensions of the Minetest game), elm (the Elm programming language), egg (for CHICKEN Scheme), and hexpm (for Erlang and Elixir packages). Existing importers have seen various improvements. The guix refresh command to automatically update package definitions has a new generic-git updater.

Try it!

There are several ways to get started using Guix:

  1. The installation script lets you quickly install Guix on top of another GNU/Linux distribution.

  2. The Guix System virtual machine image can be used with QEMU and is a simple way to discover Guix System without touching your system.

  3. You can install Guix System as a standalone distribution. The installer will guide you through the initial configuration steps.

To review all the installation options at your disposal, consult the download page and don't hesitate to get in touch with us.


About GNU Guix

GNU Guix is a transactional package manager and an advanced distribution of the GNU system that respects user freedom. Guix can be used on top of any system running the Hurd or the Linux kernel, or it can be used as a standalone operating system distribution for i686, x86_64, ARMv7, AArch64, and POWER9 machines.

In addition to standard package management features, Guix supports transactional upgrades and roll-backs, unprivileged package management, per-user profiles, and garbage collection. When used as a standalone GNU/Linux distribution, Guix offers a declarative, stateless approach to operating system configuration management. Guix is highly customizable and hackable through Guile programming interfaces and extensions to the Scheme language.

Categories: FLOSS Project Planets

Real Python: Python's "in" and "not in" Operators: Check for Membership

Planet Python - Mon, 2022-12-19 09:00

Python’s in and not in operators allow you to quickly determine if a given value is or isn’t part of a collection of values. This type of check is common in programming, and it’s generally known as a membership test in Python. Therefore, these operators are known as membership operators.

In this tutorial, you’ll learn how to:

  • Perform membership tests using the in and not in operators
  • Use in and not in with different data types
  • Work with operator.contains(), the equivalent function to the in operator
  • Provide support for in and not in in your own classes

To get the most out of this tutorial, you’ll need basic knowledge of Python, including built-in data types, such as lists, tuples, ranges, strings, sets, and dictionaries. You’ll also need to know about Python generators, comprehensions, and classes.

Source Code: Click here to download the free source code that you’ll use to perform membership tests in Python with in and not in.

Getting Started With Membership Tests in Python

Sometimes you need to find out whether a value is present in a collection of values or not. In other words, you need to check if a given value is or is not a member of a collection of values. This kind of check is commonly known as a membership test.

Arguably, the natural way to perform this kind of check is to iterate over the values and compare them with the target value. You can do this with the help of a for loop and a conditional statement.

Consider the following is_member() function:

>>>>>> def is_member(value, iterable): ... for item in iterable: ... if value is item or value == item: ... return True ... return False ...

This function takes two arguments, the target value and a collection of values, which is generically called iterable. The loop iterates over iterable while the conditional statement checks if the target value is equal to the current value. Note that the condition checks for object identity with is or for value equality with the equality operator (==). These are slightly different but complementary tests.

If the condition is true, then the function returns True, breaking out of the loop. This early return short-circuits the loop operation. If the loop finishes without any match, then the function returns False:

>>>>>> is_member(5, [2, 3, 5, 9, 7]) True >>> is_member(8, [2, 3, 5, 9, 7]) False

The first call to is_member() returns True because the target value, 5, is a member of the list at hand, [2, 3, 5, 9, 7]. The second call to the function returns False because 8 isn’t present in the input list of values.

Membership tests like the ones above are so common and useful in programming that Python has dedicated operators to perform these types of checks. You can get to know the membership operators in the following table:

Operator Description Syntax in Returns True if the target value is present in a collection of values. Otherwise, it returns False. value in collection not in Returns True if the target value is not present in a given collection of values. Otherwise, it returns False. value not in collection

As with Boolean operators, Python favors readability by using common English words instead of potentially confusing symbols as operators.

Note: Don’t confuse the in keyword when it works as the membership operator with the in keyword in the for loop syntax. They have entirely different meanings. The in operator checks if a value is in a collection of values, while the in keyword in a for loop indicates the iterable that you want to draw from.

Like many other operators, in and not in are binary operators. That means you can create expressions by connecting two operands. In this case, those are:

  1. Left operand: The target value that you want to look for in a collection of values
  2. Right operand: The collection of values where the target value may be found

The syntax of a membership test looks something like this:

value in collection value not in collection

In these expressions, value can be any Python object. Meanwhile, collection can be any data type that can hold collections of values, including lists, tuples, strings, sets, and dictionaries. It can also be a class that implements the .__contains__() method or a user-defined class that explicitly supports membership tests or iteration.

If you use the in and not in operators correctly, then the expressions that you build with them will always evaluate to a Boolean value. In other words, those expressions will always return either True or False. On the other hand, if you try and find a value in something that doesn’t support membership tests, then you’ll get a TypeError. Later, you’ll learn more about the Python data types that support membership tests.

Because membership operators always evaluate to a Boolean value, Python considers them Boolean operators just like the and, or, and not operators.

Read the full article at https://realpython.com/python-in-operator/ »

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Categories: FLOSS Project Planets

Python for Beginners: Check for NaN Values in Pandas Python

Planet Python - Mon, 2022-12-19 09:00

While working with data in python, we often encounter null values or NaN values. In this article, we will discuss different ways to check for nan values or null values in a pandas dataframe or series.

Table of Contents
  1. The isna() Function
  2. Check for NaN Values in a Pandas Dataframe Using The isna() Method
  3. Check for Nan Values in a Column in Pandas Dataframe
  4. Check for Nan Values in a Pandas Series Using The isna() Method
  5. Check for NaN Values in Pandas Using the isnull() Method
  6. Check for NaN Values in a Dataframe Using the isnull() Method
  7. Check for NaN in a Column in a Dataframe Using the isnull() Method
  8. Conclusion
The isna() Function

The isna() function in pandas is used to check for NaN values. It has the following syntax.


Here, the object can be a single python object or a list/array of python objects.

If we pass a single python object to the isna() method as an input argument, it returns True if the python object is None, pd.NA or np.NaN object. You can observe this in the following example.

import pandas as pd import numpy as np x=pd.NA print("The value is:",x) output=pd.isna(x) print("Is the value Null:",output)


The value is: <NA> Is the value Null: True

In the above example, we have passed the pandas.NA object to the isna() function. After execution, the function returns True.

When we pass a list or numpy array of elements to the isna() function, the isna() function is executed with each element of the array.

After execution, it returns a list or array containing True and False values. The False values of the output array correspond to all the values that are not NA, NaN, or None at the same position in the input list or array. The True values in the output array correspond to all the NA, NaN, or None values at the same position in the input list or array. You can observe this in the following example.

import pandas as pd import numpy as np x=[1,2,pd.NA,4,5,None, 6,7,np.nan] print("The values are:",x) output=pd.isna(x) print("Are the values Null:",output)


The values are: [1, 2, <NA>, 4, 5, None, 6, 7, nan] Are the values Null: [False False True False False True False False True]

In this example, we have passed a list containing 9 elements to the isna() function. After execution, the isna() method returns a list of 9 boolean values. Each element in the output list is associated with the element at the same index in the input list given to the isna() function. At the indices where the input list contains Null values, the output list contains True. Similarly, at indices where the input list contains integers, the output list contains False.

Check for NaN Values in a Pandas Dataframe Using The isna() Method

Along with the isna() function, the pandas module also has the isna() method at the dataframe level. You can directly invoke the isna() method on the pandas dataframe to check for nan values.

The isna() method, when invoked on a pandas dataframe, returns another dataframe containing True and False values. You can observe this in the following example.

import pandas as pd import numpy as np df=pd.read_csv("grade.csv") print("The dataframe is:") print(df) output=df.isna() print("Are the values Null:") print(output)


The dataframe is: Class Roll Name Marks Grade 0 1 11 Aditya 85.0 A 1 1 12 Chris NaN A 2 1 14 Sam 75.0 B 3 1 15 Harry NaN NaN 4 2 22 Tom 73.0 B 5 2 15 Golu 79.0 B 6 2 27 Harsh 55.0 C 7 2 23 Clara NaN B 8 3 34 Amy 88.0 A 9 3 15 Prashant NaN B 10 3 27 Aditya 55.0 C 11 3 23 Radheshyam NaN NaN Are the values Null: Class Roll Name Marks Grade 0 False False False False False 1 False False False True False 2 False False False False False 3 False False False True True 4 False False False False False 5 False False False False False 6 False False False False False 7 False False False True False 8 False False False False False 9 False False False True False 10 False False False False False 11 False False False True True

In the above example, we have passed a dataframe containing NaN values along with other values. The isna() method returns a dataframe containing boolean values. Here, False values of the output dataframe correspond to all the values that are not NA, NaN, or None at the same position in the input dataframe. The True values in the output dataframe correspond to all the NA, NaN, or None values at the same position in the input dataframe.

Check for Nan Values in a Column in Pandas Dataframe

Instead of the entire dataframe, you can also check for nan values in a column of a pandas dataframe. For this, you just need to invoke the isna() method on the particular column as shown below.

import pandas as pd import numpy as np df=pd.read_csv("grade.csv") print("The dataframe column is:") print(df["Marks"]) output=df["Marks"].isna() print("Are the values Null:") print(output)


The dataframe column is: 0 85.0 1 NaN 2 75.0 3 NaN 4 73.0 5 79.0 6 55.0 7 NaN 8 88.0 9 NaN 10 55.0 11 NaN Name: Marks, dtype: float64 Are the values Null: 0 False 1 True 2 False 3 True 4 False 5 False 6 False 7 True 8 False 9 True 10 False 11 True Name: Marks, dtype: bool Check for Nan Values in a Pandas Series Using The isna() Method

Like a dataframe, we can also invoke the isna() method on a Series object in pandas. In this case, the isna() method returns a Series containing True and False values. You can observe this in the following example.

import pandas as pd import numpy as np x=pd.Series([1,2,pd.NA,4,5,None, 6,7,np.nan]) print("The series is:") print(x) output=pd.isna(x) print("Are the values Null:") print(output)


The series is: 0 1 1 2 2 <NA> 3 4 4 5 5 None 6 6 7 7 8 NaN dtype: object Are the values Null: 0 False 1 False 2 True 3 False 4 False 5 True 6 False 7 False 8 True dtype: bool

In this example, we have invoked the isna() method on a pandas series. The isna() method returns a Series of boolean values after execution. Here, False values of the output series correspond to all the values that are not NA, NaN, or None at the same position in the input series. The True values in the output series correspond to all the NA, NaN, or None values at the same position in the input series.

Check for NaN Values in Pandas Using the isnull() Method

The isnull() function is an alias of the isna() function. Hence, it works exactly the same as the isna() function.

When we pass a NaN value, pandas.NA value, pandas.NaT value, or None object to the isnull() function, it returns True. 

import pandas as pd import numpy as np x=pd.NA print("The value is:",x) output=pd.isnull(x) print("Is the value Null:",output)


The value is: <NA> Is the value Null: True

In the above example, we have passed pandas.NA value to the isnull() function. Hence, it returns True.

When we pass any other python object to the isnull() function, it returns False as shown below.

import pandas as pd import numpy as np x=1117 print("The value is:",x) output=pd.isnull(x) print("Is the value Null:",output)


The value is: 1117 Is the value Null: False

In this example, we passed the value 1117 to the isnull() function. Hence, it returns False showing that the value is not a null value.

When we pass a list or numpy array to the isnull() function, it returns a numpy array containing True and False values. You can observe this in the following example.

import pandas as pd import numpy as np x=[1,2,pd.NA,4,5,None, 6,7,np.nan] print("The values are:",x) output=pd.isnull(x) print("Are the values Null:",output)


The values are: [1, 2, <NA>, 4, 5, None, 6, 7, nan] Are the values Null: [False False True False False True False False True]

In this example, we have passed a list to the isnull() function. After execution, the isnull() function returns a list of boolean values. Each element in the output list is associated with the element at the same index in the input list given to the isnull() function. At the indices where the input list contains Null values, the output list contains True. Similarly, at indices where the input list contains integers, the output list contains False.

Check for NaN Values in a Dataframe Using the isnull() Method

You can also invoke the isnull() method on a pandas dataframe to check for nan values as shown below.

import pandas as pd import numpy as np df=pd.read_csv("grade.csv") print("The dataframe is:") print(df) output=df.isnull() print("Are the values Null:") print(output)


The dataframe is: Class Roll Name Marks Grade 0 1 11 Aditya 85.0 A 1 1 12 Chris NaN A 2 1 14 Sam 75.0 B 3 1 15 Harry NaN NaN 4 2 22 Tom 73.0 B 5 2 15 Golu 79.0 B 6 2 27 Harsh 55.0 C 7 2 23 Clara NaN B 8 3 34 Amy 88.0 A 9 3 15 Prashant NaN B 10 3 27 Aditya 55.0 C 11 3 23 Radheshyam NaN NaN Are the values Null: Class Roll Name Marks Grade 0 False False False False False 1 False False False True False 2 False False False False False 3 False False False True True 4 False False False False False 5 False False False False False 6 False False False False False 7 False False False True False 8 False False False False False 9 False False False True False 10 False False False False False 11 False False False True True

In the output, you can observe that the isnull() method behaves in exactly the same manner as the isna() method.

Check for NaN in a Column in a Dataframe Using the isnull() Method

Instead of the entire dataframe, you can also use the isnull() method to check for nan values in a column as shown in the following example.

import pandas as pd import numpy as np df=pd.read_csv("grade.csv") print("The dataframe column is:") print(df["Marks"]) output=df["Marks"].isnull() print("Are the values Null:") print(output)


The dataframe column is: 0 85.0 1 NaN 2 75.0 3 NaN 4 73.0 5 79.0 6 55.0 7 NaN 8 88.0 9 NaN 10 55.0 11 NaN Name: Marks, dtype: float64 Are the values Null: 0 False 1 True 2 False 3 True 4 False 5 False 6 False 7 True 8 False 9 True 10 False 11 True Name: Marks, dtype: bool

In a similar manner, you can invoke the isnull() method on a pandas series as shown below.

import pandas as pd import numpy as np x=pd.Series([1,2,pd.NA,4,5,None, 6,7,np.nan]) print("The series is:") print(x) output=pd.isnull(x) print("Are the values Null:") print(output)


The series is: 0 1 1 2 2 <NA> 3 4 4 5 5 None 6 6 7 7 8 NaN dtype: object Are the values Null: 0 False 1 False 2 True 3 False 4 False 5 True 6 False 7 False 8 True dtype: bool

In the above example, we have invoked the isnull() method on a series. The isnull() method returns a Series of boolean values after execution. Here, False values of the output series correspond to all the values that are not NA, NaN, or None at the same position in the input series. The True values in the output series correspond to all the NA, NaN, or None values at the same position in the input series.


In this article, we have discussed different ways to check for nan values in pandas. To learn more about python programming,

you can read this article on how to sort a pandas dataframe. You might also like this article on how to drop columns from a pandas dataframe.

I hope you enjoyed reading this article. Stay tuned for more informative articles.

Happy Learning!

The post Check for NaN Values in Pandas Python appeared first on PythonForBeginners.com.

Categories: FLOSS Project Planets

Mike Driscoll: PyDev of the Week: Robert Smallshire

Planet Python - Mon, 2022-12-19 08:32

This week we welcome Robert Smallshire (@robsmallshire) as our PyDev of the Week!  Robert is the founder of the Sixty-North consulting company. Robert also produces amazing Python videos for Pluralsight. Robert also has several Python books that he has authored.

Let's take a few minutes to get to know Robert better!

Can you tell us a little about yourself (hobbies, education, etc):

I got into computing at around the age of nine in the early 1980s home computing revolution and started programming around that time. I’m 48 now and it’s peculiar to step back for a moment and reflect that I’ve been tinkering for almost 40 years. Growing up, and throughout my formal education, I never felt that programming would be turn out to be my career. I have only one small qualification in computing (Computer Studies GCSE – a British qualification taken at age 16), and after school my studies took me into the natural sciences, culminating with a Ph.D. in geology. I’d been doing hobby programming on and off throughout my school and college years, including writing bits of software to help my Dad’s engineering surveying business, but more novel software was needed during my Ph.D., so I had to create it. I divided my time between spending long summers camping and doing fieldwork in the French and Swiss Alps, and spending wet British winters holed up in the lab or at home writing analytical and modelling software in C++ to deal with the results.

While still writing-up my Ph.D. I started work for a university spin-off company doing commercial R&D in energy and mining. This was also a fairly equal split between fieldwork and desk-work, with multi-month trips out in the desert, near Moab, in Utah. During the day, we’d work in the field or from a light-aircraft from which we had persuaded the pilot to remove the doors to facilitate better low-level aerial photography. Back at our base in the evenings we’d hack on Perl scripts to process our data. On occasional days off we would hike in Arches and Canyonlands national parks. It was intense mental and physical work, and I was hugely fortunate at the beginning of my working life to work alongside folks so immensely smart and motivated; people from whom I learned an enormous amount.

Notwithstanding the privilege of being paid to do fieldwork, I quickly realised that being able to program is a sort of super-power, and doubly-so when you’re naturally curious about, and motivated to solve, the problem at hand. People with a good understanding of geology, who also have an aptitude for programming seem to be relatively rare, and realising this, I decided to apply for a job with a company which supplied some of the analytical software I had encountered at the R&D company.

This took me to Glasgow, Scotland for my first real employment as a “developer”. We worked on a large and complex, graphically intensive C++ system, which at the time I joined ran on expensive Silicon Graphics and Sun workstations. The software allow us to build 3D models of the inside of the Earth and run them backwards and forwards through the fourth dimension of geological time, algorithmically stripping off layers of sediment, folding and unfolding bent rocks, and breaking and unbreaking faulted rocks. It seemed almost magical! Again, I had the good fortune to work with and learn from some very intelligent, motivated, and energetic folks, including my future wife. Within a few years I was leading the development team at this company and I’m glad to report that a new system for which I initiated the design around this time (some time in 2002) is still being developed and sold today some twenty years on. After five years, and some difficult times caused by ups and downs in the energy market, my partner and I decided to look around at other options. She was anyway commuting regularly from the UK to Norway, and after a false start with ideas of moving to Canada, I ended up taking a job in Oslo, Norway and relocating within a matter of weeks. My partner followed a few months later.

In Norway, I was again working on simulation software in the oil and gas sector, but everything at my new employer was an order of magnitude larger than I had hitherto experienced, including a monolithic code base of 2.5 million lines of C++, which took overnight to compile from scratch. After only a few weeks I felt I had made a colossal career blunder, and had I not just migrated across the North Sea, I would probably have left as quickly as I had arrived. There was another option though. As Martin Fowler said, “You can either change your company or you can change your company.”

I decided to stick around long enough to see if I could make a difference, and within a few months found that things were going my way. Some other new blood was brought into what had been a stultifying software development culture, and together we began to turn things around. One such character was a fellow called Austin Bingham, with whom I had much in common, both being immigrants to Norway, and both having had positive prior experience with Python, and both taking software engineering and design, rather than just ‘coding’, seriously. Over seven years I rose to the heady heights of Chief Software Architect, which I assure you sounds rather more grand that it actually was. But still, I was the most senior person working on the internal design and programming of our products, in a business with a turnover of hundreds of millions of dollars. One of my key decisions was to introduce a scripting API in terms of Python by embedding the CPython interpreter into our core product.

Towards the end of my seven-year tenure, the business was sold and passed around a series of private-equity concerns, it becoming clear that financial engineering was more valued – and quite possibly more valuable – than software engineering. Shortly afterwards the company was acquired again in a complex transaction by a large U.S.-based conglomerate who seemed somewhat surprised
to discover that they had bought a software company along with the division of the business they actually thought they were buying. Now, as one of 135 000 employees – a small cog in a very large machine – I decided it was time to move on again.

Another factor in my desire to move on was a desire to get out of the oil and gas sector. The reasons were two-fold: First, as both my wife and I worked in the energy sector, our by then growing family was particularly exposed to the notorious boom-and-bust cyclicity of that industry. Second of all, I was concerned about the negative impact of my work on the climate. Having some training in Earth systems science, and recognising that software can be a huge force-multiplier of human capability, it became clear that somebody in my position could have a disproportionally large and negative impact on the climate. A significant fraction of the world’s oil fields were modelled and simulated in software for which I was at least nominally technically responsible, and in which my designs and code were part.

After a few sessions in the pubs of Stavanger and Oslo with my colleague Austin Bingham, we decided to set up on our own in 2013, to offer software training and consulting services.

Our new company, Sixty North, would naturally be focussed on the software technologies we knew well – particularly Python – and on serving problem domains with a significant science or engineering content, but also those where we could bring our software architecture skills to bear on managing the complexity of large systems. Our experience was that many scientists and engineers are competent at coding in the small, but lack the knowledge, skills and experience to design systems so they can grow relatively gracefully – something probably more true today than ever.

We’ve been running Sixty North for a decade now, largely as a lifestyle company rather than chasing perpetual growth. It turns out we’re pretty good at what we do, and more than able to keep a roof over our heads, and sustain a business with a handful of folks.

I’ve talked a lot about my professional career, so what do I do outside? I’ve tried to cultivate hobbies that get me away from screens and which keep me active. Much of my twenties was spent climbing mountains, descending caves and cycling, but my levels of physical activity declined significantly during my thirties when my wife and I were busy – not to say overwhelmed – by juggling busy careers, business travel and child-rearing. In my forties, I’ve got back into cycling, and now try to cycle most days in the summer, and ski as often as I can in the winter. I’m fortunate to live in a very beautiful part of the world.

Why did you start using Python?

I first used Python seriously in 2001 when I encountered it in the SCons build tool. At the time I was working on graphically intensive commercial earth science simulation software written in C++ for a company in Scotland. The code had a horrible build-system implemented recursively with make (see Recursive Make Considered Harmful) which was difficult to understand, and unreliable. In those days, our large C++ code base had to be built overnight, on Irix, Solaris and Windows, so errors and blunders in the build system were costly. After evaluating some alternatives to make, we stumbled upon SCons and fell into the world of Python. We were using Python 1.5.2 as that was the latest version we could build on all the systems we needed to support, with the compilers we had. At the time Perl was my go-to scripting language, but over the next year or two my use of Perl for ad hoc utilities and programs was almost entirely usurped by Python, although all of my "serious" programming was still done in C++.

I felt that higher-level languages would allow our team to be much more productive than we were in C++, to the extent that I had put a lot of effort into embedding a Perl interpreter in our large C++ application. In retrospect, we had chosen a reasonable software architecture – C++ an with embedded script interpreter, similar to a modern web-browser – but in Perl had erred by plumping for one of the few languages that was even less readable and maintainable than C++!

Around this time I was experimenting with Java and encountered Jython – Python for the JVM. I was very excited about this combination as it promised to marry a fast-ish compiled language (Java) with a high-level language (Python) both of which would avoid the many notorious pitfalls in C++ related to memory management. In particular, Java provided the Swing GUI toolkit and the Java 2D and Java 3D graphics APIs, which could be exercised beautifully from Python code executing on the Jython interpreter. I recall enthusing to a colleague about Jython Essentials (2002) by Samuele Pedroni and Noel Rappin – a better Python introduction than most other straight Python books available at the time – and building interesting prototype applications in Jython running on the JVM, which were portable across all the operating systems we used, and which avoided tedious compile-link-run cycles.

Sadly, Jython never really achieved escape-velocity, though having both the Pythonand Java standard libraries it provided a lot of what regular CPython still lacks out of the box today, particularly in terms of GUI and graphics toolkits. Since then, I've introduced Python into other C++-based companies, also via the vector of SCons, and latterly, with the help of Austin Bingham, by embedding the Python interpreter into C++ applications.

What other programming languages do you know and which is your favorite?

I've mentioned Perl, C++ and Java already, but I learned to program in the mid-1980s in BBC BASIC, subsequently taking a long journey through COMAL (obscure!), 6502 and ARM assembly language, Pascal, C, C++, Delphi (Object Pascal). I've also developed fairly significant code bases in C# and F#, and even done bits of Haskell in a professional context. Much – or perhaps most – of this is now forgotten, but the languages I use regularly today are Python (every day), JavaScript (many days), and Java (occasional substantial stints), a combination which reflects the languages I use at work. I still enjoy exploring languages new and old (but new to me). Recently I've dabbled with the Julia programming language, and I'm writing an assembler (in Python) for the vintage 6809 microprocessor for a home-brew 8-bit computer I'm designing and building. If I needed to work on greenfield projects with the performance profile of C++ again, I would put a serious effort into learning Rust. If I need to do much more JavaScript (likely), I can see myself wanting to get into TypeScript.

I see many programmers go through their careers waiting for the next wonder programming language to solve all their problems. I've experienced emotions like that too – notably when experiencing Lisp for the first time, or the excitement of interop on the .NET Common Language Runtime – but I feel like I'm at least a decade past that phase now, and look back on myself as being quite naïve. We
have some excellent programming languages and ecosystems currently, and rather than shiny new languages, there are easy gains to be had by using the languages we already have. The key is to use them smartly and diligently, taking system and software architecture seriously. If you know one language for high-performance/low-level such as C++, know JavaScript for the web, and know a general-purpose low-friction language like Python, you can achieve almost anything.

Of all these, Python is the language that keeps drawing me back and is the language I first reach for unless a design constraint forces me in another direction. Python facilitates a particularly short time from initial idea to useful solution. That said, it’s really important to understand when Python is not appropriate, and I've been responsible for some costly mistakes in that regard.

What projects are you working on now?

About a decade ago, when Austin Bingham and I founded our consulting and training business Sixty North, we – through a chance encounter at a software conference in Gothenburg, Sweden – fell into making online training course material for Pluralsight. As anybody who has made prerecorded training knows, it's an enormous effort to design training material, craft good examples, manually
capture high-quality demo videos, record crisp audio, and edit it all together to produce a high-quality product. For the first iteration of our courses we did everything much the same way as most folks still do, with video captures of us developing code "live", with countless retakes and much editing, pasting snippets of code into Keynote slides and manually annotating it, and so on.

When the time came to update our courses to keep up with the evolution of Python, the latest versions of tools such as PyCharm, higher resolution output, and more stringent and stylish graphic design requirements, I think it’s fair to say that the
prospect of hundreds of hours of manual rework didn't immediately fill us with joy.

Instead, we figured we could, in principle at least, produce all of our material (demo videos, slides, diagrams) from a machine-readable description of the course. We could automatically synchronise the visuals with the audio voiceover, and then, when a new version of Python or PyCharm was released, or when the need arose to deliver courses at a different video resolution, we could literally make a few updates to configuration files or demo example code and re-render the whole course.

Naturally, the difference between an ‘in principle’ and ‘in practice’ solution to this need is a huge amount of work on building the tools to do this, and describing all of our video courses to the system. Needless to say, we have around 25 hours of Python video training material published on Pluralsight which is renderable and – crucially – cheaply modifiable, completely automatically.

At the time of writing, we're exploring our options for bringing this technology to a wider audience, and removing some of the rough edges from the now substantial video course production system my colleagues and I at Sixty North have built.

Which Python libraries are your favorite (core or 3rd party)?

Many of the Python packages we make are designed to have a clear Python API and on top of that a CLI. I've found click to be an excellent library for specifying command-line interfaces. For testing, I regularly turn to David MacIver’s excellent property-based testing library, Hypothesis.

What are the top 3 things you learned while writing a book or video course?

1. Having to teach a topic in an excellent way to learn it well.
2. Finding good teaching examples which exhibit Goldilocks “just right” complexity requires long walks or bike rides, but also a grounding in experience to understand and demonstrate their relevance.
3. Most books have a hopeless financial return compared to invested effort, but are good for gaining credibility. I wouldn’t advise anybody to write a technical book for the money from sales, but to instead write one to support other aspects of a broader consulting or training business. For example. our Python Craftsman series, The Python Apprentice, The Python Journeyman, and The Python Master, are derived directly from our work on our classroom and Pluralsight training materials and they mutually support each other.

Is there anything else you’d like to say?

This open ended question made me contemplate the things which have transformed my ability to build quality software. Along with using a language you love – or at least learning to love the language you use – I would add the following advice:

First of all, taking testing seriously has had a big impact on my work. I sometimes, though by no means always, practice Test-Driven Development (TDD). Even when I’m not using TDD the tests are rarely far behind and are usually written contemporaneously with the production code. The effort in arranging for code to be testable will be repaid many times over, not just in terms of correctness, but for other desirable qualities of the system.

Secondly, taking architecture and design seriously has been transformational for me. Largely this is about following a handful of maxims: “Do one thing and do it well”, “Separate concerns”, “An architecturally significant decision is one which is costly to change later”, “Instead of asking objects for their state, tell objects what to do, and give them what they need to do it”, “Prefer pure functions”, and so on.

Many of these boil down to keeping control of coupling and cohesion, and it’s hard to overstate the importance of these for sustained success.

The third, least technical, and most enjoyable practice that has made a huge impression on me in recent years is daily pair or ensemble programming. This really took off for us during the Covid-19 pandemic to the extent that the majority of code we write at Sixty North has at least two people involved at the point of creation, and I feel our code, happiness and team spirit is much better for it. I wouldn’t like to go back to long stretches of working alone, to asynchronous code review, or to modes of working based around pull-requests.

Finally, I’d like to thank you for giving me opportunity to tell a bit of my story.

Thanks for doing the interview, Robert!

The post PyDev of the Week: Robert Smallshire appeared first on Mouse Vs Python.

Categories: FLOSS Project Planets

Jacob Rockowitz: The Value of Having a Drupal.org Profile and Biography.​

Planet Drupal - Mon, 2022-12-19 08:16

In the coming year, to build a rock-solid foundation for enterprise Schema.org-first content architecture, I'd like to convince an organization to use the Schema.org Blueprints module. To help "sell" my ideas and services, I am reworking my Drupal.org profile and website. Drupal is how I make my living. Therefore, my Drupal.org profile is a critical part of my online resume. For anyone like me whose work is directly tied to the Drupal community, there is a lot of value in having a well-thought Drupal.org profile and biography.

The value of having a Drupal.org profile begins with the simple fact that you are now an official member of the Drupal community. If your resume reflects ten years of Drupal experience, your Drupal.org profile confirms it by displaying how long you have been a member of Drupal.org. Your Drupal.org account also tracks every interaction you have had on Drupal.org. When I say every interaction, I mean every interaction, positive and negative, which reminds us to be professional and considerate on Drupal.org and adhere to Drupal's code of conduct. For example, one of my earliest posts describes my interaction with quicksketch, the previous maintainer of the Webform module. That post may have foreshadowed that I would become the maintainer of the Webform module many years later.

The Drupal Association, with community input, has done a fantastic job of creating and evolving Drupal.org's user profiles to include everything from general biographical...Read More

Categories: FLOSS Project Planets

PyBites: Pybites turns 6 today – 10 highlights + lessons learned

Planet Python - Mon, 2022-12-19 08:03

Today Pybites turns 6 years 

We could never have envisioned that our end-of-2016 “Python blog side project” would grow out into a fully fledged business serving thousands of people worldwide!

Here are 10 highlights / lessons learned from our journey so far:

1. Don’t procrastinate, implement

We had been chatting for many years about ideas and things we could do to add value, but on that 19th of Dec 2016 we actually took action.

Nothing counts till you implement!

Day 1-10 of Pybites’ existence was literally writing Python articles (scroll down), nothing more, nothing less

We had to find our unique space and we did! 3 weeks in we saw an opportunity to launch code challenges, a niche that fit our practical learn by doing approach and we’d sensed as a fun and effective way to help Pythonistas in the space

Here is the pilot post.

2. Be consistent

We kept doing the challenges consistently for a year or so. Oftentimes it was fun, but some weeks the looming (self imposed) deadlines made it feel like a grind

However, we also saw that we were quickly building up a collection of valuable content that formed the basis of what was about to come.

When things start to feel a bit monotonous think about the compound effect: all your effort will accumulate over time.

3. Community

At the 6 month mark we started a Slack community because we loved the idea of bringing together passionate Pythonistas that shared our practical take on things.

This was one of the best decisions we’ve ever made. You cannot do this alone. Bring together like-minded people and amazing synergies will happen

4. Look at trends

Browsing around we saw a lot of hype about the 100 days of code so we decided to jump on the challenge and not just code for an hour a day (as per the rules) but actually build a complete script / tool each day.

We took turns to make it somewhat manageable. This got us again more materials and traction (next point).

5. Get your work out there

As we were tweeting every day about our 100 days of code challenge implementation, the repo and our Twitter account received more of a following, including Michael Kennedy from Talk Python, which led to an interview on his podcast about our way of doing the challenge. This was around the 11 months mark.

6. Network / partner up

After this interview we got the opportunity to build a 100 days of code Python course. This course became quite successful and we later followed up with a second 100 days of web course

7. Automate things

Around that same time, we also finished our prototype of our coding platform. Its initial goal was to automate the pull request submission for our blog code challenges (see 1.) 

We kept taking this approach of building and scratching our own itch over the years which lead to more courses / products, exercises, our own open source org, and our own CMS and content reviewing systems, all in Python of course

8. Iterate fast and often

While we’re on our platform, we quickly realized that in-browser code evaluation would be really cool so we figured out how to evaluate submitted code against pytest serverless (using AWS lambda – more on the stack here).

Again, we would not have even gotten the idea if we did not solve the previous problem (blog code challenge automation).

Quite often you don’t know what problem you’re solving till you start solving it! Start soon, keep iterating

The platform and its exercises have received hundreds of revisions based on real users solving them and it has grown into something unique and highly valuable, including being used in several school curriculums across the US.

9. Invest in coaching

End of 2019 we got a bit tired of the content space, people were getting results through our materials, but something was missing.

At the 3 year mark we were wondering how we could serve people better, become more valuable to the market (a powerful Jim Rohn question that is good to ask yourself from time to time )

We invested in a marketing coaching program ourselves that taught us about offer building and how to effectively launch your offer.

This turned out one of the best investments we have ever made. Success leaves clues

You often don’t know what you don’t know, and working with people who have done something you want to achieve, opens your eyes to what is possible.   

10. Nothing like 1:1 guidance

We launched our PDM coaching program early 2020 and have been growing it since

We’re changing lives by working with people 1:1. We feel it’s the culmination of Pybites, what we worked so hard for over the last 6 years, and what we stand for value wise.

Surprisingly this involves a lot of mindset (soft skills). We talk a lot about this on our podcast , which by the way, we almost procrastinated on launching but has been very well received (iterate fast!)

If we had to sum up in one line what we’ve learned in these 6 years of Pybites it’s this: implement, get your work out there, constantly iterate and build a community around your mission

We hope this blog post inspires you to do the same

What’s next?

Great question. We always like to ask: “What’s next?!”

It’s clear from this journey that the coaching / working 1:1 with people is what we value the most and what gets people the best results. So doing more of that is a no brainer to us.

However we also see a big need in the space to more mindset related topics. So you can definitely expect more of that from us in the coming year as well.

This time of the year is great for reflection and refocus, coming up with new, audacious goals to tackle head on in the new year.

We wish you Happy Holidays and to a New Year full of action and implementation.

Good luck and reach out to us if you need any help. The best way is to send us an email to info@pybit.es – we read and respond to every email. 

And last but not least, thank you all for supporting us in this amazing journey

Categories: FLOSS Project Planets

The Top 100 QML Resources by KDAB

Planet KDE - Mon, 2022-12-19 05:03

If you’re a reader of this blog, you probably know that we have a huge amount of quality material on QML and Qt Quick, among other topics. In fact, there is so much material that it can be hard to find what you need.

If that sounds familiar, you’ll want to bookmark this page! This blog captures a snapshot of the top 100 resources we offer on QML and Qt Quick. This mix of blogs, instructional videos, and other resources has been organized into simple, easy-to-understand categories with simple descriptions added when necessary.

If you’re just getting started with Qt, you’ll want to begin with our training class. And if there’s a topic here you can’t find, you may also want to try using our Content Library Search or visit our YouTube channel for even more content.


Introduction to Qt/QML – Full KDAB Training Class

How-To Tutorials


Maximizing your IDE efficiency

Customizing the IDE


Development Patterns

Development Workflow



Graphics Sizing and Scaling

QML and 3D

QML Components


Special Problems


Specific Environments

QML Internals





About KDAB

If you like this article and want to read similar material, consider subscribing via our RSS feed.

Subscribe to KDAB TV for similar informative short video content.

KDAB provides market leading software consulting and development services and training in Qt, C++ and 3D/OpenGL. Contact us.

The post The Top 100 QML Resources by KDAB appeared first on KDAB.

Categories: FLOSS Project Planets

Russ Allbery: Review: Artifact Space

Planet Debian - Sun, 2022-12-18 23:14

Review: Artifact Space, by Miles Cameron

Series: Arcana Imperii #1 Publisher: Gollancz Copyright: June 2021 ISBN: 1-4732-3262-7 Format: Kindle Pages: 483

Artifact Space is a military (mostly) science fiction novel, the first of an expected trilogy. Christian Cameron is a prolific author of historical fiction under that name, thrillers under the name Gordon Kent, and historical fantasy under the name Miles Cameron. This is his first science fiction novel.

Marca Nbaro is descended from one of the great spacefaring mercantile families, but it's not doing her much good. She is a ward of the Orphanage, the boarding school for orphaned children of the DHC, generous in theory and a hellhole in practice. Her dream to serve on one of the Greatships, the enormous interstellar vessels that form the backbone of the human trading network, has been blocked by the school authorities, a consequence of the low-grade war she's been fighting with them throughout her teenage years. But Marca is not a person to take no for an answer. Pawning her family crest gets her just enough money to hire a hacker to doctor her school records, adding the graduation she was denied and getting her aboard the Greatship Athens as a new Midshipper.

I don't read a lot of military science fiction, but there is one type of story that I love that military SF is uniquely well-suited to tell. It's not the combat or the tactics or the often-trite politics. It's the experience of the military as a system, a collective human endeavor.

One ideal of the military is that people come to it from all sorts of backgrounds, races, and social classes, and the military incorporates them all into a system built for a purpose. It doesn't matter who you are or what you did before: if you follow the rules, do your job, and become part of a collaboration larger than yourself, you have a place and people to watch your back whether or not they know you or like you. Obviously, like any ideal, many militaries don't live up to this, and there are many stories about those failures. But the story of that ideal, told well, is a genre I like a great deal and is hard to find elsewhere.

This sort of military story shares some features with found family, and it's not a coincidence that I also like found family stories. But found family still assumes that these people love you, or at least like you. For some protagonists, that's a tricky barrier both to cross and to believe one has crossed. The (admittedly idealized) military doesn't assume anyone likes you. It doesn't expect that you or anyone around you have the right feelings. It just expects you to do your job and work with other people who are doing their job. The requirements are more concrete, and thus in a way easier to believe in.

Artifact Space is one of those military science fiction stories. I was entirely unsurprised to see that the author is a former US Navy career officer.

The Greatships here are, technically, more of a merchant marine than a full-blown military. (The author noted in an interview that he based on them on the merchant ships of Venice.) The weapons are used primarily for defense; the purpose of the Greatships is trade, and every crew member has a storage allotment in the immense cargo area that they're encouraged to use. The setting is in the far future, after a partial collapse and reconstruction of human society, in which humans have spread through interstellar space, settled habitable planets, and built immense orbital cities. The Athens is trading between multiple human settlements, but its true destination is far into the deep black: Tradepoint, where it can trade with the mysterious alien Starfish for xenoglas, a material that humans have tried and failed to reproduce and on which much of human construction now depends.

This is, to warn, one of those stories where the scrappy underdog of noble birth makes friends with everyone and is far more competent than anyone expects. The story shape is not going to surprise you, and you have to have considerable tolerance for it to enjoy this book. Marca is ridiculously, absurdly central to the plot for a new Middie. Sometimes this makes sense given her history; other times, she is in the middle of improbable accidents that felt forced by the author. Cameron doesn't entirely break normal career progression, but Marca is very special in a way that you only get to be as the protagonist of a novel.

That said, Cameron does some things with that story shape that I liked. Marca's hard-won survival skills are not weirdly well-suited for her new life aboard ship. To the contrary, she has to unlearn a lot of bad habits and let go of a lot of anxiety. I particularly liked her relationship with her more-privileged cabin mate, which at first seemed to only be a contrast between Thea's privilege and Marca's background, but turned into both of them learning from each other. There's a great mix of supporting characters, with a wide variety of interactions with Marca and a solid sense that all of the characters have their own lives and their own concerns that don't revolve around her.

There is, of course, a plot to go with this. I haven't talked about it much because I think the summaries of this book are a bit of a spoiler, but there are several layers of political intrigue, threats to the ship, an interesting AI, and a good hook in the alien xenoglas trade. Cameron does a deft job balancing the plot with Marca's training and her slow-developing sense of place in the ship (and fear about discovery of her background and hacking). The pacing is excellent, showing all the skill I'd expect from someone with a thriller background and over forty prior novels under his belt. Cameron portrays the tedious work of learning a role on a ship without boring the reader, which is a tricky balancing act.

I also like the setting: a richly multicultural future that felt like it included people from all of Earth, not just the white western parts. That includes a normalized androgyne third gender, which is the sort of thing you rarely see in military SF. Faster-than-light travel involves typical physics hand-waving, but the shape of the hand-waving is one I've not seen before and is a great excuse for copying the well-known property of oceangoing navies that longer ships can go faster.

(One tech grumble, though: while Cameron does eventually say that this is a known tactic and Marca didn't come up with anything novel, deploying spread sensors for greater resolution is sufficiently obvious it should be standard procedure, and shouldn't have warranted the character reactions it got.)

I thoroughly enjoyed this. Artifact Space is the best military SF that I've read in quite a while, at least back to John G. Hemry's JAG in space novels and probably better than those. It's going to strike some readers, with justification, as cliched, but the cliches are handled so well that I had only minor grumbling at a few absurd coincidences. Marca is a great character who is easy to care about. The plot was tense and satisfying, and the feeling of military structure, tradition, jargon, and ship pride was handled well. I had a very hard time putting this down and was sad when it ended.

If you're in the mood for that class of "learning how to be part of a collaborative structure" style of military SF, recommended.

Artifact Space reaches a somewhat satisfying conclusion, but leaves major plot elements unresolved. Followed by Deep Black, which doesn't have a release date at the time of this writing.

Rating: 9 out of 10

Categories: FLOSS Project Planets

Tokodon 22.11.2 release

Planet KDE - Sun, 2022-12-18 14:00

I’m happy to announce the release of Tokodon 22.11.2 (and 22.11.1 who I released earlier this month and forgot to properly announce). These releases contain mostly bug fixes but also some welcome interface improvements.

First this adds an account switcher (similar to the one Tobias Fella implemented in NeoChat). Very usefully when you need to manage multiple accounts and want to quickly switch between them.


This also change the image preview from appearing in a separate full screen window to be contained inside the window. This follow the similar change from from James Graham in NeoChat.

Preview full window mode

Joshua Goins improved the loading of media attachment and made it possible to now hide sensitive image by default using the blurhash effect. This is also using the already existing implementation of blurhash from Tobias in NeoChat and you might start to see a pattern in this release. ;)

Blurhash post

Finally I added support for custom emojis in many places inside the UI. Perfect if you want to show you true verified checkmark in your profile :)

Aside from the nice new improvements, I improved the spacing in the app and while not perfect yet, I hope this makes Tokodon more enjoyable to use. Joshua Goins has also made various improvements to our internal networking code and this should offer better reliability and less prone to crash code. And I fixed an important crash on start-up that was affecting a lot of users

Finally I started adding unit tests in Tokodon and added the infrastructure to mock a Mastodon server. We now have reached 12% unit tests coverage and I hope this number will grow after each release.

And for those who prefer a full changelog, here it is:
  • Remember selected account
  • Update metadata
  • Fix rebasing issue
  • Attempt to fix Qt complaining about incomplete Post type
  • Add parents to replies made by Account::get and similar requests
  • More fixes
  • Move away from shared_ptr for Post
  • Fix double-free bug when viewing certain timeline pages
  • Add qtkeychain to .kde-ci.yml
  • Fix hide image icon missing on Android
  • View your own profile via account switcher
  • Add emoji support to page headings and profile bios
  • Fix translation extraction
  • Fix replying from the notification timeline
  • Fix notification list
  • Fix fetching the timeline twice
  • Release 22.11.2
  • Fix showing view-sensitive button too often
  • Don't have text autocomplete on login form
  • Add missing release info
  • Release 21.11.1
  • Remove usage of anchors in layout
  • Use blur hash for loading images and sensitive media
  • Improve hover effect on the card
  • Fix qt6 build
  • Fix dependency in the ci
  • Put accessible description at the bottom
  • Improve the look of cards
  • Use Kirigami.ActionToolBar
  • Allow download images
  • Full screen image like neochat
  • Add m_original_post_id for use in timeline fetch
  • Propertly reset pageStack when switching account
  • Polish NotificationPage
  • Improve layout of follow notification
  • Fix crash when switching account in the notification view
  • Fix translation catalog loading
  • Post: Fix memory leak
  • Fix off by one error in notification model
  • Posibly fix crash (second try)
  • Remove debug leftover
  • Posibly fix crash at startup
  • Improve account switcher
  • Make tap and text selection work together in PostDelegate
  • Fix wrong header url
  • Fix handling of empty displayName
  • Improve layout of PostDelegate
  • Add InteractionButton component for likes, boosts and replies
  • Add Qt 6 CI for FreeBSD and Android
  • Fix custom emoji in account lists
  • Port LoginPage to mobile form
  • Add runtime dependency for org.kde.sonnet
  • More cleanup and add autotests
  • Properly use getter and use displayNameHtml in more places
  • Implement custom emojis
  • Fix coverage badge
  • Add a refresh button for desktop devices
  • Reset message handler properly to prevent threads overwriting each other
  • Fix setInstanceUri early exit, preventing client id from being fetched
  • Add coverage badge
  • Fix reuse
  • Add a qDebug filter to remove confidential data
  • Add model tests
  • Add basic test
  • Split account in an abstract class without networking
  • Remot stray qDebug leaking token
Packager section

You can find the package on download.kde.org and it has been signed with my GPG key.

Categories: FLOSS Project Planets

Bastian Venthur: The State of Python Packaging in 2022

Planet Debian - Sun, 2022-12-18 13:15

Every year or so, I revisit the current best practices for Python packaging. This was my summary for 2021 – here’s the update for 2022.


PyPA is still the place to go for information, best practices and tutorials for packaging Python projects. My only criticism from last year, namely that PyPA was heavily biased towards their own tooling (e.g. pipenv), has been addressed: the tool recommendations section lists now several tools for the same purpose with their own ones not necessarily being the first anymore.

setup.py, setup.cfg, requirements.txt, Pipfile, pyproject.toml – oh my!

This is the reason why I’m revisiting the documentation every year, to see what’s the current way to go. Good progress has been made since last year:

Bye setup.py and setup.cfg – hello pyproject.toml

pyproject.toml finally got mature enough to replace setup.py and setup.cfg in most cases. Recent versions of setuptools and pip now fully support pyproject.toml and even PyPA’s packaging tutorial completely switched their example project from away setup.py towards pyproject.toml, making it an official recommendation.

So, now you can replace your setup.py with pyproject.toml. If you had already some kind of declarative configuration in setup.cfg you can move that as well into pyproject.toml. Most tools, like pypy or pytest also support configuration in pyproject.toml (flake8 being a notable exception…) so there’s no reason to keep setup.cfg around anymore. Actually, if you migrate to pyproject.toml it is best to do it properly and remove setup.py and setup.cfg as setuptools behaves a bit buggy when building a package that has either of them and the pyproject.toml.


requirements.txt are still needed if you develop a “deployable” application (vs. a library) and want to provide pinned dependencies, i.e. with the specific versions that you’ve tested your application with. Usually, the list of requirements in requirements.txt is the same as defined in pyproject.toml, but with pinned versions.

Pipfile + Pipfile.lock

I still do completely ignore Pipfile and Pipfile.lock as they are only supported by pipenv and not backed by any standard.


The major change this year was the proper support of pyproject.toml. I am slowly replacing all setup.py and setup.cfg in my projects with pyproject.toml and haven’t discovered any issues yet. Even packaging those packages as Debian packages is well-supported by Debian’s tooling.

I’m still running a quite boring stack based on pyproject.toml and requirements.txt, ignoring more advanced tools like poetry or such for dependency management. My build-system defined in pyproject.toml requires setuptools and I’m using build for building and twine for uploading.

Since PyPA changed the packaging tutorial towards pyproject.toml and away from setup.py, I think we will slowly see setup.py and setup.cfg go away over the years. Speaking of PyPA, I’m happy that they changed their attitude towards a more unbiased recommendation of tooling.

Categories: FLOSS Project Planets

¿Dónde debo promover la actividad de Eclipse Foundation?

Planet KDE - Sun, 2022-12-18 13:14
Summary in English

This article is written in Spanish. On it, ask the readers for events or channels where I can apply or participate in to promote the Eclipse Foundation in Spain and in Spanish during 2023. It ends with some interesting links about The Eclipse Foundation.


Me incorporé a La Eclipse Foundation en Agosto de 2021 con el fin de coordinar la incorporación primero, y consolidación despues, del proyecto Oniro a la Fundación.

En Europa se adoptó por política desde un principio (hace unos años) ser remote-only, lo que hace que La Fundación tenga personal en diferentes países, incluido España. Además de servidor, Carmen Delgado se ha incorporado recientemente desde Mallorca como community Manager del proyecto Adoptium.

Nuestra actividad no es conocida en España. Procuro dedicar algo de tiempo a promover la Eclipse Foundation y Oniro en español. Por ejemplo, en 2021 realicé una presentación en Akademy-es. A lo largo de 2022 he dado una charla corta en OpenExpoES y realicé algunos contactos entre organizaciones españolas dando a conocer nuestra extensa actividad en áreas como I+D asociada a proyectos europeos, IoT o automoción, por nonbrar sólo algunos.

Estoy planificando mi participación en algún evento en España, así como acciones de promoción, como intervenir en podcasts. Me gustaría contar con tu ayuda, querido lector, para identificar eventos, canales o acciones que pudieramos desarrollar en EF durante este año 2023 para difundir alguno de los más de 400 proyectos de Software Libre que se desarrollan bajo el paraguas de la Fundación, así como las actividades y servicios con los que apoyamos a los desarrolladores (comunidades) y empresas que forman parte de nuestros Grupos de Trabajo (ecosistemas). También busco promocionar el Proyecto Oniro.

Si tienes alguna sugerencia, no dudes en contactarme. Esto elaborando un pequeño plan estos días con el fin de trasladarlo a mis compañeros de la Fundación.

Aprende sobre Eclipse Foundation

Aquí van algunos enlaces que me parecen interesantes para aprender sobre la Fundación Eclipse:

Categories: FLOSS Project Planets