Feeds
François Marier: Blocking comment spammers on an Ikiwiki blog
Despite comments on my ikiwiki blog being fully moderated, spammers have been increasingly posting link spam comments on my blog. While I used to use the blogspam plugin, the underlying service was likely retired circa 2017 and its public repositories are all archived.
It turns out that there is a relatively simple way to drastically reduce the amount of spam submitted to the moderation queue: ban the datacentre IP addresses that spammers are using.
Looking up AS numbersIt all starts by looking at the IP address of a submitted comment:
From there, we can look it up using whois:
$ whois -r 2a0b:7140:1:1:5054:ff:fe66:85c5 % This is the RIPE Database query service. % The objects are in RPSL format. % % The RIPE Database is subject to Terms and Conditions. % See https://docs.db.ripe.net/terms-conditions.html % Note: this output has been filtered. % To receive output for a database update, use the "-B" flag. % Information related to '2a0b:7140:1::/48' % Abuse contact for '2a0b:7140:1::/48' is 'abuse@servinga.com' inet6num: 2a0b:7140:1::/48 netname: EE-SERVINGA-2022083002 descr: servinga.com - Estonia geoloc: 59.4424455 24.7442221 country: EE org: ORG-SG262-RIPE mnt-domains: HANNASKE-MNT admin-c: CL8090-RIPE tech-c: CL8090-RIPE status: ASSIGNED mnt-by: MNT-SERVINGA created: 2020-02-18T11:12:49Z last-modified: 2024-12-04T12:07:26Z source: RIPE % Information related to '2a0b:7140:1::/48AS207408' route6: 2a0b:7140:1::/48 descr: servinga.com - Estonia origin: AS207408 mnt-by: MNT-SERVINGA created: 2020-02-18T11:18:11Z last-modified: 2024-12-11T23:09:19Z source: RIPE % This query was served by the RIPE Database Query Service version 1.114 (SHETLAND)The important bit here is this line:
origin: AS207408which referts to Autonomous System 207408, owned by a hosting company in Germany called Servinga.
Looking up IP blocksAutonomous Systems are essentially organizations to which IPv4 and IPv6 blocks have been allocated.
These allocations can be looked up easily on the command line either using a third-party service:
$ curl -sL https://ip.guide/as207408 | jq .routes.v4 >> servinga $ curl -sL https://ip.guide/as207408 | jq .routes.v6 >> servingaor a local database downloaded from IPtoASN.
This is what I ended up with in the case of Servinga:
[ "45.11.183.0/24", "80.77.25.0/24", "194.76.227.0/24" ] [ "2a0b:7140:1::/48" ] Preventing comment submissionWhile I do want to eliminate this source of spam, I don't want to block these datacentre IP addresses outright since legitimate users could be using these servers as VPN endpoints or crawlers.
I therefore added the following to my Apache config to restrict the CGI endpoint (used only for write operations such as commenting):
<Location /blog.cgi> Include /etc/apache2/spammers.include Options +ExecCGI AddHandler cgi-script .cgi </Location>and then put the following in /etc/apache2/spammers.include:
<RequireAll> Require all granted # https://ipinfo.io/AS207408 Require not ip 46.11.183.0/24 Require not ip 80.77.25.0/24 Require not ip 194.76.227.0/24 Require not ip 2a0b:7140:1::/48 </RequireAll>Finally, I can restart the website and commit my changes:
$ apache2ctl configtest && systemctl restart apache2.service $ git commit -a -m "Ban all IP blocks from Servinga" Future improvementsI will likely automate this process in the future, but at the moment my blog can go for a week without a single spam message (down from dozens every day). It's possible that I've already cut off the worst offenders.
I have published the list I am currently using.
#! code: Drupal 11: Creating Custom Queues
Creating queues using the core queue classes in Drupal is fairly straightforward. You just need a mechanism of adding data to the queue and a worker to process that data.
As the data you add to the queue is serialised you can add pretty much any data you want to the queue, so the only limitation is rebuilding the data once you pull it out of the queue.
There are some situations where the core Drupal queue system needs to be altered in some way. You might want to separate the data into different tables, or have a different logic for creating or storing the queue items, or even integrate with a third party queue system for manage the queues.
Whilst all of these examples are possible, they require a certain amount of understanding of the queue API and need additional services and settings to get working.
In this article we will look at how to create a custom queue, along with the queue factory needed to integrate that queue with Drupal. We will also look at some settings needed to swap out certain queues for you custom queue implementations. All of the code seen in this article is available in our Drupal Queue Examples repository on GitHub, specifically the queue_custom_example module.
First, let's look at what is requires for a queue to work in Drupal.
Create A Custom Queue With The QueueInterface InterfaceThe interface \Drupal\Core\Queue\QueueInterface is used to build the framework of the queue, which is used to manage the queue items. Your queue object must have the following methods.
philipnorton42 Sun, 01/19/2025 - 19:54Wasting time with inconsistent data
One of my leisure time activities is to develop KMyMoney, a personal finance management application. Most of my time is spent on development, testing, bug reproduction and fixing, user support and sometimes I even write some documentation for this application. And of course, I use it myself on a more or less daily basis.
One of the nice KMyMoney features that helps me a lot is the online transaction download. It’s cool, if you simply fire up your computer in the morning, start KMyMoney, select the “Account/Update all” function, fill in the passwords to your bank and Paypal accounts when asked (though also that is mostly automated using a local GPG protected password store) and see the data coming in. After about a minute I have an overview what happened in the last 24 hours on my accounts. No paper statement needed, so one could say, heavily digitalized. At this point, many thanks go out to the author of AqBanking which does all the heavy work dealing with bank’s protocols under the hood. But a picture is worth a thousand words. See for yourself how this looks like:
A recording of my daily download procedureThe process is working for a long time and I have not touched any of the software parts lately. Today, I noticed a strange thing happening because one of my accounts showed me a difference between the account balance on file and the amount provided by the bank after a download. This may happen, if you enter transactions manually but since I only download them from the bank, there should not be any difference at all. Plus, today is Sunday while on the day before everything was just fine. First thought: which corner case did I hit that KMyMoney is behaving this way and where is the bug?
First thing I usually do in this case is to just close the application and start afresh. No way: same result. Then I remembered, that I added a feature the day before to the QIF importer which also included a small change in the general statement reader code. Of course, I tested things with the QIF importer but not with AqBanking. Maybe, some error creeped into the code and causes this problem. I double checked the code and since it dealt with tags – which are certainly not provided by my bank – it could not be the cause of it.
So I looked at the screen again:
The ledger on the left shows the state before and the one on the right after the download.New data must have been received because the date in the left column changed and also the amount of the colored row changed but not the one in the row above which still shows the previous state. The color is determined by comparing the balance information with the one in the row above. So where is/where are the missing transaction(s)?
Long story short: looking at the logs I noticed, that the online balance was transmitted but there was no transaction at all submitted by the bank. And if I simply take the difference between the two balances it comes down to a reimbursement payment which I expect to receive.
Conclusion: no bug in KMyMoney, but the bank simply provided inconsistent data. Arrrrgh.
Daniel Roy Greenfeld: Fastcore L
Real Python: Python Constants: Improve Your Code's Maintainability
In Python, constants are identifiers for values that don’t change during a program’s execution. Unlike some other languages, Python lacks built-in syntax for constants, treating them as variables that should remain unchanged. You define constants by following a naming convention: use all uppercase letters with underscores separating words. This signals to other developers that these variables should not be reassigned.
By the end of this tutorial, you’ll understand that:
- Constants in Python are variables that should remain unchanged throughout execution.
- Python lacks built-in syntax for constants, relying on conventions to signal immutability.
- Defining a constant involves using uppercase letters with underscores for clarity.
- Best practices include defining constants at the top of a file and using descriptive names.
- Built-in constants like math.pi offer predefined, reliable values, unlike user-defined ones.
To learn the most from this tutorial, you’ll need basic knowledge of Python variables, functions, modules, packages, and namespaces. You’ll also need to know the basics of object-oriented programming in Python.
Sample Code: Click here to download sample code that shows you how to use constants in Python.
Understanding Constants and VariablesVariables and constants are two historical and fundamental concepts in computer programming. Most programming languages use these concepts to manipulate data and work in an effective and logical fashion.
Variables and constants will probably be present in each project, app, library, or other piece of code that you’ll ever write. The question is: what are variables and constants in practice?
What Variables AreIn math, a variable is defined as a symbol that refers to a value or quantity that can change over time. In programming, a variable is also a symbol or name typically associated with a memory address containing a value, object, or piece of data. Like in math, the content of a programming variable can change during the execution of the code that defines it.
Variables typically have a descriptive name that’s somehow associated with a target value or object. This target value can be of any data type. So, you can use variables to represent numbers, strings, sequences, custom objects, and more.
You can perform two main operations on a variable:
- Access its value
- Assign it a new value
In most programming languages, you can access the value associated with a variable by citing the variable’s name in your code. To assign a new value to a given variable, you’ll use an assignment statement, which often consists of the variable’s name, an assignment operator, and the desired value.
In practice, you’ll find many examples of magnitudes, data, and objects that you can represent as variables. A few examples include temperature, speed, time, and length. Other examples of data that you can treat as variables include the number of registered users in a web app, the number of active characters in a video game, and the number of miles covered by a runner.
What Constants AreMath also has the concept of constants. The term refers to a value or quantity that never changes. In programming, constants refer to names associated with values that never change during a program’s execution.
Just like variables, programming constants consist of two things: a name and an associated value. The name will clearly describe what the constant is all about. The value is the concrete expression of the constant itself.
Like with variables, the value associated with a given constant can be of any of data type. So, you can define integer constants, floating-point constants, character constants, string constants, and more.
After you’ve defined a constant, it’ll only allow you to perform a single operation on it. You can only access the constant’s value but not change it over time. This is different from a variable, which allows you to access its value, but also reassign it.
You’ll use constants to represent values that won’t change. You’ll find lots of these values in your day-to-day programming. A few examples include the speed of light, the number of minutes in an hour, and the name of a project’s root folder.
Why Use ConstantsIn most programming languages, constants protect you from accidentally changing their values somewhere in the code when you’re coding at two in the morning, causing unexpected and hard-to-debug errors. Constants also help you make your code more readable and maintainable.
Some advantages of using constants instead of using their values directly in your code include:
Advantage Description Improved readability A descriptive name representing a given value throughout a program is always more readable and explicit than the bare-bones value itself. For example, it’s easier to read and understand a constant named MAX_SPEED than the concrete speed value itself. Clear communication of intent Most people will assume that 3.14 may refer to the Pi constant. However, using the Pi, pi, or PI name will communicate your intent more clearly than using the value directly. This practice will allow other developers to understand your code quickly and accurately. Better maintainability Constants enable you to use the same name to identify the same value throughout your code. If you need to update the constant’s value, then you don’t have to change every instance of the value. You just have to change the value in a single place: the constant definition. This improves your code’s maintainability. Lower risk of errors A constant representing a given value throughout a program is less error-prone than several explicit instances of the value. Say that you use different precision levels for Pi depending on your target calculations. You’ve explicitly used the values with the required precision for every calculation. If you need to change the precision in a set of calculations, then replacing the values can be error-prone because you can end up changing the wrong values. It’s safer to create different constants for different precision levels and change the code in a single place. Reduced debugging needs Constants will remain unchanged during the program’s lifetime. Because they’ll always have the same value, they shouldn’t cause errors and bugs. This feature may not be necessary in small projects, but it may be crucial in large projects with multiple developers. Developers won’t have to invest time debugging the current value of any constant. Thread-safe data storage Constants can only be accessed, not written. This feature makes them thread-safe objects, which means that several threads can simultaneously use a constant without the risk of corrupting or losing the underlying data. Read the full article at https://realpython.com/python-constants/ »[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]
Real Python: Python Class Constructors: Control Your Object Instantiation
Creating a class constructor in Python involves understanding the instantiation process, which consists of two steps: instance creation and instance initialization. You start this process by calling the class like a function, which triggers the .__new__() method to create an instance and the .__init__() method to initialize it. Mastering these methods allows you to customize how Python constructs and initializes objects of your classes.
By the end of this tutorial, you’ll understand that:
- A class constructor in Python triggers the instantiation process, creating and initializing objects.
- Python handles instantiation internally with .__new__() for creation and .__init__() for initialization.
- You can customize object initialization by overriding the .__init__() method in your class.
- The difference between .__new__() and .__init__() is that .__new__() creates the instance, while .__init__() initializes it.
- Common use cases for overriding .__new__() include subclassing immutable types or implementing singletons.
To better understand the examples and concepts in this tutorial, you should be familiar with object-oriented programming and special methods in Python.
Free Bonus: Click here to get access to a free Python OOP Cheat Sheet that points you to the best tutorials, videos, and books to learn more about Object-Oriented Programming with Python.
Take the Quiz: Test your knowledge with our interactive “Python Class Constructors: Control Your Object Instantiation” quiz. You’ll receive a score upon completion to help you track your learning progress:
Interactive Quiz
Python Class Constructors: Control Your Object InstantiationIn this quiz, you'll test your understanding of class constructors in Python. By working through this quiz, you'll revisit the internal instantiation process, object initialization, and fine-tuning object creation.
Python’s Class Constructors and the Instantiation ProcessLike many other programming languages, Python supports object-oriented programming. At the heart of Python’s object-oriented capabilities, you’ll find the class keyword, which allows you to define custom classes that can have attributes for storing data and methods for providing behaviors.
Once you have a class to work with, then you can start creating new instances or objects of that class, which is an efficient way to reuse functionality in your code.
Creating and initializing objects of a given class is a fundamental step in object-oriented programming. This step is often referred to as object construction or instantiation. The tool responsible for running this instantiation process is commonly known as a class constructor.
Getting to Know Python’s Class ConstructorsIn Python, to construct an object of a given class, you just need to call the class with appropriate arguments, as you would call any function:
Python >>> class SomeClass: ... pass ... >>> # Call the class to construct an object >>> SomeClass() <__main__.SomeClass object at 0x7fecf442a140> Copied!In this example, you define SomeClass using the class keyword. This class is currently empty because it doesn’t have attributes or methods. Instead, the class’s body only contains a pass statement as a placeholder statement that does nothing.
Then you create a new instance of SomeClass by calling the class with a pair of parentheses. In this example, you don’t need to pass any argument in the call because your class doesn’t take arguments yet.
In Python, when you call a class as you did in the above example, you’re calling the class constructor, which creates, initializes, and returns a new object by triggering Python’s internal instantiation process.
A final point to note is that calling a class isn’t the same as calling an instance of a class. These are two different and unrelated topics. To make a class’s instance callable, you need to implement a .__call__() special method, which has nothing to do with Python’s instantiation process.
Understanding Python’s Instantiation ProcessYou trigger Python’s instantiation process whenever you call a Python class to create a new instance. This process runs through two separate steps, which you can describe as follows:
- Create a new instance of the target class
- Initialize the new instance with an appropriate initial state
To run the first step, Python classes have a special method called .__new__(), which is responsible for creating and returning a new empty object. Then another special method, .__init__(), takes the resulting object, along with the class constructor’s arguments.
The .__init__() method takes the new object as its first argument, self. Then it sets any required instance attribute to a valid state using the arguments that the class constructor passed to it.
In short, Python’s instantiation process starts with a call to the class constructor, which triggers the instance creator, .__new__(), to create a new empty object. The process continues with the instance initializer, .__init__(), which takes the constructor’s arguments to initialize the newly created object.
To explore how Python’s instantiation process works internally, consider the following example of a Point class that implements a custom version of both methods, .__new__() and .__init__(), for demonstration purposes:
Python point.py 1class Point: 2 def __new__(cls, *args, **kwargs): 3 print("1. Create a new instance of Point.") 4 return super().__new__(cls) 5 6 def __init__(self, x, y): 7 print("2. Initialize the new instance of Point.") 8 self.x = x 9 self.y = y 10 11 def __repr__(self) -> str: 12 return f"{type(self).__name__}(x={self.x}, y={self.y})" Copied! Read the full article at https://realpython.com/python-class-constructor/ »[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]
Real Python: Create and Modify PDF Files in Python
Creating and modifying PDF files in Python is straightforward with libraries like pypdf and ReportLab. You can read, manipulate, and create PDF files using these tools. pypdf lets you extract text, split, merge, rotate, crop, encrypt, and decrypt PDFs. ReportLab enables you to create new PDFs from scratch, allowing customization with fonts and page sizes.
By the end of this tutorial, you’ll understand that:
- You can read and modify existing PDF files using pypdf in Python.
- You can create new PDF files from scratch with the ReportLab library.
- Methods to encrypt and decrypt a PDF file with a password are available in pypdf.
- Concatenating and merging multiple PDF files can be done using pypdf.
- You can add custom fonts to a PDF using ReportLab.
- Python can create interactive PDFs with forms using ReportLab.
To follow along with this tutorial, you should download and extract to your home folder the materials used in the examples. To do this, click the link below:
Download the sample materials: Click here to get the materials you’ll use to learn about creating and modifying PDF files in this tutorial.
Extracting Text From PDF Files With pypdfIn this section, you’ll learn how to read PDF files and extract their text using the pypdf library. Before you can do that, though, you need to install it with pip:
Shell $ python -m pip install pypdf Copied!With this command, you download and install the latest version of pypdf from the Python package index (PyPI). To verify the installation, go ahead and run the following command in your terminal:
Shell $ python -m pip show pypdf Name: pypdf Version: 3.8.1 Summary: A pure-python PDF library capable of splitting, merging, cropping, and transforming PDF files Home-page: Author: Author-email: Mathieu Fenniak <biziqe@mathieu.fenniak.net> License: Location: .../lib/python3.10/site-packages Requires: Required-by: Copied!Pay particular attention to the version information. At the time of publication for this tutorial, the latest version of pypdf was 3.8.1. This library has gotten plenty of updates lately, and cool new features are added quite frequently. Most importantly, you’ll find many breaking changes in the library’s API if you compare it with its predecessor library PyPDF2.
Before diving into working with PDF files, you must know that this tutorial is adapted from the chapter “Creating and Modifying PDF Files” in Python Basics: A Practical Introduction to Python 3.
The book uses Python’s built-in IDLE editor to create and edit Python files and interact with the Python shell, so you’ll find occasional references to IDLE throughout this tutorial. However, you should have no problems running the example code from the editor and environment of your choice.
Reading PDF Files With PdfReaderTo kick things off, you’ll open a PDF file and read some information about it. You’ll use the Pride_and_Prejudice.pdf file provided in the downloadable resources for this tutorial.
Open IDLE’s interactive window and import the PdfReader class from pypdf:
Python >>> from pypdf import PdfReader Copied!To create a new instance of the PdfReader class, you’ll need to provide the path to the PDF file that you want to open. You can do that using the pathlib module:
Python >>> from pathlib import Path >>> pdf_path = ( ... Path.home() ... / "creating-and-modifying-pdfs" ... / "practice_files" ... / "Pride_and_Prejudice.pdf" ... ) Copied!The pdf_path variable now contains the path to a PDF version of Jane Austen’s Pride and Prejudice.
Note: You may need to change pdf_path so that it corresponds to the location of the creating-and-modifying-pdfs/ folder on your computer.
Now create the PdfReader instance by calling the class’s constructor with the path to your PDF file as an argument:
Python >>> pdf_reader = PdfReader(pdf_path) Copied!If you’ve been following along in Python Basics, then you’ll remember from Chapter 12, “File Input and Output,” that all open files should be closed before a program terminates. The PdfReader object does all of this for you, so you don’t need to worry about opening or closing the PDF file!
Now that you’ve created a PdfReader instance, you can use it to gather information about the PDF file. For example, to get the number of pages contained in the PDF file, you can use the built-in len() function like in the code below:
Read the full article at https://realpython.com/creating-modifying-pdf/ »[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]
Real Python: How to Install Python on Your System: A Guide
Installing Python on your system involves a few straightforward steps. First, check if Python is already installed by opening a command-line interface and typing python --version or python3 --version. You can install Python on Windows using the official installer from Python.org or through the Microsoft Store. On macOS, you can use the official installer or Homebrew. For Linux, use your package manager or build Python from source.
By the end of this tutorial, you’ll understand how to:
- Check if Python is installed by running python --version or python3 --version in a command-line interface.
- Upgrade Python by downloading and installing the latest version from Python.org.
- Install and manage multiple Python versions with pyenv to keep them separate.
This tutorial covers installing the latest Python on the most important platforms or operating systems, such as Windows, macOS, Linux, iOS, and Android. However, it doesn’t cover all the existing Linux distributions, as that would be a massive task. Nevertheless, you’ll find instructions for the most popular distributions available today.
To get the most out of this tutorial, you should be comfortable using your operating system’s terminal or command line.
Free Bonus: Click here to get a Python Cheat Sheet and learn the basics of Python 3, like working with data types, dictionaries, lists, and Python functions.
Take the Quiz: Test your knowledge with our interactive “Python Installation and Setup” quiz. You’ll receive a score upon completion to help you track your learning progress:
Interactive Quiz
Python Installation and SetupIn this quiz, you'll test your understanding of how to install or update Python on your computer. With this knowledge, you'll be able to set up Python on various operating systems, including Windows, macOS, and Linux.
Windows: How to Check or Get PythonIn this section, you’ll learn to check whether Python is installed on your Windows operating system (OS) and which version you have. You’ll also explore three installation options that you can use on Windows.
Note: In this tutorial, you’ll focus on installing the latest version of Python in your current operating system (OS) rather than on installing multiple versions of Python. If you want to install several versions of Python in your OS, then check out the Managing Multiple Python Versions With pyenv tutorial. Note that on Windows machines, you’d have to use pyenv-win instead of pyenv.
For a more comprehensive guide on setting up a Windows machine for Python programming, check out Your Python Coding Environment on Windows: Setup Guide.
Checking the Python Version on WindowsTo check whether you already have Python on your Windows machine, open a command-line application like PowerShell or the Windows Terminal.
Follow the steps below to open PowerShell on Windows:
- Press the Win key.
- Type PowerShell.
- Press Enter.
Alternatively, you can right-click the Start button and select Windows PowerShell or Windows PowerShell (Admin). In some versions of Windows, you’ll find Terminal or Terminal (admin).
Note: To learn more about your options for the Windows terminal, check out Using the Terminal on Windows.
With the command line open, type in the following command and press the Enter key:
Windows PowerShell PS> python --version Python 3.x.z Copied!Using the --version switch will show you the installed version. Note that the 3.x.z part is a placeholder here. In your machine, x and z will be numbers corresponding to the specific version you have installed.
Alternatively, you can use the -V switch:
Windows PowerShell PS> python -V Python 3.x.z Copied!Using the python -V or python—-version command, you can check whether Python is installed on your system and learn what version you have. If Python isn’t installed on your OS, you’ll get an error message.
Knowing the Python Installation Options on WindowsYou’ll have different options to install Python if you’re on a Windows machine. Here are three popular ones:
Read the full article at https://realpython.com/installing-python/ »[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]
Real Python: pandas GroupBy: Your Guide to Grouping Data in Python
The pandas .groupby() method allows you to efficiently analyze and transform datasets when working with data in Python. With df.groupby(), you can split a DataFrame into groups based on column values, apply functions to each group, and combine the results into a new DataFrame. This technique is essential for tasks like aggregation, filtering, and transformation on grouped data.
By the end of this tutorial, you’ll understand that:
- Calling .groupby("column_name") splits a DataFrame into groups, applies a function to each group, and combines the results.
- To group by multiple columns, you can pass a list of column names to .groupby().
- Common aggregation methods in pandas include .sum(), .mean(), and .count().
- You can use custom functions with pandas .groupby() to perform specific operations on groups.
This tutorial assumes that you have some experience with pandas itself, including how to read CSV files into memory as pandas objects with read_csv(). If you need a refresher, then check out Reading CSVs With pandas and pandas: How to Read and Write Files.
You can download the source code for all the examples in this tutorial by clicking on the link below:
Download Datasets: Click here to download the datasets that you’ll use to learn about pandas’ GroupBy in this tutorial.
PrerequisitesBefore you proceed, make sure that you have the latest version of pandas available within a new virtual environment:
Windows PowerShell PS> python -m venv venv PS> venv\Scripts\activate (venv) PS> python -m pip install pandas Copied! Shell $ python3 -m venv venv $ source venv/bin/activate (venv) $ python -m pip install pandas Copied!In this tutorial, you’ll focus on three datasets:
- The U.S. Congress dataset contains public information on historical members of Congress and illustrates several fundamental capabilities of .groupby().
- The air quality dataset contains periodic gas sensor readings. This will allow you to work with floats and time series data.
- The news aggregator dataset holds metadata on several hundred thousand news articles. You’ll be working with strings and doing text munging with .groupby().
You can download the source code for all the examples in this tutorial by clicking on the link below:
Download Datasets: Click here to download the datasets that you’ll use to learn about pandas’ GroupBy in this tutorial.
Once you’ve downloaded the .zip file, unzip the file to a folder called groupby-data/ in your current directory. Before you read on, ensure that your directory tree looks like this:
./ │ └── groupby-data/ │ ├── legislators-historical.csv ├── airqual.csv └── news.csvWith pandas installed, your virtual environment activated, and the datasets downloaded, you’re ready to jump in!
Example 1: U.S. Congress DatasetYou’ll jump right into things by dissecting a dataset of historical members of Congress. You can read the CSV file into a pandas DataFrame with read_csv():
Python pandas_legislators.py import pandas as pd dtypes = { "first_name": "category", "gender": "category", "type": "category", "state": "category", "party": "category", } df = pd.read_csv( "groupby-data/legislators-historical.csv", dtype=dtypes, usecols=list(dtypes) + ["birthday", "last_name"], parse_dates=["birthday"] ) Copied!The dataset contains members’ first and last names, birthday, gender, type ("rep" for House of Representatives or "sen" for Senate), U.S. state, and political party. You can use df.tail() to view the last few rows of the dataset:
Python >>> from pandas_legislators import df >>> df.tail() last_name first_name birthday gender type state party 11970 Garrett Thomas 1972-03-27 M rep VA Republican 11971 Handel Karen 1962-04-18 F rep GA Republican 11972 Jones Brenda 1959-10-24 F rep MI Democrat 11973 Marino Tom 1952-08-15 M rep PA Republican 11974 Jones Walter 1943-02-10 M rep NC Republican Copied!The DataFrame uses categorical dtypes for space efficiency:
Python >>> df.dtypes last_name object first_name category birthday datetime64[ns] gender category type category state category party category dtype: object Copied!You can see that most columns of the dataset have the type category, which reduces the memory load on your machine.
Read the full article at https://realpython.com/pandas-groupby/ »[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]
This Week in KDE Apps: Usability, accessibility, and supercharging the Fediverse
Welcome to a new issue of "This Week in KDE Apps"! Every week we cover as much as possible of what's happening in the world of KDE apps.
This week we also published a new web page in our "KDE For You" series, this time about "KDE For Digital Sovereignty". These pages give you tons of recommendations about KDE and other FOSS apps you can use in different situations, be it for education, creativity, travel and more.
Arianna EBook readerIt's now possible to change the app's color scheme independently of the system's color scheme (Onuralp SEZER, 25.04.0. Link).
Dolphin Manage your filesWhen manually adding items to the Places panel, the current location's custom icon is pre-populated in the icon field, and the item will now be created globally by default, so it appears in other apps' Places panels as well (Nate Graham, Frameworks 6.11. Link and link 2).
Elisa Play local music and listen to online radioWe added an entry at the top of the grid/list to open a track view for the current artist or genre. Tracks from artists opened from genre view will be filtered by genre (Pedro Nishiyama, 25.04.0. Link).
We have solved the problem of creating infinitely nested views when browsing artist > album > artist (Pedro Nishiyama, 25.04.0. Link).
Haruna Media playerHaruna 1.3 is out with lots of code refactoring. Additionally, the default actions for left and right mouse buttons have changed: left click is now Play/Pause and right click opens the context menu. These actions can be changed in Settings on the mouse page.
KDE Itinerary Digital travel assistantVolker restored public transport data access to Digitransit in Finland and to Rolph in Germany (Volker Krause, 24.12.2, also affects KTrip) and Joshua and Gregorz wrote and improved travel document extractors for American Airlines, Brightline and Southwest (Joshua Goins, 24.12.2, Link 1, link 2, and link 3) and Koleo (Grzegorz Mu, 24.12.2, Link).
KMail A feature-rich email applicationJoshua fixed various issues with the markdown rendering in KMail, enabling markdown footnotes, highlighting and removing some dead code (Joshua Goins, 25.04.0. Link 1 and link 2); and, to facilitate the use of KMail's security features, KMail will now query a key server when clicking on an unknown OpenPGP certificate (Tobias Fella, 25.04.0 Link).
Kdenlive Video editorThe audio waveform of Kdenlive was completely rewritten. It is now around twice as fast to generate and is more accurate (Étienne André and funded by the Kdenlive Fundraiser, 25.04.0 Link).
Before:
After:
KDevelop Featureful, plugin-extensible IDE for C/C++ and other programming languagesWe added and improved the debugger pretty printer for QJSon*, QCbor*, QDateTime, QTimeZone (David Faure, 25.04.0 Link 1 and link 2).
Krita Digital Painting, Creative FreedomThe latest Krita Monthly Update is out. If you want to learn what's going on in Krita as well as see some amazing artwork made with Krita, check it out.
Kurzschwardzenbuglen Nature Sanctuary by @Yaroslavus_Artem
Barcode Scanner Scan and create QR-CodesQrca now forces the rendering of QR code content to be plain text (Kai Uwe Broulik. Link) and only shows the flashlight button on devices with a flashlight (e.g. not on your laptop) (Kai Uwe Broulik. Link).
Tokodon Browse the FediverseTokodon will now remind you to add an alt text to your images (Joshua Goins, 25.04.0. Link).
We also added an option for a confirmation dialog before boosting a post. This is particularly relevant for people managing multiple accounts to prevent them from boosting posts from the wrong account (Joshua Goins, 25.04.0. Link).
In the department of trust and safety improvements, you can now filter some posts from your timeline (Joshua Goins, 25.04.0. Link).
And show a banner when an account has moved to another server (Joshua Goins, 25.04.0. Link).
You can now browse posts that are about a news link (Joshua Goins, 25.04.0. Link) and see the post associated with an image in the media grid of a profile (Joshua Goins, 25.04.0. Link).
We also fixed a bug where, when failing to authenticate one of your accounts, Tokodon would be stuck indefinitely on the loading screen (Carl Schwan, 24.12.2. Link).
Kwave Sound editorWe improved the performance of the playback using QtMultimedia significantly (Thomas Eschenbacher, 25.04.0. Link).
…And Everything ElseThis blog only covers the tip of the iceberg! If you’re hungry for more, check out Nate's blog about Plasma and be sure not to miss his This Week in Plasma series, where every Saturday he covers all the work being put into KDE's Plasma desktop environment.
For a complete overview of what's going on, visit KDE's Planet, where you can find all KDE news unfiltered directly from our contributors.
Get InvolvedThe KDE organization has become important in the world, and your time and contributions have helped us get there. As we grow, we're going to need your support for KDE to become sustainable.
You can help KDE by becoming an active community member and getting involved. Each contributor makes a huge difference in KDE — you are not a number or a cog in a machine! You don’t have to be a programmer either. There are many things you can do: you can help hunt and confirm bugs, even maybe solve them; contribute designs for wallpapers, web pages, icons and app interfaces; translate messages and menu items into your own language; promote KDE in your local community; and a ton more things.
You can also help us by donating. Any monetary contribution, however small, will help us cover operational costs, salaries, travel expenses for contributors and in general just keep KDE bringing Free Software to the world.
To get your application mentioned here, please ping us in invent or in Matrix.
Petter Reinholdtsen: 121 packages in Debian mapped to hardware for automatic recommendation
For some years now, I have been working on a automatic hardware based package recommendation system for Debian and other Linux distributions. The isenkram system I started on back in 2013 now consist of two subsystems, one locating firmware files using the information provided by apt-file, and one matching hardware to packages using information provided by AppStream. The former is very similar to the mechanism implemented in debian-installer to pick the right firmware packages to install. This post is about the latter system. Thanks to steady progress and good help from both other Debian and upstream developers, I am happy to report that the Isenkram system now are able to recommend 121 packages using information provided via AppStream.
The mapping is done using modalias information provided by the kernel, the same information used by udev when creating device files, and the kernel when deciding which kernel modules to load. To get all the modalias identifiers relevant for your machine, you can run the following command on the command line:
find /sys/devices -name modalias -print0 | xargs -0 sort -uThe modalias identifiers can look something like this:
acpi:PNP0000 cpu:type:x86,ven0000fam0006mod003F:feature:,0000,0001,0002,0003,0004,0005,0006,0007,0008,0009,000B,000C,000D,000E,000F,0010,0011,0013,0015,0016,0017,0018,0019,001A,001B,001C,001D,001F,002B,0034,003A,003B,003D,0068,006B,006C,006D,006F,0070,0072,0074,0075,0076,0078,0079,007C,0080,0081,0082,0083,0084,0085,0086,0087,0088,0089,008B,008C,008D,008E,008F,0091,0092,0093,0094,0095,0096,0097,0098,0099,009A,009B,009C,009D,009E,00C0,00C5,00E1,00E3,00EB,00ED,00F0,00F1,00F3,00F5,00F6,00F9,00FA,00FB,00FD,00FF,0100,0101,0102,0103,0111,0120,0121,0123,0125,0127,0128,0129,012A,012C,012D,0140,0160,0161,0165,016C,017B,01C0,01C1,01C2,01C4,01C5,01C6,01F9,024A,025A,025B,025C,025F,0282 dmi:bvnDellInc.:bvr2.18.1:bd08/14/2023:br2.18:svnDellInc.:pnPowerEdgeR730:pvr:rvnDellInc.:rn0H21J3:rvrA09:cvnDellInc.:ct23:cvr:skuSKU=NotProvided pci:v00008086d00008D3Bsv00001028sd00000600bc07sc80i00 platform:serial8250 scsi:t-0x05 usb:v413CpA001d0000dc09dsc00dp00ic09isc00ip00in00The entries above are a selection of the complete set available on a Dell PowerEdge R730 machine I have access to, to give an idea about the various styles of hardware identifiers presented in the modalias format. When looking up relevant packages in a Debian Testing installation on the same R730, I get this list of packages proposed:
% sudo isenkram-lookup firmware-bnx2x firmware-nvidia-graphics firmware-qlogic megactl wsl %The list consist of firmware packages requested by kernel modules, as well packages with program to get the status from the RAID controller and to maintain the LAN console. When the edac-utils package providing tools to check the ECC RAM status will enter testing in a few days, it will also show up as a proposal from isenkram. In addition, once the mfiutil package we uploaded in October get past the NEW processing, it will also propose a tool to configure the RAID controller.
Another example is the trusty old Lenovo Thinkpad X230, which have hardware handled by several packages in the archive. This is running on Debian Stable:
% isenkram-lookup beignet-opencl-icd bluez cheese ethtool firmware-iwlwifi firmware-misc-nonfree fprintd fprintd-demo gkrellm-thinkbat hdapsd libpam-fprintd pidgin-blinklight thinkfan tlp tp-smapi-dkms tpb %Here there proposal consist of software to handle the camera, bluetooth, network card, wifi card, GPU, fan, fingerprint reader and acceleration sensor on the machine.
Here is the complete set of packages currently providing hardware mapping via AppStream in Debian Unstable: air-quality-sensor, alsa-firmware-loaders, antpm, array-info, avarice, avrdude, bmusb-v4l2proxy, brltty, calibre, colorhug-client, concordance-common, consolekit, dahdi-firmware-nonfree, dahdi-linux, edac-utils, eegdev-plugins-free, ekeyd, elogind, firmware-amd-graphics, firmware-ath9k-htc, firmware-atheros, firmware-b43-installer, firmware-b43legacy-installer, firmware-bnx2, firmware-bnx2x, firmware-brcm80211, firmware-carl9170, firmware-cavium, firmware-intel-graphics, firmware-intel-misc, firmware-ipw2x00, firmware-ivtv, firmware-iwlwifi, firmware-libertas, firmware-linux-free, firmware-mediatek, firmware-misc-nonfree, firmware-myricom, firmware-netronome, firmware-netxen, firmware-nvidia-graphics, firmware-qcom-soc, firmware-qlogic, firmware-realtek, firmware-ti-connectivity, fpga-icestorm, g810-led, galileo, garmin-forerunner-tools, gkrellm-thinkbat, goldencheetah, gpsman, gpstrans, gqrx-sdr, i8kutils, imsprog, ledger-wallets-udev, libairspy0, libam7xxx0.1, libbladerf2, libgphoto2-6t64, libhamlib-utils, libm2k0.9.0, libmirisdr4, libnxt, libopenxr1-monado, libosmosdr0, librem5-flash-image, librtlsdr0, libticables2-8, libx52pro0, libykpers-1-1, libyubikey-udev, limesuite, linuxcnc-uspace, lomoco, madwimax, media-player-info, megactl, mixxx, mkgmap, msi-keyboard, mu-editor, mustang-plug, nbc, nitrokey-app, nqc, ola, openfpgaloader, openocd, openrazer-driver-dkms, pcmciautils, pcscd, pidgin-blinklight, ponyprog, printer-driver-splix, python-yubico-tools, python3-btchip, qlcplus, rosegarden, scdaemon, sispmctl, solaar, spectools, sunxi-tools, t2n, thinkfan, tlp, tp-smapi-dkms, trezor, tucnak, ubertooth, usbrelay, uuu, viking, w1retap, wsl, xawtv, xinput-calibrator, xserver-xorg-input-wacom and xtrx-dkms.
In addition to these, there are several with patches pending in the Debian bug tracking system, and even more where no-one wrote patches yet. Good candiates for the latter are packages with udev rules but no AppStream hardware information.
The isenkram system consist of two packages, isenkram-cli with the command line tools, and isenkram with a GUI background process. The latter will listen for dbus events from udev emitted when new hardware become available (like when inserting a USB dongle or discovering a new bluetooth device), look up the modalias entry for this piece of hardware in AppStream (and a hard coded list of mappings from isenkram - currently working hard to move this list to AppStream), and pop up a dialog proposing to install any not already installed packages supporting this hardware. It work very well today when inserting the LEGO Mindstorms RCX, NXT and EV3 controllers. :) If you want to make sure more hardware related packages get recommended, please help out fixing the remaining packages in Debian to provide AppStream metadata with hardware mappings.
As usual, if you use Bitcoin and want to show your support of my activities, please send Bitcoin donations to my address 15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b.
Seth Michael Larson: How to disable Copilot in GitHub
Armin Ronacher: Automatic Server Reloading in Rust on change: What is listenfd/systemfd?
When I developed Werkzeug (and later Flask), the most important part of the developer experience for me was enabling fast, automatic reloading. Werkzeug (and with it Flask), this is achieved by using two procsses at all times. The parent process holds on to the file descriptor of the socket on which the server listens, and a subprocess picks up that file descriptor. That subprocess restarts when it detects changes. This ensures that no matter what happens, there is no window where the browser reports a connection error. At worst, the browser will hang until the process finishes reloading, after which the page loads successfully. In case the inner process fails to come up during restarts, you get an error message.
A few years ago, I wanted to accomplish the same experience for working with Rust code which is why I wrote systemfd and listenfd. I however realized that I never really wrote here about how they work and disappointingly I think those crates, and a good auto-reloading experience in Rust are largely unknown.
Watching for ChangesFirstly one needs to monitor the file system for changes. While in theory I could have done this myself, there was already a tool that could do that.
At the time there was cargo watch. Today one might instead use it together with the more generic watchexec. Either one monitor your workspace for changes and then executes a command. So you can for instance tell it to restart your program. One of these will work:
watchexec -r -- cargo run cargo watch -x runYou will need a tool like that to do the watching part. At this point I recommend the more generic watchexec which you can find on homebrew and elsewhere.
Passing SocketsBut what about the socket? The solution to this problem I picked comes from systemd. Systemd has a “protocol” that standardizes passing file descriptors from one process to another through environment variables. In systemd parlance this is called “socket activation,” as it allows systemd to only launch a program if someone started making a request to the socket. This concept was originally introduced by Apple as part of launchd.
To make this work with Rust, I created two crates:
- systemfd is the command line tool that opens sockets and passes them on to other programs.
- listenfd is a Rust crate that accepts file descriptors from systemd or systemfd.
It's worth noting that systemfd is not exclusivly useful to Rust. The systemd protocol can be implemented in other languages as well, meaning that if you have a socket server written in Go or Python, you can also use systemfd.
So here is how you use it.
First you need to add listenfd to your project:
cargo add listenfdThen, modify your server code to accept sockets via listenfd before falling back to listening itself on ports provided through command-line arguments or configuration files. Here is an example using listenfd in axum:
use axum::{routing::get, Router}; use tokio::net::TcpListener; async fn index() -> &'static str { "Hello, World!" } #[tokio::main] async fn main() -> Result<(), Box<dyn std::error::Error>> { let app = Router::new().route("/", get(index)); let mut listenfd = listenfd::ListenFd::from_env(); let listener = match listenfd.take_tcp_listener(0)? { Some(listener) => TcpListener::from_std(listener), None => TcpListener::bind("0.0.0.0:3000").await, }?; axum::serve(listener, app).await?; Ok(()) }The key point here is to accept socket 0 from the environment as a TCP listener and use it if available. If the socket is not provided (e.g. when launched without systemd/systemfd), the code falls back to opening a fixed port.
Putting it TogetherFinally you can use cargo watch / watchexec together with systemfd:
systemfd --no-pid -s http::8888 -- watchexec -r -- cargo run systemfd --no-pid -s http::8888 -- cargo watch -x runThis is what the parameters mean:
- systemfd needs to be first it's the program that opens the sockets.
- --no-pid is a flag prevents the PID from being passed. This is necessary for listenfd to accept the socket. This is a departure of the socket passing protocol from systemd which otherwise does not allow ports to be passed through another program (like watchexec). In short: when the PID information is not passed, then listenfd will accept the socket regardless. Otherwise it would only accept it from the direct parent process.
- -s http::8888 tells systemfd to open one TCP socket on port 8888. Using http instead of tcp is a small improvement that will cause systemfd to print out a URL on startup.
- -- watchexec -r makes watchexec restart the process when something changes in the current working directory.
- -- cargo run is the program that watchexec will start and re-start onm changes. In Rust this will first compile the changes and then run the application. Because we put listenfd in, it will try to first accept the socket from systemfd.
The end result is that you can edit your code, and it will recompile automatically and restart the server without dropping any requests. When you run it, and perform changes, it will look a bit like this:
$ systemfd --no-pid -s http::5555 -- watchexec -r -- cargo run ~> socket http://127.0.0.1:5555/ -> fd #3 [Running: cargo run] Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.02s Running `target/debug/axum-test` [Running: cargo run] Compiling axum-test v0.1.0 (/private/tmp/axum-test) Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.52s Running `target/debug/axum-test`For easier access, I recommend putting this into a Makefile or similar so you can just run make devserver and it runs the server in watch mode.
To install systemfd you can use curl to bash:
curl -sSfL https://github.com/mitsuhiko/systemfd/releases/latest/download/systemfd-installer.sh | sh What About Windows?Now how does this work on Windows? The answer is that systemfd and listenfd have a custom, proprietary protocol that also makes socket passing work on Windows. That's a more complex system which involves a local RPC server. However the system does also support Windows and the details about how it works are largely irrelevant for you as a user — unless you want to implement that protocol for another programming language.
Potential ImprovementsI really enjoy using this combination, but it can be quite frustrating to require so many commands, and the command line workflow isn't optimal. Ideally, this functionality would be better integrated into specific Rust frameworks like axum and provided through a dedicated cargo plugin. In a perfect world, one could simply run cargo devserver, and everything would work seamlessly.
However, maintaining such an integrated experience is a much more involved effort than what I have. Hopefully, someone will be inspired to further enhance the developer experience and achieve deeper integration with Rust frameworks, making it more accessible and convenient for everyone.
Real Python: Understanding the Python Mock Object Library
Mocking in Python with unittest.mock allows you to simulate complex logic or unpredictable dependencies, such as responses from external services. You create mock objects to replace real ones in your tests, ensuring that your tests are isolated. The Mock class allows you to imitate real objects, and the patch() function lets you temporarily substitute mocks for real objects in your tests.
By the end of this tutorial, you’ll understand that:
- A mock in Python is a substitute object that simulates a real object in a testing environment.
- Mock differs from MagicMock in that MagicMock includes implementations of most magic methods.
- The patch() function replaces real objects with mock instances, controlling the scope of mocking.
- You can assert if a Mock object was called with methods like .assert_called().
- You set a mock’s return value by assigning a value to the mock’s .return_value attribute.
In this tutorial, you’ll explore how mocking enhances your testing strategy by enabling controlled and predictable test environments for your Python code. When you can control the behavior of your code during testing, you can reliably test that your application logic is correct.
Get Your Code: Click here to download the free sample code that you’ll use to learn about Python’s mock object library.
Take the Quiz: Test your knowledge with our interactive “Understanding the Python Mock Object Library” quiz. You’ll receive a score upon completion to help you track your learning progress:
Interactive Quiz
Understanding the Python Mock Object LibraryIn this quiz, you'll test your understanding of Python's unittest.mock library. With this knowledge, you'll be able to write robust tests, create mock objects, and ensure your code is reliable and efficient.
What Is Mocking?A mock object substitutes and imitates a real object within a testing environment. Using mock objects is a versatile and powerful way to improve the quality of your tests. This is because by using Python mock objects, you can control your code’s behavior during testing.
For example, if your code makes HTTP requests to external services, then your tests execute predictably only so far as the services are behaving as you expected. Sometimes, a temporary change in the behavior of these external services can cause intermittent failures within your test suite.
Because of this, it would be better for you to test your code in a controlled environment. Replacing the actual request with a mock object would allow you to simulate external service outages and successful responses in a predictable way.
Sometimes, it’s difficult to test certain areas of your codebase. Such areas include except blocks and if statements that are hard to satisfy. Using Python mock objects can help you control the execution path of your code to reach these areas and improve your code coverage.
Another reason to use mock objects is to better understand how you’re using their real counterparts in your code. A Python mock object contains data about its usage that you can inspect, such as:
- If you called a method
- How you called the method
- How often you called the method
Understanding what a mock object does is the first step to learning how to use one. Next, you’ll explore the Python mock object library to see how to use Python mock objects.
The Python Mock LibraryPython’s built-in mock object library is unittest.mock. It provides an easy way to introduce mocks into your tests.
Note: The standard library includes unittest.mock starting from Python 3.3 and in all newer versions. If you’re using an older version of Python, then you’ll need to install the official backport of the library.
To do so, install mock from the Python Package Index (PyPI) using pip:
Shell $ python -m pip install mock Copied!You may want to create and activate a virtual environment before installing the package.
unittest.mock provides a class called Mock, which you’ll use to imitate real objects in your codebase. Mock, along with its subclasses, offers incredible flexibility and insightful data that will meet most of your Python mocking needs.
The library also provides a function called patch(), which replaces the real objects in your code with Mock instances. You can use patch() as either a decorator or a context manager, giving you control over the scope in which the object will be mocked. Once the designated scope exits, patch() will clean up your code by replacing the mocked objects with their original counterparts.
Finally, unittest.mock provides solutions for some of the issues inherent in mocking objects, which you’ll explore later in this tutorial.
Now that you have a better understanding of what mocking is and the library you’ll be using, it’s time to dive in and explore the features and functionalities unittest.mock has to offer.
The Mock Objectunittest.mock offers a base class for mocking objects called Mock. The use cases for Mock are practically limitless because Mock is so flexible.
Begin by instantiating a new Mock instance:
Python >>> from unittest.mock import Mock >>> mock = Mock() >>> mock <Mock id='4561344720'> Copied! Read the full article at https://realpython.com/python-mock-library/ »[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]
Real Python: NumPy's max() and maximum(): Find Extreme Values in Arrays
NumPy’s max() function efficiently finds maximum values within an array, making it a key tool for data analysis in Python. This tutorial guides you through using max() and maximum(), handling missing values, and explores advanced features like broadcasting for comparing arrays of different shapes.
By the end of this tutorial, you’ll understand that:
- NumPy’s max() function finds the maximum value within a single array, working with both one-dimensional and multi-dimensional arrays.
- Conversely, np.maximum() compares two arrays element-wise to find the maximum values.
- np.amax() and max() are equivalent in NumPy.
- You can use np.nanmax() to find the maximum value in an array while ignoring nan values, preventing them from affecting the result.
This tutorial includes a very short introduction to NumPy, so even if you’ve never used NumPy before, you should be able to jump right in.
With the background provided here, you’ll be ready to continue exploring the wealth of functionality to be found in the NumPy library.
Free Bonus: Click here to get access to a free NumPy Resources Guide that points you to the best tutorials, videos, and books for improving your NumPy skills.
NumPy: Numerical PythonNumPy is short for Numerical Python. It’s an open source Python library that enables a wide range of applications in the fields of science, statistics, and data analytics through its support of fast, parallelized computations on multidimensional arrays of numbers. Many of the most popular numerical packages use NumPy as their base library.
Introducing NumPyThe NumPy library is built around a class named np.ndarray and a set of methods and functions that leverage Python syntax for defining and manipulating arrays of any shape or size.
NumPy’s core code for array manipulation is written in C. You can use functions and methods directly on an ndarray as NumPy’s C-based code efficiently loops over all the array elements in the background. NumPy’s high-level syntax means that you can simply and elegantly express complex programs and execute them at high speeds.
You can use a regular Python list to represent an array. However, NumPy arrays are far more efficient than lists, and they’re supported by a huge library of methods and functions. These include mathematical and logical operations, sorting, Fourier transforms, linear algebra, array reshaping, and much more.
Today, NumPy is in widespread use in fields as diverse as astronomy, quantum computing, bioinformatics, and all kinds of engineering.
NumPy is used under the hood as the numerical engine for many other libraries, such as pandas and SciPy. It also integrates easily with visualization libraries like Matplotlib and seaborn.
NumPy is easy to install with your package manager, for example pip or conda. For detailed instructions plus a more extensive introduction to NumPy and its capabilities, take a look at NumPy Tutorial: Your First Steps Into Data Science in Python or the NumPy Absolute Beginner’s Guide.
In this tutorial, you’ll learn how to take your very first steps in using NumPy. You’ll then explore NumPy’s max() and maximum() commands.
Creating and Using NumPy ArraysYou’ll start your investigation with a quick overview of NumPy arrays, the flexible data structure that gives NumPy its versatility and power.
The fundamental building block for any NumPy program is the ndarray. An ndarray is a Python object wrapping an array of numbers. It may, in principle, have any number of dimensions of any size. You can declare an array in several ways. The most straightforward method starts from a regular Python list or tuple:
Python >>> import numpy as np >>> A = np.array([3, 7, 2, 4, 5]) >>> A array([3, 7, 2, 4, 5]) >>> B = np.array(((1, 4), (1, 5), (9, 2))) >>> B array([[1, 4], [1, 5], [9, 2]]) Copied!You’ve imported numpy under the alias np. This is a standard, widespread convention, so you’ll see it in most tutorials and programs. In this example, A is a one-dimensional array of numbers, while B is two-dimensional.
Notice that the np.array() factory function expects a Python list or tuple as its first parameter, so the list or tuple must therefore be wrapped in its own set of brackets or parentheses, respectively. Just throwing in an unwrapped bunch of numbers won’t work:
Python >>> np.array(3, 7, 2, 4, 5) Traceback (most recent call last): ... TypeError: array() takes from 1 to 2 positional arguments but 5 were given Copied!With this syntax, the interpreter sees five separate positional arguments, so it’s confused.
In your constructor for array B, the nested tuple argument needs an extra pair of parentheses to identify it, in its entirety, as the first parameter of np.array().
Read the full article at https://realpython.com/numpy-max-maximum/ »[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]
Real Python: Get Started With Django: Build a Portfolio App
Django is a powerful Python web framework for creating complex applications. It follows the Model-View-Template (MVT) architecture and includes built-in features like authentication, an admin interface, and database management.
In this tutorial, you’ll create a portfolio app step by step, gaining hands-on experience with Django’s core features. Along the way, you’ll work with models, views, templates, and the admin interface to build a fully functional web application. This hands-on approach will demystify Django’s structure and functionality.
By the end of this tutorial, you’ll understand that:
- Django projects begin with setting up a development environment and creating a project structure.
- Learning HTML and CSS before Django can help with styling templates, but it’s not mandatory.
- Django is used for building web applications with Python, offering built-in features and scalability.
- A real-life example of Django is a portfolio website that showcases projects and manages content.
At the end of this tutorial, you’ll have a working portfolio website to showcase your projects. If you’re curious about how the final source code looks, then you can click the link below:
Get Your Code: Click here to download the Python source code for your Django portfolio project.
Take the Quiz: Test your knowledge with our interactive “Get Started With Django: Build a Portfolio App” quiz. You’ll receive a score upon completion to help you track your learning progress:
Interactive Quiz
Get Started With Django: Build a Portfolio AppIn this quiz, you'll test your understanding of Django, a fully featured Python web framework. By working through this quiz, you'll revisit the steps to create a fully functioning web application and learn about some of Django's most important features.
Learn DjangoThere are endless web development frameworks out there, so why should you learn Django over any of the others? First of all, it’s written in Python, one of the most readable and beginner-friendly programming languages out there.
Note: This tutorial assumes an intermediate knowledge of the Python language. If you’re new to programming with Python, then check out the Python Basics learning path or the introductory course.
The second reason you should learn Django is the scope of its features. When building a website, you don’t need to rely on any external libraries or packages if you choose Django. This means that you don’t need to learn how to use anything else, and the syntax is seamless because you’re using only one framework.
There’s also the added benefit that Django is straightforward to update, since the core functionality is in one package. If you do find yourself needing to add extra features, there are several external libraries that you can use to enhance your site.
One of the great things about the Django framework is its in-depth documentation. It has detailed documentation on every aspect of Django and also has great examples and even a tutorial to get you started.
There’s also a fantastic community of Django developers, so if you get stuck, there’s almost always a way forward by either checking the docs or asking the community.
Django is a high-level web application framework with loads of features. It’s great for anyone new to web development due to its fantastic documentation, and it’s especially great if you’re also familiar with Python.
Understand the Structure of a Django WebsiteA Django website consists of a single project that’s split into separate apps. The idea is that each app handles a self-contained task that the site needs to perform. As an example, imagine an application like Instagram. There are several different tasks that it needs to perform:
- User management: Logging in and out, registering, and so on
- The image feed: Uploading, editing, and displaying images
- Private messaging: Sending messages between users and providing notifications
These are each separate pieces of functionality, so if this example were a Django site, then each piece of functionality would be a different Django app inside a single Django project.
Note: A Django project contains at least one app. But even when there are more apps in the Django project, you commonly refer to a Django project as a web app.
The Django project holds some configurations that apply to the project as a whole, such as project settings, URLs, shared templates and static files. Each application can have its own database, and it’ll have its own functions to control how it displays data to the user in HTML templates.
Each application also has its own URLs as well as its own HTML templates and static files, such as JavaScript and CSS.
Django apps are structured so that there’s a separation of logic. It supports the model-view-controller pattern, which is the architecture for most web frameworks. The basic principle is that each application includes three separate files that handle the three main pieces of logic separately:
- Model defines the data structure. This is usually the database description and often the base layer to an application.
- View displays some or all of the data to the user with HTML and CSS.
- Controller handles how the database and the view interact.
If you want to learn more about the MVC pattern, then check out Model-View-Controller (MVC) Explained – With Legos.
Read the full article at https://realpython.com/get-started-with-django-1/ »[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]
Real Python: The subprocess Module: Wrapping Programs With Python
Python’s subprocess module allows you to run shell commands and manage external processes directly from your Python code. By using subprocess, you can execute shell commands like ls or dir, launch applications, and handle both input and output streams. This module provides tools for error handling and process communication, making it a flexible choice for integrating command-line operations into your Python projects.
By the end of this tutorial, you’ll understand that:
- The Python subprocess module is used to run shell commands and manage external processes.
- You run a shell command using subprocess by calling subprocess.run() with the command as a list of arguments.
- subprocess.call(), subprocess.run(), and subprocess.Popen() differ in how they execute commands and handle process output and return codes.
- multiprocessing is for parallel execution within Python, while subprocess manages external processes.
- To execute multiple commands in sequence using subprocess, you can chain them by using pipes or running them consecutively.
Read on to learn how to use Python’s subprocess module to automate shell tasks, manage processes, and integrate command-line operations into your applications.
Note: subprocess isn’t a GUI automation module or a way to achieve concurrency. For GUI automation, you might want to look at PyAutoGUI. For concurrency, take a look at this tutorial’s section on modules related to subprocess.
Once you have the basics down, you’ll be exploring some practical ideas for how to leverage Python’s subprocess. You’ll also dip your toes into advanced usage of Python’s subprocess by experimenting with the underlying Popen() constructor.
Source Code: Click here to download the free source code that you’ll use to get acquainted with the Python subprocess module.
Processes and SubprocessesFirst off, you might be wondering why there’s a sub in the Python subprocess module name. And what exactly is a process, anyway? In this section, you’ll answer these questions. You’ll come away with a high-level mental model for thinking about processes. If you’re already familiar with processes, then you might want to skip directly to basic usage of the Python subprocess module.
Processes and the Operating SystemWhenever you use a computer, you’ll always be interacting with programs. A process is the operating system’s abstraction of a running program. So, using a computer always involve processes. Start menus, app bars, command-line interpreters, text editors, browsers, and more—every application comprises one or more processes.
A typical operating system will report hundreds or even thousands of running processes, which you’ll get to explore shortly. However, central processing units (CPUs) typically only have a handful of cores, which means that they can only run a handful of instructions simultaneously. So, you may wonder how thousands of processes can appear to run at the same time.
In short, the operating system is a marvelous multitasker—as it has to be. The CPU is the brain of a computer, but it operates at the nanosecond timescale. Most other components of a computer are far slower than the CPU. For instance, a magnetic hard disk read takes thousands of times longer than a typical CPU operation.
If a process needs to write something to the hard drive, or wait for a response from a remote server, then the CPU would sit idle most of the time. Multitasking keeps the CPU busy.
Part of what makes the operating system so great at multitasking is that it’s fantastically organized too. The operating system keeps track of processes in a process table or process control block. In this table, you’ll find the process’s file handles, security context, references to its address spaces, and more.
The process table allows the operating system to abandon a particular process at will, because it has all the information it needs to come back and continue with the process at a later time. A process may be interrupted many thousands of times during execution, but the operating system always finds the exact point where it left off upon returning.
An operating system doesn’t boot up with thousands of processes, though. Many of the processes you’re familiar with are started by you. In the next section, you’ll look into the lifetime of a process.
Process LifetimeThink of how you might start a Python application from the command line. This is an instance of your command-line process starting a Python process:
The process that starts another process is referred to as the parent, and the new process is referred to as the child. The parent and child processes run mostly independently. Sometimes the child inherits specific resources or contexts from the parent.
As you learned in Processes and the Operating System, information about processes is kept in a table. Each process keeps track of its parents, which allows the process hierarchy to be represented as a tree. You’ll be exploring your system’s process tree in the next section.
Note: The precise mechanism for creating processes differs depending on the operating system. For a brief overview, the Wikipedia article on process management has a short section on process creation.
For more details about the Windows mechanism, check out the win32 API documentation page on creating processes
On UNIX-based systems, processes are typically created by using fork() to copy the current process and then replacing the child process with one of the exec() family of functions.
The parent-child relationship between a process and its subprocess isn’t always the same. Sometimes the two processes will share specific resources, like inputs and outputs, but sometimes they won’t. Sometimes child processes live longer than the parent. A child outliving the parent can lead to orphaned or zombie processes, though more discussion about those is outside the scope of this tutorial.
When a process has finished running, it’ll usually end. Every process, on exit, should return an integer. This integer is referred to as the return code or exit status. Zero is synonymous with success, while any other value is considered a failure. Different integers can be used to indicate the reason why a process has failed.
Read the full article at https://realpython.com/python-subprocess/ »[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]
Real Python: Python's min() and max(): Find Smallest and Largest Values
Python’s built-in functions max() and min() allow you to find the largest and smallest values in a dataset. You can use them with iterables, such as lists or tuples, or a series of regular arguments. They can handle numbers, strings, and even dictionaries. Plus, with the optional arguments key and default, you can customize their behavior to suit your needs.
By the end of this tutorial, you’ll understand that:
- Python’s max() and min() can find the largest and smallest values in a dataset.
- min() and max() can handle string inputs by comparing their alphabetical order.
- The key argument modifies comparison criteria by applying a function to each element before comparison.
- You can use min() and max() with generator expressions for memory-efficient value comparison.
This tutorial explores the practical use cases for min() and max(), such as removing outliers from lists and processing strings. By the end, you’ll also know how to implement your own versions of min() and max() to deepen your understanding of these functions.
Free Bonus: 5 Thoughts On Python Mastery, a free course for Python developers that shows you the roadmap and the mindset you’ll need to take your Python skills to the next level.
To get the most out of this tutorial, you should have some previous knowledge of Python programming, including topics like for loops, functions, list comprehensions, and generator expressions.
Getting Started With Python’s min() and max() FunctionsPython includes several built-in functions that make your life more pleasant and productive because they mean you don’t need to reinvent the wheel. Two examples of these functions are min() and max(). They mostly apply to iterables, but you can use them with multiple regular arguments as well. What’s their job? They take care of finding the smallest and largest values in their input data.
Whether you’re using Python’s min() or max(), you can use the function to achieve two slightly different behaviors. The standard behavior for each is to return the minimum or maximum value through straightforward comparison of the input data as it stands. The alternative behavior is to use a single-argument function to modify the comparison criteria before finding the smallest and largest values.
To explore the standard behavior of min() and max(), you can start by calling each function with either a single iterable as an argument or with two or more regular arguments. That’s what you’ll do right away.
Calling min() and max() With a Single Iterable ArgumentThe built-in min() and max() have two different signatures that allow you to call them either with an iterable as their first argument or with two or more regular arguments. The signature that accepts a single iterable argument looks something like this:
Python min(iterable, *[, default, key]) -> minimum_value max(iterable, *[, default, key]) -> maximum_value Copied!Both functions take a required argument called iterable and return the minimum and maximum values respectively. They also take two optional keyword-only arguments: default and key.
Note: In the above signatures, the asterisk (*) means that the following arguments are keyword-only arguments, while the square brackets ([]) denote that the enclosed content is optional.
Here’s a summary of what the arguments to min() and max() do:
Argument Description Required iterable Takes an iterable object, like a list, tuple, dictionary, or string Yes default Holds a value to return if the input iterable is empty No key Accepts a single-argument function to customize the comparison criteria NoLater in this tutorial, you’ll learn more about the optional default and key arguments. For now, just focus on the iterable argument, which is a required argument that leverages the standard behavior of min() and max() in Python:
Python >>> min([3, 5, 9, 1, -5]) -5 >>> min([]) Traceback (most recent call last): ... ValueError: min() arg is an empty sequence >>> max([3, 5, 9, 1, -5]) 9 >>> max([]) Traceback (most recent call last): ... ValueError: max() arg is an empty sequence Copied!In these examples, you call min() and max() with a list of integer numbers and then with an empty list. The first call to min() returns the smallest number in the input list, -5. In contrast, the first call to max() returns the largest number in the list, or 9. If you pass an empty iterator to min() or max(), then you get a ValueError because there’s nothing to do on an empty iterable.
An important detail to note about min() and max() is that all the values in the input iterable must be comparable. Otherwise, you get an error. For example, numeric values work okay:
Python >>> min([3, 5.0, 9, 1.0, -5]) -5 >>> max([3, 5.0, 9, 1.0, -5]) 9 Copied!These examples combine int and float numbers in the calls to min() and max(). You get the expected result in both cases because these data types are comparable.
However, what would happen if you mixed strings and numbers? Check out the following examples:
Python >>> min([3, "5.0", 9, 1.0, "-5"]) Traceback (most recent call last): ... TypeError: '<' not supported between instances of 'str' and 'int' >>> max([3, "5.0", 9, 1.0, "-5"]) Traceback (most recent call last): ... TypeError: '>' not supported between instances of 'str' and 'int' Copied! Read the full article at https://realpython.com/python-min-and-max/ »[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]
Going to FOSDEM 2025
Barely back from 38C3 preparations for another huge event started, FOSDEM 2025, taking place in two weeks in Brussels, Belgium.
KDEKDE will be there with a big team again, with the KDE stand being in building AW this year, on the ground floor.
There will also be talks by KDE people:
- Albert will speak about Poppler, the PDF library powering not only Okular but also the travel document extractor of Itinerary, on Saturday at 11:50 in room H.2215.
- Joseph will talk about how FOSS can help with reducing the carbon footprint of the IT sector, on Sunday at 11:55 in room H.2214.
And more below.
TransitousFOSDEM also marks the first birthday of Transitous, which started a year ago following a question in the Railways and Open Transport dev room.
Transitous has come a long way since then, meanwhile containing more than 1500 GTFS feeds from 54 countries, which is beyond what we even deemed technically feasible a year ago. More than 50 people have contributed to this, and five FOSS apps (that we know of) are using Transitous as (one of) their backends already.
We are just about to complete the migration to a new major version of MOTIS, which brings a new much easier to use API, more routing options, and more powerful door-to-door (instead of station-to-station) routing, to just name a few of the improvements.
At the same time Transitous is being moved to new and more powerful hardware, again kindly provided by Spline.
A bunch of people working on Transitous will be at FOSDEM, and Felix and Marcus will cover it in their talk on Sunday at 16:30 in room K.4.601.
Emergency and Weather AlertsAs part of the FOSS on Mobile Devices track Nucleus and I will be presenting the work on free infrastructure for receiving emergency alerts, on Saturday at 15:45 in room H.2214.
Since I last wrote about this here things have moved forward quite a bit. We have first test deployment of the FOSS Public Alert Server at alerts.kde.org and FOSSWarn has a first pre-release which can make use of that.
I yet have to update the KDE public alert app prototype to make use of this though.
See you in Brussels!With people from so many different communities around, FOSDEM has resulted in great cross-project collaborations in the past, excited to see what this one will bring and looking forward to many interesting discussions!
Petter Reinholdtsen: What is the most supported MIME type in Debian in 2025?
Seven and twelve years ago, I measured what the most supported MIME type in Debian was, first by analysing the desktop files in all packages in the archive, then by analysing the DEP-11 AppStream data set. I guess it is time to repeat the measurement, only for unstable as last time:
Debian Unstable:
count MIME type ----- ----------------------- 63 image/png 63 image/jpeg 57 image/tiff 54 image/gif 51 image/bmp 50 audio/mpeg 48 text/plain 42 audio/x-mp3 40 application/ogg 39 audio/x-wav 39 audio/x-flac 36 audio/x-vorbis+ogg 35 audio/x-mpeg 34 audio/x-mpegurl 34 audio/ogg 33 application/x-ogg 32 audio/mp4 31 audio/x-scpls 31 application/pdf 29 audio/x-ms-wmaThe list was created like this using a sid chroot:
cat /var/lib/apt/lists/*sid*_dep11_Components-amd64.yml.gz | \ zcat | awk '/^ - \S+\/\S+$/ {print $2 }' | sort | \ uniq -c | sort -nr | head -20It is nice to see that the same number of packages now support PNG and JPEG. Last time JPEG had more support than PNG. Most of the MIME types are known to me, but the 'audio/x-scpls' one I have no idea what represent, except it being an audio format. To find the packages claiming support for this format, the appstreamcli command from the appstream package can be used:
% appstreamcli what-provides mediatype audio/x-scpls | grep Package: | sort -u Package: alsaplayer-common Package: amarok Package: audacious Package: brasero Package: celluloid Package: clapper Package: clementine Package: cynthiune.app Package: elisa Package: gtranscribe Package: kaffeine Package: kmplayer Package: kylin-burner Package: lollypop Package: mediaconch-gui Package: mediainfo-gui Package: mplayer-gui Package: mpv Package: mystiq Package: parlatype Package: parole Package: pragha Package: qmmp Package: rhythmbox Package: sayonara Package: shotcut Package: smplayer Package: soundconverter Package: strawberry Package: syncplay Package: vlc %Look like several video and auto tools understand the format. Similarly one can check out the number of packages supporting the STL format commonly used for 3D printing:
% appstreamcli what-provides mediatype model/stl | grep Package: | sort -u Package: cura Package: freecad Package: open3d-viewer %How strange the slic3r and prusa-slicer packages do not support STL. Perhaps just missing package metadata? Luckily the amount of package metadata in Debian is getting better, and hopefully this way of locating relevant packages for any file format will be the preferred one soon.
As usual, if you use Bitcoin and want to show your support of my activities, please send Bitcoin donations to my address 15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b.