Planet Python

Subscribe to Planet Python feed
Planet Python -
Updated: 8 hours 10 min ago

Test and Code: 202: Using Towncrier to Keep a Changelog

Wed, 2023-05-31 17:15

Hynek joins the show to discuss towncrier.

At the top of the towncrier documentation, it says "towncrier is a utility to produce useful, summarized news files (also known as changelogs) for your project."

Towncrier is used by "Twisted, pytest, pip, BuildBot, and attrs, among others."

This is the last of 3 episodes focused on keeping a CHANGELOG.

Episode 200 kicked off the series with and Olivier Lacan
In 201 we had Ned Batchelder discussing scriv.

Special Guest: Hynek Schlawack.


<p>Hynek joins the show to discuss towncrier. </p> <p>At the top of the towncrier documentation, it says &quot;towncrier is a utility to produce useful, summarized news files (also known as changelogs) for your project.&quot;</p> <p>Towncrier is used by &quot;Twisted, pytest, pip, BuildBot, and attrs, among others.&quot;</p> <p>This is the last of 3 episodes focused on keeping a CHANGELOG. </p> <p><a href="" rel="nofollow">Episode 200</a> kicked off the series with and Olivier Lacan<br> In <a href="" rel="nofollow">201</a> we had Ned Batchelder discussing scriv.</p><p>Special Guest: Hynek Schlawack.</p><p>Links:</p><ul><li><a href="" title="Towncrier docs" rel="nofollow">Towncrier docs</a></li><li><a href="" title="How to Keep a Changelog in Markdown - Towncrier docs" rel="nofollow">How to Keep a Changelog in Markdown - Towncrier docs</a></li><li><a href="" title="Keep a Changelog" rel="nofollow">Keep a Changelog</a></li><li><a href="" title="structlog/" rel="nofollow">structlog/</a> &mdash; Example of manually edited changelog.</li><li><a href="" title="hatch-fancy-pypi-readme" rel="nofollow">hatch-fancy-pypi-readme</a></li><li><a href="" title="MyST Markdown " rel="nofollow">MyST Markdown </a></li><li><a href="" title="hatchling" rel="nofollow">hatchling</a></li></ul>
Categories: FLOSS Project Planets

Python Software Foundation: Thinking about running for the Python Software Foundation Board of Directors? Let’s talk!

Wed, 2023-05-31 11:22

This year’s Board Election Nomination period is opening tomorrow. Current board members want to share what being on the board is like and are making themselves available to answer all your questions about responsibilities, activities and time commitments via online chat. Please come join us on Slack anytime in June to talk with us about being on the PSF board.

Board Election Timeline:

  • Nominations are open, Thursday, June 1st, 2:00 pm UTC
  • Board Director Nomination cut-off: Thursday, June 15, 11:59 pm UTC
  • Voter application cut-off date: Thursday, June 15, 11:59 pm UTC
  • Announce candidates: Friday, June 16th
  • Voting start date: Tuesday, June 20, 12:01 am UTC
  • Voting end date: Friday, June 30, 11:59 pm UTC

Not sure what UTC is for you locally? Check here!

Nominations will be accepted here. (Note: you will need to sign into or create your user account first). Learn more about membership here or if you have questions about membership or nominations please email psf-elections@python.orgIn addition to Slack you are welcome to join the discussion about the PSF Board election on our forum.

Also, you can see your membership record and status on If you are a voting-eligible member and do not already have a login there, please sign up and then email so we can link your membership to your account.
Categories: FLOSS Project Planets

PyCharm: Five Things To Love About the New UI

Wed, 2023-05-31 10:56

Are you using the New UI yet? Not yet? Let me tell you why it’s the best thing since sliced bread!

Let’s get it enabled and take a look around. The easiest way to do that is from the Settings cog at the top-right of the UI:

PyCharm will need to restart, but that’s it; you’re done!

In this interface update, we have introduced more blank space around the various elements. This is to help separate content without adding elements such as dividers on the screen. If you prefer to have the UI elements a little bit smaller, you can select Compact in the Meet the New UI tool window (or later in your Settings by searching for “new ui”). This setting removes some white space and padding around interface elements.

There are a couple of themes that you might notice straight away, including the use of colour and element size to denote information hierarchy. 

For example, in the Dark theme, we have a dark gray bar at the top for common entry points to functionality such as VCS actions and Run actions and a black background for the editor. 

Let me give you a tour of the lovable little landmarks in this UI. 

#1 – The Main Toolbar

The Main toolbar is cleaner and has a more succinct layout. It has all the functionality you know and love. It is your one-stop shop for project-related information:

Let’s go from left to right. The first area I want to talk about is the Project widget. This is where you can see the name of your current project, switch between recent projects, create new projects and open existing ones.

To the right of the Project widget is the VCS widget. This lets you quickly see the status of your project in version control, including the branch and if there are any outgoing (shown in green) or incoming (shown in blue) changes:

The VCS widget is right next to the name of the project, and it is now a clear entry point for functionality related to your project and version control when you click the drop-down arrow. You’ll see lots of these so-called “entry points” for functionality groups as we continue our tour.

Over to the right is your Run widget, which has been redesigned, so I’ll go into more detail in the next section.

Finally, on the very right-hand side of the Main toolbar, you have your more general icons, including Code With Me, Search Everywhere, and Settings. It’s worth noting if you like using your mouse, the Settings cog icon takes you to some top-level options that you’ll probably use most frequently, including Plugins, Theme, Keymap, and View Mode. I hope you don’t want to, but you can also switch back to the Classic UI here!

#2 – The Run Widget

Let’s look at the Run toolbar widget in more detail. The first thing you’ll notice is the bigger, bolder icons. These are great because they give you immediate visual feedback as to the state of your application. For example, before you run or debug your application, your Run widget looks like this:

If you click the Run icon, it’ll change to this:

Now you can see your application is running, and you can stop and re-run it, or stop it entirely. You could also click the debug icon, which will prompt you to stop the currently running application so you can run it with PyCharm’s debugger instead. 

If you run the application with PyCharm’s debugger, the widget will change to this:

You can still access all your run configurations from the drop-down menu, and for each one, choose how you want to run it as well:

The functionality for the currently selected run configuration is available from the three vertical ellipses button too. 

#3 – The Tool Windows

The New UI brings you new icons, an improved layout, and the option to see more tool windows at any time. The new larger icons help reduce the cognitive load of trying to find what you are looking for because there’s less on the screen. 

I’ll show you how to customize where they are in the UI so you can quickly find them when you need them:

You have the option to split your tool windows vertically. For example, in the screenshot below, the Commit tool window is below the horizontal separator bar. Tool windows can be dragged below the separator to open them in a vertical split:

You can also split tool windows on the right-hand side in this way, too:

Finally, if you want to split your tool windows on the bottom so one is on the left, and the other is on the right, drag the tool window you want to appear on the right-hand side to the right-hand bar. In the screenshot below, I’ve dragged my Problem tool window icon over to the right and then opened it alongside the Terminal tool window:

All the tool windows now use outline monochrome icons, which are more modern and don’t clutter the interface with any additional unnecessary information. If you do want to see the name and keyboard shortcut of any tool window, hover over the icon. Remember, the handy shortcut to hide all your toolbars is ⌘⇧F12 (macOS), or  Ctrl+Shift+F12 (Windows/Linux).

#4 – Run and Debug Tool Windows

Both the Run and Debug tool windows are now available from the window tab on the left. This is great from a standardisation perspective as it allows you to quickly access both the tool windows to manage the state of your application.

In addition, the Debug tool window has been updated to have one toolbar that contains the most common actions based on usage statistics. This might mean that some actions you’re used to clicking aren’t where you expect them to be! We haven’t removed any functionality, everything is still accessible, but for example, you may notice that the Evaluate Expression icon is gone. 

There are still plenty of ways to evaluate an expression. You can:

  • Use ⌥F8 (macOS), or Ctrl+F8 (Windows/Linux)
  • Use Shift Shift (macOS/Windows/Linux) for Search Everywhere and then type in “evaluate expression”
  • Select Evaluate Expression from the right-click context menu in the Debug tool window

There is also a dedicated field for evaluation in the Debug tool window:

In addition to those changes, there are new tabs for switching between the Threads & Variables and Console views if there’s a single running configuration.

#5 – The Editor

Last but not least, the editor has had a number of updates in line with the design choices we’ve made. It would be easy to overlook these changes in terms of what to love in the New UI, but these themes that run throughout have given the whole interface a fresh, clean, professional look as you move through your codebase.

The Light and Dark colour themes have improved contrast and a consistent colour palette to brighten up (or darken) your day:

The iconography has been overhauled in the editor, too with more distinguishable shapes and colours. You will see these changes in the editor and the wider IDE. This has, in my opinion, given PyCharm a fantastic face-lift:

Breakpoints are now placed over the line numbers, saving horizontal space:

If, on the other hand, you prefer to have your breakpoints next to the line numbers, you can still do that by right-clicking in the gutter and then selecting Appearance > Breakpoints Over Line Numbers.

The colour palette for annotations for Git Blame has been updated. The lighter the shade, the older the change is. Conversely, the darker the shade, the newer the change is:


Finally, this blog post wouldn’t be complete without talking about why we have updated the interface for JetBrains IDEs. We have gathered a lot of feedback from our users over the past few years, and we learned that our current (Classic) UI is associated with words such as cluttered, outdated, and less visually appealing

Given that, we knew we needed to update our user interface, but how did we decide what it would look like? Fundamentally, we started out by implementing the user experience patterns that I’ve spoken about in this blog post. We subsequently undertook several rounds of rigorous internal and external review cycles and updated the New UI based on your feedback. 

That’s all for today. I hope this has convinced you to try the New UI and that you fall in love with it as much as I have! Remember, you can enable the New UI from the Settings cog at the top left of the IDE window. There’s a link in the Settings to share your feedback with us!

Categories: FLOSS Project Planets

Real Python: Create and Modify PDF Files in Python

Wed, 2023-05-31 10:00

It’s really useful to know how to create and modify PDF (portable document format) files in Python. This is one of the most common formats for sharing documents over the Internet. PDF files can contain text, images, tables, forms, and rich media like videos and animations, all in a single file.

This abundance of content types can make working with PDFs difficult. There are several different kinds of data to decode when opening a PDF file! Fortunately, the Python ecosystem has some great packages for reading, manipulating, and creating PDF files.

In this tutorial, you’ll learn how to:

  • Read text from a PDF with pypdf
  • Split a PDF file into multiple files
  • Concatenate and merge PDF files together
  • Rotate and crop pages in PDF files
  • Encrypt and decrypt PDF files
  • Create and customize PDF files from scratch with ReportLab

To complete this learning, you’ll use two different tools. You’ll use the pypdf library to manipulate existing PDF files and the ReportLab library to create new PDF files from scratch. Along the way, you’ll have several opportunities to deepen your understanding with exercises and examples.

To follow along with this tutorial, you should download and extract to your home folder the materials used in the examples. To do this, click the link below:

Download the sample materials: Click here to get the materials you’ll use to learn about creating and modifying PDF files in this tutorial.

Extracting Text From PDF Files With pypdf

In this section, you’ll learn how to read PDF files and extract their text using the pypdf library. Before you can do that, though, you need to install it with pip:

$ python -m pip install pypdf

With this command, you download and install the latest version of pypdf from the Python package index (PyPI). To verify the installation, go ahead and run the following command in your terminal:

$ python -m pip show pypdf Name: pypdf Version: 3.8.1 Summary: A pure-python PDF library capable of splitting, merging, cropping, and transforming PDF files Home-page: Author: Author-email: Mathieu Fenniak <> License: Location: .../lib/python3.10/site-packages Requires: Required-by:

Pay particular attention to the version information. At the time of publication for this tutorial, the latest version of pypdf was 3.8.1. This library has gotten plenty of updates lately, and cool new features are added quite frequently. Most importantly, you’ll find many breaking changes in the library’s API if you compare it with its predecessor library PyPDF2.

Before diving into working with PDF files, you must know that this tutorial is adapted from the chapter “Creating and Modifying PDF Files” in Python Basics: A Practical Introduction to Python 3.

The book uses Python’s built-in IDLE editor to create and edit Python files and interact with the Python shell, so you’ll find occasional references to IDLE throughout this tutorial. However, you should have no problems running the example code from the editor and environment of your choice.

Reading PDF Files With PdfReader

To kick things off, you’ll open a PDF file and read some information about it. You’ll use the Pride_and_Prejudice.pdf file provided in the downloadable resources for this tutorial.

Open IDLE’s interactive window and import the PdfReader class from pypdf:

>>>>>> from pypdf import PdfReader

To create a new instance of the PdfReader class, you’ll need to provide the path to the PDF file that you want to open. You can do that using the pathlib module:

>>>>>> from pathlib import Path >>> pdf_path = ( ... Path.home() ... / "creating-and-modifying-pdfs" ... / "practice_files" ... / "Pride_and_Prejudice.pdf" ... )

The pdf_path variable now contains the path to a PDF version of Jane Austen’s Pride and Prejudice.

Note: You may need to change pdf_path so that it corresponds to the location of the creating-and-modifying-pdfs/ folder on your computer.

Now create the PdfReader instance by calling the class’s constructor with the path to your PDF file as an argument:

>>>>>> pdf_reader = PdfReader(pdf_path) Read the full article at »

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Categories: FLOSS Project Planets

Python for Beginners: Unpacking in Python

Wed, 2023-05-31 09:00

Python provides us with the packing and unpacking operator to convert one iterable object to another easily. In this article, we will discuss the unpacking operator in Python with different examples.

Table of Contents
  1. What is the Unpacking Operator in Python?
  2. Unpacking in Python Using Parallel Assignment
  3. Unpacking Using The * Operator in Python
  4. Conclusion
What is the Unpacking Operator in Python?

The unpacking operator in Python is used to unpack an iterable object into individual elements. It is represented by an asterisk sign * and has the following syntax.



  • The iterable_object variable represents an iterable object such as a list, tuple, set, or a Python dictionary
  • After execution of the above statement,  the elements of iterable_object are unpacked. Then, we can use the packing operation to create other iterable objects.

To understand this, consider the following example.

myList=[1,2,3,3,4,4,5,6] mySet={*myList} print("The list is:") print(myList) print("The set is:") print(mySet)


The list is: [1, 2, 3, 3, 4, 4, 5, 6] The set is: {1, 2, 3, 4, 5, 6}

In the above example, the * operator unpacks myList. Then, we created a set from the unpacked elements.

Remember that you cannot use the unpacking operator to assign the elements of the iterable object to individual elements. If you do so, the program will run into an error. You can observe this in the following example.

myList=[1,2,3,4,5,6] print("The list is:") print(myList) a,b,c,d,e,f=*myList


The list is: [1, 2, 3, 4, 5, 6] File "/tmp/ipykernel_16212/", line 4 a,b,c,d,e,f=*myList ^ SyntaxError: can't use starred expression here

In this example, we have tried to assign elements from myList to six variables using the * operator. Hence, the program runs into SyntaxError exception.

Unpacking in Python Using Parallel Assignment

Instead of using the * operator, you can unpack an iterable object into multiple variables using parallel assignment. For this, you can use the following syntax.

var1, var2, var3,var4..varN=iterable_object

In the above statement, variables var1, var2, var3,var4 till varN are individual variables. After execution of the statement, all the variables are initialized with elements from iterable_object as shown below.

myList=[1,2,3,4,5,6] print("The list is:") print(myList) a,b,c,d,e,f=myList print("The variables are:") print(a,b,c,d,e,f)


The list is: [1, 2, 3, 4, 5, 6] The variables are: 1 2 3 4 5 6

In the above example, we have six elements in myList. Hence, we have unpacked the list into six variables a, b, c, d, e, and f.

In the above syntax, the number of variables must be equal to the number of elements in the iterable_object. Otherwise, the program will run into error. 

For instance, if the number of variables on the left-hand side is less than the number of elements in the iterable object, the program will run into a ValueError exception saying that there are too many values to unpack. You can observe this in the following example.

myList=[1,2,3,4,5,6] print("The list is:") print(myList) a,b,c,d,e=myList print("The variables are:") print(a,b,c,d,e)


The list is: [1, 2, 3, 4, 5, 6] ValueError Traceback (most recent call last) /tmp/ipykernel_16212/ in <module> 2 print("The list is:") 3 print(myList) ----> 4 a,b,c,d,e=myList 5 print("The variables are:") 6 print(a,b,c,d,e) ValueError: too many values to unpack (expected 5)

In the about code, you an observe that there are six elements in the list but we have only five variables. Due to this, the program runs into a Python ValueError exception saying that there are too many values to unpack.

In a similar manner, if the number of variables on the left side of the assignment operator is greater than the number of elements in the iterable object, the program will run into a ValueError exception saying that there are not too many values to unpack. You can observe this in the following example.

myList=[1,2,3,4,5,6] print("The list is:") print(myList) a,b,c,d,e,f,g=myList print("The variables are:") print(a,b,c,d,e,f,g)


The list is: [1, 2, 3, 4, 5, 6] ValueError Traceback (most recent call last) /tmp/ipykernel_16212/ in <module> 2 print("The list is:") 3 print(myList) ----> 4 a,b,c,d,e,f,g=myList 5 print("The variables are:") 6 print(a,b,c,d,e,f,g) ValueError: not enough values to unpack (expected 7, got 6)

In the above example, there seven variables on the left hand side and only six elements in the list. Due to this, the program runs into ValueError exception saying that there aren’t enough values to unpack.

Unpacking Using The * Operator in Python

When we have less number of variables than the elements in the iterable object, we can use the * operator to unpack the iterable object using the following syntax.

var1, var2, var3,var4..varN, *var=iterable_object

In the above syntax, If there are more than N elements in the iterable_object, first N objects are assigned to the variables var1 to varN. The rest of the variables are packed in a list and assigned to the variable var. You can observe this in the following example.

myList=[1,2,3,4,5,6] print("The list is:") print(myList) a,b,c,*d=myList print("The variables are:") print(a,b,c,d)


The list is: [1, 2, 3, 4, 5, 6] The variables are: 1 2 3 [4, 5, 6]

In the above example, we have six elements in the list. On the left hand side, we have four variables with the last variable containing the * sign. You can observe that the three variables are assigned individual elements whereas the variable containing the * operator gets all the remaining elements in a list.

Now, let us move the variable containing the * operator to the start of the expression as shown below.

*var, Var1, var2, var3,var4..varN =iterable_object

In this case, if there are more than N elements in the iterable_object, the last N elements are assigned to the variables var1 to varN. The remaining elements from the start are assigned to variable var in a list. You can observe this in the following example.

myList=[1,2,3,4,5,6] print("The list is:") print(myList) *a,b,c,d=myList print("The variables are:") print(a,b,c,d)


The list is: [1, 2, 3, 4, 5, 6] The variables are: [1, 2, 3] 4 5 6

We can also put the variable containing the * operator in between the variables on the left-hand side of the assignment operator. For example, consider the following syntax.

var1, var2, var3…varM, *var, varM+1,varM+2…varN=iterable_object

In the above example, there are M variables on the left-hand side of var and N-M variables on the right-hand side of var. Now, if the object iterable_object has more than N elements, 

  • First M elements of iterable_object are assigned to the variables var1 to varM.
  • The last N-M variables in iterable_object are assigned to the variables varM+1 to varN. 
  • The rest of the elements in the middle are assigned to the variable var as a list. 

You can observe this in the following example.

myList=[1,2,3,4,5,6] print("The list is:") print(myList) a,b,*c,d=myList print("The variables are:") print(a,b,c,d)


The list is: [1, 2, 3, 4, 5, 6] The variables are: 1 2 [3, 4, 5] 6 Conclusion

In this article, we discussed how to use the unpacking operator in Python. The unpacking operation works the same in lists, sets, and tuples. In dictionaries, only the keys of the dictionary are unpacked when using the unpacking operator. 

To learn more about Python programming, you can read this article on tuple comprehension in Python. You might also like this article on Python continue vs break statements.

I hope you enjoyed reading this article. Stay tuned for more informative articles.

Happy Learning!

The post Unpacking in Python appeared first on

Categories: FLOSS Project Planets

Stack Abuse: Reading and Writing SQL Files in Pandas

Wed, 2023-05-31 08:12

When I started learning Data Analysis a few years ago, the first thing I learned was SQL and Pandas. As a data analyst, it is crucial to have a strong foundation in working with SQL and Pandas. Both are powerful tools that help data analysts efficiently analyze and manipulate stored data in databases.

Overview of SQL and Pandas

SQL (Structured Query Language) is a programming language used to manage and manipulate relational databases. On the other hand, Pandas is a Python library used for data manipulation and analysis.

Data analysis involves working with large amounts of data, and databases are often used to store this data. SQL and Pandas provide powerful tools for working with databases, allowing data analysts to efficiently extract, manipulate, and analyze data. By leveraging these tools, data analysts can gain valuable insights from data that would otherwise be difficult to obtain.

In this article, we will explore how to use SQL and Pandas to read and write to a database.

Connecting to the DB Installing the Libraries

We must first install the necessary libraries before we can connect to the SQL database with Pandas. The two main libraries required are Pandas and SQLAlchemy. Pandas is a popular data manipulation library that allows for the storage of large data structures, as mentioned in the introduction. In contrast, SQLAlchemy provides an API for connecting to and interacting with the SQL database.

We can install both libraries using the Python package manager, pip, by running the following commands at the command prompt.

$ pip install pandas $ pip install sqlalchemy Making the Connection

With the libraries installed, we can now use Pandas to connect to the SQL database.

To begin, we will create a SQLAlchemy engine object with create_engine(). The create_engine() function connects the Python code to the database. It takes as an argument a connection string that specifies the database type and connection details. In this example, we'll use the SQLite database type and the database file's path.

Create an engine object for a SQLite database using the example below:

import pandas as pd from sqlalchemy import create_engine # Create an engine object engine = create_engine('sqlite:///C/SQLite/student.db')

If the SQLite database file, student.db in our case, is in the same directory as the Python script, we can use the file name directly, as shown below.

engine = create_engine('sqlite:///student.db') Reading SQL Files with Pandas

Let's read data now that we've established a connection. In this section, we will look at the read_sql, read_sql_table, and read_sql_query functions and how to use them to work with a database.

Executing SQL Queries using Panda's read_sql() Function

The read_sql() is a Pandas library function that allows us to execute an SQL query and retrieve the results into a Pandas dataframe. The read_sql() function connects SQL and Python, allowing us to take advantage of the power of both languages. The function wraps read_sql_table() and read_sql_query(). The read_sql() function is internally routed based on the input provided, which means that if the input is to execute an SQL query, it will be routed to read_sql_query(), and if it is a database table, it will be routed to read_sql_table().

The read_sql() syntax is as follows:

pandas.read_sql(sql, con, index_col=None, coerce_float=True, params=None, parse_dates=None, columns=None, chunksize=None)

SQL and con parameters are required; the rest are optional. However, we can manipulate the result using these optional parameters. Let's take a closer look at each parameter.

  • sql: SQL query or database table name
  • con: Connection object or connection URL
  • index_col: This parameter allows us to use one or more columns from the SQL query result as a data frame index. It can take either a single column or a list of columns.
  • coerce_float: This parameter specifies whether non-numerical values should be converted to floating numbers or left as strings. It is set to true by default. If possible, it converts non-numeric values to float types.
  • params: The params provide a secure method for passing dynamic values to the SQL query. We can use the params parameter to pass a dictionary, tuple, or list. Depending on the database, the syntax of params varies.
  • parse_dates: This allows us to specify which column in the resulting dataframe will be interpreted as a date. It accepts a single column, a list of columns, or a dictionary with the key as the column name and the value as the column format.
  • columns: This allows us to fetch only selected columns from the list.
  • chunksize: When working with a large data set, chunksize is important. It retrieves the query result in smaller chunks, enhancing performance.

Here's an example of how to use read_sql():


import pandas as pd from sqlalchemy import create_engine # Create an engine object engine = create_engine('sqlite:///C/SQLite/student.db') # Fetch all records from Student table and manipulate the result df = pd.read_sql("SELECT * FROM Student", engine, index_col='Roll Number', parse_dates='dateOfBirth') print(df) print("The Data type of dateOfBirth: ", df.dateOfBirth.dtype) # Close the Database connection engine.dispose()


firstName lastName email dateOfBirth rollNumber 1 Mark Simson 2000-02-23 2 Peter Griffen 2001-04-15 3 Meg Aniston 2001-09-20 Date type of dateOfBirth: datetime64[ns]

After connecting to the database, we execute a query that returns all records from the Student table and stores them in the DataFrame df. The "Roll Number" column is converted into an index using the index_col parameter, and the "dateOfBirth" datatype is "datetime64[ns]" due to parse_dates. We can use read_sql() not only to retrieve data but also to perform other operations such as insert, delete, and update. read_sql() is a generic function.

Loading Specific Tables or Views from the DB

Loading a specific table or view with Pandas read_sql_table() is another technique to read data from the database into a Pandas dataframe.

What is read_sql_table?

The Pandas library provides the read_sql_table function, which is specifically designed to read an entire SQL table without executing any queries and return the result as a Pandas dataframe.

The syntax of read_sql_table() is as below:

pandas.read_sql_table(table_name, con, schema=None, index_col=None, coerce_float=True, parse_dates=None, columns=None, chunksize=None)

Except for table_name and schema, the parameters are explained in the same way as read_sql().

  • table_name: The parameter table_name is the name of the SQL table in the database.
  • schema: This optional parameter is the name of the schema containing the table name.

After creating a connection to the database, we will use the read_sql_table function to load the Student table into a Pandas DataFrame.

import pandas as pd from sqlalchemy import create_engine # Create an engine object engine = create_engine('sqlite:///C/SQLite/student.db') # Load Student Table from database df = pd.read_sql_table('Student', engine) print(df.head()) # Close the Database connection engine.dispose()


rollNumber firstName lastName email dateOfBirth 0 1 Mark Simson 2000-02-23 1 2 Peter Griffen 2001-04-15 2 3 Meg Aniston 2001-09-20

We'll assume it is a large table that can be memory-intensive. Let's explore how we can use the chunksize parameter to address this issue.


import pandas as pd from sqlalchemy import create_engine # Create an engine object engine = create_engine('sqlite:///C/SQLite/student.db') # Load student table from database df_iterator = pd.read_sql_table('Student', engine, chunksize = 1) # Iterate the dataframe for df in df_iterator: print(df.head()) # Close the Database connection engine.dispose()


rollNumber firstName lastName email dateOfBirth 0 1 Mark Simson 2000-02-23 0 2 Peter Griffen 2001-04-15 0 3 Meg Aniston 2001-09-20

Please keep in mind that the chunksize I'm using here is 1 because I only have 3 records in my table.

Querying the DB Directly with Pandas' SQL Syntax

Extracting insights from the database is an important part for data analysts and scientists. To do so, we will leverage the read_sql_query() function.

What is read_sql_query()?

Using Pandas' read_sql_query() function, we can run SQL queries and get the results directly into a DataFrame. The read_sql_query() function is created specifically for SELECT statements. It cannot be used for any other operations, such as DELETE or UPDATE.


pandas.read_sql_query(sql, con, index_col=None, coerce_float=True, params=None, parse_dates=None, chunksize=None, dtype=None, dtype_backend=_NoDefault.no_default)

All parameter descriptions are the same as the read_sql() function. Here's an example of read_sql_query():


import pandas as pd from sqlalchemy import create_engine # Create an engine object engine = create_engine('sqlite:///C/SQLite/student.db') # Query the Student table df = pd.read_sql_query('Select firstName, lastName From Student Where rollNumber = 1', engine) print(df) # Close the Database connection engine.dispose()


firstName lastName 0 Mark Simson Writing SQL Files with Pandas

While analyzing data, suppose we discovered that a few entries need to be modified or that a new table or view with the data is required. To update or insert a new record, one method is to use read_sql() and write a query. However, that method can be lengthy. Pandas provide a great method called to_sql() for situations like this.

In this section, we will first build a new table in the database and then edit an existing one.

Creating a New Table in the SQL Database

Before we create a new table, let's first discuss to_sql() in detail.

What is to_sql()?

The to_sql() function of the Pandas library allows us to write or update the database. The to_sql() function can save DataFrame data to a SQL database.

Syntax for to_sql():

DataFrame.to_sql(name, con, schema=None, if_exists='fail', index=True, index_label=None, chunksize=None, dtype=None, method=None)

Only name and con parameters are mandatory to run to_sql(); however, other parameters provide additional flexibility and customization options. Let's discuss each parameter in detail:

  • name: The name of the SQL table to be created or altered.
  • con: The connection object of the database.
  • schema: The schema of the table (optional).
  • if_exists: The default value of this parameter is "fail". This parameter allows us to decide the action to be taken if the table already exists. Options include "fail", "replace", and "append".
  • index: The index parameter accepts a boolean value. By default, it is set to True, meaning the index of the DataFrame will be written to the SQL table.
  • index_label: This optional parameter allows us to specify a column label for the index columns. By default, the index is written to the table, but a specific name can be given using this parameter.
  • chunksize: The number of rows to be written at a time in the SQL database.
  • dtype: This parameter accepts a dictionary with keys as column names and values as their datatypes.
  • method: The method parameter allows specifying the method used for inserting data into the SQL. By default, it is set to None, which means pandas will find the most efficient way based on the database. There are two main options for method parameters:
    • multi: It allows inserting multiple rows in a single SQL query. However, not all databases support multi-row insert.
    • Callable function: Here, we can write a custom function for insert and call it using method parameters.

Here's an example using to_sql():

import pandas as pd from sqlalchemy import create_engine # Create an engine object engine = create_engine('sqlite:///C/SQLite/student.db') # Create a new Dataframe that will be our new table data = {'Name': ['Paul', 'Tom', 'Jerry'], 'Age': [9, 8, 7]} df = pd.DataFrame(data) # Create a new table called Customer df.to_sql('Customer', con=engine, if_exists='fail') # Close the Database connection engine.dispose()

A new table called Customer is created in the database, with two fields called "Name" and "Age."

Database snapshot:

Updating Existing Tables with Pandas Dataframes

Updating data in a database is a complex task, particularly when dealing with large data. However, using the to_sql() function in Pandas can make this task much easier. To update the existing table in the database, the to_sql() function can be used with the if_exists parameter set to "replace". This will overwrite the existing table with the new data.

Here is an example of to_sql() that updates the previously created Customer table. Suppose, in the Customer table we want to update the age of a customer named Paul from 9 to 10. To do so, first, we can modify the corresponding row in the DataFrame, and then use the to_sql() function to update the database.


import pandas as pd from sqlalchemy import create_engine # Create a connection to the SQLite database engine = create_engine('sqlite:///C/SQLite/student.db') # Load Customer Table into a Dataframe df = pd.read_sql_table('Customer', engine) # Modify the age of the customer named Paul df.loc[df['Name'] == 'Paul', 'Age'] = 10 # Update the Customer table with the modified DataFrame df.to_sql('Customer', con=engine, if_exists='replace') # Close the Database connection engine.dispose()

In the database, Paul's age is updated:


In conclusion, Pandas and SQL are both powerful tools for data analysis tasks such as reading and writing data to the SQL database. Pandas provides an easy way to connect to the SQL database, read data from the database into a Pandas dataframe, and write dataframe data back to the database.

The Pandas library makes it easy to manipulate data in a dataframe, whereas SQL provides a powerful language for querying data in a database. Using both Pandas and SQL to read and write the data can save time and effort in data analysis tasks, especially when the data is very large. Overall, leveraging SQL and Pandas together can help data analysts and scientists streamline their workflow.

Categories: FLOSS Project Planets

The Python Coding Blog: From Classes to Turtles via Functools and more | May in Review

Wed, 2023-05-31 04:25

I kept myself quite busy May. Not only did I finish running the first The Python Coding Programme cohort and starting running the second. I published seven new articles on The Python Coding Stack. Here’s a roundup:

Here’s what the animation looks like:

Coming in June: further instalments in both the object-oriented programming series and the data structure categories, and more…

I’m also working on the video courses and I hope to publish the first set of courses very soon.

Subscribe to The Python Coding Stack

Regular articles for the intermediate Python programmer or a beginner who wants to “read ahead”


The post From Classes to Turtles via Functools and more | May in Review appeared first on The Python Coding Book.

Categories: FLOSS Project Planets

Python GUIs: Your First Steps With the Kivy Library for GUI Development

Wed, 2023-05-31 04:00

Kivy is an open-source Python software library for developing graphical user interfaces. It supports cross-platform development for the desktop as well as the creation of multi-touch apps for mobile devices.

Kivy apps can run across several platforms, including Windows, Linux, macOS, Android, and IOS. One place where Kivy particularly shines is in game development. By combining Kivy's 2D physics capabilities with a simple physics engine, you can create impressive 2D simulations and games.

In this article, you'll learn about using Kivy to develop Python apps. We will go through an introduction to Kivy and what it can do. You'll learn how to create a simple Kivy app in Python and learn the basics of Kivy's styling language, known as Kv. Finally, you'll use Kivy's graphics module to draw 2D shapes on the Kivy canvas.

To get the most out of this tutorial, you should have basic knowledge of Python. Previous knowledge of general concepts of GUI programming, such as event loops, widgets, layouts, and forms, is also a plus.

Table of Contents

There are many different Python GUI libraries available, and choosing one for your project can be a really tough and confusing decision to make. For advice see our guide to Python GUI libraries.

Let's get started. We'll first take a few moments to install and set up Kivy on your computer.

Installing Kivy

Before using a third-party library like Kivy, we must install it in our working environment. Installing Kivy is as quick as running the python -m pip install kivy command on your terminal or command line. This command will install the library from the Python package index (PyPI).

Note that as of the time of writing this tutorial, Kivy only officially supports Python versions up to 3.10. For detailed information about installing Kivy, visit the official installation guide.

However, when working with third-party libraries, it's good practice to create a Python virtual environment, which is a self-contained Python installation for a particular version of Python that you can use to isolate the dependencies of a given project.

To create a virtual environment, you'll typically use Python's venv module from the standard library. Fire up a command-line window and type in the following command in your working directory.

sh $ python -m venv kivy_env

This command will create a folder called kivy_env containing a Python virtual environment. The Python version in this environment is the same as you get when you run python --version on your command line.

Next, we need to activate the virtual environment. Use the appropriate command, depending on whether you're on Windows, macOS, or Linux:

sh C:/> .\kivy_env\Scripts\activate sh $ source kivy_env/bin/activate sh $ source kivy_env/bin/activate

Once that's confirmed to be working, you can then install Kivy within the virtual environment you just created by running the following:

sh (kivy_env) $ python -m pip install kivy

With this command, you'll install Kivy in your active Python virtual environment, so you're now ready to go.

You can also install Kivy by downloading its source code directly from GitHub and doing a manual installation on the command line. For more information on following this installation path, check out the section about installing Kivy from source in the documentation.

Writing Your First Kivy GUI App in Python

Without further ado, let's get right into creating our first app with Kivy and Python. For this app, we will use a Label object to display the traditional "Hello, World!" message on our screen. To write a minimal Kivy GUI app, we need to run a few steps:

  1. Subclassing the App class
  2. Implementing its build() method, which returns a Widget instance
  3. Instantiating this class and calling its run() method

Let's start by importing the required classes. For our example, we only need the App and Label classes. Create a Python file called and add the following imports:

python from import App from kivy.uix.label import Label

The App class provides the base functionality required to create GUI apps with Kivy, such as managing the event loop. Meanwhile, the Label class will work as the root visual element or widget for our GUI.

Next, we can create our subclass of App. We have called it MainApp here. However, you can call it whatever you like:

python from import App from kivy.uix.label import Label class MainApp(App): def build(self): return Label(text="Hello, World!")

This subclass uses the concept of inheritance in object-oriented programming (OOP) in Python. All the attributes and methods defined in the superclass, App, are automatically inherited by the subclass, MainApp.

In order for our app to create a UI, we need to define a build() method. In build(), we create and return either a widget or layout, which will be the root object in our UI structure.

The build() method is the entry point to whatever will be drawn on the screen. In our example, it creates and returns a label with the "Hello, World!" text on it.

Finally, we need to create an instance of MainApp and call its run() method:

python from import App from kivy.uix.label import Label class MainApp(App): def build(self): return Label(text="Hello, World!") MainApp().run()

In the final line, we create an instance of MainApp and call its run() method. This method launches the application and runs its main loop. That's it! We're ready to run our first Kivy app. Open your command line and run the following command:

sh $ python

You'll see the following window on your screen:

First Kivy GUI Application

Great! You've just written your first Kivy GUI app using Python. It shows a black window with the message "Hello, World!" In its center. Note that the window's title bar shows the title Main, which comes from the name of your App subclass.

The next step is to explore some other essential features of Kivy that will allow you to write fully-functional GUI apps with this library.

Exploring Widgets and Layouts

In the previous section, we mentioned widgets and layouts a few times -- you may be wondering what they are! A widget is an element of a GUI that displays information or provides a specific function. They allow your users to interact with your app's GUI.

A layout, on the other hand, provides a way of arranging widgets into a particular structure in your application's windows. A layout can also give certain behaviors to widgets that belong to it, like the ScatterLayout, which enables multi-touch resizing of a child widget.

In Kivy, you'll find widget and layout classes in their corresponding module under the kivy.uix module. For example, to import the Button class, we can use:

python from kivy.uix.button import Button

In Kivy, widgets and layout classes are usually located in modules named after the class itself. However, the class uses CamelCase, and the containing module uses lower casing.

For example, take the following imports:

python # Widgets from kivy.uix.label import Label from kivy.uix.image import Image # Layouts from kivy.uix.boxlayout import BoxLayout from kivy.uix.floatlayout import FloatLayout

You'll find some exceptions to this naming convention. For example:

python from kivy.uix.image import AsyncImage from kivy.uix.screenmanager import FadeTransition

This commonly happens with modules that define multiple and closely related classes, such as Image and AsyncImage.


Widgets are the building blocks of Kivy-based GUIs. Some of the most commonly used GUI widgets in Kivy apps include the following:

  • Widget is the base class required for creating widgets.
  • Label is used for rendering text on windows and dialogs.
  • TextInput provides a box for editable plain text.
  • Button triggers actions when the user presses it.
  • CheckBox provides a two-state button that can be either checked or unchecked.
  • Image is used to display an image on your GUIs.
  • ProgressBar visualizes the progress of some tasks.
  • DropDown provides a versatile drop-down list that can list different widgets.

With these widgets and some others that Kivy provides, you can build complex and user-friendly interfaces for your applications.


Kivy also has a rich set of layout classes that allows you to arrange your widgets coherently and functionally to build up GUIs. Some examples of common layouts include:

  • BoxLayout arranges widgets sequentially in either a vertical or horizontal fashion.
  • FloatLayout arranges widgets in a specific position on the containing window.
  • RelativeLayout arranges child widgets according to relative positions.
  • GridLayout arranges widgets in a grid defined by the rows and columns.
  • PageLayout creates multi-page layouts in a way that allows flipping from one page to another.
  • ScatterLayout positions its child widgets similarly to a RelativeLayout.
  • StackLayout stacks in a left-to-right and then top-to-bottom order, or top-to-bottom then left-to-right order.

You can combine and nest layouts together to build complex user interfaces.

Using Widgets and Layouts: A Practical Example

As an example of how to use widgets and layouts in Kivy, let's look at a commonly used layout class: the GridLayout. With this class, we can create a grid of rows and columns. Each cell of the grid has a unique pair of zero-based coordinates. Consider the following example:

python from import App from kivy.uix.button import Button from kivy.uix.gridlayout import GridLayout ROWS = COLS = 3 class GridApp(App): def build(self): root = GridLayout(rows=ROWS, cols=COLS) for i in range(ROWS): for j in range(COLS): root.add_widget(Button(text=f"({i}, {j})")) return root GridApp().run()

In the build() method, we instantiate the GridLayout with three rows and three columns. Then use a for loop to add button widgets to the layout using the add_widget() method.

When we run this app, we get the window that is shown below:

Grid Layout in Kivy

Each button on the grid shows its corresponding pair of coordinates. The first coordinate represents the row, while the second represents the column. Like the rest of the layout classes, GridLayout can take several arguments that you can use to fine-tune its behavior.

Drawing Shapes in Kivy: The canvas Property

To deeply customize a GUI or design a 2D video game, we may need to draw 2D shapes, such as a rectangle, circle, ellipse, or triangle. Doing this is straightforward in Kivy. The library provides a rich set of shape classes that you can find in the package. Some of these classes include:

To draw a shape on the screen with Kivy, we need to use the canvas property of a Widget object. This property holds an instance of the Canvas class, which lives in the package.

Let's see how this works with an example of a white square drawn on the screen:

python from import App from kivy.core.window import Window from import Rectangle from kivy.uix.widget import Widget class CanvasApp(App): def build(self): root = Widget() size = 200 width, height = Window.size pos_x = 1/2 * (width - size) pos_y = 1/2 * (height - size) with root.canvas: Rectangle(size=[size, size], pos=[pos_x, pos_y]) return root CanvasApp().run()

Inside build(), we create the root widget and define the size of our shape. It'll be a square shape, so each side is equal.

Next, we compute the coordinates to center our shape on the window. The coordinates passed when creating the shape are for the top left corner of the window.

To calculate the correct values, we take the width and height of our main window, halving these values to get the center. We then subtract half of the width or height of our shape to position the center of our shape in the middle of the window. This can be simplified to 1/2 * (width - size) or 1/2 * (height - size). We store the resulting top left coordinates in pos_x and pos_y.

Next, we use the canvas property of our root window to draw the shape. This property supports the with statement, which provides the appropriate context for creating our shapes. Inside the with block, we define our Rectangle instance with the size and pos arguments.

Finally, we return the root widget as expected. The final line of code creates the app instance and calls its run() method. If you run this app from your command line, then you'll get the following window on the screen:

Drawing Shapes in Kivy With Canvas

Cool! You've drawn a square on your Kivy app. The computed coordinates place the square in the center of the window. The default color is white. However, we can change it:

python # ... from import Color, Rectangle from kivy.uix.widget import Widget # ... class CanvasApp(App): def build(self): # ... with root.canvas: Color(1, 1, 0, 1) Rectangle(size=[side, side], pos=[pos_x, pos_y]) # ...

In this code snippet, we have added an import for the Color class from the graphics package. The Color class accepts four numeric arguments between 0 and 1 representing the red, green, blue, and transparency components of our target color.

For example, the values (1, 0, 0, 1) represent an entirely red and fully opaque color. The value (0, 1, 0, 0.5) is fully green, half opaque, and half transparent. Consequently, the value (1, 1, 0, 1) gives a fully opaque yellow color. So, if you run the app, then you'll get the following output:

Drawing Shapes in Color With Kivy

We can experiment with different color values and also with different shape classes, which is cool.

Finally, note that to see the effect of the Color() on the drawn rectangle, the Color class must be instantiated before the Rectangle class. You can think of this as dipping your paintbrush on a palette before using it to paint on your canvas! Interestingly, any drawing that comes after the Color instance is painted accordingly so long as a different color has not been applied.

Using the with statement is pretty convenient and facilitates working with shapes. Alternatively, we can use the canvas.add() method:

python root.canvas.add(Color(1, 1, 0, 1)) root.canvas.add( Rectangle(size=[side, side], pos=[pos_x, pos_y]) )

These statements are equivalent to the statements we have in the with block. Go ahead and give it a try yourself.

Styling Your GUIs With the Kivy Language

Kivy also provides a declarative language known as the Kv language, which aims at separating your application's GUI design and business logic. In this tutorial, we will not go deep into using the Kv language. However, we will highlight some of its main features and strengths.

With Kv language, you can declare and style the widgets and graphical components of your GUI apps. You will put your Kv code in files with the .kv extension. Then you can load the content of these files into your app to build the GUI. You'll have at least two ways to load the content of a .kv file:

  • Relying on the automatic loading mechanism
  • Using the Builder class for manual loading

In the following sections, you'll learn the basics of these two ways of using the Kv language to build the GUI of your Kivy apps.

Relying on the Automatic Widget Loading

As stated earlier, the Kv language helps you separate business logic from GUI design. Let's illustrate this possibility with an updated version of our "Hello, World!" app:

python from import App from kivy.uix.label import Label class CustomLabel(Label): pass class MainApp(App): def build(self): root = CustomLabel() return root MainApp().run()

As you can see we have subclassed the Label class to create a new CustomLabel haven't made any modifications to the subclass, so it functions exactly like the Label class but with a different name. We add a pass statement, which is a Python placeholder statement which makes the code syntactically valid.

Next, create a file called main.kv alongside your app's file. Define a label using the following code:

kv <CustomLabel>: text: "Hello, World!"

Note that your label must have the same name as your custom Python class in the app's file. Additionally, the .kv file must have the same name as your subclass of App, but without the App suffix and in lowercase. In this example, your subclass is named MainApp, so your .kv file must be main.kv.

Now you can run the app from your command line. You'll get the following window on your screen:

Kivy Application Using the Kv Language

The Kv language, also known as kivy language or just kvlang, allows us to create widget trees in a declarative way. It also lets you bind widget properties to each other or to callbacks.

Loading Widgets Through the Builder Class

When your Kivy project grows, your .kv file will grow as well. So, it is recommended that you split up the file into different files for readability. In such cases, you will end up with multiple .kv files, and the automatic loading mechanism will not be sufficient. You'll have to use the Builder class from kivy.lang.Builder.

To explore how to use Builder, let's build a sample GUI consisting of a label and button in a BoxLayout. The label will be provided in the labels.kv file, while the buttons will live in the buttons.kv file.

Here's the Python code for this app:

python from import App from kivy.lang import Builder from kivy.uix.boxlayout import BoxLayout from kivy.uix.button import Button from kivy.uix.label import Label Builder.load_file("labels.kv") Builder.load_file("buttons.kv") class CustomLabel(Label): pass class CustomButton(Button): pass class MainApp(App): def build(self): root = BoxLayout(orientation="vertical") root.add_widget(CustomLabel()) root.add_widget(CustomButton()) return root MainApp().run()

After importing the required classes, we call the load_file() method. This method takes the filename of a .kv file as an argument and loads it into your app.

Next, you create the custom label and button following the pattern used in the previous section. Inside build(), you create a BoxLayout and add the two widgets to it. Now you need to provide the required .kv files.

Go ahead and create a labels.kv file with the following content:

kv <CustomLabel>: text: "This is a custom label!" font_size: 50 bold: True

This file provides a label with the text "This is a custom label!". Its font will have a size of 50 pixels and will be bold.

The buttons.kv will have the following code:

kv <CustomButton>: text: "Click me!"

Your custom button will be quite minimal. It'll only have the text "Click me!" on it. Go ahead and run the app from your command line. You'll get the following window on your screen:

Kivy Application Using the Kv Language With Multiple kv Files

In addition to using the load_file() to build Kv language files, you can also parse and load Kv language directly in a multi-line string in your Python file:

python Builder.load_string(""" <CustomLabel>: text: "This is a custom label!" font_size: 50 bold: True """) Builder.load_string(""" <CustomButton>: text: "Click me!" """)

These calls to load_string() are completely equivalent to the corresponding calls to load_file() in our original code example.

Let's take a look at a final example of using the Kv language. This time we'll use the language to draw shapes. Create a file with the following content:

python from import App from kivy.uix.widget import Widget class CustomRectangle(Widget): pass class MainApp(App): def build(self): return CustomRectangle() MainApp().run()

Now go ahead and create another file in the same directory and save it as main.kv. Then add the following content:

kv <CustomRectangle>: canvas: Color: rgba: 1, 1, 0, 1 Rectangle: size: 200, 200 pos: 0, 0

If you run the file, then you will see a 200×200 pixels rectangle ---square in this case--- at the lower left corner of your window! For more guidelines on using Kv Language, check out its official documentation.

More Resources

For some more examples of what you can do with Kivy, take a look at the Kivy examples section in the documentation. Depending on your interest, you can also the other resources. For example:

  • If you're interested in 3D, then the Kivy 3D demo gives a good demonstration of the framework's rendering abilities.
  • If you're interested in using Kivy to develop for mobile, you can write functional Android apps (APKs) with Python and pack them using tools like Buildozer and Python-For-Android without learning Java.
  • If you want a complete vision of where you can use Kivy, then check out the gallery of examples provided by the Kivy community.

What you've learned in this tutorial is just the tip of the Kivy iceberg. There's so much more to Kivy than what meets the eye. It's a powerful GUI library that provides a well-structured hierarchy of classes and objects that you can use to create modern and cross-platform GUIs for your applications.

Categories: FLOSS Project Planets

Brett Cannon: In response to the Changelog #526

Tue, 2023-05-30 20:07

In episode 526 of the Changelog podcast entitled, "Git with your friends", they discussed various tools involving git (disclaimer: I have been on the podcast multiple times and had dinner with the hosts of the podcast the last time they were in Vancouver). Two the projects they discussed happened to be written in Python. That led Jerod to say:

The Python one gives me pause as well, just because I don’t know if it’s gonna go right.


Jerod and Adam know I tend to run behind in listening to my podcasts, so they actually told me to listen to the podcast and let them know what I thought, hence this blog post (they also told me to write a blog post when I asked where they wanted "my rant/reply", so they literally asked for this &#x1F609;).

To start, Jerod said:

If it’s pip install for me, I just have anxiety… Even though it works most of the time. It’s the same way – and hey, old school Rubyist, but if I see your tool and I see it’s written in Ruby, I’m kind of like “Uhm, do I want to mess with this?” And that’s how I am with Python as well. Their stories are just fraught.

To me, that&aposs a red flag that Jerod is installing stuff globally and not realizing that he should be using a virtual enivonrment to isolate his tools and their dependencies. Now, I consider asking non-Python developers to create virtual environments to be a lot, and instead I would recommend using pipx. That allows one to install a Python-based tool in a very straightforward manner using pipx install into their .local directory. I also expect pipx to be available from all the major OS package managers, so installing it (and implicitly Python) shouldn&apost be too difficult.

If you don&apost want the tool you are running to be installed "permanently", pipx run will create a virtual environment in a temp directory so your system can reclaim the space when it wants to. This also has a side-effect of grabbing a newer version of the tool on occasion as the files will occasionally be deleted.

Another option is if projects provide a .pyz file. When projects do that, they are giving users a zip file that is self-contained in terms of Python code, such that you can just pass that to your Python interpreter to run something (e.g. python tool.pyz). That avoids any installation overhead or concerns over what Python interpreter you have installed at any point since you point any Python interpreter at the .pyz file (compatibility permitting).

For the pipx scenario we probably need projects to take the lead to write their documentation about this as non-Python developers quite possibly don&apost know about either option. The .pyz solution involves the project itself to build something as part of its release process which is also a bigger ask.

Jerod did provide a little bit of clarification later about what his concerns were:

Yeah. I have no problem with Ruby-based things. But if you say gem-install this tool, I’m like “You know what? I don’t really trust my Ruby environment over the course of years on my Mac”, and I’m the same way with Python. Whereas with Go, and with Rust, it seems - and JavaScript had the same bad story for me, but Deno with TypeScript is showing some new opportunities to have universal binaries, which is cool… I’m just way more likely to say “If you can just grab a binary, drop it in your path and execute it, I will do that 100 times a day.” But if your tool says PIP install, or it says gem install, or says npm install, I’m kind of like “Do I want to mess with this?” That’s just my sense.

So that does tie into the above guess that Jerod isn&apost using virtual environments. But you could stretch this out and say Jerod is even concerned that his Python interpreter will change or disappear, breaking any code he installed for that. In that instance, pipx run is rather handy as it will implicitly install Python if you got  it from your OS package manager. You can also install Python explicitly instead of relying on it coming installed in the OS (which is now a Unix thing since macOS stopped shipping Python by default).

There is also the option of turning your Python tool into a standalone application. You can use things like Briefcase for GUI apps and PyApp for CLI apps. But this does make releasing harder as the project is now being asked to maintain builds for various operating systems instead of simply relegating that job to Python itself.

Now, Adam wanted even less work to do in order to get a tool:

if it’s on Linux, it should be in Apt, or whatever your [unintelligible 00:45:04.05] Yum, or pick your – it should be a package. Or you should have to update your registry with whatever package directory you want to use, and apt update, and get that, and install. That’s my feelings. I don’t like to PIP install anything if I don’t have to.

The trick with this is that you, the tool developer, do not have direct control as to whether an OS package manager picks up your tool to be packaged. Since you don&apost control that distribution mechanism there is no guarantee that you can get your tool into the package manager you want (e.g., Homebrew can choose to reject your project for inclusion).

The other problematic part is there&aposs multiple OS package managers to get into, and that&aposs even if you restrict yourself to the "major" Linux distributions:

  1. Fedora
  2. Debian
  3. Arch
  4. SUSE
  5. Gentoo
  6. Slackware
  7. CentOS

And that&aposs not covering the BSDs:

  1. FreeBSD
  2. OpenBSD
  3. NetBSD


  1. Scoop
  2. Chocolatey
  3. winget

or macOS:

  1. Homebrew (which is also available on Linux)
  2. MacPorts

And so supporting that installation flow of just installing from an OS package manager takes work as you&aposre now coordinating your tool with multiple OSs which all have their own metadata format, requirements for inclusion, ways to be told about updates, etc. It&aposs not a small lift to make happen if you want a large swath of coverage for your project.

Hopefully this has allievated some of Jerod&aposs concerns and shown Adam that his ask isn&apost small. &#x1F609; But what&aposs the best approach for a Python tool project to take?

Unfortunately I don&apost know if there&aposs a simple answer here. For instance, if people were to use PyApp to build a self-contained binary, would people download it and actually use it, or would they be like Adam and just look for what&aposs available from their OS package manager? Where is the best cost:benefit ratio for each of the options suggested above where it warrants complicating your release process? I think documenting pipx and making a .pyz, if possible, available do make sense. But as to whether standalone binaries make sense or if it&aposs a better use of time to try and get into the package managers I honestly don&apost know.

Categories: FLOSS Project Planets

PyCoder’s Weekly: Issue #579 (May 30, 2023)

Tue, 2023-05-30 15:30

#579 – MAY 30, 2023
View in Browser »

Python’s .__call__() Method: Creating Callable Instances

In this tutorial, you’ll learn what a callable is in Python and how to create callable instances using the .__call__() special method in your custom classes. You’ll also code several examples of practical use cases for callable instances in Python.

Sorting a Django Queryset Using Custom Attributes

“Typically, Django allows sorting a queryset by any attribute on the model or related to it in either ascending or descending order. However, what if you need to sort the queryset following a custom sequence of attribute values?”

Analyze Your Python Code for Security Issues for Free

Semgrep is trusted by hundreds of thousands of developers at top companies, such as GitLab, Snowflake, Slack, and many more, to ensure the security of their code (SAST) and dependencies (SCA). Add your project in 1 minute and see for yourself →
SEMGREP sponsor

Python Decorators and How to Use Them Effectively

This article covers the importance and use of decorators in your code. It introduces you to both function and class decorators and helps you write your own.

PyPI Was Subpoenaed

In March and April 2023, PyPI received three subpoenas for user data from the US Department of Justice. This blog post covers what was requested and how the PyPI is working to clarify what they retain and can make available in the future. See the associated Hacker News discussion.

Python 3.12.0 Beta 1 Released


Removing PGP From PyPI


Discussions How to Move From Dev to Management/Team Lead Role?


Articles & Tutorials Writing Python Like It’s Rust

This blog post from Jakub talks about how writing code in Rust has informed a more rigorous approach to his Python. He now uses types more frequently, absorbing the strictness of Rust in his Python coding style. Associated Hacker News conversation

Publishing Python Packages to PyPI

In this video course, you’ll learn how to create a Python package for your project and how to publish it to PyPI, the Python Package Repository. Quickly get up to speed on everything from naming your package to configuring it using setup.cfg.

A New Approach to Find High Paying Remote Jobs

Trusted by 2000+ developers from 120+ countries. Proxify provides software developers with an effortless, fast, and reliable way to find high-paying remote job opportunities. Join the most developer-friendly community today and start working on engagements with Top clients in the USA & EU →
PROXIFY sponsor

Using k-Nearest Neighbors (kNN) in Python

In this video course, you’ll learn all about the k-nearest neighbors (kNN) algorithm in Python, including how to implement kNN from scratch. Once you understand how kNN works, you’ll use scikit-learn to facilitate your coding process.

Programming Types and Mindsets

David expounds on why we should appreciate the features of other languages and how they enable the creativity of their developers, even if we don’t like those features ourselves.

Using a Golang Package in Python Using Gopy

Including a Golang package in Python using Gopy: A simple way to leverage the power of Golang packages in Python applications.
ARJUN MAHISHI • Shared by Prathamesh

Choosing a Good File Format for Pandas

CSV, JSON, Parquet — which data format should you use for your Pandas data? Itamar compares them and makes recommendations.

The Power of Bit Manipulation

In this article, you learn about bit manipulation and how to solve problems efficiently using it in Python.

Projects & Code unimport: Remove Unused Import Statements in Your Code


ChatSQL: Convert Plain Text to SQL Through ChatGPT


pyserde: Dataclass Based Serialization Library


pyscan: Rust Based Python Dependency Vulnerability Scanner


guidance: Language for Controlling Large Language Models


Events DjangoCon Europe 2023

May 29 to June 3, 2023

Weekly Real Python Office Hours Q&A (Virtual)

May 31, 2023

Canberra Python Meetup

June 1, 2023

PyData London 2023

June 2 to June 5, 2023

PyDay La Paz 2023

June 3 to June 4, 2023

Happy Pythoning!
This was PyCoder’s Weekly Issue #579.
View in Browser »

[ Subscribe to 🐍 PyCoder’s Weekly 💌 – Get the best Python news, articles, and tutorials delivered to your inbox once a week >> Click here to learn more ]

Categories: FLOSS Project Planets

Stack Abuse: How to Split String on Multiple Delimiters in Python

Tue, 2023-05-30 14:42

Among the plenty of string operations, splitting a string is a significant one, offering the capability to divide a large, composite text into smaller, manageable components. Typically, we use a single delimiter like a comma, space, or a special character for this purpose. But what if you need to split a string based on multiple delimiters?

Imagine a situation where you're dealing with text data punctuated with various separators, or you're parsing a complex file with inconsistent delimiters. This is where Python's ability to split strings on multiple delimiters truly shines.

In this article, we'll give you a comprehensive overview of the different techniques of multi-delimiter string splitting in Python. We'll explore core Python methods, regular expressions, and even external libraries like Pandas to achieve this.

The str.split() Method can Split Strings on Only One Delimiter

The str.split() method is Python's built-in approach to dividing a string into a list of substrings. By default, str.split() uses whitespace (spaces, tabs, and newlines) as the delimiter. However, you can specify any character or sequence of characters as the delimiter:

text = "Python is a powerful language" words = text.split() print(words)

Running this code will result in:

['Python', 'is', 'a', 'powerful', 'language']

In this case, we've split the string into words using the default delimiter - whitespace. But what if we want to use a different delimiter? We can pass it as an argument to split():

text = "Python,is,a,powerful,language" words = text.split(',') print(words)

Which will give us:

['Python', 'is', 'a', 'powerful', 'language']

While str.split() is highly useful for splitting strings with a single delimiter, it falls short when we need to split a string on multiple delimiters. For example, if we have a string with words separated by commas, semicolons, and/or spaces, str.split() cannot handle all these delimiters simultaneously.

Advice: Reading our guide "Python: Split String into List with split()" will help you gain a deeper understanding of the split() method in Python.

In the upcoming sections, we will explore more sophisticated techniques for splitting strings based on multiple delimiters in Python.

Using Regular Expressions - the re.split() Method

To tackle the issue of splitting a string on multiple delimiters, Python provides us with the re (Regular Expressions) module. Specifically, the re.split() function is an effective tool that allows us to split a string using multiple delimiters.

Regular expressions (or regex) are sequences of characters that define a search pattern. These are highly versatile, making them excellent for complex text processing tasks.

Consider the following string:

text = "Python;is,a powerful:language"

If you want to extract words from it, you must consider multiple delimiters. Let's take a look at how we can use re.split() to split a string based on multiple delimiters:

import re text = "Python;is,a powerful:language" words = re.split(';|,| ', text) print(words)

This will give us:

['Python', 'is', 'a', 'powerful', 'language']

We used the re.split() method to split the string at every occurrence of a semicolon (;), comma (,), or space ( ). The | symbol is used in regular expressions to mean "or", so ;|,| can be read as "semicolon or comma or space".

This function demonstrates far greater versatility and power than str.split(), allowing us to easily split a string on multiple delimiters.

Advice: You can find more about Python regular expressions in our "Introduction to Regular Expressions in Python".

In the next section, we'll take a look at another Pythonic way to split strings using multiple delimiters, leveraging the translate() and maketrans() methods.

Using translate() and maketrans() Methods

Python's str class provides two powerful methods for character mapping and replacement: maketrans() and translate(). When used in combination, they offer an efficient way to replace multiple delimiters with a single common one, allowing us to use str.split() effectively.

The maketrans() method returns a translation table that can be used with the translate() method to replace specific characters. So, let's take a look at how to utilize those two methods to fit our needs.

First of all, we need to create a translation table that maps semicolons (;) and colons (:) to commas (,):

text = "Python;is,a powerful:language" # Create a translation table mapping ';' and ':' to ',' table = text.maketrans(";:", ",,")

Then we use the translate() method to apply this table to our text. This replaces all semicolons and colons with commas:

# Apply the translation table text = text.translate(table)

Finally, we can use str.split(',') to split the text into words and print extracted words:

# Now we can split on the comma words = text.split(',') print(words)

This will result in:

['Python', 'is', 'a powerful', 'language']

Note: This approach is particularly useful when you want to standardize the delimiters in a string before splitting it.

In the next section, we'll explore how to utilize an external library, Pandas, for splitting strings on multiple delimiters.

Leveraging the Pandas Library

Pandas, a powerful data manipulation library in Python, can also be used for splitting strings on multiple delimiters. Its str.split() function is capable of handling regex, making it another effective tool for this task.

While the built-in string methods are efficient for smaller data, when you're working with large datasets (like a DataFrame), using Pandas for string splitting can be a better choice. The syntax is also quite intuitive.

Here's how you can use Pandas to split a string on multiple delimiters:

import pandas as pd # Create a DataFrame df = pd.DataFrame({'Text': ['Python;is,a powerful:language']}) # Use the str.split() function with a regex pattern df = df['Text'].str.split(';|,|:', expand=True) print(df)

This will give us:

0 1 2 3 4 0 Python is a powerful language

We first created a DataFrame with our text. We then used the str.split() function, passing in a regex pattern similar to what we used with re.split(). The expand=True argument makes the function return a DataFrame where each split string is a separate column.

Note: Although this method returns a DataFrame instead of a list, it can be highly useful when you're already working within the Pandas ecosystem.

Performance Comparison

When choosing a method to split strings on multiple delimiters, performance can be an important factor, especially when working with large datasets. Let's examine the performance of the methods we've discussed.

The built-in str.split() method is quite efficient for smaller data sets and a single delimiter, but its performance suffers when used with multiple delimiters and large datasets due to the necessary extra processing.

The re.split() method is versatile and relatively efficient, as it can handle multiple delimiters well. However, its performance might also degrade when dealing with huge amounts of data, because regular expressions can be computationally intensive.

Using translate() and maketrans() can be an efficient way to handle multiple delimiters, especially when you want to standardize the delimiters before splitting. However, it involves an extra step, which can affect performance with large datasets.

Finally, while the Pandas library offers a very efficient and flexible method to split strings on multiple delimiters, it might be overkill for simple, small tasks. The overhead of creating a DataFrame can affect performance when working with smaller data, but it excels in handling large datasets.

In conclusion, the best method to use depends on your specific use case. For small datasets and tasks, Python's built-in methods might be more suitable, while for larger, more complex data manipulation tasks, Pandas could be the way to go.


String splitting, especially on multiple delimiters, is a common yet crucial operation in Python. It serves as the backbone in many text processing, data cleaning, and parsing tasks. As we've seen, Python provides a range of techniques for this task, each with its own strengths and weaknesses. From the built-in str.split(), to the versatile Regular Expressions, the character mapping translate() and maketrans() methods, and even the external Pandas library, Python offers solutions suitable for any complexity and size of data.

It's important to understand the different methods available and choose the one that best suits your specific requirements. Whether it's simplicity, versatility, or performance, Python's tools for string splitting can cater to various needs.

We hope this article helps you become more proficient in handling and manipulating strings in Python.

Categories: FLOSS Project Planets

The Three of Wands: Intro to cattrs 23.1.0

Tue, 2023-05-30 11:49

Instead of my usual Twitter and Fediverse threads, for this release of cattrs I figured I&aposd try something different. A blog post lets me describe the additions in more detail, provide context and usage examples, and produces a permanent record that can be linked to from the relevant places, like a GitHub release page and the cattrs changelog.

cattrs is a library for transforming Python data structures, the most obvious use case being de/serialization (to JSON, msgpack, YAML, and other formats).

Tagged Unions

cattrs has supported unions of attrs classes for a long time through our default automatic disambiguation strategy. This is a very simple way of supporting unions using no extra configuration. The way it works is: we examine every class in the union, find unique, mandatory attribute names for each class, and generate a function using that information to do the actual structuring. (Other unions are supported via manually-written hooks.)

But what if one of your classes has no unique attributes, or you just want to be able to tell the union member from a glance at the payload? Now you can use the tagged unions strategy.

This strategy adds a field into the unstructured payload, defaulting to _type but configurable, which inserts a piece of data (by default the name of the class, but again configurable) to help with structuring.

This strategy isn&apost the default so you&aposll have to import it and configure it on a union-by-union basis.

from attrs import define from cattrs import Converter from cattrs.strategies import configure_tagged_union @define class A: a: int @define class B: a: str c = Converter() configure_tagged_union(A | B, c) c.unstructure(A(1), unstructure_as=A|B) # {"a": 1, "_type": "A"} c.structure({"a": 1, "_type": "A"}, A|B) # A(1)

A useful feature of configure_tagged_union is that you can give it a default member class. This is a good way of evolving an API from a single class to a union in a backwards-compatible way.

from attrs import define @define class Request: @define class A: field: int payload: A c = Converter() c.structure({"payload": {"field": 1}}, Request) # Request(A(1)) # Next iteration: @define class Request: @define class A: field: int @define class B: field: int payload: A | B c = Converter() configure_tagged_union(A | B, c, default=A) # No type info means `A` c.structure({"payload": {"field": 1}}, Request) # Still Request(A(1))Improved Validation Errors

cattrs has had a detailed validation mode for a few versions now, and it&aposs enabled by default. In this mode, structuring errors are gathered and propagated out as an ExceptionGroup subclass, essentially creating a tree of errors mirroring the desired data structure. This ExceptionGroup can then be printed out using normal Python tooling for printing exceptions.

Still, sometimes you need a more succinct representation of your errors; for example if you need to display it to a user or return it to a web frontend. So now we have a simple transformer function available:

from attrs import define from cattrs import structure, transform_error @define class Class: a_list: list[int] a_dict: dict[str, int] try: structure({"a_list": ["a"], "a_dict": {"str": "a"}}, Class) except Exception as exc: print(transform_error(exc)) [ &aposinvalid value for type, expected int @ $.a_list[0]&apos, "invalid value for type, expected int @ $.a_dict[&aposstr&apos]" ]

As you see, we generate a list of readable(-ish) error messages, including a path to every field. This can be customized, or you can copy/paste the transform_error function and just alter it directly if you require absolute control. Learn more here.

Typed Dicts

cattrs now supports TypedDicts on all supported Python versions. Due to spotty TypedDict functionality in earlier Pythons, I recommend you use TypedDict from typing_extensions when running on 3.9 or earlier. This is the reason cattrs now depends on typing_extensions on those versions.

from typing import TypedDict from datetime import datetime from cattrs.preconf.json import make_converter converter = make_converter() class MyDict(TypedDict): my_datetime: datetime converter.structure({"my_datetime": "2023-05-01T00:00:00Z"}, MyDict) # {&aposmy_datetime&apos: datetime.datetime(2023, 5, 1, 0, 0, tzinfo=datetime.timezone.utc)}

Generic TypedDicts are supported on 3.11+ (a language limitation), and totalities, Required and NotRequred are supported regardless.

The TypedDict implementation leverages the existing attrs/dataclasses base so it inherits most of the features. For example, structuring and unstructuring hooks can be customized to rename or omit keys. Here&aposs an example with the forbid_extra_keys functionality:

from typing import TypedDict from cattrs import Converter from cattrs.gen.typeddicts import make_dict_structure_fn class MyTypedDict(TypedDict): a: int c = Converter() c.register_structure_hook( MyTypedDict, make_dict_structure_fn(MyTypedDict, c, _cattrs_forbid_extra_keys=True) ) c.structure({"a": 1, "b": 2}, MyTypedDict) # Raises an exceptionNew Markdown Docs

The docs have been rewritten using Markdown and MyST! We can finally link like civilized people and not animals, so I&aposll be going through the docs and making them more interconnected. The theme has also been tweaked to be more airy and (in my opinion) better looking. The new docs are live at now.


There are many more smaller changes in this release; I suggest inspecting the actual changelog. A quick shout out to the include_subclasses strategy by Matthieu Melot!

What&aposs Next

So I don&apost actually know exactly what&aposll end up in the next version of cattrs since I don&apost work via a strict roadmap and I can&apost predict what folks will contribute.

What I think will probably happen is the creation of some sort of OpenAPI/jsonschema and cattrs wrapper library.  It&aposs something folks have expressed interest in, and I already have the bones of it in the uapi project.

I&aposll also continue work on fleshing out the cattrs.v validation subsystem. This will probably go hand-in-hand with efforts in attrs and Mypy making operations on class attributes type-safe.

I&aposll also almost certainly expand both our union strategies to additionally handle enums and literals automatically, enabling true sum type support by default.

Categories: FLOSS Project Planets

Real Python: Getting the First Match From a Python List or Iterable

Tue, 2023-05-30 10:00

At some point in your Python journey, you may need to find the first item that matches a certain criterion in a Python iterable, such as a list or dictionary.

The simplest case is that you need to confirm that a particular item exists in the iterable. For example, you want to find a name in a list of names or a substring inside a string. In these cases, you’re best off using the in operator. However, there are many use cases when you may want to look for items with specific properties. For instance, you may need to:

  • Find a non-zero value in a list of numbers
  • Find a name of a particular length in a list of strings
  • Find and modify a dictionary in a list of dictionaries based on a certain attribute

In this video course, you’ll explore how best to approach all three scenarios.

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Categories: FLOSS Project Planets

Stack Abuse: How to Remove Whitespaces from a String in Python

Tue, 2023-05-30 08:23

In programming, data often doesn't come in a neat, ready-to-use format. This is particularly true when we deal with strings, which often need to be cleaned, formatted, or manipulated in some way before they can be used effectively. One common issue we encounter is the presence of unwanted whitespaces - extra spaces, tabs, or newlines that can interfere with the way the string is processed or displayed.

Whitespace is like the "empty space" in our data. It might not seem like much, but in programming and data analysis, it can often lead to errors, incorrect results, or simply make data harder to read and analyze. That's why it's important to understand how to manage and control whitespaces in our strings.

In this article, we will explore different techniques to remove whitespaces from strings in Python, including using built-in string methods like strip(), replace(), and join(), and also delve into more advanced concepts like regular expressions and list comprehensions.

Whitespaces in Python

In Python, and most other programming languages, whitespace refers to characters that are used for spacing and do not contain any printable glyphs. They include spaces ( ), tabs (\t), newlines (\n), and others. In Python strings, these characters can exist at the beginning, end, or anywhere within the string.

Consider the following example:

str_with_whitespace = ' Hello, World! '

The string ' Hello, World! ' has leading and trailing whitespaces.

Although they might seem innocuous, these whitespaces can cause a variety of issues. For instance, they might interfere with string comparison operations or cause problems when trying to format your output neatly:

print(' Hello, World! ' == 'Hello, World!') # Output: False

Therefore, understanding how to remove or manipulate these whitespaces is a crucial aspect of handling strings in Python. In the following sections, we'll explore various methods and techniques to effectively remove these whitespaces from our Python strings.

Ways to Remove Whitespaces from Strings in Python

Python provides various ways to remove whitespaces from strings. We'll explore several methods, each with its own advantages and use cases.

Using strip(), rstrip(), and lstrip() Methods

Python string method strip() removes leading and trailing whitespaces. If you only want to remove spaces from the left or right side, you can use lstrip() and rstrip(), respectively:

str_with_whitespace = ' Hello, World! ' # Using strip() method print(str_with_whitespace.strip()) # Output: 'Hello, World!' # Using lstrip() method print(str_with_whitespace.lstrip()) # Output: 'Hello, World! ' # Using rstrip() method print(str_with_whitespace.rstrip()) # Output: ' Hello, World!'

Note: These methods do not remove whitespace that is in the middle of the string.

Advice: For a more comprehensive overview of the strip(), rstrip() and lstrtip() in Python, read our "Guide to Python's strip() Method".

Using replace() Method

The replace() method can be used to replace all occurrences of a substring. This can be used to remove all whitespaces in a string by replacing them with nothing.

str_with_whitespace = ' Hello, World! ' # Using replace() method print(str_with_whitespace.replace(' ', '')) # Output: 'Hello,World!'

This method removes all spaces, even those within the string.

Advice: If you need a deeper understanding of the replace() method in Python, read our article "Replace Occurrences of a Substring in String with Python".

Using Regular Expressions

For more complex whitespace removal, we can use regular expressions via the re module in Python. This can be used to replace multiple consecutive whitespaces with a single space:

import re str_with_whitespace = ' Hello, World! ' # Using re.sub() method print(re.sub('\s+', ' ', str_with_whitespace).strip()) # Output: 'Hello, World!'

Here, re.sub('\s+', ' ', str_with_whitespace).strip() replaces all consecutive whitespaces with a single space and then removes leading and trailing whitespaces.

Advice: Read more about regular expressions in Python in "Introduction to Regular Expressions in Python".

Using List Comprehension

List comprehensions provide a concise way to create lists based on existing lists. They can be used to remove all whitespaces in a string:

str_with_whitespace = ' Hello, World! ' # Using list comprehension print(''.join(char for char in str_with_whitespace if not char.isspace())) # Output: 'Hello,World!'

In this example, we created a new string by joining all non-whitespace characters.

Advice: You can dive deeper into the concept of list comprehension in Python by reading our guide "List Comprehension in Python".

Using join() and split() Methods

The join() and split() methods can be used together to remove all whitespaces in a string. split() splits the string into words, and join() joins them together without spaces:

str_with_whitespace = ' Hello, World! ' # Using join() and split() methods print(' '.join(str_with_whitespace.split())) # Output: 'Hello, World!'

In this example, ' '.join(str_with_whitespace.split()) splits str_with_whitespace into words and joins them together with a single space.


Throughout this article, we've explored a variety of methods to remove whitespaces from strings in Python, including using built-in string methods like strip(), replace(), and join(), as well as more advanced techniques involving regular expressions and list comprehensions.

While each method has its own advantages, the best one to use depends on your specific needs and the nature of the data you're working with. The key is to understand these different methods and know how to use them when needed.

To enhance your understanding and practice these methods, we encourage you to experiment with them. Try creating your own strings with various types of whitespaces and see how effectively you can remove them using the techniques discussed.

Categories: FLOSS Project Planets

Talk Python to Me: #417: Test-Driven Prompt Engineering for LLMs with Promptimize

Tue, 2023-05-30 04:00
Large language models and chat-based AIs are kind of mind blowing at the moment. Many of us are playing with them for working on code or just as a fun alternative to search. But others of us are building applications with AI at the core. And when doing that, the slightly unpredictable nature and probabilistic nature of LLMs make writing and testing Python code very tricky. Enter promptimize from Maxime Beauchemin and Preset. It's a framework for non-deterministic testing of LLMs inside our applications. Let's dive inside the AIs with Max.<br/> <br/> <strong>Links from the show</strong><br/> <br/> <div><b>Max on Twitter</b>: <a href="" target="_blank" rel="noopener">@mistercrunch</a><br/> <b>Promptimize</b>: <a href="" target="_blank" rel="noopener"></a><br/> <b>Introducing Promptimize ("the blog post")</b>: <a href="" target="_blank" rel="noopener"></a><br/> <b>Preset</b>: <a href="" target="_blank" rel="noopener"></a><br/> <b>Apache Superset: Modern Data Exploration Platform episode</b>: <a href="" target="_blank" rel="noopener"></a><br/> <b>ChatGPT</b>: <a href="" target="_blank" rel="noopener"></a><br/> <b>LeMUR</b>: <a href="" target="_blank" rel="noopener"></a><br/> <b>Microsoft Security Copilot</b>: <a href="" target="_blank" rel="noopener"></a><br/> <b>AutoGPT</b>: <a href="" target="_blank" rel="noopener"></a><br/> <b>Midjourney</b>: <a href="" target="_blank" rel="noopener"></a><br/> <b>Midjourney generated pytest tips thumbnail</b>: <a href="" target="_blank" rel="noopener"></a><br/> <b>Midjourney generated radio astronomy thumbnail</b>: <a href="" target="_blank" rel="noopener"></a><br/> <b>Prompt engineering</b>: <a href="" target="_blank" rel="noopener"></a><br/> <b>Michael's ChatGPT result for scraping Talk Python episodes</b>: <a href="" target="_blank" rel="noopener"></a><br/> <b>Apache Airflow</b>: <a href="" target="_blank" rel="noopener"></a><br/> <b>Apache Superset</b>: <a href="" target="_blank" rel="noopener"></a><br/> <b>Tay AI Goes Bad</b>: <a href="" target="_blank" rel="noopener"></a><br/> <b>LangChain</b>: <a href="" target="_blank" rel="noopener"></a><br/> <b>LangChain Cookbook</b>: <a href="" target="_blank" rel="noopener"></a><br/> <b>Promptimize Python Examples</b>: <a href="" target="_blank" rel="noopener"></a><br/> <b>TLDR AI</b>: <a href="" target="_blank" rel="noopener"></a><br/> <b>AI Tool List</b>: <a href="" target="_blank" rel="noopener"></a><br/> <b>Watch this episode on YouTube</b>: <a href="" target="_blank" rel="noopener"></a><br/> <b>Episode transcripts</b>: <a href="" target="_blank" rel="noopener"></a><br/> <br/> <b>--- Stay in touch with us ---</b><br/> <b>Subscribe to us on YouTube</b>: <a href="" target="_blank" rel="noopener"></a><br/> <b>Follow Talk Python on Mastodon</b>: <a href="" target="_blank" rel="noopener"><i class="fa-brands fa-mastodon"></i>talkpython</a><br/> <b>Follow Michael on Mastodon</b>: <a href="" target="_blank" rel="noopener"><i class="fa-brands fa-mastodon"></i>mkennedy</a><br/></div><br/> <strong>Sponsors</strong><br/> <a href=''>PyCharm</a><br> <a href=''>RedHat</a><br> <a href=''>Talk Python Training</a>
Categories: FLOSS Project Planets

Python Bytes: #338 Scripting iOS with Python

Tue, 2023-05-30 04:00
<a href='' style='font-weight: bold;'>Watch on YouTube</a><br> <br> <p><strong>About the show</strong></p> <p>Sponsored by us! Support our work through:</p> <ul> <li>Our <a href=""><strong>courses at Talk Python Training</strong></a></li> <li><a href=""><strong>Test &amp; Code</strong></a> Podcast</li> <li><a href=""><strong>Patreon Supporters</strong></a></li> </ul> <p><strong>Connect with the hosts</strong></p> <ul> <li>Michael: <a href=""><strong></strong></a></li> <li>Brian: <a href=""><strong></strong></a></li> <li>Show: <a href=""><strong></strong></a></li> <li>Special guest: GUEST_PROFILE</li> </ul> <p>Join us on YouTube at <a href=""><strong></strong></a> to be part of the audience. Usually Tuesdays at 11am PT. Older video versions available there too.</p> <p><strong>Brian #1:</strong> <a href="">The Basics of Python Packaging in Early 2023</a> </p> <ul> <li>Jay Qi</li> <li>Good description of a minimal-ish <code>pyproject.toml</code> file, which includes a build backend and project metadata.</li> <li>That’s all you need for a Python-only project.</li> <li>Discussion of how to choose a build backend. Mostly it’s baed on extra features you might want, like hatchling’s include/exclude features for source distributions.</li> <li>Some discussion of frontend choices.</li> <li>Nice discussion of non-Python-only builds. Specifically, if you need to compile C or C++ extensions, you can use <code>scikit-build-core</code>, or <code>meson-python</code>, or <code>setuptools</code>. </li> <li>Related: <a href="">"Sharing is Caring - Sharing pytest Fixtures" by Brian Okken (PyCascades 2023)</a> <ul> <li>My PyCascades 2023 on packaging pytest plugins is up on YouTube</li> </ul></li> </ul> <p><strong>Michael #2:</strong> <a href=""><strong>vecs</strong></a></p> <ul> <li>via Oli</li> <li>Python collection-like interface to storing and searching vectors in postgres.</li> <li>Vector search is a key component in building AI chatbots, and semantic document search.</li> <li>If you're familiar with the space, it's effectively Pinecone built on free OSS</li> <li>It's under the Supabase github org but it's fully open source, and compatible with any pgvector vendor, e.g. RDS, or locally in docker</li> <li>If you’re on macOS and need Postgres, <a href="">Postgres App</a> is a good option.</li> </ul> <p><strong>Brian #3:</strong> <a href=""><strong>Introducing Grasshopper - An Open Source Python Library for Load Testing</strong></a></p> <ul> <li>Jacob Fiola</li> <li>“Grasshopper is a library for automated load testing, written in Python.”</li> <li>Open source project from Alteryx, </li> <li>On GitHub and PyPI under the name <a href="">locust-grasshopper</a></li> <li>Built on Locust.</li> <li>Adds <ul> <li>Tag-based suites for trend analysis and evaluating changes.</li> <li>Custom trends. Useful for actions that span multiple http calls, and you want to see timing trends for the whole action.</li> <li>Checks. Checks validate boolean conditions in the test.</li> <li>Custom tagging for all metrics</li> <li>Send data to time series db &amp; dashboards.</li> <li>Thresholds. </li> <li>Reporting results to other locations.</li> <li>Some reusable base classes that take care of the majority of the boilerplate that tests often contain</li> </ul></li> <li>Readme has a very thorough introduction including configuration and samples.</li> </ul> <p><strong>Michael #4:</strong> <a href=""><strong>memocast</strong></a></p> <ul> <li>by Daniel Engvall</li> <li>A small <a href="">iOS</a> app for e.g. iPhone that allow you to add links heard in podcasts into <a href="">reminders</a>.</li> <li>Good example of how to use Pythonista to build python scripts for iOS</li> <li>Pythonista just made an update (2 weeks ago) so that'd use Python 3.10 on the iOS which makes it even more interesting.</li> </ul> <p><strong>Extras</strong> </p> <p>Brian:</p> <ul> <li><a href="">Help test Python 3.12 beta!</a></li> <li><a href="">Python Language Summit write-ups available</a></li> <li><p><a href="">PyPI was subpoenaed</a> <a href=""></a> Michael:</p></li> <li><p><a href=""><strong>You Can Ignore This Post</strong></a></p></li> </ul> <p><strong>Joke:</strong> <a href=""><strong>Careful or you might end up summoning a demon</strong></a><a href="">.</a></p>
Categories: FLOSS Project Planets

Tryton News: News from the Tryton Unconference 2023 in Berlin

Tue, 2023-05-30 02:00

The Tryton Unconference 2023 in Berlin has come to an end. We had a great time and shared lots of information, work and fun. We would also like to thank all the wonderful people in the conference, the organisers and sponsors and the Tryton Foundation for making this event happen after a break of several years.

First Day

In case you weren’t able to participate during the Unconference, or you want to look-up a talk, you can watch the videos from the first day. Many thanks to @tbruyere who made this possible, and has been doing so for many years. Also, the slides from the talks will soon be available at Tryton - Presentations & Papers.

We also have some photos taken during a few of the talks:

Following Days In the second and third day we did a hackathon to fix bugs and discuss features. In an open-space the participants discussed community and project related topics in parallel in smaller groups.

The results were collected in pads are have been copied and spellchecked below in this newsletter.

Please note all the collected results are only individual opinions or the consensus of the discussion group which collected them. All collected topics and results are draft documents for working on later.

Slot: Marketing Tryton

Moderation: Stefan, Writing: Udo
Follow-up discussion:

What is a Marketing?

  • Tryton is a brand
  • Tryton is a company

What is part of the Tryton Brand?

  • Offering a product
  • Has a community
  • Has a design for being recognized


  • Who is the target user/customer group for the Tryton brand?
  • Re-User: Implementer/Integrator/Customizer
  • End-user
  • Product owner
  • Decision maker
  • Consultants
  • Developer

Which media addresses which target user group?

  • Tryton news
  • Social media
  • Homepage
  • Public advertising
  • Discuss forum

How to start with Tryton?

  • Get Tryton Page: 2 clicks away hidden on start page
  • Check if Tryton is good with ISO25010

Start with the Homepage

  • Structuring, design and content
  • Function description for each icon:
    – E.g. Sales Button, click, video showing sale process.
    – Screenshots of more complex customisations.
  • Videos, screenshots and keywords
  • press-collection-zip with pictures and general information

Find a common denominator which attracts all stakeholders:

  • At the first visit, everyone likes to quickly get an idea of the look & feel and functionality of Tryton.
  • not addressing documentation, but giving a feeling about the following questions:
    – Is Tryton a real application used in companies?
    – Could Tryton be a solution for me and my company?
    – How does it look and feel to use Tryton?
    – Will Tryton fit my expectations?

Also a feature list with some screenshots showing features.
A CSV/XLS file including features as a list to make them comparable to other systems.
The feature list could answer the questions:

  • How much does it fit my needs/requirements/expectations?
  • What is already there and what is missing?
  • How it compares to other software?

Then some screenshots each with a paragraph explaining a real specific customization to show Tryton’s customizability.
The screenshots and explanations should give a feeling that it is possible to customize Tryton, even if the basis is very simple.

Slot: Association Module

Merge Request: association module (!217) · Merge requests · Tryton / Tryton · GitLab

Member Workflow:

  • Draft
  • Rejected
  • Running to → rename to „Member“? - ask @dave
  • Quitted
  • Expelled

other factors?

  • paid / didn’t pay

  • Donations (Report)

  • book on the matching account and create an appropriate report

  • Dominique’s Key Points:
    – Have a members module providing membership-workflow and fees, everything else can be crowd-funded by requesters.
    – Record voluntary work hours and report to some institution to get money
    – Record members work hours which members are obligated to provide

Slot: Contrib Modules

Ongoing discussion:


  • List item
  • Make it attractive to Collaborate
  • Avoid duplicate work
  • Trustworthy source
  • Make modules more accessible and visible
  • Promoting Tryton as a whole
  • Place / Space where a Community can grow


  • Central point with services
  • Issue tracking and Merge requests
  • Some Translation will happen
  • CI / testing → Labels on projects
  • packaged (PyPI, release, whatever)
  • not part of the Docker image
  • provide an easy way to build a Docker image containing custom selected contrib-modules
  • Repo/project template
  • No Mono-Repo
  • No requirement for having one module per repo

Ideas for Rules:

  • to be implemented within reasonable time
  • must support TLS version
  • must have a test suite
  • must have have our CI steps
  • Handed on to the community
  • sibling project of Tryton core
  • Foundation pays the costs

Core Modules:

  • Generic, on general use
  • Minimum requirements
  • GPL-compatible license
  • Test Suite
  • Integration with each other
  • Pass Cedric’s review
Slot: User Interface (UI) Design

Follow-up discussion:

The main focus today is the usage of Tryton on small mobile devices like smartphones


As there are desktop client (GTK) and web client (Sao), the web client is the ideal technology for us, because it is responsive to different sizes of screens and is running in a standard browser usable on each device. Producing native apps is complex and expensive.


For some use cases, a developer needs custom layouts (e.g. calendar) or special extensions (point of sale, stock picking, scan supplier invoices) to integrate in the web client. This can already be done by

  • injecting custom code in the Web-UI
  • this code can extend the existing framework like a additional module (as we know with Tryton modules)
  • using vue.js (JavaScript)
  • using custom CSS (so one can adopt the layout event closer to the needs of the project)


Actually, resizing of columns is missing. As it is a pain for the user, we should add this function.

  • it can be done by adding a corresponding extension (using JavaScript and CSS)
  • Kalenis also does that, but they use “react”.

Follow up discussion: Resize List View Column with Handle


On small devices, only 1 column is possible to show. Perhaps with more information underneath the main content (e.g. the city underneath the name of the party in the party list). See Mobility for tablets and smartphones - #5 by s.vogel :

  • we’ll integrate fields to define the “weight” of data field (in other words semantic info, e.g. like numbering-string)
  • so we can define which is the most important column to show in the smartphone list. and what columns/data comes underneath.
  • the CSS then can take in account this weight and can apply a good text format.

Follow up discussion: Mobile List View


Here the view is already responsive:

  • we define to go on step by step optimize ugly situations as long as a generic solution can be found.
  • some of us would like an easy way to modify a layout, perhaps on the layout directly by turning on a layout-edit-mode and then add and delete and reposition things. We understand that this is complex to achieve, because functionality could be broken. There are various ways of doing it already in a proper way (custom module or in the views-administration part of Tryton)

Follow up discussion: Optimize Form View

Slot: Project Community Organization


  • Have a place where the community can grow
  • Give room to people so they can work
  • Governments to contribute modules to core or outside the code
  • Responsibility for decisions. What decisions?
  • Not only focus on code quality. There are other things to take into account.


  • There were fights in the past.
  • The foundation works for its own
  • The community doesn’t see what the foundation does. The foundation doesn’t
  • Where is the control of the foundation (> quality management?)
  • Foundation was created to protect the community and the eco-system. The foundation was not created to run the community
  • There are statutes and there it is written that the foundation is responsible for the well being of the community
  • Idea was to give some kind a visible head of the community
  • Short time ago, a lot of the directors didn’t do anything. There was nothing to tell. So it looks like, that the foundation is in-transparent.
  • One has to write down what has to do.
  • Wolf shall really start with introducing meetings.
  • the newsletter team are interested in news about events and meetings and projects and so on.
  • Wolf has a lot of interesting questions that others don’t ask. It’s important that somebody does it.
  • Some find it attractive the community is not too organized.
  • Unconference should have had a bit of organization. Perhaps one or two checklists was great.
  • The project was originally set up by Cedric and then invited other people. Now, the foundation was introduced to take over.
  • Is there an ethical card (>see discuss rules)
  • Role of Cedric. Gatekeeper for technical part. Focus on technical quality. Help on technical documentation. Only express his own opinion. He is also accepting other ideas, although he doesn’t agree. He is not stubborn.
  • Better looking for an understanding of why Cedric proposes his opinion. He pushes one to work harder and find better contributions.
  • Contributors unfortunately not are all of the same quality. If the requirements are too high, it is difficult for new contributors participate.
  • Also accept lower quality commits? At the moment, such repository does not exist yet.
  • At the moment too many things have to be discussed and that lowers down working speed and sometimes it is annoying.
  • Cedric is very fast in answering what makes impossible to participate. On the other hand works goes on fast.
  • On discuss, the answer should take in account the kind of question and the knowledge of the one who asked the question.
  • Give more room to the discussions. Not close the discussions too early.
  • One is frustrated in committing which does not solve anything.
  • It was good in having specialists for certain topics. Some kind like coaches or experts. It was good to have more opinions than a technical view.
  • If one asks process-questions in discussions, you might get a technical answer. In discuss, one doesn’t know what the question is for (process, new feature, evolution, bug). The problem is, there the questions can’t be assigned to existing projects.


  • for 5 years a director is elected and then looks for a following person
  • supporters can kick out a director
  • one can become a supporter just by writing an email to the foundation
  • there are no regularities
  • the foundation is not the leader
  • the foundation is only helping


We have discussed the creation of several community groups to work as a team on the different tasks of the foundations. Here are some ideas and people that are already volunteering.

  • Meetings, Events (Wolf)
  • Welcome newcomer
  • Technical News
  • Unconference organization (Check Lists)
  • Newsletter (Udo, Sergi, Dave)
  • Commits organization ()
  • Thanking for the contribution
  • Checking the contribution and question for approving.
  • Helping in improving
  • free workload of core team
  • Commits authorization / Gatekeeper (Cedric)
  • User Interface ()

Feel free to raise your hand if you are interested on joining some groups.


  • Even better addressing decision makers.


  • There is room for others as can be seen when looking at the people working actively.
  • Shall get more diverse.
  • Shall better search the points to get together instead of looking for the negative points.
  • less fighting


  • Is important. It is correct to focus on it.
  • Modularity is important and it is also correct to focus on it


  • Cedric will step a bit, except other answer is wrong or important things are missing.
Slot: Project Maintenance

The how? what? who? when? and where? about maintaining Tryton.
Problem: The codebase of the Tryton project is getting larger and larger over the years. Also the user base and with it the questions, discussion and demand for help grows. With the maintenance of Tryton we reach limits. The idea is to include more people in maintenance tasks.

Brainstorming ideas how to organize maintenance:

  • In code review use suggestions when you have an idea for a better solution instead of comment.
  • Triage, qualify and preview MR and issues.
  • Collect isolated maintenance tasks, responsibilities and jobs which can be done by others.
  • Describe and document maintenance tasks and jobs.
  • Propose, debate and collect rules
  • Vote for rule sets and changes
  • lower the entry for a person to get a responsibility in the project maintenance
  • Documentation about tasks, jobs and responsible persons is important for everyone interested in Tryton
  • Use the wiki functionality of Discourse, try to allow ordinary members to make their post a wiki.
  • Tryton application documentation: Use for each module/version a discuss page to collect user feedback.
    Follow-up discussion: Link modules documentation to discource and vice versa - #2 by 2cadz
Final Then at the end of the Tryton Unconference 2023 in Berlin, and last but not least, the remaining participants did an uncommented final feedback round, sharing their impressions.

1 post - 1 participant

Read full topic

Categories: FLOSS Project Planets

Python Software Foundation: The Python Language Summit 2023: Burnout is Real

Mon, 2023-05-29 14:40
Guido van Rossum, creator of the Python programming language, spoke at the 2023 Python Language Summit on the subject of open-source burnout, strategies for tackling it, and how to avoid it.

The first known case of burnout in the field of open-source software, van Rossum speculated, may have been Charles Babbage, who gave up the post of Lucasian Professor of Mathematics (the “Chair of Newton”) at Cambridge University in 1839.

“In 1839 the demands of the Analytical Engine upon my attention had become so incessant and so exhausting, that even the few duties of the Lucasian Chair had a sensible effect in impairing my bodily strength. I therefore sent in my resignation.” 

-- Charles Babbage, “Passages from the Life of a Philosopher” (Chapter 4)

Van Rossum described how the Python community had been hit multiple times by core developers, suffering from burnout, suddenly disappearing for extended periods of time. Van Rossum told the story of one core developer, previously one of the most prolific contributors to CPython, who had abruptly ceased contributing around a decade ago. He had hardly been heard from since.

Van Rossum himself, he recounted, had felt so burned out by the acrimonious debate around PEP 572 (proposing the “walrus operator”, :=), that it led to him stepping down from his post as Benevolent Dictator For Life (“BDFL”) of Python in 2018. Decisions about the language were ceded to a democratically elected Steering Council, which has governed Python ever since.

Burnout, van Rossum noted, was often connected to conflict – and it often didn’t matter whether or not you ended the conflict victorious. Merely having the conflict at all could be exhausting.

Van Rossum’s talk itself was fairly short, but was followed by a lengthy discussion among the assembled core developers on the experiences they’d had with burnout, and strategies that could be employed to tackle it.

Several attendees in the room commented that learning to recognise burnout in yourself was an important skill. Some participants in the discussion described times when they had suddenly realised that things that had previously been enjoyable had morphed into a source of stress. One core developer told the story of a conference they had organised, at which they had felt such extreme stress that they were unable to think of any of the things that had gone well. Instead, they found themselves fixated on all of the minor things that had gone wrong.

Learning to recognise burnout in others was perhaps an even harder problem to solve. Van Rossum noted that the developers most susceptible to burnout were generally those who were most active and engaged with the community. But how can you distinguish between somebody devoting their time to CPython because of the intense enjoyment they found in contributing to the project, somebody who might have formed an unhealthy addiction to open source, and somebody who was only continuing to contribute out of a misplaced sense of obligation?

Some developers spoke of strategies they used to decompress. Brett Cannon described how he periodically takes “open source breaks”, in which he forces himself to spend a period of time without looking at his GitHub notifications or thinking about his open-source commitments. Mariatta Wijaya spoke about how she found mentoring other Python programmers to be deeply healing. All agreed that it was crucial to talk to friends and relatives when you were feeling close to burnout.

It was agreed that the Python community needed to do better in many ways. We needed to become better, as a community, at understanding when other people said that they were unable to complete something they had previously committed to. And perhaps we needed to normalise questions such as, “Hey, you’ve been super productive and responsive for too long. When do you think you’ll burn out?”

Russell Keith-Magee remarked that systems with single points of failure were bound to lead to situations of intense stress. These systems would inevitably, at some point, fail. The transition from a single BDFL (with an indefinite term) to a five-member Steering Council with one-year terms had been a very positive change in that regard, Keith-Magee said. But there were still places in CPython development where there were single points of failure. Examples included various esoteric platforms where support from CPython depended on a single core developer being able to give up their time to review PRs relating to the platform.

Carol Willing agreed with Keith-Magee, pointing out that no matter who you were, you were rarely the only person who could do a certain task. You might be the person who could do it fastest or best – but sometimes, it was important to “see the people, and share the joy”.

Łukasz Langa spoke about his role as part of the current Code of Conduct Working Group, to which any violations of the Python Code of Conduct can be reported. Langa remarked that being part of the Working Group had brought to the fore parts of the community which he had previously been mostly unaware of. He encouraged everybody in the room to report toxic members of the community who were discouraging or aggressive online.

Speaking personally, for a moment: I tried to take an open-source break earlier this year, when I felt myself close to burning out on some of my open-source commitments. I can’t say it was fully successful – my addiction to GitHub was too great for me to resist glancing at my notifications occasionally. But it was helpful, and re-energising, to spend some time doing a creative activity that bore with it a 0% risk of people shouting at me on the internet about it:

Categories: FLOSS Project Planets

Python Software Foundation: The Python Language Summit 2023

Mon, 2023-05-29 13:28
Every year, just before the start of PyCon US, core developers, triagers, and special guests gather for the Python Language Summit: an all-day event of talks where the future direction of Python is discussed. The Language Summit 2023 included three back-to-back talks on the C API, an update on work towards making the Global Interpreter Lock optional, and a discussion on how to tackle burnout in the community.
This year's summit received around 45 attendees, and the summit was covered by Alex Waygood.

Attendees of the Python Language Summit

Categories: FLOSS Project Planets

Python Software Foundation: The Python Language Summit 2023: Pattern Matching, __match__, and View Patterns

Mon, 2023-05-29 13:22
One of the most exciting new features in Python 3.10 was the introduction of pattern matching (introduced in PEPs 634, 635 and 636). Pattern matching has a wide variety of uses, but really shines in situations where you need to undergo complex destructurings of tree-like datastructures.

That’s a lot of words which may or may not mean very much to you – but consider, for example, using the ast module to parse Python source code. If you’re unfamiliar with the ast module: the module provides tools that enable you to compile Python source code into an “abstract syntax tree” (AST) representing the code’s structure. The Python interpreter itself converts Python source code into an AST in order to understand how to run that code – but parsing Python source code using ASTs is also a common task for linters, such as plugins for flake8 or pylint. In the following example, ast.parse() is used to parse the source code x = 42 into an ast.Module node, and ast.dump() is then used to reveal the tree-like structure of that node in a human-readable form:

>>> import ast >>> source = "x = 42" >>> node = ast.parse(source) >>> node <ast.Module object at 0x000002A70F928D80> >>> print(ast.dump(node, indent=2)) Module( body=[ Assign( targets=[ Name(id='x', ctx=Store())], value=Constant(value=42))], type_ignores=[])

How does working with ASTs relate to pattern-matching? Well, a function to determine whether (to a reasonable approximation) an arbitrary AST node represents the symbol collections.deque might have looked something like this, before pattern matching…

import ast # This obviously won't work if the symbol is imported with an alias # in the source code we're inspecting # (e.g. "from collections import deque as d"). # But let's not worry about that here :-) def node_represents_collections_dot_deque(node: ast.AST) -> bool: """Determine if *node* represents 'deque' or 'collections.deque'""" return ( isinstance(node, ast.Name) and == "deque" ) or ( isinstance(node, ast.Attribute) and isinstance(node.value, ast.Name) and == "collections" and node.value.attr == "deque" )

But in Python 3.10, pattern matching allows an elegant destructuring syntax:

import ast def node_represents_collections_dot_deque(node: ast.AST) -> bool: """Determine if *node* represents 'deque' or 'collections.deque'""" match node: case ast.Name("deque"): return True case ast.Attribute(ast.Name("collections"), "deque"): return True case _: return False

I know which one I prefer.

For some, though, this still isn’t enough – and Michael “Sully” Sullivan is one of them. At the Python Language Summit 2023, Sullivan shared ideas for where pattern matching could go next.

Playing with matches (without getting burned)

Sullivan’s contention is that, while pattern matching provides elegant syntactic sugar in simple cases such as the one above, our ability to chain destructurings using pattern matching is currently fairly limited. For example, say we want to write a function inspecting Python AST that takes an ast.FunctionDef node and identifies whether the node represents a synchronous function with exactly two parameters, both of them annotated as accepting integers. The function would behave so that the following holds true:

>>> import ast >>> source = "def add_2(number1: int, number2: int): pass" >>> node = ast.parse(source).body[0] >>> type(node) <class 'ast.FunctionDef'> >>> is_function_taking_two_ints(node) True

With pre-pattern-matching syntax, we might have written such a function like this:

def is_int(node: ast.AST | None) -> bool: """Determine if *node* represents 'int' or ''""" return ( isinstance(node, ast.Name) and == "int" ) or ( isinstance(node, ast.Attribute) and isinstance(node.value, ast.Name) and == "builtins" and node.attr == "int" ) def is_function_taking_two_ints(node: ast.FunctionDef) -> bool: """Determine if *node* represents a function that accepts two ints""" args = node.args.posonlyargs + node.args.args return len(args) == 2 and all(is_int(node.annotation) for node in args)

If we wanted to rewrite this using pattern matching, we could possibly do something like this:

def is_int(node: ast.AST | None) -> bool: """Determine if *node* represents 'int' or ''""" match node: case ast.Name("int"): return True case ast.Attribute(ast.Name("builtins"), "int"): return True case _: return False def is_function_taking_two_ints(node: ast.FunctionDef) -> bool: """Determine if *node* represents a function that accepts two ints""" match node.args.posonlyargs + node.args.args: case [ast.arg(), ast.arg()] as arglist: return all(is_int(arg.annotation) for arg in arglist) case _: return False

That leaves a lot to be desired, however! The is_int() helper function can be rewritten in a much cleaner way. But integrating it into the is_function_taking_two_ints() is… somewhat icky! The code feels harder to understand than before, whereas the goal of pattern matching is to improve readability.

Something like this, (ab)using metaclasses, gets us a lot closer to what it feels pattern matching should be like. By using one of Python’s hooks for customising isinstance() logic, it’s possible to rewrite our is_int() helper function as a class, meaning we can seamlessly integrate it into our is_function_taking_two_ints() function in a very expressive way:

import abc import ast class PatternMeta(abc.ABCMeta): def __instancecheck__(cls, inst: object) -> bool: return cls.match(inst) class Pattern(metaclass=PatternMeta): """Abstract base class for types representing 'abstract patterns'""" @staticmethod @abc.abstractmethod def match(node) -> bool: """Subclasses must override this method""" raise NotImplementedError class int_node(Pattern): """Class representing AST patterns signifying `int` or ``""" @staticmethod def match(node) -> bool: match node: case ast.Name("int"): return True case ast.Attribute(ast.Name("builtins"), "int"): return True case _: return False def is_function_taking_two_ints(node: ast.FunctionDef) -> bool: """Determine if *node* represents a function that accepts two ints""" match node.args.posonlyargs + node.args.args: case [ ast.arg(annotation=int_node()), ast.arg(annotation=int_node()), ]: return True case _: return False

This is still hardly ideal, however – that’s a lot of boilerplate we’ve had to introduce to our helper function for identifying int annotations! And who wants to muck about with metaclasses?

A slide from Sullivan's talk

A __match__ made in heaven?

Sullivan proposes that we make it easier to write helper functions for pattern matching, such as the example above, without having to resort to custom metaclasses. Two competing approaches were brought for discussion.

The first idea – a __match__ special method – is perhaps the easier of the two to immediately grasp, and appeared in early drafts of the pattern matching PEPs. (It was eventually removed from the PEPs in order to reduce the scope of the proposed changes to Python.) The proposal is that any class could define a __match__ method that could be used to customise how match statements apply to the class. Our is_function_taking_two_ints() case could be rewritten like so:

class int_node: """Class representing AST patterns signifying `int` or ``""" # The __match__ method is understood by Python to be a static method, # even without the @staticmethod decorator, # similar to __new__ and __init_subclass__ def __match__(node) -> ast.Name | ast.Attribute: match node: case ast.Name("int"): # Successful matches can return custom objects, # that can be bound to new variables by the caller return node case ast.Attribute(ast.Name("builtins"), "int"): return node case _: # Return `None` to indicate that there was no match return None def is_function_taking_two_ints(node: ast.FunctionDef) -> bool: """Determine if *node* represents a function that accepts two ints""" match node.args.posonlyargs + node.args.args: case [ ast.arg(annotation=int_node()), ast.arg(annotation=int_node()), ]: return True case _: return False

The second idea is more radical: the introduction of some kind of new syntax (perhaps reusing Python’s -> operator) that would allow Python coders to “apply” functions during pattern matching. With this proposal, we could rewrite is_function_taking_two_ints() like so:

def is_int(node: ast.AST | None) -> bool: """Determine if *node* represents 'int' or ''""" match node: case ast.Name("int"): return True case ast.Attribute(ast.Name("builtins"), "int"): return True case _: return False def is_function_taking_two_ints(node: ast.FunctionDef) -> bool: """Determine if *node* represents a function that accepts two ints""" match node.args.posonlyargs + node.args.args: case [ ast.arg(annotation=is_int -> True), ast.arg(annotation=is_int -> True), ] case _: return False
Match-maker, match-maker, make me a __match__
A slide from Sullivan's talk

The reception in the room to Sullivan’s ideas was positive; the consensus seemed to be that there was clearly room for improvement in this area. Brandt Bucher, author of the original pattern matching implementation in Python 3.10, concurred that this kind of enhancement was needed. Łukasz Langa, meanwhile, said he’d received many queries from users of other programming languages such as C#, asking how to tackle this kind of problem.

The proposal for a __match__ special method follows a pattern common in Python’s data model, where double-underscore “dunder” methods are overridden to provide a class with special behaviour. As such, it will likely be less jarring, at first glance, to those new to the idea. Attendees of Sullivan’s talk seemed, broadly, to slightly prefer the __match__ proposal, and Sullivan himself said he thought it “looked prettier”.

Jelle Zijlstra argued that the __match__ dunder would provide an elegant symmetry between the construction and destruction of objects. Brandt Bucher, meanwhile, said he thought the usability improvements weren’t significant enough to merit new syntax.

Nonetheless, the alternative proposal for new syntax also has much to recommend it. Sullivan argued that having dedicated syntax to express the idea of “applying” a function during pattern matching was more explicit. Mark Shannon agreed, noting the similarity between this idea and features in the Haskell programming language. “This is functional programming,” Shannon argued. “It feels weird to apply OOP models to this.”

Addendum: pattern-matching resources and recipes

In the meantime, while we wait for a PEP, there are plenty of innovative uses of pattern matching springing up in the ecosystem. For further reading/watching/listening, I recommend:

Categories: FLOSS Project Planets