Planet Python

Subscribe to Planet Python feed
Planet Python - http://planetpython.org/
Updated: 22 hours 16 min ago

PyBites: Creating a Fitness Tracker App with Python Reflex

Thu, 2025-01-16 06:33

In this post, I will build a simple fitness tracker app using Python Reflex.

Reflex is a Python library that allows you to create reactive applications using a functional and declarative approach.

We will use Reflex to create a simple fitness tracker app that allows you to log the amount of workouts completed per week.

This is how it will look:

Install Reflex

After making a new directory and cd’ing into it, I’ll use uv (anybody not yet using this amazing tool? ) to initialize a project and install Reflex:

√ fitness_tracker $ uv init --no-workspace --no-package Initialized project `fitness-tracker` √ fitness_tracker (main) $ rm hello.py √ fitness_tracker (main) $ uv add reflex Using CPython 3.13.0 Creating virtual environment at: .venv Resolved 81 packages in 602ms Prepared 4 packages in 1.47s Installed 72 packages in 305ms

The awesome thing about uv is that you don’t have to worry about clearing and activating a virtual environment. It’s also super fast (listen to the creator talk about it on our podcast).

Initialize and configure Reflex

Next up I will run reflex init which presents me with the following menu. I am selecting the default blank option to start from scratch:

$ uv run reflex init ──────────────────────────────────────────────────────────────────────────────────────────────────── Initializing fitness_tracker ───────────────────────────────────────────────────────────────────────────────────────────────────── [14:30:37] Initializing the web directory. console.py:161 Get started with a template: (0) blank (https://blank-template.reflex.run) - A blank Reflex app. (1) ai - Generate a template using AI [Experimental] (2) choose templates - Choose an existing template. Which template would you like to use? (0): [14:30:43] Initializing the app directory. console.py:161 Success: Initialized fitness_tracker using the blank template √ fitness_tracker (main) $ ls -C1 README.md __pycache__ assets fitness_tracker hello.py pyproject.toml requirements.txt rxconfig.py uv.lock

Next we’ll configure the app by adding a database. To keep it simple I’ll just use a sqlite DB. Updating rxconfig.py:

config = rx.Config( app_name="fitness_tracker", db_url="sqlite:///reflex.db", )

At this point we can run the dev server to verify the installation (note that the first time it will take a bit to do the initial compilation):

$ uv run reflex run Info: The frontend will run on port 3001. Info: The backend will run on port 8001. Info: Overriding config value frontend_port with env var FRONTEND_PORT=3001 Info: Overriding config value backend_port with env var BACKEND_PORT=8001 ───────────────────────────────────────────────────────────────────────────────────────────────────────── Starting Reflex App ───────────────────────────────────────────────────────────────────────────────────────────────────────── [14:32:12] Compiling: ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 13/13 0:00:00 ───────────────────────────────────────────────────────────────────────────────────────────────────────────── App Running ───────────────────────────────────────────────────────────────────────────────────────────────────────────── Info: Overriding config value frontend_port with env var FRONTEND_PORT=3001 Info: Overriding config value backend_port with env var BACKEND_PORT=8001 App running at: http://localhost:3001 Backend running at: http://0.0.0.0:8001Skeleton app code

Reflex already created some boilerplate for us in fitness_tracker/fitness_tracker.py:

"""Welcome to Reflex! This file outlines the steps to create a basic app.""" import reflex as rx from rxconfig import config class State(rx.State): """The app state.""" ... def index() -> rx.Component: # Welcome Page (Index) return rx.container( rx.color_mode.button(position="top-right"), rx.vstack( rx.heading("Welcome to Reflex!", size="9"), rx.text( "Get started by editing ", rx.code(f"{config.app_name}/{config.app_name}.py"), size="5", ), rx.link( rx.button("Check out our docs!"), href="https://reflex.dev/docs/getting-started/introduction/", is_external=True, ), spacing="5", justify="center", min_height="85vh", ), rx.logo(), ) app = rx.App() app.add_page(index)
  • A Reflex app consists of a State class and a index function that returns a Component.
  • You can see the State class as the back-end of the app, it holds the data and methods to manipulate it.
  • The index function is the front-end of the app, it returns a Component that defines the UI.
  • Similar to Flask/FastAPI, there is an App class we should instantiate. We bind the index page to the app using its add_page method.
Use a database to save workouts

Let’s start with defining a model to track our workouts. For this app I am only interested if the workout got done and when, not what I did per se.

Reflex uses sqlmodel as its ORM (object relational mapper), so we can add a single column table as easily as this:

from datetime import datetime import reflex as rx from sqlalchemy import Column, DateTime, func, and_ from sqlmodel import Field class Workout(rx.Model, table=True): """Database model for a workout.""" completed: datetime = Field( sa_column=Column( DateTime(timezone=True), server_default=func.now(), ) )

In order to set the default value for the completed field, I needed the server_default argument of the Column constructor.

This will set the field to the current time when a new record is created (using completed: datetime = datetime.now() won’t work here, it would set a fixed datetime once ).

Sometimes you need to mix in SQLAlchemy features like this, as the Reflex docs say:

SQLModel automatically maps basic python types to SQLAlchemy column types, but for more advanced use cases, it is possible to define the column type using sqlalchemy directly.

Reflex ships with Alembic so we can use makemigrations + migrate to sync this model to the sqlite db we configured earlier, after doing an db init:

$ uv run reflex db init $ uv run reflex db makemigrations --message "Add Workout model" $ uv run reflex migrate

New to Alembic / db migrations? I made this beginner video about it.

Retrieving workouts

In the next couple of sections I will build out the back-end logic first by updating the State class. First we make a method to load workouts from the database:

from sqlalchemy import and_ WEEKLY_GOAL = 5 class State(rx.State): workouts: list[str] = [] target: int = WEEKLY_GOAL current_week_offset: int = 0 def load_workouts(self, week_offset: int = 0): today = datetime.now(timezone.utc) start_of_week = datetime(today.year, today.month, today.day) - timedelta( days=today.weekday() ) start_of_week += timedelta(weeks=week_offset) end_of_week = start_of_week + timedelta(days=7) with rx.session() as session: db_workouts = ( session.query(Workout) .filter( and_( Workout.completed >= start_of_week, Workout.completed < end_of_week, ) ) .all() ) self.workouts = [ workout.completed.strftime("%Y-%m-%d %H:%M") for workout in db_workouts ]
  • We set a weekly goal of 5 workouts (my current and pretty static goal).
  • We load workouts for the current week, if we want to see past or future weeks we can pass an offset into the method
  • We format the dates as strings for display, because State can only hold simple types, so no ORM objects.
  • We use the rx.session() context manager to get a session to query the db, very similar to sqlmodel and SQLAlchemy.
Properties (computed fields)

You can define properties in Reflex using the @rx.var decorator. I am defining a couple we’ll need in the front-end in a bit:

@rx.var def progress(self) -> int: return len(self.workouts) @rx.var def progress_percentage(self) -> int: return int(self.progress / self.target * 100) @rx.var def goal_reached(self) -> bool: return self.progress >= self.target @rx.var def current_week(self) -> bool: return self.current_week_offset == 0 @rx.var def yymm(self) -> str: dt = datetime.now(timezone.utc) + timedelta(weeks=self.current_week_offset) cal = dt.isocalendar() return f"{cal.year} - week {cal.week:02}"Navigation through the weeks

Let’s add methods to go to the next and previous weeks:

def load_current_week(self): self.load_workouts(self.current_week_offset) def show_previous_week(self): self.current_week_offset -= 1 self.load_workouts(self.current_week_offset) def show_next_week(self): self.current_week_offset += 1 self.load_workouts(self.current_week_offset)Logging workouts

We retrieved workouts, but we have no way to add them yet. Let’s add a method to save a workout to the database:

def log_workout(self): with rx.session() as session: workout = Workout() session.add(workout) session.commit() self.load_workouts(self.current_week_offset)

Note that I don’t have to specify any fields to Workout(), because the completed field (column) is set to the current datetime by default.

Building the UI

The State class has all we need from a back-end perspective. Now let’s build the front-end defining some UI elements. I define a couple of functions that return a rx.Component each.

You can also do this inline in the index() function (up next) but this increases the code’s readability and makes those components reusable.

def progress_display() -> rx.Component: return rx.vstack( rx.text(f"Workouts Completed {State.yymm}:", size="4"), rx.progress(value=State.progress_percentage) ) def week_navigation_buttons() -> rx.Component: return rx.hstack( rx.button("Previous Week", on_click=State.show_previous_week, size="2"), rx.button("Next Week", on_click=State.show_next_week, size="2"), spacing="4", ) def conditional_workout_logging_button() -> rx.Component: return rx.cond( State.goal_reached, rx.text("Congrats, you hit your weekly goal &#x1f4aa; &#x1f389;", size="4", color="green"), rx.cond( State.current_week, rx.button( "Log Workout", on_click=State.log_workout, size="4", background_color="green", color="white", ), rx.text("", size="4"), ), ) def workout_list() -> rx.Component: return rx.vstack( rx.foreach( State.workouts, lambda workout_date: rx.text(f"Workout done: {workout_date}") ), )
  • progress_display() shows the progress towards the weekly goal. Note that I can access State properties directly.
  • week_navigation_buttons() shows buttons to navigate to the previous or next week, these call methods on the State class.
  • conditional_workout_logging_button() accounts for several states: if the goal is reached, if we are in the current week, or if we can log a workout. Note that you need to use rx.cond to conditionally render components (not if statements).
  • workout_list() shows a list of workouts for the current week. Again we cannot use a plain for loop, the framework quickly taught me (through its error messaging) that I needed to use rx.foreach (ChatGPT didn’t know)
Putting all the UI components together

Now we can update the index() function to call our components in sequence:

def index() -> rx.Component: return rx.vstack( rx.heading("Fitness Tracker", size="9"), progress_display(), rx.heading("Workout History", size="7"), week_navigation_buttons(), workout_list(), conditional_workout_logging_button(), align="center", spacing="4", ) Pre-loading data upon app start

The page was already registered in the boilerplate code, but we did not give it a title so I did this here, as well as pre-loading the current week’s data using the on_load keyword argument:

app = rx.App() app.add_page( index, title="Fitness Tracker", on_load=State.load_current_week, )

And that should be it, a full-stack web app with persistence and without writing any Javascript!

Result

Run uv run reflex run again and here is our little fitness app:

Logging a workout we see our nice progress bar progress 20%:

Logging more workouts, we’re at 80% of the week’s goal now (I actually hit that today ):

Hitting the weekly goal the log workout button disappeared and we see a nice message:

Going to the previous and next weeks there is nothing in the database yet and I should not be able to log anything (simplistic first MVP):

Here is the code for this project so you can try it out yourself.

Improvements

This is quite a simplistic app of course. Here are some ways to make it more functional:

  1. Being able to change the goal from 5 to something else. This is actually easy to accomplish adding an input field: rx.input(placeholder="Set goal", on_blur=State.set_target, size="3") -> set_attribute changes the attribute on the fly and the State class just flies with the change. I tried this and I did not have to make any other changes (similar how Excel formulas just re-calculate).
  2. Ideally you should be able to set the date manually so you can still log workouts for previous weeks.
  3. I am excited to add a nice bar chart to see the performance over the weeks.

If you want to contribute to this repo, you’re more than welcome. Go here.

Note about learning approach

I used the following approach to come up with this app:

  1. I spent ~20 minutes looking at the get started docs + another 10-20 minutes on YouTube to see how databases worked with the framework (great complete app walk-through < I went straight to the sqlmodel part to learn how to use a DB).
  2. Then I used ChatGPT to come up with a quick prototype. It made more mistakes than usual, probably because this is a relatively new framework so there was less data in its training set. But I debugged my way through it learning the framework from the inside out.
  3. Then I went back to the docs and a lot of things made much more sense because I had used it.

This JIT learning (and dropping tutorial paralysis) style really works for us and it makes people that we coach very effective too.

See what people have built with usWhat’s next?

I feel I have only scratched the surface. I am excited about this tool, because one advantage over say Streamlit is that it’s easier to customize and it seems to handle state changes better.

Of course I will stick with Django + htmx + Tailwind CSS (what we used for our v2 platform) for more involved web apps, but this framework looks really promising to quickly make web apps while having a certain degree of control.

Also notice I built something from scratch in this article. I cannot wait to play with the other templates presented in the init step, for example the dashboard one …

Thanks for reading. I hope this practical guide helps you get started with Reflex.

Let me know what you will build, and what your experience with Reflex has been …

Share it in our Community
Categories: FLOSS Project Planets

PyCharm: Anomaly Detection in Machine Learning Using Python

Thu, 2025-01-16 05:08

In recent years, many of our applications have been driven by the high volume of data that we are able to collect and process. Some may refer to us being in the age of data. One of the essential aspects of handling such a large amount of data is anomaly detection – processes that enable us to identify outliers, data that is outside the bounds of expectation and demonstrate behavior that is out of the norm. In scientific research, anomaly data points could be a cause of technical issues and may need to be discarded when drawing conclusions, or it could lead to new discoveries.

In this blog post, we’ll see why using machine learning for anomaly detection is helpful and explore key techniques for detecting anomalies using Python. You’ll learn how to implement popular methods like OneClassSVM and Isolation Forest, see examples of how to visualize these results and understand how to apply them to real-world problems.

Where is anomaly detection used?

Anomaly detection has also been a crucial part of modern-day business intelligence, as it provides insights into what could possibly go wrong and may also identify potential problems. Here are some examples of using anomaly detection in modern-day business.

Security alerts

There are some cyber security attacks that can be detected via anomaly detection; for example, a spike in request volume may indicate a DDoS attack, while suspicious login behavior, like multiple failing attempts, may indicate unauthorized access. Detecting suspicious user behavior may indicate potential cyber security threats, and companies can act on them accordingly to prevent or minimize the damage.

Fraud detection

In financial organizations, for example, banks can use anomaly detection to highlight suspicious account activities, which may be an indication of illegal activities like money laundering or identity theft. Suspicious transactions can also be a sign of ​​credit card fraud.

Observability

One of the common practices for web services is to collect metrics of the real-time performance of the service if there is abnormal behavior in the system. For example, a spike in memory usage may show that something in the system isn’t functioning properly, and engineers may need to address it immediately to avoid a break in service.

Why use machine learning for anomaly detection?

Although traditional statistical methods can also help find outliers, the use of machine learning for anomaly detection has been a game changer. With machine learning algorithms, more complex data (e.g. with multiple parameters) can be analyzed all at once. Machine learning techniques also provide a means to analyze categorical data that isn’t easy to analyze using traditional statistical methods, which are more suited to numerical data.

A lot of time, these anomaly detection algorithms are programmed and can be deployed as an application (see our FastAPI for Machine Learning tutorial) and run as requested or at scheduled intervals to detect any anomalies. This means that they can prompt immediate actions within the company and can also be used as reporting tools for business intelligence teams to review and adjust strategies.

Types of anomaly detection techniques and algorithms

There are generally two main types of anomaly detection: outlier detection and novelty detection.

Outlier detection

Outlier detection is sometimes referred to as unsupervised anomaly detection, as it is assumed that in the training data, there are some undetected anomalies (thus unlabeled), and the approach is to use unsupervised machine learning algorithms to pick them out. Some of these algorithms include one-class support vector machines (SVMs), Isolation Forest, Local Outlier Factor, and Elliptic Envelope.

Novelty detection

On the other hand, novelty detection is sometimes referred to as semi-supervised anomaly detection. Since we assume that all training data doesn’t solely consist of anomalies, they’re all labeled as normal. The goal is to detect whether or not new data is an anomaly, which is sometimes referred to as a novelty. The algorithms used in outlier detection can also be used for novelty detection, provided that there are no anomalies in the training data.

Other than the outlier detection and novelty detection mentioned, it is also very common to require anomaly detection in time series data. However, since the approach and technique used for time series data are often different from the algorithms mentioned above, we’ll discuss these in detail at a later date.

Code example: finding anomalies in the Beehives dataset

In this blog post, we’ll be using this Beehives dataset as an example to detect any anomalies in the hives. This data set provides various measurements of the hive (including the temperature and relative humidity of the hive) at various times.

Here, we’ll be showing two very different methods for discovering anomalies. They are OneClassSVM, which is based on support vector machine technology, which we’ll use for drawing decision boundaries, and Isolation Forest, which is an ensemble method similar to Random Forest.

Example: OneClassSVM

In this first example, we’ll be using the data of hive 17, assuming bees will keep their hive in a constant pleasant environment for the colony; we can look at whether this is true and if there are times that the hive experiences anomaly temperature and relative humidity levels. We’ll use OneClassSVM to fit our data and look at the decision-making boundaries on a scatter plot.

The SVM in OneClassSVM stands for support vector machine, which is a popular machine learning algorithm for classification and regressions. While support vector machines can be used to classify data points in high dimensions, by choosing a kernel and a scalar parameter to define a frontier, we can create a decision boundary that includes most of the data points (normal data), while retaining a small number of anomalies outside of the boundaries to represent the probability (nu) of finding a new anomaly. The method of using support vector machines for anomaly detection is covered in a paper by Scholkopf et al. entitled Estimating the Support of a High-Dimensional Distribution.

1. Start a Jupyter project

When starting a new project in PyCharm (Professional 2024.2.2), select Jupyter under Python.

Start with PyCharm Pro for free

The benefit of using a Jupyter project (previously also known as a Scientific project) in PyCharm is that a file structure is generated for you, including a folder for storing your data and a folder to store all the Jupyter notebooks so you can keep all your experiments in one place. 

Another huge benefit is that we can render graphs very easily with Matplotlib. You will see that in the steps below.

2. Install dependencies

Download this requirements.txt from the relevant GitHub repo. Once you place it in the project directory and open it in PyCharm, you will see a prompt asking you to install the missing libraries.

Click on Install requirements, and all of the requirements will be installed for you. In this project, we’re using Python 3.11.1.

3. Import and inspect the data

You can either download the “Beehives” dataset from Kaggle or from this GitHub repo. Put all three CSVs in the Data folder. Then, in main.py, enter the following code:

import pandas as pd df = pd.read_csv('data/Hive17.csv', sep=";") df = df.dropna() print(df.head())


Finally, press the Run button in the top right-hand corner of the screen, and our code will be run in the Python console, giving us an idea of what our data looks like.

4. Fit the data points and inspect them in a graph

Since we’ll be using the OneClassSVM from scikit-learn, we’ll import it together with DecisionBoundaryDisplay and Matplotlib using the code below:

from sklearn.svm import OneClassSVM from sklearn.inspection import DecisionBoundaryDisplay import matplotlib.pyplot as plt

According to the data’s description, we know that column T17 represents the temperature of the hive, and RH17 represents the relative humidity of the hive. We’ll extract the value of these two columns as our input:

X = df[["T17", "RH17"]].values

Then, we’ll create and fit the model. Note that we’ll try the default setting first:

estimator = OneClassSVM().fit(X)

Next, we’ll show the decision boundary together with the data points:

disp = DecisionBoundaryDisplay.from_estimator( estimator, X, response_method="decision_function", plot_method="contour", xlabel="Temperature", ylabel="Humidity", levels=[0], ) disp.ax_.scatter(X[:, 0], X[:, 1]) plt.show()

Now, save and press Run again, and you’ll see that the plot is shown in a separate window for inspection.

5. Fine-tune hyperparameters

As the plot above shows, the decision boundary does not fit very well with the data points. The data points consist of a couple of irregular shapes instead of an oval. To fine-tune our model, we have to provide a specific value of “mu” and “gamma” to the OneClassSVM model. You can try it out yourself, but after a couple of tests, it seems “nu=0.1, gamma=0.05” gives the best result.

Example: Isolation Forest

Isolation Forest is an ensemble-based method, similar to the more popular Random Forest classification method. By randomly selecting parting features and values, it will create many decision trees, and the path length from the root of the tree to the node making that decision will then be averaged over all the trees (hence “forest”). A short average path length indicates anomalies.

A short decision path usually indicates data that is very different from the others.

Now, let’s compare the result of OneClassSVM with IsolationForest. To do that, we’ll make two plots of the decision boundaries made by the two algorithms. In the following steps, we’ll build on the script above using the same hive 17 data.

1. Import IsolationForest

IsolationForest can be imported from the ensemble categories in Scikit-learn:

from sklearn.ensemble import IsolationForest 2. Refactor and add a new estimator

Since now we’ll have two different estimators, let’s put them in a list:

estimators = [ OneClassSVM(nu=0.1, gamma=0.05).fit(X), IsolationForest(n_estimators=100).fit(X) ]

After that, we’ll use a for loop to loop through all the estimators.

for estimator in estimators: disp = DecisionBoundaryDisplay.from_estimator( estimator, X, response_method="decision_function", plot_method="contour", xlabel="Temperature", ylabel="Humidity", levels=[0], ) disp.ax_.scatter(X[:, 0], X[:, 1]) plt.show()

As a final touch, we’ll also add a title to each of the graphs for easier inspection. To do that, we’ll add the following after disp.ax_.scatter:

disp.ax_.set_title( f"Decision boundary using {estimator.__class__.__name__}" )

You may find that refactoring using PyCharm is very easy with the auto-complete suggestions it provides.

3. Run the code

Like before, running the code is as easy as pressing the Run button in the top-right corner. After running the code this time, we should get two graphs.

You can easily flip through the two graphs with the preview on the right. As you can see, the decision boundary is quite different while using different algorithms. When doing anomaly detection, it’s worth experimenting with various algorithms and parameters to find the one that suits the use case the most.

Summary

Anomaly Detection has proven to be an important aspect of business intelligence, and being able to identify anomalies and prompt immediate actions to be taken is essential in some sectors of business. Using the proper machine learning model to automatically detect anomalies can help analyze complicated and high volumes of data in a short period of time. In this blog post, we have demonstrated how to identify anomalies using statistical models like OneClassSVM and how STL decomposition can help identify anomalies in time series data.

To learn more about using PyCharm for machine learning, please check out “Start Studying Machine Learning With PyCharm” and “How to Use Jupyter Notebooks in PyCharm”.

Detect anomalies using PyCharm

With the Jupyter project in PyCharm Professional, you can organize your anomaly detection project with a lot of data files and notebooks easily. Graphs output can be generated to inspect anomalies and plots are very accessible in PyCharm. Other features, such as auto-complete suggestions, make navigating all the Scikit-learn models and Matplotlib plot settings a blast.

Power up your data science project by using PyCharm; check out the data science features offered to streamline your data science workflow.

Start with PyCharm Pro for free
Categories: FLOSS Project Planets

Django Weblog: Django 5.2 alpha 1 released

Thu, 2025-01-16 01:15

Django 5.2 alpha 1 is now available. It represents the first stage in the 5.2 release cycle and is an opportunity for you to try out the changes coming in Django 5.2.

Django 5.2 brings a composite of new features which you can read about in the in-development 5.2 release notes.

This alpha milestone marks the feature freeze. The current release schedule calls for a beta release in about a month and a release candidate about a month from then. We'll only be able to keep this schedule if we get early and often testing from the community. Updates on the release schedule are available on the Django forum.

As with all alpha and beta packages, this is not for production use. But if you'd like to take some of the new features for a spin, or to help find and fix bugs (which should be reported to the issue tracker), you can grab a copy of the alpha package from our downloads page or on PyPI.

The PGP key ID used for this release is Sarah Boyce: 3955B19851EA96EF

Categories: FLOSS Project Planets

TestDriven.io: Database Indexing in Django

Wed, 2025-01-15 17:28
This article explores the basics of database indexing, its advantages and disadvantages, and how to apply it in a Django application.
Categories: FLOSS Project Planets

The Python Show: 52 - PyTorch and LLMs with Daniel Voigt Godoy

Wed, 2025-01-15 12:23

In today’s podcast, we welcome Daniel Voigt Godoy to the show. Daniel is the author of Deep Learning with PyTorch Step-by-Step among other books.

We chatted about the following topics:

  • Favorite Python packages

  • PyTorch

  • Book writing

  • and so much more!

Be sure to check out the links section to learn more about Daniel!

Links
Categories: FLOSS Project Planets

Steve Holden: I want to bang some heads together!

Wed, 2025-01-15 10:26

 It's frustrating when useful tools refuse to work together nicely. In the past I've experienced conflicts between black and flake8 that made it impossible to commit via my default commit hooks. Now I'm seeing the same behaviour with black and reorder-python-imports.

In short, almost a year ago now github user maxwell-k reported that black release 24.1.0 had introduced an incompatibility with reorder-python-imports by starting to require a blank line after a module docstring. In the discussion on the bug report the black crew make the reasonable-seeming point that it's black's job to determine the disposition of whitespace, and that reorder-python-imports should do what its name implies and nothing more. This would respect the long-standing Unix tradition that each tool should as far as possible perform a single function.

Unfortunately, when elagil raised the same issue with the reorder-python-imports developers, with a request to make their project usable with black (ably supported by maxwell-k), they received a response which I can only (avoiding the use of expletives) describe as disappointing:

anything is possible. will it happen here: no

In my opinion this uncompromising attitude displays the worst kind of arrogance from a developer, and I frankly fail to see who benefits from this refusal to bend (except perhaps a developer unwilling to work further on a project or set it free). The net consequence from my own point of view is that I'll no longer be using reorder-python-imports, nor recommending it.

The situation remains unchanged. Life's too short to persuade donkeys to move. On the plus side, research into solving this irritation led me to start working with ruff, which provides the functionality of both utilities in a single rather faster tool. It's an ill wind that blows nobody any good. Goodbye, donkeys!

Categories: FLOSS Project Planets

Real Python: How to Replace a String in Python

Wed, 2025-01-15 09:00

Replacing strings in Python is a fundamental skill. You can use the .replace() method for straightforward replacements, while re.sub() allows for more advanced pattern matching and replacement. Both of these tools help you clean and sanitize text data.

In this tutorial, you’ll work with a chat transcript to remove or replace sensitive information and unwanted words with emojis. To achieve this, you’ll use both direct replacement methods and more advanced regular expressions.

By the end of this tutorial, you’ll understand that:

  • You can replace strings in Python using the .replace() method and re.sub().
  • You replace parts of a string by chaining .replace() calls or using regex patterns with re.sub().
  • You replace a letter in a string by specifying it as the first argument in .replace().
  • You remove part of a string by replacing it with an empty string using .replace() or re.sub().
  • You replace all occurrences of substrings in a string by using .replace().

You’ll be playing the role of a developer for a company that provides technical support through a one-to-one text chat. You’re tasked with creating a script that’ll sanitize the chat, removing any personal data and replacing any swear words with emojis.

You’re only given one very short chat transcript:

Text [support_tom] 2025-01-24T10:02:23+00:00 : What can I help you with? [johndoe] 2025-01-24T10:03:15+00:00 : I CAN'T CONNECT TO MY BLASTED ACCOUNT [support_tom] 2025-01-24T10:03:30+00:00 : Are you sure it's not your caps lock? [johndoe] 2025-01-24T10:04:03+00:00 : Blast! You're right! Copied!

Even though this transcript is short, it’s typical of the type of chats that agents have all the time. It has user identifiers, ISO time stamps, and messages.

In this case, the client johndoe filed a complaint, and company policy is to sanitize and simplify the transcript, then pass it on for independent evaluation. Sanitizing the message is your job!

Sample Code: Click here to download the free sample code that you’ll use to replace strings in Python.

The first thing you’ll want to do is to take care of any swear words.

How to Remove or Replace a Python String or Substring

The most basic way to replace a string in Python is to use the .replace() string method:

Python >>> "Fake Python".replace("Fake", "Real") 'Real Python' Copied!

As you can see, you can chain .replace() onto any string and provide the method with two arguments. The first is the string that you want to replace, and the second is the replacement.

Note: Although the Python shell displays the result of .replace(), the string itself stays unchanged. You can see this more clearly by assigning your string to a variable:

Python >>> name = "Fake Python" >>> name.replace("Fake", "Real") 'Real Python' >>> name 'Fake Python' >>> name = name.replace("Fake", "Real") 'Real Python' >>> name 'Real Python' Copied!

Notice that when you simply call .replace(), the value of name doesn’t change. But when you assign the result of name.replace() to the name variable, 'Fake Python' becomes 'Real Python'.

Now it’s time to apply this knowledge to the transcript:

Python >>> transcript = """\ ... [support_tom] 2025-01-24T10:02:23+00:00 : What can I help you with? ... [johndoe] 2025-01-24T10:03:15+00:00 : I CAN'T CONNECT TO MY BLASTED ACCOUNT ... [support_tom] 2025-01-24T10:03:30+00:00 : Are you sure it's not your caps lock? ... [johndoe] 2025-01-24T10:04:03+00:00 : Blast! You're right!""" >>> transcript.replace("BLASTED", "😤") [support_tom] 2025-01-24T10:02:23+00:00 : What can I help you with? [johndoe] 2025-01-24T10:03:15+00:00 : I CAN'T CONNECT TO MY 😤 ACCOUNT [support_tom] 2025-01-24T10:03:30+00:00 : Are you sure it's not your caps lock? [johndoe] 2025-01-24T10:04:03+00:00 : Blast! You're right! Copied!

Loading the transcript as a triple-quoted string and then using the .replace() method on one of the swear words works fine. But there’s another swear word that’s not getting replaced because in Python, the string needs to match exactly:

Python >>> "Fake Python".replace("fake", "Real") 'Fake Python' Copied!

As you can see, even if the casing of one letter doesn’t match, it’ll prevent any replacements. This means that if you’re using the .replace() method, you’ll need to call it various times with the variations. In this case, you can just chain on another call to .replace():

Python >>> transcript.replace("BLASTED", "😤").replace("Blast", "😤") [support_tom] 2025-01-24T10:02:23+00:00 : What can I help you with? [johndoe] 2025-01-24T10:03:15+00:00 : I CAN'T CONNECT TO MY 😤 ACCOUNT [support_tom] 2025-01-24T10:03:30+00:00 : Are you sure it's not your caps lock? [johndoe] 2025-01-24T10:04:03+00:00 : 😤! You're right! Copied!

Success! But you’re probably thinking that this isn’t the best way to do this for something like a general-purpose transcription sanitizer. You’ll want to move toward some way of having a list of replacements, instead of having to type out .replace() each time.

Read the full article at https://realpython.com/replace-string-python/ »

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Categories: FLOSS Project Planets

Glyph Lefkowitz: Small PINPal Update

Tue, 2025-01-14 19:54

Today on stream, I updated PINPal to fix the memorization algorithm.

If you haven’t heard of PINPal before, it is a vault password memorization tool. For more detail on what that means, you can check it out the README, and why not give it a ⭐ while you’re at it.

As I started writing up an update post I realized that I wanted to contextualize it a bit more, because it’s a tool I really wish were more popular. It solves one of those small security problems that you can mostly ignore, right up until the point where it’s a huge problem and it’s too late to do anything about it.

In brief, PINPal helps you memorize new secure passcodes for things you actually have to remember and can’t simply put into your password manager, like the password to your password manager, your PC user account login, your email account1, or the PIN code to your phone or debit card.

Too often, even if you’re properly using a good password manager for your passwords, you’ll be protecting it with a password optimized for memorability, which is to say, one that isn’t random and thus isn’t secure. But I have also seen folks veer too far in the other direction, trying to make a really secure password that they then forget right after switching to a password manager. Forgetting your vault password can also be a really big deal, making you do password resets across every app you’ve loaded into it so far, so having an opportunity to practice it periodically is important.

PINPal uses spaced repetition to ensure that you remember the codes it generates.

While periodic forced password resets are a bad idea, if (and only if!) you can actually remember the new password, it is a good idea to get rid of old passwords eventually — like, let’s say, when you get a new computer or phone. Doing so reduces the risk that a password stored somewhere on a very old hard drive or darkweb data dump is still floating around out there, forever haunting your current security posture. If you do a reset every 2 years or so, you know you’ve never got more than 2 years of history to worry about.

PINPal is also particularly secure in the way it incrementally generates your password; the computer you install it on only ever stores the entire password in memory when you type it in. It stores even the partial fragments that you are in the process of memorizing using the secure keyring module, avoiding plain-text whenever possible.

I’ve been using PINPal to generate and memorize new codes for a while, just in case2, and the change I made today was because encountered a recurring problem. The problem was, I’d forget a token after it had been hidden, and there was never any going back. The moment that a token was hidden from the user, it was removed from storage, so you could never get a reminder. While I’ve successfully memorized about 10 different passwords with it so far, I’ve had to delete 3 or 4.

So, in the updated algorithm, the visual presentation now hides tokens in the prompt several memorizations before they’re removed. Previously, if the password you were generating was ‘hello world’, you’d see hello world 5 times or so, times, then •••• world; if you ever got it wrong past that point, too bad, start over. Now, you’ll see hello world, then °°°° world, then after you have gotten the prompt right without seeing the token a few times, you’ll see •••• world after the backend has locked it in and it’s properly erased from your computer.

If you get the prompt wrong, breaking your streak reveals the recently-hidden token until you get it right again. I also did a new release on that same livestream, so if this update sounds like it might make the memorization process more appealing, check it out via pip install pinpal today.

Right now this tool is still only extremely for a specific type of nerd — it’s command-line only, and you probably need to hand-customize your shell prompt to invoke it periodically. But I’m working on making it more accessible to a broader audience. It’s open source, of course, so you can feel free to contribute your own code!

Acknowledgments

Thank you to my patrons who are supporting my writing on this blog. If you like what you’ve read here and you’d like to read more things like it, or you’d like to support my various open-source endeavors, you can support my work as a sponsor!

  1. Your email account password can be stored in your password manager, of course, but given that email is the root-of-trust reset factor for so many things, being able to remember that password is very helpful in certain situations. 

  2. Funny story: at one point, Apple had an outage which made it briefly appear as if a lot of people needed to reset their iCloud passwords, myself included. Because I’d been testing PINPal a bunch, I actually had several highly secure random passwords already memorized. It was a strange feeling to just respond to the scary password reset prompt with a new, highly secure password and just continue on with my day secure in the knowledge I wouldn't forget it. 

Categories: FLOSS Project Planets

Seth Michael Larson: Quickly visualizing an SBOM document

Tue, 2025-01-14 19:00
Have you ever had a Software Bill-of-Materials (SBOM) document and just want to look at the dang thing? Preferably using CLI tools? SBOMs tend to be quite large for non-trivial software projects, so looking at them in a text editor becomes difficult fast. Many "solutions" for visualizing an SBOM document require running a service which is something I don't want to do. Here's what you can do to quickly visualize an SBOM document using Anthony Harrison's sbom2dot project, the DOT language, and GraphViz: # Ensure GraphViz and sbom2dot are installed. $ dot --version $ python -m pip install sbom2dot # Create an SVG from an SBOM! $ sbom2dot -i bom.cdx.json | dot -Tsvg -o bom.cdx.svg # Open the SVG, either in image viewer or browser. $ firefox bom.cdx.svg Here's what the visualization looks like for a work-in-progress SBOM generation for the Python Cryptography package: sbom cryptography pkg:pypi/cryptography@44.0.0 NOASSERTION cryptography-cffi cryptography-cffi NOASSERTION cryptography->cryptography-cffi cryptography-keepalive cryptography-keepalive NOASSERTION cryptography->cryptography-keepalive cryptography-key-parsing cryptography-key-parsing NOASSERTION cryptography->cryptography-key-parsing cryptography-openssl cryptography-openssl NOASSERTION cryptography->cryptography-openssl cryptography-x509 cryptography-x509 NOASSERTION cryptography->cryptography-x509 cryptography-x509-verification cryptography-x509-verification NOASSERTION cryptography->cryptography-x509-verification pyo3 pkg:cargo/pyo3@0.23.2 MIT OR Apache-2.0 cryptography-cffi->pyo3 openssl-sys pkg:cargo/openssl-sys@0.9.104 MIT cryptography-cffi->openssl-sys cryptography-keepalive->pyo3 cryptography-key-parsing->cryptography-x509 cryptography-key-parsing->openssl-sys cfg-if pkg:cargo/cfg-if@1.0.0 MIT OR Apache-2.0 cryptography-key-parsing->cfg-if openssl pkg:cargo/openssl@0.10.68 Apache-2.0 cryptography-key-parsing->openssl asn1 pkg:cargo/asn1@0.20.0 BSD-3-Clause cryptography-key-parsing->asn1 cryptography-openssl->openssl-sys cryptography-openssl->cfg-if cryptography-openssl->openssl foreign-types pkg:cargo/foreign-types@0.3.2 MIT OR Apache-2.0 cryptography-openssl->foreign-types foreign-types-shared pkg:cargo/foreign-types-shared@0.1.1 MIT OR Apache-2.0 cryptography-openssl->foreign-types-shared cryptography-x509->asn1 cryptography-x509-verification->cryptography-key-parsing cryptography-x509-verification->cryptography-x509 cryptography-x509-verification->asn1 once_cell pkg:cargo/once_cell@1.20.2 MIT OR Apache-2.0 cryptography-x509-verification->once_cell pyo3-macros-backend pkg:cargo/pyo3-macros-backend@0.23.2 MIT OR Apache-2.0 quote pkg:cargo/quote@1.0.37 MIT OR Apache-2.0 pyo3-macros-backend->quote pyo3-build-config pkg:cargo/pyo3-build-config@0.23.2 MIT OR Apache-2.0 pyo3-macros-backend->pyo3-build-config proc-macro2 pkg:cargo/proc-macro2@1.0.92 MIT OR Apache-2.0 pyo3-macros-backend->proc-macro2 syn pkg:cargo/syn@2.0.89 MIT OR Apache-2.0 pyo3-macros-backend->syn heck pkg:cargo/heck@0.5.0 MIT OR Apache-2.0 pyo3-macros-backend->heck quote->proc-macro2 pyo3-build-config->once_cell target-lexicon pkg:cargo/target-lexicon@0.12.16 Apache-2.0 pyo3-build-config->target-lexicon unicode-ident pkg:cargo/unicode-ident@1.0.14 (MIT OR Apache-2.0) AND Unicode-3.0 proc-macro2->unicode-ident syn->quote syn->proc-macro2 syn->unicode-ident libc pkg:cargo/libc@0.2.166 MIT OR Apache-2.0 pyo3->libc pyo3->cfg-if pyo3-macros pkg:cargo/pyo3-macros@0.23.2 MIT OR Apache-2.0 pyo3->pyo3-macros portable-atomic pkg:cargo/portable-atomic@1.10.0 Apache-2.0 OR MIT pyo3->portable-atomic pyo3->once_cell memoffset pkg:cargo/memoffset@0.9.1 MIT pyo3->memoffset pyo3-ffi pkg:cargo/pyo3-ffi@0.23.2 MIT OR Apache-2.0 pyo3->pyo3-ffi indoc pkg:cargo/indoc@2.0.5 MIT OR Apache-2.0 pyo3->indoc unindent pkg:cargo/unindent@0.2.3 MIT OR Apache-2.0 pyo3->unindent openssl-sys->libc openssl->openssl-sys openssl->libc openssl->cfg-if openssl->once_cell openssl-macros pkg:cargo/openssl-macros@0.1.1 MIT OR Apache-2.0 openssl->openssl-macros openssl->foreign-types bitflags pkg:cargo/bitflags@2.6.0 MIT OR Apache-2.0 openssl->bitflags asn1_derive pkg:cargo/asn1_derive@0.20.0 BSD-3-Clause asn1->asn1_derive itoa pkg:cargo/itoa@1.0.14 MIT OR Apache-2.0 asn1->itoa pyo3-macros->pyo3-macros-backend pyo3-macros->quote pyo3-macros->proc-macro2 pyo3-macros->syn asn1_derive->quote asn1_derive->proc-macro2 asn1_derive->syn pyo3-ffi->libc openssl-macros->quote openssl-macros->proc-macro2 openssl-macros->syn foreign-types->foreign-types-shared pem pkg:cargo/pem@3.0.4 MIT base64 pkg:cargo/base64@0.22.1 MIT OR Apache-2.0 pem->base64 Notice the dependencies off on the top-right that aren't a part of the graph? This visualization makes it easy to spot those types of issues. You can fiddle with the options of dot to match how you'd like the visualization to be rendered. Do you have a better method for quick SBOM visualization in the terminal? Send me an email please!
Categories: FLOSS Project Planets

Mike Driscoll: Textual – Switching Screens in Your Terminal

Tue, 2025-01-14 16:09

The Screen is a container for your widgets. These screens occupy the dimensions of your terminal by default. While you can have many different screens in a single application, only one screen may be active at a time.

When you create your App class, Textual will create a screen object implicitly. Yes, Textual requires you to have at least one screen or your application won’t work. If you do not create a new screen or switch to a different one, the default screen is where your widgets will get mounted or composed to.

Screens are a great way to organize your application. Many applications have settings pages, help pages, and more. These are just a few examples of how you can use screens.

Now that you know what a screen is, you’re ready to learn how to create new ones!

Creating Screens

When you create an application, you create a Screen implicitly. But how do you create your own Screen? Fortunately, Textual has made that easy. All you need to do is import the Screen class from textual.screen and extend it as needed.

You can style screens the same way you do other widgets, except for the dimensions as screens are always the same size as your terminal window.

To see how this all works, you will create an application with two screens:

  • Your main screen
  • You second screen, which will be green

You will be able to switch between the screens using a button. Each screen has its own button and its own event or message handler.

Open up your favorite Python IDE and create a new file called two_screens.py with the following contents:

# two_screens.py from textual import on from textual.app import App, ComposeResult from textual.screen import Screen from textual.widgets import Button class GreenScreen(Screen): def compose(self) -> ComposeResult: self.styles.background = "green" yield Button("Main Screen", id="main") @on(Button.Pressed, "#main") def on_main(self) -> None: self.dismiss() class MainAop(App): def compose(self) -> ComposeResult: yield Button("Switch", id="switch") @on(Button.Pressed, "#switch") def on_switch(self) -> None: self.push_screen(GreenScreen()) if __name__ == "__main__": app = MainAop() app.run()

You use Textual’s handy on decorator to match against the button’s id. That keeps the message from bubbling around to other event handlers, which is what could happen if you had used on_button_pressed(), for example.

When you run your application, you will see something like this:

Try clicking the buttons and switching between the screens.

Of course, you don’t need to use button’s at all, if you don’t want to. You could use keyboard shortcuts instead. Why not give that a try?

Go back to your Python IDE and create a new file called two_screens_keys_only.py with this code in it:

# two_screens_keys_only.py from textual.app import App, ComposeResult from textual.screen import Screen from textual.widgets import Label class GreenScreen(Screen): BINDINGS = [("escape", "app.pop_screen", "Dismiss the screen")] def compose(self) -> ComposeResult: self.styles.background = "green" yield Label("Second Screen") class MainAop(App): SCREENS = {"green": GreenScreen} BINDINGS = [("n", "push_screen('green')", "Green Screen")] def compose(self) -> ComposeResult: yield Label("Main screen") if __name__ == "__main__": app = MainAop() app.run()

Using keyboard shortcuts makes your code a little less verbose. However, since you aren’t using a Footer widget, the shortcuts are not shown on-screen to the user. When you are on the main screen, you must press the letter “n” on your keyboard to switch to the GreenScreen. Then when you want to switch back, you press “Esc” or escape.

Here’s what the screen looks like on the GreenScreen:

Now try using the keys mentioned to swap between the two screens. Feel free to change the keyboard bindings to keys of your own choosing.

Wrapping Up

Textual can do much more with Screens than what is covered in this brief tutorial. However, you can use this information as a great starting point for learning how to add one more additional screens to your GUI in your terminal.

Play around with these examples and then run over to the Textual documentation to learn about some of the other widgets you can add to bring your application to life.

Want to Learn More?

If you’d like to learn more about Textual, check out my book: Creating TUI Applications with Textual and Python, which you can find on the following websites:

The post Textual – Switching Screens in Your Terminal appeared first on Mouse Vs Python.

Categories: FLOSS Project Planets

Peter Bengtsson: How I run standalone Python in 2025

Tue, 2025-01-14 15:06
`uv run --python 3.12 --with requests python $@` to quickly start a Python interpreter with the `requests` package installed without creating a whole project.
Categories: FLOSS Project Planets

PyCoder’s Weekly: Issue #664: Django vs FastAPI, Interacting With Python, Data Cleaning, and More (Jan. 14, 2025)

Tue, 2025-01-14 14:30

#664 – JANUARY 14, 2025
View in Browser »

Django vs. FastAPI, an Honest Comparison

David has worked with Django for a long time, but recently has done some deeper coding with FastAPI. As a result, he’s able to provide a good contrast between the libraries and why/when you might choose one over the other.
DAVID DAHAN

Ways to Start Interacting With Python

In this video course, you’ll explore the various ways of interacting with Python. You’ll learn about the REPL for quick testing and running scripts, as well as how to work with different IDEs, and Python’s IDLE.
REAL PYTHON course

Optimize Postgres Performance and Reduce Costs with Crunchy Bridge

Discover why YNAB (You Need A Budget) switched to fully managed Postgres on Crunchy Bridge. With a 30% increase in performance and a 10% reduction in costs, YNAB leverages Crunchy Bridge’s seamless scaling, high availability, and expert support to optimize their database management →
CRUNCHY DATA sponsor

Data Cleaning in Data Science

“Real-world data needs cleaning before it can give us useful insights. Learn how how you can perform data cleaning in data science on your dataset.”
HELEN SCOTT

SciPy 1.15.0 Released

GITHUB.COM/SCIPY

Pygments 2.19 Released

PYGMENTS.ORG

PyConf Hyderabad Feb 22-23

PYCONFHYD.ORG • Shared by Poruri Sai Rahul

Discussions PEP 8: More Nuanced Alignment Guidance

PYTHON.ORG

Python Jobs Backend Software Engineer (Anywhere)

Brilliant.org

More Python Jobs >>>

Articles & Tutorials Building New Structures for Learning Python

What are the new ways we can teach and share our knowledge about Python? How can we improve the structure of our current offerings and build new educational resources for our audience of Python learners? This week on the show, Real Python core team members Stephen Gruppetta and Martin Breuss join us to discuss enhancements to the site and new ways to learn Python.
REAL PYTHON podcast

Automated Accessibility Audits for Python Web Apps

This article covers how to automatically audit your web apps for accessibility standards. The associated second part covers how to do snapshot testing for the same
PAMELAFOX.ORG

Integrate Auth0 With Just a Few Lines of Code

Whether your end users are consumers, businesses, or both, Auth0 provides the foundational requirements out of the box allowing you to customize your solution with APIs, 30+ SDKs, and Quickstarts. Try Auth0 free today with up to 25K active users - no credit card needed to sign up →
AUTH0 sponsor

Software Bill of Materials Packaging Proposal

A new Python packaging proposal, PEP 770, introduces SBOM support to tackle the “phantom dependency” problem, making it easier to track non-Python components that security tools often miss.
SOCKET.DEV • Shared by Sarah Gooding

Unpacking kwargs With Custom Objects

You may have unpacked a dictionary using **kwargs mechanism in Python, but did you know you can write this capability into your own classes? This quick TIL article covers how to write a __getitem__() method.
RODRIGO GIRÃO SERRÃO

Musings on Tracing in PyPy

What started as an answer to a question on Twitter has turned into a very deep dive on tracing JITs, how they compare to method-based JITs, and how all that works in the alternative Python interpreter PyPy.
CF BOLZ-TEREICK

From Default Line Charts to Journal-Quality Infographics

“Everyone who has used Matplotlib knows how ugly the default charts look like.” In this series of posts, Vladimir shares some tricks to make your visualizations stand out and reflect your individual style.
VLADIMIR ZHYVOV

Stupid pipx Tricks

This post talks about pipx a wrapper to pip that allows you to use Python packages like applications. This post talks about the strengths and weaknesses of pipx and just what you can do with it.
KARL KNECHTEL

Towards PyPy3.11: An Update

The alternative Python interpreter PyPy is working towards a Python 3.11 compatible release. This post talks about how that is going and the challenges along the way.
PYPY.ORG

Learn SQL With Python

This tutorial teaches the fundamentals of SQL by using Python to build applications that interact with a relational PostgresSQL database.
PATRICK KENNEDY • Shared by Patrick Kennedy

PEP 769: Add a Default Keyword Argument to attrgetter and itemgetter

This proposal aims to enhance the operator module by adding a default keyword argument to the attrgetter and itemgetter functions.
PYTHON.ORG

Posit Connect Cloud: Share the Work you Make With Streamlit, FastAPI, Shiny, & Other FOSS Frameworks

Posit Connect Cloud lets you publish, host, and manage Streamlit, Dash & other apps, dashboards, APIs, and more. A centralized platform for sharing Python-based data science, it streamlines deployment and boosts collaboration—amplifying your impact.
POSIT sponsor

Unit Testing vs. Integration Testing

Discover the key differences between unit testing vs integration testing and learn how to automate both with Python.
FEDERICO TROTTA

Why Is hash(-1) == hash(-2) in Python?

Somewhat surprisingly, hash(-1) == hash(-2) in CPython. This post examines how and discovers why this is the case.
OMAIR MAJID

Projects & Code pydantic-settings: Settings Management Using Pydantic

GITHUB.COM/PYDANTIC

IPychat: An AI Extension for IPython

GITHUB.COM/VINAYAK-MEHTA • Shared by Vinayak Mehta

Migrate a Project From Poetry/Pipenv to uv

GITHUB.COM/MKNIEWALLNER • Shared by Mathieu Kniewallner

TSignal: Thread-Safet Signal/Slot System

GITHUB.COM/TSIGNALDEV • Shared by San Kim

PhotoshopAPI: Photoshop Files Parser

GITHUB.COM/EMILDOHNE

Events Weekly Real Python Office Hours Q&A (Virtual)

January 15, 2025
REALPYTHON.COM

PyData Bristol Meetup

January 16, 2025
MEETUP.COM

PyLadies Amsterdam

January 16, 2025
MEETUP.COM

PyLadies Dublin

January 16, 2025
PYLADIES.COM

Chattanooga Python User Group

January 17 to January 18, 2025
MEETUP.COM

PyCon+Web 2025

January 24 to January 26, 2025
PYCONWEB.COM

Happy Pythoning!
This was PyCoder’s Weekly Issue #664.
View in Browser »

[ Subscribe to 🐍 PyCoder’s Weekly 💌 – Get the best Python news, articles, and tutorials delivered to your inbox once a week >> Click here to learn more ]

Categories: FLOSS Project Planets

Python Morsels: Python's range() function

Tue, 2025-01-14 11:30

The range function can be used for counting upward, countdown downward, or performing an operation a number of times.

Table of contents

  1. Counting upwards in Python
  2. Using range with a step value
  3. Counting backwards in Python
  4. The arguments range accepts are similar to slicing
  5. Using range with for loops

Counting upwards in Python

How can you count from 1 to 10 in Python?

You could make a list of all those numbers and then loop over it:

>>> numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] >>> for n in numbers: ... print(n) ... 1 2 3 4 5 6 7 8 9 10

But that could get pretty tedious. Imagine if we were working with 100 numbers... or 1,000 numbers!

Instead, we could use one of Python's built-in functions: the range function.

The range function accepts a start integer and a stop integer:

>>> for n in range(1, 11): ... print(n) ... 1 2 3 4 5 6 7 8 9 10

The range function counts upward starting from that start number, and it stops just before that stop number. So we're stopping at 10 here instead of going all the way to 11.

You can also call range with just one argument:

>>> for n in range(5): ... print(n) ... 0 1 2 3 4

When range is given one argument, it starts at 0, and it stops just before that argument.

So range can accept one argument (the stop value) where it starts at 0, and it stops just before that number. And range can also accept two arguments: a start value and a stop value. But range also accepts a third argument!

Using range with a step value

The range function can accept …

Read the full article: https://www.pythonmorsels.com/range/
Categories: FLOSS Project Planets

Daniel Roy Greenfeld: TIL: Using inspect and timeit together

Tue, 2025-01-14 10:30
Two libraries in Python's standard library that are useful for keeping load testing code all in one module.
Categories: FLOSS Project Planets

Real Python: Building Dictionary Comprehensions in Python

Tue, 2025-01-14 09:00

Dictionary comprehensions are a concise and quick way to create, transform, and filter dictionaries in Python. They can significantly enhance your code’s conciseness and readability compared to using regular for loops to process your dictionaries.

Understanding dictionary comprehensions is crucial for you as a Python developer because they’re a Pythonic tool for dictionary manipulation and can be a valuable addition to your programming toolkit.

In this video course, you’ll learn how to:

  • Create dictionaries using dictionary comprehensions
  • Transform existing dictionaries with comprehensions
  • Filter key-value pairs from dictionaries using conditionals
  • Decide when to use dictionary comprehensions

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Categories: FLOSS Project Planets

Django Weblog: Django security releases issued: 5.1.5, 5.0.11, and 4.2.18

Tue, 2025-01-14 09:00

In accordance with our security release policy, the Django team is issuing releases for Django 5.1.5, Django 5.0.11, and Django 4.2.18. These releases address the security issues detailed below. We encourage all users of Django to upgrade as soon as possible.

CVE-2024-56374: Potential denial-of-service vulnerability in IPv6 validation

Lack of upper bound limit enforcement in strings passed when performing IPv6 validation could lead to a potential denial-of-service attack. The undocumented and private functions clean_ipv6_address and is_valid_ipv6_address were vulnerable, as was the django.forms.GenericIPAddressField form field, which has now been updated to define a max_length of 39 characters.

The django.db.models.GenericIPAddressField model field was not affected.

Thanks to Saravana Kumar for the report.

This issue has severity "moderate" according to the Django security policy.

Affected supported versions
  • Django main
  • Django 5.1
  • Django 5.0
  • Django 4.2
Resolution

Patches to resolve the issue have been applied to Django's main, 5.1, 5.0, and 4.2 branches. The patches may be obtained from the following changesets.

CVE-2024-56374: Potential denial-of-service vulnerability in IPv6 validation The following releases have been issued

The PGP key ID used for this release is Natalia Bidart: 2EE82A8D9470983E

General notes regarding security reporting

As always, we ask that potential security issues be reported via private email to security@djangoproject.com, and not via Django's Trac instance, nor via the Django Forum, nor via the django-developers list. Please see our security policies for further information.

Categories: FLOSS Project Planets

Python Insider: Python 3.14.0 alpha 4 is out

Tue, 2025-01-14 06:46

Hello, three dot fourteen dot zero alpha four!

https://www.python.org/downloads/release/python-3140a4/

This is an early developer preview of Python 3.14

Major new features of the 3.14 series, compared to 3.13

Python 3.14 is still in development. This release, 3.14.0a4, is the fourth of seven planned alpha releases.

Alpha releases are intended to make it easier to test the current state of new features and bug fixes and to test the release process.

During the alpha phase, features may be added up until the start of the beta phase (2025-05-06) and, if necessary, may be modified or deleted up until the release candidate phase (2025-07-22). Please keep in mind that this is a preview release and its use is not recommended for production environments.

Many new features for Python 3.14 are still being planned and written. Among the new major new features and changes so far:

The next pre-release of Python 3.14 will be 3.14.0a5, currently scheduled for 2025-02-11.

More resources And now for something completely different

In Python, you can use Greek letters as constants. For example:

from math import pi as π def circumference(radius: float) -> float: return 2 * π * radius print(circumference(6378.137)) # 40075.016685578485 Enjoy the new release

Thanks to all of the many volunteers who help make Python Development and these releases possible! Please consider supporting our efforts by volunteering yourself or through organisation contributions to the Python Software Foundation.

Regards from a slushy, slippery Helsinki,

Your release team,
Hugo van Kemenade @hugovk
Ned Deily @nad
Steve Dower @steve.dower
Łukasz Langa @ambv

Categories: FLOSS Project Planets

Eli Bendersky: Reverse mode Automatic Differentiation

Tue, 2025-01-14 06:07

Automatic Differentiation (AD) is an important algorithm for calculating the derivatives of arbitrary functions that can be expressed by a computer program. One of my favorite CS papers is "Automatic differentiation in machine learning: a survey" by Baydin, Perlmutter, Radul and Siskind (ADIMLAS from here on). While this post attempts to be useful on its own, it serves best as a followup to the ADIMLAS paper - so I strongly encourage you to read that first.

The main idea of AD is to treat a computation as a nested sequence of function compositions, and then calculate the derivative of the outputs w.r.t. the inputs using repeated applications of the chain rule. There are two methods of AD:

  • Forward mode: where derivatives are computed starting at the inputs
  • Reverse mode: where derivatives are computed starting at the outputs

Reverse mode AD is a generalization of the backpropagation technique used in training neural networks. While backpropagation starts from a single scalar output, reverse mode AD works for any number of function outputs. In this post I'm going to be describing how reverse mode AD works in detail.

While reading the ADIMLAS paper is strongly recommended but not required, there is one mandatory pre-requisite for this post: a good understanding of the chain rule of calculus, including its multivariate formulation. Please read my earlier post on the subject first if you're not familiar with it.

Linear chain graphs

Let's start with a simple example where the computation is a linear chain of primitive operations: the Sigmoid function.

This is a basic Python implementation:

def sigmoid(x): return 1 / (1 + math.exp(-x))

To apply the chain rule, we'll break down the calculation of S(x) to a sequence of function compositions, as follows:

\[\begin{align*} f(x)&=-x\\ g(f)&=e^f\\ w(g)&=1+g\\ v(w)&=\frac{1}{w} \end{align*}\]

Take a moment to convince yourself that S(x) is equivalent to the composition v\circ(w\circ(g\circ f))(x).

The same decomposition of sigmoid into primitives in Python would look as follows:

def sigmoid(x): f = -x g = math.exp(f) w = 1 + g v = 1 / w return v

Yet another representation is this computational graph:

Each box (graph node) represents a primitive operation, and the name assigned to it (the green rectangle on the right of each box). An arrows (graph edge) represent the flow of values between operations.

Our goal is to find the derivative of S w.r.t. x at some point , denoted as S'(x_0). The process starts by running the computational graph forward with our value of . As an example, we'll use x_0=0.5:

Since all the functions in this graph have a single input and a single output, it's sufficient to use the single-variable formulation of the chain rule.

\[(g \circ f)'(x_0)={g}'(f(x_0)){f}'(x_0)\]

To avoid confusion, let's switch notation so we can explicitly see which derivatives are involved. For and g(f) as before, we can write the derivatives like this:

\[f'(x)=\frac{df}{dx}\quad g'(f)=\frac{dg}{df}\]

Each of these is a function we can evaluate at some point; for example, we denote the evaluation of f'(x) at as \frac{df}{dx}(x_0). So we can rewrite the chain rule like this:

\[\frac{d(g \circ f)}{dx}(x_0)=\frac{dg}{df}(f(x_0))\frac{df}{dx}(x_0)\]

Reverse mode AD means applying the chain rule to our computation graph, starting with the last operation and ending at the first. Remember that our final goal is to calculate:

\[\frac{dS}{dx}(x_0)\]

Where S is a composition of multiple functions. The first composition we unravel is the last node in the graph, where v is calculated from w. This is the chain rule for it:

\[\frac{dS}{dw}=\frac{d(S \circ v)}{dw}(x_0)=\frac{dS}{dv}(v(x_0))\frac{dv}{dw}(x_0)\]

The formula for S is S(v)=v, so its derivative is 1. The formula for v is v(w)=\frac{1}{w}, so its derivative is -\frac{1}{w^2}. Substituting the value of w computed in the forward pass, we get:

\[\frac{dS}{dw}(x_0)=1\cdot\frac{-1}{w^2}\bigg\rvert_{w=1.61}=-0.39\]

Continuing backwards from v to w:

\[\frac{dS}{dg}(x_0)=\frac{dS}{dw}(x_0)\frac{dw}{dg}(x_0)\]

We've already calculated \frac{dS}{dw}(x_0) in the previous step. Since w=1+g, we know that w'(g)=1, so:

\[\frac{dS}{dg}(x_0)=-0.39\cdot1=-0.39\]

Continuing similarly down the chain, until we get to the input x:

\[\begin{align*} \frac{dS}{df}(x_0)&=\frac{dS}{dg}(x_0)\frac{dg}{df}(x_0)=-0.39\cdot e^f\bigg\rvert_{f=-0.5}=-0.24\\ \frac{dS}{dx}(x_0)&=\frac{dS}{df}(x_0)\frac{df}{dx}(x_0)=-0.24\cdot -1=0.24 \end{align*}\]

We're done; the value of the derivative of the sigmoid function at x=0.5 is 0.24; this can be easily verified with a calculator using the analytical derivative of this function.

As you can see, this procedure is rather mechanical and it's not surprising that it can be automated. Before we get to automation, however, let's review the more common scenario where the computational graph is a DAG rather than a linear chain.

General DAGs

The sigmoid sample we worked though above has a very simple, linear computational graph. Each node has a single predecessor and a single successor; moreover, the function itself has a single input and single output. Therefore, the single-variable chain rule is sufficient here.

In the more general case, we'll encounter functions that have multiple inputs, may also have multiple outputs [1], and the internal nodes are connected in non-linear patterns. To compute their derivatives, we have to use the multivariate chain rule.

As a reminder, in the most general case we're dealing with a function that has n inputs, denoted a=a_1,a_2\cdots a_n, and m outputs, denoted f_1,f_2\cdots f_m. In other words, the function is mapping .

The partial derivative of output i w.r.t. input j at some point a is:

Assuming f is differentiable at a, then the complete derivative of f w.r.t. its inputs can be represented by the Jacobian matrix:

The multivariate chain rule then states that if we compose f\circ g (and assuming all the dimensions are correct), the derivative is:

This is the matrix multiplication of and .

Linear nodes

As a warmup, let's start with a linear node that has a single input and a single output:

In all these examples, we assume the full graph output is S, and its derivative by the node's outputs is \frac{\partial S}{\partial f}. We're then interested in finding \frac{\partial S}{\partial x}. Since since f:\mathbb{R}\to\mathbb{R}, the Jacobian is just a scalar:

\[Df=\frac{\partial f}{\partial x}\]

And the chain rule is:

\[D(S\circ f)=DS(f)\cdot Df=\frac{\partial S}{\partial f}\frac{\partial f}{\partial x}\]

No surprises so far - this is just the single variable chain rule!

Fan-in

Let's move on to the next scenario, where f has two inputs:

Once again, we already have the derivative \frac{\partial S}{\partial f} available, and we're interested in finding the derivative of S w.r.t. the inputs.

In this case, f:\mathbb{R}^2\to\mathbb{R}, so the Jacobian is a 1x2 matrix:

\[Df=\left [ \frac{\partial f}{\partial x_1} \quad \frac{\partial f}{\partial x_2} \right ]\]

And the chain rule here means multiplying a 1x1 matrix by a 1x2 matrix:

\[D(S\circ f)=DS(f)\cdot Df= \left [ \frac{\partial S}{\partial f} \right ] \left [ \frac{\partial f}{\partial x_1} \quad \frac{\partial f}{\partial x_2} \right ] = \left [ \frac{\partial S}{\partial f} \frac{\partial f}{\partial x_1} \quad \frac{\partial S}{\partial f} \frac{\partial f}{\partial x_2} \right ]\]

Therefore, we see that the output derivative propagates to each input separately:

\[\begin{align*} \frac{\partial S}{\partial x_1}&=\frac{\partial S}{\partial f} \frac{\partial f}{\partial x_1}\\ \frac{\partial S}{\partial x_2}&=\frac{\partial S}{\partial f} \frac{\partial f}{\partial x_2} \end{align*}\] Fan-out

In the most general case, f may have multiple inputs but its output may also be used by more than one other node. As a concrete example, here's a node with three inputs and an output that's used in two places:

While we denote each output edge from f with a different name, f has a single output! This point is a bit subtle and important to dwell on: yes, f has a single output, so in the forward calculation both f_1 and f_2 will have the same value. However, we have to treat them differently for the derivative calculation, because it's very possible that \frac{\partial S}{\partial f_1} and \frac{\partial S}{\partial f_2} are different!

In other words, we're reusing the machinery of multi-output functions here. If f had multiple outputs (e.g. a vector function), everything would work exactly the same.

In this case, since we treat f as f:\mathbb{R}^3\to\mathbb{R}^2, its Jacobian is a 2x3 matrix:

\[Df= \begin{bmatrix} \frac{\partial f_1}{\partial x_1} & \frac{\partial f_1}{\partial x_2} & \frac{\partial f_1}{\partial x_3} \\ \\ \frac{\partial f_2}{\partial x_1} & \frac{\partial f_2}{\partial x_2} & \frac{\partial f_2}{\partial x_3} \\ \end{bmatrix}\]

The Jacobian DS(f) is a 1x2 matrix:

\[DS(f)=\left [ \frac{\partial S}{\partial f_1} \quad \frac{\partial S}{\partial f_2} \right ]\]

Applying the chain rule:

\[\begin{align*} D(S\circ f)=DS(f)\cdot Df&= \left [ \frac{\partial S}{\partial f_1} \quad \frac{\partial S}{\partial f_2} \right ] \begin{bmatrix} \frac{\partial f_1}{\partial x_1} & \frac{\partial f_1}{\partial x_2} & \frac{\partial f_1}{\partial x_3} \\ \\ \frac{\partial f_2}{\partial x_1} & \frac{\partial f_2}{\partial x_2} & \frac{\partial f_2}{\partial x_3} \\ \end{bmatrix}\\ &= \left [ \frac{\partial S}{\partial f_1}\frac{\partial f_1}{\partial x_1}+\frac{\partial S}{\partial f_2}\frac{\partial f_2}{\partial x_1}\qquad \frac{\partial S}{\partial f_1}\frac{\partial f_1}{\partial x_2}+\frac{\partial S}{\partial f_2}\frac{\partial f_2}{\partial x_2}\qquad \frac{\partial S}{\partial f_1}\frac{\partial f_1}{\partial x_3}+\frac{\partial S}{\partial f_2}\frac{\partial f_2}{\partial x_3} \right ] \end{align*}\]

Therefore, we have:

\[\begin{align*} \frac{\partial S}{\partial x_1}&=\frac{\partial S}{\partial f_1}\frac{\partial f_1}{\partial x_1}+\frac{\partial S}{\partial f_2}\frac{\partial f_2}{\partial x_1}\\ \frac{\partial S}{\partial x_2}&=\frac{\partial S}{\partial f_1}\frac{\partial f_1}{\partial x_2}+\frac{\partial S}{\partial f_2}\frac{\partial f_2}{\partial x_2}\\ \frac{\partial S}{\partial x_3}&=\frac{\partial S}{\partial f_1}\frac{\partial f_1}{\partial x_3}+\frac{\partial S}{\partial f_2}\frac{\partial f_2}{\partial x_3} \end{align*}\]

The key point here - which we haven't encountered before - is that the derivatives through f add up for each of its outputs (or for each copy of its output). Qualitatively, it means that the sensitivity of f's input to the output is the sum of its sensitivities across each output separately. This makes logical sense, and mathematically it's just the consequence of the dot product inherent in matrix multiplication.

Now that we understand how reverse mode AD works for the more general case of DAG nodes, let's work through a complete example.

General DAGs - full example

Consider this function (a sample used in the ADIMLAS paper):

\[f(x_1, x_2)=ln(x_1)+x_1 x_2-sin(x_2)\]

It has two inputs and a single output; once we decompose it to primitive operations, we can represent it with the following computational graph [2]:

As before, we begin by running the computation forward for the values of x_1,x_2 at which we're interested to find the derivative. Let's take x_1=2 and x_2=5:

Recall that our goal is to calculate \frac{\partial f}{\partial x_1} and \frac{\partial f}{\partial x_2}. Initially we know that \frac{\partial f}{\partial v_5}=1 [3].

Starting with the v_5 node, let's use the fan-in formulas developed earlier:

\[\begin{align*} \frac{\partial f}{\partial v_4}&=\frac{\partial f}{\partial v_5} \frac{\partial v_5}{\partial v_4}=1\cdot 1=1\\ \frac{\partial f}{\partial v_3}&=\frac{\partial f}{\partial v_5} \frac{\partial v_5}{\partial v_3}=1\cdot -1=-1 \end{align*}\]

Next, let's tackle v_4. It also has a fan-in configuration, so we'll use similar formulas, plugging in the value of \frac{\partial f}{\partial v_4} we've just calculated:

\[\begin{align*} \frac{\partial f}{\partial v_1}&=\frac{\partial f}{\partial v_4} \frac{\partial v_4}{\partial v_1}=1\cdot 1=1\\ \frac{\partial f}{\partial v_2}&=\frac{\partial f}{\partial v_4} \frac{\partial v_4}{\partial v_2}=1\cdot 1=1 \end{align*}\]

On to v_1. It's a simple linear node, so:

\[\frac{\partial f}{\partial x_1}^{(1)}=\frac{\partial f}{\partial v_1} \frac{\partial v_1}{\partial x_1}=1\cdot \frac{1}{x_1}=0.5\]

Note the (1) superscript though! Since x_1 is a fan-out node, it will have more than one contribution to its derivative; we've just computed the one from v_1. Next, let's compute the one from v_2. That's another fan-in node:

\[\begin{align*} \frac{\partial f}{\partial x_1}^{(2)}&=\frac{\partial f}{\partial v_2} \frac{\partial v_2}{\partial x_1}=1\cdot x_2=5\\ \frac{\partial f}{\partial x_2}^{(1)}&=\frac{\partial f}{\partial v_2} \frac{\partial v_2}{\partial x_2}=1\cdot x_1=2 \end{align*}\]

We've calculated the other contribution to the x_1 derivative, and the first out of two contributions for the x_2 derivative. Next, let's handle v_3:

\[\frac{\partial f}{\partial x_2}^{(2)}=\frac{\partial f}{\partial v_3} \frac{\partial v_3}{\partial x_2}=-1\cdot cos(x_2)=-0.28\]

Finally, we're ready to add up the derivative contributions for the input arguments. x_1 is a "fan-out" node, with two outputs. Recall from the section above that we just sum their contributions:

\[\frac{\partial f}{\partial x_1}=\frac{\partial f}{\partial x_1}^{(1)}+\frac{\partial f}{\partial x_1}^{(2)}=0.5+5=5.5\]

And:

\[\frac{\partial f}{\partial x_2}=\frac{\partial f}{\partial x_2}^{(1)}+\frac{\partial f}{\partial x_2}^{(2)}=2-0.28=1.72\]

And we're done! Once again, it's easy to verify - using a calculator and the analytical derivatives of f(x_1,x_2) - that these are the right derivatives at the given points.

Backpropagation in ML, reverse mode AD and VJPs

A quick note on reverse mode AD vs forward mode (please read the ADIMLAS paper for much more details):

Reverse mode AD is the approach commonly used for machine learning and neural networks, because these tend to have a scalar loss (or error) output that we want to minimize. In reverse mode, we have to run AD once per output, while in forward mode we'd have to run it once per input. Therefore, when the input size is much larger than the output size (as is the case in NNs), reverse mode is preferable.

There's another advantage, and it relates to the term vector-jacobian product (VJP) that you will definitely run into once you start digging deeper in this domain.

The VJP is basically a fancy way of saying "using the chain rule in reverse mode AD". Recall that in the most general case, the multivariate chain rule is:

However, in the case of reverse mode AD, we typically have a single output from the full graph, so is a row vector. The chain rule then means multiplying this row vector by a matrix representing the node's jacobian. This is the vector-jacobian product, and its output is another row vector. Scroll back to the Fan-out sample to see an example of this.

This may not seem very profound so far, but it carries an important meaning in terms of computational efficiency. For each node in the graph, we don't have to store its complete jacobian; all we need is a function that takes a row vector and produces the VJP. This is important because jacobians can be very large and very sparse [4]. In practice, this means that when AD libraries define the derivative of a computation node, they don't ask you to register a complete jacobian for each operation, but rather a VJP.

This also provides an additional way to think about the relative efficiency of reverse mode AD for ML applications; since a graph typically has many inputs (all the weights), and a single output (scalar loss), accumulating from the end going backwards means the intermediate products are VJPs that are row vectors; accumulating from the front would mean multiplying full jacobians together, and the intermediate results would be matrices [5].

A simple Python implementation of reverse mode AD

Enough equations, let's see some code! The whole point of AD is that it's automatic, meaning that it's simple to implement in a program. What follows is the simplest implementation I could think of; it requires one to build expressions out of a special type, which can then calculate gradients automatically.

Let's start with some usage samples; here's the Sigmoid calculation presented earlier:

xx = Var(0.5) sigmoid = 1 / (1 + exp(-xx)) print(f"xx = {xx.v:.2}, sigmoid = {sigmoid.v:.2}") sigmoid.grad(1.0) print(f"dsigmoid/dxx = {xx.gv:.2}")

We begin by building the Sigmoid expression using Var values (more on this later). We can then run the grad method on a Var, with an output gradient of 1.0 and see that the gradient for xx is 0.24, as calculated before.

Here's the expression we used for the DAG section:

x1 = Var(2.0) x2 = Var(5.0) f = log(x1) + x1 * x2 - sin(x2) print(f"x1 = {x1.v:.2}, x2 = {x2.v:.2}, f = {f.v:.2}") f.grad(1.0) print(f"df/dx1 = {x1.gv:.2}, df/dx2 = {x2.gv:.2}")

Once again, we build up the expression, then call grad on the final value. It will populate the gv attributes of input Vars with the derivatives calculated w.r.t. these inputs.

Let's see how Var works. The high-level overview is:

  • A Var represents a node in the computational graph we've been discussing in this post.
  • Using operator overloading and custom math functions (like the exp, sin and log seen in the samples above), when an expression is constructed out of Var values, we also build the computational graph in the background. Each Var has links to its predecessors in the graph (the other Vars that feed into it).
  • When the grad method is called, it runs reverse mode AD through the computational graph, using the chain rule.

Here's the Var class:

class Var: def __init__(self, v): self.v = v self.predecessors = [] self.gv = 0.0

v is the value (forward calculation) of this Var. predecessors is the list of predecessors, each of this type:

@dataclass class Predecessor: multiplier: float var: "Var"

Consider the v5 node in DAG sample, for example. It represents the calculation v4-v3. The Var representing v5 will have a list of two predecessors, one for v4 and one for v3. Each of these will have a "multiplier" associated with it:

  • For v3, Predecessor.var points to the Var representing v3 and Predecessor.multiplier is -1, since this is the derivative of v5 w.r.t. v3
  • Similarly, for v4, Predecessor.var points to the Var representing v4 and Predecessor.multiplier is 1.

Let's see some overloaded operators of Var [6]:

def __add__(self, other): other = ensure_var(other) out = Var(self.v + other.v) out.predecessors.append(Predecessor(1.0, self)) out.predecessors.append(Predecessor(1.0, other)) return out # ... def __mul__(self, other): other = ensure_var(other) out = Var(self.v * other.v) out.predecessors.append(Predecessor(other.v, self)) out.predecessors.append(Predecessor(self.v, other)) return out

And some of the custom math functions:

def log(x): """log(x) - natural logarithm of x""" x = ensure_var(x) out = Var(math.log(x.v)) out.predecessors.append(Predecessor(1.0 / x.v, x)) return out def sin(x): """sin(x)""" x = ensure_var(x) out = Var(math.sin(x.v)) out.predecessors.append(Predecessor(math.cos(x.v), x)) return out

Note how the multipliers for each node are exactly the derivatives of its output w.r.t. corresponding input. Notice also that in some cases we use the forward calculated value of a Var's inputs to calculate this derivative (e.g. in the case of sin(x), the derivative is cos(x), so we need the actual value of x).

Finally, this is the grad method:

def grad(self, gv): self.gv += gv for p in self.predecessors: p.var.grad(p.multiplier * gv)

Some notes about this method:

  • It has to be invoked on a Var node that represents the entire computation.
  • Since this function walks the graph backwards (from the outputs to the inputs), this is the direction our graph edges are pointing (we keep track of the predecessors of each node, not the successors).
  • Since we typically want the derivative of some output "loss" w.r.t. each Var, the computation will usually start with grad(1.0), because the output of the entire computation is the loss.
  • For each node, grad adds the incoming gradient to its own, and propagates the incoming gradient to each of its predecessors, using the relevant multiplier.
  • The addition self.gv += gv is key to managing nodes with fan-out. Recall our discussion from the DAG section - according to the multivariate chain rule, fan-out nodes' derivatives add up for each of their outputs.
  • This implementation of grad is very simplistic and inefficient because it will process the same Var multiple times in complex graphs. A more efficient implementation would sort the graph topologically first and then would only have to visit each Var once.
  • Since the gradient of each Var adds up, one shouldn't be reusing Vars between different computations. Once grad was run, the Var should not be used for other grad calculations.

The full code for this sample is available here.

Conclusion

The goal of this post is to serve as a supplement for the ADIMLAS paper; once again, if the topic of AD is interesting to you, I strongly encourage you to read the paper! I hope this post added something on top - please let me know if you have any questions.

Industrial strength implementations of AD, like autograd and JAX, have much better ergonomics and performance than the toy implementation shown above. That said, the underlying principles are similar - reverse mode AD on computational graphs.

I'll discuss an implementation of a more sophisticated AD system in a followup post.

[1]In this post we're only looking at single-output graphs, however, since these are typically sufficient in machine learning (the output is some scalar "loss" or "error" that we're trying to minimize). That said, for functions with multiple outputs the process is very similar - we just have to run the reverse mode AD process for each output variable separately. [2]Note that the notation here is a bit different from the one used for the sigmoid function. This notation is adopted from the ADIMLAS paper, which uses v_i for all temporary values within the graph. I'm keeping the notations different to emphasize they have absolutely no bearing on the math and the AD algorithm. They're just a naming convention. [3]For consistency, I'll be using the partial derivative notation throughout this example, even for nodes that have a single input and output. [4]For an example of gigantic, sparse jacobians see my older post on backpropagation through a fully connected layer. [5]There are a lot of additional nuances here to explain; I strongly recommend this excellent lecture by Matthew Johnson (of JAX and autograd fame) for a deeper overview. [6]These use the utility function ensure_var; all it does is wrap the its argument in a Var if it's not already a Var. This is needed to wrap constants in the expression, to ensure that the computational graph includes everything.
Categories: FLOSS Project Planets

Python Software Foundation: Powering Python together in 2025, thanks to our community!

Tue, 2025-01-14 03:35

We are so very grateful for each of you who donated or became new members during our end-of-year fundraiser and membership drive. We raised $30,000 through the PyCharm promotion offered by JetBrains– WOW! Including individual donations, Supporting Memberships, donations to our Fiscal Sponsorees, and JetBrains’ generous partnership we raised around $99,000 for the PSF’s mission supporting Python and its community.

Your generous support means we can dive into 2025 ready to invest in our key goals for the year. Some of our goals include:

  • Embrace the opportunities and tackle the challenges that come with scale
  • Foster long term sustainable growth- for Python, the PSF, and the community
  • Improve workflows through iterative improvement in collaboration with the community

Each bit of investment from the Python community—money, time, energy, ideas, and enthusiasm—helps us to reach these goals!

We want to specifically call out to our new members: welcome aboard, thank you for joining us, and we are so appreciative of you! We’re looking forward to having your voice take part in the PSF’s future. If you aren’t a member of the PSF yet, check out our Membership page, which includes details about our sliding scale memberships. We are happy to welcome new members any time of year!

As always, we want to thank those in the community who took the time to share our posts on social media and their local or project based networks. We’re excited about what 2025 has in store for Python and the PSF, and as always, we’d love to hear your ideas and feedback. Looking for how to keep in touch with us? You can find all the ways in our "Where to find the PSF?" blog post.

We wish you a perfectly Pythonic year ahead!
- The PSF Team

P.s. Want to continue to help us make an impact? Check out our “Do you know the PSF's next sponsor?” blog post and share with your employer!

Categories: FLOSS Project Planets

Python⇒Speed: Catching memory leaks with your test suite

Mon, 2025-01-13 19:00

Resource leaks are an unpleasant type of bug. Little by little your program uses more memory, or more file descriptors, or some other limited resource. Everything seems fine—until you run, and now your program is dead.

In many cases you can catch these sort of bugs in advance, by tweaking your test suite. Or, after you’ve discovered such a bug, you can use your test suite to identify what is causing it. In this article we’ll cover:

  • An example of a memory leaks.
  • When your test suite may be a good way to identify the causes of leaks.
  • How to catch leaks using pytest.
  • Other types of leaks.
Read more...
Categories: FLOSS Project Planets

Pages