Planet Python

Planet Python - http://planetpython.org/

Updated: 14 hours 1 min ago

Programiz: Python List

Wed, 2024-11-27 23:55

In this tutorial, we will learn about Python lists (creating lists, changing list items, removing items, and other list operations) with the help of examples.

Categories: FLOSS Project Planets

Bruno Ponne / Coding The Past: How to calculate Z-Scores in Python

Wed, 2024-11-27 19:00

If you’ve worked with statistical data, you’ve likely encountered z-scores. A z-score measures how far a data point is from the mean, expressed in terms of standard deviations. It helps identify outliers and compare data distributions, making it a vital tool in data science.

In this guide, we’ll show you how to calculate z-scores in Python using a custom function and built-in libraries like SciPy. You’ll also learn to visualize z-scores for better insights.

1. What is a z-score?

A z-score measures how many standard deviations a data point is from the mean. The formula for calculating the z-score of a data point X is:

\[Z_{X} = \frac{X - \overline{X}}{S}\]

Where:

$Z_{X}$ is the z score of the point $X$;
$X$ is the value for which we want to calculate the Z score;
$\overline{X}$ is the mean of the sample;
$S$ is the standard deviation of the sample.

2. Python z score using a custom function

A custom function allows you to implement the z-score formula directly. Here’s how to define and use it in Python:

content_copy Copy

def calculate_z(X, X_mean, X_sd): return (X - X_mean) / X_sd

The function takes three arguments:

a vector X of values for which you want to calculate the z-scores, like a pandas dataframe column, for example;
the mean of the values in X;
the standard deviation of the values in X.

Finally, in the return clause, we apply the z-score formula explained above.

To test our function, we will use data from Playfair (1821). He collected data regarding the price of wheat and the typical weekly wage for a “good mechanic” in England from 1565 to 1821. His objective was to show how well-off working men were in the 19th century. This dataset is available in the HistData R package and also on the webpage of Professor Vincent Arel-Bundock, a great source of datasets. It consists of 3 variables: year, price of wheat (in Shillings) and weekly wages (in Shillings).

We will be calculating the z-scores for the weekly wages. First we load the dataset directly from the website, as indicated in the code below.

content_copy Copy

import pandas as pd data = pd.read_csv("https://vincentarelbundock.github.io/Rdatasets/csv/HistData/Wheat.csv") print(data['Wages'].mean()) print(data['Wages'].std()) data["z-score_wages"] = calculate_z(data["Wages"], data["Wages"].mean(), data["Wages"].std())

The average weekly wage during the period was 11.58 Shillings, with a standard deviation of 7.34. With this information, we can calculate the Z score for each observation in the dataset. This is done and stored in a new column called “z-score_wages”.

If you check the first row of the data frame, you will find out that in 1565 the z score was around -0.9, that is, the wages were 0.9 standard deviations below the mean of the values for the whole period.

3. Python z score using SciPy

A second option to calculate z-scores in Python is to use the zscore method of the SciPy library as shown below. Ensure you set a policy for handling missing values if your dataset is incomplete.

In the code below, we calculate the z-scores for Wheat prices. If you look at the z-score summary statistics, you will see that the price of wheat varied between -1.13 and 3.65 standard deviations away from the mean in the observed period.

content_copy Copy

from scipy import stats data["z-score_wheat"] = stats.zscore(data["Wheat"], nan_policy="omit") data["z-score_wheat"].describe()

3. Visualising z scores

Below you can better visualize the basic idea of z scores: to measure how far away a data point is from the mean in terms of standard deviations. This visualization was created in D3, a JavaScript library for interactive data visualization. Click “See average wage” to see the averave wage for the whole period. Then check out how far from the mean each data point is and finally note that the z-score consists of this distance in terms of standard deviation.

1. See Average Wage 2. See Distance to the Mean 3. See Z-Scores Reset

4. Visualizing z scores with Matplotlib

The code below plots the wage z scores over time and shows them as the distance from the point to the mean, as demonstrated in the D3 visualization above. Please consult the lesson ‘Storytelling with Matplotlib - Visualizing historical data’ to learn more about Matplotlib visualizations.

content_copy Copy

# Calculate mean wage mean_wage = data["z-score_wages"].mean() # Create the plot fig, ax = plt.subplots(figsize=(10, 6)) # Scatter plot of wages over years ax.plot(data["Year"], data["z-score_wages"], 'o', color='#FF6885', label="Wage Z-scores", markeredgewidth=0.5) # Add a horizontal line for the mean wage ax.axhline(y=mean_wage, color='gray', linestyle='dashed', label=f"Mean Z-score = {mean_wage:.2f}") # Add gray lines connecting points to the mean for year, wage in zip(data["Year"], data["z-score_wages"]): ax.plot([year, year], [mean_wage, wage], color='gray', linestyle='dotted', linewidth=1) # Customize the plot ax.set_xlabel("Year") ax.set_ylabel("Z-scores") ax.set_title("Z-scores Over Time") ax.legend() # Show the plot plt.show()

Have questions or insights? Leave a comment below, and I’ll be happy to help.

Happy coding!

Conclusions

A z score is a measure of how many standard deviations a data point is away from the mean. It can be easily calculated in Python;
You can visualize z-scores using traditional python libraries like Matplotlib or Seaborn.

Categories: FLOSS Project Planets

Real Python: Continuous Integration and Deployment for Python With GitHub Actions

Wed, 2024-11-27 09:00

Creating software is an achievement worth celebrating. But software is never static. Bugs need to be fixed, features need to be added, and security demands regular updates. In today’s landscape, with agile methodologies dominating, robust DevOps systems are crucial for managing an evolving codebase. That’s where GitHub Actions shine, empowering Python developers to automate workflows and ensure their projects adapt seamlessly to change.

GitHub Actions for Python empowers developers to automate workflows efficiently. This enables teams to maintain software quality while adapting to constant change.

Continuous Integration and Continuous Deployment (CI/CD) systems help produce well-tested, high-quality software and streamline deployment. GitHub Actions makes CI/CD accessible to all, allowing automation and customization of workflows directly in your repository. This free service enables developers to execute their software development processes efficiently, improving productivity and code reliability.

In this tutorial, you’ll learn how to:

Use GitHub Actions and workflows
Automate linting, testing, and deployment of a Python project
Secure credentials used for automation
Automate security and dependency updates

This tutorial will use an existing codebase, Real Python Reader, as a starting point for which you’ll create a CI/CD pipeline. You can fork the Real Python Reader code on GitHub to follow along. Be sure to deselect the Copy the master branch only option when forking. Alternatively, if you prefer, you can build your own Real Python Reader using a previous tutorial.

In order to get the most out of this tutorial, you should be comfortable with pip, building Python packages, Git, and have some familiarity with YAML syntax.

Before you dig into GitHub Actions, it may be helpful to take a step back and learn about the benefits of CI/CD. This will help you understand the kinds of problems that GitHub Actions can solve.

Get Your Code: Click here to download the free sample code you’ll use to learn about CI/CD for Python With GitHub Actions.

Take the Quiz: Test your knowledge with our interactive “GitHub Actions for Python” quiz. You’ll receive a score upon completion to help you track your learning progress:

Interactive Quiz

GitHub Actions for Python

In this quiz, you'll test your understanding of GitHub Actions for Python. By working through this quiz, you'll revisit how to use GitHub Actions and workflows to automate linting, testing, and deployment of a Python project.

Unlocking the Benefits of CI/CD

Continuous Integration (CI) and Continuous Deployment (CD), commonly known as CI/CD, are essential practices in modern software development. These practices automate the integration of code changes, the execution of tests, and the deployment of applications. This helps teams and open-source contributors to deliver code changes more frequently in a reliable and structured way.

Moreover, when publishing open-source Python packages, CI/CD will ensure that all pull requests (PRs) and contributions to your package will meet the needs of the project while standardizing the code quality.

Note: To learn more about what a pull request is and how to create one, you can read GitHub’s official documentation.

More frequent deployments with smaller code changes reduce the risk of unintended breaking changes that can occur with larger, more complex releases. For example, even though developers can format all code using the same linting tools and rules, policy can automatically block PRs from being merged if the code’s tests don’t pass.

In the next section, you’ll learn how GitHub Workflows can help you implement CI/CD on a repository hosted on GitHub.

Exploring GitHub Workflows

GitHub Workflows are a powerful feature of GitHub Actions. They allow you to define custom automation workflows for your repositories. Whether you want to build, test, or deploy your code, GitHub Workflows provide a flexible and customizable solution that any project on GitHub can use for free, whether the repository is public or private.

Even though there are many CI/CD providers, GitHub Actions has become the default among open-source projects on GitHub because of its expansive ecosystem, flexibility, and low or no cost.

Anatomy of a Workflow File

Workflow files are declaratively written YAML files with a predefined structure that must be adhered to for a workflow to run successfully. Your YAML workflow files are stored and defined in a .github/workflows/ folder in your project’s root directory.

Your workflow folder can have multiple workflow files, each of which will perform a certain task. You can name these workflow files anything you’d like. However, for the sake of simplicity and readability, it’s common practice to name them after the tasks they achieve, such as test.yml.

Each file has a few elements that are required, but many, many more that are optional. The GitHub Actions documentation is thorough and well-written, so be sure to check it out after you’ve finished reading this tutorial.

There are three main parts that make up the bulk of a workflow file: triggers, jobs, and steps. You’ll cover these in the next sections.

Workflow Triggers Read the full article at https://realpython.com/github-actions-python/ »

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Categories: FLOSS Project Planets

PyPodcats: Episode 7: With Anna Makarudze

Wed, 2024-11-27 07:00

Learn about Anna's journey. Since discovering Python and Django in 2015, Anna has been actively contributing to the Django community and taking up leadership roles within DSF and Django Girls.Learn about Anna's journey. Since discovering Python and Django in 2015, Anna has been actively contributing to the Django community and taking up leadership roles within DSF and Django Girls.

We interviewed Anna Makarudze from Zimbabwe to hear about her inspiring journey.

Since discovering Python and Django in 2015, Anna has been actively involved in the Django community. She helped organize PyCon Zimbabwe, and she has coached at Django Girls in Harare and Windhoek.

She served on the Board of Directors at Django Software Foundation for five years, and she is currently a Django Girls Foundation Trustee & Fundraising Coordinator.

During her journey in tech, Anna became aware of the lack of representation of women in tech industry, something that became more evident as she attended Django Under the Hood in 2016 where most of the attendees were white men, and only a few are women. That’s when she realized the importance of communities like Django Girls in supporting more women in the Django Community.

In this chat, Anna shared ways on how you can contribute and help support Django Girls+ Foundation.

She is currently studying her master’s degree in Blekinge Institute of Technology in Karlskrona, Sweden.

Be sure to listen to the episode to learn all about Anna’s inspiring story!

Topic discussed

Introductions
Her role in Django Girls+ Foundation
Moving from Zimbabwe to Sweden, and adjusting to her new life
Django Girls tutorials and website
How to contribute to Django Girls website
Discovering Python, Flask, and Django in 2015
Attending Django Under the Hood in 2016
Representation of women in tech
The role of Django Girls+ in tech
How to help and support Django Girls+
What to expect from Django Girls+ in 2025 onwards

Links from the show

Django Girls+: https://djangogirls.org/en/
Blekinge Institute of Technology: https://bth.se/
Muzinda Hub: https://x.com/muzindahub
Treehouse: https://teamtreehouse.com/
Django Software Foundation: https://www.djangoproject.com/foundation/
Django Under the Hood: https://djangounderthehood.com/

Links from Django Girls+

Coach at an event: https://coach.djangogirls.org/
Organize an event: https://organize.djangogirls.org/
Work on the website: https://github.com/DjangoGirls/djangogirls
Contribute to our resources:
Translate our resources:
- https://github.com/DjangoGirls/translation-guide
Donate to us:
For corporate companies: https://djangogirls.org/en/donate/

Categories: FLOSS Project Planets

Django Weblog: Django 6.x Steering Council Candidate Registration

Wed, 2024-11-27 02:00

Following our announcement of the 6.x Steering Council elections, today we open candidate registrations. Registrations will be open until December 4 2024 at 23:59 Anywhere on Earth.

Eligibility

Candidate eligibility requirements are defined in DEP 12: The Steering Council. To be qualified for elections, we require both of the following:

A history of substantive contributions to Django or the Django ecosystem. This history must begin at least 18 months prior to the individual's candidacy for the Steering Council, and include substantive contributions in at least two of these bullet points:
- Code contributions on Django projects or major third-party packages in the Django ecosystem
- Reviewing pull requests and/or triaging Django project tickets
- Documentation, tutorials or blog posts
- Discussions about Django on the django-developers mailing list or the Django Forum
- Running Django-related events or user groups
A history of engagement with the direction and future of Django. This does not need to be recent, but candidates who have not engaged in the past three years must still demonstrate an understanding of Django's changes and direction within those three years.

If you have questions about the election please contact foundation@djangoproject.com or ask on the Django forum.

Categories: FLOSS Project Planets

Brett Cannon: What the PSF Conduct WG does

Tue, 2024-11-26 18:28

In the past week I had two people separately tell me what they thought the PSF Conduct WG did and both were wrong (and incidentally in the same way). As such, I wanted to clarify what exactly the WG does for people in case others also misunderstand what the group does.

⚠️I am a member of the PSF Conduct WG (whose membership you can see by checking the charter), and have been for a few years now. That means I both speak from experience but I also may be biased in some way that I&aposm not aware of. But since this post is meant to be objective I&aposm hoping there aren&apost any concerns about bias.🔔There are a myriad of conduct groups in the Python community beyond the PSF Conduct WG, and they all work differently. For example, conferences like PyCon US have their own, the Django and NumFOCUS communities have their own, etc. This post is about a specific group and does not represent other ones.

I would say there are 4 things the Conduct WG actually does (in order from least to most frequent):

Maintain the PSF Code of Conduct (CoC)
Let PSF members know when they have gone against the CoC in a public space
Record disciplinary actions taken by groups associated with the PSF
Provide conduct advice to Python groups

Let&aposs talk about what each of these mean.

Maintain the CoC

In September 2019 the CoC was rewritten from a two paragraph "don&apost be mean" CoC to a more professional one. That rewrite actually is what led to the establishment of the Conduct WG in the first place. Since then, the Conduct WG is in charge of making any changes as necessary to the document. But ever since the rewrite was completed, it is rarely touched.

Let PSF members know when they have gone against the CoC publicly

Becoming a member of the PSF requires that you "agree to the community Code of Conduct". As such, if you are found to be running afoul of the CoC publicly where you also declare your PSF membership, then the Conduct WG will reach out to you and kindly let you know what you did wrong and to please not do that (technically you could get referred to the PSF board to have your membership revoked if you did something really bad, but I&aposm not aware of that ever happening).

But there are two key details about this work of the WG that I think people don&apost realize that are important. One is the Conduct WG does not go out on the internet looking for members who have done something that&aposs in violation of the CoC. What happens instead is people report to the WG when they have seen a PSF member behave poorly in public while promoting their PSF membership (and this tends to be Fellows more than the general members).

Two, this is (so far) only an issue if you promote the fact that you&aposre a PSF member. What you do in your life outside of Python is none of the WG&aposs concern, but if you, e,g., call out your PSF affiliation on your profile on X and then post something that goes against the CoC, then that&aposs a problem as that then reflects poorly on the PSF and the rest of the membership. Now, if someone were to very publicly come out as a member of some heinous organization even without talking about Python then that might be enough to warrant the Conduct WG saying something to the PSF board (and this probably applies more to Fellows than general members), but I haven&apost seen that happen.

Record CoC violations

If someone violates the CoC, some groups report them to the Conduct WG and we record who violated the CoC, how they violated, and what action was taken. The reason for this is to see if someone is jumping from group to group, causing conduct issues, but in a way that the larger pattern isn&apost being noticed by individual groups. But to be honest, not many groups report things (it is one more thing to do after dealing with a conduct issue which is exhausting on its own), and typically people who run afoul of the CoC where a pattern would be big enough to cause concern usually do it enough in one place as well, so the misconduct is noticed regardless.

Provide advice

The most common thing the Conduct WG does, by far, is provide advice to other groups who ask us for said advice based on the WG&aposs training and expertise. This can range from, "can you double-check our logic and reaction as a neutral 3rd-party?" to, "can you provide a recommendation on how to handle this situation?"

While this might be the thing the Conduct WG does the most, it also seems to be the most misunderstood. For instance, much like with emailing PSF members when they have violated the CoC publicly while promoting their PSF membership, the Conduct WG does not go out looking for people causing trouble. This is entirely driven by people coming to the WG with a problem. The closest thing I can think of the Conduct WG doing in terms of proactively reaching out is some group that got a grant from the PSF Grants WG did something wrong around the CoC that was reported to the Conduct WG that warrants us notifying the Grants WG of the problem. But the Conduct WG isn&apost snooping around the internet looking for places to give advice.

I have also heard folks say the Conduct WG "demanded" something, or "made" something happen. That is simply not true. The Conduct WG has no power to compel some group to do something (i.e. things like moderation and enforcement is handled by the folks who come to the Conduct WG asking for advice). As an example, let&aposs say the Python steering council came to the Conduct WG asking for advice (and that could be as open-ended as "what do you recommend?" to "we are thinking of doing this; does that seem reasonable to you?"). The Conduct WG would provide the advice requested, and that&aposs the end of it. The Conduct WG advised in this hypothetical, it didn&apost require anything. The SC can choose to enact the advice, modify it in some way, or flat-out ignore it; the Conduct WG cannot make the SC do anything (heck, the SC isn&apost even under the PSF&aposs jurisdiction, but that&aposs not an important detail here, just something else I have heard people get wrong). And this inability to compel a group to do something even extends to groups that come to the Conduct WG for advice even if they are affiliated with the PSF. Going back to the Grants WG example, we can&apost make the Grants WG pull someone&aposs grant or deny future grants, we can just let them know what we think. We can refer an issue to the PSF board, but we can&apost compel the board to do anything (e.g., if we warn a PSF member about their public conduct, we can&apost make them stop being a PSF member for it, the most we can do is inform the PSF board about what someone has done and potentially offer advice).

Having said all of that, anecdotally it seems that most groups that request a recommendation from the Conduct WG enact those recommendations. So you could say the Conduct WG was involved in some action that was taken based on the WG&aposs recommendation, but you certainly cannot assign full blame on the WG for the actions taken by other groups either.

Categories: FLOSS Project Planets

Kay Hayen: Nuitka Release 2.5

Tue, 2024-11-26 18:00

This is to inform you about the new stable release of Nuitka. It is the extremely compatible Python compiler, “download now”.

This release focused on Python 3.13 support, but also on improved compatibility, made many performance optimizations, enhanced error reporting, and better debugging support.

Bug Fixes

Windows: Fixed a regression in onefile mode that incorrectly handled program and command line paths containing spaces. Fixed in 2.4.4 already.
Windows: Corrected an issue where console output handles were being opened with closed file handles. Fixed in 2.4.2 already.
Standalone: Restored the ability to use trailing slashes on the command line to specify the target directory for data files on Windows. Fixed in 2.4.2 already.
Compatibility: Fixed a parsing error that occurred with relative imports in .pyi files, which could affect some extension modules with available source code. Fixed in 2.4.3 already.
Modules: Ensured that extension modules load correctly into packages when using Python 3.12. Fixed in 2.4.4 already.
Windows: Improved command line handling for onefile mode to ensure full compatibility with quoting. Fixed in 2.4.4 already.
Data Directories: Allowed the use of non-normalized paths on the command line when specifying data directories. Fixed in 2.4.5 already.
Python 3.11+: Fixed an issue where inspect module functions could raise StopIteration when examining compiled functions on the stack. Fixed in 2.4.5 already.
importlib_metadata: Improved compatibility with importlib_metadata by handling cases where it might be broken, preventing potential compilation crashes. Fixed in 2.4.5 already.
Plugins: Fixed a crash that occurred when using the no_asserts YAML configuration option. Fixed in 2.4.6 already.
Scons: Improved error tolerance when reading ccache log files to prevent crashes on Windows caused by non-ASCII module names or paths. Fixed in 2.4.11 already.
Scons: Prevented the C standard option from being applied to C++ compilers, resolving an issue with the splash screen on Windows when using Clang. Fixed in 2.4.8 already.
macOS: Enhanced handling of DLL self-dependencies to accommodate cases where DLLs use both .so and .dylib extensions for self-references. Fixed in 2.4.8 already.
Compatibility: Fixed a memory leak that occurred when using deepcopy on compiled methods. Fixed in 2.4.9 already.
MSYS2: Excluded the bin directory from being considered a system DLL folder when determining DLL inclusion. Fixed in 2.4.9 already.
Python 3.10+: Fixed a crash that could occur when a match statement failed to match a class with arguments. Fixed in 2.4.9 already.
MSYS2: Implemented a workaround for non-normalized paths returned by os.path.normpath in MSYS2 Python environments. Fixed in 2.4.11 already.
Python 3.12: Resolved an issue where Nuitka’s constant code was triggering assertions in Python 3.12.7. Fixed in 2.4.10 already.
UI: Ensured that the --include-package option includes both Python modules and extension modules that are sub-modules of the specified package. Fixed in 2.4.11 already.
Windows: Prevented encoding issues with CMD files used for accelerated mode on Windows.
Standalone: Improved the standard library scan to avoid assuming the presence of specific files, which might have been deleted by the user or a Python distribution.
Compatibility: Added suffix, suffixes, and stem attributes to Nuitka resource readers to improve compatibility with file objects.
Compatibility: Backported the error message change for yield from used at the module level, using dynamic detection instead of hardcoded text per version.
Compatibility: Fixed an issue where calling built-in functions with keyword-only arguments could result in errors due to incorrect argument passing.
Compatibility: Fixed reference leaks that occurred when using list.insert and list.index with 2 or 3 arguments.
Windows: Prioritized relative paths over absolute paths for the result executable when absolute paths are not file system encodable. This helps address issues related to non-ASCII short paths on some Chinese systems.
Compatibility: Improved compatibility with C extensions by handling cases where the attribute slot is not properly implemented, preventing potential segfaults.
Compatibility: Prevent the leakage of sys.frozen when using the multiprocessing module and its plugin, resolving a long-standing TODO and potentially breaking compatibility with packages that relied on this behavior.
Compatibility: Fixed an issue where matching calls with keyword-only arguments could lead to incorrect optimization and argument passing errors.
Compatibility: Corrected the handling of iterators in for loops to avoid assuming the presence of slots, preventing potential issues.
macOS: Added support for cyclic DLL dependencies, where DLLs have circular references.
Compatibility: Ensured the use of updated expressions during optimization phase for side effects to prevent crashes caused by referencing obsolete information.
Python 3.10+: Fixed a crash that could occur in complex cases when re-formulating match statements.
Python 3.4-3.5: Corrected an issue in Nuitka’s custom PyDict_Next implementation that could lead to incorrect results in older Python 3 versions.
Python 3.10+: Ensured that AttributeError is raised with the correct keyword arguments, avoiding a TypeError that occurred previously.
Plugins: Added a data file function that avoids loading packages, preventing potential crashes caused by incompatible dependencies (e.g., numpy versions).
Compatibility: Ensured that Nuitka’s package reader closes data files after reading them to prevent resource warnings in certain Python configurations.
Standalone: Exposed setuptools contained vendor packages in standalone distributions to match the behavior of the setuptools package.
Accelerated Mode: Enabled the django module parameter in accelerated mode to correctly detect used extensions.
Compatibility: Prevented resource warnings for unclosed files when trace outputs are sent to files via command line options.
Compatibility: Enabled the use of xmlrpc.server without requiring the pydoc module.
Plugins: Fixed an issue in the anti-bloat configuration where change_function and change_classes ignored “when” clauses, leading to unintended changes.
Python 3.12 (Linux): Enhanced static libpython handling for Linux. Static libpython is now used only when the inline copy is available (not in official Debian packages). The inline copy of hacl is used for all Linux static libpython uses with Python 3.12 or higher.
Standalone: Further improved the standard library scan to avoid assuming the presence of files that might have been manually deleted.
UI: Fixed the --include-raw-dir option, which was not functioning correctly. Only the Nuitka Package configuration was being used previously.

Package Support

arcade: Improved standalone configuration for the arcade package. Added in 2.4.3 already.
license-expression: Added a missing data file for the license-expression package in standalone distributions. Added in 2.4.6 already.
pydantic: Included a missing implicit dependency required for deprecated decorators in the pydantic package to function correctly in standalone mode. Fixed in 2.4.5 already.
spacy: Added a missing implicit dependency for the spacy package in standalone distributions. Added in 2.4.7 already.
trio: Updated standalone support for newer versions of the trio package. Added in 2.4.8 already.
tensorflow: Updated standalone support for newer versions of the tensorflow package. Added in 2.4.8 already.
pygame-ce: Added standalone support for the pygame-ce package. Added in 2.4.8 already.
toga: Added standalone support for newer versions of the toga package on Windows. Added in 2.4.9 already.
django: Implemented a workaround for a django debug feature that attempted to extract column numbers from compiled frames. Added in 2.4.9 already.
PySide6: Improved standalone support for PySide6 on macOS by allowing the recognition of potentially unusable plugins. Added in 2.4.9 already.
polars: Added a missing dependency for the polars package in standalone distributions. Added in 2.4.9 already.
django: Enhanced handling of cases where the django settings module parameter is absent in standalone distributions. Added in 2.4.9 already.
win32ctypes: Included missing implicit dependencies for win32ctypes modules on Windows in standalone distributions. Added in 2.4.9 already.
arcade: Added a missing data file for the arcade package in standalone distributions. Added in 2.4.9 already.
PySide6: Allowed PySide6 extras to be optional on macOS in standalone distributions, preventing complaints about missing DLLs when they are not installed. Added in 2.4.11 already.
driverless-selenium: Added standalone support for the driverless-selenium package. Added in 2.4.11 already.
tkinterdnd2: Updated standalone support for newer versions of the tkinterdnd2 package. Added in 2.4.11 already.
kivymd: Updated standalone support for newer versions of the kivymd package. Added in 2.4.11 already.
gssapi: Added standalone support for the gssapi package. Added in 2.4.11 already.
azure.cognitiveservices.speech: Added standalone support for the azure.cognitiveservices.speech package on macOS.
mne: Added standalone support for the mne package.
fastapi: Added a missing dependency for the fastapi package in standalone distributions.
pyav: Updated standalone support for newer versions of the pyav package.
py_mini_racer: Added standalone support for the py_mini_racer package.
keras: Improved standalone support for keras by extending its sub-modules path to include the keras.api sub-package.
transformers: Updated standalone support for newer versions of the transformers package.
win32com.server.register: Updated standalone support for newer versions of the win32com.server.register package.
Python 3.12+: Added support for distutils in setuptools for Python 3.12 and later.
cv2: Enabled automatic scanning of missing implicit imports for the cv2 package in standalone distributions.
lttbc: Added standalone support for the lttbc package.
win32file: Added a missing dependency for the win32file package in standalone distributions.
kivy: Fixed an issue where the kivy clipboard was not working on Linux due to missing dependencies in standalone distributions.
paddleocr: Added missing data files for the paddleocr package in standalone distributions.
playwright: Added standalone support for the playwright package with a new plugin.
PySide6: Allowed PySide6 extras to be optional on macOS in standalone distributions, preventing complaints about missing DLLs when they are not installed.

New Features

Python 3.13: Added experimental support for Python 3.13.

Warning

Python 3.13 support is not yet recommended for production use due to limited testing. On Windows, only MSVC and ClangCL are currently supported due to workarounds needed for incompatible structure layouts.
UI: Introduced a new --mode selector to replace the options --standalone, --onefile, --module, and --macos-create-app-bundle.

Note

The app mode creates an app bundle on macOS and a onefile binary on other operating systems to provide the best deployment option for each platform.
Windows: Added a new hide choice for the --windows-console-mode option. This generates a console program that hides the console window as soon as possible, although it may still briefly flash.
UI: Added the --python-flag=-B option to disable the use of bytecode cache (.pyc) files during imports. This is mainly relevant for accelerated mode and dynamic imports in non-isolated standalone mode.
Modules: Enabled the generation of type stubs (.pyi files) for compiled modules using an inline copy of stubgen. This provides more accurate and informative type hints for compiled code.

Note

Nuitka also adds implicit imports to compiled extension modules, ensuring that dependencies are not hidden.
Plugins: Changed the data files configuration to a list of items, allowing the use of when conditions for more flexible control. Done in 2.4.6 already.
Onefile: Removed the MSVC requirement for the splash screen in onefile mode. It now works with MinGW64, Clang, and ClangCL. Done for 2.4.8 already.
Reports: Added information about the file system encoding used during compilation to help debug encoding issues.
Windows: Improved the attach mode for --windows-console-mode when forced redirects are used.
Distutils: Added the ability to disable Nuitka in pyproject.toml builds using the build_with_nuitka setting. This allows falling back to the standard build backend without modifying code or configuration. This setting can also be passed on the command line using --config-setting.
Distutils: Added support for commercial file embedding in distutils packages.
Linux: Added support for using uninstalled self-compiled Python installations on Linux.
Plugins: Enabled the matplotlib plugin to react to active Qt and tkinter plugins for backend selection.
Runtime: Added a new original_argv0 attribute to the __compiled__ value to provide access to the original start value of sys.argv[0], which might be needed by applications when Nuitka modifies it to an absolute path.
Reports: Added a list of DLLs that are actively excluded because they are located outside of the PyPI package.
Plugins: Allowed plugins to override the compilation mode for standard library modules when necessary.

Optimization

Performance: Implemented experimental support for “dual types”, which can significantly speed up integer operations in specific cases (achieving speedups of 12x or more in some very specific loops). This feature is still under development but shows promising potential for future performance gains, esp. when combined with future PGO (Profile Guided Optimization) work revealing likely runtime types more often and more types being covered.
Performance: Improved the speed of module variable access.
- For Python 3.6 to 3.10, this optimization utilizes dictionary version tags but may be less effective when module variables are frequently written to.
- For Python 3.11+, it relies on dictionary key versions, making it less susceptible to dictionary changes but potentially slightly slower for cache hits compared to Python 3.10.
Performance: Accelerated string dictionary lookups for Python 3.11+ by leveraging knowledge about the key and the module dictionary’s likely structure. This also resolves a previous TODO item, where initial 3.11 support was not as fast as our support for 3.10 was in this domain.
Performance: Optimized module dictionary updates to occur only when values actually change, improving caching efficiency.
Performance: Enhanced exception handling by removing bloat in the abstracted differences between Python 3.12 and earlier versions. This simplifies the generated C code, reduces conversions, and improves efficiency for all Python versions. This affects both C compile time and runtime performance favorably and solves a huge TODO for Python 3.12 performance.
Performance: Removed the use of CPython APIs calls for accessing exception context and cause values, which can be slow.
Performance: Utilized Nuitka’s own faster methods for creating int and long values, avoiding slower CPython API calls.
Performance: Implemented a custom variant of _PyGen_FetchStopIterationValue to avoid CPython API calls in generator handling, further improving performance on generators, coroutines and asyncgen.
Windows: Aligned with CPython’s change in reference counting implementation on Windows for Python 3.12+, which improves performance with LTO (Link Time Optimization) enabled.
Optimization: Expanded static optimization to include unary operations, improving the handling of number operations and preparing for full support of dual types.
Optimization: Added static optimization for os.stat and os.lstat calls.
Performance: Passed the exception state directly into unpacking functions, eliminating redundant exception fetching and improving code efficiency.
Performance: Introduced a dedicated helper for unpacking length checks, resulting in faster and more compact code helping scalability as well.
Performance: Generated more efficient code for raising built-in exceptions by directly creating them through the base exception’s new method instead of calling them as functions. This can speed up some things by a lot.
Performance: Optimized exception creation by avoiding unnecessary tuple allocations for empty exceptions. This hack avoids hitting the memory allocator as much.
Performance: Replaced remaining uses of PyTuple_Pack with Nuitka’s own helpers to avoid CPython API calls.
Code Generation: Replaced implicit exception raise nodes with direct exception creation nodes for improved C code generation.
Windows: Aligned with CPython’s change in managing object reference counters on Windows for Python 3.12+, improving performance with LTO enabled.
Performance: Removed remaining CPython API calls when creating int values in various parts of the code, including specialization code, helpers, and constants loading.
Windows: Avoided scanning for DLLs in the PATH environment variable when they are not intended to be used from the system. This prevents potential crashes related to non-encodable DLL paths and makes those scans faster too.
Windows: Updated to a newer MinGW64 version from 13.2 to 14.2 for potentially improved binary code generation with that compiler.
Code Size: Reduced the size of constant blobs by avoiding module-level constants for the global values -1, 0, and 1.
Code Generation: Improved code generation for variables by directly placing NameError exceptions into the thread state when raised, making for more compact C code.
Optimization: Statically optimized the sys.ps1 and sys.ps2 values to not exist (unless in module mode), potentially enabling more static optimization in packages that detect interactive usage checking them.
Performance: Limited the use of tqdm locking to no-GIL and Scons builds where threading is actively used.
Optimization: Implemented a faster check for non-frame statement sequences by decoupling frames and normal statement sequences and using dedicated accessors. This improves performance during the optimization phase.

Anti-Bloat

Prevented the inclusion of importlib_metadata for the numpy package. Added in 2.4.2 already.
Avoided the use of dask in the pandera package. Added in 2.4.5 already.
Removed numba for newer versions of the shap package. Added in 2.4.6 already.
Prevented attempts to include both Python 2 and Python 3 code for the aenum package, avoiding SyntaxError warnings. Added in 2.4.7 already.
Enhanced handling for the sympy package. Added in 2.4.7 already.
Allowed pydoc for the pyqtgraph package. Added in 2.4.7 already.
Avoided pytest in the time_machine package. Added in 2.4.9 already.
Avoided pytest in the anyio package.
Avoided numba in the pandas package.
Updated anti-bloat measures for newer versions of the torch package with increased coverage.
Avoided pygame.tests and cv2 for the pygame package.
Allowed unittest in the absl.testing package.
Allowed setuptools in the tufup package.
Avoided test modules when using the bsdiff4 package.
Treated the use of the wheel module the same as using the setuptools package.

Organizational

Development Environment: Added experimental support for a devcontainer to the repository, providing an easier way to set up a Linux-based development environment. This feature is still under development and may require further refinement.
Issue Reporting: Clarified the issue reporting process on GitHub, emphasizing the importance of testing reproducers against Python first to ensure the issue is related to Nuitka.
Issue Reporting: Discouraged the use of --deployment in issue reports, as it hinders the automatic identification of issues, that should be the first thing to remove.
UI: Improved the clarity of help message of the option for marking data files as external, emphasizing that files must be included before being used.
UI: Added checks to the Qt plugins to ensure that specified plugin families exist, preventing unnoticed errors.
UI: Implemented heuristic detection of terminal link support, paving the way for adding links to options and groups in the command line interface.
UI: Removed obsolete caching-related options from the help output, as they have been replaced by more general options.
Plugins: Improved error messages when retrieving information from packages during compilation.
Quality: Implemented a workaround for an isort bug that prevented it from handling UTF-8 comments.
Quality: Updated GitHub actions to use clang-format-20.
Quality: Updated to the latest version of black for code formatting.
Release Process: Updated the release script tests for Debian and PyPI to use the correct runner names. (Changed in 2.4.1 already.
UI: Disabled progress bar locking, as Nuitka currently doesn’t utilize threads.
UI: Added heuristic detection of terminal link support and introduced an experimental terminal link as a first step towards a more interactive command line interface.
Debugging: Fixed a crash in the “explain reference counts” feature that could occur with unusual dict values mistaken for modules.
Debugging: Included reference counts of tracebacks when dumping reference counts at program end.
Debugging: Added assertions and traces to improve debugging of input/output handling.
Quality: Added checks for configuration module names in Nuitka package configuration to catch errors caused by using filenames instead of module names.
UI: Removed obsolete options controlling cache behavior, directing users to the more general cache options.
Scons: Ensured that the CC environment variable is used consistently for --version and onefile bootstrap builds, as well as the Python build, preventing inconsistencies in compiler usage and outputs.
Distutils: Added the compiled-package-hidden-by-package mnemonic for use in distutils to handle the expected warning when a Python package is replaced with a compiled package and the Python code is yet to be deleted.
Dependency Management: Started experimental support for downloading Nuitka dependencies like ordered-set. This feature is not yet ready for general use.

Tests

Added Python 3.13 to the GitHub Actions test matrix.
Significantly enhanced construct-based tests for clearer results. The new approach executes code with a boolean flag instead of generating different code, potentially leading to the removal of custom templating.
Removed the 2to3 conversion code from the test suite, as it is being removed from newer Python versions. Tests are now split with version requirements as needed.
Fixed an issue where the test runner did not discover and use Python 3.12+, resulting in insufficient test coverage for those versions on GitHub Actions.
Ensured that the compare_with_cpython test function defaults to executing the system’s Python interpreter instead of relying on the PYTHON environment variable.
Set up continuous integration with Azure Pipelines to run Nuitka tests against the factory branch on each commit.
Enforced the use of static libpython for construct-based tests to eliminate DLL call overhead and provide more accurate performance measurements.
Improved the robustness of many construct tests, making them less sensitive to unrelated optimization changes.
Removed a test that was only applicable to Nuitka Commercial, as it was not useful to always skip it in the standard version. Commercial tests are now also recognized by their names.
Added handling for segmentation faults in distutils test cases, providing debug output for easier diagnosis of these failures.
Prevented resource warnings for unclosed files in a reflected test.

Cleanups

WASI: Corrected the signatures of C function getters and setters for compiled types in WASI to ensure they match the calling conventions. Casts are now performed locally to the compiled types instead of in the function signature. Call entries also have the correct signature used by Python C code.
WASI: Improved code cleanliness by adhering to PyCFunction signatures in WASI.
Code Generation: Fixed a regression in code generation that caused misaligned indentation in some cases.
Code Formatting: Changed some code for identical formatting with clang-format-20 to eliminate differences between the new and old versions.
Caching: Enforced proper indentation in Nuitka cache files stored in JSON format.
Code Cleanliness: Replaced checks for Python 3.4 or higher with checks for Python 3, simplifying the code and reflecting the fact that Python 3.3 is no longer supported.
Code Cleanliness: Removed remaining Python 3.3 specific code from frame templates.
Code Cleanliness: Performed numerous spelling corrections and renamed internal helper functions for consistency and clarity.
Plugins: Renamed the get_module_directory helper function in the Nuitka Package configuration to remove the leading underscore, improving readability.
Plugins: Moved the numexpr.cpuinfo workaround to the appropriate location in the Nuitka Package configuration, resolving an old TODO item.

Summary

This a major release that brings support for Python 3.13, relatively soon after its release.

Our plugin system and Nuitka plugin configuration was used a lot for support of many more third-party packages, and numerous other enhancements in the domain of avoiding bloat.

This release focuses on improved compatibility, new break through performance optimizations, to build on in the future, enhanced error reporting, and better debugging support.

Categories: FLOSS Project Planets

EuroPython Society: List of EPS Board Candidates for 2024/2025

Tue, 2024-11-26 17:27

At this year’s EuroPython Society General Assembly (GA), planned for Sunday, December 1st, 2024, 20:00 CET, we will vote in a new board of the EuroPython Society for the term 2024/2025

List of Board Candidates

The EPS bylaws require one chair, one vice chair and 2 - 7 board members. The following candidates have stated their willingness to work on the EPS board. We are presenting them here (in alphabetical order by first name).

We will be updating this list in the days before the GA. Please send in any nominations or self-nominations to board@europython.eu. For more information please check our previous post here: https://europython-society.org/2024-general-assembly-announcement/

Please note that our bylaws do not restrict nominations to people on this list. It is even possible to self-nominate or nominate other candidates at the GA itself. However, in the interest of giving members a better chance to review the candidate list, we’d like to encourage all nominations to be made before the GA.

The following fine folks have expressed their desire to run for the next EPS board elections: Anders Hammarquist, Aris Nivorils, Artur Czepiel, Cyril Bitterich, Mia Bajić, Shekhar Koirala.

Anders Hammarquist

Pythonista / Consultant / Software architect

Anders is running his own Python consultancy business, AB Struse, since 2019 and is currently mostly involved with using Python in industrial automation. He has been using Python since 1995, and fosters its use in at least four companies.

He helped organize EuroPython 2004 and 2005, and has attended and given talks at several EuroPythons since then. He has handled the Swedish financials of the EuroPython Society since 2016 and has served as board member since 2017.

Aris Nivorlis

Pythonista / Geoscientist / Data Steward

Aris is a researcher at Deltares, a non-profit research institute in the Netherlands, where he combines his expertise in geoscience with his passion for Python and data stewardship to address real-world challenges. His journey with Python started during his doctoral studies, where he became a passionate advocate for the language and supported several colleagues in adopting Python for their research.

Aris has been involved in the Python community for the past four years. He is the Chair of PyCon Sweden and has been a core organizer for the past three conferences. He was a EuroPython organizer in 2024, leading the Ops Team, an experience he describes as both challenging and incredibly rewarding.

Aris is running for the EuroPython Society (EPS) Board to work in shaping its future direction. He is particularly interested on how EPS can further support local Python communities, events, and projects, while ensuring the success of the EuroPython conference. Aris aims to build on the foundation of previous efforts, working toward a more independent and sustainable organisation team for EuroPython. One of his key goals is to lower the barriers for others to get involved as volunteers, organizers, and board members, fostering a more inclusive and accessible society.

Artur Czepiel (nomination for Chair)

Software developer

I started using Python in 2008 and attended my first EuroPython in 2016, and it’s been an incredible journey ever since. Over the years, I’ve had the opportunity to contribute to five EuroPython conferences, including serving four terms on the board and chairing the conference last year.

Despite that, I still have new ideas for improvements and a couple of unfinished projects I’d like to see moving forward. :)

Cyril Bitterich

Operations dude / Organiser / Systems Engineer

Cyril’s first contact with the EuroPython community was back in 2019 in Basel during a break between workshops. Starting with helping prepare goodie bags, he went on to assist with setting up the conference location, and took on the role of a general runner during the event. After becoming a late addition to the Ops team in Dublin, he was part of the Ops and Programme teams for both Prague editions of the conference. His firefighting skills proved invaluable as he supported other teams and the general organisation, making him a de facto member of the 2024 core organisers team.

Enjoying the warmth of the community and gaining experience in a lot of different roles, it&aposs now time for him to pass on the lessons he’s learned in a more structured way. Having taken on smaller and ad-hoc leadership roles during the conference, he now aims to play an even more active role in the year-to-year operations of the EuroPython community.

With a background in successfully playing firefighter at organising capoeira events for 100 to 2000 attendees at locations all over the world (Asia and Antarctica are still on the to-do list 😉), and working with leaders, teams and attendees from diverse cultures, his knowledge and experience extend far beyond the last 3 years of EuroPython conference.

His goal is to bring more structure to the EPS as a whole and the conference teams where needed. A key focus is ensuring effective communication, be it through improving documentation or maintaining open and regular dialogue with the teams and everyone else involved Ultimately, he aims to help strike a healthy balance between the community-driven, volunteer spirit with the professionalism expected from an established conference like EuroPython. However, the aim is not to disrupt or rebuild everything from scratch. Instead, he seeks to build on the strong foundation established by the board members active in the last years - while shamelessly taking full advantage of their counsel along the way.

This should establish a stable foundation within the EuroPython Society that can be adopted or directly used by other local communities in Europe. On the EPS side, this will hopefully open up resources to support other communities based on their specific needs.

Mia Bajić

Software Engineer & Community Events Organizer

I’m a software engineer and community events organizer. Since joining the Python community in 2021, I’ve led Pyvo meetups, brought Python Pizza to the Czech Republic, and contributed to PyCon CZ 23 as well as EuroPython 2023 & 2024. I&aposve spoken on technical topics at major conferences, including PyCon US, DjangoCon, EuroPython and many other PyCons across Europe.

I’m running for the board to learn, grow, and give back to the community that has given me so much. The main topic I would like to focus on is sustainability. I value setting clear goals, fostering open and transparent culture, and delivering measurable results.

Inspired by the folks from the Django Software Foundation, I decided to share more about my background, motivations, and small improvements I’d like to see implemented this year on my blog: https://clytaemnestra.github.io/tech-blog/eps-elections.

Shekhar Koirala

Machine Learning Engineer

I joined EuroPython as a remote volunteer in 2022 and later became part of the onsite volunteer team in Dublin the same year. Over the past two years, I’ve worked with the Ops team, supporting other onsite volunteers. My volunteering experience also includes contributions to Python Ireland and PyData Kathmandu. This year, I attended PyData Amsterdam, PyCon Sweden, and a non-Python volunteer-led conference in Berlin, Germany, where I sought to understand what truly makes a conference great. I realized that, at the core, it’s the people and their love for the community that make a conference exceptional.

Whether or not I join the board, I will continue volunteering for EuroPython. But if I am selected as a board member, I aim to support volunteers and foster a wholesome environment, just as I have always experienced. I also want to strengthen the connection between EuroPython and the local community.

Professionally, I work as a Machine Learning Engineer at Identv, using Python for both work and hobbies. Recently, I delivered a talk and conducted a workshop at PyCon Ireland, and I’m excited to kickstart my open-source journey. Beside sitting in front of the computer, I love to hike, take photos and try not to get lost in the wild.

What does the EPS Board do ?

The EPS board is made up of up to 9 directors (including 1 chair and 1 vice chair); the board runs the day-to-day business of the EuroPython Society, including running the EuroPython conference series, and supports the community through various initiatives such as our grants programme. The board collectively takes up the fiscal and legal responsibility of the Society.

For more details you can check our previous post here: https://europython-society.org/2024-general-assembly-announcement/#what-does-the-board-do

Categories: FLOSS Project Planets

PyCoder’s Weekly: Issue #657 (Nov. 26, 2024)

Tue, 2024-11-26 14:30

#657 – NOVEMBER 26, 2024
View in Browser »

NumPy Practical Examples: Useful Techniques

In this tutorial, you’ll learn how to use NumPy by exploring several interesting examples. You’ll read data from a file into an array and analyze structured arrays to perform a reconciliation. You’ll also learn how to quickly chart an analysis and turn a custom function into a vectorized function.
REAL PYTHON

Loop Targets

Loop assignment allows you to assign to a dict item in a for loop. This post covers what that means and that it is no more costly than regular assignment.
NED BATCHELDER

Introducing the Windsurf Editor. Tomorrow’s IDE, Today

Introducing the Windsurf Editor, the first agentic IDE. All the features you know and love from Codeium’s extensions plus new capabilities such as Cascade that act as collaborative AI agents, combining the best of copilot and agent systems →
CODEIUM sponsor

Vector Animations With Python

This post shows you how to use gizeh and moviepy to create animations using vector graphics.
DEEPNOTE.COM

Take the 2024 Django Developers Survey

DJANGO SOFTWARE FOUNDATION

Python 3.14.0 Alpha 2 Released

CPYTHON DEV BLOG

Quiz: Python Dictionary Comprehensions

REAL PYTHON

PyTexas (Austin) Call for Proposals

PYTEXAS.ORG

Quiz: Namespaces and Scope in Python

REAL PYTHON

Articles & Tutorials Avoid Counting in Django Pagination

Django’s Paginator class separates data into chunks to display pages of results. By default, underneath it uses a SQL COUNT(*) call which for large amounts of data can be expensive. This article shows you how to get around that.
NIK TOMAZIC

Working With TOML and Python

TOML is a configuration file format that’s becoming increasingly popular in the Python community. In this video course, you’ll learn the syntax of TOML and explore how you can work with TOML files in your own projects.
REAL PYTHON course

Categories of Leadership on Technical Teams

Understanding the different types of technical leadership can help you better structure teams and define roles and responsibilities. This post talks about the various types of technical leaders you might encounter.
BEN KUHN

Quick Prototyping With Sqlite3

Python’s sqlite3 bindings makes it a great tool for quick prototyping while you’re working out what it is exactly that you’re building and what kind of database schema makes sense.
JUHA-MATTI SANTALA

Django Performance and Optimization

“This document provides an overview of techniques and tools that can help get your Django code running more efficiently - faster, and using fewer system resources.”
DJANGO SOFTWARE FOUNDATION

Python Dependency Management Is a Dumpster Fire

Managing dependencies in Python can be a bit of a challenge. This deep dive article shows you all the problems and how the problems are mitigated if not solved.
NIELS CAUTAERTS

Under the Microscope: Ecco the Dolphin

This article shows the use of Ghidra and Python to reverse engineer the encoding scheme of the Sega Dreamcast game Eccho the Dolphin.
32BITS

Is Python Really That Slow?

The speed of Python is an age old conversation. This post talks about just what it does and doesn’t mean.
MIGUEL GRINBERG

Threads Beat Async/Await

Further musings about async & await and why Armin thinks virtual threads are a better model.
ARMIN RONACHER

Python for R Users

Resources for getting better at Python for experienced R developers
STEPHEN TURNER

Projects & Code nbformat: Base Implementation of Jupyter Notebook Format

PYPI.ORG

pex: Generate Python EXecutable Files, Lock Files and Venvs

GITHUB.COM/PEX-TOOL

Async, Pure-Python Server-Side Rendering Engine

VOLFPETER.GITHUB.IO • Shared by Peter Volf

django-tasks: Background Workers Reference Implementation

GITHUB.COM/REALORANGEONE

dsRAG: Retrieval Engine for Unstructured Data

GITHUB.COM/D-STAR-AI

Events Weekly Real Python Office Hours Q&A (Virtual)

November 27, 2024
REALPYTHON.COM

Python New Zealand: Mapping Hurricane Wind Speeds From Space

November 28, 2024
MEETUP.COM

SPb Python Drinkup

November 28, 2024
MEETUP.COM

PyCon Wroclaw 2024

November 30 to December 1, 2024
PYCONWROCLAW.COM

PyDay Togo 2024

November 30 to December 1, 2024
LU.MA

PyDelhi User Group Meetup

November 30, 2024
MEETUP.COM

Happy Pythoning!
This was PyCoder’s Weekly Issue #657.
View in Browser »

[ Subscribe to 🐍 PyCoder’s Weekly 💌 – Get the best Python news, articles, and tutorials delivered to your inbox once a week >> Click here to learn more ]

Categories: FLOSS Project Planets

Real Python: Managing Dependencies With Python Poetry

Tue, 2024-11-26 09:00

When your Python project relies on external packages, you need to make sure you’re using the right version of each package. After an update, a package might not work as it did before. A dependency manager like Python Poetry helps you specify, install, and resolve external packages in your projects. This way, you can be sure that you always work with the correct dependency version on every machine.

In this video course, you’ll learn how to:

Create a new project using Poetry
Add Poetry to an existing project
Configure your project through pyproject.toml
Pin your project’s dependency versions
Install dependencies from a poetry.lock file
Run basic Poetry commands using the Poetry CLI

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Categories: FLOSS Project Planets

Armin Ronacher: Constraints are Good: Python's Metadata Dilemma

Mon, 2024-11-25 19:00

There is currently an effort underway to build a new universal lockfile standard for Python, most of which is taking place on the Python discussion forum. This initiative has highlighted the difficulty of creating a standard that satisfies everyone. It has become clear that different Python packaging tools are having slightly different ideas in mind of what a lockfile is supposed to look like or even be used for.

In those discussions however also a small other aspect re-emerged: Python has a metadata problem. Python's metadata system is too complex and suffers from what I would call “lack of constraints”.

JavaScript: Example of Useful Constraints

JavaScript provides an excellent example of how constraints can simplify and improve a system. In JavaScript, metadata is straightforward. Whether you develop against a package locally or if you are using a package from npm, metadata represents itself the same way. There is a single package.json file that contains the most important metadata of a package such as name, version or dependencies. This simplicity imposes significant but beneficial constraints:

There is a 1:1 relationship between an npm package and its metadata. Every npm package has a single package.json file that is the source of truth of metadata. Metadata is trivially accessible, even programmatically, via require('packageName/package.json').
Dependencies (and all other metadata) are consistent across platforms and architectures. Platform-specific binaries are handled via a filter mechanism (os and cpu) paired with optionalDependencies. [1]
All metadata is static, and updates require explicit changes to package.json prior to distribution or installation. Tools are provided to manipulate that metadata such as npm version patch which will edit the file in-place.

These constraints offer several benefits:

Uniform behavior regardless of whether a dependency is installed locally or from a remote source. There is not even a filesystem layout difference between what comes from git or npm. This enables things like replacing an installed dependency with a local development copy, without change in functionality.
There is one singular source of truth for all metadata. You can edit package.json and any consumer of that metadata can just monitor that file for changes. No complex external information needs to be consulted.
Resolvers can rely on a single API call to fetch dependency metadata for a version, improving efficiency. Practically this also means that the resolver only needs to hit a single URL to retrieve all possible dependencies of a dependency. [2]
It makes auditing much easier because there are fewer moving parts and just one canonical location for metadata.

Python: The Cost of Too Few Constraints

In contrast, Python has historically placed very few constraints on metadata. For example, the old setup.py based build system essentially allowed arbitrary code execution during the build process. At one point it was at least strongly suggested that the version produced by that build step better match what is uploaded to PyPI. However, in practice, if you lie about the version that is okay too. You could upload a source distribution to PyPI that claims it's 2.0 but will in fact install 2.0+somethinghere or a completely different version entirely.

What happens is that both before a package is published to PyPI and when a package is installed locally after downloading, the metadata is generated from scratch. Not only does that mean the metadata does not have to match, it also means that it's allowed to be completely different. It's absolutely okay for a package to claim it depends on cool-dependency on your machine, but on uncool-dependency on my machine. Or to dependent on different packages depending on the time of the day of the phase of the moon.

Editable installs and caching are particularly problematic since metadata could become invalid almost immediately after being written. [3]

Some of this has been somewhat improved because the new pyproject.toml standard encourages static metadata. However build systems are entirely allowed to override that by falling back to what is called “dynamic metadata” and this is something that is commonly done.

In practice this system incurs a tremendous tax to everybody that can be easily missed.

Disjointed and complex metadata access: there is no clear relationship of PyPI package name and the installed Python modules. If you know what the PyPI package name is, you can access metadata via importlib.metadata. Metadata is not read from pyproject.toml, even if it's static, instead it takes the package name and it accesses the metadata from the .dist-info folder (most specifically the METADATA file therein) installed into site-packages.
Mandatory metadata re-generation: As a consequence if you edit pyproject.toml to edit a piece of metadata, you need to re-install the package for that metadata to be updated in the .dist-info. People commonly forget doing that, so desynchronized metadata is very common. This is true even for static metadata today!
Unclear cache invalidation: Because metadata can be dynamic, it's not clear when you should automatically re-install a package. It's not enough to just track pyproject.toml for changes when dynamic metadata is used. uv for instance has a really complex, explicit cache management system so one can help uv detect outdated metadata. This obviously is non-standardized, requires uv to understand version control systems and is also not shared with other tools. For instance if you know that the version information incorporates the git hash, you can tell uv to pay attention to git commits.
Fragmented metadata storage: even where generated metadata is stored is complex. Different systems have slightly different behavior for storing that metadata.
- When working locally (eg: editable installs) what happens depends on the build system:
 - If setuptools is used, metadata written into two locations. The legacy <PACKAGE_NAME>.egg-info/PKG-INFO file. Additionally it's placed in the new location for metadata inside site-packages in a <PACKAGE_NAME>.dist-info/METADATA file.
 - If hatch and most other modern build systems are used, metadata is only written into site-packages. (into <PACKAGE_NAME>.dist-info/METADATA)
 - If no build system is configured it depends a bit on the installer. pip will even for an editable install build a wheel with setuptools, uv will only build a wheel and make the metadata available if one runs uv build. Otherwise the metadata is not available (in theory it could be found in pyproject.toml for as long as it's not dynamic).
- For source distributions (sdist) first the build step happens as in the section before. Afterwards the metadata is thrown into a PKG-INFO file. It's currently placed in two locations in the sdist: PKG-INFO in the root and <PACKAGE_NAME>.egg-info/PKG-INFO. That metadata however I believe is only used for PyPI, when installing the sdist locally the metadata is regenerated from pyproject.toml (or if setuptools is used setup.py). That's also why metadata can change from what's in the sdist to what's there after installation.
- For wheels the metadata is placed in <PACKAGE_NAME>.dist-info/METADATA exclusively. Wheels have static metadata, so no build step is taking place. What is in the wheel is always used.
Dynamic metadata makes resolvers slow: Dynamic metadata makes the job of resolvers and installers very hard and slows them down. Today for instance advanced resolvers like poetry or uv sometimes are not able to install the right packages, because they assume that dependency metadata is consistent across sdists and wheels. However there are a lot of sdists available on PyPI that publish incomplete dependency metadata (just whatever the build step for the sdist created on the developer's machine is what is cached on PyPI).

Not getting this right can be the difference of hitting one static URL with all the metadata, and downloading a zip file, creating a virtualenv, installing build dependencies, generating an entire sdist and then reading the final generated metadata. Many orders of magnitude difference in time it takes to execute.

This also extends to caching. If the metadata can constantly change, how would a resolver cache it? Is it required to build all possible source distributions to determine the metadata as part of resolving?
Cognitive complexity: The system introduces an enormous cognitive overhead which makes it very hard to understand for users, particularly when things to wrong. Incorrectly cached metadata can be almost impossible to debug for a user because they do not understand what is going on. Their pyproject.toml shows the right information, yet for some reason it behaves incorrectly. Most people don't know what "egg info" or "dist info" is. Or why an sdist has metadata in a different location than a wheel or a local checkout.

Having support for dynamic metadata also means that developers continue to maintain elaborate and confusing systems. For instance there is a plugin for hatch that dynamically creates a readme [4], requiring even arbitrary Python code to run to display documentation. There are plugins to automatically change versions to incorporate git version hashes. As a result to figure out what version you actually have installed it's not just enough to look into a single file, you might have to rely on a tool to tell you what's going on.

Moving The Cheese

The challenge with dynamic metadata in Python is vast, but unless you are writing a resolver or packaging tool, you're not going to experience the pain as much. You might in fact quite enjoy the power of dynamic metadata. Unsurprisingly bringing up the idea to remove it is very badly received. There are so many workflows seemingly relying on it.

At this point fixing this problem might be really hard because it's a social problem more than a technical one. If the constraint would have been placed there in the first place, these weird use cases would never have emerged. But because the constraints were not there, people were free to go to town with leveraging it with all the consequences it causes.

I think at this point it's worth moving the cheese, but it's unclear if this can be done through a standard. Maybe the solution will be for tools like uv or poetry to warn if dynamic metadata is used and strongly discourage it. Then over time the users of packages that use dynamic metadata will start to urge the package authors to stop using it.

The cost of dynamic metadata is real, but it's felt only in small ways all the time. You notice it a bit when your resolver is slower than it has to, you notice it if your packaging tool installs the wrong dependency, you notice it if you need to read the manual for the first time when you need to reconfigure your cache-key or force a package to constantly reinstall, you notice it if you need to re-install your local dependencies over and over for them not to break. There are many ways you notice it. You don't notice it as a roadblock, just as a tiny, tiny tax. Except that is a tax we all pay and it makes the user experience significantly worse compared to what it could be.

The deeper lesson here is that if you give developers too much flexibility, they will inevitably push the boundaries and that can have significant downsides as we can see. Because Python's packaging ecosystem lacked constraints from the start, imposing them now has become a daunting challenge. Meanwhile, other ecosystems, like JavaScript's, took a more structured approach early on, avoiding many of these pitfalls entirely.

[1]You can see how this works in action for sentry-cli for instance. The @sentry/cli package declares all its platform specific dependencies as optionalDependencies (relevant package.json). Each platform build has a filter in its package.json for os and cpu. For instance this is what the arm64 linux binary dependency looks like: package.json. npm will attempt to install all optional dependencies, but it will skip over the ones that are not compatible with the current platform. [2]For @sentry/cli at version 2.39.0 for instance this means that this singular URL will return all the information that a resolver needs: registry.npmjs.org/@sentry/cli/2.39.0 [3]A common error in the past was to receive a pkg_resources.DistributionNotFound exception when trying to run a script in local development [4]I got some flak on Bluesky for throwing readme generators under the bus. While they do not present the same problem when it comes to metadata like dependencies and versions do, they do still increase the complexity. In an ideal world what you find in site-packages represents what you have in your version control and there is a README.md file right there. That's what you have in JavaScript, Rust and plenty of other ecosystems. What we have however is a build step (either dynamic or copying) taking that readme file, and placing it in a RFC 5322 header encoded file in a dist info. So instead of "command clicking" on a dependency and finding the readme, we need special tools or arcane knowledge if we want to read the readme files locally.

Categories: FLOSS Project Planets

Mike Driscoll: Black Friday Python Deals 2024

Mon, 2024-11-25 13:57

Black Friday and Cyber Monday are nearly here, so it’s time to do a Python sale! All my books and courses are 35% off until December 4th if you use this code: BF24 at checkout.

You can learn about any of the following topics in my books and courses:

Basic Python (Python 101)
PDF Processing (ReportLab book)
Excel and Python
Image processing (Pillow book)
Python Logging (NEW this year!)
Jupyter Notebook
JupyterLab 101 (NEW this month!)
and more

Check out my books and courses today!

Other Python Sales

Sundeep Agarwal: ~50% off Sundeep’s all book and Python bundles with code FestiveOffer
Python Jumpstart with Python Morsels: 50% off Trey Hunner’s brand new Python course, an introduction to Python that’s very hands-on ($99 instead of $199)
Rodrigo 50% off Rodrigo’s all books bundle with code BF24
The Python Coding Place: 40% off The Python Coding Book and 40% off a lifetime membership to The Python Coding Place with code black2024
Adam Johnson’s Django-related Deals for Black Friday 2024

The post Black Friday Python Deals 2024 appeared first on Mouse Vs Python.

Categories: FLOSS Project Planets

Trey Hunner: New Python Jumpstart course

Mon, 2024-11-25 11:08

I’ve just recently launched a self-paced introduction to Python that is extremely hands-on. It’s called Python Jumpstart and it’s based on introductory Python curriculum that I have been iterating on for years.

Learn Python by writing Python code ✍

We do not learn by putting information into the brain. Our brains simply don’t retain knowledge that way.

Learning happens from repeatedly attempting to retrieve information from the brain. When it comes to Python, that means writing Python code.

So unlike most Python courses, Python Jumpstart is not focused on watching videos and rewriting code seen within videos. Instead, this new Python course is based around learning Python by writing Python code.

How Python Jumpstart is structured 🔬

Python Jumpstart includes learning Python by solving 46 short Python exercises.

Before each exercise, you’ll watch a 5 minute video explaining a new Python topic. You’ll then attempt the exercise to put one or more Python topics into practice.

You won’t solve many of the exercises on your first try and that’s okay. These exercises are designed to be revisited a few times over the course of weeks, until you’re satisfied with your solution.

This is not a sprint 📆

This structured path to Python proficiency, includes:

Carefully crafted exercises that build real understanding
Short, detailed explanations that won’t waste your time
A proven teaching approach that focuses on active learning
Spaced repetition to help concepts stick

This is not a course for impatient or passive learners. Learning takes repetitive effort spaced over many days and this course is structured to embrace that fact.

If you spend 30 minutes on Python Jumpstart each day, I estimate that you’ll complete this course in about 7 weeks. You’ll spend the large majority of that time writing Python code.

This course will take time, but it will be time well-spent.

Launch week special: 50% off until December 2 ⏰

Through Monday December 2, 2024, you can get lifetime access to Python Jumpstart for $99. After this launch week, Python Jumpstart will be $199.

Ready to jumpstart your Python learning journey?

Get Python Jumpstart for $99

Categories: FLOSS Project Planets

Real Python: Speed Up Your Python Program With Concurrency

Mon, 2024-11-25 09:00

Concurrency refers to the ability of a program to manage multiple tasks at once, improving performance and responsiveness. It encompasses different models like threading, asynchronous tasks, and multiprocessing, each offering unique benefits and trade-offs. In Python, threads and asynchronous tasks facilitate concurrency on a single processor, while multiprocessing allows for true parallelism by utilizing multiple CPU cores.

Understanding concurrency is crucial for optimizing programs, especially those that are I/O-bound or CPU-bound. Efficient concurrency management can significantly enhance a program’s performance by reducing wait times and better utilizing system resources.

In this tutorial, you’ll learn how to:

Understand the different forms of concurrency in Python
Implement multi-threaded and asynchronous solutions for I/O-bound tasks
Leverage multiprocessing for CPU-bound tasks to achieve true parallelism
Choose the appropriate concurrency model based on your program’s needs

To get the most out of this tutorial, you should be familiar with Python basics, including functions and loops. A rudimentary understanding of system processes and CPU operations will also be helpful. You can download the sample code for this tutorial by clicking the link below:

Get Your Code: Click here to download the free sample code that you’ll use to learn about speeding up your Python program with concurrency.

Take the Quiz: Test your knowledge with our interactive “Python Concurrency” quiz. You’ll receive a score upon completion to help you track your learning progress:

Interactive Quiz

Python Concurrency

In this quiz, you'll test your understanding of Python concurrency. You'll revisit the different forms of concurrency in Python, how to implement multi-threaded and asynchronous solutions for I/O-bound tasks, and how to achieve true parallelism for CPU-bound tasks.

Exploring Concurrency in Python

In this section, you’ll get familiar with the terminology surrounding concurrency. You’ll also learn that concurrency can take different forms depending on the problem it aims to solve. Finally, you’ll discover how the different concurrency models translate to Python.

What Is Concurrency?

The dictionary definition of concurrency is simultaneous occurrence. In Python, the things that are occurring simultaneously are called by different names, including these:

Thread
Task
Process

At a high level, they all refer to a sequence of instructions that run in order. You can think of them as different trains of thought. Each one can be stopped at certain points, and the CPU or brain that’s processing them can switch to a different one. The state of each train of thought is saved so it can be restored right where it was interrupted.

You might wonder why Python uses different words for the same concept. It turns out that threads, tasks, and processes are only the same if you view them from a high-level perspective. Once you start digging into the details, you’ll find that they all represent slightly different things. You’ll see more of how they’re different as you progress through the examples.

Now, you’ll consider the simultaneous part of that definition. You have to be a little careful because, when you get down to the details, you’ll discover that only multiple system processes can enable Python to run these trains of thought at literally the same time.

In contrast, threads and asynchronous tasks always run on a single processor, which means they can only run one at a time. They just cleverly find ways to take turns to speed up the overall process. Even though they don’t run different trains of thought simultaneously, they still fall under the concept of concurrency.

Note: Threads in most other programming languages often run in parallel. To learn why Python threads can’t, check out What Is the Python Global Interpreter Lock (GIL)?

If you’re curious about even more details, then you can also read about Bypassing the GIL for Parallel Processing in Python or check out the experimental free threading introduced in Python 3.13.

The way the threads, tasks, or processes take turns differs. In a multi-threaded approach, the operating system actually knows about each thread and can interrupt it at any time to start running a different thread. This mechanism is also true for processes. It’s called preemptive multitasking since the operating system can preempt your thread or process to make the switch.

Preemptive multitasking is handy in that the code in the thread doesn’t need to do anything special to make the switch. It can also be difficult because of that at any time phrase. The context switch can happen in the middle of a single Python statement, even a trivial one like x = x + 1. This is because Python statements typically consist of several low-level bytecode instructions.

On the other hand, asynchronous tasks use cooperative multitasking. The tasks must cooperate with each other by announcing when they’re ready to be switched out without the operating system’s involvement. This means that the code in the task has to change slightly to make it happen.

The benefit of doing this extra work upfront is that you always know where your task will be swapped out, making it easier to reason about the flow of execution. A task won’t be swapped out in the middle of a Python statement unless that statement is appropriately marked. You’ll see later how this can simplify parts of your design.

What Is Parallelism?

So far, you’ve looked at concurrency that happens on a single processor. What about all of those CPU cores your cool, new laptop has? How can you make use of them in Python? The answer is to execute separate processes!

A process can be thought of as almost a completely different program, though technically, it’s usually defined as a collection of resources including memory, file handles, and things like that. One way to think about it is that each process runs in its own Python interpreter.

Because they’re different processes, each of your trains of thought in a program leveraging multiprocessing can run on a different CPU core. Running on a different core means that they can actually run at the same time, which is fabulous. There are some complications that arise from doing this, but Python does a pretty good job of smoothing them over most of the time.

Read the full article at https://realpython.com/python-concurrency/ »

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Categories: FLOSS Project Planets

Python Bytes: #411 TLS Client: Hello <<guitar solo>>

Mon, 2024-11-25 03:00

Topics covered in this episode: <ul> <li><a href="https://talkpython.fm/blog/posts/talk-python-rewritten-in-quart-async-flask/?featured_on=pythonbytes">Talk Python rewritten in Quart</a></li> <li><a href="https://blog.pypi.org/posts/2024-11-14-pypi-now-supports-digital-attestations/?featured_on=pythonbytes">PyPI now supports digital attestations</a></li> <li><a href="https://github.com/LilyFoote/django-rusty-templates?featured_on=pythonbytes">Django Rusty Templates</a></li> <li><a href="https://discuss.python.org/t/pep-639-round-3-improving-license-clarity-with-better-package-metadata/53020/128?featured_on=pythonbytes">PEP 639 is now supported by PYPI</a></li> <li>Extras</li> <li>Joke</li> </ul><a href='https://www.youtube.com/watch?v=Qk1b0AiZCYo' style='font-weight: bold;'data-umami-event="Livestream-Past" data-umami-event-episode="411">Watch on YouTube</a> About the show Sponsored by us! Support our work through: <ul> <li>Our <a href="https://training.talkpython.fm/?featured_on=pythonbytes">courses at Talk Python Training</a></li> <li><a href="https://courses.pythontest.com/p/the-complete-pytest-course?featured_on=pythonbytes">The Complete pytest Course</a></li> <li><a href="https://www.patreon.com/pythonbytes">Patreon Supporters</a></li> </ul> Connect with the hosts <ul> <li>Michael: <a href="https://fosstodon.org/@mkennedy">@mkennedy@fosstodon.org</a> / <a href="https://bsky.app/profile/mkennedy.codes?featured_on=pythonbytes">@mkennedy.codes</a></li> <li>Brian: <a href="https://fosstodon.org/@brianokken">@brianokken@fosstodon.org</a> / <a href="https://bsky.app/profile/brianokken.bsky.social?featured_on=pythonbytes">@brianokken.bsky.social</a></li> <li>Show: <a href="https://fosstodon.org/@pythonbytes">@pythonbytes@fosstodon.org</a> / <a href="https://bsky.app/profile/pythonbytes.bsky.social">@pythonbytes.bsky.social</a></li> </ul> Join us on YouTube at <a href="https://pythonbytes.fm/stream/live">pythonbytes.fm/live</a> to be part of the audience. Usually Monday at 10am PT. Older video versions available there too. Finally, if you want an artisanal, hand-crafted digest of every week of the show notes in email form? Add your name and email to <a href="https://pythonbytes.fm/friends-of-the-show">our friends of the show list</a>, we'll never share it. Michael #1: <a href="https://talkpython.fm/blog/posts/talk-python-rewritten-in-quart-async-flask/?featured_on=pythonbytes">Talk Python rewritten in Quart</a> <ul> <li>Rewrote all of <a href="https://talkpython.fm/?featured_on=pythonbytes">talkpython.fm</a> in <a href="https://quart.palletsprojects.com/en/latest/?featured_on=pythonbytes">Quart</a> (10k lines of code total, 4k changed)</li> <li>Considered <ul> <li>FastAPI</li> <li>Litestar</li> <li>Django</li> <li>Hugo Static Site + Python</li> <li>Flask</li> </ul></li> <li>Discussed the multistage upgrade / conversion process</li> <li>Automating tests for all 1,000 pages</li> </ul> Brian #2: <a href="https://blog.pypi.org/posts/2024-11-14-pypi-now-supports-digital-attestations/?featured_on=pythonbytes">PyPI now supports digital attestations</a> <ul> <li>Dustin Ingram</li> <li>“Attestations provide a verifiable link to an upstream source repository: By signing with the identity of the upstream source repository, such as in the case of an upload of a project built with GitHub Actions, PyPI's support for digital attestations defines a strong and verifiable association between a file on PyPI and the source repository, workflow, and even the commit hash that produced and uploaded the file. Additionally, publishing attestations to a transparency log helps mitigate against both compromise of PyPI and compromise of the projects themselves.”</li> <li>For maintainers <ul> <li>If using GH Actions and Trusted Publishing <ul> <li>make sure you use pypa/gh-action-pypi-publish, version v1.11.0 or newer</li> <li>that’s it</li> </ul></li> <li>If not <ul> <li>“Support for automatic attestation generation and publication from other Trusted Publisher environments <a href="https://github.com/pypi/warehouse/issues/17001?featured_on=pythonbytes">is planned</a>.”</li> <li>“While not recommended, maintainers can also <a href="https://docs.pypi.org/attestations/producing-attestations/#the-manual-way">manually</a> <a href="https://docs.pypi.org/attestations/producing-attestations/#the-manual-way">generate and publish attestations</a>.”</li> </ul></li> </ul></li> <li>See also <ul> <li><a href="https://socket.dev/blog/pypi-introduces-digital-attestations?featured_on=pythonbytes">PyPI Introduces Digital Attestations to Strengthen Python Package Security</a> <ul> <li>by Sarah Gooding</li> </ul></li> <li><a href="https://trailofbits.github.io/are-we-pep740-yet/?featured_on=pythonbytes">Are we PEP 740 yet?</a></li> </ul></li> </ul> Michael #3: <a href="https://github.com/LilyFoote/django-rusty-templates?featured_on=pythonbytes">Django Rusty Templates</a> <ul> <li>by Lily Foote</li> <li>An experimental reimplementation of Django's templating language in Rust.</li> <li>Goals <ul> <li>100% compatibility of rendered output.</li> <li>Error reporting that is at least as useful as Django's errors.</li> <li>Improved performance over Django's pure Python implementation.</li> </ul></li> </ul> Brian #4: <a href="https://discuss.python.org/t/pep-639-round-3-improving-license-clarity-with-better-package-metadata/53020/128?featured_on=pythonbytes">PEP 639 is now supported by PYPI</a> <ul> <li><a href="https://bsky.app/profile/snarky.ca/post/3laudpvnabc27?featured_on=pythonbytes">from Brett Cannon</a></li> <li><a href="https://peps.python.org/pep-0639/?featured_on=pythonbytes">PEP 639 – Improving License Clarity with Better Package Metadata</a></li> <li>For project metadata, use these fields: license and license-files:</li> <li>Examples license field <pre><code>[project] license = "MIT" [project] license = "MIT AND (Apache-2.0 OR BSD-2-clause)" [project] license = "MIT OR GPL-2.0-or-later OR (FSFUL AND BSD-2-Clause)" [project] license = "LicenseRef-Proprietary" </code></pre></li> <li>Examples of <code>license-files</code>: <pre><code>[project] license-files = ["LICEN[CS]E*", "AUTHORS*"] [project] license-files = ["licenses/LICENSE.MIT", "licenses/LICENSE.CC0"] [project] license-files = ["LICENSE.txt", "licenses/*"] [project] license-files = [] </code></pre></li> </ul> Extras Brian: <ul> <li>Playground Wisdom: Threads Beat Async/Await - interesting read from Armin Ronacher about different language abstractions around concurrency.</li> <li>PythonTest.com Discord community is now live <ul> <li>Launched last week, as of this morning we’ve got 89 members</li> <li>Anyone already a pythontest community member has received an invite</li> <li>Anyone can join through courses.pythontest.com</li> </ul></li> <li>Everything at pythontest.com is 20% off through Dec with code turkeysale2024</li> <li>“Python Testing with pytest” eBook 40% off through Dec 2, use code turkeysale2024</li> </ul> Michael: <ul> <li>Python 3.14.0a2 released</li> <li>Starter packs: <ul> <li>Michael’s Python people: https://bsky.app/starter-pack/mkennedy.codes/3lbdnupl26e2x </li> <li>Directory: https://blueskydirectory.com/starter-packs/all</li> </ul></li> </ul> Joke: <a href="https://www.youtube.com/watch?v=atcqMWqB3hw">curl - heavy metal style</a>!

Categories: FLOSS Project Planets

Zato Blog: SSH commands as API microservices

Mon, 2024-11-25 03:00

SSH commands as API microservices 2024-11-25, by Dariusz Suchojad

This is a quick guide on how to turn SSH commands into a REST API service. The use-case may be remote administration of devices or equipment that does not offer a REST interface or making sure that access to SSH commands is restricted to selected external REST-based API clients only.

Python

The first thing needed is code of the service that will connect to SSH servers. Below is a service doing just that - it receives name of the command to execute and host to run in on, translating stdout and stderr of SSH commands into response documents which Zato in turn serializes to JSON.

# -*- coding: utf-8 -*- # stdlib from traceback import format_exc # Zato from zato.server.service import Service class SSHInvoker(Service): """ Accepts an SSH command to run on a remote host and returns its output to caller. """ # A list of elements that we expect on input input = 'host', 'command' # A list of elements that our responses will contain output = 'is_ok', 'cid', '-stdout', '-stderr' def handle(self): # Local aliases host = self.request.input.host command = self.request.input.command # Correlation ID is always returned self.response.payload.cid = self.cid try: # Build the full command full_command = f'ssh {host} {command}' # Run the command and collect output output = self.commands.invoke(full_command) # Assign both stdout and stderr to response self.response.payload.stdout = output.stdout self.response.payload.stderr = output.stderr except Exception: # Catch any exception and log it self.logger.warn('Exception caught (%s), e:`%s', self.cid, format_exc()) # Indicate an error self.response.payload.is_ok = False else: # Everything went fine self.response.payload.is_ok = True Dashboard

In the Zato Dashboard, let's go ahead and create an HTTP Basic Auth definition that a remote API client will authenticate against:

Now, the SSH service can be mounted on a newly created REST channel - note the security definition used and that data format is set to JSON. We can skip all the other details such as caching or rate limiting, for illustration purposes, this is not needed.

Usage

At this point, everything is ready to use. We could make it accessible to external API clients but, for testing purposes, let's simply invoke our SSH API gateway service from the command line:

$ curl "api:password@localhost:11223/api/ssh" -d \ '{"host":"localhost", "command":"uptime"}' { "is_ok": true, "cid": "27406f29c66c2ab6296bc0c0", "stdout": " 09:45:42 up 37 min, 1 user, load average: 0.14, 0.27, 0.18\n"} $ Note that, at this stage, the service should be used in trusted environments only, e.g. it will run any command that it is given on input which means that in the next iteration it could be changed to only allow commands from an allow-list, rejecting anything that is not recognized.

And this completes it - the service is deployed and made accessible via a REST channel that can be invoked using JSON. Any command can be sent to any host and their output will be returned to API callers in JSON responses.

More resources

➤ Python API integration tutorial
➤ What is an integration platform?
➤ Python Integration platform as a Service (iPaaS)
➤ What is an Enterprise Service Bus (ESB)? What is SOA?

Wingware: Wing Python IDE Version 10.0.7 - November 25, 2024

Sun, 2024-11-24 20:00

This minor release reduces Python 3.12+ debugger overhead and improves Python code analysis.

See the change log for details.

Download Wing 10 Now: Wing Pro | Wing Personal | Wing 101 | Compare Products

What's New in Wing 10

AI Assisted Development

Wing Pro 10 takes advantage of recent advances in the capabilities of generative AI to provide powerful AI assisted development, including AI code suggestion, AI driven code refactoring, description-driven development, and AI chat. You can ask Wing to use AI to (1) implement missing code at the current input position, (2) refactor, enhance, or extend existing code by describing the changes that you want to make, (3) write new code from a description of its functionality and design, or (4) chat in order to work through understanding and making changes to code.

Examples of requests you can make include:

"Add a docstring to this method" "Create unit tests for class SearchEngine" "Add a phone number field to the Person class" "Clean up this code" "Convert this into a Python generator" "Create an RPC server that exposes all the public methods in class BuildingManager" "Change this method to wait asynchronously for data and return the result with a callback" "Rewrite this threaded code to instead run asynchronously"

Yes, really!

Your role changes to one of directing an intelligent assistant capable of completing a wide range of programming tasks in relatively short periods of time. Instead of typing out code by hand every step of the way, you are essentially directing someone else to work through the details of manageable steps in the software development process.

Support for Python 3.12, 3.13, and ARM64 Linux

Wing 10 adds support for Python 3.12 and 3.13, including (1) faster debugging with PEP 669 low impact monitoring API, (2) PEP 695 parameterized classes, functions and methods, (3) PEP 695 type statements, and (4) PEP 701 style f-strings.

Wing 10 also adds support for running Wing on ARM64 Linux systems.

Poetry Package Management

Wing Pro 10 adds support for Poetry package management in the New Project dialog and the Packages tool in the Tools menu. Poetry is an easy-to-use cross-platform dependency and package manager for Python, similar to pipenv.

Ruff Code Warnings & Reformatting

Wing Pro 10 adds support for Ruff as an external code checker in the Code Warnings tool, accessed from the Tools menu. Ruff can also be used as a code reformatter in the Source > Reformatting menu group. Ruff is an incredibly fast Python code checker that can replace or supplement flake8, pylint, pep8, and mypy.

Try Wing 10 Now!

Wing 10 is a ground-breaking new release in Wingware's Python IDE product line. Find out how Wing 10 can turbocharge your Python development by trying it today.

Downloads: Wing Pro | Wing Personal | Wing 101 | Compare Products

See Upgrading for details on upgrading from Wing 9 and earlier, and Migrating from Older Versions for a list of compatibility notes.

Categories: FLOSS Project Planets

Hugo van Kemenade: A surprising thing about PyPI's BigQuery data

Sun, 2024-11-24 15:45

You can get download numbers for PyPI packages (or projects) from a Google BigQuery dataset. You need a Google account and credentials, and Google gives 1 TiB of free quota per month.

Each month, I have automation to fetch the download numbers for the 8,000 most popular packages over the past 30 days, and make it available as more accessible JSON and CSV files at Top PyPI Packages. This data is widely used for research in academia and industry.

However, as more packages and releases are uploaded to PyPI, and there are more and more downloads logged, the amount of billed data increases too.

This chart shows the amount of data billed per month.

At first, I was only collecting downloads data for 4,000 packages, and it was fetched for two queries: downloads over 365 days and over 30 days. But as time passed, it started using up too much quota to download data for 365 days.

So I ditched the 365-day data, and increased the 30-day data from 4,000 to 5,000 packages. Later, I checked how much quota was being used and increased from 5,000 packages to 8,000 packages.

But then I exceeded the BigQuery monthly quota of 1 TiB fetching data for July 2024.

To fetch the missing data and investigate what's going in, I started Google Cloud's 90-day, $300 (€277.46) free-trial 💸

Here's what I found!

Finding: it costs more to get data for downloads from only pip than from all installers

I use the pypinfo client to help query BigQuery. By default, it only fetches downloads for pip.

Only pip

This command gets one day's download data for the top 10 packages, for pip only:

$ pypinfo --limit 10 --days 1 "" project Served from cache: False Data processed: 58.21 GiB Data billed: 58.21 GiB Estimated cost: $0.29

Results:

project download count boto3 37,251,744 aiobotocore 16,252,824 urllib3 16,243,278 botocore 15,687,125 requests 13,271,314 s3fs 12,865,055 s3transfer 12,014,278 fsspec 11,982,305 charset-normalizer 11,684,740 certifi 11,639,584 Total 158,892,247 All installers

Adding the --all flag gets one day's download data for the top 10 packages, for all installers:

$ pypinfo --all --limit 10 --days 1 "" project Served from cache: False Data processed: 46.63 GiB Data billed: 46.63 GiB Estimated cost: $0.23 project download count boto3 39,495,624 botocore 17,281,187 urllib3 17,225,121 aiobotocore 16,430,826 requests 14,287,965 s3fs 12,958,516 charset-normalizer 12,781,405 certifi 12,647,098 setuptools 12,608,120 idna 12,510,335 Total 168,226,197

So we can see the default pip-only costs an extra 25% data processed and data billed, and costs an extra 25% in dollars.

Unsurprisingly, the actual download counts are higher for all installers. The ranking has changed a bit, but I expect we're still getting more-or-less the same packages in the top thousands of results.

Queries

It sends a query like this to BigQuery for only pip:

SELECT file.project as project, COUNT(*) as download_count, FROM `bigquery-public-data.pypi.file_downloads` WHERE timestamp BETWEEN TIMESTAMP_ADD(CURRENT_TIMESTAMP(), INTERVAL -2 DAY) AND TIMESTAMP_ADD(CURRENT_TIMESTAMP(), INTERVAL -1 DAY) AND details.installer.name = "pip" GROUP BY project ORDER BY download_count DESC LIMIT 10

And for all installers:

SELECT file.project as project, COUNT(*) as download_count, FROM `bigquery-public-data.pypi.file_downloads` WHERE timestamp BETWEEN TIMESTAMP_ADD(CURRENT_TIMESTAMP(), INTERVAL -2 DAY) AND TIMESTAMP_ADD(CURRENT_TIMESTAMP(), INTERVAL -1 DAY) GROUP BY project ORDER BY download_count DESC LIMIT 10

These queries are the same, except the default has an extra AND details.installer.name = "pip" condition. It seems reasonable it would cost more to do extra filtering work.

Installers

Let's look at the installers:

$ pypinfo --all --limit 100 --days 1 "" installer Served from cache: False Data processed: 29.49 GiB Data billed: 29.49 GiB Estimated cost: $0.15 installer name download count pip 1,121,198,711 uv 117,194,833 requests 29,828,272 poetry 23,009,454 None 8,916,745 bandersnatch 6,171,555 setuptools 1,362,797 Bazel 1,280,271 Browser 1,096,328 Nexus 593,230 Homebrew 510,247 Artifactory 69,063 pdm 62,904 OS 13,108 devpi 9,530 conda 2,272 pex 194 Total 1,311,319,514

pip still by far the most popular, and unsurprising uv is up there too, with about 10% of pip's downloads.

The others are about 25% or less of uv. A lot of them are mirroring services that we wanted to exclude before.

I think given uv's importance, and my expectation that it will continue to take a bigger share of the pie, plus especially the extra cost for filtering by just pip, means that we should switch to fetching data for all downloaders. Plus the others don't account for that much of the pie.

Finding: the number of packages doesn't affect the cost

This was the biggest surprise. Earlier I'd been increasing or decreasing the number to try and remain under quota. But it turns out it makes no difference how many packages you query!

I fetched data for just one day and all installers for different package limits: 1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000. Sample query:

SELECT file.project as project, COUNT(*) as download_count, FROM `bigquery-public-data.pypi.file_downloads` WHERE timestamp BETWEEN TIMESTAMP_ADD(CURRENT_TIMESTAMP(), INTERVAL -2 DAY) AND TIMESTAMP_ADD(CURRENT_TIMESTAMP(), INTERVAL -1 DAY) GROUP BY project ORDER BY download_count DESC LIMIT 8000

Result: Interestingly, the cost is the same for all limits (1000-8000): $0.31.

Repeating with one day but filtering for pip only:

Result: Cost increased to $0.39 but again the same for all limits.

Let's repeat with all installers, but for 30 days, and this time query in decreasing limits, in case we were only paying for incremental changes: 8000, 7000, 6000, 5000, 4000, 3000, 2000, 1000:

Result: Again, the cost is the same regardless of package limit: $4.89 per query.

Well then, let's repeat with the limit increasing by powers of ten, up to 1,000,000! This last one fetches data for all 531,022 packages on PyPI:

limit projects count estimated cost bytes billed bytes processed 1 1 0.20 43,447,746,560 43,447,720,943 10 10 0.20 43,447,746,560 43,447,720,943 100 100 0.20 43,447,746,560 43,447,720,943 1000 1,000 0.20 43,447,746,560 43,447,720,943 8000 8,000 0.20 43,447,746,560 43,447,720,943 10000 10,000 0.20 43,447,746,560 43,447,720,943 100000 100,000 0.20 43,447,746,560 43,447,720,943 1000000 531,022 0.20 43,447,746,560 43,447,720,943

Result: Again, same cost, whether for 1 package or 531,022 packages!

Finding: the number of days affects the cost

No surprise. I'd earlier noticed 365 days too took much quota, and I could continue with 30 days.

Here's the estimated cost and bytes billed (for one package, all installers) between one and 30 days (f"pypinfo --all --json --indent 0 --days {days} --limit 1 '' project"), showing a roughly linear increase:

Conclusion

It doesn't matter how many packages I fetch data for, I might as well fetch all and make it available to everyone, depending on the size of the data file. It will make sense to still offer a smaller file with 8,000 or so packages: often you just need a large-ish yet manageable number.
It costs more to filter for only downloads from pip, so I've switched to fetching data for all installers.
The number of days affects the cost, so I will need to decrease this in the future to stay within quota. For example, at some point I may need to switch from 30 to 25 days, and later from 25 to 20 days.

More details from the investigation, the scripts and data files can be found at
hugovk/top-pypi-packages#36.

And let me know if you know any tricks to reduce costs!

Header photo: "The Balancing Rock, Stonehenge, Near Glen Innes, NSW" by the Royal Australian Historical Society, with no known copyright restrictions.

Categories: FLOSS Project Planets

Django Weblog: 2024 Malcolm Tredinnick Memorial Prize awarded to Rachell Calhoun

Sun, 2024-11-24 13:39

This year it was hard to decide, and we wanted to also show who else got nominated, because they also deserve recognition, so it took a bit longer than we expected.

The Django Software Foundation Board is pleased to announce that the 2024 Malcolm Tredinnick Memorial Prize has been awarded to Rachell Calhoun.

Rachell Calhoun is an influential figure within the Django community, well known for being cheerful and always willing to help others. She consistently empowers folks behind the scenes.

Rachell got her start in the Django community through a Django Girls Seoul event. Being an educator, she started organizing Django Girls Seoul events. Her contributions to Django Girls Seoul and Django Girls Grand Rapids exemplify her commitment to sharing knowledge, spreading Django and lifting others up. Rachell is now a trustee for Django Girls +, contributing to its mission of helping women and other underrepresented groups around the world learn programming with Django.

In 2022, Rachell co-founded Djangonaut Space, an initiative designed to support new contributors to the Django ecosystem, encouraging leadership and growth. Rachell’s willingness to help people achieve their goals and celebrate their achievements has been imprinted in Djangonaut Space’s culture. Rachell and Djangonaut Space have done a stellar job on helping people become contributors and Django community members.

Her commitment to fostering diversity and inclusion extends beyond her organizational work; she has volunteered at multiple DjangoCon US events, bringing her welcoming and inclusive spirit to the community. A long-time volunteer and speaker at DjangoCon US and DjangoCon Europe from 2016 to 2024, she has shared her expertise and insights on various topics related to Django and web development.

Rachell has contributed to Django for many years, she has been instrumental in creating spaces where people of all backgrounds can thrive, making her a beloved and respected member of the global Django ecosystem.

Some quotes from the thirteen people who nominated Rachell had this to say about her:

Rachell advocates for others constantly through sponsorship, inclusivity, and connection. She is extremely empathic and seeks to not only welcome others in, but to actively bring them into the group.

She has been one of the core members of Djangonaut Space which has done a lot for bringing new contributors into the Django community. This program has done a lot to excite and energize the Django community and Rachell is one of the major reasons why. --
Throughout her career she's been involved in Django Girls starting about a decade ago in South Korea. She was a major organizer of the Grand Rapids, MI branch, before moving into the trustee role she occupies now.

Rachell is one of my favorite people and she's been doing an excellent job at growing Django and helping others feel more welcome here. Rachell is an excellent choice for the Malcolm Tredinnick 2024 award!

— Tim Shilling

Rachell is an extremely skillful leader who is always nurturing newcomers into leaders. She has been pivotal to my experience with the Djangonaut Space Program.
I started out as a nervous Djangonaut who didn’t schedule my 1:1s until Rachell checked in with me and made sure I knew the program was a safe space to discuss anything.

When I joined the program organizers as a Navigator Coordinator, I was initially much more of a follower. Rachell knew to step back while continuing to provide her support, so I could step into the leadership role and get my job done.

Rachell shows people that she believes in them. She does this in a friendly, gentle, and encouraging manner. She never forces anyone to make decisions that they don’t feel comfortable with. The community is really lucky to have Rachell.

— Lilian

Rachell Calhoun, one of the organizers and founders of Djangonaut Space, has been an open, supportive, and educational help on my Django journey. Her contributions to the Djangonaut Space program are invaluable—a program I hold quite dearly as a cornerstone of my technical interactions and growth.

Rachell's ideals of nurturing and guiding have shone through the program, for which I am grateful. Encouraging wonderful conversations, organizing and fostering mentorship, and being a great person!

I believe Rachell is an embodiment of the Malcolm Tredinnick spirit and am confident that should she win the prize, she would go on to create more impact for the Django community and the world at large.

— Emmanuel Katchy

Other nominations for this year included:

Anna Makarudze, Fundraising Coordinator at Django Girls+ Foundation, chair of the first DjangoCon Africa, previously served the DSF board as president.

Benjamin Balder Bach, chair of the DSF social media working group, organizer of Django Day Copenhagen for many years.

Black Python Devs, community founded by Jay Miller, to increase diversity and inclusion of typically underrepresented people.

Bhuvnesh Sharma, co-chair of the DSF social media working group, and co-founded and organized Django India.

Carlton Gibson, previously a Django fellow, co-host of Django Chat, volunteers in DjangoCon Europe and provides useful advice in forum and discord.

Christoph Bulter, active helper of the official and unofficial Django Discord.

Django Girls+, a non-profit organization and a community that empowers and helps women to organize free, one-day programming workshops by providing tools, resources and support.

Django Discord moderators and helpers, which are moderating the discord and provide help to keep the place welcoming and inclusive to everyone.

Daniel Moran, active contributor in various open-source projects, including django-tasks-scheduler. He is an administrator of the Django Commons organization.

Ester Beltrami, PyCon Italia and Django London organizer, is also a volunteer and a speaker in events such as EuroPython or DjangoCon Europe.

Felipe de Morais, co-founder of AfroPython, participant of Djangonaut Space program, organized and advised multiple Django Girls workshops across Brazil and Chile.

Jake Howard, speaker and contributor to Django, known for his work around background tasks.

Matt Westcott, frequent speaker and lead the development of Wagtail.

Russel Keith-Magee, python core contributor and previously Django core contributor and also served in the DSF board as President.

Ryan Cheley, django contributor and mentor (navigator) in Djangonaut Space program.

Simon Charette, long-time django contributor, previously member of the Django 5.x steering council

Sheena O’Connell, frequent speaker and DjangoCon Africa organizer.

Tom Carrick, Django Accessibility team creator and member, django contributor for many years and mentor (navigator) in Djangonaut Space program.

Tim Schilling, DEFNA secretary, DjangoCon Us organizer and co-founder of Djangonaut Space.

Will Vincent, former board member of the DSF, co-host of Django Chat and co-writer of Django News.

Each year we receive many nominations, and it is always hard to pick the winner. This year, as always, we received many nominations for the Malcolm Tredinnick Memorial Prize with some being nominated multiple times. Some have been nominated in multiple years. If your nominee didn’t make it this year, you can always nominate them again next year.

Malcolm would be very proud of the legacy he has fostered in our community!

Congratulations Rachell on the well-deserved honor!

Categories: FLOSS Project Planets

Django Weblog: DjangoCon Europe 2026 call for organizers completed

Sun, 2024-11-24 11:57

The DjangoCon Europe 2026 call for organizers is now over. We’re elated to report we received three viable proposals, a clear improvement over recent years.

We’ll let the successful team decide when and how to make their announcement, but in the meantime – thank you to everyone who took part in this process ❤️ We’re elated to have such a strong community in Europe. And for now, look forward to DjangoCon Europe 2025 in Dublin, Ireland! 🍀

What about 2027?

We’re not ready to plan that yet, but if you’re interested in organizing – take a moment to add your name and email to our DjangoCon Europe 2027 expression of interest form. We’ll make sure to reach out once the time is right.

Categories: FLOSS Project Planets

Search form

Tag cloud

Planet Python

Programiz: Python List

Bruno Ponne / Coding The Past: How to calculate Z-Scores in Python

Real Python: Continuous Integration and Deployment for Python With GitHub Actions

PyPodcats: Episode 7: With Anna Makarudze

Django Weblog: Django 6.x Steering Council Candidate Registration

Brett Cannon: What the PSF Conduct WG does

Kay Hayen: Nuitka Release 2.5

EuroPython Society: List of EPS Board Candidates for 2024/2025

PyCoder’s Weekly: Issue #657 (Nov. 26, 2024)

Real Python: Managing Dependencies With Python Poetry

Armin Ronacher: Constraints are Good: Python's Metadata Dilemma

Mike Driscoll: Black Friday Python Deals 2024

Trey Hunner: New Python Jumpstart course

Real Python: Speed Up Your Python Program With Concurrency

Python Bytes: #411 TLS Client: Hello <<guitar solo>>

Zato Blog: SSH commands as API microservices

Wingware: Wing Python IDE Version 10.0.7 - November 25, 2024

Hugo van Kemenade: A surprising thing about PyPI's BigQuery data

Django Weblog: 2024 Malcolm Tredinnick Memorial Prize awarded to Rachell Calhoun

Django Weblog: DjangoCon Europe 2026 call for organizers completed

Pages

Recent Publications

FLOSS Project Planets

FLOSS Research

Search form

Tag cloud

You are here

Planet Python

Pages

Recent Publications

FLOSS Project Planets

FLOSS Research