Planet Python

Planet Python - http://planetpython.org/

Updated: 16 hours 2 min ago

Pythonicity: Packaging rundown

Sat, 2024-05-11 20:00

Companion guide to the Python packaging tutorial.

This is not an overview of packaging, nor a history of the tooling. The intended audience is an author of a simple package who merely wants to publish it on the package index, without being forced to make uninformed choices.

Build backends

The crux of the poor user experience is choosing a build backend. The reader at this stage does not know what a “build backend” is, and moreover does not care.

The 4 backends in the tutorial are described here in their presented order. An example snippet of a pyproject.toml file is included, mostly assuming defaults, with a couple common options:

dynamic version
package data for type information

hatchling

- 83 kB with 4 dependencies

requires = ["hatchling"] build-backend = "hatchling.build" [tool.hatch.build.targets.sdist] include = ["<package>/*"] [tool.hatch.version] path = "<package>/__init__.py"

Part of - not to be confused with - the project manager Hatch.

The source distribution section is included because by default hatchling ostensibly includes all files that are not ignored. However, it only abides by the root .gitignore. It will include virtual environments, if not named .venv. For a project that advocates sensible defaults, this is surprising behavior and a security flaw. Even if the issue is fixed, it will presumably include untracked files and clearly omissible directories such as .github.

setuptools

- 894 kB with 0 dependencies

[build-system] requires = ["setuptools>=61.0"] build-backend = "setuptools.build_meta" [tool.setuptools] packages = ["<package>"] [tool.setuptools.dynamic] version = {attr = "<package>.__version__"} [tool.setuptools.package-data] <package> = ["py.typed"]

The original build tool, and previously the de facto standard. It is no longer commonly included in Python distributions, so they are all on equal footing with respect to needing installation.

Setuptools requires explicitly specifying the package, as well as any package data. It also includes legacy “.egg” and “setup.cfg” files, which a modern user will not be familiar with.

flit-core

- 63 kB with 0 dependencies

[build-system] requires = ["flit-core>=3.4"] build-backend = "flit_core.buildapi"

Part of the Flit tool for publishing packages.

Flit automatically supports dynamic versions (and descriptions), and includes the source directory with data files in the source distribution.

pdm-backend

- 101 kB with 0 dependencies

requires = ["pdm-backend"] build-backend = "pdm.backend" [tool.pdm] version = {source = "file", path = "<package>/__init__.py"}

Part of - not to be confused with - the project manager PDM.

PDM automatically includes the source and test directories, with data files, in the source distribution.

Evaluations Popularity and endorsements

The popularity of Setuptools should be discounted because of its history. The popularity of Hatchling and PDM-backend is clearly influenced by their respective parent projects. PDM has significantly less downloads than the others, but they are all popular enough to expect longevity.

Setuptools, Hatch, and Flit are all under the packaging authority umbrella, though as the previously cited article points out, PyPA affiliation does not indicate much.

The tutorial “defaults” to Hatchling, which presumably is not intended as an endorsement, but will no doubt be interpreted as such.

Size and dependencies

Setuptools is by far the largest; no surprise since it is much more than a build backend. Hatchling is the only one with dependencies, but the 3 modern ones seem appropriately lightweight.

File selection

Wheels have a standard layout, but source distributions do not. Whether sdist should include docs and tests is a matter of debate.

There was a time when open source software meant “distributed with an open source license”, so the source distribution was the primary way to acquire the code. This all seems anachronistic in the age of distributed version control and public collaboration. Not to mention wheels are zip files which have the source code.

One piece of advice is that the sdist should be buildable. Generated portable files could be included, thereby not needing the tools that generate them. But for a simple (read pure) Python project, that is not particularly relevant.

There is another issue with backends creating different artifacts when using their own build commands. This rundown only evaluated python -m build.

Metadata

The modern 3 implicitly support data files. All 4 support dynamic versioning in some manner. Then again, maybe the __version__ attribute is no longer the leading convention among the 7 options for single-sourcing the version. Now that importlib.metadata is no longer provisional, is that preferred?

Recommendations

It would be disingenuous to not end with recommendations, since the refusal to - in a document titled tool recommendations - is the problem. The PyPA endorses pip, build, and twine as standard tools, even though there are alternatives.

Author’s disclosures: I am a long-time Python developer of several packages, and a couple with extension modules. I use no project management tools, and am not affiliated with any of these projects.

flit-core - No criticisms. The dynamic version and description feature are a plus; not having any flit-specific sections feels like less coupling.
pdm-backend - No criticisms. A natural choice if one wants tests in the source distribution.
hatchling - The file selection issue is significant. Users need a warning that they should include an sdist section and check their tarballs. Many are going to have unnecessarily large distributions, and someone with a local secrets directory - whether ignored or untracked - is going to have a seriously bad day.
setuptools - Requires the most customization and is perpetually handicapped by backwards compatibility. The only advantage setuptools had was being already installed. It may be time to disavow it for new projects without extension modules.

My projects currently use setuptools for purely historically reasons. For new projects, I would likely use flit-core. I may switch-over existing projects, though there is really no incentive to.

Unless a standard emerges, of course.

Epilogue

A meta case could be made for Flit(-core) as well: that its limited scope and independence from a project manager is itself an asset. Whereas choosing Hatch(ling) or PDM(-backend) feels like picking a side. Flit can position itself as the minimalist choice for those who resent having to choose.

Categories: FLOSS Project Planets

Mike Driscoll: Ruff – The Fastest Python Linter and Formatter Just Got Faster!

Sat, 2024-05-11 10:16

I’m a little late in reporting on this topic, but Ruff put out an update in April 2024 that includes a hand-written recursive descent parser. This update is in version 0.4.0 and newer.

Ruff’s new parser is >2x faster, translating to a 20-40% speedup for all linting and formatting invocations. Ruff’s announcement includes some statistics to show improvements that are worth checking out.

What’s This New Parser?

I’ve never tried writing a code parser, so I’ll have to rely on Ruff’s announcement to explain this. Basically, when you are doing static analysis, you will turn the source code into Abstract Syntax Trees (ASTs), which you can then analyze. Python has an AST module built in for this purpose. Ruff is written in Rust, though, so their AST analyzer is also written in Rust.

The original parser was called a generated parser, specifically LALRPOP. The parser requires a grammar to be defined in a Domain Specific Language (DSL), which is then converted into executable code for the generator.

Ruff’s new hand-written parser is a recursive descent parser. Follow that link to Wikipedia to learn all the nitty gritty details.

Their team created a hand-written parser to give them more control and flexibility over the parsing process, making it easier to work on the many weird edge cases they need to support. They also created a new parser to make Ruff faster and provide better error messages and error resilience.

Wrapping Up

Ruff is great and makes linting and formatting your Python code so much faster. You can learn much more about Ruff in my other articles on this topic:

The post Ruff – The Fastest Python Linter and Formatter Just Got Faster! appeared first on Mouse Vs Python.

Categories: FLOSS Project Planets

Search form

Tag cloud

Planet Python

Pythonicity: Packaging rundown

Mike Driscoll: Ruff – The Fastest Python Linter and Formatter Just Got Faster!

Pages

Recent Publications

FLOSS Project Planets

FLOSS Research

Search form

Tag cloud

You are here

Planet Python

Pythonicity: Packaging rundown

Mike Driscoll: Ruff – The Fastest Python Linter and Formatter Just Got Faster!

Pages

Recent Publications

FLOSS Project Planets

FLOSS Research