Planet Python

Subscribe to Planet Python feed
Planet Python - http://planetpython.org/
Updated: 5 hours 46 min ago

Python Software Foundation: The Python Language Summit 2024: PyREPL -- New default REPL written in Python

Fri, 2024-06-14 05:26

Lysandros showing the mistake we've all made, no longer a problem in the new REPL
(Photo credit: Hugo van Kemenade)

One of the headline features of Python 3.13 is the new interactive interpreter, sometimes known as a "REPL" (Read-Evaluate-Print-Loop) which was contributed by Pablo Galindo Salgado, Łukasz Langa, and Lysandros Nikolaou and based on the PyPy project's own interactive interpreter, PyREPL. Pablo, Łukasz, and Lysandros all were at the Language Summit 2024 to present about this new feature coming to Python.

Why does Python need a new interpreter?

Python already has an interactive interpreter, so why do we need a new one? Lysandros explained that the existing interpreter is "deeply tangled" to Python's tokenizer which means adding new features or making changes is extremely difficult.

To lend further color to this point, Lysandros dug into how the tokenizer had changed since Python was first developed. Lysandros noted that "for the first 12 years [of Python], Guido was the only one who touched the tokenizer" and only later after the parser was replaced did anyone else meaningfully contribute to the tokenizer.

Terse example code for Python's tokenizer

Meanwhile, there are other REPLs for Python that "have many new features that [Python's] interpreter doesn't have that users have grown to expect", Lysandros explained. Some basic features that were listed as examples included lack of color support meaning no syntax highlighting, the ergonomics issues around exit versus exit(), no support for multi-line editing and buffer history, and poor ergonomics around pasting code into the interpreter.

Why PyREPL?

"We've settled on starting our solution around PyREPL", Pablo explained, "our reasoning being that maintaining terminal applications is hard. Starting from scratch would have a much higher risk for users". Pablo also noted that "most people who would interact with the REPL wouldn't test in betas", because Python pre-releases are generally used for running automated tests in continuous integration and not interactively tested manually.

Pablo explained that there are many different terminals and platforms which are all sources of behaviors and bugs that are hard to get right the first time. "[PyREPL] provided us with a solid base that we know works and we can start modifying".

Tasteful modern art or bug in the REPL?

Another major contributing factor was that PyREPL is written in Python. Pablo emphasized that "now people that want to start contributing to the REPL can actually contribute because it's written in Python".

Finally, Pablo pointed out that because the implementation is now partially shared between CPython and PyPy that both implementations can benefit from bug fixes to the shared parts of the codebase. Support for Chinese characters in the REPL was fixed in CPython and is being contributed back to PyPy.

Łukasz noted that adopting PyREPL wasn't a straightforward copy-paste job, there were multiple ideas in PyPy's PyREPL that don't make sense for CPython. Notably, PyPy is written to also support Python 2, so the code was simplified to only handle Python 3 code. PyREPL for PyPy also came with support for PyGame which wasn't necessary for CPython.

Type hints and strict type checking using mypy were also added to PyREPL, making the PyREPL module the first in the Python standard library to be type-checked on pull requests. Adding type hints to the code immediately found bugs which were fixed and reported back to PyPy.

What are the new features in 3.13?

Pablo gave a demonstration of the new features of PyREPL, including:

  • Colored prompts
  • F1 for help, F3 for bracketed paste
  • Multi-line editing and history
  • Better support for pasting blocks of code
     

Below are some recreated highlights from the demo. Pasting code samples into the old REPL that contain multiple newlines would often result in SyntaxErrors due to multiple newlines in a row resulting in that statement being evaluated. Multi-line editing also helps modifying code all in one place rather than having to piece a snippet together line-by-line, modifying what you want as you go:

Demo of multi-line paste in Python 3.13 

And the "exit versus exit()" paper-cut has been bothering Python users for long enough. This error was especially taunting because the REPL clearly knows what your intent is with it's helpful message to "Use exit() to exit":

"exit" without parenthesis just works, finally!
Windows and terminals

Support is already available for Unix consoles (Linux and macOS) in Python 3.13.0-beta1 and the standout feature request so far for PyREPL has been Windows support. Windows was left out because "historically the console on Windows was way different than Unix consoles". Łukasz continued, saying that "they don't intend to support right now" offering a "yes, but..." for users asking for Windows support.

Windows has two consoles today, cmd.exe of yore and the new "Windows Terminal" which supports many of the same features as Unix consoles including VT100 escape codes. The team's plan is to support the new Windows Terminal, and "to use our sprints here in Pittsburgh to finish". Windows support will also require removing CPython dependencies on the curses and readline libraries.

What's next for PyREPL?

The team already has plans cooking up for what to add to the REPL in Python 3.14. Łukasz commented that "syntax highlighting is an obvious idea to tackle". Łukasz also referenced an idea from Tania Allard for accessibility improvements similar to those in IPython.

Łukasz reiterated that the goal isn't to make an "uber REPL" or "replace IPython", but instead to make a REPL that core developers can use while testing development branches (where dependencies aren't working yet).

Łukasz continued that core developers aren't the only ones that these improvements benefit: "many teachers are using straight-up Python, IDLE, or the terminal because the computers they're using don't allow them to install anything else."

Given the applause from the room during the demos, it's safe to say that this work has been received well. There were only concerns about platform support and rollout for the new REPL.

Gregory Smith informed the team that functionality that requires a "Function" key (ie F1, F2, etc) must also be supported without Function keys due to some computers lacking them, like Chromebooks.

Carol Willing was concerned about releasing PyREPL without support for Windows Terminal, especially from a teaching perspective, describing that potential outcome as "painful". Carol wanted clear documentation on how to get the new REPL on Windows. "Positioning [the new REPL] for teaching without clear Windows instructions is a recipe for disaster".

Pablo assured that the team wants to add support for Windows Terminal in time for the first 3.13 release candidate. Pablo could not make guarantees due to a lack of Windows expertise among the three, saying "the reason I'm not saying 100% is because none of us are Windows experts. We understand what needs to be done... but we need some help."

Łukasz named Steve Dower, the Windows release expert for Python, who is "very motivated to help us get Windows Terminal support during sprints". Łukasz reiterated they're "not 100%, but we are very motivated to get it done".

Gregory Smith shared Carol's concern and framed the problem as one of communication strategy, proposing to "not promise too much until it works completely on Windows". By Python 3.14 the flashy features like syntax highlighting would have landed and the team would have a better understanding of what's needed for Windows. The team can revise the 3.13 "What's New in Python" depending on what gets implemented in the 3.13 timeline.

Ned Deily sought to clarify what the default experience would be for users of 3.13. Pablo said that "on Windows right now you will get the [same REPL] that you got before" and "on Linux and macOS, if your terminal supports the features which most of them do, you get the enhanced experience". "What we want in the sprints is to make Windows support the new one, if we get feature parity, then [Windows] will also get the new [REPL]".

Carol also asked to document how to opt-out of the new REPL in the case that support wasn't added in time for 3.13 to avoid differences between educational material and what students were seeing in their terminal. Kushal Das confirmed that differences across platforms is a source of problems for students, saying that "if all [students] have the same experience it's much better than just improving only macOS and Linux" to avoid students feeling bad just due to their operating system.

Pablo said that the opt-out mechanism was already in place with an environment variable and will discuss other opt-out mechanisms if needed for educators.

Emily Morehouse, speaking as a Steering Council member added that the Steering Council has requested an informational PEP on the new REPL. "Hearing concerns about how [the new REPL] might be rolled out... it sounds like we might need something that's more compatible and an easier rollout", leaving the final discussions to the 3.13 release manager, Thomas Wouters. Carol replied that she believes "we could do it in documentation".

Categories: FLOSS Project Planets

Python Software Foundation: The Python Language Summit 2024: Lightning Talks

Fri, 2024-06-14 05:13

The Python Language Summit 2024 closed off with six lightning talks which were all submitted during the Language Summit. The talks were delivered by Petr Viktorin, David Hewitt, Emily Morehouse, Łukasz Langa, Pablo Galindo Salgado, and Yury Selivanov.

Petr Viktorin: Unsupported build warning

Do you know what happens when you build Python on an unsupported platform?

"... It works!" -- Thomas Wouters

Petr gave a short presentation on a warning that many folks using Python (and even developing Python!) may have never seen before: the unsupported build warning. This warning appears when building on a platform that's not officially supported by CPython, for example "riscv64-unknown-linux-gnu".

"The platform is not supported, use at your own risk"
(Photo credit: Hugo van Kemenade)
 

Just because a platform isn't officially supported by CPython doesn't mean it won't work on that platform, and indeed it's likely that CPython may work fine on the platform or a subset of features may be subtly or not-so-subtly broken or unavailable.

Petr wanted to get a temperature check from the group on whether this warning could be further improved or changed, such as by hiding the warning after the user had executed the test suite or showing the number of tests that had failed.

The room seemed mostly uninterested in exploring this topic further and was in favor of keeping the warning as-is.

David Hewitt: Rust in Python: panic!

David Hewitt maintains the project PyO3 which offers Rust bindings for the Python C API. David explained that these bindings require mapping concepts in the Rust programming language to Python and the topic of today's talk is the panic! macro.

In Rust, the panic! macro will generate a panic, unwind the stack, and then terminate the program while providing feedback to the caller of the program. David showed that there were two methods of handling errors in Rust programs, panic! and Result.

Python functions implemented in Rust use the PyResult type to contain the return value or raised exception which uses the Rust Result type. But what if a Rust function panics, what should PyO3 do?

Today PyO3 raises a separate exception for panics, pyo3_runtime.PanicException to be exact. This exception inherits from BaseException, typically reserved for exceptions that users won't want to catch like KeyboardInterrupt and SystemExit.

David has been receiving feedback from some users that the PanicException inheriting from BaseException is annoying to work with. This is because everywhere that exceptions are caught now needs to also catch PyO3's PanicException, giving the example case of logging exceptions.

David wanted feedback on whether the original choice to inherit from BaseException was appropriate or if there was a better answer.

Pablo Galindo Salgado asked whether an AssertError or RuntimeError would be more appropriate. David replied that he felt that not inheriting from BaseException would "cheapen" the Rust aspect of a panic.

Guido van Rossum offered that he thinks "BaseException is the correct choice", to which there was much agreement from the room.

Emily Morehouse: Formalizing the PEP prototype process

Python Steering Council Member and Language Summit chair, Emily Morehouse, spoke to the group about the PEP prototype process and how formalizing can better support PEP authors.

(Photo credit: Hugo van Kemenade)

Emily started off the talk stating "We all agree that we should be doing more testing and prototyping outside of CPython". She also referenced prior talks like pdb improvements and subinterpreters where this approach was recommended.

Emily noted that the Steering Council has pronounced this as a requirement for PEP authors. She acknowledged that this "can feel a bit bad as a PEP author to be put out into the dark world of figuring out how to gather feedback from the community" and how to manage and distribute a project.

Emily's proposal for improving the PEP process borrows from the TC39 process, which is the process for making changes and improvements to JavaScript. The proposal would see the prototype process be made an "official optional step of the  PEP process" which would then allow creating a separate GitHub repository within the "python" GitHub organization.

This would allow the project to house its own code, issue tracker, and packages would be distributed by the Python organization instead of on someone's personal account. Emily also suggested providing a template for the repository to handle distributions to PyPI.

Emily's theory is that PEPs would see more adoption and get more feedback if they came from an official channel. This approach provides additional support to PEP author and lets the author start gathering community feedback quicker before waiting for PEP pronouncement.

The room was in agreement for moving forward with the process improvement.

Carol Willing thought the improvement would be great but called back to pattern matching for CPython where the work was done in a feature branch of the CPython repository rather than a separate repository. Carol thought using a feature branch worked well for pattern matching and wanted to know how this process might work for future language changes.

Emily replied that the process would be case-by-case depending on the feature whether it's a branch, fork or something else. Thomas Wouters agreed, saying that this proposal appears to be specifically for projects which could be distributed on the PyPI instead of language features.

Łukasz Langa: Python for iOS, finally

Harking back to the previous talk on mobile support for Python, Łukasz wanted to know if the Python team should have a more official presence on phone application stores like the Apple App Store (and maybe the Google Play Store, but Łukasz declined to speak on it since he is an iOS user).

Łukasz noted that there already exists today multiple applications on his phone that "are Python". These applications are useful for trying out Python code, learning Python, and writing small programs.

However Łukasz noted that by not having an official Python application on mobile meant the user experience today is sub-optimal. Some applications are publishing old versions of Python and aren't reachable when being asked to upgrade to newer versions so users can take advantage of new features. Others have suddenly changed from being free to being paid applications. Should the Python development team do something about this?

The response from the room appeared positive, but acknowledged the amount of effort that creating and maintaining such an application would be.

Russell Keith-Magee, the author of BeeWare which is leading the charge to bring Python to mobile platforms, said "Sure, but I'm not building it". After much laughter from the room, Russell noted that the project is "an entirely achievable goal but not a small one".

Ned Deily, macOS release expert, agreed and offered that "implementing a terminal would get us most of the way there".

Pablo Galindo Salgado: Making asserts cooler in 3.14You'll have to imagine the iconic Pablo "✨ woooooow ✨"
(Photo credit: Hugo van Kemenade)

Pablo took the term "lightning talk" to heart and gave a 90 second presentation (demo included!) on his plans to improve asserts in Python in version 3.14. The problem statement was summarized as "asserts are kinda sad", after which Pablo showed how when an assert statement fails there isn't much indication about why the condition failed in the error.

Consider how an assertion error might look today:

Traceback (most recent call last):
  File "main.py", line 7, in <module>
    bar(x, y)
    ~~~^^^^^^
  File "main.py", line 3, in bar
    assert (x + 1) + z == y
           ^^^^^^^^^^^^^^^^^
AssertionError

Pretty opaque! In the above example you'll notice that we can't see the values of x, y, or z which makes evaluating what went wrong difficult. Instead, with Pablo's proposed changes the traceback would look like so:

Traceback (most recent call last):
  File "main.py", line 7, in <module>
    bar(x, y)
    ~~~^^^^^^
  File "main.py", line 3, in bar
    assert (x + 1) + z == y
           ^^^^^^^^^^^^^^^^^
AssertionError: assert ((1 + 1) + 11) == 2

With this change the values are visible for the asserted statement. Similarly being able to inspect containers to show where their contents differ, a-la pytest:

Traceback (most recent call last):
  File "main.py", line 2, in bar
    assert x == y
AssertionError: assert Lists differ: [1, 2, 3, [1, 2]] != [1, 2, 3, [1, 3]]

First differing element 3:
[1, 2]
[1, 3]

- [1, 2, 3, [1, 2]]
?               ^

+ [1, 2, 3, [1, 3]]

Pablo intends to put together a PEP for this feature, including asking questions like whether the process should be hookable and whether user code should be able to provide custom formatters. Stay tuned for that!

Yury Selivanov: Efficient data sharing between subinterpreters

The final talk of the Language Summit was from Yury Selivanov on Memhive, a new "highly experimental" project which adds support for structured data sharing between Python subinterpreters.

Per-Interpreter GIL is a newer feature to Python that allows running multiple "interpreters" in a single instance of Python each with their own Global Interpreter Lock (GIL). Per-interpreter GIL allows for true multi-core parallelism, previously using threads in a Python process would only allow a single Python instruction to execute at a time due to the GIL being shared across all threads.

Being in the same process means that each subinterpreter is sharing the same memory space, and if instructions are executing truly concurrently we run into problems with that shared memory space. PEP 684 which specifies per-interpreter GIL calls out this issue and for now keeps memory allocators using global locking mechanisms.

Yury started the talk by discussing immutable data structures and their properties, the most interesting being how quickly they can be copied in memory. Deep copies are fast for immutable data structures because they are implemented as a single copy-by-reference.

An immutable mapping collection, specifically a hash-array mapped trie (HAMT), has already been implemented in Python for the contextvars module. Context variables need to be copied for every new asynchronous task, so being efficient is important to not impact performance of all async Python workloads.

Yury explaining how to replant a trie 🌲
(Photo credit: Hugo van Kemenade)

HAMTs work by transparently updating the trie structure in background of the mapping allowing for structured sharing while minimizing overhead to create new copies.

The invariant for this to work across subinterpreters is that the immutable collection in the main interpreter must not be garbage collected. Maintaining this invariant will require reliable reference counting across subinterpreters ("remote IncRef"). The proposed implementation would have every subinterpreter maintain multiple queues for tracking local and remote reference counts.

Yury explained that similar to how HAMTs provide an immutable mapping collection there is another data structure for immutable list collections which is "just another 5,000 lines of C" (which received some chuckles) and "luckily we won't be the first ones to implement this collection".

After comparing the performance of pickling mappings or using mappings with immutable data structures showed that immutable data structures were much more performant. The performance was better for immutable mappings for both small and large numbers of keys, between 6x and a "ridiculous" 150,000x faster.

"I believe these are the missing components for subinterpreters", Yury noted with many thanks to Eric Snow who has been working on subinterpreters and per-interpreter GIL for years. Yury concluded that this work is being done with a practical use-case in mind so will be completed and usable for others including CPython.

For folks looking for more on this topic, Yury also gave a talk at PyCon US 2024 about his work on Memhive.

Categories: FLOSS Project Planets

Python Software Foundation: The Python Language Summit 2024: Annotations as Transformers

Fri, 2024-06-14 05:13

The final talk of the main schedule of the Python Language Summit was delivered by Jason R. Coombs on using annotations for transforms. The presentation was accompanied by a GitHub repository and Jupyter notebook illustrating the problem and proposed solution.

Jason is interested in a method for users to "transform their parameters in a reusable way". The motivation was to avoid imperative methods of transforming parameters to "increase reusability, composition, and separation of concerns". Jason imagined transformers which could be "packaged up in a library or used across multiple functions" and would "be applied at the scope of individual parameters".

Python already has a language feature that's similar to this concept with decorators, which allow wrapping a function or class with another function in a syntactically concise way.

Jason noted that "return values can be handled by decorators fairly easily, so [the proposal] is more concerned with input parameters". For a decorator to affect parameters, the decorator "would have to inspect the parameters" and "entangle itself with the function signature".

Diagram from Jason's presentation showing transforms being applied to individual parameters of a function.

Jason's proposal would use type annotations due to type annotations already specifying the desired type, the proposal being to add behavior "this is the type I want to make this" and perform transforms. Below is some example code of the proposal:

def transformer(val: float | None) -> float:
    return val if val is not None else 0

def make_str(val: float) -> str:
    return str(val)

def my_fn(
    p1: transformer,
    p2: transformer
) -> make_str:

    return (p1 ** 2) + p2

Jason went on to show that Pydantic was offering something similar to his proposal by having functions called on parameters and return values using the pydantic.BeforeValidator class in conjunction with typing.Annotated, though this use-case "wasn't being advertised by Pydantic":

from typing import Annotated
import pydantic

def transformer(val: float | None) -> float:
    return val if val is not None else 0

@pydantic.validate_call(validate_return=True)
def my_fn(
    p1: Annotated[float, pydantic.BeforeValidator(transformer)],
    p2: Annotated[float, pydantic.BeforeValidator(transformer)]
) -> Annotated[str, pydantic.BeforeValidator(str)]:

    return (p1 ** 2) + p2

Jason didn't like this approach though due to the verbosity, requiring to use a decorator and provide annotations, and needing an extra dependency.

Eric V. Smith asked if Jason had seen PEP 712, which Eric is the sponsor of, that describes a "converter" mechanism for dataclass fields. This mechanism was similar in that "the type you annotated something with became different to the type you passed". Eric remarked it was "pretty common thing that people want to pass in different types when they're constructing something than the internal types of the class".

Jason replied that he had seen the PEP but "hadn't incorporated it into a larger strategy yet". Steering council member Barry Warsaw noted that he "didn't know what the solution is, but it is interesting... that the problems are adjacent".

There was skepticism from the room, including from typing council member Guido van Rossum, on using type annotations as the mechanism for transformers. Type annotations today don't affect the runtime behavior of the code and this proposal would be a departure from that, Guido noting "process-wise, that's going to be a difficult hurdle".

If type annotations weren't the way forwards, Jason had also considered proposing new syntax or a new language feature and wanted feedback on whether "there's viability" in that approach and if so, "[he] could explore those options".

There were questions about why decorators weren't sufficient, citing PEP 318 motivation section containing examples similar to the ones Jason had presented. Transformers could be assigned to parameters by name, passing in the transformer as a key-value parameters into the decorator like so:

def transformer(val: float | None) -> float:
    return val if val is not None else 0

@apply(p1=transformer, p2=transformer)
def my_fn(
    p1: float,
    p2: float
) -> float:

    return (p1 ** 2) + p2

Jason found this pattern "discouraging" and "less elegant" because the variable name needs to mentioned in multiple places and that he was "hoping for something that was more integrated into the language, to not feel like a second-class feature".

Łukasz Langa commented on the case for removing the "None" type from a union, could already be done with a type guard and drew attention to work being done to allow more complicated type guards. Łukasz was "sympathetic to conciseness, but type checkers already handle this".

Steering Council member Gregory Smith was hesitant to make any change in this area. He agreed that "as a language, we're missing something", but "wasn't sure if we've got a way forward that doesn't make the language more complicated".

Categories: FLOSS Project Planets

Python Software Foundation: The Python Language Summit 2024: Limiting Yield in Async Generators

Fri, 2024-06-14 05:13

Zac Hatfield-Dodds came to the Language Summit to present on a fundamental incompatability between the popular async programming paradigm "structured concurrency" and asynchronous generators, specifically when it came to exception handling when the two were mixed together.

Structured Concurrency

Structured concurrency is becoming more popular for Python async programming like with Trio "nurseries" and in the Python standard library with the addition of asyncio.TaskGroup in Python 3.11.

When using structured concurrency, active tasks can be thought of as a tree-like structure where sub-tasks of a parent task have to exit before the parent task itself can proceed past a pre-defined scope. This exit can come through all the tasks completing successfully or from an exception being raised either internally or externally (for example, in the case of a timeout on time-bounded work).

The mechanism which allows a parent task and its sub-tasks to cooperate in this way is called a "cancel scope" which Trio makes a top-level concept but is implicitly used in asyncio.TaskGroup and asyncio.timeout.

Async programs that are structured with this paradigm can rely on exceptions behaving in a much more recognizable way. There's no more danger of a spawned sub-task silently swallowing an exception because all sub-tasks are guaranteed to be checked for their status before the parent task can exit.

The problem with yields

The fundamental issue is that yields suspend the current call frame, in effect "returning" a value, and then the generator needs to be "called" again for execution to be resumed. This suspension doesn't play well with structured concurrency because execution can't be suspended in the same call frame as a cancel scope, otherwise that scope can't process exceptions from its child tasks.


Zac leading a "fun game of 'why is this code broken?'"
(Photo credit: Hugo van Kemenade)

Zac presented some innocuous looking code samples that suffered from the described issue:

async def iter_with_timeout(ait, max_time): try: while True: with asyncio.timeout(max_time): yield await anext(ait) except StopAsyncIteration: return async def fn(): async for elem in iter_with_timeout(ait, max_time=1.0): await do_something_with(elem)

In this example, asyncio.timeout() could expire while the yield had suspended the generator and before the generator was resumed. This scenario would result in the cancellation exception being raised in the outer task outside of the asyncio.timeout() cancel scope. If things had gone to plan and the generator wasn't suspended the cancellation would be caught by asyncio.timeout() instead and execution would proceed.

Zac presented the following fix to the iter_with_timeout() function:

async def iter_with_timeout(ait, max_time): try: while True: with asyncio.timeout(max_time): tmp = await anext(ait) 
yield tmp # Move yield outside the cancel scope!
 except StopAsyncIteration: return

By moving the yield outside the cancellation scope it means that the suspension of the frame isn't happening when execution is inside a cancellation scope. This means that propagation of cancellation errors can't be subverted by a suspended call frame for this program.

If you're still having trouble understanding the problem: you are not alone. There was a refrain of "still with me?" coming from Zac throughout this talk. I recommend looking at the problem statement and motivating examples in the PEP for more information.

Where to go from here

Zac and Nathaniel Smith have coauthored PEP 789 with their proposed solution of disallowing yield statements within context managers that behave like cancel scopes. Attempting to yield within these scopes would instead raise a RuntimeError.

The mechanism would be using a new function "sys.prevents_yields()" which would be used by authors of async frameworks to annotate context managers which can't be suspended safely. Users of async frameworks wouldn't need to change their code unless it contained the unwanted behavior.

The language would need to support this feature by adding metadata to call frames to track whether the current frame should allow yields to occur.

Mark Shannon was concerned that the solution was "lots of machinery to handle the exception being raised in the wrong place" and sought clarification that there would be overhead added to every call and return. Zac confirmed this would be the case, but that it could be done with "one integer [member on call frames] that you increment and decrement, but it would do some operation on every frame call and return".

Irit Katriel asked why a "runtime error" was being used "instead of something static". Zac explained that users might define their own context managers which have a "cancel scope property" and the runtime "wouldn't know statically whether a given context manager should raise an error or not".

Łukasz Langa asked whether adding a type annotation to context managers would be sufficient to avoid adding runtime overhead. Zac responded that "there are still many users that don't use static type checking", and that "there's no intention to make it required by default". Łukasz was concerned that the proposal "would be contentious for runtime performance" due to the impact being "non-trivial".

Pablo Galindo Salgado wanted to explore other big ideas to avoid the performance penalty like adding new syntax or language feature, such as "with noyield" to provide a static method of avoiding the issue. Zac agreed that changing the context manager protocol could also be a solution.

Guido van Rossum lamented that this was "yet another demonstration that async generators were a bridge too far. Could we have a simpler PEP that proposes to deprecate and eventually remove from the language asynchronous generators, just because they're a pain and tend to spawn more complexity".

Zac had no objections to a PEP deprecating async generators¹. Zac continued, "while static analysis is helpful in some cases, there are inevitably cases that it misses which kept biting us... until we banned all async generators in our codebase".

¹ Editors note: after the summit an update to PEP 789 would describe how the problem doesn't exist solely in async generators and thus removal of the feature wouldn't solve the problem, either.

Categories: FLOSS Project Planets

Python Software Foundation: The Python Language Summit 2024: Should we make pdb better?

Fri, 2024-06-14 05:13

Tian Gao came to the Language Summit 2024 to talk about improving pdb, short for "Python debugger", a module and command line tool for debugging Python.

Tian Gao presenting on how to improve pdb

There are not many command-line debugger alternatives to pdb for Python. Tian mentioned a few, including PuDB, pdb++, and ipdb, but those alternatives are all themselves based on either pdb or another standard library module 'bdb'.

pdb is the only "standalone" command-line-based Python debugger

Tian presented a laundry list of desirable new features that could be added to pdb, including:

  • Showing more lines of code around the current breakpoint.
  • Colors in the terminal, syntax highlighting.
  • Customization, with defaults being safe.
  • Handling of more scenarios (threads, asyncio, bytecode, remote debugging)
Performance and backwards compatibility

The biggest issue according to Tian, which he noted had been discussed in the past, was the performance of pdb. "pdb is slow because sys.trace is slow, which is something we cannot change", and the only way forward on making pdb faster is to switch to sys.monitoring to avoid triggering unnecessary events.

Switching to sys.monitoring would give a big boost to performance. According to Tian, "setting a breakpoint in your code in the worst case you get a 100x slowdown compared to almost zero overhead with sys.monitoring". Unfortunately, switching isn't so easy, Tian noted there are serious backwards compatibility concerns for the standard library module bdb if pdb were to start using sys.monitoring.

"If we're not ready to [switch to sys.monitoring] yet, would we ever do this in the future?", Tian asked the group, noting that an alternative is to create a third-party library and encourage folks to use that library instead.

Thomas Wouters started off saying that "bdb is a standard library module and it cannot break user code" and cautioned that core developers don't know who is depending on modules. bdb's interface can't have backwards incompatible changes without long deprecation periods. In Thomas' mind, "the answer is obvious, leave pdb as it is and build something else".

Thomas also noted "in the long-term, a debugger in the standard library is important" but that development doesn't need to happen in the standard library. Thomas listed the benefits for developing a new debugger outside the standard library like being able to publish outside the Python release schedule and to use the debugger with older Python versions. Once a debugger reaches a certain level of stability it can be added to the standard library and potentially replace pdb.

Tian agreed with Thomas' proposal in theory, but was concerned that a third-party debugger on PyPI wouldn't see the same levels of adoption compared to being in the standard library and thus would struggle to meet a threshold of "stability" without a critical mass of users. Or worse yet, maintainers wouldn't be motivated to continue due to a lack of use, resulting in a "dead project". (Some foreshadowing, Steering Council member Emily Morehouse gave a lightning talk on this topic later on in the Language Summit)

Łukasz Langa noted that Python now has support for "breakpoint()" and that "what breakpoint() actually does, we can change. We can run another debugger if we decide to", referencing if a better debugger was added in the future to CPython that it could be made into a new default for breakpoints.

Russell Keith-Magee from BeeWare, was interested in what Tian had said about remote debugging, noting that "remote debugging is the only way you can debug [on mobile platforms]". Russell would be interested in pdb or a new debugger supporting this use-case. Tian noted that unfortunately remote debugging would be one of the more difficult features to implement.

Pablo Galindo Salgado, commenting on existing Python "attach-to-process" debuggers, said that the hacks in use today are "extremely unsafe". Pablo said that "we'd need something inside CPython [to be safe], but then you have another problem, you have to implement that feature on [all platforms]". Pablo also mentioned that attach-to-process debugging is usually a bad model because it can't be enabled by default for security reasons but "you won't know when you'll need to debug".

Anthony Shaw asked about the scope of the project and was interested in whether there could be a framework for debugging in CPython that pdb and others could build on. Anthony pointed out that many other debuggers "needed to do a bunch of hooks and tricks" to do debugging because it's "not provided out of the box by CPython".

Tian responded that "bdb is supposed to do that, but it was written 30 years ago so is too old to support new things that a debugger wants". Others mentioned that sys.monitoring (new in Python 3.12) was meant to be a framework for debuggers to build on.

Gregory Smith, Steering Council member, said he "wants all of these things" and agreed with Thomas to "develop this as much as you can... outside of the standard library", telling Tian that "you're going to end up in a better state that way". Greg's primary concern was whether CPython needed to do anything to enable Tian's proposal. He continued, "it sounds like we (CPython) have most of what we need, but if we don't let's get that planned so we can enable a successful separate project before we ship it with Python in the future".

Categories: FLOSS Project Planets

Python Software Foundation: The Python Language Summit 2024: Python on Mobile

Fri, 2024-06-14 05:13

Malcolm Smith from BeeWare presented on the status and direction of Python on mobile platforms like iOS and Android. BeeWare has been working on bringing Python to mobile for a few years now. Previously Russell Keith-Magee gave a talk at the Language Summit in 2023 on BeeWare to announce plans for Tier 3 support for Python on Android and iOS in Python 3.13 along with Anaconda's funded support for the project.

Now we've arrived at Python 3.13 pre-releases, and things are going well! Malcolm reported that "the implementations are nearly complete" along with thank-yous to the core developers who helped with the project.

Overview of current Python mobile platform support

The other platforms listed in the table "iOS x86_64 and Android ARM32/x86", don't have any plans to be implemented. There aren't any actual physical devices for iOS on x86_64 as the architecture is only used for development simulators.

For Android the ARM32 and x86 platforms are being phased out due to being 32-bit architectures and today represent less than 10% of devices. For these reasons, Malcolm and team have decided not to implement support for this architecture.

Malcolm also reported that there is a buildbot for iOS and in the coming weeks there will be buildbots added for Android ARM64 and x86_64 platforms.

Let's talk packages!

Python is well-known for its rich package ecosystem, and the BeeWare team is working on bringing Python packages to mobile Python, too. "It's not enough just to have support for CPython", Malcolm said on this topic, "we also need to support the packaging ecosystem". As with many new platforms for Python, pure Python packages work without much issue and "the difficulty comes in with anything which contains native compiled components".

 

The current and future approach for mobile-friendly Python packages
 

The BeeWare team's approach so far has been to bootstrap packages with native components on their own by creating tools and "building wheels for popular packages like numpy, cryptography, and Pillow". Malcolm reported that the current approach of rebuilding individual packages isn't scalable and the team would need to help upstream maintainers build their own mobile wheels. Malcolm said the team plans to focus this year on "making it as easy as possible to produce and release [mobile] wheels within existing workflows" and contributing to tools like cibuildwheel, setuptools, and PyO3.

Malcolm also hopes that "by the end of this year some of the major packages will be in position to start releasing mobile wheels to the Python Package Index". The team has already specified a format for the wheel tags for iOS (PEP 730) and Android (PEP 738). "The binary compatibility situation is pretty good", Malcolm noted that iOS and Android both come from a single source in Apple and Google respectively meaning "there's a fairly well-defined set of libraries available on each version".

Python today provides an embeddable package for the Windows platform. Malcolm requested from the group that more official Python embeddable packages be created for each of the mobile platforms with headers and libraries to ease building Python packages for those platforms. Having these artifacts available would provide a reference for binary compatibility on those platforms.

Ned Deily, the macOS release expert for CPython, agreed that having more binary releases for macOS and iOS is something we "should definitely do in the 3.14 timeframe".

Challenges with keeping mobile buildbots green

Malcolm provided the core developer team some tips on writing Python code with these new and constrained platforms in mind. He warned that there is little to no support for spawning subprocesses, but "multi-threading on the other hand is perfectly fine on both of these platforms".

Mobile platforms also tend to be constrained in terms of security. iOS only allows loading libraries from specific folders and Android has restrictions like not being able to read the root directory or create hard links.

Given these differences, "it's reasonable to expect that mobile platforms will have more frequent failures as development proceeds, so how do we go about testing them?" The full CPython test suite is running on both mobile platforms with buildbots, but today there's no testing done before a pull request is merged. This situation leads to mobile buildbots starting to fail without the contributing developer necessarily noticing.

This problem is exacerbated by limited continuous integration (CI) resources in GitHub Actions, especially for macOS which limits virtualization on ARM64 processors. Malcolm suggested evaluating GitHub's Merge Queue feature as a potential way to solve this issue by requiring a small amount of testing on mobile platforms without blocking development of features.

Malcolm's proposal for better visibility of test failures for mobile

Łukasz Langa agreed that CI was an issue, one that he's actively looking improving, but wasn't convinced that using a merge queue would decrease the number of jobs required to run. Malcolm clarified that he is proposing only running a smaller subset of jobs per-commit in pull requests and the complete set, including some buildbots, as a part of pre-merge testing.

Many folks expressed concern about adding buildbots as a part of pre-merge or per-commit checks, because buildbots have no high-availability SLA and often suffer occasional outages, some buildbots not being reliable and therefore preventing merging of commits, and concerns about security of unreviewed changes running on buildbots.

Thomas Wouters, Python 3.13 release manager, was "unconvinced" on adding pre-merge testing for Tier 3 platforms, something that is usually reserved for Tier 1 platforms.

Ned Deily recommended doing iOS builds as a part of existing macOS builds in GitHub Actions. This would catch build errors for the platform and would likely find some issues early without much additional investment.

Categories: FLOSS Project Planets

Python Software Foundation: The Python Language Summit 2024: Free-threading ecosystems

Fri, 2024-06-14 05:13

Following years of excitement around the removal of the Global Interpreter Lock (GIL), Python without the GIL is coming soon. Python 3.13 pre-releases already have support for being built without the GIL using a new --disable-gil compile-time option:

# Download
wget https://www.python.org/ftp/python/3.13.0/Python-3.13.0b2.tgz
echo "c87c42aa8137230a15a02ed90a6600610ba680cb5b54c0fbc57581a0d032e0c4  ./Python-3.13.0b2.tgz" | sha256sum --check
tar -xzvf ./Python-3.13.0b2.tgz

# Build
cd Python-3.13.0b2/
./configure --disable-gil
make

# Run with no GIL!
./python -X nogil -c "import sys; print(sys._is_gil_enabled())"
False

But simply having GIL-less Python is not enough, code needs to be written that is safe and performant without the GIL using both the C and Python APIs.

This year at the Language Summit, Daniele Parmeggiani gave a talk about ways Python can enable safe and performant concurrent code without locking CPython into a specific implementation or memory model.

Don't leak the details

Daniele started his talk, like many Python users, with cautious enthusiasm about the prospect of free-threading in Python:

"Given the acceptance notes to PEP 703, one should argue for caution in discussing the prospects of new multi-threading ecosystems after the release of Python 3.13 — with a hopeful spirit I will disregard this caution here."

-- Daniele Parmeggiani

Daniele detailed a feature request he had opened to create a public function for the private C API function "_Py_TRY_INCREF()". Daniele wanted to use this function to increment an object's reference count safely in a truly multi-threaded Python where a reference count might be decremented concurrently to an increment.

Daniele continued, "[Sam Gross] responded as thoughtfully and thoroughly as he usually does that the function shouldn't be public, and I agree with him".

The semantics of _Py_TRY_INCREF() today are tied to the specific implementation of free-threading and without a guarantee that the underlying implementation won't change Daniele does not think the function "should ever be made public".

But without this functionality Daniele's problem still stands, where do we go from here?

Higher-level APIs to the rescue

"At a higher-level it's possible to write further guarantees without constraining what's under the hood". Daniele started a single step up in abstraction, detailing an atomic reference API:

PyObject *AtomicRef_Get(AtomicRef *self)
{
    PyObject *reference;
    reference = self->reference;
    while (!_Py_TRY_INCREF(reference)) {
        reference = self->reference;
    }
    return reference;
}

This would be "trivial to implement" with the new garbage collection scheme in Python 3.13 ("quiescent state-based reclamation" or QSBR), "but what if [Python 3.14] were to change this scheme radically? Or what if 3.15 decides to do away with it entirely?"

Daniele eschewed making guarantees about low-level APIs at this stage of development, but concluded that "an API for atomically updating a reference to a PyObject seems like a high-level use-case worth guaranteeing, regardless of any implementation of reference counting".

Atomic data structures

Daniele continued exploring higher-level concepts that Python could provide at this stage of free-threading by looking to what other languages are doing.

Java provides a java.util.concurrent package containing some familiar faces for Python concurrency users like Semaphores, Locks, and Barriers, but also some other atomic primitives that map to Python classes like dicts, lists, booleans, and integers. Daniele asked whether Python should provide atomic variations for primitives like numbers and dictionaries.

Daniele explained that many atomic data structures use the "compare-and-set" model to synchronize read and write access to the same space in memory. Compare-and-set requires the caller to specify an expected value, if the value in memory matches the expected value then the value is updated to the passed value, and the call returns whether the operation was successful or not.

Daniele explained that compare-and-set establishes a "happens-before" ordering between concurrent writes to the same memory location, joking that the phrase "happens-before" may spark thoughts of memory models which he wished to avoid.

Today Python doesn't have any method of reordering memory accesses which would require thinking about the memory model. Daniele noted that may come one day from the new just-in-time compiler (JIT).

Daniele was already developing an atomic dictionary class and had seen performance gains over the existing standard library dictionary with the GIL disabled (with lower single-threaded performance):

Performance comparison of dict with and without the GIL and Daniele's AtomicDict

Daniele observed that the free-threading changes actually decreased the performance for write-heavy workloads on builtin types like dictionaries because "Python programs will now actually be subject to memory contention". When multiple threads attempt to mutate a list or dictionary, "it will be as if the GIL is still there, [the threads] will all be contending for one lock", offering that "new concurrent data structures would alleviate this performance issue".

Daniele wanted to know what primitives Python should offer for C extension developers targeting free-threaded builds, or asked if it's still too early to make guarantees:

"As the writer of a C extension looking to implement concurrent lock-free data structures for Python", Daniele asked of the room, "does CPython eventually wish to incorporate... either high-level atomics or low-level routines?"

Daniele continued, "if not the atomics, then new low-level APIs like _Py_TRY_INCREF() will be necessary in order not to force the abuse of locks in external efforts towards new free-threading ecosystems".

Discussion

Thomas Wouters, channeling the Steering Council's past intent from accepting PEP 703 last October said, "we don't know yet what users will actually need" and the Steering Council didn't want to "prematurely optimize" and mandate features be implemented without that knowledge.

Thomas recommended building solutions to "production use-cases" as PyPI packages or separate projects before the deciding to pull those solutions into Python, summarizing the sentiment with, "we need to take our time and make sure we're doing the right thing".

Steering Council member Barry Warsaw agreed with Thomas on strategy, also adding that "[atomic references] might be something [Python] needs to make sure the interpreter doesn't crash with some of our own C code". Barry was interested in how to "ensure that the interpreter stays safe in the face of free-threading without necessarily thinking about the right APIs for the higher-level data structures".

Sam Gross, author and main implementer of PEP 703 to make the GIL optional in CPython, commented on making additional guarantees to standard library collections, saying "we're going to find situations that are ambiguous where no one's promised thread-safety or [the lack of thread-safety]".

Sam would also like to see "scalable collections" on PyPI (and "would love to see in Python eventually too") that are "designed not just to be thread-safe, but to scale well with certain workloads". Sam noted that builtin data classes like dict and list "can only make so many trade-offs" and tend to "focus on single-threaded performance" or "multi-threaded read-only access".

Eric Snow wanted to see immutable data structures be considered, too, noting the benefits to performance and shareability that Yury Selivanov was seeing when using them with sub-interpreters.

Gregory Smith sympathized with Daniele on wanting to avoid thinking about memory models, but "had a sneaking suspicion we kinda have to anyway". Greg was concerned about other stacks like data science and machine learning "re-interpreting Python code and transforming it into other things that run on other hardware". Without a clear definition, people "make their own assumptions" and get confused when code runs differently in different places.

Replying to Greg, Daniele offered that there's already a mechanism for determining whether an object is shared between threads "which might be a first-step", but that this "was a detail of the implementation, and not a part of the language".

Guido van Rossum began by being "wary of looking to Java for examples", stating that many APIs that Python borrowed from Java were eventually deprecated and removed.

Guido commented that "there will be other people with much higher-level ideas on concurrency" and recommended "to wait as long as we can before we build anything into the language explicitly or implicitly". Guido also felt it was "important that we have sub-interpreters as well as free-threading, so people can play with different models before we commit to anything".

Overall, the group seemed interested in Daniele's work on atomics but didn't seem willing to commit to exact answers for Python yet. It's clear that more experimentation will be needed in this area.

 

 

Categories: FLOSS Project Planets

Python Software Foundation: The Python Language Summit 2024: Native Interface and Limited C API

Fri, 2024-06-14 05:13

Back in October 2023, PEP 731 proposed a new C API working group charged with overseeing and coordinating the development and maintenance of the Python C API. This working group spawned from a series of discussions on the C API from the Language Summit in 2023 and creation of an inventory of problems with the C API at the 2023 core developer sprint.

Two inaugural C API working group members, Petr Viktorin and Victor Stinner, presented back-to-back talks on the C API and gave context on what's been happening in the past year.

What does the C API working group do?

The first of the two C API talks was given by Petr Viktorin on the "Native Interface" and some of the first steps towards an idealized C API.

Petr started off by explaining that the C API working group makes two types of decisions: what functionality to expose via the C API and how to expose it. Petr also explained that the C API working group keeps two separate issue trackers, one for incremental "evolution" of the C API and another for "revolution", a place where more "radical" ideas are discussed.

The existing C API wasn't designed with the knowledge, context, and needs of today (like free-threading), but there are many good parts of the C API. Petr explained that one of the more impactful things the working group has done is to formalize "guidelines to get consistency with the good parts of the existing API".

Petr gave an example of what can go wrong with the PyLong_GetSign() function. This API has a baked-in type check that can't be avoided due to its function signature and thus incurs a performance penalty even when the caller has already checked whether the object is the correct type.

This extra performance penalty means that CPython itself uses its own private API which avoids the type check, but this extra private API only for CPython isn't a great experience. Other languages and projects want access to the more performant API, too.

Petr went on to reference Mark Shannon's proposal for a New C API which Petr called "close to perfect" with caveats around not dropping existing APIs and the name, instead suggesting "Native Interface" for the name of the new C API.

"Unfortunately we need to keep the old API around. We can't just remove a chunk of the existing API just because it's old", Petr lamented. Petr also noted that not being able to remove parts of the existing API might mean that the Faster CPython project loses some incentive to work on the new C API.

C API decisions are made on three axes: performance, safety, and convenience. Petr argued that of the three, "performance should be prioritized", because a convenient and safe layer can be built on top of a performant API with the right amount of context.

Annotating the existing C API

Petr noted that we have experience within Python for adding a safety layer on top of APIs in Python: type hints! Type hints in Python provide context into an API's inputs and outputs that can be checked using external tooling without incurring a performance penalty on runtime.

Petr proposed adding annotations to C function signatures for function behaviors like "returns a null pointer on error" or "never returns a null pointer" which can then be used in other contexts like documentation or borrow checking. Among the proposed annotations were some about whether references were borrowed, stolen, or a new reference, which can be used to check consistency of references.

List of possible annotations for C API functions

Petr also noted that many of these annotations apply not only to new APIs but to existing APIs as well. Implementing these annotations as empty C macros means that behavior and performance isn't impacted but can be parsed from header files.

Petr's slides showing the annotations in use as C macros

To go along with these new annotations, Petr proposed writing a tool similar to Argument Clinic. Argument Clinic is a tool maintained by the CPython team which automatically generates boilerplate code like function signatures and argument unpacking based on input instructions.

Mark Shannon asked to clarify whether the priority was to improve the C API or document existing behavior. Petr's plan was to add annotation information to the existing API and to wait on implementing the new Native Interface until later. This plan wouldn't change the behavior of any existing API, but APIs which aren't conforming would receive a new variant that conforms to the new C API standards.

Victor Stinner asked whether the annotation information would be stored in a separate file. Petr noted that a separate file is the plan to make it easier to wrap the API and to avoid needing to parse header files directly.

PyO3 maintainer David Hewitt asked whether the plan was to include variations that avoid type checks for all C API functions to dodge the performance penalty for C API wrappers. David noted that PyO3 implemented many C API function calls as methods on wrapped objects. This means that the type check was implicit and thus could avoid having types checked again by the C API function. David also clarified that these extra checks "aren't a major performance drag" but would be great to remove the inefficiencies if possible.

Petr answered that wrappers will need to wait for the Native Interface to be implemented to expose the underlying C API functions which don't include type checks.

There was enthusiastic agreement from the room about using annotation information for documentation and automatically generating boilerplate code and checks along with being able to do borrow checking using annotation information.

Limited C API

The second C API talk was given by Victor Stinner on the status of the Limited C API. The Limited C API is a subset of the Python C API that's consistent across different versions of Python. The Limited C API can be opted-in to using #define Py_LIMITED_API, by doing so only public functions of the limited C API can be used.

Victor started off by listing his long-term goals for the Python C API, which mostly focused on reducing friction both for maintainers of the Python C API and for third parties using the API or updating to support new Python versions. One possibility to achieve this would be to "move to using the Limited C API by default and use the Stable ABI for everybody" but Victor noted this is a "very long term goal".

Getting to this goal is challenging because it's difficult to know how a given change will affect the ecosystem of Python projects, both for finding affected projects and how widespread breakage would be for users. Victor explained that each change typically only requires "1-10 lines of code changed per impacted project" to fix issues.

Trying to move all functions from private to either public or internal

Victor's biggest project currently is to remove private functions from the C API, specifically functions which begin with an underscore "_" by convention. Victor explained that he removed all 300 private functions starting with "_Py" for 3.13.0-alpha1 to discover how and where private APIs are used by downstream projects. Victor and team anticipated that this mass-removal would cause breakages, so after the initial round of discovery the removed functions causing the most issues have been re-added in 3.13.0-alpha2.

As of 3.13.0-beta1, 264 functions of the over 300 functions are still removed. The functions which have been added back are not simply left as-is either: once a private function is discovered the C API working group gets a chance to design a new public C API function for projects to use instead.

"The goal isn't to annoy people, the goal is to provide better functions for everybody" -- Victor Stinner

These new public C API functions would have documentation, tests, backwards compatibility guarantees, and can benefit from the new C API working group guidelines around API design. Victor gave an example of the PyDict_Pop() API which previously required checking for an error condition using PyErr_Occurred() to disambiguate between a key not being in the dictionary or if any other error occurred.

The new PyDict_Pop() function returns -1, 0, and 1 for the "error", "not found", and "found" cases respectively in accordance with new C API guidelines meaning a call to PyErr_Ocurred() is avoided.

New PyDict_Pop() public function with improvements

The pythoncapi-compat project, which Victor is a maintainer of, provides backfills for these new 3.13 APIs for Python 3.12 and older. This means that projects can immediately start taking advantage of new APIs which are better designed and return strong references. Victor highlighted in particular PyDict_GetItemRef() and others which are new in 3.13 and are important for free-threading due to PyDict_GetItem() returning a borrowed reference instead of a new strong reference.

Slide from Victor's presentation on current Limited C API adoption

The biggest users of the Python C API like Cython, PyO3, pybind, and more are at various stages of supporting the Limited C API, most of which require an opt-in for builds.

Victor's top project in coming months and years will be to move the C API away from using structures ("C structs") like PyFrameObject, PyThreadState, and PyTypeObject. Victor noted that projects like Cython, greenlet, gevent, and more have to access directly into structure members which can cause breakages when upgrading to new Python versions. Victor explained that there is no way to handle this with the Limited C API today. "We already provide many helper functions like getters and setters, but we need to provide even more" said Victor as a way forwards on this issue.

Petr questioned the approach of "breaking current projects so that future Python versions don't break them", saying that it'd be better to warn projects about using private API functions that aren't supported and wait to introduce breaking changes when it's necessary to progress the C API.

Victor replied that he'd already started work on a PEP to opt-in for build errors when a project is using deprecated functions, "like a strict mode for the C API". Victor agreed that the current plan isn't great in this way, "we ask people to update their code and the timeline is very short, we expect people to update in one years time" noting the circumstances where this can be difficult such as unmaintained projects or solo-maintainers.

Petr also added here that the opt-in would need to be versioned per Python version, so users can have control over when they want to do the work to move to new C API functions.

Eric Snow and Mark Shannon remarked on a more incremental strategy. This strategy would see deprecated functions moved structurally into a separate file ("legacy.c" and "legacy.h") but with the behavior preserved to have a clearer idea of what functions Python developers want to remove. After being moved the functions would be implemented using newly designed APIs where possible. Others noted that this would only be a convenience for core developers and projects that are interested in internals like PyO3 and Cython.

David Hewitt commented on the long feedback cycles, as downstream projects of the Stable ABI are still using Python 3.7 as a target, so any changes to the Stable ABI may not receive feedback until many years later. Victor responded that he's working on a new project that implements new functions of Python for old Python versions.

Overall, the work and proposals presented by both Petr and Victor were well-received by the room. It's clear that the Python C API is in good hands with the C API working group and is moving in the right direction to solve tomorrow's problems.

Categories: FLOSS Project Planets

Python Software Foundation: The Python Language Summit 2024: Should Python adopt Calendar Versioning?

Fri, 2024-06-14 05:12

 

Hugo van Kemenade, the newly announced Release Manager for Python 3.14 and 3.15, started the Language Summit with a proposal to change Python's versioning scheme.

Hugo's view of kicking off the language summit!
(Photo credit: Hugo van Kemenade)

The goal of Hugo's proposal was to make expectations around versioning, backwards compatibility, and support timelines clearer for Python users.

On the surface, Python's versioning might appear to be Semantic Versioning (SemVer) due to its three-part version and infamous set of backwards incompatible changes known as Python 3. Hugo noted that the publication of Python 1.0.0 (1994) and what would become the Python versioning scheme predates the publication of SemVer by at around 15 years (2009).

The perception of Python using semantic versioning is a source of confusion for users who don't expect backwards incompatible changes when upgrading to new versions of Python. In reality almost all new feature releases of Python include backwards incompatible changes such as the removal of "dead batteries" where PEP 594 marked 19 modules for removal in Python 3.13.

Calendar Versioning (CalVer) encompasses a wide array of different versioning schemes that have one property in common: using the release date as part of a release's version. Calendar-based versions vary quite widely, but typically include a two or four digit year (YY or YYYY) and sometimes a month or day (MM and DD).

Using years in versions is quite common amongst other programming languages, operating systems like Ubuntu, and tools like Black, pip, and PyCharm.

Slide from Hugo's presentation showing programming languages using calendar-based versioning like Ada, Algol, C, C++, Fortran, and JavaScript

Since 2019, Python has made releases according to the new yearly cadence from PEP 602. Moving to annual releases made it possible for downstream distributors to rely on when a new Python version appears, which brings newer Python versions to users faster.

Each minor release receives 5 years of security fixes. Using the release year of 2026 as an example, users could add 5 years and know they'll receive security fixes on that minor release until 2031. Figuring out this information from "3.15" in the existing versioning scheme would require another lookup, typically to the release schedule PEP.

If the year were baked into the version, one wouldn't need to see the release schedule to know when support was ending, instead one could add 5 years to the year encoded in the version (e.g. for "3.26", 26 + 5 = 31, therefore security support ends in 2031).

Hugo offered multiple proposed versioning schemes, including:

  • Using the release year as minor version (3.YY.micro, "3.26.0")
  • Using the release year as major version (YY.0.micro, "26.0.0")
  • Using the release year and month as major and minor version (YY.MM.micro, "26.10.0")

There were discussions about other options beyond these amongst attendees.

Thomas Wouters, release manager for 3.12 and 3.13, questioned the value-add for adopting a new versioning system. Thomas noted that while the current system is confusing, changing the system in any way also adds confusion for users. Hugo responded that clarity, especially support for security fix and end-of-life dates, was the biggest motivation.

Barry Warsaw wondered if there was a way to test potential new versioning scheme ahead of time to find potential problems. Hugo referenced the deadsnakes project which builds distributions of CPython for Ubuntu. The deadsnakes project previously created a build of Python 3.9 that modified the version to be "3.10" to help discover breakages in projects assuming a single-digit minor version. Hugo also had experience using static code analysis to find other version assumptions in Python projects.

"Python 3 is a brand at this point, and we should stick to it" said Guido van Rossum after sharing concerns that changes to the major version would break the ecosystem more than changes to the minor version. Others voiced concerns about changing the major version "3" including in the "python3" binary and for packaging such as "abi3" tag.

Carol Willing noted that many projects are relying on Python's versioning system and already have those versions "baked in" to warnings in existing releases. Hugo confirmed this is a problem, including Python itself, which had a few deprecation warnings and messages that reference future Python versions like 3.15. Hugo's plan would be to update these versions for Python, give plenty of time before the new versioning scheme took affect.

Donghee Na offered up Rust's use of "yearly editions" in the branding of their releases, where the version number is completely separate from the branding of the release. Hugo was concerned that this would add another layer of confusion and would mostly repeat information already found in the release schedule.

Overall the proposal to use the current year as the minor version was well-received, Hugo mentioned that he'd be drafting up a PEP for this change.

Carl Meyer cautioned against making any changes to the version scheme before 2026 in order to preserve the 3.14 "π"-thon release which received approval and laughter from the room. Sounds like whatever happens we'll get to have our pie and eat it too. 🥧

Categories: FLOSS Project Planets

Talk Python to Me: #466: Pydantic Performance Tips

Fri, 2024-06-14 04:00
You're using Pydantic and it seems pretty straightforward, right? But could you adopt some simple changes to your code that would make it a lot faster and more efficient? Chances are, you'll find a couple of the tips from Sydney Runkle that will do just that. Join us to talk about Pydantic performance tips here on Talk Python.<br/> <br/> <strong>Episode sponsors</strong><br/> <br/> <a href='https://talkpython.fm/sentry'>Sentry Error Monitoring, Code TALKPYTHON</a><br> <a href='https://talkpython.fm/code-comments'>Code Comments</a><br> <a href='https://talkpython.fm/training'>Talk Python Courses</a><br/> <br/> <strong>Links from the show</strong><br/> <br/> <div><b>Sydney Runkle</b>: <a href="https://www.linkedin.com/in/sydney-runkle-105a35190/" target="_blank" rel="noopener">linkedin.com</a><br/> <b>Pydantic</b>: <a href="https://pydantic.dev/opensource" target="_blank" rel="noopener">pydantic.dev</a><br/> <b>Performance docs</b>: <a href="https://docs.pydantic.dev/latest/concepts/performance/" target="_blank" rel="noopener">docs.pydantic.dev</a><br/> <b>Union tips</b>: <a href="https://docs.pydantic.dev/latest/concepts/unions/" target="_blank" rel="noopener">docs.pydantic.dev</a><br/> <b>Sydney's presentation slides</b>: <a href="https://docs.google.com/presentation/d/183bn9ecIzOOqfxanrESu7rBaKCI70CX0/edit?usp=sharing&ouid=117072411264002710561&rtpof=true&sd=true" target="_blank" rel="noopener">docs.google.com</a><br/> <b>JSON to Pydantic</b>: <a href="https://jsontopydantic.com" target="_blank" rel="noopener">jsontopydantic.com</a><br/> <b>Samuel talking FastUI</b>: <a href="https://talkpython.fm/episodes/show/449/building-uis-in-python-with-fastui" target="_blank" rel="noopener">talkpython.fm</a><br/> <br/> <b>CodeFlash</b>: <a href="https://www.codeflash.ai" target="_blank" rel="noopener">codeflash.ai</a><br/> <b>Codspeed</b>: <a href="https://codspeed.io" target="_blank" rel="noopener">codspeed.io</a><br/> <b>Watch this episode on YouTube</b>: <a href="https://www.youtube.com/watch?v=R8PL1snHgzY" target="_blank" rel="noopener">youtube.com</a><br/> <b>Episode transcripts</b>: <a href="https://talkpython.fm/episodes/transcript/466/pydantic-performance-tips" target="_blank" rel="noopener">talkpython.fm</a><br/> <br/> <b>--- Stay in touch with us ---</b><br/> <b>Subscribe to us on YouTube</b>: <a href="https://talkpython.fm/youtube" target="_blank" rel="noopener">youtube.com</a><br/> <b>Follow Talk Python on Mastodon</b>: <a href="https://fosstodon.org/web/@talkpython" target="_blank" rel="noopener"><i class="fa-brands fa-mastodon"></i>talkpython</a><br/> <b>Follow Michael on Mastodon</b>: <a href="https://fosstodon.org/web/@mkennedy" target="_blank" rel="noopener"><i class="fa-brands fa-mastodon"></i>mkennedy</a><br/></div>
Categories: FLOSS Project Planets

Python Morsels: Data structures contain pointers

Thu, 2024-06-13 17:20

Data structures, like variables, contain references to objects, rather than the objects themselves.

Table of contents

  1. Referencing the same object in multiple places
  2. Data structures store references, not objects
  3. Avoid referencing the same mutable object
  4. An ouroboros: A list that contains itself
  5. Python's data structures contain pointers to objects

Referencing the same object in multiple places

Let's point a variable row to a list of three zeroes:

>>> row = [0, 0, 0]

Now let's make a new variable that points to a list-of-lists:

>>> boat = [row, row, row]

We now have a list of three lists, each with three zeroes in it:

>>> boat [[0, 0, 0], [0, 0, 0], [0, 0, 0]]

What would happen if we look up index 1, and then index 1 again, and change that to the number 1?

>>> boat[1][1] = 1

What do you think might happen? What will change in our lists?

We're looking up the second list, and then the second value in the second list, and assigning that value to 1. So we've asked to change the middle item in the middle list to the number 1.

That's not quite what happens though:

>>> boat [[0, 1, 0], [0, 1, 0], [0, 1, 0]]

Instead, we changed the middle number in all three of our inner lists.

Why did this happen?

Well...

Data structures store references, not objects

Lists in Python don't actually …

Read the full article: https://www.pythonmorsels.com/data-structures-contain-pointers/
Categories: FLOSS Project Planets

Python Software Foundation: For your consideration: Proposed bylaws changes to improve our membership experience

Thu, 2024-06-13 08:22

This year, as part of our annual election process, the Python Software Foundation Board is offering three bylaws changes for our Members to vote on. These changes are all centered on our membership experience: making it simpler to qualify as a Member for Python-related volunteer work, making it easier to vote, and allowing the Board more options to keep our membership safe and enforce the Code of Conduct.

Voting Members will be asked to vote on these items during the July Board of Directors election. If the majority of voting members vote in favor of any of the changes, those changes will be incorporated into the bylaws and go into immediate effect.

We're sharing these changes with you today as an opportunity to understand why these changes are being proposed, and to give you an opportunity to ask questions of the Board before you vote, either by emailing psf-elections@pyfound.org or membership-wg@pyfound.org, or by responding to the thread on the PSF discussions site.

The text of the changes are available from the following links, all of which show visual representations of additions and deletions to our canonical bylaws repository:

The Board has carefully considered these changes and strongly encourages all Members to vote in favor of them. The rest of this post explains the changes, and why we're putting them to our Voting Members.

Change 1: Merging Contributing and Managing member classes

Since 2017, when we adopted our current membership model, we've had four classes of membership: Supporting (recognition for being a monetary donor), Managing (recognition for volunteer work in the PSF or in community events), Contributing (recognition for volunteer work on open source software), and Fellow (recognition of long-term service to the mission of the PSF).

For almost as long as our membership options have existed, there's been confusion about the distinction between Managing and Contributing members. Both require 5 hours a month of volunteer service, but the distinction between community work and work on software is increasingly out of step with how our community thinks about contributing.

In the future, we want community members to qualify for PSF membership by participating and giving back to the community, either through donating, through volunteer work, or in recognition of long service, without distinguishing between code contributions and non-code contributions.

With this proposed bylaws change, we would merge the Managing and Contributing membership classes. All Managing Members would become Contributing Members, and we would no longer have a Managing Member class. Further, this change would explicitly allow for works of authorship (including documentation, books, or blogs) beyond software to count as volunteer work, provided those works are openly licensed.

We think this would significantly simplify the membership categories, reduce confusion, and make it easier for volunteers who both run events and write code to decide which membership class best applies to them.

Change 2: Simplifying the voter affirmation process by treating past voting activity as intent to continue voting

Our bylaws (section 4.2) require every Member to affirm their intention to vote in an election before they can be issued a ballot. This is intended to ensure that our election reaches a quorum (i.e. one third of ballots issued are actually used to vote), and is therefore valid by our bylaws. Due to technical limitations, we only started requiring this affirmation in practice last year. Since then, we've received feedback that the affirmation process has unintentionally excluded some people who had intended to vote.

It is the Board's intention that we make it as easy as possible to vote in our elections, however, we must balance legal obligations that require us to maintain a quorum in our elections.

This bylaws change would allow us to treat any member who voted in the immediately previous year's election as having affirmed their intention to vote. The Board believes that voting in a PSF election is a good indicator that a member is likely to vote again, and including them in the quorum calculation is unlikely to put our quorum at risk.

While there may be technical limitations to us implementing this change, the change is necessary to allow the Foundation to alter the affirmation procedure at all. It is the Board's intention that this change would be implemented for the 2025 election should the Bylaws change be accepted by the membership during the 2024 election.

Change 3: Allow for removal of Fellows by a Board vote in response to Code of Conduct violations, removing the need for a vote of the membership

Currently PSF Fellows are awarded membership for life, as a reward for exceptional service to the mission of the Foundation. There are deliberately very few ways to remove a Fellow from the membership.

If a Fellow were found to have violated the Foundation's Code of Conduct in a way that warranted termination of their membership, currently the only way to remove them would be to put their removal to a vote of all Voting Members (per section 4.15 of our Bylaws). We believe that requiring a vote of the membership to remove a Code of Conduct violator from our community would subject members of the community --- including people directly impacted by that violator's behavior --- to undue distress.

On the other hand, we believe there is significant legal risk that could arise from Code of Conduct violators known to the PSF using their status as a PSF Fellow to enhance their reputation. In cases where the Foundation needs to act in order to continue being able to serve the Python community effectively, we currently have no choice but to name a known Code of Conduct violator as part of a vote put to the membership.

In practice, this requirement limits our ability to effectively enforce our Code of Conduct. This is a disservice to our community.

This proposed change gives the Board, by a majority vote, the ability to terminate the membership of a Fellow as a consequence of breaching any written policy of the Foundation, specifically including our Code of Conduct. This change would allow the Board to act in cases where there is a clear need for a problematic community member to no longer be affiliated with the Foundation, without further perpetuating the trauma caused by that community member's actions.

Categories: FLOSS Project Planets

The Python Show: 44 - Django with Will Vincent

Wed, 2024-06-12 21:59

In this episode, we welcome Will Vincent to the Python Show Podcast. Will has written several books on Django.

We chatted about all things Django and Python. Specifically, we covered the following topics:

  • Favorite Python packages

  • Podcasts

  • Content creation

  • Book writing

  • Why Django versus another web framework

  • AI and writing

  • and more!

Links
Categories: FLOSS Project Planets

Matt Layman: Boosting AI with Python: Using Click, Jinja2, and GPT Libraries

Wed, 2024-06-12 20:00
n this session, we will explore how to use Python to enhance your AI projects with:
Categories: FLOSS Project Planets

Seth Michael Larson: PyCon US 2024 as Security Developer-in-Residence

Wed, 2024-06-12 20:00
PyCon US 2024 as Security Developer-in-Residence AboutBlogNewsletterLinks PyCon US 2024 as Security Developer-in-Residence

Published 2024-06-13 by Seth Larson
Reading time: minutes

This critical role would not be possible without funding from the Alpha-Omega project. Massive thank-you to Alpha-Omega for investing in the security of the Python ecosystem!

This was my first PyCon US as the Security Developer-in-Residence. I accepted the offer shortly after PyCon US 2023, where Deb Nicholson provided an exciting cliffhanger that the PSF was close to making a hiring decision for the role during the PSF update.

The timing of my hiring to coincide with PyCon US is quite the unique opportunity, it means I get to share all the things we've been able to accomplish together in the past year all-together in person. This also means I get to share the road ahead and learn from many people I don't get to see on a day-to-day, like maintainers of projects large and small, software users at organizations and companies, and users whose work extends beyond software like scientists and industrial control systems operators.

Overall I think this yearly cadence couldn't be better, I'm already looking forward to what I'll be able to share at PyCon US 2025 back in Pittsburgh. As I've said in the past, this role wouldn't be effective without the support of the community, so I want to thank you all for the parts you've played in me being successful.

State of Python Supply Chain Security with Alpha Omega

Alpha Omega is a sponsor of the Python Software Foundation, specifically for my role! Alpha Omega's goal is to "improve global software supply chain security by partnering with open source". I presented alongside Alpha Omega's co-founder Michael Winser. The slides for our presentation are available online.

Alpha Omega uses a two-pronged approach, and it's right there in the name: Alpha and Omega:

  • Alpha: Improve the security posture of the most critical projects through staffing.
  • Omega: Automated security analysis, metrics, and remediation for wider range of projects.

My role is full-time staffing, so is in the "Alpha" bucket of funding.


Speaking for Alpha Omega on the state of Python supply chain security

For readers of my blog, lots of the content of the presentation is review, but I went over the following accomplishments for the first year of the role:

  • Python Software Foundation as a CVE Numbering Authority (and how we helped Linux, curl, and others with a guide).
  • Joining Python Security Response Team and working on process
  • Working with Python release managers to move the release process to GitHub Actions as an isolated build environment.
  • Build reproducibility for the Python release process.
  • Software Bill-of-Materials for Python release artifacts.
  • Coordinated cross-ecosystem response to libwebp and xz-utils vulnerabilities.
  • Tons of community work: PEP reviewer, talks, blogs, and more.
The more interesting part for existing readers is my plans for next year. My plan is to partially shift focus, continue working with Python core developer team of course, but to start making improvements to the wider community of projects using Python as well:

  • Enabling Build Provenance and Software Bill-of-Materials for Python packages.
  • Adoption of security features, hardening, and best practices for Python packages.
  • Special focus on Python packaging tools and workflows.

I finished the talk by describing the unique nature of this role as being flexible and how that's a boon for the rapidly changing space of software supply chain security. This role has a fairly wide scoping, which means that when things come up (like xz-utils and novel social-engineering techniques) it's in this role's scope to think about how to respond for the Python ecosystem.

“Vuln Together”, an open space on vulnerability management

“Got vulns? Let's talk!” -- I co-hosted this open space with GitHub Security champion and CVE board member Madison Oliver. This open space focused on the soft-side of vulnerability management for open source projects: people!

Managing vulnerabilities for open source projects is a non-trivial and effort intensive process, because maintainers need to create and publish a security policy, accept private vulnerability reports, and then know what reporters need, request a CVE ID, know how to estimate severity and write an advisory text, and then how to publish an advisory alongside fixed versions. Phew!


Madison and I pictured holding the open space card for “Vuln Together”

The open space was a forum to discuss difficulties with managing vulnerabilities, answering questions, and pointing folks at resources. We shared recommendations like how maintainers can easily request CVEs from GitHub, enable Private Vulnerability Reporting on GitHub and Confidential Issues on GitLab to make reporting easier, and showed resources like the Guide for Vulnerability Disclosure Process from the OpenSSF Vulnerability Disclosures working group.


The open space was well attended!

Language Summit discussion on CPython security

My bingo space for "xz" was almost immediately filled at PyCon US when I attended the Python Language Summit as the blogger. Release manager and core developer Pablo Galindo Salgado spoke about the Python contribution security model in "the wake of xz-utils backdoor".

The complete language summit blog posts are coming soon to the Python Software Foundation blog (I should know, I authored them!) so if you're interested in this topic you can stay tuned over there. I won't spoil any of the contents of the blog post here, but it was a great discussion all around. Lots of interesting ideas and thoughts from core developers that'll help with figuring process improvements that work for CPython.

Meeting all the people

Mike Fiedler and I were on the stage briefly after Kate Chapman's keynote for a "Meet the Python Software Foundation Security Engineers" segment where we went over our plans for the upcoming year and recommended folks to follow the PSF and PyPI blogs for future updates.

Of course, I got to talk to so many people, too many to name individually. I chatted with the upcoming Release Manager for CPython, Hugo van Kemenade, on some ideas to further improve the CPython release process, specifically around signatures. I chatted with folks from specific sub-ecosystems like Jannis Leidel from Conda and Jazzband, David Lord from Flask, and Jarek Potiuk from Airflow.

Also I handed out stickers! I went through over a hundred, it brought me a lot of joy seeing how much folks liked the derpy snake knight design.

... and beyond!

PyCon US 2024 is the start of conferences this year for me. Shortly after PyCon US it was announced that I will be keynoting PyCon Taiwan in September and also speaking at All Things Open 2024 in Raleigh, North Carolina. If you're attending either of these events get in contact with me (and I promise to bring stickers).

That's all for this post! 👋 If you're interested in more you can read the previous report.

Thanks for reading! ♡ Did you find this article helpful and want more content like it? Get notified of new posts by subscribing to the RSS feed or the email newsletter.

This work is licensed under CC BY-SA 4.0

Categories: FLOSS Project Planets

Real Python: Python Mappings: A Comprehensive Guide

Wed, 2024-06-12 10:00

One of the main data structures you learn about early in your Python learning journey is the dictionary. Dictionaries are the most common and well-known of Python’s mappings. However, there are other mappings in Python’s standard library and third-party modules. Mappings share common characteristics, and understanding these shared traits will help you use them more effectively.

In this tutorial, you’ll learn about:

  • Basic characteristics of a mapping
  • Operations that are common to most mappings
  • Abstract base classes Mapping and MutableMapping
  • User-defined mutable and immutable mappings and how to create them

This tutorial assumes that you’re familiar with Python’s built-in data types, especially dictionaries, and with the basics of object-oriented programming.

Get Your Code: Click here to download the free sample code that you’ll use to learn about mappings in Python.

Take the Quiz: Test your knowledge with our interactive “Python Mappings” quiz. You’ll receive a score upon completion to help you track your learning progress:

Interactive Quiz

Python Mappings

In this quiz, you'll test your understanding of the basic characteristics and operations of Python mappings. By working through this quiz, you'll revisit the key concepts and techniques of creating a custom mapping.

Understanding the Main Characteristics of Python Mappings

A mapping is a collection that allows you to look up a key and retrieve its value. The keys in mappings can be objects of a broad range of types. However, in most mappings, there are object types that can’t be used as keys, as you’ll learn later in this tutorial.

The previous paragraph described mappings as collections. A collection is an iterable container that has a defined size. However, mappings also have additional features. You’ll explore each of these mapping characteristics with examples from Python’s main mapping types.

The feature that’s most characteristic of mappings is the ability to retrieve a value using a key. You can use a dictionary to demonstrate this operation:

Python >>> points = { ... "Denise": 3, ... "Igor": 2, ... "Sarah": 3, ... "Trevor": 1, ... } >>> points["Sarah"] 3 >>> points["Matt"] Traceback (most recent call last): ... KeyError: 'Matt' Copied!

The dictionary points contains four items, each with a key and a value. You can use the key within the square brackets to fetch the value associated with that key. However, if the key doesn’t exist in the dictionary, the code raises a KeyError.

You can use one of the mappings in the standard-library collections module to assign a default value for keys that aren’t present in the collection. The defaultdict type includes a callable that’s called each time you try to access a key that doesn’t exist. If you want the default value to be zero, you can use a lambda function that returns 0 as the first argument in defaultdict:

Python >>> from collections import defaultdict >>> points_default = defaultdict( ... lambda: 0, ... points, ... ) >>> points_default defaultdict(<function <lambda> at 0x104a95da0>, {'Denise': 3, 'Igor': 2, 'Sarah': 3, 'Trevor': 1}) >>> points_default["Sarah"] 3 >>> points_default["Matt"] 0 >>> points_default defaultdict(<function <lambda> at 0x103e6c700>, {'Denise': 3, 'Igor': 2, 'Sarah': 3, 'Trevor': 1, 'Matt': 0}) Copied!

The defaultdict constructor has two arguments in this example. The first argument is the callable that’s used when a default value is needed. The second argument is the dictionary you created earlier. You can use any valid argument when you call dict() as the second argument in defaultdict() or omit this argument to create an empty defaultdict.

When you access a key that’s missing from the dictionary, the key is added, and the default value is assigned to it. You can also create the same points_default object using the callable int as the first argument since calling int() with no arguments returns 0.

All mappings are also collections, which means they’re iterable containers with a defined length. You can explore these characteristics with another mapping in Python’s standard library, collections.Counter:

Python >>> from collections import Counter >>> letters = Counter("learning python") >>> letters Counter({'n': 3, 'l': 1, 'e': 1, 'a': 1, 'r': 1, 'i': 1, 'g': 1, ' ': 1, 'p': 1, 'y': 1, 't': 1, 'h': 1, 'o': 1}) Copied!

The letters in the string "learning python" are converted into keys in Counter, and the number of occurrences of each letter is used as the value corresponding to each key.

You can confirm that this mapping is iterable, has a defined length, and is a container:

Python >>> for letter in letters: ... print(letter) ... l e a r n i g p y t h o >>> len(letters) 13 >>> "n" in letters True >>> "x" in letters False Copied!

You can use the Counter object letters in a for loop, which confirms it’s iterable. All mappings are iterable. However, the iteration loops through the keys and not the values. You’ll see how to iterate through the values or through both keys and values later in this tutorial.

The built-in len() function returns the number of items in the mapping. This is equal to the number of unique characters in the original string, including the space character. The object is sized since len() returns a value.

You can use the in keyword to confirm which elements are in the mapping. This check alone isn’t sufficient to confirm that the mapping is a container. However, you can also access the object’s .__contains__() special method directly:

Python >>> letters.__contains__("n") True Copied! Read the full article at https://realpython.com/python-mappings/ »

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Categories: FLOSS Project Planets

Real Python: Quiz: Python Mappings

Wed, 2024-06-12 08:00

In this quiz, you’ll test your understanding of the basic characteristics and operations of Python mappings. By working through this quiz, you’ll revisit the key concepts and techniques of creating a custom mapping.

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Categories: FLOSS Project Planets

Kay Hayen: Nuitka Release 2.3

Tue, 2024-06-11 18:00

This is to inform you about the new stable release of Nuitka. It is the extremely compatible Python compiler, “download now”.

This release bumps the long-awaited 3.12 support to a complete level. Now, Nuitka behaves identically to CPython 3.12 for the most part.

In terms of bug fixes, it’s also huge. Especially for Unicode paths and software with Unicode extension module names and Unicode program names, and even non-UTF8 code names, there have been massive amounts of improvements.

Table of Contents

Bug Fixes
  • Standalone: Added support for python-magic-bin package. Fixed in 2.2.1 already.

  • Fix: The cache directory creation could fail when multiple compilations started simultaneously. Fixed in 2.2.1 already.

  • macOS: For arm64 builds, DLLs can also have an architecture dependent suffix; check that as well. Makes the soundfile dependency scan work. Fixed in 2.2.1 already.

  • Fix: Modules where lazy loaders handling adds hard imports when a module is first processed did not affect the current module, potentially causing it not to resolve hidden imports. Fixed in 2.2.1 already.

  • macOS: The use of libomp in numba needs to cause the extension module not to be included and not to look elsewhere. Fixed in 2.2.1 already.

  • Python3.6+: Fix, added support for keyword arguments of ModuleNotFoundError. Fixed in 2.2.1 already.

  • macOS: Detect more versioned DLLs and arm64 specific filenames. Fixed in 2.2.1 already.

  • Fix, was not annotating exception exit when converting an import to a hard submodule import. Fixed in 2.2.2 already.

  • Fix, branches that became empty can still have traces that need to be merged.

    Otherwise, usages outside the branch will not see propagated assignment statements. As a result, these falsely became unassigned instead. Fixed in 2.2.2 already.

  • Windows: Fix, uninstalled self-compiled Python didn’t have proper installation prefix added for DLL scan, resulting in runtime DLLs not picked up from there. Fixed in 2.2.2 already.

  • Standalone: Added support for newer PySide6 version 6.7. It needed correction on macOS and has a new data file type. Fixed in 2.2.3 already.

  • Standalone: Complete support for pyocd package. Fixed in 2.2.3 already.

  • Module: Fix, the created .pyi files were incomplete.

    The list of imported modules created in the finalization step was incomplete, we now go over the actual done modules and mark all non-included modules as dependencies.

  • Scons: Fix, need to avoid using Unicode paths towards the linker on Windows. Instead, use a temporary output filename and rename it to the actual filename after Scons has completed.

  • Windows: Avoid passing Unicode paths to the dependency walker on Windows, as it cannot handle those. Also, the temporary filenames in the build folder must be in short paths, as it cannot handle them in case that is a Unicode path.

  • Scons: For ccache on Windows, the log filename must be a short path too, if the build folder is a Unicode path.

  • Windows: Make sure the Scons build executes inside a short path as well, so that a potential Unicode path is visible to the C compiler when resolving the current directory.

  • Windows: The encoding of Unicode paths for accelerated mode values of __file__ was not making sure that hex sequences were correctly terminated, so in some cases, it produced ambiguous C literals.

  • Windows: Execute binaries created with --windows-uac-admin with and --run options with proper UAC prompt.

  • Fix, need to allow for non-UTF8 Unicode in variable names, function names, class names, and method names.

  • Python3.10+: Fix, match statements that captured the rest of mapping checks were not working yet.

    match value: case {"key1": 5, **rest}: ... # rest was not assigned here
  • Windows: When deleting build folders, make sure the retries leading to a complete deletion always.

  • Python2: Fix, could crash with non-unicode program paths on Windows.

  • Avoid giving SyntaxWarning from reading source code

    For example, the standard site module of Python 3.12 gives warnings about illegal escape sequences that nobody cares about apparently.

  • Fix, the matplotlib warnings by options-nanny were still given even if the no-qt plugin was used, since the variable name referenced there was not actually set yet by that plugin.

  • Windows: Fix, when using the uninstalled self-compiled Python, we need python.exe to find DLL dependencies. Otherwise it doesn’t locate the MSVC runtime and Python DLL properly.

  • Standalone: Added support for freetype package.

New Features
  • Support for Python 3.12 is finally there. We focused on scalability first and because we did things the correct way immediately, rather than rushing to get it working and improving only later.

    As a result, the correctness and performance of Nuitka with previous Python releases are improved as well.

    Some things got delayed, though. We need to do more work to take advantage of other core changes. Concerning exceptions normalized at creation time, the created module code doesn’t yet take advantage. Also, more efficient two-digit long handling is possible with Python 3.12, but not implemented. It will take more time before we have these changes completed.

  • Experimental support for Python 3.13 beta 1 is also there, and potentially surprising, but we will try and follow its release cycle closely and aim to support it at the time of release.

    Nuitka has followed all of its core changes so far, and basic tests are passing; the accelerated, module, standalone, and onefile modes all work as expected. The only thing delayed is the uncompiled generator integration, where we need to replicate the exact CPython behavior. We need to have perfect integration only for working with the asyncio loop, so we wait with it until release candidates appear.

  • Plugins: Added support to include directories entirely unchanged by adding raw_dir values for data-files section, see Nuitka Package Configuration.

  • UI: The new command line option --include-raw-dir was added to allow including directories entirely unchanged.

  • Module: Added support for creating modules with Unicode names. Needs a different DLL entry function name and to make use of two-phase initialization for the created extension module.

  • Added support for OpenBSD standalone mode.

Optimization
  • Python3: Avoid API calls for allocators

    Most effective with Python 3.11 or higher but also many other types like bytes, dict keys, float, and list objects are faster to create with all Python3 versions.

  • Python3.5+: Directly use the Python allocator functions for object creation, avoiding the DLL API calls. The coverage is complete with Python3.11 or higher, but many object types like float, dict, list, bytes benefit even before that version.

  • Python3: Faster creation of StopIteration objects.

    With Python 3.12, the object is created directly and set as the current exception without normalization checks.

    We also added a new specialized function to create the exception object and populate it directly, avoiding the overhead of calling of the StopIteration type.

  • Python3.10+: When accessing freelists, we were not passing for tstate but locally getting the interpreter object, which can be slower by a few percent in some configurations. We now use the free lists more efficient with tuple, list, and dict objects.

  • Python3.8+: Call uncompiled functions via vector calls.

    We avoid an API call that ends up being slower than using the same function via the vector call directly.

  • Python3.4+: Avoid using _PyObject_LengthHint API calls in list.extend and have our variant that is faster to call.

  • Added specialization for os.path.normpath. We might benefit from compile time analysis of it once we want to detect file accesses.

  • Avoid using module constants accessor for global constant values

    For example, with (), we used the module-level accessor for no reason, as it is already available as a global value. As a result, constant blobs shrink, and the compiled code becomes slightly smaller , too.

  • Anti-Bloat: Avoid using dask from the sparse module. Added in 2.2.2 already.

Organizational
  • UI: Major change in console handling.

    Compiled programs on Windows now have a third mode, besides console or not. You can now create GUI applications that attach to an available console and output there.

    The new option --console controls this and allows to enforce console with the force value and disable using it with the disable value, the attach value activates the new behavior.

    Note

    Redirection of outputs to a file in attach mode only works if it is launched correctly, for example, interactively in a shell, but some forms of invocation will not work; prominently, subprocess.call without inheritable outputs will still output to a terminal.

    On macOS, the distinction doesn’t exist anymore; technically it wasn’t valid for a while already; you need to use bundles for non-console applications, though, by default otherwise a console is forced by macOS itself.

  • Detect patchelf usage in buggy version 0.18.0 and ask the user to upgrade or downgrade it, as this specific version is known to be broken.

  • UI: Make clear that the --nofollow-import-to option accepts patters.

  • UI: Added warning for module mode and usage of the options to force outputs as they don’t have any effect.

  • UI: Check the success of Scons in creating the expected binary immediately after running it and not only once we reach post-processing.

  • UI: Detect empty user package configuration files

  • UI: Do not output module ast when a plugin reports an error for the module, for example, a forbidden import.

  • Actions: Update from deprecated action versions to the latest versions.

Tests
  • Use Nuitka Project Options for the user plugin test rather than passing by environment variables to the test runner.

  • Added a new search mode, skip, `` to complement ``resume which resumes right

    after the last test resume stopped on. We can use that while support for a Python version is not complete.

Cleanups
  • Solved a TODO about using unified code for setting the StopIteration, coroutines, generators, and asyncgen used to be different.

  • Unified how the binary result filename is passed to Scons for modules and executables to use the same result_exe key.

Summary

This release marks a huge step in catching up with compatibility of Python. After being late with 3.12 support, we will now be early with 3.13 support if all goes well.

The many Unicode support related changes also enhanced Nuitka to generate 2 phase loading extension modules, which also will be needed for sub-interpreter support later on.

From here on, we need to re-visit compatibility. A few more obscured 3.10 features are missing, the 3.11 compatibility is not yet complete, and we need to take advantage of the new caching possibilities to enhance performance for example with attribute lookups to where it can be with the core changes there.

For the coming releases until 3.13 is released, we hope to focus on scalability a lot more and get a much needed big improvement there, and complete these other tasks on the side.

Categories: FLOSS Project Planets

Brett Cannon: Saying thanks to open source maintainers

Tue, 2024-06-11 17:29

After signing up for GitHub Sponsors, I had a nagging feeling that somehow asking for money from other people to support my open source work was inappropriate. But after much reflection, I realized that phrasing the use of GitHub Sponsors as a way to express patronage/support and appreciation for my work instead of sponsorship stopped me feeling bad about it. It also led me to reflect on to what degree people can express thanks to open source maintainers.

⚠️This blog post is entirely from my personal perspective and thus will not necessarily apply to every open source developer out there.Be nice

The absolutely easiest way to show thanks is to simply not be mean. It sounds simple, but plenty of people fail at even this basic level of civility. This isn&apost to say you can&apost say that a project didn&apost work for you or you disagree with something, but there&aposs a massive difference between saying "I tried the project and it didn&apost meet my needs" and "this project is trash".

People failing to support this basic level of civility is what leads to burnout.

Be an advocate

It&aposs rather indirect, but saying nice things about a project is a way of showing thanks. As an example, I have seen various people talk positively about pyproject.toml online, but not directly at me. That still feels nice due to how much effort I put into helping make that file exist and creating the [project] table.

Or put another way, you never know who is reading your public communications.

Produce your own open source

Another indirect way to show thanks is by sharing your own open source code. By maintaining your own code, you&aposll increase the likelihood I myself will become a user of your project. That then becomes a circuitous cycle of open source support between us.

Say thanks

Directly saying "thank you" actually goes a really long way. It takes a lot of positive interactions to counteract a single negative interaction. You might be surprised how much it might brighten someone&aposs day when someone takes the time and effort to reach out and say "thank you", whether that&aposs by DM, email, in-person at a conference, etc.

Fiscal support

As I said in the opening of this post, I set up GitHub Sponsors for myself as a way for people to show fiscal support for my open source work if that&aposs how they prefer to express their thanks (including businesses). Now I&aposm purposefully not saying "sponsor" as to me that implies that giving money leads to some benefit (e.g. getting a shout-out somewhere) which is totally reasonable for people to do. But for me, since every commit is a gift, I&aposm financially secure, and I&aposm not trying to make a living from my volunteer open source work or put in the effort to make sponsorship worth it, I have chosen to treat fiscal support as a way of showing reciprocity for the gift of sharing my code that you&aposve already received. This means I fully support all open source maintainers setting up fiscal support at a minimum, and if they want to put in the effort to go the sponsorship route then they definitely should.

Producing open source also isn&apost financially free. For instance, I pay for:

  1. The hosting of this blog via Ghost(Pro)
  2. Obsidian Sync to keep my open source notes available on all my devices so when I have an idea I can write it down
  3. Obsidian Publish to share my open source notes
  4. Computer upgrades (including ergonomic upgrades like keyboards)
  5. My personal time away from my wife and child, family and friends (which my open source journal exists to try and point out for those who don&apost realize how much time I put into my volunteer work)

So while open source is "free" for you as the consumer, the producer very likely has concrete financial costs in producing that open source on top of the intangible costs like volunteering their personal time.

But as I listed earlier, there are plenty of other ways to show thanks without having to spend money that can be equally valuable to a maintainer.

I also specifically didn&apost mention contributing. I have said before that contributions are like giving someone a puppy: it seems like a lovely gift at the time, but the recipient is now being "gifted" daily walks involving scooping &#x1F4A9; and vet bills. As such, contributions from others can be a blessing and a curse all at the same time depending on the contribution itself, the attitude of the person making the contribution, etc. So I wouldn&apost always assume my contribution is as welcomed and desired as much as a "thank you" note.

Categories: FLOSS Project Planets

PyCoder’s Weekly: Issue #633 (June 11, 2024)

Tue, 2024-06-11 15:30

#633 – JUNE 11, 2024
View in Browser »

String Interpolation in Python: Exploring Available Tools

In this tutorial, you’ll learn about the different tools that Python provides for performing string interpolation. String interpolation allows you to create new strings by inserting different objects into a string template.
REAL PYTHON

Notebooks for Fundamentals of Music Processing

This is a collection of Python Notebooks for teaching and learning the fundamentals of music processing. Examples include illustrations, sound samples, math, and more.
INTERNATIONAL AUDIO LABS

Upgrade Python Versions Without the Pain

Stop wasting 30% of your team’s sprint on maintaining legacy codebases. Automatically migrate and keep up-to-date on Python versions, so that you can focus on being productive while staying secure, without the risk of breaking changes - Get a code assessment today →
ACTIVESTATE sponsor

Python’s Many Command-Line Utilities

This article describes every command-line tool included with Python, each of which can be run with python -m module_name.
TREY HUNNER

String Interpolation in Python (Quiz)

Take this quiz to test your understanding of the available tools for string interpolation in Python, as well as their strengths and weaknesses. These tools include f-strings, the .format() method, and the modulo operator.
REAL PYTHON

Python 3.12.4 Released

See the full list of changes in this release
CPYTHON DEV BLOG

PEP 712 Rejected

This Python Enhancement Proposal “Adding a ‘converter’ parameter to dataclasses.field” was determined to have an insufficient number of use cases.
PYTHON

Python 3.13.0 Beta 2 Released

CPYTHON DEV BLOG

Articles & Tutorials What Are CRUD Operations?

CRUD operations are the cornerstone of application functionality. Whether you access a database or interact with a REST API, you usually want to create, retrieve, update, and delete data. In this tutorial, you’ll explore how CRUD operations work in practice.
REAL PYTHON

What We Talk About When We Talk About System Design

Mahesh talks about the rules he has encountered when doing research on designing large systems. Guidelines include late-binding on the design, focusing on the problem rather than existing systems, talking about other applications, and more.
MAHESH BALAKRISHNAN

Get Your Own AI Agent to Answer Questions From Your Database

Introducing “Database Mind” - a ready-to-use AI system designed for easy integration into your projects. As part of the “Minds Endpoints” AI platform, it offers a simple plug-and-play API service, enabling developers to effortlessly incorporate advanced AI capabilities into their solutions →
MINDSDB sponsor

Statically Typed Functional Programming With Python 3.12

This detailed article looks at how to use the match statement along with Python’s typing mechanism to write functional programs similar in style to Kotlin.
OSKAR WICKSTROM

How to Annotate a Graph With Matplotlib and Python

The Matplotlib package is great for visualizing data. One of its many features is the ability to annotate points on your graph. This article shows you how.
MIKE DRISCOLL

bytes: The Lesser-Known Python Built-in Sequence

The bytes data type looks a bit like a string, but it isn’t a string. This article explores it and also looks at the main Unicode encoding, UTF-8
STEPHEN GRUPPETTA

Reflecting on One Year of Being an Engineering Manager

“Being a manager is a focus change from code to people, from output to outcomes and from being productive to making most of everyone’s time.” Read more of Victor’s reflecting on his first year as a manager.
VICTOR STOJANOV

Testing With Python: Fake It

This article is on using mock in your Python testing and is part of a larger series on testing in general.
BITECODE

Projects & Code Mesop: Build Web Apps in Python

GITHUB.COM/GOOGLE

WeasyPrint: The Awesome Document Factory

GITHUB.COM/KOZEA

django-axes: Track of Failed Login Attempts in Django

GITHUB.COM/JAZZBAND

Zango: Microservices in Django

GITHUB.COM/HEALTHLANE-TECHNOLOGIES

gloe: Library for Flow-Oriented Code

GITHUB.COM/IDEOS

Events Weekly Real Python Office Hours Q&A (Virtual)

June 12, 2024
REALPYTHON.COM

Wagtail Space NL

June 12 to June 15, 2024
WAGTAIL.SPACE

Django Girls Abraka Workshop 2024

June 13 to June 15, 2024
DJANGOGIRLS.ORG

Python Atlanta

June 13 to June 14, 2024
MEETUP.COM

PyData London 2024

June 14 to June 17, 2024
PYDATA.ORG

PyCamp Leipzig 2024

June 15 to June 17, 2024
BARCAMPS.EU

Wagtail Space US

June 20 to June 23, 2024
WAGTAIL.SPACE

Happy Pythoning!
This was PyCoder’s Weekly Issue #633.
View in Browser »

[ Subscribe to 🐍 PyCoder’s Weekly 💌 – Get the best Python news, articles, and tutorials delivered to your inbox once a week >> Click here to learn more ]

Categories: FLOSS Project Planets

Pages