Planet Python

Subscribe to Planet Python feed
Planet Python - http://planetpython.org/
Updated: 16 hours 34 min ago

Real Python: Python Protocols: Leveraging Structural Subtyping

Wed, 2024-07-17 10:00

In Python, a protocol specifies the methods and attributes that a class must implement to be considered of a given type. Protocols are important in Python’s type hint system, which allows for static type checking through external tools, such as mypy, Pyright, and Pyre.

Before there were protocols, these tools could only check for nominal subtyping based on inheritance. There was no way to check for structural subtyping, which relies on the internal structure of classes. This limitation affected Python’s duck typing system, which allows you to use objects without considering their nominal types. Protocols overcome this limitation, making static duck typing possible.

In this tutorial, you’ll:

  • Gain clarity around the use of the term protocol in Python
  • Learn how type hints facilitate static type checking
  • Learn how protocols allow static duck typing
  • Create custom protocols with the Protocol class
  • Understand the differences between protocols and abstract base classes

To get the most out of this tutorial, you’ll need to know the basics of object-oriented programming in Python, including concepts such as classes and inheritance. You should also know about type checking and duck typing in Python.

Get Your Code: Click here to download the free sample code that shows you how to leverage structural subtyping with Python protocols

The Meaning of “Protocol” in Python

During Python’s evolution, the term protocol became overloaded with two subtly different meanings. The first meaning refers to internal protocols, such as the iterator, context manager, and descriptor protocols.

These protocols are widely understood in the community and consist of special methods that make up a given protocol. For example, the .__iter__() and .__next__() methods define the iterator protocol.

Python 3.8 introduced a second, slightly different type of protocol. These protocols specify the methods and attributes that a class must implement to be considered of a given type. So, these protocols also have to do with a class’s internal structure.

With this kind of protocol, you can define interchangeable classes as long as they share a common internal structure. This feature allows you to enforce a relationship between types or classes without the burden of inheritance. This relationship is known as structural subtyping or static duck typing.

In this tutorial, you’ll focus on this second meaning of the term protocol. First, you’ll have a look at how Python manages types.

Dynamic and Static Typing in Python

Python is a dynamically typed language, which means that the Python interpreter checks an object’s type when the code runs. It also means that while a variable can only reference one object at a time, the type of that object can change during the variable’s lifetime.

For example, you can have a variable that starts as a string and changes into an integer number:

Python >>> value = "One hundred" >>> value 'One hundred' >>> value = 100 >>> value 100 Copied!

In this example, you have a variable that starts as a string. Later in your code, you change the variable’s value to an integer.

Because of its dynamic nature, Python has embraced a flexible typing system that’s known as duck typing.

Duck Typing

Duck typing is a type system in which an object is considered compatible with a given type if it has all the methods and attributes that the type requires. This typing system supports the ability to use objects of independent and decoupled classes in a specific context as long as they adhere to some common interface.

Note: To dive deeper into duck typing, check out the Duck Typing in Python: Writing Flexible and Decoupled Code tutorial.

As an example of duck typing, you can consider built-in container data types, such as lists, tuples, strings, dictionaries, and sets. All of these data types support iteration:

Python >>> numbers = [1, 2, 3] >>> person = ("Jane", 25, "Python Dev") >>> letters = "abc" >>> ordinals = {"one": "first", "two": "second", "three": "third"} >>> even_digits = {2, 4, 6, 8} >>> containers = [numbers, person, letters, ordinals, even_digits] >>> for container in containers: ... for element in container: ... print(element, end=" ") ... print() ... 1 2 3 Jane 25 Python Dev a b c one two three 8 2 4 6 Copied!

In this code snippet, you define a few variables using different built-in types. Then, you start a for loop over the collections and iterate over each of them to print their elements to the screen. Even though the built-in types are significantly different from one another, they all support iteration.

The duck typing system allows you to create code that can work with different objects, provided that they share a common interface. This system allows you to set relationships between classes that don’t rely on inheritance, which produces flexible and decoupled code.

Read the full article at https://realpython.com/python-protocol/ »

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Categories: FLOSS Project Planets

Real Python: Quiz: How Do You Choose Python Function Names?

Wed, 2024-07-17 08:00

In this quiz, you’ll test your understanding of how to choose Python function names.

By working through this quiz, you’ll revisit the rules and conventions for naming Python functions and why they’re important for writing Pythonic code.

Choosing the ideal Python function names makes your code more readable and easier to maintain. Code with well-chosen names can also be less prone to bugs.

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Categories: FLOSS Project Planets

Anwesha Das: Looking back to Euro Python 2024

Wed, 2024-07-17 07:42

Over the years, when  I am low, I always go to the 2014 Euro Python talk  "Farewell and Welcome Home: Python in Two Genders" by Naomi. It has become the first step of my coping mechanism and the door to my safe house. Though 2024 marked my Euro Python journey in person, I had a long connection and respect for the conference. A conference that believes community matters, human values and feelings matter, and not afraid to walk the talk. And how the conference stood up to my expectations in every bit.

My Talk: Intellectual Property Law 101

I had my talk on Intellectual Property Law, on the first day. After a long time, I was giving a talk on the legal topic. This talk was dedicated to the developers. So, I concentrated on only those issues which concerned the developers. Tried to stitch the concerned topics Patent, Trademarks, and Copyright together. For the smooth flow of the talk, since it becomes easier for the developers to understand and remember for all the practical purposes for future use. I was concerned if I would be able to connect with people. Later, people came to  me with several related questions, starting from

  • Why should I be concerned about patents?

  • Which license would fit my project?

  • Should I be scared about any Trademarks granted to other organizations under some other jurisdiction?

So on and so forth. Though I could not finish the whole talk due to time constraints, I am happy with the overall review.

Panel: Open Source Sustainability

On Day 1 of the main conference, we had the panel on Open Source Sustainability. This topic lies at the core of open-source ecosystem sustainability for the projects and community for the future and stability. The panel had Deb Nicholson, Armin Ronacher Çağıl Uluşahin Sönmez,Deb Nicholson, Samuel Colvin, and me and Artur Czepiel as  the moderator.  I was happy to represent my community&aposs side. It was a good discussion, and hopefully, we could give answers to some questions of the community in general.

Birds of Feather session: Open Source Release Management

This Birds of Feathers (BoF) session is intended to deal with the Release Management of various Open Source projects, irrespective of their size. The discussion includes all projects, from a community-led project to projects maintained/initiated by big enterprises, from a project maintained by one contributor to a project with several hundred contributors.

  • What methods do we follow regarding versioning, release cadence, and the process?

  • Do most of us follow manual processes or depend on automated ones?

  • What works and what does not, and how can we improve our lives?

  • What are the significant points that make the difference?

We discussed and covered the following topics: different aspects of release management of Open-Source projects, security, automation, CI usage, and documentation. We followed the Chatham House Rules during the discussion to provide the space for open, frank, and collaborative conversation.

PyLadies Lunch

And then comes my favorite part of the conference: PyLadies Lunch. It was my seventh PyLadies lunch, and I was moderating it for the fifth time. But this time, my wonderful friends [Laís] and Çağıl were by my side, holding me up when I failed. I love every time I am at a PyLadies lunch. This is where I get my strength, energy, and love.

Workshop

I attended two workshops organized by Anezka Muller , Mia Bajić and all amazing PyLadies organizers

  • Self-defense workshop where the moderators helped us navigate challenging situations we face in life, safeguard ourselves from them, and overcome them.

  • I AM Remarkable workshop, where we learned to tell people about our successes.

Representing Ansible Community

I always take the chance to meet the Ansible community members face-to-face. Euro Python gave me another opportunity to do that. I learned about different user stories that we do not get to hear from our work corners, and I learned about these unique problems and their solutions in Ansible. 
Fun fact : Maarten gave a review after knowing I am Anwesha from the Ansible project. He said, &aposCan you Ansible people slow down in releasing new versions of Ansible? Every time we get used to it, we have a new version.&apos

Acknowledging mental health issues

The proudest moment for me personally was when I acknowledged my mental health issues and later when people came to me saying how they relate to me and how they felt empowered when I mentioned this.

PyLadies network at Red Hat

A network of PyLadies within Red Hat has been my dream since I joined Red Hat. She also agreed when I shared this with Karolina at last year&aposs DevConf. And finally, we initiated on day 2 of the conference. We are so excited for the future to come.

Meeting friends

Conference means friends. It was so great to meet so many friends after such a long time Tylor, Nicholas, Naomi, Honza, Carol, Mike, Artur, Nikita, Valerio and many new ones Jannis Joana,[Chirstian], Martina Tereza , Maria, Alyona, Mia, Naa , Bojanand Jodie. A special note of love to Jodie, you to hold my hand and take me out of the dark.

The best is saved for the last. Euro Python 2024 made 3 of my dreams come true.

  • Gender Neutral Washrooms

  • Sanitary products in restrooms (I remember carrying sanitary napkins in my bag pack in PyCon India and telling girls if they needed it, it was available in the PyLadies booth).

  • Neo-diversity bag (which saved me at the conference; thank you, Karolina, for this)

I cannot wait for the next Euro Python; see you all at Euro Python 2025.

PS: Thanks to Lias, I will always have a small piece of Euro Python 2024 with me. I know I am loved and cared for.

Categories: FLOSS Project Planets

Python Software Foundation: Announcing the 2024 PSF Board Election & Proposed Bylaw Change Results!

Wed, 2024-07-17 07:11

The 2024 election for the PSF Board and proposed Bylaws changes created an opportunity for conversations about the PSF's work to serve the global Python community. We appreciate community members' perspectives, passion, and engagement in the election process this year.

We want to send a big thanks to everyone who ran and was willing to serve on the PSF Board. Even if you were not elected, we appreciate all the time and effort you put into thinking about how to improve the PSF and represent the parts of the community you participate in. We hope that you will continue to think about these issues, share your ideas, and join a PSF Work Group if you feel called to do so.

Board Members Elect

Congratulations to our three new Board members who have been elected!

  • Tania Allard
  • KwonHan Bae
  • Cristián Maureira-Fredes

We’ll be in touch with all the elected candidates shortly to schedule onboarding. Newly elected PSF Board members are provided orientation for their service and will be joining the upcoming board meeting.

PSF Bylaw Changes

All three of the proposed PSF Bylaw changes are approved:

We appreciate the high level of engagement on the proposed Bylaw changes, and the range of perspectives and points that were raised. We hope that our efforts towards increased transparency, such as the Office Hour session, and our responses in the FAQ helped to continue to build trust with the community. Our goal with these changes continues to be:

  • Making it simpler to qualify as a Member for Python-related volunteer work
  • Making it easier to vote
  • Allowing the Board more options to keep our membership safe and enforce the Code of Conduct

This announcement serves as notice that the Bylaws changes have been approved by the membership, and will automatically go into effect 15 days from now, on Thursday, August 1st, 2024.

Thank you!

We’d like to take this opportunity to thank our outgoing board member, Débora Azevedo, for her outstanding service. Débora served on the PSF Board through a particularly eventful time; bringing PyCon US into an age of hybrid events, responding to calls from our community for transparency, and hiring multiple new staff members to continue to improve our organization. Thank you for supporting the PSF and the Python community through so much change- you are appreciated!

Our heartfelt thanks go out to each of you who took the time to review the candidates and submit your votes. Your participation helps the PSF represent our community. We received 611 total votes, easily reaching quorum–1/3 of affirmed voting members (794). We’re especially grateful for your patience with continuing to navigate the changes to the voting process, which allows for a valid election and a more sustainable election system.

We also want to thank everyone who helped promote this year’s board election, especially Board Members Denny Perez and Georgi Ker, who took the initiative to cover this year’s election and produced informational videos for our candidates. This promotional effort was inspired by the work of Python Community News last year. We also want to highlight the PSF staff members and PSF Board members who put in tons of effort each year as we work to continually improve the PSF elections.

What’s next?

If you’re interested in the complete tally, make sure to check the Python Software Foundation Board of Directors Election 2024 Results page. These results will be available until 10 Sep 2024 at 10:00 AM EDT.

The PSF Election team will conduct a retrospective of this year’s election process to ensure we are improving year over year. We received valuable feedback about the process and tooling. We hope to be able to implement changes for next year to ensure a smooth and accessible election process for everyone in our community.

Finally, it might feel a little early to mention this, but we will have at least 4 seats open again next year. If you're interested in running or learning more, we encourage you to contact a current PSF Board member or two this year and ask them about their experience serving on the board.

Categories: FLOSS Project Planets

Python Bytes: #392 The votes have been counted

Wed, 2024-07-17 04:00
<strong>Topics covered in this episode:</strong><br> <ul> <li><a href="https://pyfound.blogspot.com/2024/07/announcing-2024-psf-board-election.html"><strong>2024 PSF Board Election &amp; Proposed Bylaw Change Results</strong></a></li> <li><strong><a href="https://satyrn.app">SATYRN: A modern Jupyter client for Mac</a></strong></li> <li><a href="https://blog.pypi.org/posts/2024-07-08-incident-report-leaked-admin-personal-access-token/"><strong>Incident Report: Leaked GitHub Personal Access Token</strong></a></li> <li><strong>Extra extra extra</strong></li> <li><strong>Extras</strong></li> <li><strong>Joke</strong></li> </ul><a href='https://www.youtube.com/watch?v=GpZI_HqzCTc' style='font-weight: bold;'data-umami-event="Livestream-Past" data-umami-event-episode="392">Watch on YouTube</a><br> <p><strong>About the show</strong></p> <p>Sponsored by <strong>Code Comments</strong>, an original podcast from RedHat: <a href="https://pythonbytes.fm/code-comments">pythonbytes.fm/code-comments</a></p> <p><strong>Connect with the hosts</strong></p> <ul> <li>Michael: <a href="https://fosstodon.org/@mkennedy"><strong>@mkennedy@fosstodon.org</strong></a></li> <li>Brian: <a href="https://fosstodon.org/@brianokken"><strong>@brianokken@fosstodon.org</strong></a></li> <li>Show: <a href="https://fosstodon.org/@pythonbytes"><strong>@pythonbytes@fosstodon.org</strong></a></li> </ul> <p>Join us on YouTube at <a href="https://pythonbytes.fm/stream/live"><strong>pythonbytes.fm/live</strong></a> to be part of the audience. Usually Tuesdays at 10am PT. Older video versions available there too.</p> <p>Finally, if you want an artisanal, hand-crafted digest of every week of the show notes in email form? Add your name and email to <a href="https://pythonbytes.fm/friends-of-the-show">our friends of the show list</a>, we'll never share it. </p> <p><strong>Brian #1:</strong> <a href="https://pyfound.blogspot.com/2024/07/announcing-2024-psf-board-election.html"><strong>2024 PSF Board Election &amp; Proposed Bylaw Change Results</strong></a></p> <ul> <li>New board members <ul> <li>Tania Allard</li> <li>KwonHan Bae</li> <li>Cristián Maureira-Fredes</li> </ul></li> <li>Congrats to new board members</li> <li>If you want to consider becoming a board member, there are 4 seats up for vote next year.</li> <li>All 3 bylaw changes passed, <a href="https://opavote.com/results/5004101476679680/1">by a wide margin</a>. <ul> <li><a href="https://pyfound.blogspot.com/2024/06/for-your-consideration-proposed-bylaws.html">Details of changes</a></li> <li>Change 1: Merging Contributing and Managing member classes</li> <li>Change 2: Simplifying the voter affirmation process by treating past voting activity as intent to continue voting</li> <li>Change 3: Allow for removal of Fellows by a Board vote in response to Code of Conduct violations, removing the need for a vote of the membership</li> </ul></li> </ul> <p><strong>Michael #2:</strong> <a href="https://satyrn.app">SATYRN: A modern Jupyter client for Mac</a></p> <ul> <li>A Jupyter client app for macOS</li> <li>Comes with a command palette</li> <li>LLM assistance (local or cloud?)</li> <li>Built in Black formatter</li> <li>Currently in alpha</li> <li>Business model unknown</li> </ul> <p><strong>Brian #3:</strong> <a href="https://blog.pypi.org/posts/2024-07-08-incident-report-leaked-admin-personal-access-token/"><strong>Incident Report: Leaked GitHub Personal Access Token</strong></a></p> <ul> <li>Suggested by Galen Swint</li> <li>See also JFrog blog: <a href="https://jfrog.com/blog/leaked-pypi-secret-token-revealed-in-binary-preventing-suppy-chain-attack/">Binary secret scanning helped us prevent</a><a href="https://jfrog.com/blog/leaked-pypi-secret-token-revealed-in-binary-preventing-suppy-chain-attack/"> </a><a href="https://jfrog.com/blog/leaked-pypi-secret-token-revealed-in-binary-preventing-suppy-chain-attack/">(what</a><a href="https://jfrog.com/blog/leaked-pypi-secret-token-revealed-in-binary-preventing-suppy-chain-attack/"> might have been) the worst supply chain attack you can imagine</a></li> <li>A GitHub access token found it’s way into a .pyc file, then into a docker image.</li> <li>JFrog found it through some regular scans.</li> <li>JFrog notified PYPI security.</li> <li>Token was destroyed within 17 minutes. (nice turnaround)</li> <li>Followup scan revealed that no harm was done.</li> <li>Takaways (from Ee Durbin): <ul> <li>Set aggressive expiration dates for API tokens (If you need them at all)</li> <li>Treat .pyc files as if they were source code</li> <li>Perform builds on automated systems from clean source only.</li> </ul></li> </ul> <p><strong>Michael #4:</strong> <strong>Extra extra extra</strong></p> <ul> <li><a href="https://blog.python.org/2024/06/python-3130-beta-3-released.html">Python 3.13.0 beta 3 released</a></li> <li><a href="https://github.com/jordanbaird/Ice/releases">Ice got a lot better</a></li> <li><a href="https://www.youtube.com/watch?v=k0XuoK132z4">I Will Piledrive You If You Say AI Again | Prime Reacts Video</a></li> <li><a href="https://fosstodon.org/@mkennedy/112797279807472603">Follow up actions for polyfill supply chain attack</a></li> <li><a href="https://surveys.jetbrains.com/s3/p-developer-ecosystem-survey-2024?utm_source=pythonbytes">Developer Ecosystem Survey 2024</a></li> <li><a href="https://talkpython.fm/castle">Code in a Castle still has seats open</a></li> </ul> <p><strong>Extras</strong> </p> <p>Brian: </p> <ul> <li>A new pytest course in the works <ul> <li>Quick course focusing on <ul> <li>core pytest features + some strategy and Design for Testability concepts</li> </ul></li> <li>Idea <ul> <li>everyone on the team (including managers) can take the new course.</li> <li>1-2 people on a team take “The Complete pytest Course” to become the teams local pytest experts.</li> </ul></li> </ul></li> <li>Python People is on an indefinite hold </li> <li>Python Test → back to Test &amp; Code (probably) <ul> <li>I’m planning a series (maybe a season) on TDD which will be language agnostic.</li> <li>Plus I still have tons of Test &amp; Code stickers and no Python Test stickers.</li> <li>New episodes planned for August</li> </ul></li> </ul> <p><strong>Joke:</strong> <a href="https://devhumor.com/media/i-need-my-intellisense">I need my intellisense</a><a href="https://devhumor.com/media/i-need-my-intellisense"> </a><a href="https://devhumor.com/media/i-need-my-intellisense">(autocomplete)</a></p>
Categories: FLOSS Project Planets

PyCoder’s Weekly: Issue #638 (July 16, 2024)

Tue, 2024-07-16 15:30

#638 – JULY 16, 2024
View in Browser »

Customize VS Code Settings

In this course, Philipp helps you customize your Visual Studio Code settings to switch from a basic cluttered look to a clean presentable look. This is not just pleasant on the eyes, but also gives you a nice user interface if you want to share on a Zoom call or screen recording.
REAL PYTHON course

Incident Report: Leaked GitHub Personal Access Token

A PyPI admin accidentally leaked credentials into a Docker container. It has since been fixed and the credentials revoked. This is the report by that same admin outlining what happened and how to help prevent similar mistakes in the future.
EE DURBIN

GPU Accelerate Your Data Science Workflows End-to-End

Discover how to create, accelerate, and deploy data pipelines with RAPIDS for GPU-accelerated data science workflows. Take one of our Data Science courses for free when you join the NVIDIA Developer Program →
NVIDIA sponsor

Free-Threaded CPython Is Ready to Experiment With!

An overview of the ongoing efforts to improve and roll out support for free-threaded CPython throughout the Python open source ecosystem. Associated Hacker News discussion.
RALF GOMMERS

PSF Announces Infrastructure Engineer

PYTHON SOFTWARE FOUNDATION

DjangoCon US 2024 Announces Talks

DJANGOCON

Django Security Releases Issued: 5.0.7 and 4.2.14

DJANGO SOFTWARE FOUNDATION

Register for Kiwi PyCon, Aug 23-25

KIWIPYCON.NZ • Shared by Kiwi PyCon

Quiz: Split Datasets With scikit-learn.train_test_split()

REAL PYTHON

Python Jobs Python Tutorial Writer (Anywhere)

Real Python

Python Video Course Instructor (Anywhere)

Real Python

More Python Jobs >>>

Articles & Tutorials Free, Unbelievably Stupid Wi-Fi on Long-Haul Flights

Deep in a need to procrastinate on a flight between London and San Francisco, Robert discovered that changing his name on an airline’s frequent flyer account was free over the plane’s WiFi. What’s a developer to do? Work on their tickets? No, create an entire TCP/IP protocol using this loophole. The result is the PySkyWiFi package.
ROBERT HEATON

Digging Into Graph Theory in Python With David Amos

Have you wondered about graph theory and how to start exploring it in Python? What resources and Python libraries can you use to experiment and learn more? This week on the show, former co-host David Amos returns to talk about what he’s been up to and share his knowledge about graph theory in Python.
REAL PYTHON

My Programming Beliefs as of July 2024

This collection of thoughts outlines how Evan approaches coding, with the understanding that this might change in the future. His beliefs include using spikes, the difference between simple and easy, a preference for enums over booleans, and more.
EVAN HAHN

Breaking Out of Nested Loops With Generators

Have you ever had the situation where you’ve got a nested loop and need to break out of the outer one? One way of dealing with this problem is refactoring the loop to use a generator. This post shows you how.
RODRIGO GIRÃO SERRÃO

“Extracting Wisdom” From Conference Videos

There are so many conferences and so many videos, you can’t possibly watch them all. This post shows you how to extract information to summarize a talk so you can quickly decide what you want to watch.
GONÇALO VALÉRIO

Creating a Simple Pastebin Service in Python and Flask

Learn how to build a functional pastebin service using Python and Flask. This tutorial covers web development basics, file handling, and syntax highlighting.
MUHAMMAD RAZA

How a Decorator Crashed My Flask App

This blog post shows how failing to use functools.wraps can cause issues with FlaskAPI. Learn why you should always use wraps and what went wrong.
SUYOG DAHAL

Python Has Too Many Package Managers

Overview of Python’s Package management ecosystem in 2024 and associated Hacker News Discussion
LARRY DU

Creating Images in Your Terminal With Python and Rich Pixels

Rich Pixels, a package from one of the folks at Textual, allows you to create images in your terminal and display them.
MIKE DRISCOLL

How Do You Choose Python Function Names?

This tutorial discusses the rules and conventions for choosing Python function names and why they’re important.
REAL PYTHON

Using HTMX With FastAPI

This tutorial looks at how use HTMX with FastAPI by creating a simple todo web app and deploying it on Render.
PAUL ESCH-LAURENT • Shared by Michael Herman

Projects & Code ViperIDE: MicroPython IDE for Web and Mobile

GITHUB.COM/VSHYMANSKYY

ML System Design: 450 Case Studies to Learn From

EVIDENTLYAI.COM • Shared by Daria Maliugina

reladiff: High-Perf Diffing of Large Datasets Across Databases

GITHUB.COM/EREZSH

Yen: The Last Python Environment Manager You’ll Ever Need

GITHUB.COM/TUSHARSADHWANI • Shared by Tushar Sadhwani

Events Weekly Real Python Office Hours Q&A (Virtual)

July 17, 2024
REALPYTHON.COM

PyData Bristol Meetup

July 18, 2024
MEETUP.COM

PyLadies Dublin

July 18, 2024
PYLADIES.COM

Chattanooga Python User Group

July 19 to July 20, 2024
MEETUP.COM

PyKla Monthly Meetup

July 24, 2024
MEETUP.COM

PyLadies Amsterdam

July 24, 2024
MEETUP.COM

PyOhio 2024

July 27 to July 28, 2024
PYOHIO.ORG

Happy Pythoning!
This was PyCoder’s Weekly Issue #638.
View in Browser »

[ Subscribe to 🐍 PyCoder’s Weekly 💌 – Get the best Python news, articles, and tutorials delivered to your inbox once a week >> Click here to learn more ]

Categories: FLOSS Project Planets

Real Python: Exercises Course: Introduction to Web Scraping With Python

Tue, 2024-07-16 10:00

Web scraping is the process of collecting and parsing raw data from the Web, and the Python community has come up with some pretty powerful web scraping tools.

The Internet hosts the greatest source of information on the planet. Many disciplines, such as data science, business intelligence, and investigative reporting, can benefit enormously from collecting and analyzing data from websites.

In this course, you’ll practice:

  • Parsing website data using string methods and regular expressions
  • Parsing website data using an HTML parser
  • Interacting with forms and other website components

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Categories: FLOSS Project Planets

Kushal Das: Friends, the most important part of any conference

Tue, 2024-07-16 01:33

At the beginning one goes to the conferences to listen to the talks and make new contacts. You meet a lot of new faces every time. Over time a few of them will become great friends and then all conferences will become about friends.

We wait for the conferences so that we can meet our friends. I went back to PyCon US this year after 5 years, means I met many friends after 5 years. It was so happy feeling to see them again.

Last week I went to my first ever Euro Python in Prague, finally the visa was good in the right days of the year. This means I managed to meet more friends, a few of them just after a month (as they were present in PyCon US) and some after many many years. Really enjoyed the social event place selections by the organizers.

Personally the social events allowed me to go full scale nerd out on technical and social issues with friends. I was really missing these discussions. Heard more stories and discussed about fun ideas. One is below :)

$ python Python 3.12.4 (main, Jun 7 2024, 00:00:00) [GCC 14.1.1 20240607 (Red Hat 14.1.1-5)] on linux Type "help", "copyright", "credits" or "license" for more information. >>> hello 🤌🤌🤌 Traceback (most recent call last): File "<stdin>", line 1, in <module> NameError: name 'hello' is not defined. Did you mean: 'help'? >>> [].set("different exception") 🤌🤌🤌 Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: 'list' object has no attribute 'set' >>>
Categories: FLOSS Project Planets

Mike Driscoll: Creating Images in Your Terminal with Python and Rich Pixels

Mon, 2024-07-15 13:11

A newer Python package called Rich Pixels allows you to create images in your terminal and display them. Darren Burns, one of the team members from the Textual project, created this package.

Anyway, let’s find out how to use Rich Pixels!

Installation

You can install Rich Pixels using Python’s pip utility. Here’s how:

python -m pip install rich-pixels

Once you have Rich Pixels installed, you can try it out!

Displaying Images in the Terminal

Rich Pixels lets you take a pre-existing image and show it in your terminal. The higher the image’s resolution, the better the output will be. However, if your image has too many pixels, it probably won’t fit in your terminal, and much of it will be drawn off-screen.

For this example, you will use the Python Show Podcast logo and attempt to draw it in your terminal.

Open up your favorite Python editor and add the following code to it:

from rich_pixels import Pixels from rich.console import Console console = Console() pixels = Pixels.from_image_path("python_show200.jpg") console.print(pixels)

For this example, you will use a square image that is 200×200 pixels. You can run the code like this in your terminal:

python pixels.py

When you execute the command above, you will see something like this in your terminal:

 

As you can see, the image is a little pixelated and gets cut off at the bottom. Of course, this all depends on your monitor’s resolution.

Here’s what happens when you use an image that is 80×80 pixels:

You can also. use the Pillow package to create an image object and pass that the Rich Pixels too. Here’s how that might look:

with Image.open("path/to/image.png") as image: pixels = Pixels.from_image(image)

You can create or draw your images using Pillow. There is some coverage of this topic in my article, Drawing Shapes on Images with Python and Pillow which you could then pass to Rich Pixels to display it.

Wrapping Up

Rich Pixels is a fun way to add extra pizzazz to your terminal applications. Rich Pixels can also be used in a Textual application. While there probably aren’t a lot of use cases for this package, it’s a lot of fun to play around with.

Give it a try, and let me know what you create!

The post Creating Images in Your Terminal with Python and Rich Pixels appeared first on Mouse Vs Python.

Categories: FLOSS Project Planets

Real Python: Split Your Dataset With scikit-learn's train_test_split()

Mon, 2024-07-15 10:00

One of the key aspects of supervised machine learning is model evaluation and validation. When you evaluate the predictive performance of your model, it’s essential that the process be unbiased. Using train_test_split() from the data science library scikit-learn, you can split your dataset into subsets that minimize the potential for bias in your evaluation and validation process.

In this tutorial, you’ll learn:

  • Why you need to split your dataset in supervised machine learning
  • Which subsets of the dataset you need for an unbiased evaluation of your model
  • How to use train_test_split() to split your data
  • How to combine train_test_split() with prediction methods

In addition, you’ll get information on related tools from sklearn.model_selection.

Get Your Code: Click here to download the free sample code that you’ll use to learn about splitting your dataset with scikit-learn’s train_test_split().

Take the Quiz: Test your knowledge with our interactive “Split Your Dataset With scikit-learn's train_test_split()” quiz. You’ll receive a score upon completion to help you track your learning progress:

Interactive Quiz

Split Your Dataset With scikit-learn's train_test_split()

In this quiz, you'll test your understanding of how to use the train_test_split() function from the scikit-learn library to split your dataset into subsets for unbiased evaluation in machine learning.

The Importance of Data Splitting

Supervised machine learning is about creating models that precisely map the given inputs to the given outputs. Inputs are also called independent variables or predictors, while outputs may be referred to as dependent variables or responses.

How you measure the precision of your model depends on the type of a problem you’re trying to solve. In regression analysis, you typically use the coefficient of determination, root mean square error, mean absolute error, or similar quantities. For classification problems, you often apply accuracy, precision, recall, F1 score, and related indicators.

The acceptable numeric values that measure precision vary from field to field. You can find detailed explanations from Statistics By Jim, Quora, and many other resources.

What’s most important to understand is that you usually need unbiased evaluation to properly use these measures, assess the predictive performance of your model, and validate the model.

This means that you can’t evaluate the predictive performance of a model with the same data you used for training. You need evaluate the model with fresh data that hasn’t been seen by the model before. You can accomplish that by splitting your dataset before you use it.

Training, Validation, and Test Sets

Splitting your dataset is essential for an unbiased evaluation of prediction performance. In most cases, it’s enough to split your dataset randomly into three subsets:

  1. The training set is applied to train or fit your model. For example, you use the training set to find the optimal weights, or coefficients, for linear regression, logistic regression, or neural networks.

  2. The validation set is used for unbiased model evaluation during hyperparameter tuning. For example, when you want to find the optimal number of neurons in a neural network or the best kernel for a support vector machine, you experiment with different values. For each considered setting of hyperparameters, you fit the model with the training set and assess its performance with the validation set.

  3. The test set is needed for an unbiased evaluation of the final model. You shouldn’t use it for fitting or validation.

In less complex cases, when you don’t have to tune hyperparameters, it’s okay to work with only the training and test sets.

Underfitting and Overfitting

Splitting a dataset might also be important for detecting if your model suffers from one of two very common problems, called underfitting and overfitting:

  1. Underfitting is usually the consequence of a model being unable to encapsulate the relations among data. For example, this can happen when trying to represent nonlinear relations with a linear model. Underfitted models will likely have poor performance with both training and test sets.

  2. Overfitting usually takes place when a model has an excessively complex structure and learns both the existing relations among data and noise. Such models often have bad generalization capabilities. Although they work well with training data, they usually yield poor performance with unseen test data.

You can find a more detailed explanation of underfitting and overfitting in Linear Regression in Python.

Prerequisites for Using train_test_split()

Now that you understand the need to split a dataset in order to perform unbiased model evaluation and identify underfitting or overfitting, you’re ready to learn how to split your own datasets.

You’ll use version 1.5.0 of scikit-learn, or sklearn. It has many packages for data science and machine learning, but for this tutorial, you’ll focus on the model_selection package, specifically on the function train_test_split().

Note: While this tutorial is tested with this specific version of scikit-learn, the features that you’ll use are core to the library and should work equivalently in other versions of scikit-learn as well.

You can install sklearn with pip:

Read the full article at https://realpython.com/train-test-split-python-data/ »

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Categories: FLOSS Project Planets

Real Python: Quiz: How to Use Generators and yield in Python

Mon, 2024-07-15 08:00

In this quiz, you’ll test your understanding of Python generators.

Generators and the Python yield statement can help you when you’re working with large datasets that might overwhelm your machine’s memory. Another use case is when you have a complex function that needs to maintain an internal state every time it’s called.

When you understand Python generators, then you’ll be able to work with large datasets in a more Pythonic fashion, create generator functions and expressions, and apply your knowledge towards building efficient data pipelines.

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Categories: FLOSS Project Planets

Real Python: Quiz: How to Write Beautiful Python Code With PEP 8

Mon, 2024-07-15 08:00

In this quiz, you’ll test your understanding of how to write beautiful Python code with PEP 8.

By working through this quiz, you’ll revisit the key guidelines laid out in PEP 8 and how to set up your development environment to write PEP 8 compliant Python code.

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Categories: FLOSS Project Planets

Kushal Das: Disable this Firefox preference to save privacy

Mon, 2024-07-15 04:55

If you are on the latest Firefox 128 (which is there on Fedora 40), you should uncheck the following preference to disable Privacy-Preserving Attribution. Firefox added this experimental feature and turn it on by default for everyone. Which should not be the case.

You can find it in the preferences window.

Categories: FLOSS Project Planets

Zato Blog: Network packet brokers and automation in Python

Mon, 2024-07-15 00:43
Network packet brokers and automation in Python 2024-07-15, by Dariusz Suchojad

Packet brokers are crucial for network engineers, providing a clear, detailed view of network traffic, aiding in efficient issue identification and resolution.

But what is a network packet broker (NBP) really? Why are they needed? And how to automate one in Python?

➤ Read this article about network packet brokers and their automation in Python to find out more.

More resources

Click here to read more about using Python and Zato in telecommunications
➤ Python API integration tutorial
What is an integration platform?

More blog posts
Categories: FLOSS Project Planets

Real Python: Quiz: How to Flatten a List of Lists in Python

Sun, 2024-07-14 08:00

In this quiz, you’ll test your understanding of how to flatten a list in Python.

You’ll write code and answer questions to revisit the concept of converting a multidimensional list, such as a matrix, into a one-dimensional list.

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Categories: FLOSS Project Planets

Real Python: Quiz: Python Type Checking

Sun, 2024-07-14 08:00

In this quiz, you’ll test your understanding of Python Type Checking.

By working through this quiz, you’ll revisit type annotations and type hints, adding static types to code, running a static type checker, and enforcing types at runtime.

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Categories: FLOSS Project Planets

PyPy: Finding Simple Rewrite Rules for the JIT with Z3

Fri, 2024-07-12 15:14

In June I was at the PLDI conference in Copenhagen to present a paper I co-authored with Max Bernstein. I also finally met John Regehr, who I'd been talking on social media for ages but had never met. John has been working on compiler correctness and better techniques for building compilers and optimizers since a very long time. The blog post Finding JIT Optimizer Bugs using SMT Solvers and Fuzzing was heavily inspired by this work. We talked a lot about his and his groups work on using Z3 for superoptimization and for finding missing optimizations. I have applied some of the things John told me about to the traces of PyPy's JIT, and wanted to blog about that. However, my draft felt quite hard to understand. Therefore I have now written this current post, to at least try to provide a somewhat gentler on-ramp to the topic.

In this post we will use the Python-API to Z3 to find local peephole rewrite rules for the operations in the intermediate representation of PyPy's tracing JIT. The code for this is simple enough that we can go through all of it.

The PyPy JIT produces traces of machine level instructions, which are optimized and then turned into machine code. The optimizer uses a number of approaches to make the traces more efficient. For integer operations it applies a number of arithmetic simplification rules rules, for example int_add(x, 0) -> x. When implementing these rules in the JIT there are two problems: How do we know that the rules are correct? And how do we know that we haven't forgotten any rules? We'll try to answer both of these, but the first one in particular.

We'll be using Z3, a satisfiability module theories (SMT) solver which has good bitvector support and most importantly an excellent Python API. We can use the solver to reason about bitvectors, which are how we will model machine integers.

To find rewrite rules, we will consider the binary operations (i.e. those taking two arguments) in PyPy traces that take and produce integers. The completely general form op(x, y) is not simplifiable on its own. But if either x == y or if one of the arguments is a constant, we can potentially simplify the operation into a simpler form. The results are either the variable x, or a (potentially different) constant. We'll ignore constant-folding where both arguments of the binary operation are constants. The possible results for a simplifiable binary operation are the variable x or another constant. This leaves the following patterns as possibilities:

  • op(x, x) == x
  • op(x, x) == c1
  • op(x, c1) == x
  • op(c1, x) == x
  • op(x, c1) == c2
  • op(c1, x) == c2

Our approach will be to take every single supported binary integer operation, instantiate all of these patterns, and try to ask Z3 whether the resulting simplification is valid for all values of x.

Quick intro to the Z3 Python-API

Here's a terminal session showing the use of the Z3 Python API:

>>>> import z3 >>>> # construct a Z3 bitvector variable of width 8, with name x: >>>> x = z3.BitVec('x', 8) >>>> # construct a more complicated formula by using operator overloading: >>>> x + x x + x >>>> x + 1 x + 1

Z3 checks the "satisfiability" of a formula. This means that it tries to find an example set of concrete values for the variables that occur in a formula, such that the formula becomes true. Examples:

>>>> solver = z3.Solver() >>>> solver.check(x * x == 3) unsat >>>> # meaning no x fulfils this property >>>> >>>> solver.check(x * x == 9) sat >>>> model = solver.model() >>>> model [x = 253] >>>> model[x].as_signed_long() -3 >>>> # 253 is the same as -3 in two's complement arithmetic with 8 bits

In order to use Z3 to prove something, we can ask Z3 to find counterexamples for the statement, meaning concrete values that would make the negation of the statement true:

>>>> solver.check(z3.Not(x ^ -1 == ~x)) unsat

The result unsat means that we just proved that x ^ -1 == ~x is true for all x, because there is no value for x that makes not (x ^ -1 == ~x) true (this works because -1 has all the bits set).

If we try to prove something incorrect in this way, the following happens:

>>>> solver.check(z3.Not(x ^ -1 == x)) sat

sat shows that x ^ -1 == x is (unsurprisingly) not always true, and we can ask for a counterexample:

>>>> solver.model() [x = 0]

This way of proving this works because the check calls try to solve an (implicit) "exists" quantifier, over all the Z3 variables used in the formula. check will either return z3.unsat, which means that no concrete values make the formula true; or z3.sat, which means that you can get some concrete values that make the formula true by calling solver.model().

In math terms we prove things using check by de-Morgan's rules for quantifiers:

$$ \lnot \exists x: \lnot f(x) \implies \forall x: f(x) $$

Now that we've seen the basics of using the Z3 API on a few small examples, we'll use it in a bigger program.

Encoding the integer operations of RPython's JIT into Z3 formulas

Now we'll use the API to reason about the integer operations of the PyPy JIT intermediate representation (IR). The binary integer operations are:

opnames2 = [ "int_add", "int_sub", "int_mul", "int_and", "int_or", "int_xor", "int_eq", "int_ne", "int_lt", "int_le", "int_gt", "int_ge", "uint_lt", "uint_le", "uint_gt", "uint_ge", "int_lshift", "int_rshift", "uint_rshift", "uint_mul_high", "int_pydiv", "int_pymod", ]

There's not much special about the integer operations. Like in LLVM, most of them are signedness-independent: int_add, int_sub, int_mul, ... work correctly for unsigned integers but also for two's-complement signed integers. Exceptions for that are order comparisons like int_lt etc. for which we have unsigned variants uint_lt etc. All operations that produce a boolean result return a full-width integer 0 or 1 (the PyPy JIT supports only word-sized integers in its intermediate representation)

In order to reason about the IR operations, some ground work:

import z3 INTEGER_WIDTH = 64 solver = z3.Solver() solver.set("timeout", 10000) # milliseconds, ie 10s xvar = z3.BitVec('x', INTEGER_WIDTH) constvar = z3.BitVec('const', INTEGER_WIDTH) constvar2 = z3.BitVec('const2', INTEGER_WIDTH) TRUEBV = z3.BitVecVal(1, INTEGER_WIDTH) FALSEBV = z3.BitVecVal(0, INTEGER_WIDTH)

And here's the a function to turn an integer IR operation of PyPy's JIT into Z3 formulas:

def z3_expression(opname, arg0, arg1=None): """ computes a tuple of (result, valid_if) of Z3 formulas. `result` is the formula representing the result of the operation, given argument formulas arg0 and arg1. `valid_if` is a pre-condition that must be true for the result to be meaningful. """ result = None valid_if = True # the precondition is mostly True, with few exceptions if opname == "int_add": result = arg0 + arg1 elif opname == "int_sub": result = arg0 - arg1 elif opname == "int_mul": result = arg0 * arg1 elif opname == "int_and": result = arg0 & arg1 elif opname == "int_or": result = arg0 | arg1 elif opname == "int_xor": result = arg0 ^ arg1 elif opname == "int_eq": result = cond(arg0 == arg1) elif opname == "int_ne": result = cond(arg0 != arg1) elif opname == "int_lt": result = cond(arg0 < arg1) elif opname == "int_le": result = cond(arg0 <= arg1) elif opname == "int_gt": result = cond(arg0 > arg1) elif opname == "int_ge": result = cond(arg0 >= arg1) elif opname == "uint_lt": result = cond(z3.ULT(arg0, arg1)) elif opname == "uint_le": result = cond(z3.ULE(arg0, arg1)) elif opname == "uint_gt": result = cond(z3.UGT(arg0, arg1)) elif opname == "uint_ge": result = cond(z3.UGE(arg0, arg1)) elif opname == "int_lshift": result = arg0 << arg1 valid_if = z3.And(arg1 >= 0, arg1 < INTEGER_WIDTH) elif opname == "int_rshift": result = arg0 << arg1 valid_if = z3.And(arg1 >= 0, arg1 < INTEGER_WIDTH) elif opname == "uint_rshift": result = z3.LShR(arg0, arg1) valid_if = z3.And(arg1 >= 0, arg1 < INTEGER_WIDTH) elif opname == "uint_mul_high": # zero-extend args to 2*INTEGER_WIDTH bit, then multiply and extract # highest INTEGER_WIDTH bits zarg0 = z3.ZeroExt(INTEGER_WIDTH, arg0) zarg1 = z3.ZeroExt(INTEGER_WIDTH, arg1) result = z3.Extract(INTEGER_WIDTH * 2 - 1, INTEGER_WIDTH, zarg0 * zarg1) elif opname == "int_pydiv": valid_if = arg1 != 0 r = arg0 / arg1 psubx = r * arg1 - arg0 result = r + (z3.If(arg1 < 0, psubx, -psubx) >> (INTEGER_WIDTH - 1)) elif opname == "int_pymod": valid_if = arg1 != 0 r = arg0 % arg1 result = r + (arg1 & z3.If(arg1 < 0, -r, r) >> (INTEGER_WIDTH - 1)) elif opname == "int_is_true": result = cond(arg0 != FALSEBV) elif opname == "int_is_zero": result = cond(arg0 == FALSEBV) elif opname == "int_neg": result = -arg0 elif opname == "int_invert": result = ~arg0 else: assert 0, "unknown operation " + opname return result, valid_if def cond(z3expr): """ helper function to turn a Z3 boolean result z3expr into a 1 or 0 bitvector, using z3.If """ return z3.If(z3expr, TRUEBV, FALSEBV)

We map the semantics of a PyPy JIT operation to Z3 with the z3_expression function. It takes the name of a JIT operation and its two (or one) arguments into a pair of Z3 formulas, result and valid_if. The resulting formulas are constructed with the operator overloading of Z3 variables/formulas.

The first element result of the result of z3_expression represents the result of performing the operation. valid_if is a bool that represents a condition that needs to be True in order for the result of the operation to be defined. E.g. int_pydiv(a, b) is only valid if b != 0. Most operations are always valid, so they return True as that condition (we'll ignore valid_if for a bit, but it will become more relevant further down in the post).

We can define a helper function to prove things by finding counterexamples:

def prove(cond): """ Try to prove a condition cond by searching for counterexamples of its negation. """ z3res = solver.check(z3.Not(cond)) if z3res == z3.unsat: return True elif z3res == z3.unknown: # eg on timeout return False elif z3res == z3.sat: return False assert 0, "should be unreachable" Finding rewrite rules

Now we can start finding our first rewrite rules, following the first pattern op(x, x) -> x. We do this by iterating over all the supported binary operation names, getting the z3 expression for op(x, x) and then asking Z3 to prove op(x, x) == x.

for opname in opnames2: result, valid_if = z3_expression(opname, xvar, xvar) if prove(result == xvar): print(f"{opname}(x, x) -> x, {result}")

This yields the simplifications:

int_and(x, x) -> x int_or(x, x) -> x Synthesizing constants

Supporting the next patterns is harder: op(x, x) == c1, op(x, c1) == x, and op(x, c1) == x. We don't know which constants to pick to try to get Z3 to prove the equality. We could iterate over common constants like 0, 1, MAXINT, etc, or even over all the 256 values for a bitvector of length 8. However, we will instead ask Z3 to find the constants for us too.

This can be done by using quantifiers, in this case z3.ForAll. The query we pose to Z3 is "does there exist a constant c1 such that for all x the following is true: op(x, c1) == x? Note that the constant c1 is not necessarily unique, there could be many of them. We generate several matching constant, and add that they must be different to the condition of the second and further queries.

We can express this in a helper function:

def find_constant(z3expr, number_of_results=5): condition = z3.ForAll( [xvar], z3expr ) for i in range(number_of_results): checkres = solver.check(condition) if checkres == z3.sat: # if a solver check succeeds, we can ask for a model, which is # concrete values for the variables constvar model = solver.model() const = model[constvar].as_signed_long() yield const # make sure we don't generate the same constant again on the # next call condition = z3.And(constvar != const, condition) else: # no (more) constants found break

We can use this new function for the three mentioned patterns:

# try to find constants for op(x, x) == c for opname in opnames2: result, valid_if = z3_expression(opname, xvar, xvar) for const in find_constant(result == constvar): print(f"{opname}(x, x) -> {const}") # try to find constants for op(x, c) == x and op(c, x) == x for opname in opnames2: result, valid_if = z3_expression(opname, xvar, constvar) for const in find_constant(result == xvar): print(f"{opname}(x, {const}) -> x") result, valid_if = z3_expression(opname, constvar, xvar) for const in find_constant(result == xvar): print(f"{opname}({const}, x) -> x") # this code is not quite correct, we'll correct it later

Together this yields the following new simplifications:

# careful, these are not all correct! int_sub(x, x) -> 0 int_xor(x, x) -> 0 int_eq(x, x) -> 1 int_ne(x, x) -> 0 int_lt(x, x) -> 0 int_le(x, x) -> 1 int_gt(x, x) -> 0 int_ge(x, x) -> 1 uint_lt(x, x) -> 0 uint_le(x, x) -> 1 uint_gt(x, x) -> 0 uint_ge(x, x) -> 1 uint_rshift(x, x) -> 0 int_pymod(x, x) -> 0 int_add(x, 0) -> x int_add(0, x) -> x int_sub(x, 0) -> x int_mul(x, 1) -> x int_mul(1, x) -> x int_and(x, -1) -> x int_and(-1, x) -> x int_or(x, 0) -> x int_or(0, x) -> x int_xor(x, 0) -> x int_xor(0, x) -> x int_lshift(x, 0) -> x int_rshift(x, 0) -> x uint_rshift(x, 0) -> x int_pydiv(x, 1) -> x int_pymod(x, 0) -> x

Most of these look good at first glance, but the last one reveals a problem: we've been ignoring the valid_if expression up to now. We can stop doing that by changing the code like this, which adds z3.And(valid_if, ...) to the argument of the calls to find_constant:

# try to find constants for op(x, x) == c, op(x, c) == x and op(c, x) == x for opname in opnames2: result, valid_if = z3_expression(opname, xvar, xvar) for const in find_constant(z3.And(valid_if, result == constvar)): print(f"{opname}(x, x) -> {const}") # try to find constants for op(x, c) == x and op(c, x) == x for opname in opnames2: result, valid_if = z3_expression(opname, xvar, constvar) for const in find_constant(z3.And(result == xvar, valid_if)): print(f"{opname}(x, {const}) -> x") result, valid_if = z3_expression(opname, constvar, xvar) for const in find_constant(z3.And(result == xvar, valid_if)): print(f"{opname}({const}, x) -> x")

And we get this list instead:

int_sub(x, x) -> 0 int_xor(x, x) -> 0 int_eq(x, x) -> 1 int_ne(x, x) -> 0 int_lt(x, x) -> 0 int_le(x, x) -> 1 int_gt(x, x) -> 0 int_ge(x, x) -> 1 uint_lt(x, x) -> 0 uint_le(x, x) -> 1 uint_gt(x, x) -> 0 uint_ge(x, x) -> 1 int_add(x, 0) -> x int_add(0, x) -> x int_sub(x, 0) -> x int_mul(x, 1) -> x int_mul(1, x) -> x int_and(x, -1) -> x int_and(-1, x) -> x int_or(x, 0) -> x int_or(0, x) -> x int_xor(x, 0) -> x int_xor(0, x) -> x int_lshift(x, 0) -> x int_rshift(x, 0) -> x uint_rshift(x, 0) -> x int_pydiv(x, 1) -> x Synthesizing two constants

For the patterns op(x, c1) == c2 and op(c1, x) == c2 we need to synthesize two constants. We can again write a helper method for that:

def find_2consts(z3expr, number_of_results=5): condition = z3.ForAll( [xvar], z3expr ) for i in range(number_of_results): checkres = solver.check(condition) if checkres == z3.sat: model = solver.model() const = model[constvar].as_signed_long() const2 = model[constvar2].as_signed_long() yield const, const2 condition = z3.And(z3.Or(constvar != const, constvar2 != const2), condition) else: return

And then use it like this:

for opname in opnames2: # try to find constants c1, c2 such that op(c1, x) -> c2 result, valid_if = z3_expression(opname, constvar, xvar) consts = find_2consts(z3.And(valid_if, result == constvar2)) for const, const2 in consts: print(f"{opname}({const}, x) -> {const2}") # try to find constants c1, c2 such that op(x, c1) -> c2 result, valid_if = z3_expression(opname, xvar, constvar) consts = find_2consts(z3.And(valid_if, result == constvar2)) for const, const2 in consts: print("%s(x, %s) -> %s" % (opname, const, const2))

Which yields some straightforward simplifications:

int_mul(0, x) -> 0 int_mul(x, 0) -> 0 int_and(0, x) -> 0 int_and(x, 0) -> 0 uint_lt(x, 0) -> 0 uint_le(0, x) -> 1 uint_gt(0, x) -> 0 uint_ge(x, 0) -> 1 int_lshift(0, x) -> 0 int_rshift(0, x) -> 0 uint_rshift(0, x) -> 0 uint_mul_high(0, x) -> 0 uint_mul_high(1, x) -> 0 uint_mul_high(x, 0) -> 0 uint_mul_high(x, 1) -> 0 int_pymod(x, 1) -> 0 int_pymod(x, -1) -> 0

A few require a bit more thinking:

int_or(-1, x) -> -1 int_or(x, -1) -> -1

The are true because in two's complement, -1 has all bits set.

The following ones require recognizing that -9223372036854775808 == -2**63 is the most negative signed 64-bit integer, and 9223372036854775807 == 2 ** 63 - 1 is the most positive one:

int_lt(9223372036854775807, x) -> 0 int_lt(x, -9223372036854775808) -> 0 int_le(-9223372036854775808, x) -> 1 int_le(x, 9223372036854775807) -> 1 int_gt(-9223372036854775808, x) -> 0 int_gt(x, 9223372036854775807) -> 0 int_ge(9223372036854775807, x) -> 1 int_ge(x, -9223372036854775808) -> 1

The following ones are true because the bitpattern for -1 is the largest unsigned number:

uint_lt(-1, x) -> 0 uint_le(x, -1) -> 1 uint_gt(x, -1) -> 0 uint_ge(-1, x) -> 1 Strength Reductions

All the patterns so far only had a variable or a constant on the target of the rewrite. We can also use the machinery to do strengh-reductions where we generate a single-argument operation op1(x) for input operations op(x, c1) or op(c1, x). To achieve this, we try all combinations of binary and unary operations. (We won't consider strength reductions where a binary operation gets turned into a "cheaper" other binary operation here.)

opnames1 = [ "int_is_true", "int_is_zero", "int_neg", "int_invert", ] for opname in opnames2: for opname1 in opnames1: result, valid_if = z3_expression(opname, xvar, constvar) # try to find a constant op(x, c) == g(x) result1, valid_if1 = z3_expression(opname1, xvar) consts = find_constant(z3.And(valid_if, valid_if1, result == result1)) for const in consts: print(f"{opname}(x, {const}) -> {opname1}(x)") # try to find a constant op(c, x) == g(x) result, valid_if = z3_expression(opname, constvar, xvar) result1, valid_if1 = z3_expression(opname1, xvar) consts = find_constant(z3.And(valid_if, valid_if1, result == result1)) for const in consts: print(f"{opname}({const}, x) -> {opname1}(x)")

Which yields the following new simplifications:

int_sub(0, x) -> int_neg(x) int_sub(-1, x) -> int_invert(x) int_mul(x, -1) -> int_neg(x) int_mul(-1, x) -> int_neg(x) int_xor(x, -1) -> int_invert(x) int_xor(-1, x) -> int_invert(x) int_eq(x, 0) -> int_is_zero(x) int_eq(0, x) -> int_is_zero(x) int_ne(x, 0) -> int_is_true(x) int_ne(0, x) -> int_is_true(x) uint_lt(0, x) -> int_is_true(x) uint_lt(x, 1) -> int_is_zero(x) uint_le(1, x) -> int_is_true(x) uint_le(x, 0) -> int_is_zero(x) uint_gt(x, 0) -> int_is_true(x) uint_gt(1, x) -> int_is_zero(x) uint_ge(x, 1) -> int_is_true(x) uint_ge(0, x) -> int_is_zero(x) int_pydiv(x, -1) -> int_neg(x) Conclusions

With not very little code we managed to generate a whole lot of local simplifications for integer operations in the IR of PyPy's JIT. The rules discovered that way are "simple", in the sense that they only require looking at a single instruction, and not where the arguments of that instruction came from. They also don't require any knowledge about the properties of the arguments of the instructions (e.g. that they are positive).

The rewrites in this post have mostly been in PyPy's JIT already. But now we mechanically confirmed that they are correct. I've also added the remaining useful looking ones, in particular int_eq(x, 0) -> int_is_zero(x) etc.

If we wanted to scale this approach up, we would have to work much harder! There are a bunch of problems that come with generalizing the approach to looking at sequences of instructions:

  • Combinatorial explosion: if we look at sequences of instructions, we very quickly get a combinatorial explosion and it becomes untractable to try all combinations.

  • Finding non-minimal patterns: Some complicated simplifications can be instances of simpler ones. For example, because int_add(x, 0) -> x, it's also true that int_add(int_sub(x, y), 0) -> int_sub(x, y). If we simply generate all possible sequences, we will find the latter simplification rule, which we would usually not care about.

  • Unclear usefulness: if we simply generate all rewrites up to a certain number of instructions, we will get a lot of patterns that are useless in the sense that they typically aren't found in realistic programs. It would be much better to somehow focus on the patterns that real benchmarks are using.

In the next blog post I'll discuss an alternative approach to simply generating all possible sequences of instructions, that tries to address these problems. This works by analyzing the real traces of benchmarks and mining those for inefficiencies, which only shows problems that occur in actual programs.

Sources

I've been re-reading a lot of blog posts from John's blog:

but also papers:

Another of my favorite blogs has been Philipp Zucker's blog in the last year or two, lots of excellent posts about/using Z3 on there.

Categories: FLOSS Project Planets

Python Morsels: What are lists in Python?

Fri, 2024-07-12 11:09

Lists are used to store and manipulate an ordered collection of things.

Table of contents

  1. Lists are ordered collections
  2. Containment checking
  3. Length
  4. Modifying the contents of a list
  5. Indexing: looking up items by their position
  6. Lists are the first data structure to learn

Lists are ordered collections

This is a list:

>>> colors = ["purple", "green", "blue", "yellow"]

We can prove that to ourselves by passing that object to Python's built-in type function:

>>> type(colors) <class 'list'>

Lists are ordered collections of things.

We can create a new list by using square brackets ([]), and inside those square brackets, we put each of the items that we'd like our list to contain, separated by commas:

>>> numbers = [2, 1, 3, 4, 7, 11] >>> numbers [2, 1, 3, 4, 7, 11]

Lists can contain any type of object. Each item in a list doesn't need to be of the same type, but in practice, they typically are.

So we might refer to this as a list of strings:

>>> colors = ["purple", "green", "blue", "yellow"]

While this is a list of numbers:

>>> numbers = [2, 1, 3, 4, 7, 11] Containment checking

We can check whether a …

Read the full article: https://www.pythonmorsels.com/what-are-lists/
Categories: FLOSS Project Planets

Peter Bengtsson: Converting Celsius to Fahrenheit with Python

Fri, 2024-07-12 11:08
Starting at 4°C, add +12 to the Celcius and mirror the number to get the Fahrenheit number.
Categories: FLOSS Project Planets

Python Software Foundation: Announcing Our New PyPI Support Specialist!

Fri, 2024-07-12 09:12

We are thrilled to announce that our first-ever search for a dedicated PyPI Support Specialist has concluded with the hire of Maria Ashna, the newest member of the Python Software Foundation (PSF) staff. Reporting to Ee Durbin, Director of Infrastructure, Maria joins us from a background in academic research, technical consulting, and theatre.

Maria will help the PSF to support one of our most critical services, the Python Package Index (PyPI). Over the past 23 years, PyPI has seen essentially exponential growth in traffic and users, relying for the most part on volunteers to support it. With the addition of requirements to keep all Python maintainers and users safe, our support load has outstretched our support resources for some time now. The Python Software Foundation committed to hiring to increase this capacity in April and we’re excited to have Maria on board to begin providing crucially needed support.


From Maria, “I am a firm believer in democratizing tech. The Open Source community is the lifeblood of such democratization, which is why I am excited to be part of PSF and to serve this community.”

As you see Maria around the PyPI support inbox, issue tracker, and discuss.python.org in the future we hope that you’ll extend a warm welcome! We’re eager to get her up and running to reduce the stress that users have been experiencing around PyPI support and further our work to improve and extend PyPI sustainably.

Categories: FLOSS Project Planets

Pages