Planet Python

Subscribe to Planet Python feed
Planet Python - http://planetpython.org/
Updated: 8 hours 43 min ago

Podcast.__init__: Experimenting With Reinforcement Learning Using MushroomRL

Sun, 2021-09-19 16:50
Reinforcement learning is a branch of machine learning and AI that has a lot of promise for applications that need to evolve with changes to their inputs. To support the research happening in the field, including applications for robotics, Carlo D'Eramo and Davide Tateo created MushroomRL. In this episode they share how they have designed the project to be easy to work with, so that students can use it in their study, as well as extensible so that it can be used by businesses and industry professionals. They also discuss the strengths of reinforcement learning, how to design problems that can leverage its capabilities, and how to get started with MushroomRL for your own work.Summary

Reinforcement learning is a branch of machine learning and AI that has a lot of promise for applications that need to evolve with changes to their inputs. To support the research happening in the field, including applications for robotics, Carlo D’Eramo and Davide Tateo created MushroomRL. In this episode they share how they have designed the project to be easy to work with, so that students can use it in their study, as well as extensible so that it can be used by businesses and industry professionals. They also discuss the strengths of reinforcement learning, how to design problems that can leverage its capabilities, and how to get started with MushroomRL for your own work.

Announcements
  • Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show!
  • Your host as usual is Tobias Macey and today I’m interviewing Davide Tateo and Carlo D’Eramo about MushroomRL, a library for building reinforcement learning experiments
Interview
  • Introductions
  • How did you get introduced to Python?
  • Can you start by describing what reinforcement learning is and how it differs from other approaches for machine learning?
  • What are some example use cases where reinforcement learning might be necessary?
  • Can you describe what MushroomRL is and the story behind it?
    • Who are the target users of the project?
    • What are its main goals?
  • What are your suggestions to other developers for implementing a succesful library?
  • What are some of the core concepts that researchers and/or engineers need to understand to be able to effectively use reinforcement learning techniques?
  • Can you describe how MushroomRL is architected?
    • How have the goals and design of the project changed or evolved since you began working on it?
  • What is the workflow for building and executing an experiment with MushroomRL?
    • How do you track the states and outcomes of experiments?
  • What are some of the considerations involved in designing an environment and reward functions for an agent to interact with?
  • What are some of the open questions that are being explored in reinforcement learning?
  • How are you using MushroomRL in your own research?
  • What are the most interesting, innovative, or unexpected ways that you have seen MushroomRL used?
  • What are the most interesting, unexpected, or challenging lessons that you have learned while working on MushroomRL?
  • When is MushroomRL the wrong choice?
  • What do you have planned for the future of MushroomRL?
  • How can the open-source community contribute to MushroomRL?
  • What kind of support you are willing to provide to users?
Keep In Touch Picks Closing Announcements
  • Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management.
  • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
  • If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@podcastinit.com) with your story.
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers
  • Join the community in the new Zulip chat workspace at pythonpodcast.com/chat
Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Categories: FLOSS Project Planets

Łukasz Langa: Weekly Report 2021, September 13 - 19

Sun, 2021-09-19 16:20

This week in numbers: closed 8 issues, authored 1 PR, closed 49 PRs, and reviewed 6. No highlights this time since I badly hoped to be able to squeeze in some work on Saturday but that turned out not to be possible (it’s birthday season in my family).

Categories: FLOSS Project Planets

Mike Driscoll: Python 101 – An Intro to Jupyter Notebook

Sun, 2021-09-19 08:30

The Jupyter Notebook is an open-source web application that allows you to create and share documents that contain code, equations, visualizations, and formatted text. By default, Jupyter Notebook runs Python out of the box. Additionally, Jupyter Notebook supports many other programming languages via extensions. You can use the Jupyter Notebook for data cleaning and transformation, numerical simulation, statistical modeling, data visualization, machine learning, and much more!

In this chapter, you will learn about the following:

  • Installing The Jupyter Notebook
  • Creating a Notebook
  • Adding Content
  • Adding Markdown Content
  • Adding an Extension
  • Exporting Notebooks to Other Formats

This chapter is not meant to be a comprehensive tutorial on the Jupyter Notebook. Instead, it will show you the basics of how to use a Notebook and why it might be useful. If you are intrigued by this technology, you might want to check out my book on the topic, Jupyter Notebook 101.

Let’s get started!

Installing The Jupyter Notebook

Jupyter Notebook does not come with Python. You will need to install it using pip. If you are using Anaconda instead of the official Python, then Jupyter Notebook comes with Anaconda, pre-installed.

Here is how you would install Jupyter Notebook with pip:

python3 -m pip install jupyter

When you install Jupyter Notebook, it will install a lot of other dependencies. You may want to install Jupyter Notebook into a Python virtual environment. See Chapter 21 for more information.

Once the installation is done, you are ready to create a Jupyter Notebook!

Creating a Notebook

Creating a Notebook is a fundamental concept. Jupyter Notebook operates through its own server, which comes included with your installation. To be able to do anything with Jupyter, you must first launch this Jupyter Notebook Server by running the following command:

jupyter notebook

This command will either launch your default browser or open up a new tab, depending on whether your browser is already running or not. In both cases you will soon see a new tab that points to the following URL: http://localhost:8888/tree. Your browser should load up to a page that looks like this:

Jupyter Server

Here you can create a Notebook by clicking the New button on the right:

Creating a Jupyter Notebook

You can create a Notebook with this menu as well as a text file, a folder, and an in-browser terminal session. For now, you should choose the Python 3 option.

Having done that, a new tab will open with your new Notebook loaded:

A New Notebook

Now let’s learn about how to interact with the Notebook!

Naming Your Notebook

The top of the Notebook says that it is Untitled. To fix that, all you need to do is click on the word Untitled and an in-browser dialog will appear:

Renaming a Notebook

When you rename the Notebook, it will also rename the file that the Notebook is saved to so that it matches the name you gave it. You can name this Notebook “Hello World”.

Running Cells

Jupyter Notebook cells are where the magic happens. This is where you can create content and interactive code. By default, the Notebook will create cells in code mode. That means that it will allow you to write code in whichever kernel you chose when you created the Notebook. A kernel refers to the programming language that you chose when creating your Jupyter Notebook. You chose Python 3 when you created this Notebook, so you can write Python 3 code in the cell.

Right now the cell is empty, so it doesn’t do anything at all. Let’s add some code to change that:

print('Hello from Python!')

To execute the contents of a cell, you need to run that cell. After selecting the cell, there are three ways of running it:

  • Clicking the Run button in the row of buttons along the top
  • Navigating to Cell -> Run Cells from the Notebook menu
  • Using the keyboard shortcut: Shift+Enter

When you run this cell, the output should look like this:

Running a Jupyter Notebook Cell

Jupyter Notebook cells remember the order in which they are run. If you run the cells out of order, you may end up with errors because you haven’t imported something in the right order. However, when you do run the cells in order, you can write imports in one cell and still use those imports in later cells. Notebooks make it simple to keep logical pieces of the code together. In fact, you can put explanatory cells, graphs and more between the code cells and the code cells will still share with each other.

When you run a cell, there are some brackets next to the cell that will fill-in with a number. This indicates the order in which the cells were run. In this example, when you ran the first cell, the brackets filled in with the number one. Because all code cells in a notebook operate on the same global namespace, it is important to be able to keep track of the order of execution of your code cells.

Learning About the Menus

There is a menu in the Jupyter Notebook that you can use to work with your Notebook. The menu runs along the top of the Notebook. Here are your menu options:

  • File
  • Edit
  • View
  • Insert
  • Cell
  • Kernel
  • Widgets
  • Help

Let’s go over each of these menus. You don’t need to know about every single option in these menus to start working with Jupyter, so this will be a high-level overview.

The File menu is used for opening a Notebook or creating a new one. You can also rename the Notebook here. One of the nice features of Notebooks is that you can create Checkpoints. Checkpoints allow you to rollback to a previous state. To create a Checkpoint, go in the File menu and choose the Save and Checkpoint option.

The Edit menu contains your regular cut, copy, and paste commands, which you can use on a cell level. You can also delete, split, or merge cells from here. Finally, you can use this menu to reorder the cells.

You will find that some of the options here are grayed out. The reason an item is grayed out is because that option does not apply to the currently selected cell in your Notebook. For example, if you selected a code cell, you won’t be able insert an image. Try changing the cell type to Markdown to see how the options change.

The View menu is used for toggling the visibility of the header and the toolbar. This is also where you would go to toggle Line Numbers on or off.

The Insert menu is used for inserting cells above or below the currently selected cell.

The Cell menu is useful for running one cell, a group of cells or everything in the Notebook! You can change the cell type here, but you will probably find that the toolbar is more intuitive to use then the menu for that sort of thing.

Another useful feature of the Cell menu is that you can use it to clear the cell’s output. A lot of people share their Notebooks with others. If you want to do that, it can be useful to clear out the outputs of the cells so that your friends or colleagues can run the cells themselves and discover how they work.

The Kernel menu is for working with the Kernel itself. The Kernel refers to the programming language plugin. You will occasionally need to restart, reconnect or shut down your kernel. You can also change which kernel is running in your Notebook.

You won’t use the Kernel menu all that often. However, when you need to do some debugging in Jupyter Notebook, it can be handy to restart the Kernel rather than restarting the entire server.

The Widgets menu is for clearing and saving widget state. A Widget is a way to add dynamic content to your Notebook, like a button or slider. These are written in JavaScript under the covers.

The last menu is the Help menu. This is where you will go to learn about the special keyboard shortcuts for your Notebook. It also provides a user interface tour and plenty of reference material that you can use to learn how to better interact with your Notebook.

Now let’s learn how to create content in your Notebook!

Adding Content

You can choose between two primary types of content for your Notebooks:

  • Code
  • Markdown

There are technically two other cell types you can choose. One is Raw NBConvert, which is only intended for special use cases when using the nbconvert command line tool. This tool is used to convert your Notebook to other formats, such as PDF.

The other type is Heading, which actually isn’t used anymore. If you choose this cell type, you will receive the following dialog:

Heading Types

You have already seen how to use the default cell type, Code. So the next section will focus on Markdown.

Creating Markdown Content

The Markdown cell type allows you to format your text. You can create headings, add images and links, and format your text with italics, bold, etc.

This chapter won’t cover everything you can do with Markdown, but it will teach you the basics. Let’s take a look at how to do a few different things!

Formatting Your Text

If you would like to add italics to your text, you can use single underscores or single asterisks. If you would rather bold your text, then you double the number of asterisks or underscores.

Here are a couple of examples:

You can italicize like *this* or _this_ Or bold like **this** or __this__

Try setting your Notebook cell to Markdown and adding the text above to it. You will then see that the Notebook is automatically formatting the text for you:

Formatting text

When you run the cell, it will format the text nicely:

Formatted Text (after run)

If you need to edit the cell again, you can double-click the cell and it will go back into editing mode.

Now let’s find out how to add heading levels!

Using Headings

Headings are good for creating sections in your Notebook, just like they are when you are creating a web page or a document in Microsoft Word. To create headings in Markdown, you can use one or more # signs.

Here are some examples:

# Heading 1 ## Heading 2 ### Heading 3 #### Heading 4

If you add the code above to a Markdown cell in your Notebook, it will look like this:

Markdown Headings

You can see that the Notebook is already generating a type of preview for you here by shrinking the text slightly for each heading level.

When you run the cell, you will see something like the following:

Markdown Headings (after running)

As you can see, Jupyter nicely formats your text as different-level headings that can be helpful to structure your text.

Adding a Listing

Creating a listing or bullet points is pretty straight-forward in Markdown. To create a listing, you add an asterisk (*) or a dash (-) to the beginning of the line.

Here is an example:

* List item 1 * sub item 1 * sub item 2 * List item 2 * List item 3

Let’s add this code to your Notebook:

Adding Listings in Markdown

You don’t really get a preview of listings this time, so let’s run the cell to see what you get:

Listings in Markdown (after run)

That looks pretty good! Now let’s find out how to get syntax highlighting for your code!

Highlighting Code Syntax

Notebooks already allow you to show and run code and they even show syntax highlighting. However, this only works for the Kernels or languages installed in Jupyter Notebook.

If you want to show code for another language that is not installed or if you want to show syntax highlighting without giving the user the ability to run the code, then you can use Markdown for that.

To create a code block in Markdown, you would need to use 3 backticks followed by the language that you want to show. If you want to do inline code highlighting, then surround the code snippet with single backticks. However, keep in mind that inline code doesn’t support syntax highlighting.

Here are two examples in the Notebook:

Syntax Highlighting in Markdown

When you run the cell, the Notebook transforms the Markdown into the following:

Syntax Highlighting (after run)

Here you can see how the code now has syntax highlighting.

Now let’s learn how to generate a hyperlink!

Creating a Hyperlink

Creating hyperlinks in Markdown is quite easy. The syntax is as follows:

[text](URL)

So if you wanted to link to Google, you would do this:

[Google](https://www.google.com)

Here is what the code looks like in the Notebook:

Hyperlink Markdown

When you run the cell, you will see the Markdown turned into a regular hyperlink:

Hyperlink Markdown (after run)

As you can see, the Markdown has been transformed into a traditional hyperlink.

Let’s find out about Jupyter extensions next!

Adding an Extension

Jupyter Notebook has lots of functionality right out of the box. If you need anything beyond that, you can also add new features through extensions from a large extension ecosystem. There are four different types of extensions available:

  • Kernel
  • IPython kernel
  • Notebook
  • Notebook server

Most of the time, you will want to install a Notebook extension.

An extension for Jupyter Notebook is technically a JavaScript module that will be loaded in the Notebook’s front-end to add new functionality or make the Notebook look different. If you know JavaScript, you can write your own extension!

If you need to add something new to Jupyter Notebook, you should use Google to see if someone has written something that will work for you. The most popular extension is actually a large set of extensions called jupyter_contrib_nbextensions which you can get here:

Most good extensions can be installed using pip. For example, to install the one mentioned above, you can run this command:

$ pip install jupyter_contrib_nbextensions

There are a few that are not compatible with pip. In those cases, you can use Jupyter itself to install the extension:

$ jupyter nbextension install NAME_OF_EXTENSION

While this installs the extension for Jupyter to use, it does not make the extension active yet. You will need to enable an extension if you install it using this method before you can use it.

To enable an extension, you need to run the following command:

$ jupyter nbextension enable NAME_OF_EXTENSION

If you installed the extension while you were running Jupyter Notebook, you may need to restart the Kernel or the entire server to be able to use the new extension.

You may want to get the Jupyter NbExtensions Configurator extension to help you manage your extensions. It is a neat extension designed for enabling and disabling other extensions from within your Notebook’s user interface. It also displays the extensions that you have currently installed.

Exporting Notebooks to Other Formats

After you have created an amazing Notebook, you may want to share it with other people who are not as computer savvy as you are. Jupyter Notebook supports converting the Notebooks to other formats:

  • HTML
  • LaTeX
  • PDF
  • RevealJS
  • Markdown
  • ReStructured Text
  • Executable script

You can convert a Notebook using the nbconvert tool that was installed when you originally installed Jupyter Notebook. To use nbconvert, you can do the following:

$ jupyter nbconvert <notebook file> --to <output format>

Let’s say you want to convert your Notebook to PDF. To do that, you would do this:

$ jupyter nbconvert my_notebook.ipynb --to pdf

You will see some output as it converts the Notebook into a PDF. The nbconvert tool will also display any warnings or errors that it encounters during the conversion. If the process finishes successfully, you will have a my_notebook.pdf file in the same folder as the Notebook file

The Jupyter Notebook provides a simpler way to convert your Notebooks too. You can do so from the File menu within the Notebook itself. You can choose the Download as option to do the conversion.

Depending on the platform that you are on, you may need to install LaTeX or other dependencies to get certain export formats to work properly.

Wrapping Up

The Jupyter Notebook is a fun way to learn how to use Python or machine learning. It is a great way to organize your data so that you can share it with others. You can use it to create presentations, show your work, and run your code.

In this article, you learned about the following:

  • Installing The Jupyter Notebook
  • Creating a Notebook
  • Adding Content
  • Adding Markdown Content
  • Adding an Extension
  • Exporting Notebooks to Other Formats

You should give Jupyter Notebook a try. It’s a useful coding environment and well worth your time.

Related Articles

Learn more about what you can do with Jupyter Notebook in these articles:

The post Python 101 – An Intro to Jupyter Notebook appeared first on Mouse Vs Python.

Categories: FLOSS Project Planets

Mike Driscoll: Merging Dictionaries with the Union Operator

Sat, 2021-09-18 08:30

As a developer, there are times where you may end up with two or more dictionaries that you need to combine into one master dictionary. There are lots of different ways to merge dictionaries in the Python programming language.

In this tutorial, you will look at a few of the old ways to merge dictionaries and then look at the latest method that was added in Python 3.9.

Here are the methods you will learn about:

  • Using dict.update()
  • Merging with **
  • Merging with the Union Operator

You will start your journey with the update() method!

Using dict.update()

Python’s dictionary has many different methods. One of those methods can be used to merge two dictionaries together. That method is called update().

Here is an example:

>>> first_dictionary = {"name": "Mike", "occupation": "Python Teacher"} >>> second_dictionary = {"location": "Iowa", "hobby": "photography"} >>> first_dictionary.update(second_dictionary) >>> first_dictionary {'name': 'Mike', 'occupation': 'Python Teacher', 'location': 'Iowa', 'hobby': 'photography'}

This worked perfectly! The only problem with this method is that it modifies one of the dictionaries. If you want to create a third dictionary without modifying one of the input dictionaries, then you’ll want to check out one of the other merging methods in this article.

You are now ready to learn about using **!

Merging with **

When you use the double-asterisk, it is sometimes called “unpacking”, “expanding” or “splatting” a dictionary. The ** is used in Python with kwargs in functions too.

Here is how you can use the ** to merge two dictionaries:

>>> first_dictionary = {"name": "Mike", "occupation": "Python Teacher"} >>> second_dictionary = {"location": "Iowa", "hobby": "photography"} >>> merged_dictionary = {**first_dictionary, **second_dictionary} >>> merged_dictionary {'name': 'Mike', 'occupation': 'Python Teacher', 'location': 'Iowa', 'hobby': 'photography'}

This syntax looks a little weird, but it works great!

Now you are ready to learn about the latest way to merge two dictionaries!

Merging with the Union Operator

Starting in Python 3.9, you can use Python’s union operator, |, to merge dictionaries. You can learn all the nitty-gritty details in PEP 584.

Here is how you can use the union operator to merge two dictionaries:

>>> first_dictionary = {"name": "Mike", "occupation": "Python Teacher"}  >>> second_dictionary = {"location": "Iowa", "hobby": "photography"} >>> merged_dictionary = first_dictionary | second_dictionary >>> merged_dictionary {'name': 'Mike', 'occupation': 'Python Teacher', 'location': 'Iowa', 'hobby': 'photography'}

This is the shortest method yet for merging two dictionaries into one.

Wrapping Up

You now know three different methods that you can use to merge multiple dictionaries into one. If you have access to Python 3.9 or greater, you should use the union operator as that is arguable the cleanest looking method of combining dictionaries. However, if you are stuck on an older version of Python, you needn’t despair as you now have two other methods that should work!

The post Merging Dictionaries with the Union Operator appeared first on Mouse Vs Python.

Categories: FLOSS Project Planets

Talk Python to Me: #334: Microsoft Planetary Computer

Sat, 2021-09-18 04:00
On this episode, Rob Emanuele and Tom Augspurger join us to talk about building and running Microsoft's Planetary Computer project. This project is dedicated to providing the data around climate records and the compute necessary to process it with the mission of help use all understand climate change better. It combines multiple petabytes of data with a powerful hosted Jupyterlab notebook environment to process it.<br/> <br/> <strong>Links from the show</strong><br/> <br/> <div><b>Rob Emanuele on Twitter</b>: <a href="https://twitter.com/lossyrob" target="_blank" rel="noopener">@lossyrob</a><br/> <b>Tom Augspurger on Twitter</b>: <a href="https://twitter.com/TomAugspurger/" target="_blank" rel="noopener">@TomAugspurger</a><br/> <br/> <b>Video at example walkthrough by Tom if you want to follow along</b>: <a href="https://youtu.be/Ow8igbLT5KQ?t=2360" target="_blank" rel="noopener">youtube.com?t=2360</a><br/> <br/> <b>Planetary computer</b>: <a href="https://planetarycomputer.microsoft.com" target="_blank" rel="noopener">planetarycomputer.microsoft.com</a><br/> <b>Applications in public</b>: <a href="https://planetarycomputer.microsoft.com/applications" target="_blank" rel="noopener">planetarycomputer.microsoft.com</a><br/> <br/> <b>Microsoft's Environmental Commitments</b><br/> <b>Carbon negative</b>: <a href="https://blogs.microsoft.com/blog/2020/01/16/microsoft-will-be-carbon-negative-by-2030/" target="_blank" rel="noopener">blogs.microsoft.com</a><br/> <b>Report</b>: <a href="https://www.microsoft.com/en-us/corporate-responsibility/sustainability/report" target="_blank" rel="noopener">microsoft.com</a><br/> <br/> <b>AI for Earth grants</b>: <a href="https://www.microsoft.com/en-us/ai/ai-for-earth-grants" target="_blank" rel="noopener">microsoft.com</a><br/> <b>Python SDK</b>: <a href="https://github.com/microsoft/planetary-computer-sdk-for-python" target="_blank" rel="noopener">github.com</a><br/> <b>Planetary computer containers</b>: <a href="https://github.com/microsoft/planetary-computer-containers" target="_blank" rel="noopener">github.com</a><br/> <b>IPCC Climate Report</b>: <a href="https://www.ipcc.ch/report/ar6/wg1/downloads/report/IPCC_AR6_WGI_SPM.pdf" target="_blank" rel="noopener">ipcc.ch</a><br/> <b>Episode transcripts</b>: <a href="https://talkpython.fm/episodes/transcript/334/microsoft-planetary-computer" target="_blank" rel="noopener">talkpython.fm</a><br/> <br/> <b>Stay in touch with us</b><br/> <b>Subscribe on YouTube (for live streams)</b>: <a href="https://talkpython.fm/youtube" target="_blank" rel="noopener">youtube.com</a><br/> <b>Follow Talk Python on Twitter</b>: <a href="https://twitter.com/talkpython" target="_blank" rel="noopener">@talkpython</a><br/> <b>Follow Michael on Twitter</b>: <a href="https://twitter.com/mkennedy" target="_blank" rel="noopener">@mkennedy</a><br/></div><br/> <strong>Sponsors</strong><br/> <a href='https://clubhouse.io/talkpython'>Shortcut</a><br> <a href='https://talkpython.fm/training'>Talk Python Training</a><br> <a href='https://talkpython.fm/assemblyai'>AssemblyAI</a>
Categories: FLOSS Project Planets

Brett Cannon: Unravelling the `async with` statement

Fri, 2021-09-17 22:41

I already covered unravelling the with statement, and async with is not much different. Much like with, the language reference for async with gives an example of the statement already destructured. Based on that and the fact that async with is just with with asynchronous versions of __enter__ and __exit__ (__aenter__ and __aexit__, respectively), I&aposm just going to jump straight to the unravelled version and keep this post short.

async with a as b: cExample use of async with

becomes:

_enter = type(a).__aenter__ _exit = type(a).__aexit__ b = await _enter(a) try: c except: if not await _exit(a, *sys.exc_info()): raise else: await _exit(a, None, None, None) Example aync with unravelled

It&aposs with unravelled but changed by:

  1. __enter__ to __aenter__.
  2. __exit__ to __aexit__.
  3. All calls to those methods being preceded with await.
Categories: FLOSS Project Planets

PyPy: Better JIT Support for Auto-Generated Python Code

Fri, 2021-09-17 15:55
Performance Cliffs

A common bad property of many different JIT compilers is that of a "performance cliff": A seemingly reasonable code change, leading to massively reduced performance due to hitting some weird property of the JIT compiler that's not easy to understand for the programmer (e.g. here's a blog post about the fix of a performance cliff when running React on V8). Hitting a performance cliff as a programmer can be intensely frustrating and turn people off from using PyPy altogether. Recently we've been working on trying to remove some of PyPy's performance cliffs, and this post describes one such effort.

The problem showed up in an issue where somebody found the performance of their website using Tornado a lot worse than what various benchmarks suggested. It took some careful digging to figure out what caused the problem: The slow performance was caused by the huge functions that the Tornado templating engine creates. These functions lead the JIT to behave in unproductive ways. In this blog post I'll describe why the problem occurs and how we fixed it.

Problem

After quite a bit of debugging we narrowed down the problem to the following reproducer: If you render a big HTML template (example) using the Tornado templating engine, the template rendering is really not any faster than CPython. A small template doesn't show this behavior, and other parts of Tornado seem to perform well. So we looked into how the templating engine works, and it turns out that the templates are compiled into Python functions. This means that a big template can turn into a really enormous Python function (Python version of the example). For some reason really enormous Python functions aren't handled particularly well by the JIT, and in the next section I'll explain some the background that's necessary to understand why this happens.

Trace Limits and Inlining

To understand why the problem occurs, it's necessary to understand how PyPy's trace limit and inlining works. The tracing JIT has a maximum trace length built in, the reason for that is some limitation in the compact encoding of traces in the JIT. Another reason is that we don't want to generate arbitrary large chunks of machine code. Usually, when we hit the trace limit, it is due to inlining. While tracing, the JIT will inline many of the functions called from the outermost one. This is usually good and improves performance greatly, however, inlining can also lead to the trace being too long. If that happens, we will mark a called function as uninlinable. The next time we trace the outer function we won't inline it, leading to a shorter trace, which hopefully fits the trace limit.

In the diagram above we trace a function f, which calls a function g, which is inlined into the trace. The trace ends up being too long, so the JIT disables inlining of g. The next time we try to trace f the trace will contain a call to g instead of inlining it. The trace ends up being not too long, so we can turn it into machine code when tracing finishes.

Now we know enough to understand what the problem with automatically generated code is: sometimes, the outermost function itself doesn't fit the trace limit, without any inlining going on at all. This is usually not the case for normal, hand-written Python functions. However, it can happen for automatically generated Python code, such as the code that the Tornado templating engine produces.

So, what happens when the JIT hits such a huge function? The function is traced until the trace is too long. Then the trace limits stops further tracing. Since nothing was inlined, we cannot make the trace shorter the next time by disabling inlining. Therefore, this happens again and again, the next time we trace the function we run into exactly the same problem. The net effect is that the function is even slowed down: we spend time tracing it, then stop tracing and throw the trace away. Therefore, that effort is never useful, so the resulting execution can be slower than not using the JIT at all!

Solution

To get out of the endless cycle of useless retracing we first had the idea of simply disabling all code generation for such huge functions, that produce too long traces even if there is no inlining at all. However, that lead to disappointing performance in the example Tornado program, because important parts of the code remain always interpreted.

Instead, our solution is now as follows: After we have hit the trace limit and no inlining has happened so far, we mark the outermost function as a source of huge traces. The next time we trace such a function, we do so in a special mode. In that mode, hitting the trace limit behaves differently: Instead of stopping the tracer and throwing away the trace produced so far, we will use the unfinished trace to produce machine code. This trace corresponds to the first part of the function, but stops at a basically arbitrary point in the middle of the function.

The question is what should happen when execution reaches the end of this unfinished trace. We want to be able to cover more of the function with machine code and therefore need to extend the trace from that point on. But we don't want to do that too eagerly to prevent lots and lots of machine code being generated. To achieve this behaviour we add a guard to the end of the unfinished trace, which will always fail. This has the right behaviour: a failing guard will transfer control to the interpreter, but if it fails often enough, we can patch it to jump to more machine code, that starts from this position. In that way, we can slowly explore the full gigantic function and add all those parts of the control flow graph that are actually commonly executed at runtime.

In the diagram we are trying to trace a huge function f, which leads to hitting the trace limit. However, nothing was inlined into the trace, so disabling inlining won't ensure a successful trace attempt the next time. Instead, we mark f as "huge". This has the effect that when we trace it again and are about to hit the trace limit, we end the trace at an arbitrary point by inserting a guard that always fails.

If this guard failure is executed often enough, we might patch the guard and add a jump to a further part of the function f. This can continue potentially several times, until the trace really hits and end points (for example by closing the loop and jumping back to trace 1, or by returning from f).

Evaluation

Since this is a performance cliff that we didn't observe in any of our benchmarks ourselves, it's pointless to look at the effect that this improvement has on existing benchmarks – there shouldn't and indeed there isn't any.

Instead, we are going to look at a micro-benchmark that came out of the original bug report, one that simply renders a big artificial Tornado template 200 times. The code of the micro-benchmark can be found here.

All benchmarks were run 10 times in new processes. The means and standard deviations of the benchmark runs are:

Implementation Time taken (lower is better) CPython 3.9.5 14.19 ± 0.35s PyPy3 without JIT 59.48 ± 5.41s PyPy3 JIT old 14.47 ± 0.35s PyPy3 JIT new 4.89 ± 0.10s

What we can see is that while the old JIT is very helpful for this micro-benchmark, it only brings the performance up to CPython levels, not providing any extra benefit. The new JIT gives an almost 3x speedup.

Another interesting number we can look at is how often the JIT started a trace, and for how many traces we produced actual machine code:

Implementation Traces Started Traces sent to backend Time spent in JIT PyPy3 JIT old 216 24 0.65s PyPy3 JIT new 30 25 0.06s

Here we can clearly see the problem: The old JIT would try tracing the auto-generated templating code again and again, but would never actually produce any machine code, wasting lots of time in the process. The new JIT still traces a few times uselessly, but then eventually converges and stops emitting machine code for all the paths through the auto-generated Python code.

Related Work

Tim Felgentreff pointed me to the fact that Truffle also has a mechanism to slice huge methods into smaller compilation units (and I am sure other JITs have such mechanisms as well).

Conclusion

In this post we've described a performance cliff in PyPy's JIT, that of really big auto-generated functions which hit the trace limit without inlining, that we still want to generate machine code for. We achieve this by chunking up the trace into several smaller traces, which we compile piece by piece. This is not a super common thing to be happening – otherwise we would have run into and fixed it earlier – but it's still good to have a fix now.

The work described in this post tiny bit experimental still, but we will release it as part of the upcoming 3.8 beta release, to get some more experience with it. Please grab a 3.8 release candidate, try it out and let us know your observations, good and bad!

Categories: FLOSS Project Planets

Python Circle: Python easter egg - import this and the joke

Fri, 2021-09-17 15:39
Zen of python, import this, the hidden easter egg with the joke, source code of Zen of python disobey itself
Categories: FLOSS Project Planets

Python Software Foundation: Tereza Iofciu Awarded the PSF Community Service Award for Q1 2021

Fri, 2021-09-17 14:20

 


Tereza Iofciu, Data Science coach, PyLadies Hamburg organizer, and PSF Code of Conduct working group member has been awarded the Python Software Foundation 2021 Q1 community service award.

RESOLVED, that the Python Software Foundation award the Q1 2021 Community Service Award to Tereza Iofciu. Tereza is a PSF Code of Conduct WG member and has done a wonderful job helping, participating, and driving the Code of Conduct WG discussions. Tereza formed and continues to help organize the PyLadies event in 2021. Tereza is also a member of the newly formed PSF Diversity & Inclusion WG.

We interviewed Tereza to learn more about her inspiration and work with the Python community. Georgi Ker, a close associate of Tereza also speaks about Tereza. 

The Origin Story

Can you tell us about your origin story? Like how you got into tech?


I got into tech quite traditionally, I studied Computer Science in Bucharest, Romania, but I chose that not for a particular love for Informatik. I was good at Math and Physics in high school but I couldn't study those as I didn't want to become a teacher, seeing how teachers were treated in school. 

 

In the year 2000, Computer Science seemed like a thing for the future.

 

After that I kind of went with the flow, and the flow got me to Germany and doing a Ph.D. in Information Retrieval as the field of Data Science was emerging.


After that, I worked as a Data Scientist, Data Engineer, Product Management, Leadership, and now I am teaching (ha! the irony) Data science at the Neuefische Bootcamp.


Involvement with the Python Community and Inspiration

What was your earliest involvement with the Python community?

 

I would say in 2018 I saw on Twitter a friend of mine posting she was looking for a new job where diversity was part of the culture. 

 

Through her, I discovered the PyLadies Berlin meetups and I realized that I was missing such a community in Hamburg. We had lots of meetups in the city (things used to still be in-person back then), but most were talks and networking, and not so much about teaching and learning. 

 

It took a while to set it up but then I started the PyLadies Hamburg that year, which I wrote about here.

You have been a volunteer coordinator and organizer of PyLadies Hamburg. You are also a member of the PSF Code of Conduct WG, and the Diversity & Inclusion WG. This is amazing. What drives and inspires you into volunteering your time and resources in the Python Community?

 

I often felt that a normal day job doesn't fulfill all my needs, one gets paid for work and it is hard for companies to be consistent in providing other goals. Business is business and in the end, things come down to profit. 

 

So one rarely gets the opportunity to be surrounded at work by like-minded people all the time. 

 

I have volunteered in other organizations, but I found that the PyLadies does attract people who, while they are active in it, are very passionate and inspiring about making tech accessible to more than the majority. So in the end PyLadies was also a refuge and an energy top-up. 

 

It is like finding your village in the world! 

 

Tech companies in Germany are still very behind with diversity.. and changing that needs all the help it can get, women and people from underrepresented groups need a space where they can learn and grow and get inspired without invisible glass ceilings. 



How has your involvement within the Python community helped your career?

 

Being involved helped my career in several ways - I've discovered that I learn better when I teach, that is I cannot be bothered to learn a new thing when it is just for the sake of me learning it. 

 

This ultimately led to me believing I would succeed in my current role, and thus I took the opportunity. 

 

We've organized a lot of events - meetups, full-day workshops (IoT workshop at PyCon DE 2019), and conferences like Python Pizza Hamburg in 2019 and 2020, and International Women's Day PyLadies over 3 timezones. 

 

One learns a lot from organizing and it can also be lots of fun. Also, I have been in a leadership role since 2019, and part of the job is to inspire people to get out of their comfort zone, present their work, organize workshops, do meetups and this is something that I was already practicing within the community. 

 

And the network, being around inspiring people is inspiring, and in the end, one is part of an inspiration loop - people also come back with stories on how their life got better with PyLadies. 


Impact of Covid in the Python Community


How has Covid affected your work with the Python community and what steps are you taking to push the community forward during this trying time?

 

We moved pretty quickly to remote events, nobody really felt like being responsible for spreading covid and now there is the remote everywhere. 

 

Aside from the fatigue of the pandemic, going remote has greatly made the events accessible to more people, people from other cities, countries, or people who have to take care of other people and wouldn't have been able to travel to a meetup. 

 

We had this year’s workshops with speakers from the US and Canada. This would have not been possible previously.

 

On the PyLadies Hamburg side, we try to keep to the rhythm of monthly events. 

 

And the International Women's Day event became a three timezone event quite randomly, I posted about organizing one event in Hamburg and looking for speakers among the PyLadies organizers, then Lorena Mesa from Chicago saw it and asked if she could do a joint one in Chicago and then I asked her if she knows anyone on the other side of the globe for symmetry, and she said Georgi Ker in Bangkok who said: "of course."

 

This year I also attended for the first time PyCon US and I was part of the panel presenting the Diversity & Inclusion Workgroup, and we were geographically spread all over the world.


Georgi Ker Speaks on Tereza Iofciu's Impact

Georgi Ker, who had the opportunity of working together with Tereza and Lorena Mesa in organizing the online International Women’s Day 2021 event, speaks on Tereza’s impact.


Tereza is everywhere! I don't even know where to start. She was the one who initiated organizing the PyLadies IWD - International Women's Day - event in different time zones. Making the event accessible for more people.
Apart from involvement in the Interim Global Council, she is also one of the PyLadies moderators to ensure that PyLadies stays as a safe environment for everyone.
Tereza is like the guardian of PyLadies and PSF protecting the gates of the Python community caring for people.
We at the Python Software Foundation wish to once again congratulate and celebrate Tereza Iofciu for her amazing contributions to PyLadies and the wider Python community.

Categories: FLOSS Project Planets

STX Next: Python for Data Engineering: Why Do Data Engineers Use Python?

Fri, 2021-09-17 10:20

Python is one of the most popular programming languages worldwide. It often ranks high in surveys—for instance, it claimed the first spot in the Popularity of Programming Language index and came second in the TIOBE index.

Categories: FLOSS Project Planets

Python for Beginners: Add an item to a dictionary in Python

Fri, 2021-09-17 09:27

A dictionary in python is a data structure that stores data in the form of key-value pairs. The key-value pairs are also called items. The key-value pairs in each dictionary are separated by a colon “:” and each item in the dictionary is separated by a comma “,”. In this article, we will look at different ways to add an item to a dictionary in python.

Add an item to a dictionary using subscript notation

If we have a dictionary named myDict and a key-value pair having values myKey and myValue, then we can add the key-value pair to the dictionary using the syntax myDict [ myKey ] = myValue. This can be done as follows.

myDict = {"name": "PythonForBeginners", "acronym": "PFB"} print("Original Dictionary is:", myDict) myDict["niche"] = "programming" print("Modified Dictionary is:", myDict)

Output:

Original Dictionary is: {'name': 'PythonForBeginners', 'acronym': 'PFB'} Modified Dictionary is: {'name': 'PythonForBeginners', 'acronym': 'PFB', 'niche': 'programming'}

In the example above, we have added a new key “niche” with value “programming” associated with it.

Remember that if the key of the item which is being added to the dictionary already exists in it, the value for the key will be overwritten with a new value. This can be seen in the following example.

myDict = {'name': 'PythonForBeginners', 'acronym': 'PFB', 'niche': 'programming'} print("Original Dictionary is:", myDict) myDict["niche"] = "python programming" print("Modified Dictionary is:", myDict)

Output:

Original Dictionary is: {'name': 'PythonForBeginners', 'acronym': 'PFB', 'niche': 'programming'} Modified Dictionary is: {'name': 'PythonForBeginners', 'acronym': 'PFB', 'niche': 'python programming'}

In the example above, the key “niche” already exists in the dictionary with value “programming” associated with it. When we try to add the key-value pair with “niche” as key and “python programming” as associated value, the value associated with “niche” is updated to the new value.

Add an item to a dictionary using the update() method in Python

We can add an item to a dictionary using the update() method. The update() method when invoked on a dictionary, takes a dictionary or an iterable object having key-value pairs as input and adds the items to the dictionary.

We can give a  new dictionary as an input to the update() method and add the items to a given dictionary as follows.

myDict = {"name": "PythonForBeginners", "acronym": "PFB"} print("Original Dictionary is:", myDict) myDict.update({'niche': 'programming'}) print("Modified Dictionary is:", myDict)

Output:

Original Dictionary is: {'name': 'PythonForBeginners', 'acronym': 'PFB'} Modified Dictionary is: {'name': 'PythonForBeginners', 'acronym': 'PFB', 'niche': 'programming'}

We can give a list of tuples having key-value pairs as an input to the update() method and add items to a given dictionary as follows.

myDict = {"name": "PythonForBeginners", "acronym": "PFB"} print("Original Dictionary is:", myDict) items = [("niche", "programming")] myDict.update(items) print("Modified Dictionary is:", myDict)

Output:

Original Dictionary is: {'name': 'PythonForBeginners', 'acronym': 'PFB'} Modified Dictionary is: {'name': 'PythonForBeginners', 'acronym': 'PFB', 'niche': 'programming'}

We can also pass key-value pairs as keyword parameters to the update() method to add elements to the dictionary. Here the keys will be used as keyword parameters and values will be assigned as an input to the keyword parameters as follows.

myDict = {"name": "PythonForBeginners", "acronym": "PFB"} print("Original Dictionary is:", myDict) myDict.update(niche="programming") print("Modified Dictionary is:", myDict)

Output:

Original Dictionary is: {'name': 'PythonForBeginners', 'acronym': 'PFB'} Modified Dictionary is: {'name': 'PythonForBeginners', 'acronym': 'PFB', 'niche': 'programming'} Add an item to a dictionary using the ** operator in Python

A double asterisk (**) operator is used to pass variable length keyword parameters to a function. We can also use the ** operator to add a key-value pair to another dictionary. When we apply the ** operator to a dictionary, it deserializes the dictionary and converts it to a collection of key-value pairs. This collection of key-value pairs can be again converted to a dictionary.

To add an item to a dictionary, we will first create a dictionary with only that item. Then we will use the ** operator to merge the new dictionary and the dictionary to which the item had to be added as follows.

myDict = {"name": "PythonForBeginners", "acronym": "PFB"} print("Original Dictionary is:", myDict) newDict = {'niche': 'programming'} myDict = {**myDict, **newDict} print("Modified Dictionary is:", myDict)

Output:

Original Dictionary is: {'name': 'PythonForBeginners', 'acronym': 'PFB'} Modified Dictionary is: {'name': 'PythonForBeginners', 'acronym': 'PFB', 'niche': 'programming'} Add an item to a dictionary using the __setitem__() method

We can also add an item to the dictionary using the __setitem__() method. The __setitem__() method, when invoked on a dictionary, takes the new key and value as its first and second parameters respectively and adds the key-value pair to the dictionary as follows.

myDict = {"name": "PythonForBeginners", "acronym": "PFB"} print("Original Dictionary is:", myDict) myDict.__setitem__('niche', 'programming') print("Modified Dictionary is:", myDict)

Output:

Original Dictionary is: {'name': 'PythonForBeginners', 'acronym': 'PFB'} Modified Dictionary is: {'name': 'PythonForBeginners', 'acronym': 'PFB', 'niche': 'programming'}

If the key already exists in the dictionary, the value associated with it is overwritten with the new value. This can be seen in the following example.

myDict = {'name': 'PythonForBeginners', 'acronym': 'PFB', 'niche': 'programming'} print("Original Dictionary is:", myDict) myDict.__setitem__('niche', 'python programming') print("Modified Dictionary is:", myDict)

Output:

Original Dictionary is: {'name': 'PythonForBeginners', 'acronym': 'PFB', 'niche': 'programming'} Modified Dictionary is: {'name': 'PythonForBeginners', 'acronym': 'PFB', 'niche': 'python programming'} Conclusion

In this article, we have seen different ways to add an item to a dictionary in python. To read more about dictionaries in python you can read this article on dictionary comprehension or this article on how to merge two dictionaries in python.

The post Add an item to a dictionary in Python appeared first on PythonForBeginners.com.

Categories: FLOSS Project Planets

Mike Driscoll: Python 101 – Importing Modules (Video)

Fri, 2021-09-17 08:51

In this video tutorial, you will learn all about how to import modules using the import and from keywords

Related Tutorials

The post Python 101 – Importing Modules (Video) appeared first on Mouse Vs Python.

Categories: FLOSS Project Planets

Real Python: The Real Python Podcast – Episode #78: Learning Python Through Illustrated Stories

Fri, 2021-09-17 08:00

Are you a visual learner? Does it help to have programming concepts shared with concrete examples and images? Would you like to see if your child might be interested in programming? This week on the show, we talk with author Shari Eskenas about her books, "A Day in Code - Python: Learn to Code in Python Through an Illustrated Story" and "Learn Python Through Nursery Rhymes & Fairy Tales."

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Categories: FLOSS Project Planets

Mike Driscoll: Case / Switch Comes to Python in 3.10

Thu, 2021-09-16 08:30

Python 3.10 is adding a new feature called Structural Pattern Matching, which is defined in PEP 634 and has a tutorial on the topic in PEP 636. Structural Pattern Matching brings the case / switch statement to Python. The new syntax goes beyond what some languages use for their case statements.

This tutorial’s aim is to get you acquainted with the new syntax that you can use in Python 3.10. But before you dive into this latest incarnation of Python, let’s review what you could use before 3.10 came out

Python Before 3.10

Python has always had several solutions that you could use instead of a case or switch statement. A popular example is to use Python’s if–elif–elseas mentioned in this StackOverflow answer. In that answer, it shows the following example:

if x == 'a': # Do the thing elif x == 'b': # Do the other thing if x in 'bc': # Fall-through by not using elif, but now the default case includes case 'a'! elif x in 'xyz': # Do yet another thing else: # Do the default

This is a pretty reasonable alternative to using a case statement.

Another common solution that you’ll find on StackOverflow and other websites is to use Python’s dictionary to do something like this:

choices = {'a': 1, 'b': 2} result = choices.get(key, 'default')

There are other solutions that use lambdas inside of dictionaries or functions inside of dictionaries. These are also valid solutions.

Using the if–elif–else is quite possibly the most common and is also usually the most readable solution before the release of Python 3.10.

Getting Started with Structural Pattern Matching

Python’s new structural pattern matching uses two new keywords:

  • match (not switch!)
  • case

To see how to use this code, see the following example that is based on Guido’s tutorial:

>>> status_code = 400 >>> match status_code: ... case 400: ... print("bad request") ... case 200: ... print("good") ... case _: print("Something else bad happened") bad request

This code takes the status_code and tells Python to match it against one of the cases. If the case is _ (underscore), then the case was not found and that is the default case. That last case statement is sometimes called the “fall-through” case.

Combining Literals

You can simplify your case statements a bit by combining the literals that you are comparing against. For example, you might want to check if the pattern, status_code, matches against multiple literals. To do that, you would modify your code like this: case 400|401|403

Here’s a full example:

>>> status_code = 400 >>> match status_code: ... case 400|401|403 : ... print("bad request") ... case 200: ... print("good") ... case _: print("Something else bad happened") bad request
bad request

Isn’t that cool?

Wrapping Up

Structural Pattern Matching is an exciting new feature that is only available in Python 3.10 and newer. It’s a powerful new feature that has lots of interesting uses. Could those use-cases be solved using Python’s existing features? Probably, but this makes it even easier!

Related Articles

The post Case / Switch Comes to Python in 3.10 appeared first on Mouse Vs Python.

Categories: FLOSS Project Planets

STX Next: Polyglot Programming and the Benefits of Mastering Several Languages

Thu, 2021-09-16 07:05

Why learn one programming language when you can master a few? 

Categories: FLOSS Project Planets

Stack Abuse: Calculating Euclidean Distance with NumPy

Thu, 2021-09-16 06:30

In this guide - we'll take a look at how to calculate the Euclidean distance between two points in Python, using Numpy.

What is Euclidean Distance?

Euclidean distance is a fundamental distance metric pertaining to systems in Euclidean space.

Euclidean space is the classical geometrical space you get familiar with in Math class, typically bound to 3 dimensions. Though, it can also be perscribed to any non-negative integer dimension as well.

Euclidean distance is the shortest line between two points in Euclidean space.

The name comes from Euclid, who is widely recognized as "the father of geometry", as this was the only space people at the time would typically conceive of. Through time, different types of space have been observed in Physics and Mathematics, such as Affine space, and non-Euclidean spaces and geometry are very unintuitive for our cognitive perception.

In 3-dimensional Euclidean space, the shortest line between two points will always be a straight line between them, though this doesn't hold for higher dimensions.

Given this fact, Euclidean distance isn't always the most useful metric to keep track of when dealing with many dimensions, and we'll focus on 2D and 3D Euclidean space to calculate the Euclidean distance.

Measuring distance for high-dimensional data is typically done with other distance metrics such as Manhattan distance.

Generally speaking, Euclidean distance has major usage in development of 3D worlds, as well as Machine Learning algorithms that include distance metrics, such as K-Nearest Neighbors. Typically, Euclidean distance willl represent how similar two data points are - assuming some clustering based on other data has already been performed.

Mathematical Formula

The mathematical formula for calculating the Euclidean distance between 2 points in 2D space:
$$
d(p,q) = \sqrt[2]{(q_1-p_1)^2 + (q_2-p_2)^2 }
$$
The formula is easily adapted to 3D space, as well as any dimension:
$$
d(p,q) = \sqrt[2]{(q_1-p_1)^2 + (q_2-p_2)^2 + (q_3-p_3)^2 }
$$
The general formula can be simplified to:
$$
d(p,q) = \sqrt[2]{(q_1-p_1)^2 + ... + (q_n-p_n)^2 }
$$
A sharp eye may notice the similarity between Euclidean distance and Pythagoras' Theorem:
$$
C^2 = A^2 + B^2
$$

$$
d(p,q)^2 = (q_1-p_1)^2 + (q_2-p_2)^2
$$

There in fact is a relationship between these - Euclidean distance is calculated via Pythagoras' Theorem, given the Cartesian coordinates of two points.

Because of this, Euclidean distance is sometimes known as Pythagoras' distance, as well, though, the former name is much more well-known.

Note: The two points are vectors, but the output should be a scalar (which is the distance).

We'll be using NumPy to calculate this distance for two points, and the same approach is used for 2D and 3D spaces:

import matplotlib.pyplot as plt from mpl_toolkits.mplot3d import Axes3D fig = plt.figure() ax = fig.add_subplot(111, projection = '3d') ax.scatter(0, 0, 0) ax.scatter(3, 3, 3) plt.show()

Calculating Euclidean Distance in Python with NumPy

First, we'll need to install the NumPy library:

$ pip install numpy

Now, let's import it and set up our two points, with the Cartesian coordinates as (0, 0, 0) and (3, 3, 3):

import numpy as np # Initializing the points point_1 = np.array((0, 0, 0)) point_2 = np.array((3, 3, 3))

Now, instead of performing the calculation manually, let's utilize the helper methods of NumPy to make this even easier!

np.sqrt() and np.sum()

The operations and mathematical functions required to calculate Euclidean Distance are pretty simple: addition, subtraction, as well as the square root function. Multiple additions can be replaced with a sum, as well:
$$
d(p,q) = \sqrt[2]{(q_1-p_1)^2 + (q_2-p_2)^2 + (q_3-p_3)^2 }
$$

NumPy provides us with a np.sqrt() function, representing the square root function, as well as a np.sum() function, which represents a sum. With these, calculating the Euclidean Distance in Python is simple and intuitive:

# Get the square of the difference of the 2 vectors square = np.square(point_1 - point_2) # Get the sum of the square sum_square = np.sum(square)

This gives us a pretty simple result:

(0-3)^2 + (0-3)^2 + (0-3)^2

Which is equal to 27. All that's left is to get the square root of that number:

# The last step is to get the square root and print the Euclidean distance distance = np.sqrt(sum_square) print(distance)

This results in:

5.196152422706632

In true Pythonic spirit, this can be shortened to just a single line:

distance = np.sqrt(np.sum(np.square(point_1 - point_2)))

And you can even use the built-in pow() and sum() methods of the math module of Python instead, though they require you to hack around a bit with the input, which is conveniently abstracted using NumPy, as the pow() function only works with scalars (each element in the array individually), and accepts an argument - to which power you're raising the number.

This approach, though, intuitively looks more like the formula we've used before:

from math import * distance = np.sqrt(sum(pow(a-b, 2) for a, b in zip(point_1, point_2))) print(distance)

This also results in:

5.196152422706632 np.linalg.norm()

The np.linalg.norm() function represents a Mathematical norm. In essence, a norm of a vector is it's length. This length doesn't have to necessarily be the Euclidean distance, and can be other distances as well. Euclidean distance is the L2 norm of a vector (sometimes known as the Euclidean norm) and by default, the norm() function uses L2 - the ord parameter is set to 2.

If you were to set the ord parameter to some other value p, you'd calculate other p-norms. For instance, the L1 norm of a vector is the Manhattan distance!

With that in mind, we can use the np.linalg.norm() function to calculate the Euclidean distance easily, and much more cleanly than using other functions:

distance = np.linalg.norm(point_1-point_2) print(distance)

This results in the L2/Euclidean distance being printed:

5.196152422706632

L2 normalization and L1 normalization are heavily used in Machine Learning to normalize input data.

If you'd like to learn more about feature scaling - read our Guide to Feature Scaling Data with Scikit-Learn!

np.dot()

We can also use a Dot Product to calculate the Euclidean distance. In Mathematics, the Dot Product is the result of multiplying two equal-length vectors and the result is a single number - a scalar value. Because of the return type, it's sometimes also known as a "scalar product". This operation is often called the inner product for the two vectors.

To calculate the dot product between 2 vectors you can use the following formula:
$$
\vec{p} \cdot \vec{q} = {(q_1-p_1) + (q_2-p_2) + (q_3-p_3) }
$$

With NumPy, we can use the np.dot() function, passing in two vectors.

If we calculate a Dot Product of the difference between both points, with that same difference - we get a number that's in a relationship with the Euclidean Distance between those two vectors. Extracting the square root of that number nets us the distance we're searching for:

# Take the difference between the 2 points diff = point_1 - point_2 # Perform the dot product on the point with itself to get the sum of the squares sum_square = np.dot(diff, diff) # Get the square root of the result distance = np.sqrt(sum_square) print(distance)

Of course, you can shorten this to a one-liner as well:

distance = np.sqrt(np.dot(point_1-point_2, point_1-point_2)) print(distance) 5.196152422706632 Using the Built-In math.dist()

Python has its built-in method, in the math module, that calculates the distance between 2 points in 3d space. However, this only works with Python 3.8 or later.

math.dist() takes in two parameters, which are the two points, and returns the Euclidean distance between those points.

Note: Please note that the two points must have the same dimensions (i.e both in 2d or 3d space).

Now, to calculate the Euclidean Distance between these two points, we just chuck them into the dist() method:

import math distance = math.dist(point_1, point_2) print(distance) 5.196152422706632 Conclusion

Euclidean distance is a fundamental distance metric pertaining to systems in Euclidean space.

Euclidean space is the classical geometrical space you get familiar with in Math class, typically bound to 3 dimensions. Though, it can also be perscribed to any non-negative integer dimension as well.

Euclidean distance is the shortest line between two points in Euclidean space.

The metric is used in many contexts within data mining, machine learning, and several other fields, and is one of the fundamental distance metrics.

Categories: FLOSS Project Planets

Python⇒Speed: Using Podman with BuildKit, the better Docker image builder

Wed, 2021-09-15 20:00

BuildKit is a new and improved tool for building Docker images: it’s faster, has critical features missing from traditional Dockerfiles like build secrets, plus additionally useful features like cache mounting. So if you’re building Docker images, using BuildKit is in general a good idea.

And then there’s Podman: Podman is a reimplemented, compatible version of the Docker CLI and API. It does not however implement all the BuildKit Dockerfile extensions. On its own, then, Podman isn’t as good as Docker at building images.

There is another option, however: BuildKit has its own build tool, which is distinct from the traditional docker build, and this build tool can work with Podman.

Let’s see where Podman currently is as far as BuildKit features, and how to use BuildKit with Podman if that is not sufficient.

Read more...
Categories: FLOSS Project Planets

Python for Beginners: Find the Height of a Binary Tree

Wed, 2021-09-15 10:09

Just like we find the length of a list or the number of items in a python dictionary, we can find the height of a binary tree. In this article, we will formulate an algorithm to find the height of a binary tree. We will also implement the algorithm in python and execute on a given binary tree.

What is the height of a binary tree?

Height of a binary tree is defined as the maximum distance from the root node at which a node is present in the binary tree. The height of a binary tree depends on the number of nodes and their position in the tree. If a tree has an ‘n’ number of nodes, it can have a height anywhere between log(n) + 1 to n.  The binary tree will have a height n if the tree is entirely skewed to either left or right. It will have a height of log(n)+1 if the nodes in the tree are properly distributed and the tree is a complete binary tree.

For example, the following binary tree has 7 elements.A binary tree with 7 elements can have any height between log(7)+ 1 that is 3 and 7. In our example, the nodes of the tree are properly distributed and the tree is completely balanced. Therefore, the height of the tree is 3.

binary tree How to calculate the height of a binary tree?

To calculate the height of a binary tree, we can calculate the heights of left and right subtrees. The maximum of the height of the subtrees can be used to find the height of the tree by adding one to it. For an empty root, we can say that the height of the tree  is zero. Similarly, height of a single node will be considered as 1. 

Algorithm to find the height of a binary tree

Now that we have found a way to find the height of the binary tree, we will formulate the algorithm for finding the height as follows.

  1. If we find an empty root node, we will say that the height of the tree is 0.
  2. Otherwise, we will find the height of the left subtree and right subtree recursively.
  3. After finding the height of the left subtree and right subtree, we will calculate their maximum height.
  4. We will add 1 to the maximum height. That will be the height of the binary tree.
Implementation of the algorithm in Python

Now that we have understood and formulated the algorithm, we will implement it in Python.

from queue import Queue class BinaryTreeNode: def __init__(self, data): self.data = data self.leftChild = None self.rightChild = None def height(root): if root is None: return 0 leftHeight=height(root.leftChild) rightHeight=height(root.rightChild) max_height= leftHeight if rightHeight>max_height: max_height = rightHeight return max_height+1 def insert(root, newValue): # if binary search tree is empty, create a new node and declare it as root if root is None: root = BinaryTreeNode(newValue) return root # if newValue is less than value of data in root, add it to left subtree and proceed recursively if newValue < root.data: root.leftChild = insert(root.leftChild, newValue) else: # if newValue is greater than value of data in root, add it to right subtree and proceed recursively root.rightChild = insert(root.rightChild, newValue) return root root = insert(None, 50) insert(root, 20) insert(root, 53) insert(root, 11) insert(root, 22) insert(root, 52) insert(root, 78) print("Height of the binary tree is:") print(height(root))

Output:

Height of the binary tree is: 3

Here, we have created a binary tree node. Then, we defined functions to insert elements to the binary tree. Finally, we implemented the algorithm to find the height of a binary tree in Python.

Conclusion

In this article, we have implemented an algorithm to find the height of a binary tree. To learn more about  other data structures, you can read this article on Linked List in Python. Stay tuned for more articles on implementation of different algorithms in Python.

The post Find the Height of a Binary Tree appeared first on PythonForBeginners.com.

Categories: FLOSS Project Planets

Real Python: Build a Personal Diary With Django and Python

Wed, 2021-09-15 10:00

A diary is a personal safe space. With the help of Django, you can create a diary on your own computer without storing data in anyone else’s cloud. By following along with the project below, you’ll see how quickly you can build a functioning web app in Django without any external dependencies.

In this tutorial, you’ll learn how to:

  • Set up a Django project
  • Work with the standard SQLite database
  • Make use of the Django admin site
  • Create models and class-based views
  • Nest and style templates
  • Secure your diary with authentication

This tutorial will guide you step-by-step to your final diary. If you’re just starting out with Django and want to finish your first real project, then this tutorial is for you!

To get the complete source code for the Django project and its steps, click the link below:

Get Source Code: Click here to get the source code you’ll use to build a personal diary web app with Django and Python in this tutorial.

Demo Video

On the main page of your diary, you’ll have a list of entries. You can scroll through them and create new ones with a click of a button. The styling is provided in this tutorial, so you can focus on the Django part of the code. Here’s a quick demo video of how it will look in action:

By the end of the tutorial, you’ll be able to flawlessly navigate your diary to create, read, update, and delete entries on demand.

Project Overview

The tutorial is divided into multiple steps. That way, you can take breaks and continue at your own pace. In each step, you’ll tackle a specific area of your diary project:

  1. Setting up your Django diary project
  2. Creating entries on the back end
  3. Displaying entries on the front end
  4. Adding styling
  5. Managing entries on the front end
  6. Improving your user experience
  7. Implementing authentication

By following along, you’ll explore the basics of web apps and how to add common features of a Django project. After finishing the tutorial, you’ll have created your own personal diary app and will have a Django project blueprint to build upon.

Prerequisites

You don’t need any previous knowledge of Django to complete this project. If you want to learn more about the topics you encounter in this tutorial, you’ll find links to resources along the way.

However, you should be comfortable using the command line and have a basic knowledge of Python and classes. Although it helps to know about virtual environments and pip, you’ll learn how to set everything up as you work through the tutorial.

Step 1: Setting Up Your Django Diary

Start the project by creating your project directory and setting up a virtual environment. This setup will keep your code isolated from any other projects on your machine. You can name your project folder and the virtual environment any way you want. In this tutorial, the project folder is named my-diary, and the virtual environment is named .venv:

$ mkdir my-diary $ cd my-diary $ python3 -m venv .venv $ source .venv/bin/activate

Your prompt now starts with the name of your virtual environment in parenthesis. This is an indicator that the virtual environment is activated. For the rest of the tutorial, your virtual environment must be activated. All of the following steps will take place inside this directory or its subdirectories.

Note: To activate your virtual environment on Windows, you might need to run this command:

c:\> python -m venv .venv c:\> .venv\Scripts\activate.bat

For other platforms and shells, you might need to use a different command.

The only other requirement for your diary is Django itself. Install the specific version of this tutorial with pip:

(.venv) $ python -m pip install Django==3.2.1

This command installs Django and some dependencies that Django requires. That’s everything you need.

Read the full article at https://realpython.com/django-diary-project-python/ »

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Categories: FLOSS Project Planets

Pages