Planet Python

Subscribe to Planet Python feed
Planet Python - http://planetpython.org/
Updated: 1 hour 7 min ago

eGenix.com: Python Meeting Düsseldorf - 2023-09-27

Mon, 2023-09-25 05:00

The following text is in German, since we're announcing a regional user group meeting in Düsseldorf, Germany.

Ankündigung

Das nächste Python Meeting Düsseldorf findet an folgendem Termin statt:

27.09.2023, 18:00 Uhr
Raum 1, 2.OG im Bürgerhaus Stadtteilzentrum Bilk
Düsseldorfer Arcaden, Bachstr. 145, 40217 Düsseldorf


Programm Bereits angemeldete Vorträge
  • Moritz Damm:
    Einführung in 'Kedro - A framework for production-ready data science'
  • Marc-André Lemburg:
    Parsing structured content with Python 3.10's new match-case
  • Arkadius Schuchhardt:
    Repository Pattern in Python: Why and how?
  • Jens Diemer:
    CLI Tools

Weitere Vorträge können gerne noch angemeldet werden. Bei Interesse, bitte unter info@pyddf.de melden. Startzeit und Ort

Wir treffen uns um 18:00 Uhr im Bürgerhaus in den Düsseldorfer Arcaden.

Das Bürgerhaus teilt sich den Eingang mit dem Schwimmbad und befindet sich an der Seite der Tiefgarageneinfahrt der Düsseldorfer Arcaden.

Über dem Eingang steht ein großes "Schwimm’ in Bilk" Logo. Hinter der Tür direkt links zu den zwei Aufzügen, dann in den 2. Stock hochfahren. Der Eingang zum Raum 1 liegt direkt links, wenn man aus dem Aufzug kommt.

>>> Eingang in Google Street View

⚠️ Wichtig: Bitte nur dann anmelden, wenn ihr absolut sicher seid, dass ihr auch kommt. Angesichts der begrenzten Anzahl Plätze, haben wir kein Verständnis für kurzfristige Absagen oder No-Shows. Einleitung

Das Python Meeting Düsseldorf ist eine regelmäßige Veranstaltung in Düsseldorf, die sich an Python Begeisterte aus der Region wendet.

Einen guten Überblick über die Vorträge bietet unser PyDDF YouTube-Kanal, auf dem wir Videos der Vorträge nach den Meetings veröffentlichen.

Veranstaltet wird das Meeting von der eGenix.com GmbH, Langenfeld, in Zusammenarbeit mit Clark Consulting & Research, Düsseldorf:

Format

Das Python Meeting Düsseldorf nutzt eine Mischung aus (Lightning) Talks und offener Diskussion.

Vorträge können vorher angemeldet werden, oder auch spontan während des Treffens eingebracht werden. Ein Beamer mit HDMI und FullHD Auflösung steht zur Verfügung.

(Lightning) Talk Anmeldung bitte formlos per EMail an info@pyddf.de

Kostenbeteiligung

Das Python Meeting Düsseldorf wird von Python Nutzern für Python Nutzer veranstaltet.

Da Tagungsraum, Beamer, Internet und Getränke Kosten produzieren, bitten wir die Teilnehmer um einen Beitrag in Höhe von EUR 10,00 inkl. 19% Mwst. Schüler und Studenten zahlen EUR 5,00 inkl. 19% Mwst.

Wir möchten alle Teilnehmer bitten, den Betrag in bar mitzubringen.

Anmeldung

Da wir nur 25 Personen in dem angemieteten Raum empfangen können, möchten wir bitten, sich vorher anzumelden.

Meeting Anmeldung bitte per Meetup

Weitere Informationen

Weitere Informationen finden Sie auf der Webseite des Meetings:

              https://pyddf.de/

Viel Spaß !

Marc-Andre Lemburg, eGenix.com

Categories: FLOSS Project Planets

Stack Abuse: How to Check for NaN Values in Python

Fri, 2023-09-22 16:12
Introduction

Today we're going to explore how to check for NaN (Not a Number) values in Python. NaN values can be quite a nuisance when processing data, and knowing how to identify them can save you from a lot of potential headaches down the road.

Why Checking for NaN Values is Important

NaN values can be a real pain, especially when you're dealing with numerical computations or data analysis. They can skew your results, cause errors, and generally make your life as a developer more difficult. For instance, if you're calculating the average of a list of numbers and a NaN value sneaks in, your result will also be NaN, regardless of the other numbers. It's almost as if it "poisons" the result - a single NaN can throw everything off.

Note: NaN stands for 'Not a Number'. It is a special floating-point value that cannot be converted to any other type than float.

NaN Values in Mathematical Operations

When performing mathematical operations, NaN values can cause lots of issues. They can lead to unexpected results or even errors. Python's math and numpy libraries typically propagate NaN values in mathematical operations, which can lead to entire computations being invalidated.

For example, in numpy, any arithmetic operation involving a NaN value will result in NaN:

import numpy as np a = np.array([1, 2, np.nan]) print(a.sum())

Output:

nan

In such cases, you might want to consider using functions that can handle NaN values appropriately. Numpy provides nansum(), nanmean(), and others, which ignore NaN values:

print(np.nansum(a))

Output:

3.0

Pandas, on the other hand, generally excludes NaN values in its mathematical operations by default.

How to Check for NaN Values in Python

There are many ways to check for NaN values in Python, and we'll cover some of the most common methods used in different libraries. Let's start with the built-in math library.

Using the math.isnan() Function

The math.isnan() function is an easy way to check if a value is NaN. This function returns True if the value is NaN and False otherwise. Here's a simple example:

import math value = float('nan') print(math.isnan(value)) # True value = 5 print(math.isnan(value)) # False

As you can see, when we pass a NaN value to the math.isnan() function, it returns True. When we pass a non-NaN value, it returns False.

The benefit of using this particular function is that the math module is built-in to Python, so no third party packages need to be installed.

Using the numpy.isnan() Function

If you're working with arrays or matrices, the numpy.isnan() function can be a nice tool as well. It operates element-wise on an array and returns a Boolean array of the same shape. Here's an example:

import numpy as np array = np.array([1, np.nan, 3, np.nan]) print(np.isnan(array)) # array([False, True, False, True])

In this example, we have an array with two NaN values. When we use numpy.isnan(), it returns a Boolean array where True corresponds to the positions of NaN values in the original array.

You'd want to use this method when you're already using NumPy in your code and need a function that works well with other NumPy structures, like np.array.

Using the pandas.isnull() Function

Pandas provides an easy-to-use function, isnull(), to check for NaN values in the DataFrame or Series. Let's take a look at an example:

import pandas as pd # Create a DataFrame with NaN values df = pd.DataFrame({'A': [1, 2, np.nan], 'B': [5, np.nan, np.nan], 'C': [1, 2, 3]}) print(df.isnull())

The output will be a DataFrame that mirrors the original, but with True for NaN values and False for non-NaN values:

A B C 0 False False False 1 False True False 2 True True False

One thing you'll notice if you test this method out is that it also returns True for None values, hence why it refers to null in the method name. It will return True for both NaN and None.

Comparing the Different Methods

Each method we've discussed — math.isnan(), numpy.isnan(), and pandas.isnull() — has its own strengths and use-cases. The math.isnan() function is a straightforward way to check if a number is NaN, but it only works on individual numbers.

On the other hand, numpy.isnan() operates element-wise on arrays, making it a good choice for checking NaN values in numpy arrays.

Finally, pandas.isnull() is perfect for checking NaN values in pandas Series or DataFrame objects. It's worth mentioning that pandas.isnull() also considers None as NaN, which can be very useful when dealing with real-world data.

Conclusion

Checking for NaN values is an important step in data preprocessing. We've explored three methods — math.isnan(), numpy.isnan(), and pandas.isnull() — each with its own strengths, depending on the type of data you're working with.

We've also discussed the impact of NaN values on mathematical operations and how to handle them using numpy and pandas functions.

Categories: FLOSS Project Planets

Stack Abuse: How to Position Legend Outside the Plot in Matplotlib

Fri, 2023-09-22 15:14
Introduction

In data visualization, often create complex graphs that need to have legends for the reader to be able to interpret the graph. But what if those legends get in the way of the actual data that they need to see? In this Byte, we'll see how you can move the legend so that it's outside of the plot in Matplotlib.

Legends in Matplotlib

In Matplotlib, legends provide a mapping of labels to the elements of the plot. These can be very important to help the reader understand the visualization they're looking at. Without the legend, you might not know which line represented which data! Here's a basic example of how legends work in Matplotlib:

import matplotlib.pyplot as plt # Create a simple line plot plt.plot([1, 2, 3, 4], [1, 4, 9, 16], label='Sample Data') # Add a legend plt.legend() # Display the plot plt.show()

This will produce a plot with a legend located in the upper-left corner inside the plot. The legend contains the label 'Sample Data' that we specified in the plt.plot() function.

Why Position the Legend Outside the Plot?

While having the legend inside the plot is the default setting in Matplotlib, it's not always the best choice. Legends can obscure important details of the plot, especially when dealing with complex data visualizations. By positioning the legend outside the plot, we can be sure that all data points are clearly visible, making our plots easier to interpret.

How to Position the Legend Outside the Plot in Matplotlib

Positioning the legend outside the plot in Matplotlib is fairly easy to do. We simply need to use the bbox_to_anchor and loc parameters of the legend() function. Here's how to do it:

import matplotlib.pyplot as plt # Create a simple line plot plt.plot([1, 2, 3, 4], [1, 4, 9, 16], label='Sample Data') # Add a legend outside the plot plt.legend(bbox_to_anchor=(1, 1.10), loc='upper right') # Display the plot plt.show()

In this example, bbox_to_anchor is a tuple specifying the coordinates of the legend's anchor point, and loc indicates the location of the anchor point with respect to the legend's bounding box. The coordinates are in axes fraction (i.e., from 0 to 1) relative to the size of the plot. So, (1, 1.10) positions the anchor point just outside the top right corner of the plot.

Positioning this legend is a bit more of an art than a science, so you may need to play around with the values a bit to see what works.

Common Issues and Solutions

One common issue is the legend getting cut off when you save the figure using plt.savefig(). This happens because plt.savefig() doesn't automatically adjust the figure size to accommodate the legend. To fix this, you can use the bbox_inches parameter and set it to 'tight' like so:

plt.savefig('my_plot.png', bbox_inches='tight')

Another common issue is the legend overlapping with the plot when positioned outside. This can be fixed by adjusting the plot size or the legend size to ensure they fit together nicely. Again, this is something you'll likely have to test with many different values to find the right configuration and positioning.

Note: Adjusting the plot size can be done using plt.subplots_adjust(), while the legend size can be adjusted using legend.get_frame().

Conclusion

And there you have it! In this Byte, we showed how you can position the legend outside the plot in Matplotlib and explained some common issues. We've also talked a bit about some use-cases where you'll need to position the legend outside the plot.

Categories: FLOSS Project Planets

Stack Abuse: Importing Python Modules

Fri, 2023-09-22 11:40
Introduction

Python allows us to create just about anything, from simple scripts to complex machine learning models. But to work on any complex project, you'll likely need to use or create modules. These are the building blocks of complex projects. In this article, we'll explore Python modules, why we need them, and how we can import them in our Python files.

Understanding Python Modules

In Python, a module is a file containing Python definitions and statements. The file name is the module name with the suffix .py added. Imagine you're working on a Python project, and you've written a function to calculate the Fibonacci series. Now, you need to use this function in multiple files. Instead of rewriting the function in each file, you can write it once in a Python file (module) and import it wherever needed.

Here's a simple example. Let's say we have a file math_operations.py with a function to add two numbers:

# math_operations.py def add_numbers(num1, num2): return num1 + num2

We can import this math_operations module in another Python file and use the add_numbers function:

# main.py import math_operations print(math_operations.add_numbers(5, 10)) # Output: 15

In the above example, we've imported the math_operations module using the import statement and used the add_numbers function defined in the module.

Note: Python looks for module files in the directories defined in sys.path. It includes the directory containing the input script (or the current directory), the PYTHONPATH (a list of directory names, with the same syntax as the shell variable PATH), and the installation-dependent default directory. You can check the sys.path using import sys; print(sys.path).

But why do we need to import Python files? Why can't we just write all our code in one file? Let's find out in the next section.

Why Import Python Files?

The concept of importing files in Python is comparable to using a library or a toolbox. Imagine you're working on a project and need a specific tool. Instead of creating that tool from scratch every time you need it, you would look in your toolbox for it, right? The same goes for programming in Python. If you need a specific function or class, instead of writing it from scratch, you can import it from a Python file that already contains it.

This not only helps us from having to continously rewrite code we've already written, but it also makes our code cleaner, more efficient, and easier to manage. This promotes a modular programming approach where the code is broken down into separate parts or modules, each performing a specific function. This modularity makes debugging and understanding the code much easier.

Here's a simple example of importing a Python standard library module:

import math # Using the math library to calculate the square root print(math.sqrt(16))

Output:

4.0

We import the math module and use its sqrt function to calculate the square root of 16.

Different Ways to Import Python Files

Python provides several ways to import modules, each with its own use cases. Let's look at the three most common methods.

Using 'import' Statement

The import statement is the simplest way to import a module. It simply imports the module, and you can use its functions or classes by referencing them with the module name.

import math print(math.pi)

Output:

3.141592653589793

In this example, we import the math module and print the value of pi.

Using 'from...import' Statement

The from...import statement allows you to import specific functions, classes, or variables from a module. This way, you don't have to reference them with the module name every time you use them.

from math import pi print(pi)

Output:

3.141592653589793

Here, we import only the pi variable from the math module and print it.

Using 'import...as' Statement

The import...as statement is used when you want to give a module a different name in your script. This is particularly useful when the module name is long and you want to use a shorter alias for convenience.

import math as m print(m.pi)

Output:

3.141592653589793

Here, we import the math module as m and then use this alias to print the value of pi.

Importing Modules from a Package

Packages in Python are a way of organizing related modules into a directory hierarchy. Think of a package as a folder that contains multiple Python modules, along with a special __init__.py file that tells Python that the directory should be treated as a package.

But how do you import a module that's inside a package? Well, Python provides a straightforward way to do this.

Suppose you have a package named shapes and inside this package, you have two modules, circle.py and square.py. You can import the circle module like this:

from shapes import circle

Now, you can access all the functions and classes defined in the circle module. For instance, if the circle module has a function area(), you can use it as follows:

circle_area = circle.area(5) print(circle_area)

This will print the area of a circle with a radius of 5.

Note: If you want to import a specific function or class from a module within a package, you can use the from...import statement, as we showed earlier.

But what if your package hierarchy is deeper? What if the circle module is inside a subpackage called 2d inside the shapes package? Python has got you covered. You can import the circle module like this:

from shapes.2d import circle

Python's import system is quite flexible and powerful. It allows you to organize your code in a way that makes sense to you, while still providing easy access to your functions, classes, and modules.

Common Issues Importing Python Files

As you work with Python, you may come across several errors while importing modules. These errors could stem from a variety of issues, including incorrect file paths, syntax errors, or even circular imports. Let's see some of these common errors.

Fixing 'ModuleNotFoundError'

The ModuleNotFoundError is a subtype of ImportError. It's raised when you try to import a module that Python cannot find. It's one of the most common issues developers face while importing Python files.

import missing_module

This will raise a ModuleNotFoundError: No module named 'missing_module'.

There are several ways you can fix this error:

  1. Check the Module's Name: Ensure that the module's name is spelled correctly. Python is case-sensitive, which means module and Module are treated as two different modules.

  2. Install the Module: If the module is not a built-in module and you have not created it yourself, you may need to install it using pip. For example:

$ pip install missing_module
  1. Check Your File Paths: Python searches for modules in the directories defined in sys.path. If your module is not in one of these directories, Python won't be able to find it. You can add your module's directory to sys.path using the following code:
import sys sys.path.insert(0, '/path/to/your/module')
  1. Use a Try/Except Block: If the module you're trying to import is not crucial to your program, you can use a try/except block to catch the ModuleNotFoundError and continue running your program. For example:
try: import missing_module except ModuleNotFoundError: print("Module not found. Continuing without it.") Avoiding Circular Imports

In Python, circular imports can be quite a headache. They occur when two or more modules depend on each other, either directly or indirectly. This leads to an infinite loop, causing the program to crash. So, how do we avoid this common pitfall?

The best way to avoid circular imports is by structuring your code in a way that eliminates the need for them. This could mean breaking up large modules into smaller, more manageable ones, or rethinking your design to remove unnecessary dependencies.

For instance, consider two modules A and B. If A imports B and B imports A, a circular import occurs. Here's a simplified example:

# A.py import B def function_from_A(): print("This is a function in module A.") B.function_from_B() # B.py import A def function_from_B(): print("This is a function in module B.") A.function_from_A()

Running either module will result in a RecursionError. To avoid this, you could refactor your code so that each function is in its own module, and they import each other only when needed.

# A.py def function_from_A(): print("This is a function in module A.") # B.py import A def function_from_B(): print("This is a function in module B.") A.function_from_A()

Note: It's important to remember that Python imports are case-sensitive. This means that import module and import Module would refer to two different modules and could potentially lead to a ModuleNotFoundError if not handled correctly.

Using __init__.py in Python Packages

In our journey through learning about Python imports, we've reached an interesting stop — the __init__.py file. This special file serves as an initializer for Python packages. But what does it do, exactly?

In the simplest terms, __init__.py allows Python to recognize a directory as a package so that it can be imported just like a module. Previously, an empty __init__.py file was enough to do this. However, from Python 3.3 onwards, thanks to the introduction of PEP 420, __init__.py is no longer strictly necessary for a directory to be considered a package. But it still holds relevance, and here's why.

Note: The __init__.py file is executed when the package is imported, and it can contain any Python code. This makes it a useful place for initialization logic for the package.

Consider a package named animals with two modules, mammals and birds. Here's how you can use __init__.py to import these modules.

# __init__.py file from . import mammals from . import birds

Now, when you import the animals package, mammals and birds are also imported.

# main.py import animals animals.mammals.list_all() # Use functions from the mammals module animals.birds.list_all() # Use functions from the birds module

By using __init__.py, you've made the package's interface cleaner and simpler to use.

Organizing Imports: PEP8 Guidelines

When working with Python, or any programming language really, it's important to keep your code clean and readable. This not only makes your life easier, but also the lives of others who may need to read or maintain your code. One way to do this is by following the PEP8 guidelines for organizing imports.

According to PEP8, your imports should be grouped in the following order:

  1. Standard library imports
  2. Related third party imports
  3. Local application/library specific imports

Each group should be separated by a blank line. Here's an example:

# Standard library imports import os import sys # Related third party imports import requests # Local application/library specific imports from my_library import my_module

In addition, PEP8 also recommends that imports should be on separate lines, and that they should be ordered alphabetically within each group.

Note: While these guidelines are not mandatory, following them can greatly improve the readability of your code and make it more Pythonic.

To make your life even easier, many modern IDEs, like PyCharm, have built-in tools to automatically organize your imports according to PEP8.

With proper organization and understanding of Python imports, you can avoid common errors and improve the readability of your code. So, the next time you're writing a Python program, give these guidelines a try. You might be surprised at how much cleaner and more manageable your code becomes.

Conclusion

And there you have it! We've taken a deep dive into the world of Python imports, exploring why and how we import Python files, the different ways to do so, common errors and their fixes, and the role of __init__.py in Python packages. We've also touched on the importance of organizing imports according to PEP8 guidelines.

Remember, the way you handle imports can greatly impact the readability and maintainability of your code. So, understanding these concepts is not just a matter of knowing Python's syntax—it's about writing better, more efficient code.

Categories: FLOSS Project Planets

Stack Abuse: Fix the "AttributeError: module object has no attribute 'Serial'" Error in Python

Fri, 2023-09-22 08:51
Introduction

Even if you're a seasoned Python developer, you'll occasionally encounter errors that can be pretty confusing. One such error is the AttributeError: module object has no attribute 'Serial'. This Byte will help you understand and resolve this issue.

Understanding the AttributeError

The AttributeError in Python is raised when you try to access or call an attribute that a module, class, or instance does not have. Specifically, the error AttributeError: module object has no attribute 'Serial' suggests that you're trying to access the Serial attribute from a module that doesn't possess it.

For instance, if you have a module named serial and you're trying to use the Serial attribute from it, you might encounter this error. Here's an example:

import serial ser = serial.Serial('/dev/ttyUSB0') # This line causes the error

In this case, the serial module you're importing doesn't have the Serial attribute, hence the AttributeError.

Note: It's important to understand that Python is case-sensitive. Serial and serial are not the same. If the attribute exists but you're using the wrong case, you'll also encounter an AttributeError.

Fixes for the Error

The good news is that this error is usually a pretty easy fix, even if it seems very confusing at first. Let's explore some of the solutions.

Install the Correct pyserial Module

One of the most common reasons for this error is the incorrect installation of the pyserial module. The Serial attribute is a part of the pyserial module, which is used for serial communication in Python.

You might have installed a module named serial instead of pyserial (this is more common than you think!). To fix this, you need to uninstall the incorrect module and install the correct one.

$ pip uninstall serial $ pip install pyserial

After running these commands, your issue may be resolved. Now you can import Serial from pyserial and use it in your code:

from pyserial import Serial ser = Serial('/dev/ttyUSB0') # This line no longer causes an error

If this didn't fix the error, keep reading.

Rename your serial.py File

For how much Python I've written in my career, you'd think that I wouldn't make this simple mistake as much as I do...

Another possibility is that the Python interpreter gets confused when there's a file in your project directory with the same name as a module you're trying to import. This is another common source of the AttributeError error.

Let's say, for instance, you have a file named serial.py in your project directory (or maybe your script itself is named serial.py). When you try to import serial, Python might import your serial.py file instead of the pyserial module, leading to the AttributeError.

The solution here is simple - rename your serial.py file to something else.

$ mv serial.py my_serial.py Conclusion

In this Byte, we've explored two common causes of the AttributeError: module object has no attribute 'Serial' error in Python: installing the wrong pyserial module, and having a file in your project directory that shares a name with a module you're trying to import. By installing the correct module or renaming conflicting files, you should be able to eliminate this error.

Categories: FLOSS Project Planets

Real Python: The Real Python Podcast – Episode #173: Getting Involved in Open Source & Generating QR Codes With Python

Fri, 2023-09-22 08:00

Have you thought about contributing to an open-source Python project? What are possible entry points for intermediate developers? Christopher Trudeau is back on the show this week, bringing another batch of PyCoder's Weekly articles and projects.

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Categories: FLOSS Project Planets

Wesley Chun: Managing Shared (formerly Team) Drives with Python and the Google Drive API

Thu, 2023-09-21 18:31
2023 UPDATE: We are working to put updated versions of all the code into GitHub... stay tuned. The link will provided in all posts once the code sample(s) is(are) available.
2019 UPDATE: "G Suite" is now called "Google Workspace", "Team Drives" is now known as "Shared Drives", and the corresponding supportsTeamDrives flag has been renamed to supportsAllDrives. Please take note of these changes regarding the post below.
NOTE 1: Team Drives is only available for G Suite Business Standard users or higher. If you're developing an application for Team Drives, you'll need similar access.
NOTE 2: The code featured here is also available as a video + overview post as part of this series.

Introduction Team Drives is a relatively new feature from the Google Drive team, created to solve some of the issues of a user-centric system in larger organizations. Team Drives are owned by an organization rather than a user and with its use, locations of files and folders won't be a mystery any more. While your users do have to be a G Suite Business (or higher) customer to use Team Drives, the good news for developers is that you won't have to write new apps from scratch or learn a completely different API.

Instead, Team Drives features are accessible through the same Google Drive API you've come to know so well with Python. In this post, we'll demonstrate a sample Python app that performs core features that all developers should be familiar with. By the time you've finished reading this post and the sample app, you should know how to:
  • Create Team Drives
  • Add members to Team Drives
  • Create a folder in Team Drives
  • Import/upload files to Team Drives folders

Using the Google Drive API The demo script requires creating files and folders, so you do need full read-write access to Google Drive. The scope you need for that is:
  • 'https://www.googleapis.com/auth/drive' — Full (read-write) access to Google Drive
If you're new to using Google APIs, we recommend reviewing earlier posts & videos covering the setting up projects and the authorization boilerplate so that we can focus on the main app. Once we've authorized our app, assume you have a service endpoint to the API and have assigned it to the DRIVE variable.

Create Team Drives New Team Drives can be created with DRIVE.teamdrives().create(). Two things are required to create a Team Drive: 1) you should name your Team Drive. To make the create process idempotent, you need to create a unique request ID so that any number of identical calls will still only result in a single Team Drive being created. It's recommended that developers use a language-specific UUID library. For Python developers, that's the uuid module. From the API response, we return the new Team Drive's ID. Check it out:
def create_td(td_name): request_id = str(uuid.uuid4()) body = {'name': td_name} return DRIVE.teamdrives().create(body=body, requestId=request_id, fields='id').execute().get('id') Add members to Team Drives To add members/users to Team Drives, you only need to create a new permission, which can be done with  DRIVE.permissions().create(), similar to how you would share a file in regular Drive with another user.  The pieces of information you need for this request are the ID of the Team Drive, the new member's email address as well as the desired role... choose from: "organizer", "owner", "writer", "commenter", "reader". Here's the code:
def add_user(td_id, user, role='commenter'): body = {'type': 'user', 'role': role, 'emailAddress': user} return DRIVE.permissions().create(body=body, fileId=td_id, supportsTeamDrives=True, fields='id').execute().get('id') Some additional notes on permissions: the user can only be bestowed permissions equal to or less than the person/admin running the script... IOW, they cannot grant someone else greater permission than what they have. Also, if a user has a certain role in a Team Drive, they can be granted greater access to individual elements in the Team Drive. Users who are not members of a Team Drive can still be granted access to Team Drive contents on a per-file basis.

Create a folder in Team Drives Nothing to see here! Yep, creating a folder in Team Drives is identical to creating a folder in regular Drive, with DRIVE.files().create(). The only difference is that you pass in a Team Drive ID rather than regular Drive folder ID. Of course, you also need a folder name too. Here's the code:
def create_td_folder(td_id, folder): body = {'name': folder, 'mimeType': FOLDER_MIME, 'parents': [td_id]} return DRIVE.files().create(body=body, supportsTeamDrives=True, fields='id').execute().get('id') Import/upload files to Team Drives folders Uploading files to a Team Drives folder is also identical to to uploading to a normal Drive folder, and also done with DRIVE.files().create(). Importing is slightly different than uploading because you're uploading a file and converting it to a G Suite/Google Apps document format, i.e., uploading CSV as a Google Sheet, or plain text or Microsoft Word® file as Google Docs. In the sample app, we tackle the former:
def import_csv_to_td_folder(folder_id, fn, mimeType): body = {'name': fn, 'mimeType': mimeType, 'parents': [folder_id]} return DRIVE.files().create(body=body, media_body=fn+'.csv', supportsTeamDrives=True, fields='id').execute().get('id') The secret to importing is the MIMEtype. That tells Drive whether you want conversion to a G Suite/Google Apps format (or not). The same is true for exporting. The import and export MIMEtypes supported by the Google Drive API can be found in my SO answer here.

Driver app All these functions are great but kind-of useless without being called by a main application, so here we are:
FOLDER_MIME = 'application/vnd.google-apps.folder' SOURCE_FILE = 'inventory' # on disk as 'inventory.csv' SHEETS_MIME = 'application/vnd.google-apps.spreadsheet' td_id = create_td('Corporate shared TD') print('** Team Drive created') perm_id = add_user(td_id, 'email@example.com') print('** User added to Team Drive') folder_id = create_td_folder(td_id, 'Manufacturing data') print('** Folder created in Team Drive') file_id = import_csv_to_td_folder(folder_id, SOURCE_FILE, SHEETS_MIME) print('** CSV file imported as Google Sheets in Team Drives folder') The first set of variables represent some MIMEtypes we need to use as well as the CSV file we're uploading to Drive and requesting it be converted to Google Sheets format. Below those definitions are calls to all four functions described above.

Conclusion If you run the script, you should get output that looks something like this, with each print() representing each API call:
$ python3 td_demo.py ** Team Drive created ** User added to Team Drive ** Folder created in Team Drive ** CSV file imported as Google Sheets in Team Drives folder When the script has completed, you should have a new Team Drives folder called "Corporate shared TD", and within, a folder named "Manufacturing data" which contains a Google Sheets file called "inventory".

Below is the entire script for your convenience which runs on both Python 2 and Python 3 (unmodified!)—by using, copying, and/or modifying this code or any other piece of source from this blog, you implicitly agree to its Apache2 license:
from __future__ import print_function import uuid from apiclient import discovery from httplib2 import Http from oauth2client import file, client, tools SCOPES = 'https://www.googleapis.com/auth/drive' store = file.Storage('storage.json') creds = store.get() if not creds or creds.invalid: flow = client.flow_from_clientsecrets('client_secret.json', SCOPES) creds = tools.run_flow(flow, store) DRIVE = discovery.build('drive', 'v3', http=creds.authorize(Http())) def create_td(td_name): request_id = str(uuid.uuid4()) # random unique UUID string body = {'name': td_name} return DRIVE.teamdrives().create(body=body, requestId=request_id, fields='id').execute().get('id') def add_user(td_id, user, role='commenter'): body = {'type': 'user', 'role': role, 'emailAddress': user} return DRIVE.permissions().create(body=body, fileId=td_id, supportsTeamDrives=True, fields='id').execute().get('id') def create_td_folder(td_id, folder): body = {'name': folder, 'mimeType': FOLDER_MIME, 'parents': [td_id]} return DRIVE.files().create(body=body, supportsTeamDrives=True, fields='id').execute().get('id') def import_csv_to_td_folder(folder_id, fn, mimeType): body = {'name': fn, 'mimeType': mimeType, 'parents': [folder_id]} return DRIVE.files().create(body=body, media_body=fn+'.csv', supportsTeamDrives=True, fields='id').execute().get('id') FOLDER_MIME = 'application/vnd.google-apps.folder' SOURCE_FILE = 'inventory' # on disk as 'inventory.csv'... CHANGE! SHEETS_MIME = 'application/vnd.google-apps.spreadsheet' td_id = create_td('Corporate shared TD') print('** Team Drive created') perm_id = add_user(td_id, 'email@example.com') # CHANGE! print('** User added to Team Drive') folder_id = create_td_folder(td_id, 'Manufacturing data') print('** Folder created in Team Drive') file_id = import_csv_to_td_folder(folder_id, SOURCE_FILE, SHEETS_MIME) print('** CSV file imported as Google Sheets in Team Drives folder') As with our other code samples, you can now customize it to learn more about the API, integrate into other apps for your own needs, for a mobile frontend, sysadmin script, or a server-side backend!

Code challenge Write a simple application that moves folders (and its files or folders) in regular Drive to Team Drives. Each folder you move should be a corresponding folder in Team Drives. Remember that files in Team Drives can only have one parent, and the same goes for folders.
Categories: FLOSS Project Planets

Stack Abuse: How to Pass Multiple Arguments to the map() Function in Python

Thu, 2023-09-21 16:22
Introduction

The goal of Python, with its rich set of built-in functions, is to allow developers to accomplish complex tasks with relative ease. One such powerful, yet often overlooked, function is the map() function. The map() function will execute a given function over a set of items, but how do we pass additional arguments to the provided function?

In this Byte, we'll be exploring the map() function and how to effectively pass multiple arguments to it.

The map() Function in Python

The map() function in Python is a built-in function that applies a given function to every item of an iterable (like list, tuple etc.) and returns a list of the results.

def square(number): return number ** 2 numbers = [1, 2, 3, 4, 5] squared = map(square, numbers) print(list(squared)) # Output: [1, 4, 9, 16, 25]

In this snippet, we've defined a function square() that takes a number and returns its square. We then use the map() function to apply this square() function to each item in the numbers list.

Why Pass Multiple Arguments to map()?

You might be wondering, "Why would I need to pass multiple arguments to map()?" Well, there are scenarios where you might have a function that takes more than one argument, and you want to apply this function to multiple sets of data simultaneously.

Not every function we provide to map() will take only one argument. What if, instead of a squared function, we have a more generic math.pow function, and one of the arguments is what number to raise the item to. How do we handle a case like this?

Or maybe you have two lists of numbers, and you want to find the product of corresponding numbers from these lists. This is another case where passing multiple arguments to map() can come be helpful.

How to Pass Multiple Arguments to map()

There are a few different types of cases in which you'd want to pass multiple arguments to map(), two of which we mentioned above. We'll walk through both of those cases here.

Multiple Iterables

Passing multiple arguments to the map() function is simple once you understand how to do it. You simply pass additional iterables after the function argument, and map() will take items from each iterable and pass them as separate arguments to the function.

Here's an example:

def multiply(x, y): return x * y numbers1 = [1, 2, 3, 4, 5] numbers2 = [6, 7, 8, 9, 10] result = map(multiply, numbers1, numbers2) print(list(result)) # Output: [6, 14, 24, 36, 50]

Note: Make sure that the number of arguments in the function should match the number of iterables passed to map()!

In the example above, we've defined a function multiply() that takes two arguments and returns their product. We then pass this function, along with two lists, to the map() function. The map() function applies multiply() to each pair of corresponding items from the two lists, and returns a new list with the results.

Multiple Arguments, One Iterable

Continuing with our math.pow example, let's see how we can still use map() to run this function on all items of an array.

The first, and probably simplest, way is to not use map() at all, but to use something like list comprehension instead.

import math numbers = [1, 2, 3, 4, 5] res = [math.pow(n, 3) for n in numbers] print(res) # Output: [1.0, 8.0, 27.0, 64.0, 125.0]

This is essentiall all map() really is, but it's not as compact and neat as using a convenient function like map().

Now, let's see how we can actually use map() with a function that requires multiple arguments:

import math import itertools numbers = [1, 2, 3, 4, 5] res = map(math.pow, numbers, itertools.repeat(3, len(numbers))) print(list(res)) # Output: [1.0, 8.0, 27.0, 64.0, 125.0]

This may seem a bit more complicated at first, but it's actually very simple. We use a helper function, itertools.repeat, to create a list the same length as numbers and with only values of 3.

So the output of itertools.repeat(3, len(numbers)), when converted to a list, is just [3, 3, 3, 3, 3]. This works because we're now passing two lists of the same length to map(), which it happily accepts.

Conclusion

The map() function is particularly useful when working with multiple iterables, as it can apply a function to the elements of these iterables in pairs, triples, or more. In this Byte, we've covered how to pass multiple arguments to the map() function and how to work with multiple iterables.

Categories: FLOSS Project Planets

William Minchin: minchin.jrnl v7 “Phoenix” released

Thu, 2023-09-21 15:22

Today, I do something that I should have done 5 years ago, and something that I’ve been putting off for the last 2 years1: I’m releasing a personal fork of jrnl2! I’ve given this release the codename Phoenix, after the mythical bird of rebirth, that springs forth renewed from the ashes of the past.

You can install it today:

pip install minchin.jrnl

And then to run it from the command line:

minchin.jrnl

(or

jrnl

)

Features Today

Today, the codebase is that of jrnl v2.63 in a new namespace. In particular, that gives us a working yaml exporter; now you can build your Pelican sites again (or maybe only I was doing that…).

The version number (7) was picked to be larger than the current jrnl-org/jrnl release (currently at 4.0.1). (Plus I thought it would look cool!)

I’ve moved the configuration to a new location on disk4, as to not stomp on your existing jrnl (i.e. jrnl-org/jrnl or “legacy”) installation.

Limited documentation, to match the current codebase, has been uploaded to my personal site at minchin.ca/minchin.jrnl. (Although it remains incomplete and very much a work in progress.)

And finally, I extend an invitation to all those current or former users of jrnl to move here. I welcome your contributions and support. If there are features missing, please feel free to let me know.

Short Term Update Plans

I’ve pushed this release out with very few changes from the base codebase in a effort to get it live. But I have some updates that I’m planning to do very shortly. There updates will maintain the Phoenix codename, even if the major version number increments.

The biggest of these is to launch my much anticipated plugin system. The code has been already written (for several years now5, actually), can it just needs to be double checked that it still works as expected.

The other immediate update is to make sure the code works with Python 3.11 (the current version of Python), which seems to already be the case.

Medium to Long Term Project Goals, or My Vision

These are features I’d like to add, although I realize this will take more than tonight. Also this section lays out my visions for the project and some anti-features I want to avoid.

The Plugin System

The plugin system I think will be huge movement forward to make minchin.jrnl more useful. In particular, it allows minchin.jrnl to import and export to and from new formats, including allowing you to write one-off export formats (which I intend to use personally right away!). Displying the journal entries on the commandline is also handled by exporters, so you’d be able to tweak that output as well. I also intend to extend the plugin system to the storage backends.

My hope is that this will futureproof minchin.jrnl, allowing new formats to quickly and easily be added, retiring deprecated formats to external plugins, and being able to quickly test and integrate new formats by seemlessly bring external plugins “in-house”.

In particular, I’m planning to have separate plugins for “true” yaml + markdown exports and Pelican-formated markdown, to add an export format for Obsidian6 formatted markdown, and add backend format plugins to support jrnl v1 and whatever format they’re dreaming up for jrnl v47.

In short, I hope plugins will allow you to make minchin.jrnl more useful, without me being the bottleneck.

Plain Text Forever

One of the things that drew to the original jrnl implementation was that is was based on plain text, and using plain text to store journal entries. Plain text has a number of advantages8, but the near universal backwards and forewards compatibility in high on that list. Yes, plain text has it’s limitations9, but I think the advantages far outweight the disadvantages, particularly when it comes to a journal that you might hope will be readable years or generations from now. Also, plain text just makes it so much easier to develop minchin.jrnl.

The included, default option for minchin.jrnl will always be plain text.

If you’re looking to upgrade your plain text, you might consider Markdown10 or ReStructured Text (ReST)11.

If you’re looking for either a database backend or more multimedia functionality (or both), you’re welcome to write something as a backend plugin for minchin.jrnl; that ability is a featured reason for providing the (to be added shortly!) plugin system in the first place!

MIT Licensed

The original jrnl was initially released under the MIT license, and that only changed with the v2.4 release to GPLv312. My hope and goal is to remove all GPL-licensed code and release future versions of minchin.jrnl under the MIT license23.

My opposition to the change13 was because I’ve come to feel that Open Source work is both given and received as a gift, and I feel the GPL license moves away from that ethos.

I suspect the least fun part of this partial re-write will be getting the testing system up and running again, as the original library jrnl v1 had been using has gone many years without updates.

To this end, I’m requesting that any code contributions to the project be dual-licensed under both MIT and GPLv3.

Documentation in Sphinx

Documentation will eventually be moved over to Sphinx (from the current MkDocs), a process I’ve began but not finished. Driving this is the expectation that I’ll have more technical documentation (than is included currently) as I layout how to work with the plugin API, and Sphinx makes it easy to keep code and documentation side by side in the same (code) file.

Furthermore, I want to document how to use minchin.jrnl as a Python API generally; this would allow you to interact with your journal from other Python programs.

Easy to Push Releases

Knowing my own schedule, I want to be able to sit down for an evening, make (several) small improvements, and then push out a new release before I go to bed. To that end, I want to make the streamlined to push out new releases. Expect lots of small releases. :)

Drop poetry

poetry is a powerful Python project management tool, but is one I’ve never really liked14. Particular issues include a lack of first class Windows support15 and very conservative upper bounds for dependencies and supported Python versions. Plus I have refind a system elsewhere using pip-tools16 and setup.py to manage these same issues that I find works very well for me.

This has been accomplished with the current release!

Windows Support

jnrl, to date, has always had decent Windows support. As I personally work on Windows, Windows will continue to have first class support.

Where this may show is tools beyond Python will need to be readily available on Windows before they’re used33, and the Windows Terminal is fairly limited in what in can do, at least compared with some Linux terminals.

Replace the Current Code of Conduct

I don’t much care for the current Code of Conduct17: it seems to be overly focused on the horrible things people might do to each other, and I’m not sure I want that to be the introduction people get to the project. I hope to find a Code of Conduct that focuses more on the positive things I hope people will do as they interact with each other and the project.

My replaced/current Code of Conduct is here (although this may be updated again in the future).

Open Stewardship Discussion

If the time comes when someone else is assuming stewardship for the project, I intend for those discussions to be help publicly18.

My History with the Project, and Why the Fork

This section is different: it is much less about the codebase and more focused on myself and my relationship to it. I warn you it is likely to be somewhat long and winding.

My Introduction to jrnl

Many years ago now, I was new in Python. At that time34 when I came across a problem that I thought programming might solve, I first went looking from a Python program that might solve it.

In looking for a way to manage the regular text notes I was taking at work, I found jrnl, which I eventually augmented with DayOne (Classic) for in-field note entry (on a work-supplied iPad) and Pelican for display.

Jrnl was more though: it was the object of my first Pull Request35, my first contribution to Open Source. My meagre help was appreciated and welcomed warmly, and so I returned often. I found jrnl to be incredibly useful in learning about Python best practices and code organization; here was a program that was more than a toy but simple enough (or logically separated) that I could attempt to understand it, to gork it, as a whole. I contributed in many places, but particularly around Windows support, DayOne backend support, and exports to Markdown (to be fed to Pelican).

In short jrnl became part of the code I used everyday.

jrnl Goes Into Hibernation

I have heard it said that software rewrites are a great way to kill a project. The reasons for this are multiple, but in short it (typically) saps the energy to update the legacy version even as bugs pile up, but the new thing can’t go into everyday use until it is feature-compatible with the legacy version, and the re-write always takes way more effort than initial estimates.

For reasons now forgotten36, a “v2” semi-rewrite was undertaken. And then it stalled. And then the project maintainer got a life19 and the re-write stalled even moreso.

The (Near) Endless Beta of v2, or the First Time I Should Have Forked

For me, initially, this wasn’t a big deal: I was often running a development build locally as I tinkered away with jrnl, so I just kept doing that. Also, I had just started working on my plugin system (for exporters first, but expecting it could easily be extended to importers and backends).

As the months of inactivity on the part of the maintainer stretched on, and pull requests grew staler, at some point I should have forked the project and “officially” released my development version. But I never did, because it seemed like a scary new thing to do20.

Invitation to Become a Maintainer

And then21 one day out of the blue, I got an email from the maintainer asking if I wanted to be a co-maintainer for jrnl! I was delighted, and promptly said yes. I was given commit access to the repo on GitHub (but, as far as I knew, no ability to push releases to PyPI), and then…not much happened. I reached out to the maintainer to suggest some plans, as it still felt like “his” project, but I never heard much back. And I was too timid to move forward without at least something from him. And I was busy with the rest of life too. After a few months, I realized my first plan wasn’t working and started thinking about how to try again to move the project forward, more on my own. In front of me was the push to v2.0, and a question of how to integrate my in-development plug-in system.

The Coup

And then on a one another day, again out of the blue, I got an unexpected email that I no longer had commit access to the jrnl repo. I searched the project for updates, including the issue tracker and came up with #591 where a transition to new maintainers was brought up; I don’t know why I wasn’t pinged on it. At the time22, I said I was happy to see new life in the project and to see it move forward. But it was unsettling that I’d missed the early discussions.

It also seemed odd to me that the two maintainer that stepped forward hadn’t seemed to be involved with the project at all before that point.

For a while, things were ok: a “version 2” was released that was very close to the v2 branch I was using at home, bugs started getting fixed regularly, and new releases continued to come out. But my plugins never made it into a release.

Things Fall Apart (aka I Get Frustrated)

But things under new management didn’t stay rosy.

One of the things they did was completely rewrite the Git history, and thus change all the commit hashes. This was a small but continueing annoyance (until I got a new computer), because everytime I would go to push changes to GitHub, my git client would complain about the “new” (old) tags it was trying to push, because it couldn’t find a commit with the matching hash.

But my two big annoyances were a re-write of the YAML exporter and the continual roadblocks to getting my plugin system merged in.

My plugin system has the longest history, having been started before the change in stewardship. Many times (after the change in stewardship), I created a pull request24 and the response would be to make various changes or to split it into smaller pieces; I would make the changes, and the cycle would continue. But there was never a plan presented that I felt I could successful complete, nor was I ever told the plugin system was unaligned with the vision they had for the project. I lost considerable enthusiasm for trying to get the plugins merged after rewriting the tests for the third time (as the underlying testing framework was changed).

The YAML exporter changes are what ultimately left me feeling frozen out of the project. Without much fanfare, the YAML exporter was changed, because someone25 felt that the resulting output wasn’t “true” or “pure” YAML. This is true, in a sense, because when I had originally written the exporter, it was designed to output files for Pelican with an assumed Markdown body and YAML front matter for metadata. At the request of the (then) maintainer, I called it the “YAML exporter”, partly because there was already a “Markdown exporter”. I didn’t realize it had been broken until I went to upgrade jrnl and render my Pelican site (which I use to search and read my notes) and it had just stopped working26. The change wasn’t noted (in a meaningful way) in the release notes, and the version numbers didn’t give an indication of when this change had happened30. I eventually figured out where the change had happened, explained the history of the exporter (that again, I had written years earlier) and proposed three solutions, each with a corresponding Pull Request: 1) return the YAML exporter to it’s previous output27, 2) duplicate the old exporter under a new name28, or 3) merge my plugin system, which would allow me to write my own plugin, and solve the problem myself. I was denied on all three, and told that I ‘didn’t understand the purpose or function of the YAML exporter’31 (yes, of the plugin I’d written37). The best I got was that they would reconsider what rose to the level of a “breaking change” when dealing with versioning32.

I Walk Away

The combined experience left me feeling very frustrated: jrnl was broken (to me) and all my efforts to fix it were forably rebuffed.

When I tried to express my frustrations at my inability to get a “working” version of jrnl, I was encouraged to take a mantal health break. While I appreciate the awareness of mental health, stepping away wouldn’t be helpful in this particlar case because the nothing would happen to fix my broken tooling (the cause of my frustrations). It seemed like the “right words”(tm) someone had picked up at a workshop, but the same workshop figured that the “right words”(tm) would solve everything without requiring a deeper look or deeper changes.

So I took my leave. I’ve been running an outdated version (v2.6) ever since, and because of the strictness of the Poetry metadata29, I can’t run it on anything newer than Python 3.9 (even as I’ve upgraded my “daily” Python to 3.11).

I Return (Sort Of); The Future and My Fears

So this marks my return. My “mental health break” is over. As I realize I can only change myself (and not others), I will do the work to fix the deeper issues (e.g. broken Pelican exports, lack of “modern” Python support) by managing my own fork. And so that is the work I’ll do.

Looking forward, if I’m the only one that uses my fork, that would be somewhat disappointing, but also fine. After all, I write software, first and foremost, for my own usecase and offer it to others as a gift. On the other hand, if a large part of the community moves here, I worry about being able to shepherd that community any better than the one I am leaving.

I worry too that either due to there being conflict at all, or that all of my writings are publically displayed, others will think less of my work or myself because of the failings they see there. It is indeed very hard to get through a disagreement like this without failing in some degree.

But it seems better to act, than to suffer in silence.

A Thank You

Thank you to all those who have worked to make jrnl as successful as it has been to date.

If you’ve gotten this far, thank you for reading this all. I hope you will join me, and I hope your experiences with minchin.jrnl are positive!

The header image was generated locally by Standard Diffusion XL.

  1. October 18, 2021 todo item: “fork jrnl” 

  2. main landing page at jrnl.sh, code at jrnl-orl/jrnl on GitHub, and jrnl on PyPI 

  3. https://github.com/jrnl-org/jrnl/tree/v2.6 

  4. this varies by OS, so run jrnl --list to see where yours is stored. 

  5. Pull Request #1115 

  6. I’ve started using Obsidian to take notes on my workstation and on my phone, and find it incredible. The backend format remains Markdown with basically Yaml front matter, but the format is slightly different from Pelican, and exported file layout differs. 

  7. The initial draft of this post was written before the v4 release, when there was talk of changing how the journal files were kept. v4 has since been released, and I’m unclear if that change ever happened, or what “breaking change” occurred that bumped the version number from 3 to 4 generally. In any case, if they change their format, with the plugin system it becomes fairly trivial to add a format-specific importer. 

  8. also: tiny file size, easy to put under version control, no proprietary format or data lock-in, portability across computing platforms, and generally are human readable 

  9. includes limitations on embedded text formating, storing pictures, videos, or sound recordings, and lacking standardized internal metadata 

  10. Markdown has several variants and many extensions. If you’re starting out, I recommend looking at the CommonMark specification. Note however that Markdown was originally designed as a shortcut for creating HTML documents, and so has no built in features for managing groups of Markdown documents. It is also deliberately limited in the formating options available, while officially supporting raw HTML as a fallback for anything missing. 

  11. ReST is older than Markdown and has always had a full specification. It was originally designed for the Python language documentation, and so was designed from the beginning to deal with the interplay between several text documents. Sadly, it doesn’t seem to have been adopted much outside of the Python ecosystem. 

  12. version 2.3.1 license (MIT); version 2.4 license (GPLv3), released April 18, 2020. 

  13. as I detailed at the time. But the issue (#918) announcing the change was merged within 20 minutes of being opened, so I’m not sure anything I could have said would have changed their minds. 

  14. this can and should be flushed out into a full blog post. But another day. 

  15. and I work on Windows. And I work with Python because Python has had good Windows support. 

  16. https://pip-tools.readthedocs.io/en/latest/ 

  17. jrnl-org/jrnl’s Code of Conduct: the Contributor Code of Conduct

  18. I imagine in the issue tracker for the project. 

  19. I think he got a job with or founded a startup, and I suspect he probably moved continents. 

  20. In the intervening time, I ended up releasing personal forks of several Pelican plugins. The process is no longer new or scary, but still can be a fair bit of work. And that experience has given me the confidence to go forward with this fork. 

  21. February 16, 2018 

  22. July 5, 2019; my comment, at the time 

  23. my (pending) codename for these releases is ⚜ Fluer-de-lis. The reference is to the Lily, a flower that is a symbol of purity and rebirth. 

  24. see Pull Request #1216 and Discussion #1006 

  25. Issue #1065 

  26. in particular, Pelican could no longer find the metadata block and instead rendered the text of each entry as if it was a code block. 

  27. I’m sure I wrote the code to do this, but can’t find the Pull Request at the moment. Maybe I figured the suggestion wouldn’t go anyway. 

  28. Pull Request #1337 

  29. https://github.com/jrnl-org/jrnl/blob/v2.6/pyproject.toml#L25 

  30. perhaps because I was looking for a breaking change rather than a bug fix

  31. this comment and this one, in particular. I can’t find those exact quoted words, but that was the sentiment I was left with. 

  32. this comment 

  33. So no make. But Invoke, written in Python, works well for many of make‘s use cases. 

  34. and still today 

  35. Pull Request #110, dated November 27, 2013 

  36. but likely recorded in the issue tracker 

  37. Pull Request #258, opened July 30, 2014. 

Categories: FLOSS Project Planets

Stack Abuse: How to Flatten Specific Dimensions of NumPy Array

Thu, 2023-09-21 14:41
Introduction

In data manipulation and scientific computing, NumPy stands as one of the most-used libraries as it provides quite a few functionalities. One such operation, "flattening," helps to transform multi-dimensional arrays into a one-dimensional sequence. While flattening an entire array is pretty straightforward, there are times when you might want to selectively flatten specific dimensions to suit the requirements of your data pipeline or algorithm. In this Byte, we'll see various techniques to achieve this more nuanced form of flattening.

NumPy Arrays

NumPy, short for Numerical Python, is a library in Python that provides support for large, multi-dimensional arrays and matrices, along with a collection of mathematical functions to operate on these arrays. NumPy arrays are a key ingredient in scientific computing with Python. They are more efficient and faster compared to Python's built-in list data type, especially when it comes to mathematical operations.

This code shows what a NumPy array can look like:

import numpy as np # Creating a 2D NumPy array array_2D = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) print(array_2D)

Output:

[[1 2 3] [4 5 6] [7 8 9]] Why Flatten Some Dimensions of a NumPy Array?

Flattening an array means converting a multidimensional array into a 1D array. But why would you want to flatten just some dimensions of a NumPy array?

Well, there are many scenarios where you might need to do this. For example, in machine learning, often we need to flatten our input data before feeding it into a model. This is because many machine learning algorithms expect input data in a specific format, usually as a 1D array.

But sometimes, you might not want to flatten the entire array. Instead, you might want to flatten specific dimensions of the array while keeping the other dimensions intact. This can be useful in scenarios where you want to maintain some level of the original structure of the data.

How to Flatten a NumPy Array

Flattening a NumPy array is fairly easy to do. You can use the flatten() method provided by NumPy to flatten an array:

import numpy as np # Creating a 2D NumPy array array_2D = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) # Flattening the 2D array flattened_array = array_2D.flatten() print(flattened_array)

Output:

[1 2 3 4 5 6 7 8 9]

As you can see, the flatten() method has transformed our 2D array into a 1D array.

But what if we want to flatten only a specific dimension of the array and not the entire array? We'll explore this in the next sections.

Flattening Specific Dimensions of a NumPy Array

Flattening a NumPy array is quite straightforward. But, what if you need to flatten only specific dimensions of an array? This is where the reshape function comes into play.

Let's say we have a 3D array and we want to flatten the last two dimensions, keeping the first dimension as it is. The reshape function can be used to achieve this. Here's a simple example:

import numpy as np # Create a 3D array array_3d = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]]) # Reshape the array flattened_array = array_3d.reshape(array_3d.shape[0], -1) print(flattened_array)

Output:

[[ 1 2 3 4 5 6] [ 7 8 9 10 11 12]]

In the above code, the -1 in the reshape function indicates that the size of that dimension is to be calculated automatically. This is based on the size of the array and the size of the other dimensions.

Note: The reshape function does not modify the original array. Instead, it returns a new array that has the specified shape.

Similar Solutions and Use-Cases

Flattening specific dimensions of a NumPy array isn't the only way to manipulate your data. There are other similar solutions you might find useful. For example, the ravel function can also be used to flatten an array. However, unlike reshape, ravel always returns a flattened array.

Additionally, you can use the transpose function to change the order of the array dimensions. This can be useful in cases where you need to rearrange your data for specific operations or visualizations.

These techniques can be particularly useful in data preprocessing for machine learning. For instance, you might need to flatten the input data for a neural network. Or, you might need to transpose your data to ensure that it's in the correct format for a particular library or mathematical function.

Conclusion

In this Byte, we've explored how to flatten specific dimensions of a NumPy array using the reshape function. We've also looked at similar solutions such as ravel and transpose and discussed some use-cases where these techniques can be particularly useful.

While these techniques are powerful tools for data manipulation, they are just the tip of the iceberg when it comes to what you can do with NumPy. So I'd suggest taking a deeper look at the NumPy documentation and see what other interesting features you can discover.

Categories: FLOSS Project Planets

Peter Bengtsson: Pip-Outdated.py with interactive upgrade

Thu, 2023-09-21 12:11

Last year I wrote a nifty script called Pip-Outdated.py "Pip-Outdated.py - a script to compare requirements.in with the output of pip list --outdated". It basically runs pip list --outdated but filters based on the packages mentioned in your requirements.in. For people familiar with Node, it's like checking all installed packages in node_modules if they have upgrades, but filter it down by only those mentioned in your package.json.

I use this script often enough that I added a little interactive input to ask if it should edit requirements.in for you for each possible upgrade. Looks like this:

❯ Pip-Outdated.py black INSTALLED: 23.7.0 POSSIBLE: 23.9.1 click INSTALLED: 8.1.6 POSSIBLE: 8.1.7 elasticsearch-dsl INSTALLED: 7.4.1 POSSIBLE: 8.9.0 fastapi INSTALLED: 0.101.0 POSSIBLE: 0.103.1 httpx INSTALLED: 0.24.1 POSSIBLE: 0.25.0 pytest INSTALLED: 7.4.0 POSSIBLE: 7.4.2 Update black from 23.7.0 to 23.9.1? [y/N/q] y Update click from 8.1.6 to 8.1.7? [y/N/q] y Update elasticsearch-dsl from 7.4.1 to 8.9.0? [y/N/q] n Update fastapi from 0.101.0 to 0.103.1? [y/N/q] n Update httpx from 0.24.1 to 0.25.0? [y/N/q] n Update pytest from 7.4.0 to 7.4.2? [y/N/q] y

and then,

❯ git diff requirements.in | cat diff --git a/requirements.in b/requirements.in index b7a246e..0e996e5 100644 --- a/requirements.in +++ b/requirements.in @@ -9,7 +9,7 @@ python-decouple==3.8 fastapi==0.101.0 uvicorn[standard]==0.23.2 selectolax==0.3.16 -click==8.1.6 +click==8.1.7 python-dateutil==2.8.2 gunicorn==21.2.0 # I don't think this needs `[secure]` because it's only used by @@ -18,7 +18,7 @@ requests==2.31.0 cachetools==5.3.1 # Dev things -black==23.7.0 +black==23.9.1 flake8==6.1.0 -pytest==7.4.0 +pytest==7.4.2 httpx==0.24.1

That's it. Then if you want to actually make these upgrades you run:

❯ pip-compile --generate-hashes requirements.in && pip install -r requirements.txt

To install it, download the script from: https://gist.github.com/peterbe/a2b158c39f1f835c0977c82befd94cdf
and put it in your ~/bin and make it executable.
Now go into a directory that has a requirements.in and run Pip-Outdated.py

Categories: FLOSS Project Planets

Stack Abuse: Calculate Mean Across Multiple DataFrames in Pandas

Thu, 2023-09-21 11:03
Introduction

The Pandas library offers a plethora of functions that make data manipulation and analysis super simple (or at least simpler). One such function is the mean() function, which allows you to calculate the average of values in a DataFrame. But what if you're working with multiple DataFrames? In this Byte, we'll explore how to calculate the mean across multiple DataFrames.

Why Calculate Mean Across Multiple DataFrames?

There are numerous scenarios where you might have multiple DataFrames and need to calculate the mean across all of them. For example, you might have data spread across multiple DataFrames due to the size of the data, different data sources, or maybe the data is simply segmented for easier manipulation or storage in files. In these cases, calculating the mean across all these DataFrames can provide a holistic view of the data and can be useful for certain statistical analyses.

Calculating Mean in a Single DataFrame

Before we get into calculating mean across multiple DataFrames, let's first understand how to calculate mean in a single DataFrame. Here's how we'd do it:

import pandas as pd # Create a DataFrame df = pd.DataFrame({ 'A': [1, 2, 3, 4, 5], 'B': [2, 3, 4, 5, 6], 'C': [3, 4, 5, 6, 7] }) # Calculate mean mean = df.mean() print(mean)

When you run this code, you'll get the following output:

A 3.0 B 4.0 C 5.0 dtype: float64

In this simple example, the mean() function calculates the mean of each column in the DataFrame.

Extending to Multiple DataFrames

Now that we know how to calculate the mean in a single DataFrame, let's extend this to multiple DataFrames. To do this, it'd be easiest if we concatenated the DataFrames and then calculate the mean. This can be done using the concat() method.

# Create two more DataFrames df1 = pd.DataFrame({ 'A': [6, 7, 8, 9, 10], 'B': [7, 8, 9, 10, 11], 'C': [8, 9, 10, 11, 12] }) df2 = pd.DataFrame({ 'A': [11, 12, 13, 14, 15], 'B': [12, 13, 14, 15, 16], 'C': [13, 14, 15, 16, 17] }) # Concatenate DataFrames df_concat = pd.concat([df, df1, df2]) # Calculate mean mean_concat = df_concat.mean() print(mean_concat)

The output will be:

A 8.0 B 9.0 C 10.0 dtype: float64

First we concatenate the three DataFrames using pd.concat(). We then calculate the mean of the new concatenated DataFrame using the mean() function.

Note: The pd.concat() function concatenates along the vertical axis by default. If your DataFrames have the same columns, this is typically what you want.

However, if your DataFrames have different columns, you might want to concatenate along the horizontal axis. You can do this by setting the axis parameter to 1: pd.concat([df1, df2], axis=1). This would be useful if they have different columns and you just want them in a common DataFrame to run analysis on, like with the mean() method.

Use Cases

Calculating the mean across multiple DataFrames in Pandas can help in a variety of scenarios. Let's see a few possible use-cases.

One of the most common scenarios is when you're dealing with a large dataset that's been split into multiple DataFrames for easier handling. In such cases, calculating the mean across these DataFrames can give you a more holistic understanding of your data.

Consider the case of a data analyst working with sales data from a multinational company. The data is split by region, each represented by a separate DataFrame. To get a global perspective on average sales, the analyst would need to calculate the mean across all these DataFrames.

import pandas as pd # Assume we have three DataFrames for sales data in three different regions df1 = pd.DataFrame({'sales': [100, 200, 300]}) df2 = pd.DataFrame({'sales': [400, 500, 600]}) df3 = pd.DataFrame({'sales': [700, 800, 900]}) # Calculate the mean across all DataFrames mean_sales = pd.concat([df1, df2, df3]).mean() print(mean_sales)

Output:

sales 500.0 dtype: float64

Another use-case could be time-series analysis, where you might have data split across multiple DataFrames, each representing a different time period. Calculating the mean across these DataFrames can provide better insights into trends and patterns over time.

Conclusion

In this Byte, we calculated the mean across multiple DataFrames in Pandas. We started by understanding the calculation of mean in a single DataFrame, then extended this concept to multiple DataFrames. We also pointed out some use-cases where this technique would be particularly useful, like when dealing with split datasets or conducting time-series analysis.

Categories: FLOSS Project Planets

Stack Abuse: Convert Index of a Pandas DataFrame into a Column in Python

Thu, 2023-09-21 09:31
Introduction

There are times when using Pandas that you may find yourself needing to convert the row index to a column of its own. This may be a useful operation for a couple of reasons, which we'll see later in this Byte.

DataFrames and Indexing in Pandas

Pandas is a very popular data manipulation library in Python. It has two key data structures - Series and DataFrame. A DataFrame is basically just a table of data, similar to an Excel spreadsheet. Each DataFrame has an index, which you can think of as a special column that identifies each row. By default, the index is a range of integers from 0 to n-1, where n is the number of rows in the DataFrame.

Here's a basic DataFrame:

import pandas as pd data = { 'Name': ['John', 'Anna', 'Peter', 'Linda'], 'Age': [28, 24, 35, 32], 'City': ['New York', 'London', 'Paris', 'Berlin'] } df = pd.DataFrame(data) print(df)

Output:

Name Age City 0 John 28 New York 1 Anna 24 London 2 Peter 35 Paris 3 Linda 32 Berlin

The leftmost column (0,1,2,3) is the index of this DataFrame.

Why Convert the Index into a Column?

So why would we want to convert the index into a column? Well, sometimes the index of a DataFrame can contain valuable information that we want to utilize as part of our data analysis. If our DataFrame is time series data, it's possible that the index could be the timestamp (or relative time since the start of the series). By converting the index into a column, we can then perform operations on it just like any other column.

Convert Index into a Column

Now let's see how we can actually convert the index of a DataFrame into a column. We'll use the reset_index() function provided by Pandas, which generates a new DataFrame or Series with the index reset.

Here's are the steps:

  1. Create a DataFrame (or use an existing one).
  2. Call the reset_index() function on the DataFrame.
  3. If you want to keep the old index, use the drop=False parameter.

Here's an example:

import pandas as pd # Step 1: Create a DataFrame data = { 'Name': ['John', 'Anna', 'Peter', 'Linda'], 'Age': [28, 24, 35, 32], 'City': ['New York', 'London', 'Paris', 'Berlin'] } df = pd.DataFrame(data) # Step 2: Reset the index df_reset = df.reset_index() print(df_reset)

Output:

index Name Age City 0 0 John 28 New York 1 1 Anna 24 London 2 2 Peter 35 Paris 3 3 Linda 32 Berlin

As you can see, the old index has been converted into a column named "index". The DataFrame now has a new default integer index.

Other Ways to Convert Index into a Column

While the reset_index() function is a perfectly good way to convert the index into a column, there are also some other ways to do the same thing.

Another way to do this is to manually create and assign a new column. We can create a new column by using the syntax:

df['new_column'] = data

Assuming data is a series of data, we'll now have a new column containing that data. We can leverage this, along with df.index, to create a new column of index values:

df = pd.DataFrame({ 'A': ['foo', 'bar', 'baz'], 'B': ['one', 'two', 'three'] }) df['idx'] = df.index print(df)

This will also result in a new column withe index values:

A B idx 0 foo one 0 1 bar two 1 2 baz three 2 Conclusion

In this Byte, we've saw how to convert the index of a DataFrame into a column using Python's Pandas library. We've seen how to use the reset_index() function, and also an alternative method using rename_axis() and reset_index(). We've also discussed some of the situations where converting the index into a column can be useful.

Categories: FLOSS Project Planets

PyBites: 6 Cool Things You Can Do With The Functools Module

Thu, 2023-09-21 07:16

In this article let’s look at the functools Standard Library module and 6 cool things you can do with it (be warned, a lot of decorators are coming your way! ) …

1. Cache (“memoize”) things

You can use the @cache decorator (formerly called @lru_cache) as a “simple lightweight unbounded function cache”.

The classic example is calculating a Fibonacci series where the intermediate results are cached, speeding up the calculation significantly:

from functools import cache @cache def fibonacci(n: int) -> int: if n <= 1: return n return fibonacci(n - 1) + fibonacci(n - 2) for i in range(40): print(fibonacci(i))

On my system this code takes 0.02s to complete.

However if I comment the @cache decorator it takes 28.30s because of all the repeated calculations!

Hence caching is especially useful and crucial for tasks with expensive repeat computations.

New to caching? Check out our YouTube video.

You can do the same for properties using @cached_property.

2. Write less dunder methods

Using the @total_ordering decorator you can write the __eq__() dunder and one of __lt__(), __le__(), __gt__(), or __ge__(), so only two, and it will provide the other ones automatically for you. Less code, nice automation.

As per the docs it does come with the cost of slower execution and more complex stack traces. Also this decorator makes no attempt to override methods already declared in the class or its superclasses.

The term “dunder” is colloquially derived from “double underscore. In the context of Python, dunder methods, also known as “magic methods” or “special methods,” are a set of predefined methods with double underscores at the beginning and end of their names (e.g., __init__, __str__). Learn more about them in my Dan Bader guest article here and practice with them on our platform.

3. Freeze functions

partial() lets you put a basic wrapper around an existing function so that you can set a default value where there normally wouldn’t be one.

For example, if I wanted the print() function to always end with a comma instead of a newline, I could use partial() as follows:”

from functools import partial print_no_newline = partial(print, end=', ') # Normal print() behavior: for _ in range(3): print('test') test test test # My new frozen print() one: for _ in range(3): print_no_newline('test') test, test, test,

Another example is freezing the pow() built-in to always square by fixating the exp argument to 2:

from functools import partial # Using partial with the built-in pow function square = partial(pow, exp=2) # Testing the new function print(square(4)) # Outputs: 16 print(square(5)) # Outputs: 25

By using partial(), you can simplify repetitive calls, enhance code clarity, and create reusable components with preset configurations.

There is also partialmethod() which behaves like partial() but is designed to be used as a method definition rather than being directly callable.

4. Use generic functions

With the introduction of PEP 443, Python added support for “single-dispatch generic functions”.

These allow you to define a set of functions (variants) for one main function, where each variant handles a different type of argument.

The @singledispatch decorator orchestrates this behavior, enabling the function to change its behavior based on the type of its argument.

Let’s take a look at a simple example:

from functools import singledispatch @singledispatch def process(data): """Default behavior for unrecognized types.""" print(f"Received data: {data}") @process.register(str) def _(data): """Handle string objects.""" print(f"Processing a string: {data}") @process.register(int) def _(data): """Handle integer objects.""" print(f"Processing an integer: {data}") @process.register(list) def _(data): """Handle list objects.""" print(f"Processing a list of length: {len(data)}") # Testing the generic function process(42) # Outputs: Processing an integer: 42 process("hello") # Outputs: Processing a string: hello process([1, 2, 3]) # Outputs: Processing a list of length: 3 process(2.5) # Outputs: Received data: 2.5

In the example above, when we call the process function, the appropriate registered function is invoked based on the type of the argument passed.

For data types that do not have a registered function, the default behavior (defined under the main @singledispatch decorated function) is used.

Such a design can make your code more organized and clear, especially when one function needs to handle various data types differently.

Note that the repeated use of _ as the function name is idiomatic for discarding values, see also our YouTube video: 5 use cases for underscores in Python.

See also our Bite exercise #76.

5. Help writing better decorators

When writing a decorator in Python, it’s best practice to use functools.wraps() to not lose the docstring and other metadata of the function you are decorating:

from functools import wraps def mydecorator(func): @wraps(func) def wrapped(*args, **kwargs): result = func(*args, **kwargs) return result return wrapped @mydecorator def hello(name: str): """Print a salute message""" print(f"Hello {name}") # thanks to functools. wraps metadata is preserved: print(hello.__doc__) # 'Print a salute message' print(hello.__annotations__) # {'name': <class 'str'>} # without functools.wraps: print(hello.__doc__) # None print(hello.__annotations__) # {}

Preserving the metadata of the decorated function like this, it becomes easier for developers to understand the purpose and usage of the function.

6. Aggregate data or transform cumulatively

functools.reduce(func, iterable) is a function that accumulates results by successively applying a function to the elements of an iterable, from left to right.

Note reduce() was moved into the functools module in Python 3, in Python 2 reduce() was a built-in function.

This can be useful in various scenarios where you want to aggregate data or transform it in a cumulative way.

Here is a an example where I use it to aggregate operator module operations on a list of numbers:

from functools import reduce import operator numbers = list(range(1, 11)) # [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] print(operator.add(1, 2)) # 3 print(reduce(operator.add, numbers)) # 55 print(reduce(operator.sub, numbers)) # -53 print(reduce(operator.mul, numbers)) # 3628800 print(reduce(operator.truediv, numbers)) # 2.7557319223985893e-07 Conclusion

In conclusion, the functools module in Python’s Standard Library is a treasure trove of tools, especially for those who frequently work with functions and decorators.

Whether you’re looking to optimize with caching, streamline your class comparisons, wrap functions in flexible ways, or even handle function dispatch based on argument types, functools has you covered.

As we’ve seen, these utilities can simplify your code, boost performance, and overall make your code more Pythonic.

Next time you find yourself reaching for a function-based solution, remember to peek into functools — there might just be a tool waiting to make your life easier.

Keep calm and code in Python!

More Python tips

We distilled 250 of these kind of tips in our Pybites Python Tips Book. Timeless, real world and practical tips that will make your code more Pythonic.

Get the Book

Categories: FLOSS Project Planets

Stack Abuse: Getting Today's Date in YYYY-MM-DD in Python

Wed, 2023-09-20 12:48
Introduction

Whether you're logging events or measuring execution times, you'll often find yourself working with dates. In Python, the built-in datetime module makes it easy to get the current date and time, format it, or even do time math. In this Byte, we'll focus on how to get today's date and format it in the YYYY-MM-DD format.

Dates and Times in Python

Python provides the datetime module in its standard library for dealing with dates and times. This module includes several classes for managing dates, times, timedeltas, and more. The two classes we'll be focusing on in this Byte are datetime and date.

The datetime class is a combination of a date and a time, and provides a wide range of methods and attributes. The date class, on the other hand, is solely concerned with dates (year, month, and day).

Here's a quick example of how you can create a datetime object:

from datetime import datetime # create a datetime object dt = datetime(2023, 9, 19, 23, 59, 59) print(dt) # Output: 2023-09-19 23:59:59

In this snippet, we're creating a datetime object for the last second of Sept. 19th, 2023. But how do we get the current date?

Getting Today's Date in Python

Python's datetime module provides a method called today() that returns the current date and time as a datetime object. Here's how you can use it:

from datetime import datetime # get today's date today = datetime.today() print(today) # Output: 2023-09-19 22:17:08.845750

In the above example, the today() method returned the current date and time. However, if you only need the date, you can use the date() method of a datetime object to get a date object:

from datetime import datetime # get today's date today = datetime.today().date() print(today) # Output: 2023-09-19 Formatting Date as YYYY-MM-DD

The date and datetime objects provide a method called strftime() that allows you to format the date and time in various ways. The strftime() method takes a format string where each %-prefixed character is replaced with data from the date and time.

To format a date in the YYYY-MM-DD format, you can use the %Y, %m, and %d format codes:

from datetime import datetime # get today's date today = datetime.today().date() # format date as YYYY-MM-DD formatted_date = today.strftime('%Y-%m-%d') print(formatted_date) # Output: 2023-09-19

In the format string, %Y is replaced with the four-digit year, %m is replaced with the two-digit month, and %d is replaced with the two-digit day.

Note: The strftime() method returns a string, so you can't use date methods or attributes on the result. If you need to manipulate the date after formatting it, keep a reference to the original date or datetime object.

And that's it! You now know how to get today's date in Python and format it in the YYYY-MM-DD format.

Other Ways to Get Today's Date

While the datetime module is a powerful tool for working with dates and times in Python, it's not the only way to get today's date. Let's explore some other methods.

One alternative is using the time module, another built-in Python module for dealing with time. Here's how you can use it to get today's date:

import time today = time.strftime("%Y-%m-%d") print(today) # Output: 2023-09-19

When you run this code, you'll get the current date output in the 'YYYY-MM-DD' format. The strftime method formats the time according to the given format string.

Note: While the time module can give you the current date, it does not have as many date and time manipulation capabilities as the datetime module. It's generally better to use datetime for more complex date and time tasks.

Another option is to use an external library, like pendulum. Pendulum is a Python library that simplifies and enhances date handling, even beyond what's available in datetime.

Here's how you can get today's date with pendulum:

import pendulum today = pendulum.now().to_date_string() print(today) # Output: 2023-09-19

This will also give you the current date in the 'YYYY-MM-DD' format. The now method gets the current date and time, and the to_date_string method formats it as a string.

Note: Remember that pendulum is not a built-in module, so you'll need to install it using pip (pip install pendulum) before you can use it.

Conclusion

We've covered a variety of ways to get today's date in Python and format it as 'YYYY-MM-DD'. We started off with the basics of dates and times in Python, then moved on to getting today's date using the datetime module. We also looked at how to format the date in the 'YYYY-MM-DD' format. Finally, we explored some other methods for getting the current date, using both the built-in time module and the external pendulum library.

Categories: FLOSS Project Planets

Real Python: How to Catch Multiple Exceptions in Python

Wed, 2023-09-20 10:00

In this tutorial, you’ll learn various techniques to catch multiple exceptions with Python. To begin with, you’ll review Python’s exception handling mechanism before diving deeper and learning how to identify what you’ve caught, sometimes ignore what you’ve caught, and even catch lots of exceptions.

Python raises an exception when your code encounters an occasional but not unexpected error. For example, this will occur if you try to read a missing file. Because you’re aware that such exceptions may occur, you should write code to deal with, or handle, them. In contrast, a bug happens when your code does something illogical, like a miscalculation. Bugs should be fixed, not handled. This is why debugging your code is important.

When your Python program encounters an error and raises an exception, your code will probably crash, but not before providing a message within a traceback indicating what the problem is:

>>>>>> 12 / "five" Traceback (most recent call last): ... TypeError: unsupported operand type(s) for /: 'int' and 'str'

Here, you’ve tried to divide a number by a string. Python can’t do this, so it raises a TypeError exception. It then shows a traceback reminding you that the division operator doesn’t work with strings.

To allow you to take action when an error occurs, you implement exception handling by writing code to catch and deal with exceptions. Better this than your code crashing and scaring your user. To handle exceptions, you use the try statement. This allows you to monitor code for exceptions and take action should they occur.

Most try statements use try … except blocks as follows:

  • The try block contains the code that you wish to monitor for exceptions. Any exceptions raised within try will be eligible for handling.

  • One or more except blocks then follow try. These are where you define the code that will run when exceptions occur. In your code, any raised exceptions trigger the associated except clause. Note that where you have multiple except clauses, your program will run only the first one that triggers and then ignore the rest.

To learn how this works, you write a try block to monitor three lines of code. You include two except blocks, one each for ValueError and ZeroDivisionError exceptions, to handle them should they occur:

# handler_statement.py try: first = float(input("What is your first number? ")) second = float(input("What is your second number? ")) print(f"{first} divided by {second} is {first / second}") except ValueError: print("You must enter a number") except ZeroDivisionError: print("You can't divide by zero")

The code that you’re monitoring asks the user to enter both numbers and then prints the division. You’ll cause a ValueError if you don’t enter a number in the first two lines of code. When the float() function tries to convert your input into a float, a ValueError occurs if this isn’t possible. A ZeroDivisionError occurs if you enter 0 as the second number. When the print() function attempts to divide by zero, you get a ZeroDivisionError.

Having written the above code, you then test each of the control flows. To do this, you first provide perfectly valid data, then provide a string for the second number, and finally provide a 0 for the second number:

$ python handler_statement.py What is your first number? 10 What is your second number? 5 10.0 divided by 5.0 is 2.0 $ python handler_statement.py What is your first number? 10 What is your second number? "five" You must enter a number $ python handler_statement.py What is your first number? 10 What is your second number? 0 You can't divide by zero

The good news is that your code never crashes. This is because your code has successfully handled the exceptions.

First, you provided acceptable data. When you look at the output, you can see that the program flows only through try. You haven’t invoked any of the except clauses because Python hasn’t raised any exceptions.

You then cause a ValueError by entering a string. This happens because the float() function can’t convert your "five" into a float. Your program flow now becomes try then except ValueError. Despite raising a ValueError, your code has handled it gracefully. Your users will no longer experience a worrying crash.

In your final test run, you try to divide by 0. This time, you cause a ZeroDivisionError because Python doesn’t like your enthusiasm for Riemann spheres and infinity. This time, the program flow is try then except ZeroDivisionError. Again, your code has handled your exception gracefully. Most of your users will be happy with this, though the mathematicians may be disappointed.

After the error handling is complete, your program’s flow would usually continue with any code beyond the try statement. In this case, there’s none, so the program simply ends.

As an exercise, you might like to try entering a string as the first input and a number as the second. Can you predict what will happen before you try it out?

Note: Your code catches only ZeroDivisionError or ValueError exceptions. Should any others be raised, it’ll crash as before. You could get around this by creating a final except Exception clause to catch all other exceptions. However, this is bad practice because you might catch exceptions that you didn’t anticipate. It’s better to catch exceptions explicitly and customize your handling of them.

Up to this point, you’ve reviewed how to catch exceptions individually using the try statement. In the remainder of this tutorial, you’ll learn about ways that you can catch multiple exceptions. Time to dive a bit deeper.

Get Your Code: Click here to download the sample code that shows you how to catch multiple exceptions in Python.

How to Catch One of Several Possible Python Exceptions

Catching individual exceptions in separate except clauses is the way to go if you need to perform different handling actions on the different exceptions being caught. If you find that you’re performing the same actions in response to different exceptions, then you can produce simpler, more readable code by handling multiple exceptions in a single except clause. To do this, you specify the exceptions as a tuple within the except statement.

Suppose you now like the idea of your earlier code being able to handle both exceptions in a single line. To do this, you decide to rewrite your code as follows:

Read the full article at https://realpython.com/python-catch-multiple-exceptions/ »

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Categories: FLOSS Project Planets

Stack Abuse: Find the Index of an Item in a List in Python

Wed, 2023-09-20 10:00
Introduction

In this Byte we're going to take a look at one of the most common tasks you can do with a list: finding the index of an item. Luckily, this is usually a pretty simple task - but there are a few potential pitfalls and nuances that you need to be aware of. So let's get started!

Lists in Python

Lists are a very commonly used data type in Python. They are mutable, ordered collections of items, which means you can add, remove, or change items after the list is created. They are used so often because they're incredibly versatile and can hold any type of object: numbers, strings, other lists, and so on. Here's a simple example of a list in Python:

fruits = ['apple', 'banana', 'cherry', 'date', 'elderberry']

In this list, 'apple' is at index 0, 'banana' at index 1, and so on. Remember, Python uses zero-based indexing, which means the first element is at index 0, not 1.

How to Find the Index of an Item

So, how do we find the index of a particular item in a list? Python provides a couple of different ways to do this, and we're going to look at two of them: the index() method and the enumerate() function.

Using the index() Method

The index() method is probably the most straightforward way to find the index of an item in a list. You call this method on a list and pass the item you're looking for as an argument. Here's how you would use it:

fruits = ['apple', 'banana', 'cherry', 'date', 'elderberry'] index = fruits.index('cherry') print(index)

When you run this code, it will output:

2

The index() method returns the index of the first occurrence of the item. If the item is not in the list, it raises a ValueError.

Note: The index() method only returns the first occurrence of the item. If the item appears more than once in the list and you want to find all of its indexes, you'll need to use a different approach, which we'll cover in another section of this Byte.

Using the enumerate() Function

In Python, the enumerate() function adds a counter to an iterable and returns it as an enumerate object. This can be useful when you want to get the index of an item in a list. Let's see how it works:

fruits = ['apple', 'banana', 'cherry', 'date'] for i, fruit in enumerate(fruits): print(f"The index of {fruit} is {i}")

This will output:

The index of apple is 0 The index of banana is 1 The index of cherry is 2 The index of date is 3

The enumerate() function makes our code cleaner and more Pythonic. Instead of manually incrementing a counter, we let Python handle it for us.

To actually find an item, we might do something like this:

fruits = ['apple', 'banana', 'cherry', 'date'] idx = None for i, fruit in enumerate(fruits): if fruit == 'cherry': idx = i break print(idx)

Again, this code would print 2 to the console.

This method is useful when it's more difficult to check for a particular item. For example, you can't easily find a dict with the index method. Whereas with enumerate, you can easily implement your own custom code to check for the item you're looking for.

For example:

people = [ {'name': 'John', 'age': 27}, {'name': 'Alice', 'age': 23}, {'name': 'Bob', 'age': 32}, {'name': 'Lisa', 'age': 28}, ] idx = None for i, person in enumerate(people): if person['name'] == 'Lisa': idx = i break print(idx) 3 Handling Errors

When dealing with lists and indices in Python, there are two common errors you might encounter: IndexError: list index out of range and ValueError: item is not in list. Let's take a closer look at each of these.

IndexError: List Index Out of Range

This error happens when you try to access an index that is outside the bounds of the list. It's a common mistake, especially when dealing with loops or complex list manipulations.

fruits = ['apple', 'banana', 'cherry'] print(fruits[3])

This will result in:

IndexError: list index out of range

To prevent this error, always make sure that the index you're trying to access exists in the list.

ValueError: Item is not in List

This error occurs when you try to find the index of an item that doesn't exist in the list using the index() method.

fruits = ['apple', 'banana', 'cherry'] print(fruits.index('date'))

This will result in:

ValueError: 'date' is not in list

To prevent this error, you can use the in keyword to check if the item exists in the list before trying to find its index.

fruits = ['apple', 'banana', 'cherry'] if 'date' in fruits: print(fruits.index('date')) else: print("'date' is not in the list.")

This will output:

'date' is not in the list.

Remember, you should always try to handle these errors gracefully in your code. This not only prevents your program from crashing but also improves the user experience.

Finding the Index of All Occurrences of an Item

Finding the index of a single occurrence of an item in a Python list is a relatively simple task, as we've seen. But what if we want to find the indices of all occurrences of an item? In this case, we can use a combination of Python's built-in functions and list comprehension.

Consider the following list:

numbers = [1, 2, 3, 2, 4, 2, 5, 6, 2, 7]

In this list, the number 2 appears four times. Let's find all its occurrences:

indices = [i for i, x in enumerate(numbers) if x == 2] print(indices)

This script will output:

[1, 3, 5, 8]

Here, we're using a list comprehension to create a new list (indices). The enumerate() function is used to return both the index and value from numbers. If the value (x) is equal to 2, the index (i) is added to the indices list.

Conclusion

Throughout this Byte, we've explored how to find the index of an item in a Python list using various methods. We've learned about the index() method and the enumerate() function, and we've also seen how to handle common errors that can occur when trying to find an index. Lastly, we even showed how to find all occurrences of an item in a list.

Categories: FLOSS Project Planets

Mike Driscoll: Learning About Code Metrics in Python with Radon

Wed, 2023-09-20 08:48

There are many different tools that you can install to help you be a better Python programmer. For example, you might install pytest so that you can do unit tests of your code. Or you might install Ruff, a super fast Python linter. The focus of this article is on another tool called Radon that helps you compute code metrics.

You can use Radon to help you find complex code in your code base. This is known as Cyclomatic Complexity or McCabe’s Complexity. According to Radon’s documentation:

“Cyclomatic Complexity corresponds to the number of decisions a block of code contains plus 1. This number (also called McCabe number) is equal to the number of linearly independent paths through the code. This number can be used as a guide when testing conditional logic in blocks.”

For example, if the number equals three, you will probably need to write at least three unit tests to have complete code coverage. Not only is this useful for figuring out how many tests to write, the cyclomatic complexity can tell you when it is time to refactor your code. You can read the full details of how Radon calculates complexity in their documentation.

Experienced developers are often able to know when to refactor from their own experience, but newer engineers may need the help that a tool like this provides.

You can also use a tool like Radon in your CI/CD system to prevent developers from merging in overly complex code.

Installing Radon

You can install Radon using pip. Here is an example:

python -m pip install radon

Now, let’s learn how to use Radon on your code!

Basic Radon Usage

You can call the radon application on the command line. Radon accepts multiple flags to control what kinds of output you receive.

Here are the four commands that radon currently can use:

  • cc: compute Cyclomatic Complexity
  • raw: compute raw metrics
  • mi: compute Maintainability Index
  • hal: compute Halstead complexity metrics

Let’s try running radon against the popular Black package (Black is a popular code formatted for Python).

Here is the command to run:

PS C:\Users\Mike\AppData\Local\Programs\Python\Python311\Lib\site-packages\black> radon cc . -a -nc brackets.py F 225:0 is_split_before_delimiter - F M 70:4 BracketTracker.mark - C comments.py F 140:0 convert_one_fmt_off_pair - D F 208:0 generate_ignored_nodes - C F 253:0 _generate_ignored_nodes_from_fmt_skip - C concurrency.py F 120:0 schedule_formatting - C files.py F 309:0 gen_python_files - C F 46:0 find_project_root - C linegen.py F 1133:0 normalize_invisible_parens - D F 747:0 _maybe_split_omitting_optional_parens - D F 1453:0 generate_trailers_to_omit - D F 879:0 bracket_split_build_line - D M 396:4 LineGenerator.visit_STRING - C F 509:0 transform_line - C F 997:0 delimiter_split - C F 1529:0 run_transformer - C F 1355:0 maybe_make_parens_invisible_in_atom - C F 632:0 left_hand_split - C M 289:4 LineGenerator.visit_simple_stmt - C F 699:0 _first_right_hand_split - C F 1313:0 remove_with_parens - C lines.py M 569:4 EmptyLineTracker._maybe_empty_lines - E M 646:4 EmptyLineTracker._maybe_empty_lines_for_class_or_def - E F 755:0 is_line_short_enough - D C 514:0 EmptyLineTracker - D F 882:0 can_omit_invisible_parens - C M 300:4 Line.has_magic_trailing_comma - C F 846:0 can_be_split - C M 62:4 Line.append - C M 529:4 EmptyLineTracker.maybe_empty_lines - C M 228:4 Line.contains_uncollapsable_type_comments - C M 362:4 Line.append_comment - C nodes.py F 174:0 whitespace - F F 616:0 is_simple_decorator_trailer - C F 573:0 is_one_sequence_between - C parsing.py F 164:0 stringify_ast - C F 57:0 lib2to3_parse - C strings.py F 173:0 normalize_string_quotes - C trans.py M 792:4 StringParenStripper.do_match - D M 1388:4 StringSplitter.do_transform - D M 2070:4 StringParenWrapper.do_transform - D M 1064:4 BaseStringSplitter._get_max_string_length - D M 1334:4 StringSplitter.do_splitter_match - C M 686:4 StringMerger._validate_msg - C M 542:4 StringMerger._merge_one_string_group - C F 1219:0 iter_fexpr_spans - C C 772:0 StringParenStripper - C M 378:4 StringMerger.do_match - C C 1801:0 StringParenWrapper - C M 1859:4 StringParenWrapper.do_splitter_match - C F 84:0 hug_power_op - C C 358:0 StringMerger - C M 1174:4 BaseStringSplitter._prefer_paren_wrap_match - C M 1985:4 StringParenWrapper._assign_match - C M 2032:4 StringParenWrapper._dict_or_lambda_match - C __init__.py F 1156:0 get_features_used - F F 443:0 main - E F 616:0 get_sources - D F 749:0 reformat_one - C F 121:0 read_pyproject_toml - C F 1094:0 _format_str_once - C F 800:0 format_file_in_place - C 62 blocks (classes, functions, methods) analyzed. Average complexity: C (19.741935483870968)

In this example, you asked radon to give you the Cyclomatic Complexity (cc) of the Black package. You also tacked on the -a or average flag and the -n flag, which lets you set the minimum complexity rank to display. The default is “a”, but in this example, you set the minimum to “c”, meaning it will show ranks C to F.

At this point, you can go through the code and start looking at the functions, methods, and classes to see what a C-ranked portion of code looks like compared with an F-ranked one. Give it a try and you’ll soon learn how to use radon to discover how complex your code is.

There are more command line options than what is shown here. Check out the full listing of additional flag in radon’s documentation.

You can also have radon measure the maintainability of your code by using the mi command. Here’s an example:

PS C:\Users\Mike\AppData\Local\Programs\Python\Python311\Lib\site-packages\black> radon mi . brackets.py - A cache.py - A comments.py - A concurrency.py - A const.py - A debug.py - A files.py - A handle_ipynb_magics.py - A linegen.py - C lines.py - C mode.py - A nodes.py - C numerics.py - A output.py - A parsing.py - A report.py - A rusty.py - A strings.py - A trans.py - C _width_table.py - A __init__.py - C __main__.py - A

The results of the mi command are similar to the cc command as radon will once again use letter ranks on your files to help you grade your code’s maintainability.

Radon can also calculate some raw metrics about your code by using the raw command. Here are the metrics that this command will calculate for you:

  • LOC: the total number of lines of code
  • LLOC: the number of logical lines of code
  • SLOC: the number of source lines of code – not necessarily corresponding to the LLOC [Wikipedia]
  • comments: the number of Python comment lines (i.e. only single-line comments #)
  • multi: the number of lines representing multi-line strings
  • blank: the number of blank lines (or whitespace-only ones)

Let’s try running the raw command against Black:

PS C:\Users\wheifrd\AppData\Local\Programs\Python\Python311\Lib\site-packages\black> radon raw . brackets.py LOC: 375 LLOC: 197 SLOC: 253 Comments: 4 Single comments: 11 Multi: 47 Blank: 64 - Comment Stats (C % L): 1% (C % S): 2% (C + M % L): 14% cache.py LOC: 97 LLOC: 59 SLOC: 54 Comments: 2 Single comments: 5 Multi: 14 Blank: 24 - Comment Stats (C % L): 2% (C % S): 4% (C + M % L): 16% comments.py LOC: 329 LLOC: 202 SLOC: 212 Comments: 33 Single comments: 29 Multi: 42 Blank: 46 - Comment Stats (C % L): 10% (C % S): 16% (C + M % L): 23%

The example above has been truncated since there is a LOT of output. But this snippet gives you a good idea of what to expect when you run the raw command against your code.

Wrapping Up

Radon is a great tool to add to your toolbox. You can use it to help you figure out which parts of your source code are getting too complex. You can also use Radon in your CI/CD infrastructure to prevent developers from checking in overly complex code. You might use this for your own open-source or commercial projects at your employer.

Give it a try and see if you like Radon!

The post Learning About Code Metrics in Python with Radon appeared first on Mouse Vs Python.

Categories: FLOSS Project Planets

PyBites: Your First Python Open Source Contribution: A Step-By-Step Guide

Wed, 2023-09-20 06:03
Introduction

I recently re-engaged with one of my open source projects and it was a rewarding experience.

It was a Pybites project I had written the core for years ago, but thanks to some amazing Pythonistas in our community it became a way more mature tool so I had to get acquainted again.

I was resolving an issue that had been causing some disturbance in our community, hence fixing it was quite satisfying.

Contributing to open source is amazing, and with the amount of projects out there your contributions are necessary.

It’s also one of the best way to become a better Pythonista.

Beyond personal development, the significance of open source in the tech industry, the ethos behind it, and how it powers much of the modern web and software infrastructure makes it so much more important.

Starting out can also be overwhelming though. So in this guide I will walk you through the steps to begin your journey as an open source contributor.

Why Contribute to Open Source?

Let’s first look why this makes you a better Python programmer …

1. Develop Your Technical Skills
  • Dive into a Code Base: Fixing bugs often involves understanding someone else’s code. This is an excellent practice for reading and understanding code quickly.
  • Think Design: When implementing a new feature, you must consider how your code fits into the existing architecture, honing your design skills. You don’t learn that from tutorials (!), neither from staying in your own silo / only working on your own projects. This is a critical skill, since joining a software team you 9 out of 10 times will inherit an existing code base!
  • Learn from Code Reviews: Peer reviews expose you to different perspectives and best practices, and they can be a valuable source of learning. There was an before and after in my software career when I underwent code reviews by more senior peers. We spoke about this here.
2. Network and Collaborate with Other Developers
  • Forge Professional Relationships: Collaboration on projects allows you to meet like-minded individuals, fostering friendships and networking opportunities.
  • Discover New Opportunities: Continuous interaction with other developers can open doors to collaborations, partnerships, and job offers. This sounds cliché but it’s incredible what “one”— just one — contact can do for your entire career, so really grab this opportunity.
3. Enhance Your Resume and Portfolio
  • Gain Practical Experience: Employers appreciate candidates who can hit the ground running. Contributing to projects shows that you can work with real-world code.
  • Showcase Your Commitment: Active contributions signal a mentality of continuous learning and a desire to improve software for everyone. Funny fact: most people that reach out for help have this deep desire to give back. Well there is no better way than doing it with code, no?
4. Mastering Open-Source Communication
  • Engage with Tact and Clarity: In the open-source world, effective communication is as vital as the code you contribute.
    Whether you’re raising an issue, commenting on a PR, or just interacting in discussions, how you convey your thoughts can determine the reception and success of your contribution.
    Clear, positive, and respectful dialogue fosters collaboration and minimizes misunderstandings. Remember, every interaction is an opportunity to build relationships and showcase your professionalism.
    More on the communication part in a bit …
Overcoming the Fear: “Am I Good Enough?” 1. Start Small
  • Simple Contributions: Even correcting typos in documentation or improving the readability of code / comments can be a great way to get your feet wet.

    Many beginners experience what’s commonly known as ‘imposter syndrome’—the feeling that they’re not qualified or deserving of their accomplishments. If this sounds familiar, you’re not alone. In fact, we’ve detailed 9 actionable tips to combat imposter syndrome in this article. Understanding and addressing these feelings can make your open source journey more fulfilling and less daunting.
2. What Do You Care About?
  • Personal Connection: Contributing to a project that excites you will make the process more enjoyable and sustainable. Our portfolio assessment can help you here too …
3. Familiarize Yourself with Git and GitHub
  • Learn the Basics: Plenty of online tutorials can guide you through the process of creating branches, committing changes, and opening pull requests, but the best way to learn the skills is to use them in a real world setting. Open source gives you this playground.
4. Review the Project’s Guidelines
  • Know the Rules: Most projects have a CONTRIBUTING.md file that outlines how to set up the development environment and submit contributions.
5. Choosing a License for Contributions
  • Understanding the type of license a project uses is crucial before making contributions. Licenses define how the software can be used, modified, and distributed.
  • Some commonly used open-source licenses are MIT License, GNU General Public License (GPL), Apache License 2.0, BSD Licenses and Creative Commons Licenses.
  • Before contributing to or starting an open-source project, ensure you’re comfortable with its license. Some licenses may have implications for how your contributions are used in the future.
6. Effective Communication

This is a really important one so here are 9 tips:

  • Clear and Concise Messaging: When commenting on issues or PRs, be clear about what you’re addressing. Avoid using jargon unless it’s commonly understood in the project’s context.
  • Stay Positive and Respectful: Remember that there’s a human on the other side of the screen. Always approach discussions with a positive attitude and avoid confrontational or aggressive language.
  • Provide Context: If you’re raising an issue or making a contribution, provide as much relevant context as possible. This might include screenshots, error messages, or references to related issues.
  • Ask Questions: If you’re unsure about something, it’s better to ask than assume. However, do ensure you’ve checked available documentation or previous issues before asking.
  • Acknowledge Feedback: If someone provides feedback or asks questions about your contribution, acknowledge it, even if you disagree. This creates a sense of collaboration and mutual respect.
  • Avoid Emotional Responses: It’s easy to get emotionally attached to your code, but remember that feedback is about the code, not you as a person. If a comment or review triggers an emotional response, take a break before replying.
  • Use Emojis Judiciously: Emojis can convey tone and emotion, making text-based communication friendlier. However, ensure your usage aligns with the project’s communication style and doesn’t obscure the message.
  • Stay Updated: If you’re part of a thread or discussion, try to keep up with it. This shows commitment and can prevent redundant or outdated contributions.
  • Close the Loop: If an issue you raised gets resolved or a PR you made gets merged/closed, thank everyone involved and ensure there are no loose ends.
7. Seek a Supportive Community

Being part of a supportive community can make a world of difference in your open-source journey. Not only will you have a platform to ask questions and seek guidance, but you’ll also gain invaluable insights from experienced developers.

Here are a few Python-centric communities known for their welcoming and collaborative atmospheres:

  • Pybites: This is the one! I am biased of course! However, it is a fantastic place for Python enthusiasts of all skill levels. Whether you’re looking for advice on a specific problem or simply want to share your experiences or be among passionate developers, this community is a great place to start.
  • r/learnpython: Reddit’s learnpython community is a treasure trove of resources, discussions, and advice. Whether you’re a beginner seeking guidance or an experienced developer looking to help others, you’ll find a welcoming space here.
  • Python Software Foundation (PSF) Community: The PSF is the organization behind Python, and their community is vast and diverse. They have mailing lists, forums, and working groups dedicated to various aspects of Python and its development.
  • What is your goal with Python? Remember, the right community can greatly influence your growth and experience in the open-source world.
    But it’s essential to consider your individual goals. For example, if you’re a beginner you might benefit more from r/learnpython, but if you’re looking to contribute to web development projects, you might find the Django (or framework …) community more relevant.
    In any case, all these communities are welcoming (one of the nicest Python features actually! ) and offer unique perspectives and opportunities.
Finding the Right Project 1. Explore GitHub
  • Search and Filter: Use GitHub’s search functionality to find projects in Python that align with your interests and skill level. We also bundled some of our projects in our Opensource GitHub org here.
2. Check “Good First Issue” Labels
  • Newcomer-Friendly Projects: These labels are a signal from the maintainers that they welcome contributions from those new to the project.
3. Understand the Role of Maintainers
  • Maintainers are the backbone of open-source projects. They review contributions, ensure code quality, manage releases, and steer the project’s direction.
  • Here’s what maintainers typically look for and how you can make their lives easier:
    • Well-Documented Contributions: Ensure that any code you submit is accompanied by relevant documentation. If you’re fixing a bug, provide a clear description of the problem and your solution.
    • Follow Project Guidelines: Adhering to the project’s style guide, testing protocols, and contribution guidelines ensures your submissions align with the project’s standards and reduces back-and-forth.
    • Stay Engaged: After submitting a contribution, stay active in the discussion. Address feedback promptly and collaborate with maintainers and other contributors.
    • Understand the Project’s Vision: Before suggesting a significant change or new feature, familiarize yourself with the project’s goals and roadmap. This ensures your contributions align with the project’s direction.
    • Respect Time and Effort: Maintainers often volunteer their time. Be patient when waiting for feedback and appreciative of their input.
4. Follow Your Interests
  • Personal Use Cases: If there is a tool or library you use regularly, this familiarity makes it a great candidate for your contributions.
5. Attend Meetups
  • Networking and Inspiration: Events like local Python meetups or global events like Hacktoberfest are excellent opportunities to find projects and mentors.
Opening Your First PR Without Having an Anxiety Attack 1. Double Check Your Work
  • Code Quality: Before submitting, ensure your code is clean, well-commented, and adheres to the project’s style guide. The project might offer guidance here, but for example, using tools like flake8 already catches common issues / violations.
2. Start with a Draft PR
  • Invite Early Feedback: Draft PRs are a non-intimidating way to show your work and ask for comments without officially submitting it.
3. Ask for Help When You Need It
  • No Shame in Asking: If you’re stuck, ask for guidance. It’s much better than staying stuck. This is often the hardest thing as developers, but on the other end is a human that wants to help and can empathize with where you are at, because they were there too at some point.
4. It Takes Time
  • It’s a Learning Process: Remember that every contributor started somewhere, and the primary goal is growth and learning, not immediate perfection (which you won’t hit nor should aim for anyways).
Diversity in Open Source

A diverse set of contributors not only brings varied skills and knowledge but also ensures that the software developed is robust, inclusive, and caters to a global audience.

As you venture into the open-source community, be an advocate for diversity and inclusivity. Recognize that every contribution, regardless of its origin, enriches the community and the broader tech ecosystem.

By prioritizing diversity, we are building software that truly represents and serves everyone.

Conclusion

Contributing to open source projects as a Python developer is a rewarding endeavor that enhances both your technical and mindset skills, and your career prospects.

It’s normal to feel nervous at the outset, but with a supportive community (one of the best features of the Python space remember!), and a willingness to learn and improve, you’ll soon find that it’s a richly rewarding experience.

Keep calm and code in Python!

Made your first open-source contribution after reading this guide? We’re eager to celebrate with you!

Join our Pybites Community

Taking that first step is often the hardest. Dive into our Python community, renowned for its welcoming spirit. There’s always a spot open for a new contributor!

Also, what better place to share and celebrate your open source contributions and other Python/mindset achievements? Your victories will inspire and encourage others!

Recognizing and celebrating milestones can fuel your motivation. Curious about why? Check out our related article on this topic.

And finally remember that every contribution, no matter how small, pushes the world of open-source forward. Take the leap, and become a part of this incredible journey.

Join our Pybites Community

Categories: FLOSS Project Planets

Kushal Das: SBOM and vulnerability scanning

Wed, 2023-09-20 03:26

Software Bill of Materials became one of the latest buzzword. A lot of people and companies talking about it like a magical thing, if you use it then all of your security problems will be solved, just like what happened with Blockchain!!.

Though a hand full of projects (or companies building those projects) focused on the actual tooling part. Things we can use and see some useful output than blogposts/presentations with fancy graphics.

In this post we will try to see how can we use these tools today (2023/09/20).

SBOM currently comes in two major flavors, SPDX aka Software Package Data Index and CycloneDX. There are existing tooling to convert in between.

Syft

We will use syft from Anchore to generate our SBOM(s).

This tool can generate from various sources, starting from container images to Python projects, RPM/Debian dbs, Rust or Go projects.

Let us generate the SBOM for a Debian 12 VM.

$ syft /var/lib/dpkg -o spdx-json=server.spdx.json --source-name debian12 ✔ Indexed file system /var/lib/dpkg ✔ Cataloged packages [395 packages]

For for a Rust project:

$ syft /home/kdas/code/johnnycanencrypt/Cargo.lock -o spdx-json=jce.spdx.json ✔ Indexed file system /home/kdas/code/johnnycanencrypt ✔ Cataloged packages [203 packages]

We generated the SBOMs. Now this should solve the security issues, isn't?

I found the above in Matthew Martin's timeline.

Grype

This is where Grype comes handy, it is a vulnerability scanner for container images and filesystems and works with the SBOM(s) generated by syft.

$ grype jce.spdx.json ✔ Vulnerability DB [updated] ✔ Scanned for vulnerabilities [1 vulnerability matches] ├── by severity: 0 critical, 0 high, 1 medium, 0 low, 0 negligible └── by status: 1 fixed, 0 not-fixed, 0 ignored NAME INSTALLED FIXED-IN TYPE VULNERABILITY SEVERITY time 0.1.45 0.2.23 rust-crate GHSA-wcg3-cvx6-7396 Medium

And:

grype server.spdx.json ✔ Vulnerability DB [no update available] ✔ Scanned for vulnerabilities [178 vulnerability matches] ├── by severity: 6 critical, 136 high, 34 medium, 2 low, 0 negligible └── by status: 0 fixed, 178 not-fixed, 0 ignored NAME INSTALLED FIXED-IN TYPE VULNERABILITY SEVERITY file 1:5.44-3 CVE-2007-1536 High git 1:2.39.2-1.1 CVE-2020-5260 High gnupg 2.2.40-1.1 CVE-2022-3515 Critical gnupg 2.2.40-1.1 CVE-2022-34903 Medium gnupg 2.2.40-1.1 CVE-2022-3219 Low openssl 3.0.9-1 CVE-2023-4807 High openssl 3.0.9-1 CVE-2023-3817 Medium openssl 3.0.9-1 CVE-2023-2975 Medium openssl 3.0.9-1 CVE-2023-1255 Medium perl 5.36.0-7 CVE-2023-31486 High perl 5.36.0-7 CVE-2023-31484 High vim 2:9.0.1378-2 CVE-2022-3520 Critical vim 2:9.0.1378-2 CVE-2022-0318 Critical vim 2:9.0.1378-2 CVE-2017-6350 Critical vim 2:9.0.1378-2 CVE-2017-6349 Critical vim 2:9.0.1378-2 CVE-2017-5953 Critical vim 2:9.0.1378-2 CVE-2023-4781 High vim 2:9.0.1378-2 CVE-2023-4752 High <snipped>

Now it is on your team members to decide how to react to information we gather from these tools. The tools themselves will not solve the problems at hand. You have to decide the update steps and if that is at all required or not.

Also please remember, there is and will be a lot of false positives (not in Grype output yet, but other tools in the SBOM ecosystem). The projects (I am talking about in general most of the tooling in this field) are trying hard to reduce these, but not possible always to remove every such edge case.

Categories: FLOSS Project Planets

Pages