Planet Python

Subscribe to Planet Python feed
Planet Python - http://planetpython.org/
Updated: 8 hours 59 min ago

Talk Python to Me: #455: Land Your First Data Job

Thu, 2024-04-04 04:00
Interested in data science but you're not quite working in it yet? In software, getting that very first job can truly be the hardest one to land. On this episode, we have Avery Smith from Data Career Jumpstart here to share his advice for getting your first data job.<br/> <br/> <strong>Episode sponsors</strong><br/> <br/> <a href='https://talkpython.fm/sentry'>Sentry Error Monitoring, Code TALKPYTHON</a><br> <a href='https://talkpython.fm/posit'>Posit</a><br> <a href='https://talkpython.fm/training'>Talk Python Courses</a><br/> <br/> <strong>Links from the show</strong><br/> <br/> <div><b>Avery Smith</b>: <a href="https://www.linkedin.com/in/averyjsmith/" target="_blank" rel="noopener">www.linkedin.com</a><br/> <b>Data Career Jumpstart</b>: <a href="https://www.datacareerjumpstart.com/" target="_blank" rel="noopener">www.datacareerjumpstart.com</a><br/> <b>Data Nerd Site</b>: <a href="https://datanerd.tech" target="_blank" rel="noopener">datanerd.tech</a><br/> <b>Write C# LINQ queries to query data</b>: <a href="https://learn.microsoft.com/en-us/dotnet/csharp/linq/get-started/write-linq-queries" target="_blank" rel="noopener">learn.microsoft.com</a><br/> <b>A faster way to build and share data apps</b>: <a href="https://streamlit.io" target="_blank" rel="noopener">streamlit.io</a><br/> <b>Plotly Dash</b>: <a href="https://dash.plotly.com" target="_blank" rel="noopener">dash.plotly.com</a><br/> <br/> <b>Michael's Keynote: State of Python in 2024</b>: <a href="https://www.youtube.com/watch?v=coz1CGRxjQ0" target="_blank" rel="noopener">youtube.com</a><br/> <b>Watch this episode on YouTube</b>: <a href="https://www.youtube.com/watch?v=0G89ZY5IWUM" target="_blank" rel="noopener">youtube.com</a><br/> <b>Episode transcripts</b>: <a href="https://talkpython.fm/episodes/transcript/455/land-your-first-data-job" target="_blank" rel="noopener">talkpython.fm</a><br/> <br/> <b>--- Stay in touch with us ---</b><br/> <b>Subscribe to us on YouTube</b>: <a href="https://talkpython.fm/youtube" target="_blank" rel="noopener">youtube.com</a><br/> <b>Follow Talk Python on Mastodon</b>: <a href="https://fosstodon.org/web/@talkpython" target="_blank" rel="noopener"><i class="fa-brands fa-mastodon"></i>talkpython</a><br/> <b>Follow Michael on Mastodon</b>: <a href="https://fosstodon.org/web/@mkennedy" target="_blank" rel="noopener"><i class="fa-brands fa-mastodon"></i>mkennedy</a><br/></div>
Categories: FLOSS Project Planets

Matt Layman: Flash messages and content encodings - Building SaaS with Python and Django #188

Wed, 2024-04-03 20:00
In this episode, we added flash messages (after a rough start with some networking issues). Then I tracked down a thorny issue. We found that there is a non-breaking space with ’timesince’ that affects the encoding and what links Gmail adds to emails.
Categories: FLOSS Project Planets

Real Python: Install and Execute Python Applications Using pipx

Wed, 2024-04-03 10:00

A straightforward way to distribute desktop and command-line applications written in Python is to publish them on the Python Package Index (PyPI), which hosts hundreds of thousands of third-party packages. Many of these packages include runnable scripts, but using them requires decent familiarity with the Python ecosystem. With pipx, you can safely install and execute such applications without affecting your global Python interpreter.

In this tutorial, you’ll learn how to:

  • Turn the Python Package Index (PyPI) into an app marketplace
  • Run installed applications without explicitly calling Python
  • Avoid dependency conflicts between different applications
  • Try throw-away applications in temporary locations
  • Manage the installed applications and their environments

To fully benefit from this tutorial, you should feel comfortable around the terminal. In particular, knowing how to manage Python versions, create virtual environments, and install third-party modules in your projects will go a long way.

Note: If you’re a Windows user, then it’s highly recommended you follow our Python coding setup guide before plunging into this tutorial. The gist of it is that you should avoid installing Python from the Microsoft Store, as it could prevent pipx from working correctly.

To help you get to grips with pipx, you can download the supplemental materials, which include a handy command cheat sheet. Additionally, you can test your understanding by taking a short quiz.

Get Your Cheatsheet: Click here to download the free cheatsheet of pipx commands you can use to install and execute Python applications.

Take the Quiz: Test your knowledge with our interactive “Install and Execute Python Applications Using pipx” quiz. Upon completion you will receive a score so you can track your learning progress over time:

Take the Quiz »

Get Started With pipx

On the surface, pipx resembles pip because it also lets you install Python packages from PyPI or another package index. However, unlike pip, it doesn’t install packages into your system-wide Python interpreter or even an activated virtual environment. Instead, it automatically creates and manages virtual environments for you to isolate the dependencies of every package that you install.

Additionally, pipx adds symbolic links to your PATH variable for every command-line script exposed by the installed packages. As a result, you can invoke those scripts directly from the command line without explicitly running them through the Python interpreter.

Think of pipx as Python’s equivalent of npx in the JavaScript ecosystem. Both tools let you install and execute third-party modules in the command line just as if they were standalone applications. However, not all modules are created equal.

Broadly speaking, you can classify the code distributed through PyPI into three categories:

  1. Importable: It’s either pure-Python source code or Python bindings of compiled shared objects that you want to import in your Python projects. Typically, they’re libraries like Requests or Polars, providing reusable pieces of code to help you solve a common problem. Alternatively, they might be frameworks like FastAPI or PyGame that you build your applications around.
  2. Runnable: These are usually command-line utility tools like black, isort, or flake8 that assist you during the development phase. They could also be full-fledged applications like bpython or the JupyterLab environment, which is primarily implemented in a foreign TypeScript programming language.
  3. Hybrid: They combine both worlds by providing importable code and runnable scripts at the same time. Flask and Django are good examples, as they offer utility scripts while remaining web frameworks for the most part.

Making a distribution package runnable or hybrid involves defining one or more entry points in the corresponding configuration file. Historically, these would be setup.py or setup.cfg, but modern build systems in Python should generally rely on the pyproject.toml file and define their entry points in the [project.scripts] TOML table.

Note: If you use Poetry to manage your project’s dependencies, then you can add the appropriate script declarations in the tool-specific [tool.poetry.scripts] table.

Each entry point represents an independent script that you can run by typing its name at the command prompt. For example, if you’ve ever used the django-admin command, then you’ve called out an entry point to the Django framework.

Note: Don’t confuse entry points, which link to individual functions or callables in your code, with runnable Python packages that rely on the __main__ module to provide a command-line interface.

For example, Rich is a library of building blocks for creating text-based user interfaces in Python. At the same time, you can run this package with python -m rich to display a demo application that illustrates various visual components at your fingertips. Despite this, pipx won’t recognize it as runnable because the library doesn’t define any entry points.

To sum up, the pipx tool will only let you install Python packages with at least one entry point. It’ll refuse to install runnable packages like Rich and bare-bones libraries that ship Python code meant just for importing.

Once you identify a Python package with entry points that you’d like to use, you should first create and activate a dedicated virtual environment as a best practice. By keeping the package isolated from the rest of your system, you’ll eliminate the risk of dependency conflicts across various projects that might require the same Python library in different versions. Furthermore, you won’t need the superuser permissions to install the package.

Deciding where and how to create a virtual environment and then remembering to activate it every time before running the corresponding script can become a burden. Fortunately, pipx automates these steps and provides even more features that you’ll explore in this tutorial. But first, you need to get pipx running itself.

Test Drive pipx Without Installation

If you’re unsure whether pipx will address your needs and would prefer not to commit to it until you’ve properly tested the tool, then there’s good news! Thanks to a self-contained executable available for download, you can give pipx a spin without having to install it.

To get that executable, visit the project’s release page on the official GitHub repository in your web browser and grab the latest version of a file named pipx.pyz. Files with the .pyz extension represent runnable Python ZIP applications, which are essentially ZIP archives containing Python source code and some metadata, akin to JAR files in Java. They can optionally vendor third-party dependencies that you’d otherwise have to install by hand.

Note: Internally, the pipx project uses shiv to build its Python ZIP application. When you first run a Python ZIP application that was built with shiv, it’ll unpack itself into a hidden folder named .shiv/ located in your user’s home directory. As a result, subsequent runs of the same application will reuse the already extracted files, speeding up the startup time.

Afterward, you can run pipx.pyz by passing the path to your downloaded copy of the file to your Python interpreter—just as you would with a regular Python script:

Read the full article at https://realpython.com/python-pipx/ »

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Categories: FLOSS Project Planets

Django Weblog: Django bugfix release issued: 5.0.4

Wed, 2024-04-03 09:52

Today we've issued the 5.0.4 bugfix release.

The release package and checksums are available from our downloads page, as well as from the Python Package Index. The PGP key ID used for this release is Natalia Bidart: 2EE82A8D9470983E.

Django 3.2 has reached the end of extended support

Note that with this release, Django 3.2 has reached the end of extended support. All Django 3.2 users are encouraged to upgrade to Django 4.2 or later to continue receiving fixes for security issues.

See the downloads page for a table of supported versions and the future release schedule.

Categories: FLOSS Project Planets

Robin Wilson: Simple self-hosted OpenStreetMap routing using Valhalla and Docker

Wed, 2024-04-03 06:39

I came up with an interesting plan for an artistic map recently (more on that when I’ve finished working on it), and to create it I needed to be able to calculate a large number of driving routes around Southampton, my home city.

Specifically, I needed to be able to get lines showing the driving route from A to B for around 5,000 combinations of A and B. I didn’t want to overload a free hosted routing server, or have to deal with rate limiting – so I decided to look into running some sort of routing service myself, and it was actually significantly easier than I thought it would be.

It so happened that I’d come across a talk recently on a free routing engine called Valhalla. I hadn’t actually watched the talk yet (it is on my ever-expanding list of talks to watch), but it had put the name into my head – so I started investigating Valhalla. It seemed to do what I wanted, so I started working out how to run it locally. Using Docker, it was nice and easy – and to make it even easier for you, here’s a set of instructions.

  1. Download an OpenStreetMap extract for your area of interest. All of your routes will need to be within the area of this extract. I was focusing on Southampton, UK, so I downloaded the Hampshire extract from the England page on GeoFabrik. If you start from the home page you should be able to navigate to any region in the world.

  2. Put the downloaded file (in this case, hampshire-latest.osm.pbf) in a folder called custom_files. You can download multiple files and put them all in this folder and they will all be processed.

  3. Run the following Docker command:

    docker run -p 8002:8002 -v $PWD/custom_files:/custom_files ghcr.io/gis-ops/docker-valhalla/valhalla:latest

    This will run the valhalla Docker image, exposing port 8002 and mapping the custom_files subdirectory to /custom_files inside the container. Full docs for the Docker image are available here.

  4. You’ll see various bits of output as Valhalla processes the OSM extract, and eventually the output will stop appearing and the API server will start.

  5. Visit http://localhost:8002 and you should see an error – this is totally expected, it is just telling you that you haven’t used one of the valid API endpoints. This shows that the server is running properly.

  6. Start using the API. See the documentation for instructions on what to pass the API.

Once you’ve got the server running, it’s quite easy to call the API from Python and get the resulting route geometries as Shapely LineString objects. These can easily be put into a GeoPandas GeoDataFrame. For example:

import urllib import requests from pypolyline.cutil import decode_polyline # Set up the API request parameters - in this case, from one point # to another, via car data = {"locations":[ {"lat":from_lat,"lon":from_lon},{"lat":to_lat,"lon":to_lon}], "costing":"auto"} # Convert to JSON and make the URL path = f"http://localhost:8002/route?json={urllib.parse.quote(json.dumps(data))}" # Get the URL resp = requests.get(path) # Extract the geometry of the route (ie. the line from start to end point) # This is in the polyline format that needs decoding polyline = bytes(resp.json()['trip']['legs'][0]['shape'], 'utf8') # Decode the polyline decoded_coords = decode_polyline(polyline, 6) # Convert to a shapely LineString geom = shapely.LineString(decoded_coords)

To run this, you’ll need to install pypolyline and requests.

Note that you need to pass a second parameter of 6 into decode_polyline or you’ll get nonsense out (this parameter tells it that it is in polyline6 format, which seems to not be documented particularly well in the Valhalla documentation). Also, I’m sure there is a better way of passing JSON in a URL parameter using requests, but I couldn’t find it – whatever I did I got a JSON parsing error back from the API. The urllib.parse.quote(json.dumps(data)) code was the simplest thing I found that worked.

This code could easily be extended to work with multi-leg routes, to extract other data like the length or duration of the route, and more. The Docker image can also do more, like load public transport information, download OSM files itself and even more – see the docs for more on this.

Check back soon to see how I used this for some cool map-based art.

Categories: FLOSS Project Planets

Matt Layman: NATS: Connecting Apps Over a Network Easily

Tue, 2024-04-02 20:00
NATS is an awesome open source technology to help connect code together over a network. Whether you’re build a distributed microservice architecture or connecting IoT devices, NATS provides the tools you need to do that easily. In this talk, you’ll learn about NATS via a presentation with plenty of live coding examples.
Categories: FLOSS Project Planets

PyCoder’s Weekly: Issue #623 (April 2, 2024)

Tue, 2024-04-02 15:30

#623 – APRIL 2, 2024
View in Browser »

Reading and Writing WAV Files in Python

In this tutorial, you’ll learn how to work with WAV audio files in Python using the standard-library wave module. Along the way, you’ll synthesize sounds from scratch, visualize waveforms in the time domain, animate real-time spectrograms, and apply special effects to widen the stereo field.
REAL PYTHON

Designing a Pure Python Web Framework

This blog post talks about Reflex, a Python web framework. The post talks about what makes Reflex different from other frameworks and shows you sample starting code. See also the associated HN Discussion.
NIKHIL RAO

Creating an Autopilot in X-Plane Using Python

X-Plane is a flight simulator, and Austin is using Python to create an autopilot using proportional integral derivative controllers. Read on to see how its done.
AUSTIN

Mojo Goes Open Source

MODULAR

PyPI Hiring a Support Specialist (Remote)

PYPI

Discussions Draft PEP: Sealed Decorator for Static Typing

PYTHON DISCUSS

What Are Some Good Python Codebases to Read?

LOBSTERS

Articles & Tutorials Using Python in Bioinformatics and the Laboratory

How is Python being used to automate processes in the laboratory? How can it speed up scientific work with DNA sequencing? This week on the show, Chemical Engineering PhD Student Parsa Ghadermazi is here to discuss Python in bioinformatics.
REAL PYTHON podcast

Handling Database Migrations With Alembic

Alembic is a change control tool for database content in SQLAlchemy. This article looks at the high-level architecture of how Alembic works, how to add it to your project, and some common workflows you’ll encounter.
PAUL ESCH-LAURENT • Shared by Michael Herman

Python Tricks: A Buffet of Awesome Python Features

Discover Python’s best practices with simple examples and start writing even more beautiful + Pythonic code. “Python Tricks: The Book” shows you exactly how. You’ll master intermediate and advanced-level features in Python with practical examples and a clear narrative. Get the book + video bundle 33% off →
DAN BADER sponsor

Python in List of Best Languages to Learn

The US Bureau of Labor Statistics has identified the top four languages for programmers to learn and Python made the list. Median annual wage of programmers in the US is expected to rise 25% in the next 5 years.
FORTUNE

Finding Python Easter Eggs

Python has its fair share of hidden surprises, commonly known as Easter eggs. From clever jokes to secret messages, these little mysteries are often meant to be discovered by curious geeks like you!
REAL PYTHON course

PyPI Temporarily Halted New Users and Projects

To fend off a supply-chain attack, PyPI temporarily halted new users and projects for about 10 hours last week. This article discusses why, and the scourge of supply-chain attacks.
ARS TECHNICA

Broadcasting in NumPy

Broadcasting in NumPy is not the most exciting topic, but this article explores the topic using a narrative perspective. This is not your standard “broadcasting in NumPy” article!
STEPHEN GRUPPETTA • Shared by Stephen Gruppetta

A Better Python Cache for Slow Function Calls

The folks at Sweep AI needed something more persistent than Python’s lru_cache. This post talks about the design behind a file based cached decorator they’ve recently released.
WILLIAM ZENG

Jupyter & IPython Terminology Explained

Are you trying to understand the differences between Jupyter Notebook, JupyterLab, IPython, Colab, and related terms? This article is for you.
DATA SCHOOL

How I Manage Python in 2024

This post covers the tools one developer uses in their day-to-day process. Read on for info about mise, uv, ruff, and more.
OUTLORE

Fixing a Bug in PyPy’s Incremental GC

A deep dive on hunting a tricky bug in the garbage collection code inside the alternate interpreter PyPy.
CARL FRIEDRICH BOLZ-TEREICK

Projects & Code django-prose-editor: Rich Text Editing for Django

GITHUB.COM/MATTHIASK

pycountry: ISO Country, Language, Currency and More

GITHUB.COM/PYCOUNTRY

sqlelf: Explore ELF Objects Through the Power of SQL

GITHUB.COM/FZAKARIA

Python Post-Mortem Debugger

GITHUB.COM/COCOLATO • Shared by cocolato

botasaurus: Framework to Build Awesome Scrapers

GITHUB.COM/OMKARCLOUD

Events Weekly Real Python Office Hours Q&A (Virtual)

April 3, 2024
REALPYTHON.COM

Canberra Python Meetup

April 4, 2024
MEETUP.COM

Sydney Python User Group (SyPy)

April 4, 2024
SYPY.ORG

PyCascades 2024

April 5 to April 9, 2024
PYCASCADES.COM

PyDelhi User Group Meetup

April 6, 2024
MEETUP.COM

Django Girls Ecuador 2024

April 6, 2024
OPENLAB.EC

Happy Pythoning!
This was PyCoder’s Weekly Issue #623.
View in Browser »

[ Subscribe to 🐍 PyCoder’s Weekly 💌 – Get the best Python news, articles, and tutorials delivered to your inbox once a week >> Click here to learn more ]

Categories: FLOSS Project Planets

PyCon: PyCon US 2024: Call for Volunteers and Hatchery Registration now Open!

Tue, 2024-04-02 14:45

Looking to make a meaningful contribution to the Python community? Look no further than PyCon US 2024! Whether you're a seasoned Python pro or a newcomer to the community and looking to get involved, there's a volunteer opportunity that's perfect for you.

Sign-up for volunteer roles is done directly through the PyCon US website. This way, you can view and manage shifts you sign up for through your personal dashboard! You can read up on the different roles to volunteer for and how to sign up on the PyCon US website.

PyCon US is largely organized and run by volunteers. Every year, we ask to fill over 300 onsite volunteer hours to ensure everything runs smoothly at the event. And the best part? You don't need to commit a lot of time to make a difference– some shifts are as short as one hour long! You can sign up for as many or as few shifts as you’d like. Even a couple of hours of your time can go a long way in helping us create an amazing experience for attendees.

Keep in mind that you need to be registered for the event to sign up for a volunteer role.

One important way to get involved is to sign up as a Session Chair or Session Runner. This is an excellent opportunity to meet and interact with speakers while helping to ensure that sessions run smoothly. And who knows, you might just learn something new along the way! You can sign up for these roles directly on the Talks schedule.

Volunteer your time at PyCon US 2024 and you’ll be part of a fantastic community that's passionate about Python programming and help us make this year's conference a huge success. Sign up today for the shifts that call to you and join the fun!

Hatchery Program

First introduced in 2018, the Hatchery program offers the pathways for PyCon US attendees to introduce new tracks, activities, summits, demos, etc., at the conference—activities that all share and fulfill the Python Software Foundation’s mission within the PyCon US schedule.

Since its introduction, this program has “hatched” several new tracks that are now staples of our conference, including PyCon US Charlas, Mentored Sprints, and the Maintainer’s Summit. This year, we’ve received eight very compelling proposals. After careful consideration, we have selected four new programs, each of them unique and focus on different aspects of the Python community.

FlaskCon - Friday, May 17, 2024

Join us in a mini conference dedicated to Flask, its community and ecosystem, as well as related web technologies. Meet maintainers and community members, learn about how to get involved, and join us during the sprint days to contribute. Submit your talk proposal today!

Organized by David Lord, Phil Jones, Adam Englander, David Carmichael, Abdur-Rahmaan Janhangeer

Community Organizers Summit - Saturday, May 18, 2024

Do you organize a Conference, Meetup, User Group, Hackathon, or other community event in your area? Are you trying to start a group but don't know where to start? Whether you have 30 years of experience or are looking to create a new event, this summit is for you.

Join us for a summit of Presentations, Panels, and Breakout Sessions about organizing community events.

Organized by Mason Egger, Kevin Horn, and Heather White

Sign-up is required. Register to secure your spot.

Humble Data - Saturday, May 18, 2024

Are you eager to embark on a tech career but unsure where to start? Are you curious about data science? Taking the first steps in this area is hard, but you don’t have to do it alone. Join our workshop for complete beginners and get started in Python data science - even if you’ve never written a single line of code!

We invite those from underrepresented groups to apply to join us for a fun, supportive workshop that will give you the confidence to get started in this exciting area. You can expect plenty of exercises, as well as inspiring talks from those who were once in your shoes. You’ll cover the basics of programming in Python, as well as useful libraries and tools such as Jupyter notebooks, pandas, and Matplotlib.

In this hands-on workshop, you’ll work through a series of beginner-friendly materials at your own pace. You’ll work within small groups, each with an assigned mentor, who will be there to help you with any questions or whenever you get stuck. All you’ll need to bring is a laptop that can connect to the internet and a willingness to learn!

Organized by Cheuk Ting Ho and Jodie Burchell

Sign-up is required. Register to secure your spot.
Documentation Summit - Sunday May 19, 2024

A full-day summit including talks and panel sessions inviting leaders in documentation to share their experience in how to make good documentation, discussion about documentation tools such as sphinx, mkdocs, themes etc, what are the common mistakes and how to avoid them. Accessibility of documentation is also an important topic so we will also cover talks or discussions regarding accessibility of documentation.

This summit is aimed at anyone who cares about or is involved in any aspect of open source documentation, such as, but not limited to, technical writers, developers, developer advocates, project maintainers and contributors, accessibility experts, documentation tooling developers, and documentation end-users.

Organized by Cheuk Ting Ho

Sign-up is required. Register to secure your spot.
Register Now

Registration for PyCon US is now open, and all of the Hatchery programs are included as part of your PyCon US registration (no additional cost). Some of the programs require advanced sign up, in which case walk-ins will only be accepted if space is available. Please check each Hatchery program carefully to determine whether a registration is required or not.

Head over to your PyCon US Dashboard to add any of the above Hatchery programs to your PyCon US registration. Don't worry, you can always change your mind and cancel later to open up the space for someone else!

Congratulations to all the accepted program organizers! Thank you for bringing forward your fresh ideas to PyCon US. We look forward to seeing you in Pittsburgh.
Categories: FLOSS Project Planets

Real Python: Python Deep Learning: PyTorch vs Tensorflow

Tue, 2024-04-02 10:00

PyTorch vs TensorFlow: What’s the difference? Both are open source Python libraries that use graphs to perform numerical computation on data. Both are used extensively in academic research and commercial code. Both are extended by a variety of APIs, cloud computing platforms, and model repositories.

If they’re so similar, then which one is best for your project?

In this video course, you’ll learn:

  • What the differences are between PyTorch and TensorFlow
  • What tools and resources are available for each
  • How to choose the best option for your specific use case

You’ll start by taking a close look at both platforms, beginning with the slightly older TensorFlow, before exploring some considerations that can help you determine which choice is best for your project. Let’s get started!

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Categories: FLOSS Project Planets

Anwesha Das: Opening up Ansible release to the community

Tue, 2024-04-02 08:52

Transparency, collaboration, inclusivity, and openness lay the foundation of the Open Source community. As the project&aposs maintainers, few of our tasks make the entry bar of contribution low, collaboration easy, and the governance model fair. Ansible Community Engineering Team always thrives on these purposes through our different endeavors.

Ansible has historically been released by Red Hat employees. We planned to open up the release to the community. And I was asked about that. My primary goal was releasing Ansible, which should be dull and possible for the community. This was my first time dealing with Github actions. There is still a lot to learn. But we are there now.

The Release Management working group started releasing the Ansible Community package using GitHub Actions workflow from Ansible version 9.3.0 . The recent 9.4.0 release has also been released following the same workflow.

Thank you Felix Fontein, Maxwell G, Sviatoslav Sydorenko and Toshio for helping out in shaping the workflow with you valuable feedback, doing the actual release and giving answers to my enumerable queries.

Categories: FLOSS Project Planets

EuroPython: EuroPython 2024: Ticket sales now open! 🐍

Tue, 2024-04-02 08:39

Hey hey, everyone,

We are thrilled to announce that EuroPython is back and better than ever! ✨
EuroPython 2024 will be held 8-14 July at the Prague Congress Centre (PCC), Czech Republic. Details of how to participate remotely will be published soon.

The conference will follow the same structure as the previous editions:

  • Two Workshop/Tutorial Days (8-9 July, Mon-Tue)
  • Three Conference Days (10-12 July, Wed-Fri)
  • Sprint Weekend (13-14 July, Sat-Sun)

Secure your spot at EuroPython 2024 by purchasing your tickets today. For more information and to grab your tickets, visit https://ep2024.europython.eu/tickets before they sell out!

Get your tickets fast before the late-bird prices kick in. &#x1F3C3;

Looking forward to welcoming you to EuroPython 2024 in Prague! &#x1F1E8;&#x1F1FF;

&#x1F3AB;Don&apost forget to get your ticket at - https://ep2024.europython.eu

Cheers,

The EuroPython 2024 Organisers

Categories: FLOSS Project Planets

Python Bytes: #377 A Dramatic Episode

Tue, 2024-04-02 04:00
<strong>Topics covered in this episode:</strong><br> <ul> <li><a href="https://github.com/epogrebnyak/justpath"><strong>justpath</strong></a></li> <li><strong>xz back door</strong></li> <li><a href="https://lpython.org">LPython</a></li> <li><a href="https://github.com/treyhunner/dramatic"><strong>dramatic</strong></a></li> <li><strong>Extras</strong></li> <li><strong>Joke</strong></li> </ul><a href='https://www.youtube.com/watch?v=eWnYlxOREu4' style='font-weight: bold;'data-umami-event="Livestream-Past" data-umami-event-episode="377">Watch on YouTube</a><br> <p><strong>About the show</strong></p> <p>Sponsored by ScoutAPM: <a href="https://pythonbytes.fm/scout"><strong>pythonbytes.fm/scout</strong></a></p> <p><strong>Connect with the hosts</strong></p> <ul> <li>Michael: <a href="https://fosstodon.org/@mkennedy"><strong>@mkennedy@fosstodon.org</strong></a></li> <li>Brian: <a href="https://fosstodon.org/@brianokken"><strong>@brianokken@fosstodon.org</strong></a></li> <li>Show: <a href="https://fosstodon.org/@pythonbytes"><strong>@pythonbytes@fosstodon.org</strong></a></li> </ul> <p>Join us on YouTube at <a href="https://pythonbytes.fm/stream/live"><strong>pythonbytes.fm/live</strong></a> to be part of the audience. Usually Tuesdays at 11am PT. Older video versions available there too.</p> <p>Finally, if you want an artisanal, hand-crafted digest of every week of </p> <p>the show notes in email form? Add your name and email to <a href="https://pythonbytes.fm/friends-of-the-show">our friends of the show list</a>, we'll never share it.</p> <p><strong>Michael #1:</strong> <a href="https://github.com/epogrebnyak/justpath"><strong>justpath</strong></a></p> <ul> <li>Inspect and refine PATH environment variable on both Windows and Linux.</li> <li>Raw, count, duplicates, invalids, corrections, excellent stuff.</li> <li>Check out <a href="https://asciinema.org/a/642726">the video</a></li> </ul> <p><strong>Brian #2:</strong> <strong>xz back door</strong></p> <ul> <li>In case you kinda heard about this, but not really.</li> <li>Very short version: <ul> <li>A Microsoft engineer noticed a performance problem with ssh and tracked it to a particular version update of xz.</li> <li>Further investigations found a multi-year installation of a fairly complex back door into the xz by a new-ish contributor. But still contributing over several years. First commit in early 2022.</li> <li>The problem is caught. But if it had succeeded, it would have been bad.</li> <li>Part of the issue of how this happened is due to having one primary maintainer on a very widely used tool included in tons-o-Linux distributions.</li> </ul></li> <li>Some useful articles <ul> <li><a href="https://boehs.org/node/everything-i-know-about-the-xz-backdoor"><strong>Everything I Know About the XZ Backdoor</strong></a> - Evan Boehs - recommended read</li> </ul></li> <li>Don’t think your affected? Think again if you use homebrew, for example: <ul> <li><a href="https://micro.webology.dev/2024/03/29/update-and-upgrade.html"><strong>Update and upgrade Homebrew and</strong></a><a href="https://micro.webology.dev/2024/03/29/update-and-upgrade.html"> </a><a href="https://micro.webology.dev/2024/03/29/update-and-upgrade.html"><strong><code>xz</code></strong></a><a href="https://micro.webology.dev/2024/03/29/update-and-upgrade.html"> <strong>versions</strong></a></li> </ul></li> <li>Notes <ul> <li>Open source maintenance burnout is real</li> <li>Lots of open source projects are maintained by unpaid individuals for long periods of time.</li> <li>Multi-year sneakiness and social bullying is pretty hard to defend against.</li> <li>Handing off projects to another primary maintainer has to be doable. <ul> <li>But now I think we need better tools to vet contributors. </li> <li>Maybe? Or would that just suppress contributions?</li> </ul></li> </ul></li> <li>One option to help with burnout: <ul> <li>JGMM, Just Give Maintainers Money: <a href="https://blog.glyph.im/2024/03/software-needs-to-be-more-expensive.html"><strong>Software Needs To Be More Expensive</strong></a> - Glyph</li> </ul></li> </ul> <p><strong>Michael #3:</strong> <a href="https://lpython.org">LPython</a></p> <ul> <li>LPython aggressively optimizes type-annotated Python code. It has several backends, including LLVM, C, C++, and WASM. </li> <li>LPython’s primary tenet is speed.</li> <li>Play with the wasm version here: <a href="https://dev.lpython.org">dev.lpython.org</a></li> <li>Still in alpha, so keep that in mind.</li> </ul> <p><strong>Brian #4:</strong> <a href="https://github.com/treyhunner/dramatic"><strong>dramatic</strong></a></p> <ul> <li>Trey Hunner</li> <li>More drama in the software world. This time in the Python. </li> <li>Actually, this is just a fun utility to make your Python output more dramatic.</li> <li>More fun output with <a href="https://github.com/ChrisBuilds/terminaltexteffects">terminaltexteffects</a> <ul> <li>suggested by Allan</li> </ul></li> </ul> <p><strong>Extras</strong> </p> <p>Brian:</p> <ul> <li><a href="https://github.com/Textualize/textual/releases/tag/v0.55.0">Textual how has a new inline feature in the new release.</a></li> </ul> <p>Michael:</p> <ul> <li>My keynote talk is out: <a href="https://www.youtube.com/watch?v=coz1CGRxjQ0">The State of Python in 2024</a></li> <li>Have you browsed your <a href="https://github.com">github feed</a> lately?</li> <li><a href="https://pythoninsider.blogspot.com/2024/03/python-31014-3919-and-3819-is-now.html">3.10, 3.9, 3.8 security updates</a></li> </ul> <p><strong>Joke:</strong> <a href="https://python-bytes-static.nyc3.digitaloceanspaces.com/definition-of-methodolgy-terms.jpg">Definition of terms</a></p>
Categories: FLOSS Project Planets

Python Software Foundation: New Open Initiative for Cybersecurity Standards

Mon, 2024-04-01 23:00

The Python Software Foundation is pleased to announce our participation in co-starting a new Open Initiative for Cybersecurity Standards collaboration with the Apache Software Foundation, the Eclipse Foundation, other code-hosting open source foundations, SMEs, industry players, and researchers. This collaboration is focused on meeting the real challenges of cybersecurity in the open source ecosystem, and demonstrating full cooperation with and supporting the implementation of the European Union’s Cyber Resilience Act (CRA). With our combined efforts, we are optimistic that we will reach our goal of establishing common specifications for secure open source development based on existing open source best practices. 

New regulations, such as those in the CRA, highlight the need for secure by design and strong supply chain security standards. The CRA will lead to standard requests from the Commission to the European Standards Organisations and we foresee requirements from the United States and other regions in the future. As open source foundations, we want to respond to these requests proactively by establishing common specifications for secure software development and meet the expectations of the newly defined term Open Source Steward. 

Open source communities and foundations, including the Python community, have long been practicing and documenting secure software development processes. The starting points for creating common specifications around security are already there, thanks to millions of contributions to hundreds of open source projects. In the true spirit of open source, we plan to learn from, adapt, and build upon what already exists for the collective betterment of our greater software ecosystem. 

The PSF’s Executive Director Deb Nicholson will attend and participate in the initial Open Initiative for Cybersecurity Standards meetings. Later on, various PSF staff members will join in relevant parts of the conversation to help guide the initiative alongside their peers. The PSF looks forward to more investment in cybersecurity best practices by Python and the industry overall. 

This community-driven initiative will have a lasting impact on the future of cybersecurity and our shared open source communities. We welcome you to join this collaborative effort to develop secure open source development specifications. Participate by sharing your knowledge, input, and raising up existing community contributions. Sign up for the Open Initiative for Process Specifications mailing list to get involved and stay updated on this initiative. Check out the press release's from the Eclipse Foundation’s and the Apache Software Foundation for more information.

Categories: FLOSS Project Planets

Hynek Schlawack: Python Project-Local Virtualenv Management Redux

Mon, 2024-04-01 20:00

One of my first TIL entries was about how you can imitate Node’s node_modules semantics in Python on UNIX-like operating systems. A lot has happened since then (to the better!) and it’s time for an update. direnv still rocks, though.

Categories: FLOSS Project Planets

Luke Plant: Enforcing conventions in Django projects with introspection

Mon, 2024-04-01 11:05

Naming conventions can make a big difference to the maintenance issues in software projects. This post is about how we can use the great introspection capabilities in Python to help enforce naming conventions in Django projects.

Contents

Let’s start with an example problem and the naming convention we’re going to use to solve it. There are many other applications of the techniques here, but it helps to have something concrete.

The problem: DateTime and DateTimeField confusion

Over several projects I’ve found that inconsistent or bad naming of DateField and DateTimeField fields can cause various problems.

First, poor naming means that you can confuse them for each other, and this can easily trip you up. In Python, datetime is a subclass of date, so if you use a field called created_date assuming it holds a date when it actually holds a datetime, it might be not obvious initially that you are mishandling the value, but you’ll often have subtle problems down the line.

Second, sometimes you have a field named like expired which is actually the timestamp of when the record expired, but it could easily be confused for a boolean field.

Third, not having a strong convention, or having multiple conventions, leads to unnecessary time wasted on decisions that could have been made once.

Finally, inconsistency in naming is just confusing and ugly for developers, and often for users further down the line, because names tend to leak.

Even if you do have an established convention, it’s possible for people not to know. It’s also very easy for people to change a field’s type between date and datetime without also changing the name. So merely having the convention is not enough, it needs to be enforced.

Note

If you want to change the name and type of a field (or any other atribute), and want the data to preserve data as much as possible, you usually need to do it in two stages or more depending on your needs, and always check the migrations created – otherwise Django’s migration framework will just see one field removed and a completely different one added, and generate migrations that will destroy your data.

For this specific example, the convention I quite like is:

  • field names should end with _at for timestamp fields that use DateTimeField, like expires_at or deleted_at.

  • field names should end with _on or _date for fields that use DateField, like issued_on or birth_date.

This is based on the English grammar rule that we use “on” for dates but “at” for times – “on the 25th March”, but “at 7:00 pm” – and conveniently it also needs very few letters and tends to read well in code. The _date suffix is also helpful in various contexts where _on seems very unnatural. You might want different conventions, of course.

To get our convention to be enforced with automated checks we need a few tools.

The tools Introspection

Introspection means the ability to use code to inspect code, and typically we’re talking about doing this when our code is already running, from within the same program and using the same programming language.

In Python, this starts from simple things like isintance() and type() to check the type of an object, to things like hasattr() to check for the presence of attributes and many other more advanced techniques, including the inspect module and many of the metaprogramming dunder methods.

Django app and model introspection

Django is just Python, so you can use all normal Python introspection techniques. In addition, there is a formally documented and supported set of functions and methods for introspecting Django apps and models, such as the apps module and the Model _meta API.

Django checks framework

The third main tool we’re going to use in this solution is Django’s system checks framework, which allows us to run certain kinds of checks, at both “warning” and “error” level. This is the least important tool, and we could in fact switch it out for something else like a unit test.

The solution

It’s easiest to present the code, and then discuss it:

from django.apps import apps from django.conf import settings from django.core.checks import Tags, Warning, register @register() def check_date_fields(app_configs, **kwargs): exceptions = [ # This field is provided by Django's AbstractBaseUser, we don't control it # and we’ll break things if we change it: "accounts.User.last_login", ] from django.db.models import DateField, DateTimeField errors = [] for field in get_first_party_fields(): field_name = field.name model = field.model if f"{model._meta.app_label}.{model.__name__}.{field_name}" in exceptions: continue # Order of checks here is important, because DateTimeField inherits from DateField if isinstance(field, DateTimeField): if not field_name.endswith("_at"): errors.append( Warning( f"{model.__name__}.{field_name} field expected to end with `_at`, " + "or be added to the exceptions in this check.", obj=field, id="conventions.E001", ) ) elif isinstance(field, DateField): if not (field_name.endswith("_date") or field_name.endswith("_on")): errors.append( Warning( f"{model.__name__}.{field_name} field expected to end with `_date` or `_on`, " + "or be added to the exceptions in this check.", obj=field, id="conventions.E002", ) ) return errors def get_first_party_fields(): for app_config in get_first_party_apps(): for model in app_config.get_models(): yield from model._meta.get_fields() def get_first_party_apps() -> list[AppConfig]: return [app_config for app_config in apps.get_app_configs() if is_first_party_app(app_config)] def is_first_party_app(app_config: AppConfig) -> bool: if app_config.module.__name__ in settings.FIRST_PARTY_APPS: return True app_config_class = app_config.__class__ if f"{app_config_class.__module__}.{app_config_class.__name__}" in settings.FIRST_PARTY_APPS: return True return False

We start here with some imports and registration, as documented in the “System checks” docs. You’ll need to place this code somewhere that will be loaded when your application is loaded.

Our checking function defines some allowed exceptions, because there are some things out of our control, or there might be other reasons. It also mentioned the exceptions mechanism in the warning message. You might want a different mechanism for exceptions here, but I think having some mechanism like this, and advertising its existence in the warnings, is often pretty important. Otherwise, you can end up with worse consequences when people just slavishly follow rules. Notice how in the exception list above I’ve given a comment detailing why the exception is there though – this helps to establish a precedent that exceptions should be justified, and the justification should be there in the code.

We then loop through all “first party” model fields, looking for DateTimeField and DateField instances. This is done using our get_first_party_fields() utility, which is defined in terms of get_first_party_apps(), which in turn depends on:

The id values passed to Warning here are examples – you should change according to your needs. You might also choose to use Error instead of Warning.

Output

When you run manage.py check, you’ll then get output like:

System check identified some issues: WARNINGS: myapp.MyModel.created: (conventions.E001) MyModel.created field expected to end with `_at`, or be added to the exceptions in this check. System check identified 1 issue (0 silenced).

As mentioned, you might instead want to run this kind of check as a unit test.

Conclusion

There are many variations on this technique that can be used to great effect in Django or other Python projects. Very often you will be able to play around with a REPL to do the introspection you need.

Where it is possible, I find doing this far more effective than attempting to document things and relying on people reading and remembering those docs. Every time I’m tripped up by bad names, or when good names or a strong convention could have helped me, I try to think about how I could push people towards a good convention automatically – while also giving a thought to unintended bad consequences of doing that prematurely or too forcefully.

Categories: FLOSS Project Planets

Zero to Mastery: Python Monthly Newsletter 💻🐍

Mon, 2024-04-01 06:00
52nd issue of Andrei Neagoie's must-read monthly Python Newsletter: Whitehouse Recommends Python, Memory Footprint, Let's Talk About Devin, and much more. Read the full newsletter to get up-to-date with everything you need to know from last month.
Categories: FLOSS Project Planets

Tryton News: Newsletter April 2024

Mon, 2024-04-01 02:00

During the last month we focused on fixing bugs, improving the behaviour of things, speeding-up performance issues and adding new features for you.

Changes for the User Sales, Purchases and Projects

When processing an exception on an order, the user can ignore the exception and so no more related lines/documents will be re-created. But in case of a mistake it was not possible to cancel the ignore. Now we allow the Sale and Purchase administrator group to edit the list of ignored- lines to be able to remove mistakes. After changes to the list of ignored lines the user needs to manually reprocess the order, using the Process button, to restore it to a coherent state.

Accounting, Invoicing and Payments

Account users are now allowed to delete draft account moves.

Stock, Production and Shipments

When creating a stock forecast the warehouse is now filled in automatically.

Now the scheduled task maintains a global order of assignations for shipments and productions. A global order is important because assignations are competing with each other to get the products first.

User Interface

We now hide the traceback from an error behind an expander widget, as it may scare some users and it is not helpful for most of them.

System Data and Configuration

Employees are now activated based on the start and end date of their employment.

New Modules

The new stock_product_location_place module allows a specific place to be defined where goods are stored in their location. You can refer to its documentation for more details.

New Documentation

We reworked parts of the Tryton documentation.

How to enter in an opening balance.

We changed our documentation hub from readthedocs to self hosting.

New Releases

We released bug fixes for the currently maintained long term support series
7.0 and 6.0, and for the penultimate series 6.8.

Security Please update your systems to take care of a security related bug we found last month. Changes for the System Administrator

We now make cron and workers exit silently on a keyboard interrupt.

We also introduced a switch on trytond-admin to be able to delay the creation of indexes. This is because the index creation can take a long time to complete when updating modules on big databases. Using this switch the database schema can be quickly created, but will be without the performance gain from the new indexes, which are not available yet. Another run at a more appropriate time without the switch can then be used to create the indexes.

For history records we now display the date time on access errors.

Changes for Implementers and Developers

We now use dot notation and binary operators when converting PYSON to a string when it is to be displayed to the user.

Authors: @dave @pokoli @udono

1 post - 1 participant

Read full topic

Categories: FLOSS Project Planets

Go Deh: Finding a sub-list within a list, in Python

Sun, 2024-03-31 06:15

   

 

Existing?

 As part of a larger project, I thought I might need to search for a sub-list within a given list, and because I am lazy i did a quick google and did not like the answers I found.I started with the thought that the best algorithm for me would be to start searching from the index of the first item in the sublist and so on, but none of the googled answers used list.index.

I decided then to create my own 

My version

Well I want to use list.index. If the item is not in the list then it raises an error, so I'll need a try-except block too.

I look for successive first item from the sub-list in the list and if found, accumulate the index in the answer and move on to search for the next match.

It seemed easy to add flags to:

  1. Stop after finding a first index of the sub-list in the list.
  2. Allow overlapping matches  or not. [1,0,1] is found twice in [1,0,1,0,1] at indices 0 and 2, but only once if overlapping is not allowed
#!/bin/env python3#%%from typing import Any

"""Find instance of a sub-list in a list"""
def index_sublist(lst: list[Any],                  sublst: list[Any],                  only_first: bool=False,                  non_overlapping=False,                  ) -> list[int]:    "Find instance of a (non-empty), sub-list in a list"    if not sublst:        raise ValueError("Empty sub-list")    if not lst:        return []        first, ln = sublst[0], len(sublst)    ans, i = [], 0    while True:        try:            i = lst.index(first, i)        except ValueError:            break        if lst[i: i+ln] == sublst:            ans.append(i)        if only_first:            break        i += ln if non_overlapping else 1        return ans
#%%def test():    assert index_sublist([], [1], only_first=False) == []    assert index_sublist([1], [1], only_first=False) == [0]    assert index_sublist([1,0,1], [1], only_first=False) == [0, 2]    assert index_sublist([2,1,0,1], [1], only_first=True) == [1]    assert index_sublist([2,1,0,1], [1, 3], only_first=False) == []        assert index_sublist([1,0,1,0,1], [1,0,1],                         only_first=False,                         non_overlapping=False) == [0, 2]    assert index_sublist([1,0,1,0,1], [1,0,1],                         only_first=False,                         non_overlapping=True) == [0]

#%%if __name__ == '__main__':    test()

End.

 

Categories: FLOSS Project Planets

Armin Ronacher: Skin in the Game

Sat, 2024-03-30 20:00

There was a bit of a kerfuffle about subverting open source projects recently. That incident made me think about something that's generally on my mind. That thought again was triggered by that incident but is otherwise not in response to it. I want to talk about some of the stresses of being an Open Source contributor and maintainer but specifically about something that have been unsure over the years about: anonymity and pseudonymity.

Over the years it has been pretty clear that some folks are contributing in the Open Source space and don't want to have their name attached to their contributions. I'm not going to judge if they have legitimate reasons for doing so or if pseudonymity a good or bad thing. That it is happening, is simply a fact of life. The consequences of that however are quite interesting and I think worth discussing.

When I talk about names, I primarily think about the ability to associate an online handle and a contribution to a real human being. That does not imply that it should be necessarily trivial for people to find that information, but it should be something that is at least in principle be possible. There is obviously a balance to all of this, but given that there are real consequences to “doing stuff on the internet” there has to be a way to get in contact with the person behind it. So as far as “naming a person” here is concerned it's not so much about a particular name, but as in being able to identify the human being behind it.

While we might get away with believing nothing on the internet matters and laws do not apply, that's not really true. In fact particularly with Open Source we're all leveraging copyright laws and the ability to enforce contracts to work together. And no matter how much we write “THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS “AS IS” AND ANY EXPRESS OR IMPLIED WARRANTIES” not all legal consequences can be waived.

Which leads me to some development in internet anonymity I have observed over the last 20 years which I find worth reflecting on. When I got started with Open Source, pseudonyms felt much less common. The distance to the legal system at least to me felt much closer than today. I give you a handful examples of this: When I got started doing stuff on the internet and you did something really stupid, someone called your ISP and you had an angry conversation. Because the subscriber of that line was known. A lot of the systems on the earlier internet were based on a lot more trust than would be acceptable today. An angry ISP was not the worst that would happen to you, a lot of people got charged with wire-fraud for things that today are just being ignored because they have become too commonplace (like probably most DDOS attacks these days). When I created my first SourceForge account, the “real name” field was not optional, CLAs talked about names and asked for signatures. When my stuff was packaged up in Debian some of the first things that came my way were folks explaining me some legal stuff about licenses I was unaware before. After I started getting involved with Ubuntu I went to a key signing party where I showed my passport to other human beings to demonstrate that I exist. When I became a Python core contributor I signed a physical paper for the CLA.

A lot of this feels quite untypical today. We no longer do a lot of these things and I believe it mostly just works because people don't go to court much about Open Source projects any more. It probably also works because over time Open Source became more established. If you contribute via GitHub today, even the terms of service probably help resolving copyright issues by being quite explicit about how contributions to public repositories happen (you contribute under the license of the repository).

But sometimes people do go to court. Open Source projects in many ways are an unclear amalgamation of different contributions and we just collectively hope that we all agree that contributions come in under the same licenses as the file in the root of the project. The Linux kernel once did not accept contributions from pseudonymous users. It did so for good reasons. They need to know who the person is that contributes so they know what to do in case of a licensing conflict and there was more than one lawsuit involving Linux. This was true even after the DCO was put in place. Today, pseudonyms are accepted. Not just in Linux, but also in many large projects. An example of that is the CNCF which found a nice middle ground on the name and what you sign off with: “A real name does not require a legal name, nor a birth name, nor any name that appears on an official ID (e.g. a passport). Your real name is the name you convey to people in the community for them to use to identify you as you.”

Most important however is this part: “Your real name should not be an anonymous id or false name that misrepresents who you are.” The need of getting in contact with the person exists and did not go away. It always existed and it quite likely will continue to exist. There are good reasons why you want to know who the person is. Maybe the person contributed code they did not own the copyright of, maybe their employer writes you an angry email. Concerns about licensing are a common reason for why people want to know who the people are that contribute. Maybe sanctions or other legal restrictions prevent to accept contributions from that person. Another reason you might need to get in contact with the author is to change the license. You might remember that a lot of projects tried to move from GPL v2 to GPL v2 or later. A change that required the agreement of every person that contributed before. Reaching out to people sometimes is not the easiest of tasks.

However in addition to pseudonymous contributions, there is also a sharp increase of anonymous contributions. Particularly thanks to GitHub pull requests it's incredibly common that you get commits now from folks whose only identity is a made up user name, no visible email address and some default avatar that GitHub generated.

This is not necessarily a problem, but to me it feels like a trend that I'm not sure how to work with. It creates a somewhat complex form of interaction where one person might be out in the open, the other person might be entirely anonymous. Many of us old timers who went into Open Source in former times have a pretty well established online identity (either via a “real name” or pseudonym). I also think that many of us who are in this for a while feel quite a bit of stress and responsibility for the things we created, at least that is very much true for me. Multiple times over the years did I hear or read online that a person chooses to contribute anonymously is because their employer bans Open Source work. One the one hand it's great that people find a way to avoid these restrictions, on the other hand if that ever gets found out you probably are going to have some unfriendly talks with someone else's legal team. While in practice none of my code is important enough that I think something like this will happen, I can absolutely see this happen to large Open Source projects where a rogue employee contributes on their employer's time or otherwise proprietary code.

I have a heard the sentiment a few times now that one should vet the contributions, not the contributors. That's absolutely true. Yet at the same time many of us are quite frankly assuming good actors and just happy to get contributions. We sometimes merge pull requests not in the best state of mind, sometimes we feel pressured. It can be quite hard to spot back doors and hostile commits, particularly if the other side is sufficiently motivated. But here is the thing: you know who I am, I do not know who a lot of the people are that send pull requests against my libraries. An asymmetry I need to work with.

What motivates me to write this, is that I feel quite a bit of asymmetry in contributions these days. It's a lot easier to contribute to Open Source these days and that's a good thing. But it also comes at a cost. It's impossible to find yourself having become a critical piece of software deployed all over the world by accident. Your users update to the latest version of your code without any vetting on their own. Yet the brunt of the responsibility falls on you, the person associated with the project. A person that might be known. Yet a lot of the contributions are random people, and you might not have a good change to identify them. Sometimes it's not even the contributions, it's already anonymous users on the issue trackers that increase that pressure.

I find that environment at time to be emotionally stressful, much more than it has been. I don't even maintain particularly popular pieces of Open Source libraries these days but I still feel much more stressed about that experience than years ago and a pretty big element of it is that I feel that a lot of the issues and commits are from people who show up once and then leave. Maybe it's because I'm older, or because I also have other things in my life than Open Source, but the situation is what it is.

Which brings me back to the identity thing. It's probably great for a lot of people that their online identity is not clearly linked to the real world identity. What I find less great is that with this loss of real identity many of the real world legal consequences are then stuck with me, a person that can be identified. I don't assume that knowing who the folks are that contribute will solve any problems, mind you. While I do have some probably unrealistic hope that law enforcement agencies would find it a bit easier to get involved if they can better identify a bad actor, I'm not even sure if they find much of an interest to get involved in the first place. To me, it's mostly a piece of mind thing.

Everybody's contribution into ones projects turns into a permanent liability in a way. I take responsibility of someone else's commit with the moment I press the merge button. While many of those contributions are benign no matter what, you do start to trust repeated contributors after a while. A well established identity on the internet creates a form of inner piece, a handing over a project more and more to a person you don't know less so. Yet it can happen absolutely gradually. Maybe verified identities an illusion, but sometimes these illusions is all that's needed to feel more relaxed.

I don't think we should force people to have a real world identity on the internet, but we also have to probably take a step back and look at how we came here and if we like it this way. In a sense this is a generic rant about missing the “good old times” (that probably never were), where people talked to each other eye to eye. Instead more and more, interactions on the internet feel like that they are happening with faceless figures you will probably never ever meet, see, talk or write to.

So what's left? I don't know. Neither do I know if this is a problem that only I feel, nor do I know a solution to it if it was one. All I can say is that I find Open Source stressful more than one way these days.

Categories: FLOSS Project Planets

Glyph Lefkowitz: Software Needs To Be More Expensive

Sat, 2024-03-30 19:00
The Cost of Coffee

One of the ideas that James Hoffmann — probably the most influential… influencer in the coffee industry — works hard to popularize is that “coffee needs to be more expensive”.

The coffee industry is famously exploitative. Despite relatively thin margins for independent café owners1, there are no shortage of horrific stories about labor exploitation and even slavery in the coffee supply chain.

To summarize a point that Mr. Hoffman has made over a quite long series of videos and interviews2, some of this can be fixed by regulatory efforts. Enforcement of supply chain policies both by manufacturers and governments can help spot and avoid this type of exploitation. Some of it can be fixed by discernment on the part of consumers. You can try to buy fair-trade coffee, avoid brands that you know have problematic supply-chain histories.

Ultimately, though, even if there is perfect, universal, zero-cost enforcement of supply chain integrity… consumers still have to be willing to, you know, pay more for the coffee. It costs more to pay wages than to have slaves.

The Price of Software

The problem with the coffee supply chain deserves your attention in its own right. I don’t mean to claim that the problems of open source maintainers are as severe as those of literal child slaves. But the principle is the same.

Every tech company uses huge amounts of open source software, which they get for free.

I do not want to argue that this is straightforwardly exploitation. There is a complex bargain here for the open source maintainers: if you create open source software, you can get a job more easily. If you create open source infrastructure, you can make choices about the architecture of your projects which are more long-term sustainable from a technology perspective, but would be harder to justify on a shorter-term commercial development schedule. You can collaborate with a wider group across the industry. You can build your personal brand.

But, in light of the recent xz Utils / SSH backdoor scandal, it is clear that while the bargain may not be entirely one-sided, it is not symmetrical, and significant bad consequences may result, both for the maintainers themselves and for society.

To fix this problem, open source software needs to get more expensive.

A big part of the appeal of open source is its implicit permission structure, which derives both from its zero up-front cost and its zero marginal cost.

The zero up-front cost means that you can just get it to try it out. In many companies, individual software developers do not have the authority to write a purchase order, or even a corporate credit card for small expenses.

If you are a software engineer and you need a new development tool or a new library that you want to purchase for work, it can be a maze of bureaucratic confusion in order to get that approved. It might be possible, but you are likely to get strange looks, and someone, probably your manager, is quite likely to say “isn’t there a free option for this?” At worst, it might just be impossible.

This makes sense. Dealing with purchase orders and reimbursement requests is annoying, and it only feels worth the overhead if you’re dealing with a large enough block of functionality that it is worth it for an entire team, or better yet an org, to adopt. This means that most of the purchasing is done by management types or “architects”, who are empowered to make decisions for larger groups.

When individual engineers need to solve a problem, they look at open source libraries and tools specifically because it’s quick and easy to incorporate them in a pull request, where a proprietary solution might be tedious and expensive.

That’s assuming that a proprietary solution to your problem even exists. In the infrastructure sector of the software economy, free options from your operating system provider (Apple, Microsoft, maybe Amazon if you’re in the cloud) and open source developers, small commercial options have been marginalized or outright destroyed by zero-cost options, for this reason.

If the zero up-front cost is a paperwork-reduction benefit, then the zero marginal cost is almost a requirement. One of the perennial complaints of open source maintainers is that companies take our stuff, build it into a product, and then make a zillion dollars and give us nothing. It seems fair that they’d give us some kind of royalty, right? Some tiny fraction of that windfall? But once you realize that individual developers don’t have the authority to put $50 on a corporate card to buy a tool, they super don’t have the authority to make a technical decision that encumbers the intellectual property of their entire core product to give some fraction of the company’s revenue away to a third party. Structurally, there’s no way that this will ever happen.

Despite these impediments, keeping those dependencies maintained does cost money.

Some Solutions Already Exist

There are various official channels developing to help support the maintenance of critical infrastructure. If you work at a big company, you should probably have a corporate Tidelift subscription. Maybe ask your employer about that.

But, as they will readily admit there are a LOT of projects that even Tidelift cannot cover, with no official commercial support, and no practical way to offer it in the short term. Individual maintainers, like yours truly, trying to figure out how to maintain their projects, either by making a living from them or incorporating them into our jobs somehow. People with a Ko-Fi or a Patreon, or maybe just an Amazon wish-list to let you say “thanks” for occasional maintenance work.

Most importantly, there’s no path for them to transition to actually making a living from their maintenance work. For most maintainers, Tidelift pays a sub-hobbyist amount of money, and even setting it up (and GitHub Sponsors, etc) is a huge hassle. So even making the transition from “no income” to “a little bit of side-hustle income” may be prohibitively bureaucratic.

Let’s take myself as an example. If you’re a developer who is nominally profiting from my infrastructure work in your own career, there is a very strong chance that you are also a contributor to the open source commons, and perhaps you’ve even contributed more to that commons than I have, contributed more to my own career success than I have to yours. I can ask you to pay me3, but really you shouldn’t be paying me, your employer should.

What To Do Now: Make It Easy To Just Pay Money

So if we just need to give open source maintainers more money, and it’s really the employers who ought to be giving it, then what can we do?

Let’s not make it complicated. Employers should just give maintainers money. Let’s call it the “JGMM” benefit.

Specifically, every employer of software engineers should immediately institute the following benefits program: each software engineer should have a monthly discretionary budget of $50 to distribute to whatever open source dependency developers they want, in whatever way they see fit. Venmo, Patreon, PayPal, Kickstarter, GitHub Sponsors, whatever, it doesn’t matter. Put it on a corp card, put the project name on the line item, and forget about it. It’s only for open source maintenance, but it’s a small enough amount that you don’t need intense levels of approval-gating process. You can do it on the honor system.

This preserves zero up-front cost. To start using a dependency, you still just use it4. It also preserves zero marginal cost: your developers choose which dependencies to support based on perceived need and popularity. It’s a fixed overhead which doesn’t scale with revenue or with profit, just headcount.

Because the whole point here is to match the easy, implicit, no-process, no-controls way in which dependencies can be added in most companies. It should be easier to pay these small tips than it is to use the software in the first place.

This sub-1% overhead to your staffing costs will massively de-risk the open source projects you use. By leaving the discretion up to your engineers, you will end up supporting those projects which are really struggling and which your executives won’t even hear about until they end up on the news. Some of it will go to projects that you don’t use, things that your engineers find fascinating and want to use one day but don’t yet depend upon, but that’s fine too. Consider it an extremely cheap, efficient R&D expense.

A lot of the options for developers to support open source infrastructure are already tax-deductible, so if they contribute to something like one of the PSF’s fiscal sponsorees, it’s probably even more tax-advantaged than a regular business expense.

I also strongly suspect that if you’re one of the first employers to do this, you can get a round of really positive PR out of the tech press, and attract engineers, so, the race is on. I don’t really count as the “tech press” but nevertheless drop me a line to let me know if your company ends up doing this so I can shout you out.

Acknowledgments

Thank you to my patrons who are supporting my writing on this blog. If you like what you’ve read here and you’d like to read more of it, or you’d like to support my various open-source endeavors, you can support my work as a sponsor! I am also available for consulting work if you think your organization could benefit from expertise on topics such as “How do I figure out which open source projects to give money to?”.

  1. I don’t have time to get into the margins for Starbucks and friends, their relationship with labor, economies of scale, etc. 

  2. While this is a theme that pervades much of his work, the only place I can easily find where he says it in so many words is on a podcast that sometimes also promotes right-wing weirdos and pseudo-scientific quacks spreading misinformation about autism and ADHD. So, I obviously don’t want to link to them; you’ll have to take my word for it. 

  3. and I will, since as I just recently wrote about, I need to make sure that people are at least aware of the option 

  4. Pending whatever legal approval program you have in place to vet the license. You do have a nice streamlined legal approvals process, right? You’re not just putting WTFPL software into production, are you? 

Categories: FLOSS Project Planets

Pages