FLOSS Project Planets

Real Python: Quiz: Creating Great README Files for Your Python Projects

Planet Python - Wed, 2024-06-19 08:00

Test your understanding of how a great README file can make your Python project stand out and how to create your own README files.

Take this quiz after reading our Creating Great README Files for Your Python Projects tutorial.

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Categories: FLOSS Project Planets

PyCharm: How to Move From pandas to Polars

Planet Python - Wed, 2024-06-19 07:48

This is a guest post from Cheuk Ting Ho, a data scientist who contributes to multiple open-source libraries, such as pandas and Polars.

You’ve probably heard about Polars – it is now firmly in the spotlight in the data science community. 

Are you still using pandas and would like to try out Polars? Are you worried that it will take a lot of effort to migrate your projects from pandas to Polars? You might be concerned that Polars won’t be compatible with your existing pipeline or the other tools you are currently using.

Fear not! In this article, I will answer these questions so you can decide whether to migrate to using Polars or not. I will also provide some tips for those of you who have already decided to migrate.

How is Polars different from pandas?

Polars is known for its speed and security, as it is written in Rust and based on Apache Arrow. For details about Polars vs. pandas, you can see our other blog post here. In short, while Polars’ backend architecture is different from pandas’, the creator and community around Polars have tried to maintain a Python API that is very similar to pandas’. At first glance, Polars code is very similar to pandas code. Fun fact – some contributors to pandas are also contributors to Polars. Due to this, the barrier for pandas users to start using Polars is relatively low. However, as it is still a different library, it is worth double-checking the differences between the two.

Advantages of using Polars

Have you struggled when using pandas for a relatively large data set? Do you think pandas is using too much RAM and slowing your computer down while working locally? Polars may solve this problem by using its lazy API. Intermediate steps won’t be executed unless needed, saving memory for the intermediate steps in some cases.

Another advantage Polars has is that, since it is written in Rust, it can make use of concurrency much better than pandas. Python is traditionally single-threaded, and although pandas uses the NumPy backend to speed up some operations, it is still mainly written in Python and has certain limitations in its multithreading capabilities.

Tools that make the switch easy

As Polars’ popularity grows, there is more and more support for Polars in popular tools for data scientists, including scikit-learn and HoloViz.

PyCharm, the most popular IDE used by data scientists, provides a similar experience when you work with pandas and Polars. This makes the process of migration smoother. For example, interactive tables allow you to easily see the information about your DataFrame, such as the number of rows and columns.

Try PyCharm for free

PyCharm has an excellent pagination feature – if you want to see more results per page, you can easily configure that via a drop-down menu:

You can see the statistical summary for the data when you hover the cursor over the column name:

You can also sort the data for inspection with a few clicks in the header. You can also use the multi-sorting functionality – after sorting the table once, press and hold (macOS) or Alt (Windows) and click on the second column you want the table to be sorted by. For example, here, we can sort by island and bill_length_mm in the table.

To get more insights from the DataFrame, you can switch to chat view with the icon on the left:

You can also change how the data is shown in the settings, showing different columns and using different graph types:

It also helps you to auto-complete methods when using Polars, very handy when you are starting to use Polars and not familiar with all of the methods that it provides. To understand more about full line code completion in JetBrains IDEs, please check out this article

You can also access the official documentation quickly by clicking the Polars icon in the top-right corner of the table, which is really handy.

How to migrate from pandas to Polars

If you’re now convinced to migrate to Polars, your final questions might be about the extent of changes needed for your existing code and how easy it is to learn Polars, especially considering your years of experience and muscle memory with pandas.

Similarities between pandas and Polars

Polars provides APIs similar to pandas, most notably the read_csv(), head(), tail(), and describe() for a glance at what the data looks like. It also provides similar data manipulation functions like join() and groupby()/ group_by(), and aggregation functions like mean() and sum().

Before going into the migration, let’s look at these code examples in Polars and pandas.

Example 1 – Calculating the mean score for each class

pandas

import pandas as pd df_student = pd.read_csv("student_info.csv") print(df_student.dtypes) df_score = pd.read_csv("student_score.csv") print(df_score.head()) df_class = df_student.join(df_score.set_index("name"), on="name").drop("name", axis=1) df_mean_score = df_class.groupby("class").mean() print(df_mean_score)

Polars

import polars as pl df_student = pl.read_csv("student_info.csv") print(df_student.dtypes) df_score = pl.read_csv("student_score.csv") print(df_score.head()) df_class = df_student.join(df_score, on="name").drop("name") df_mean_score = df_class.group_by("class").mean() print(df_mean_score)

Polars provides similar io methods like read_csv. You can also inspect the dtypes, do data cleaning with drop, and do groupby with aggregation functions like mean.

Example 2 – Calculating the rolling mean of temperatures

pandas

import pandas as pd df_temp = pd.read_csv("temp_record.csv", index_col="date", parse_dates=True, dtype={"temp":int}) print(df_temp.dtypes) print(df_temp.head()) df_temp.rolling(2).mean()

Polars

import polars as pl df_temp = pl.read_csv("temp_record.csv", try_parse_dates=True, dtypes={"temp":int}).set_sorted("date") print(df_temp.dtypes) print(df_temp.head()) df_temp.rolling("date", period="2d").agg(pl.mean("temp"))

Reading with date as index in Polars can also be done with read_csv, with a slight difference in the function arguments. Rolling mean (or other types of aggregation) can also be done in Polars.

As you can see, these code examples are very similar, with only slight differences. If you are an experienced pandas user, I am sure your journey using Polars will be quite smooth.

Tips for migrating from pandas to Polars

As for code that was previously written in pandas, how can you migrate it to Polars? What are the differences in syntax that may trip you up? Here are some tips that may be useful:

Selecting and filtering

In pandas, we use .loc / .iloc and [] to select part of the data in a data frame. However, in Polars, we use .select to do so. For example, in pandas df["age"] or df.loc[:,"age"] becomes df.select("age") in Polars.

In pandas, we can also create a mask to filter out data. However, in Polars, we will use .filter instead. For example, in pandas df["age" > 18] becomes df.filter(pl.col("a") > 18) in Polars.

All of the code that involves selecting and filtering data needs to be rewritten accordingly.

Use .with_columns instead of .assign

A slight difference between pandas and Polars is that, in pandas we use .assign to create new columns by applying certain logic and operations to existing columns. In Polars, this is done with .with_columns. For example:

In pandas

df_rec.assign(     diameter = lambda df: (df.x + df.y) * 2,     area = lambda df: df.x * df.y )

becomes

df_rec.with_columns(     diameter = (pl.col("x") + pl.col("y")) * 2,     area = lambda df: pl.col("x") * pl.col("y") )

in Polars.

.with_columns can replace groupby

In addition to assigning a new column with simple logic and operations, .with_columns offers more advanced capabilities. With a little trick, you can perform operations similar to groupby in pandas by using window functions:

In pandas

df = pd.DataFrame({     "class": ["a", "a", "a", "b", "b", "b", "b"],     "score": ["80", "39", "67", "28", "77", "90", "44"], }) df["avg_score"] = df.groupby("class")["score"].transform("mean")

becomes

df.with_columns(     pl.col("score").mean().over("class").alias("avg_score") )

in Polars.

Use scan_csv instead of read_csv if you can

Although read_csv also works in Polars, by using scan_csv instead of read_csv it will turn to lazy evaluation mode and benefit from the lazy API mentioned above.

Building pipelines properly with lazy API

In pandas, we usually use .pipe to build data pipelines. However, since Polars works a bit differently, especially when using the lazy API, we want the pipeline to be executed only once. So, we need to adjust the code accordingly. For example:

Instead of this pandas code snippet:

def discount(df):     df["30_percent_off"] = df["price"] * 0.7     return df def vat(df):     df["vat"] = df["price"] * 0.2     return df def total_cost(df):     df["total"] = df["30_percent_off"] + df["vat"]     return df (df  .pipe(discount)  .pipe(vat)  .pipe(total_cost) )

We will have the following one in Polars:

def discount(input_col)r:     return pl.col(input_col).mul(0.7).alias("70_percent_off") def vat(input_col):     return pl.col(input_col).mul(0.2).alias("vat") def total_cost(input_col1, input_col2):     return pl.col(input_col1).add(pl.col(input_col2).alias("total") df.with_columns(     discount("price"),     val("price"),     total_cost("30_percent_off", "vat"), ) Missing data: No more NaN

Do you find NaN in pandas confusing? There is no NaN in Polars! Since NaN is an object in NumPy and Polars doesn’t use NumPy as the backend, all missing data will now be null instead. For details about null and NaN in Polars, check out the documentation.

Exploratory data analysis with Polars

Polars provides a similar API to pandas, and with hvPlot, you can easily create a simple plotting function with exploratory data analysis in Polars. Here I will show two examples, one creating simple statistical information from your data set, and the other plotting simple graphs to understand the data.

Summary statistics from dataset

When using pandas, the most common way to get a summary statistic is to use describe. In Polars, we can also use describe in a similar manner. For example, we have a DataFrame with some numerical data and missing data:

We can use describe to get summary statistics:

Notice how object types are treated – in this example, the column name gives a different result compared to pandas. In pandas, a column with object type will result in categorical data like this:

In Polars, the result is similar to numeric data, which makes less sense:

Simple plotting with Polars DataFrame


To better visualize of the data, we might want to plot some graphs to help us evaluate the data more efficiently. Here is how to do so with the plot method in Polars.

First of all, since Polars uses hvPlot as backend, make sure that it is installed. You can find the hvPlot User Guide here. Next, since hvPlot will output the graph as an interactive Bokeh graph, we need to use output_notebook from bokeh.plotting to make sure it will show inline in the notebook. Add this code at the top of your notebook:

from bokeh.plotting import output_notebook output_notebook()

Also, make sure your notebook is trusted. This is done by simply checking the checkbox in the top-right of the display when using PyCharm.

Next, you can use the plot method in Polars. For example, to make a scatter plot, you have to specify the columns to be used as the x- and y-axis, and you can also specify the column to be used as color of the points:

df.plot.scatter(x="body_mass_g", y="bill_length_mm", color="species")

This will give you a nice plot of the different data points of different penguin species for inspection:

Of course, scatter plots aren’t your only option. In Polars, you can use similar steps to create any type of plot that is supported by hvPlot. For example, hist can be done like this:

df.plot.hist("body_mass_g", by=["species","sex"])

For a full list of plot types supported by hvPlot, you can have a look at the hvPlot reference gallery.

Conclusion

I hope the information provided here will help you on your way with using Polars. Polars is an open-source project that is actively maintained and developed. If you have suggestions or questions, I recommend reaching out to the Polars community.

About the author

Cheuk Ting Ho

Cheuk has been a Data Scientist at various companies – a job that demands high numerical and programming skills, especially in Python. Following her passion for the tech community, Cheuk has been a Developer Advocate for three years. She also contributes to multiple open-source libraries like Hypothesis, Pytest, pandas, Polars, PyO3, Jupyter Notebook, and Django. Cheuk is currently a consultant and trainer at CMD Limes.

Categories: FLOSS Project Planets

EuroPython: EuroPython June 2024 Newsletter

Planet Python - Wed, 2024-06-19 06:13
🐍 EuroPython 2024: The Ultimate Python Party Awaits! 🎉


Hello Pythonistas,

Get ready to code, connect, and celebrate at EuroPython 2024! We’re thrilled to bring you an unforgettable conference experience filled with enlightening talks, engaging workshops, and a whole lot of fun. Whether you&aposre a seasoned developer or just starting your Python journey, there&aposs something for everyone. Let&aposs dive into the details!

⏰ THREE DAYS LEFT TO BUY YOUR TICKETS!! 🎟️

Don&apost miss out on the Python event of the year! Secure your spot today and be part of the magic.

🎟️ Buy your tickets here!!! 🎟️

The Late Bird prices kick in this Saturday (June 22nd).

SCHEDULE 📅

The schedule is OUT! Check out all the awesome stuff we have planned for you in Prague this year.

🎤 Keynote Speakers

We&aposre excited to announce our stellar lineup of keynote speakers who will inspire and challenge you with their insights and experiences:

  • Carol Willing - Developer Advocate at Noteable and core developer of Jupyter and CPython.
  • Tereza Iofciu - Data Science Lead at Free Now, co-founder of PyLadies Hamburg.
  • Anna Přistoupilová - Bioinformatician and researcher.
  • Armin Ronacher - Creator of Flask and Director of Engineering at Sentry.
  • Łukasz Langa - Python core developer and Release Manager for Python 3.8 and 3.9.
  • Mai Giménez - Senior research engineer at Google DeepMind, specialising in large language and multimodal models.
🥳 Social Events
  • Boat Trip: Set sail on Friday with us for a scenic boat trip to enjoy an evening of networking and relaxation. Make sure to reserve your spot early! Sign-up will be available soon.
  • EuroPython Social Event: Join us for a fantastic evening in Prague. This event promises great food, drinks, and the opportunity to connect with fellow attendees in a beautiful setting. You will be invited to bring your favourite games and musical instruments. Stay tuned!
  • Speakers’ Dinner: An exclusive dinner event for our speakers to network, share insights, and enjoy a relaxing evening before the conference kicks off. More information here.
🍽  PyLadies LunchPyLadies Lunch at EuroPython 2023 at the Prague Conference Centre

On Thursday, 11th July 2024 12:30 to 14:00. Join us for a special lunch event aimed at fostering community and empowerment among women in tech.

Thank you to our sponsor 🐙 Kraken Tech 🐙 for supporting the lunch event.

We’re excited to announce a range of events for underrepresented groups in computing this year! 🎉 Whether you’re new to PyLadies or a long-time supporter, we warmly welcome you to join us and be part of our supportive community.

  • Self-Defence Workshop: Learn to defend yourself against inappropriate behaviour in this supportive session. Facilitated by a professional therapist, you&aposll gain practical skills and mutual support.
  • #IAmRemarkable: Empower yourself in this workshop designed to help women and underrepresented groups celebrate their achievements and enhance their self-promotion skills.
  • Meet & Greet with PyLadies: Network with seasoned members of the PyLadies community. Gain valuable insights, advice, and inspiration for your Python journey.

Sign up for any of the sessions above here

🌍 Community Organiser&aposs Lunch

On Friday (July 12th) at 1 pm. A great opportunity for community leaders to network and discuss strategies for success. This lunch will include an Open Space discussion about Python Organizations and how we deal with challenges.

Sign up for the Community Organiser’s session here

👩‍💻 Learn to Build Websites With Django Girls

Are you interested in learning how to build websites and belong to an underrepresented group in computing? Join us for a one-day workshop!

No prior programming experience is required. The workshop is open to everyone, regardless of participation in EuroPython. For more information, click here

👶 Childcare

This year, we&aposre once again partnering with Susie&aposs Babysitting Prague to offer childcare at the main conference venue (Prague Conference Centre).

If you&aposre interested, please let us know at the latest two weeks before the event by filling out this form.

You will be asked about the Childcare add-on when you buy your ticket.

💻 Sprint Weekend

EuroPython 2024 Sprints will be during the weekend of the 13th and 14th of July. This year, the event will happen at a different venue from the conference and it will be free for anyone with a conference ticket to join!

As per our tradition, the EuroPython will provide the rooms and facilities but the sprints are organised by YOU. It is a great chance to contribute to open-source projects large and small, learn from each other, geek out and have fun. 🐍

Lunch and coffee will be provided.

  • When: 13th and 14th July 2024 (09:30 - 17:00)
  • Where: To be determined
🤭 Py.Jokes~ pyjoke There are only two hard problems in Computer Science: cache invalidation, naming things and off-by-one-errors. 📱 Stay Connected

Share your EuroPython experience on social media!

Use the hashtag #EuroPython2024 and follow us on:

With so much joy and excitement,

EuroPython 2024 Team 🤗

Categories: FLOSS Project Planets

Qt for MCUs 2.8 LTS released

Planet KDE - Wed, 2024-06-19 06:05

We are thrilled to announce the release of Qt for MCUs 2.8 LTS, which comes with new exciting GUI building blocks, improvements to build tools workflows, extended support for Infineon TRAVEO T2G microcontrollers, and much more. Qt for MCUs 2.8 is a Long-Term Support version, offering increased stability throughout your development. As such, it is the preferred version for all new projects. Standard Support will be available for 18 months, until December 2025.

Categories: FLOSS Project Planets

The Drop Times: What We Learned from DrupalJam: Open Up 2024

Planet Drupal - Wed, 2024-06-19 04:41
Celebrate the 20th DrupalJam with Esmeralda Tijhoff! Join over 330 participants in Utrecht and delve into Dries Buytaert's keynote, insightful presentations, and hands-on workshops. Discover the latest trends in Drupal technology and explore the evolution of open source at DrupalJam 2024. Don't miss out on this comprehensive event recap—read more now!
Categories: FLOSS Project Planets

LakeDrops Drupal Consulting, Development and Hosting: ECA 2.0.0 has been released for Drupal 10.3 and 11

Planet Drupal - Wed, 2024-06-19 04:41
ECA 2.0.0 has been released for Drupal 10.3 and 11 Jürgen Haas Wed, 19.06.2024 - 10:41

Almost 2 years ago, ECA 1.0.0 was published, and a lot happened in the 23 months in between. Today, ECA gets its first major update which comes not only with a ton of new features but also with code clean-up, performance improvements and support for the latest Drupal core releases 10.3 and soon 11.

Categories: FLOSS Project Planets

ComputerMinds.co.uk: My text filter's placeholder content disappeared!

Planet Drupal - Wed, 2024-06-19 04:40
A story of contributing a fix to Drupal... and a pragmatic workaround

When I upgraded a site from Drupal 10.1 to 10.2, I discovered a particularly serious bug: the login form on our client's site vanished ... which was pretty serious for this site which hid all content behind a login!

We had a custom text format filter plugin to render the login form in place of a custom token in text that editors set, on one of the few pages that anonymous users could access. Forms can have quite different cacheability to the rest of a page, and building them can be a relatively expensive operation anyway, so we used placeholders which Drupal can replace 'lazily' outside of regular caching:

class MymoduleLoginFormFilter extends FilterBase implements TrustedCallbackInterface { public function process($text, $langcode) { $result = new FilterProcessResult($text); $needle = '[login_form]'; // No arguments needed as [login_form] is always to be replaced with the same form. $arguments = []; $replace = $result->createPlaceholder(self::class . '::renderLoginForm', $arguments); return $result->setProcessedText(str_replace($needle, $replace, $text)); } public static function renderLoginForm() { // Could be any relatively expensive operation. return \Drupal::formBuilder()->getForm(UserLoginForm::class); } public static function trustedCallbacks() { return ['renderLoginForm']; } }

But our text format also had core's "Correct faulty and chopped off HTML" filter enabled - which completely removed the placeholder, and therefore the form went missing from the final output!

Debugging this to investigate was interesting - it took me down the rabbit hole of learning more about PHP 8 Fibers, as Drupal 10.2 uses them to replace placeholders. Initially, I thought the problem could be there, but it turned out that the placeholder itself was the problem. Drupal happily generated the form to go in the right place, but couldn't find the placeholder. Here's what a placeholder, created by FilterProcessResult::createPlaceholder() should look like:

<drupal-filter-placeholder callback="Drupal\mymodule\Plugin\Filter\MymoduleLoginFormFilter::renderLoginForm" arguments="" token="hqdY2kfgWm35IxkrraS4AZx6zYgR7YRVmOwvWli80V4"></drupal-filter-placeholder>

Looking very carefully, I spotted that the arguments="" attribute in the actual markup was just arguments - i.e. it had been turned into a 'boolean' HTML attribute:

<drupal-filter-placeholder callback="Drupal\mymodule\Plugin\Filter\MymoduleLoginFormFilter::renderLoginForm" arguments token="hqdY2kfgWm35IxkrraS4AZx6zYgR7YRVmOwvWli80V4"></drupal-filter-placeholder>

There is a limited set of these, and yet the masterminds/html5 component that Drupal 10.2 now uses to process HTML 5 requires an explicit list of the attributes that should not get converted to boolean attributes when they are set to an empty string.

At this point, I should point out that this means a simple solution could be to just pass some arguments so that the attribute isn't empty! That is a nice immediate workaround that avoids the need for any patch, so is an obvious maintainable solution:

// Insert your favourite argument; any value will do. $arguments = [42];

At least that ensures our login form shows again!

But I don't see any documentation saying there must be arguments, and it would be easy for someone to write this kind of code again elsewhere, especially if we're trying to do The Right Thing by using placeholders in filters.

So I decided to contribute a fix back to Drupal core. I've worked on core before. Sometimes it's a joy to find or fix something that affects thousands of people, other times the contribution process can be soul-destroying. At least in this case, I found an existing test in core that could be easily extended to demonstrate the bug. Then I wrote a surgical fix... but I can see that it tightly couples the filter system to Drupal's HtmlSerializerRules class. That class is within the \Drupal\Component namespace, which is described as:

Drupal Components are independent libraries that do not depend on the rest of Drupal in order to function.

Components MAY depend on other Drupal Components or external libraries/packages, but MUST NOT depend on any other Drupal code.

So perhaps it needs configuration in order to be decoupled; and/or a factory service; or maybe modules should subscribe to an event to be able to inject their own rules .... and very quickly perfection feels like the enemy of good, as I can imagine the scope of a solution ballooning in size and complexity. 

I'm all for high standards in core, but fulfilling them to produce solutions can still be a slow and frustrating experience. I'm already involved in enough long-running issues that just bounce around between reviewers, deprecations and changes in standards. I risk just ranting here rather than providing answers - and believe me, I'm incredibly grateful for the work that reviewers and committers have put into producing Drupal - but surely the current process must be putting so many potential contributors off. We worry about attracting talent to the Drupal ecosystem, and turning Takers into Makers, but what are they going to find when they arrive? Contributing improvements of decent size is hard and can require perseverance over years. Where can we adjust the balance to make contribution easier for anyone, even seasoned developers?

As I suggested, perhaps this particular bug needs any of a factory pattern, event subscriber, or injected configuration... but what would my next step be? I'm reluctant to put effort into writing a more complex solution when I know from experience that reviewers might just suggest doing something different anyway. At least I have that simple (if unsatisfying) workaround for the filter placeholder method: always send an argument, even if it might be ignored. I guess that reflects the contribution experience itself sometimes!

Categories: FLOSS Project Planets

Sahil Dhiman: First Iteration of My Free Software Mirror

Planet Debian - Tue, 2024-06-18 23:35

As I’m gearing towards setting up a Free Software download mirror in India, it occurred to me that I haven’t chronicled the work and motivation behind setting up the original mirror in the first place. Also, seems like it would be good to document stuff here for seeing the progression, as the mirror is going multi-country now. Right now my existing mirror i.e., mirrors.de.sahilister.net, which was mirrors.sahilister.in, hosted in Germany serves traffic for Termux, NomadBSD, Blender, BlendOS and GIMP. For a while in between, I hosted OSMC project mirror as well.

To explain what is a Free Software download mirror thing is first, I’ll quote myself from work blog -

As most Free Software doesn’t have commercial backing and require heavy downloads, the concept of software download mirrors helps take the traffic load off of the primary server, leading to geographical redundancy, higher availability and faster download in general.

So whenever someone wants to download a particular (mirrored) software and click download, upstream redirects the download to one of the mirror server which is geographical (or in other parameters) nearby to the user, leading to faster downloads and load sharing amongst all mirrors.

Since the time I got into Linux and servers, I always wanted to help the community somehow, and mirroring seemed to be the most obvious thing. India seems to be a country which has traditionally seen less number of public download mirrors. IITB, TiFR, and some of the public institutions used to host them for popular Linux and Free Softwares, but they seem to be diminishing these days.

In the last months of 2021, I started using Termux and saw that it had only a few mirrors (back then). I tried getting a high capacity, high bandwidth in budget was hard in India in 2021-22. So after much deliberation I decided to go where it’s available and chose a German hosting provider with the thought helping where possible and adding India node which conditions are favorable here (thankfully that happened, and India node is live too now.). Termux required only 29 GB of storage, so went ahead and started mirroring it. I raised this issue in Termux’s GitHub repository in January 2022. This blog post chronicles the start of the mirror.

Termux has high request counts from a mirror point of view. Each Termux client, usually check each mirror in selected group for availability before randomly selecting one for download (only other case is when client has explicitly selected a single mirror using termux-repo-change). The mirror stared getting thousands of requests daily, but only a small percentage would actually get my mirror in selection, so download traffic was lower. Similar thing happened with OSMC too (which I started mirroring later).

With this start, I started exploring various project that would be benefit from additional mirrors. Public information from Academic Computer Club in Umeå’s mirror and Freedif’s mirror stats helped to figure out storage and bandwidth requirements for potential projects.

Later, I migrated to a different provider for better speeds and added LibreSpeed test on the mirror server. Those were fun times. Between OSMC, Termux and LibreSpeed, I was getting almost 1.2 millions hits/day on the server at its peak, crossing for the first time a TB/day traffic number.

Next came Blender, which took the longest time to set up, around 9–10 months. Blender had a push-trigger requirement for rsync from upstream that took quite some back and forth. It now contributes the most amount of traffic on my mirror. On release days, mirror does more than 3 TB/day and normal days, it hovers around 2 TB/day. Gimp is the latest addition to the mirror.

At one time, the mirror traffic touched 4.97 TB/day. That’s when I decided on dropping LibreSpeed server to solely focus on mirroring for now, keeping the bandwidth allotment for serving downloads for now.

The mirror project selection grew organically. I used to reach out many projects discussing the need of for additional mirrors. Some projects outright denied mirroring request as Germany already has good academic mirrors boosting 20-25 Gbits/s speeds from FTP era, which seems fair. Finding the niche was essential to only add softwares, which truly required additional capacity. There were months when nothing much would happen with the mirror, rsync would continue to update the mirror while nginx would keep on serving the traffic. Nowadays, the mirror pushes around 70 TB/month. I occasionally check logs, vnstat, add new security stuff here and there and pay the bills. The mirror now saturates the Gigabit link sometimes and goes beyond that, peaking around 1.42 Gbits/s (the hosting provider seems to be upping their game). The plan is to upgrade the link to better speeds.

Yearly traffic stats (through `vnstat -y`)

On the way, learned quite a few things like -

  • IPv6 and exposing rsync module due to OSMC requirement.
  • Implementing user with restricted access to only trigger rsync, basically make rsync pull trigger based due to Blender.
  • Iterating over right client response size for LibreSpeed test project.
  • Mistakenly identifying torrent traffic for BlendOS as DDoS and blocking it for quite a few months. BlendOS added loads of hits for torrent traffic, making my mirror also serve as web seed. Web seeds in conjunction with normal seeds surely is a good combination for serving downloads as it combines best of both world, general availability of web seed/mirror and benefit of traditional seeds to maximize download speeds at user end.
  • Handling abusive traffic (a lot of it to be frank) and how to handle it. The approach is more of a whack a mole right now, which I want to improve and automate.
  • Most of the traffic on non-Linux/BSDs operating system serving mirrors (like mine) is for people on Windows and Mac asking for EXEs and DMGs. Mostly because package repositories carry software distribution load for Linux/BSDs OS and partly because the number of Windows/Mac users are quite high compared to other OSs.
  • Load balancing through DNS and HTTP redirection (which I would implement in my India mirror now) to better maximize available resources.

GeoIP Map of Clients from Yesterday's Access Logs. Click to enlarge
Generated from IPinfo.io

Fun fact, Academic Computer Club in Umeå which runs mirror.accum.se (one of the prominent Debian, Ubuntu etc.) mirror, now has 200 Gbits/s uplink to the internet through SUNET.

In hindsight, the stats look amazing, hundreds of TBs of traffic served from the mirror. That does show that there’s still quite an appetite for public mirrors in time of commercially “donated” CDNs and GitHub. The world could have done with one less mirror, but it saved some time, lessened the burden for others, while providing redundancy and traffic localization with one additional mirror. And it’s fun for someone like me who’s into infrastructure that powers the Internet. Now, I’ll try focusing and expanding the India mirror, which in itself started pushing almost half a TB/day. Long live Free Software and public download mirrors.

Categories: FLOSS Project Planets

PyCoder’s Weekly: Issue #634 (June 18, 2024)

Planet Python - Tue, 2024-06-18 15:30

#634 – JUNE 18, 2024
View in Browser »

Should Python Adopt Calendar Versioning?

Python’s use of semantic style versioning numbers causes confusion, as breaking changes can be present in the “minor” number position. This proposal given at the Python Language Summit is to switch to using calendar based versioning. A PEP is forthcoming.
PYTHON SOFTWARE FOUNDATION

Python Mappings: A Comprehensive Guide

In this tutorial, you’ll learn the basic characteristics and operations of Python mappings. You’ll explore the abstract base classes Mapping and MutableMapping and create a custom mapping.
REAL PYTHON

How Do You Turn Data Science Insights into Business Results? Posit Connect

Data scientists use Posit Connect to get their work into the hands of decision-makers. Securely deploy python analytic work & distribute that across teams. Publish data apps, documents, notebooks, and dashboards. Deploy models as APIs & configure reports to run & get distributed on a custom schedule →
POSIT sponsor

NumPy 2.0.0 Release Notes

The long awaited 2.0 release of NumPy landed this week. Not all the docs are up to date yet, but this final draft of the release notes shows you what is included.
NUMPY.ORG

Quiz: Python Mappings

In this quiz, you’ll test your understanding of the basic characteristics and operations of Python mappings. By working through this quiz, you’ll revisit the key concepts and techniques of creating a custom mapping.
REAL PYTHON

Discussions Personal Red Flags When You’re Interviewing at a Company?

HACKER NEWS

Articles & Tutorials Proposed Bylaws Changes for the PSF

As part of the upcoming board election, three new bylaws are also being proposed for your consideration. The first makes it easier to qualify for membership for Python-related volunteer work, the second makes it easier to vote, and the third gives the board more options around the code of conduct.
PYTHON SOFTWARE FOUNDATION

Python Interfaces: Object-Oriented Design Principles

In this video course, you’ll explore how to use a Python interface. You’ll come to understand why interfaces are so useful and learn how to implement formal and informal interfaces in Python. You’ll also examine the differences between Python interfaces and those in other programming languages.
REAL PYTHON course

Prod Alerts? You Should be Autoscaling

Rest easy with Judoscale’s web & worker autoscaling for Heroku, Render, and Amazon ECS. Traffic spike? Scaled up. Quiet night? Scaled down. Work queue backlog? No problem →
JUDOSCALE sponsor

Listing All Files in a Directory With Python

In this video course, you’ll be examining a couple of methods to get a list of files and folders in a directory with Python. You’ll also use both methods to recursively list directory contents. Finally, you’ll examine a situation that pits one method against the other.
REAL PYTHON course

Python Logging: The Log Levels

Logging levels allow you to control which messages you record in your logs. Think of log levels as verbosity levels. How granular do you want your logs to be? This article teaches you how to control your logging.
MIKE DRISCOLL

How Do You Program for 8h in a Row?

You may get paid for 8 hours a day, but that doesn’t necessarily mean you’re coding that whole time. This article touches on the variety of the job and what you should expect if you are new to the field.
BITE CODE!

Python Language Summit 2024: Lightning Talks

A summary of the six lightning talks given at the 2024 Python Language Summit. Topics include Python for iOS, improving asserts in 3.14, sharing data between sub-interpreters, and more.
PYTHON SOFTWARE FOUNDATION

Starting and Stopping uvicorn in the Background

Learn how to start and stop uvicorn in the background using a randomly selected free port number. Useful for running test suites that require live-webservers.
CHRISTOPH SCHIESSL • Shared by Christoph Schiessl

How I Built a Bot Publishing Italian Paintings on Bluesky

This article describes Nicolò’s project to build a bot that retrieves images from Wikimedia, selecting the best ones, and deploying it to the cloud.
NICOLÒ GISO • Shared by Nicolò Giso

Testing async MongoDB AWS Applications With Pytest

This article shows real life techniques and fixtures needed to make the test suite of your MongoDB and AWS-based application usable and performant.
HANDMADESOFTWARE • Shared by Thorin Schiffer

DjangoCon Europe 2024 Bird’s-Eye View

Thibaud shares some of the best moments of DjangoConEU 2024. He highlights some of the talks, workshops, and the outcome of the sprints.
THIBAUD COLAS

Storing Django Static and Media Files on DigitalOcean Spaces

This tutorial shows how to configure Django to load and serve up static and media files, public and private, via DigitalOcean Spaces.
MICHAEL HERMAN

CPython Reference Counting and Garbage Collection Internals

A detailed code walkthrough of how CPython implements memory management, including reference counting and garbage collection
ABHINAV UPADHYAY

The Decline of the User Interface

“Software has never looked cooler, but user interface design and user experience have taken a sharp turn for the worse.”
NICK HODGES

Ruff: Internals of a Rust-Backed Python Linter-Formatter

This article dives into the structure of the popular ruff Python linter written in Rust.
ABDUR-RAHMAAN JANHANGEER • Shared by Abdur-Rahmaan Janhangeer

Projects & Code FinanceDatabase: Financial Database as a Python Module

GITHUB.COM/JERBOUMA

prettypretty: Build Awesome Terminal User Interfaces

GITHUB.COM/APPAREBIT

smbclient-ng: Interact With SMB Shares

GITHUB.COM/P0DALIRIUS

wakepy: Cross-Platform Keep-Awake With Python

GITHUB.COM/FOHRLOOP

django-mfa2: Django MFA; Supports TOTP, U2F, & More

GITHUB.COM/MKALIOBY

Events Weekly Real Python Office Hours Q&A (Virtual)

June 19, 2024
REALPYTHON.COM

Wagtail Space US

June 20 to June 23, 2024
WAGTAIL.SPACE

PyData Bristol Meetup

June 20, 2024
MEETUP.COM

PyLadies Dublin

June 20, 2024
PYLADIES.COM

Chattanooga Python User Group

June 21 to June 22, 2024
MEETUP.COM

PyCamp Leipzig 2024

June 22 to June 24, 2024
BARCAMPS.EU

Happy Pythoning!
This was PyCoder’s Weekly Issue #634.
View in Browser »

[ Subscribe to 🐍 PyCoder’s Weekly 💌 – Get the best Python news, articles, and tutorials delivered to your inbox once a week >> Click here to learn more ]

Categories: FLOSS Project Planets

PyBites: Learn Python From Scratch: We Extended Our Newbie Bite Exercises From 25 to 50 🐍 📈

Planet Python - Tue, 2024-06-18 14:42

We are excited to announce that we’ve extended our Newbie Bites from 25 to 50 Python exercises!

The importance of exercising when learning how to code

We’re passionate about this new batch of exercises because they require active engagement, which is crucial for learning how to code. Passive methods like reading books or watching videos don’t help concepts click or stick.

Our exercises involve writing code that is validated with pytest. This immediate feedback helps you understand mistakes and learn more effectively. You’ll encounter errors, re-think your approach, and practice deliberately.

Why double the number of Newbie exercises?

The first 25 exercises taught the fundamentals well, but many found it hard to tackle the intro and beginner Bites afterward. The extra 25 Newbie Bites (#26-50) bridge the gap between complete beginners and intermediate Python programmers.

These new exercises cover essential concepts like error handling, type hints, default arguments, special characters, working with dates, classes (!), list comprehensions, constants, exception handling, and more.

We believe these challenges will provide a deeper understanding and more robust skill set to tackle the regular Bites and become more proficient in Python.

Get full access to the Newbie Bites Overview of the new exercises

Reading Errors: Learn to read and understand error messages in Python. These messages provide valuable information for debugging and fixing issues efficiently.

Failing Tests: Practice reading and interpreting failing test outputs with the pytest framework. This skill is crucial for resolving Bites and any Python development.

Type Hints: Explore type hints introduced in Python 3.5, which help you write more readable and maintainable code by specifying expected data types.

Default Arguments: Understand how to define functions with default values, making your functions more flexible and easier to use.

Special Chars: Learn about special characters in Python strings, such as \n and \t, for better formatting and readability.

Word Count: Use string methods like .split() and .splitlines() to manipulate and process text data effectively.

Dict Retrieval – Part 2: Explore advanced techniques for retrieving values from dictionaries to enhance your data handling skills.

Dict Retrieval – Part 3: Learn safer methods to retrieve values from dictionaries, providing defaults if keys are not present.

Random Module: Use Python’s random module to write a number guessing game, showcasing the practical use of standard library modules.

Working With Dates – Part 1: Explore the datetime module, focusing on the date class and the .weekday() method to work with dates.

Working With Dates – Part 2: Continue working with the datetime module, focusing on importing specific objects versus entire modules.

Make a Class: Learn about Python classes, which serve as blueprints for creating objects, starting from the basics.

Class With Str: Build upon the previous exercise by learning special methods like __str__ for adding string representations to classes.

Make Dataclass: Simplify class creation with Python dataclasses, introduced in Python 3.7.

Scope: Understand variable scope to write clearer and less error-prone code. Scope determines the visibility and lifespan of variables.

String Manipulations: Practice fundamental string manipulations, essential for processing and transforming text data.

List Comprehension: Learn to write concise and efficient list comprehensions, a powerful feature in Python for creating new lists.

Named Tuple: Explore namedtuples, which allow attribute access to elements and support type hints, enhancing your data handling capabilities.

Constants: Learn to assign and use constants, which are fixed values that improve code readability and maintainability.

Exceptions: Master exception handling to write resilient code and establish clear boundaries for function callers.

For Loop With Break And Continue: Control loop flow using break and continue statements to manage iterations based on conditions.

In Operator: Use the in operator to check for item presence in collections, a common practice in Python programming.

String Module: Combine list comprehensions with the string module to check and manipulate characters, then join them back into strings.

Formatting Intro: Learn string interpolation using the .format() method to insert variables into strings dynamically.

Formatting Intro: Learn string interpolation using the .format() method to insert variables into strings dynamically.

Buy the Newbie exercise bundle

The exercises build upon each other, so they must be done in order. After completing them, you’ll earn your Pybites Newbie Certificate.

We’re also working on a series of screencasts to further explain the second half of the Newbie Bites. Stay tuned by subscribing to our YouTube channel.

About the Pybites Platform

We offer a rich platform for learning Python through practical exercises.

Our Bite exercises are designed to challenge you and help you apply Python in real-world scenarios, ranging from basic syntax to advanced topics.

Whether you’re a beginner or an experienced coder, our exercises aim to improve your coding skills in an effective and fun manner.

Start coding today: https://codechalleng.es/

And join our growing community of passionate Pythonistas: https://pybites.circle.so/

Categories: FLOSS Project Planets

Wim Leers: XB week 5: chaos theory

Planet Drupal - Tue, 2024-06-18 13:47

We started the week off with a first MVP config entity: the component config entity. We have to start somewhere and then iterate — and iterate we can: thanks to Felix “f.mazeikis”, #3444417 landed! Most crucially this means we now have both a low-level unit test and a kernel test that ensures the validation for this config entity is thoroughof course I’m bringing config validation to Experience Builder (XB) from the very start!

It doesn’t do much: it allows “opting in” single directory components (SDCs) to XB, and now only those are available in the PoC admin UI. Next up is #3452397: expanding the admin UI and the config entity to allow defining default values for its SDC props. This will allow the UI to immediately generate a preview of the component, which otherwise may (depending on the component) not render anything visible (imagine: a <h2> without text, an <img> without an image, etc.).

But literally everything about this config entity is subject to change:

  • its name (issue: #3455036)
  • the whole “opting in” principle (issue comment: #3444424-7
  • what it does exactly (issue: #3446083)
  • whether it should handle only exposing SDCs 1:1 in XB, or whether it should also handle other component types (issue: #3454519)
How did we get here?

Which brings me to the main subject for this week: I have a theory that explains the chaos that I think now many of you see.

We rushed to open the 0.x branch to invite contribution, because Dries announced XB at DrupalCon. This wrongly gave the impression that there was a very detailed plan, with a precise architecture in mind. That is not the case, and mea culpa for giving that impression. That branch was simply the combination of different PoCs merged together.

Some additional context: back in March, from one day to the next, I along with Ben “bnjmnm”, Ted “tedbow”, Alex “effulgentsia”, Tim and Lauri, plus Felix and Jesse (both from the Site Studio team, both very experienced) I were asked to stop everything we were doing (for me: dropping config validation) and start estimating the product requirements.

~4 weeks of non-stop grueling meetings 1 to assess and estimate the 64 requirements, with Lauri providing visual examples and additional context for the product requirements he crafted (after careful research: #3452440). We looked at multiple radical options and evaluated which of these was viable:

  1. building on top of one of the page building solutions that already exist in the Drupal ecosystem
  2. partnering with one of the existing page building systems, if they were willing to relicense under GPL-2+ and we’d be able to fit them onto Drupal’s data modeling strengths
  3. building something new (but reusing the parts in the Drupal ecosystem that we can build on top of, such as SDC, field types for data modeling, etc.)

The result: 60 pages worth of notes 2, with for each of the 42 “Required for MVP” product requirements (out of a total of 64), there now was an outline of how it could be implemented, a best+realistic+worst case estimate (week-level accuracy), which skillsets needed and how many people. It’s during this process that it became clear that only one choice made sense: the last one. That was the conclusion we reached because the existing solutions in the ecosystem fall short (see #3452440), Site Studio does not meet all of XB’s requirements and some of its architectural choices conflict with XB’s requirements and we did not end up finding a magical unicorn that meshed perfectly with Drupal.

It’s during this time that I made my biggest mistake yet: because the request for estimating this was coming from within Acquia, I assumed this was intended to be built by a team of Acquians. Because … why else create estimates?! If it’s built by a mix of paid full-time, paid part-time and volunteer community members, there’s little point in estimating, right?!
Well, I was wrong: turns out the intent was to get a sense of how feasible it was to achieve in roughly which timeline. But I and my fellow team members were so deep into this enormous estimation exercise based on very high-level requirements that thinking about capturing this information in a more shareable form was a thought that simply did not occur…

So, choice 3 it was, with people that have deep Layout Builder knowledge (Ted & Tim), while bringing in expertise from the Site Studio team (Jesse & Felix) and strong front-end expertise (Ben & Harumi “hooroomoo”) … because next we were asked to narrow the estimates for the highest-variance estimates, to bring it down to a higher degree of confidence. That’s where the “FTEs”, “Best Case Size (weeks)”, “Realistic Case Size (weeks)”, “Worst Case Size (weeks)”, “Most Probable Case Size (calculated, FTE weeks)”, “Variance” columns in the requirements spreadsheet come in.

For the next ~4 weeks, we built PoCs for the requirements where our estimates had either the highest variance or the highest number, to both narrow and reduce the estimates. That’s where all the branches on the experience_builder project come from! For example:

Week 1 = witch pot

After the ~8 chaotic weeks above, it was DrupalCon (Dries also wrote about this, just before DrupalCon, based on the above work!), and then … it was week 1.

In that first week, we took a handful of branches that seemed to contain the pieces most sensible to start iterating on an actual implementation, threw that into a witch pot, and called that 0.x!

Now that all of you know the full origin story, and many of you have experienced the subsequent ~4 equally chaotic weeks, you’re all caught up! :D

Missed a prior week? See all posts tagged Experience Builder.

Goal: make it possible to follow high-level progress by reading ~5 minutes/week. I hope this empowers more people to contribute when their unique skills can best be put to use!

For more detail, join the #experience-builder Slack channel. Check out the pinned items at the top!

Going forward: order from chaos

So, how are we going to improve things?

Actual progress?

So besides the backstory and attempts to bring more order & overview, what happened?

The UI gained panning support (#3452623), which really makes it feel like it’s coming alive:

Try it yourself locally if you like, but there’s not much you can do yet.
Install the 0.x branch — the “Experience Builder PoC” toolbar item takes you there!

… you can also try an imperfect/limited version of the current UI on the statically hosted demo UI thanks to #3450311 by Lee “larowlan”.

We also got two eslint CI jobs running now (#3450307): one for the typical Drupal stuff, one for the XB UI (which is written in TypeScript) and reduced the overhead of PHPStan Level 8 (again thanks to Lee, in #3454501) … and a bunch more low-level things that were improved.

Various things are in progress … expect an update for those next week :)

Thanks to Lauri for reviewing this!

  1. Jesse Baker aptly labeled this period as “bonding through trauma” — before the meetings started mid-March we didn’t know each other, and afterwards it felt like I’d worked with Felix & Jesse for years! ↩︎

  2. In a form that is not sufficiently digestible for public consumption. ↩︎

Categories: FLOSS Project Planets

Nonprofit Drupal posts: June Drupal for Nonprofits Chat

Planet Drupal - Tue, 2024-06-18 11:05

Join us THURSDAY, June 20 at 1pm ET / 10am PT, for our regularly scheduled call to chat about all things Drupal and nonprofits. (Convert to your local time zone.)

We don't have anything specific on the agenda this month, so we'll have plenty of time to discuss anything that's on our minds at the intersection of Drupal and nonprofits.  Got something specific you want to talk about? Feel free to share ahead of time in our collaborative Google doc: https://nten.org/drupal/notes!

All nonprofit Drupal devs and users, regardless of experience level, are always welcome on this call.

This free call is sponsored by NTEN.org and open to everyone. 

  • Join the call: https://us02web.zoom.us/j/81817469653

    • Meeting ID: 818 1746 9653
      Passcode: 551681

    • One tap mobile:
      +16699006833,,81817469653# US (San Jose)
      +13462487799,,81817469653# US (Houston)

    • Dial by your location:
      +1 669 900 6833 US (San Jose)
      +1 346 248 7799 US (Houston)
      +1 253 215 8782 US (Tacoma)
      +1 929 205 6099 US (New York)
      +1 301 715 8592 US (Washington DC)
      +1 312 626 6799 US (Chicago)

    • Find your local number: https://us02web.zoom.us/u/kpV1o65N

  • Follow along on Google Docs: https://nten.org/drupal/notes

View notes of previous months' calls.

Categories: FLOSS Project Planets

Real Python: Rounding Numbers in Python

Planet Python - Tue, 2024-06-18 10:00

With many businesses turning to Python’s powerful data science ecosystem to analyze their data, understanding how to avoid introducing bias into datasets is absolutely vital. If you’ve studied some statistics, then you’re probably familiar with terms like reporting bias, selection bias, and sampling bias. There’s another type of bias that plays an important role when you’re dealing with numeric data: rounding bias.

Understanding how rounding works in Python can help you avoid biasing your dataset. This is an important skill. After all, drawing conclusions from biased data can lead to costly mistakes.

In this video course, you’ll learn:

  • Why the way you round numbers is important
  • How to round a number according to various rounding strategies
  • How to implement each strategy in pure Python
  • How rounding affects data and which rounding strategy minimizes this effect
  • How to round numbers in NumPy arrays and pandas DataFrames
  • When to apply different rounding strategies

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Categories: FLOSS Project Planets

Mike Driscoll: How to Publish a Python Package to PyPI

Planet Python - Tue, 2024-06-18 09:01

Do you have a Python package that you’d like to share with the world? You should publish it on the Python Package Index (PyPI). The vast majority of Python packages are published there. The PyPI team has also created extensive documentation to help you on your packaging journey. This article does not aim to replace that documentation. Instead, it is just a shorter version of it using ObjectListView as an example.

The ObjectListView project for Python is based on a C# wrapper .NET ListView but for wxPython. You use ObjectListView as a replacement for wx.ListCtrl because its methods and attributes are simpler. Unfortunately, the original implementation died out in 2015 while a fork, ObjectListView2 died in 2019. For this article, you will learn how I forked it again and created ObjectListView3 and packaged it up for PyPI.

Creating a Package Structure

When you create a Python package, you must follow a certain type of directory structure to build everything correctly. For example, your package files should go inside a folder named src. Within that folder, you will have your package sub-folder that contains something like an __init__.py and your other Python files.

The src folder’s contents will look something like this:

package_tutorial/ | -- src/ | -- your_amazing_package/ | -- __init__.py | -- example.py

The __init__.py file can be empty. You use that file to tell Python that the folder is a package and is importable. But wait! There’s more to creating a package than just your Python files!

You also should include the following:

  • A license file
  • The pyproject.toml file which is used for configuring your package when you build it, among other things
  • A README.md file to describe the package, how to install it, example usage, etc
  • A tests folder, if you have any (and you should!)

Go ahead and create these files and the tests folder. It’s okay if they are all blank right now. At this point, your folder structure will look like this:

package_tutorial/ | -- LICENSE | -- pyproject.toml | -- README.md | -- src/ | -- your_amazing_package/ | -- __init__.py | -- example.py | -- tests/ Picking a Build Backend

The packaging tutorial mentions that you can choose various build backends for when you create your package. There are examples for the following:

  • Hatchling
  • setuptools
  • Flit
  • PDM

You add this build information in your pyproject.toml file. Here is an example you might use if you picked setuptools:

[build-system] requires = ["setuptools>=61.0"] build-backend = "setuptools.build_meta"

This section in your config is used by pip or build. These tools don’t actually do the heavy lifting of converting your source code into a wheel or other distribution package. That is handled by the build backend. You don’t have to add this section to your pyproject.toml file though. You will find that pip will default to setuptools if there isn’t anything listed.

Configuring Metadata

All the metadata for your package should go into a pyproject.toml file. In the case of ObjectListView, it didn’t have one of these files at all.

Here’s the generic example that the Packaging documentation gives:

[project] name = "example_package_YOUR_USERNAME_HERE" version = "0.0.1" authors = [ { name="Example Author", email="author@example.com" }, ] description = "A small example package" readme = "README.md" requires-python = ">=3.8" classifiers = [ "Programming Language :: Python :: 3", "License :: OSI Approved :: MIT License", "Operating System :: OS Independent", ] [project.urls] Homepage = "https://github.com/pypa/sampleproject" Issues = "https://github.com/pypa/sampleproject/issues"

Using this as a template, I created the following for the ObjectListView package:

[project] name = "ObjectListView3" version = "1.3.4" authors = [ { name="Mike Driscoll", email="mike@somewhere.org" }, ] description = "An ObjectListView is a wrapper around the wx.ListCtrl that makes the list control easier to use." readme = "README.md" requires-python = ">=3.9" classifiers = [ "Programming Language :: Python :: 3", "License :: OSI Approved :: MIT License", "Operating System :: OS Independent", ] [project.urls] Homepage = "https://github.com/driscollis/ObjectListView3" Issues = "https://github.com/driscollis/ObjectListView3/issues"

Now let’s go over what the various parts are in the above metadata:

  • name – The distribution name of your package. Make sure your name is unique!
  • version – The package version
  • authors – One or more authors including emails for each
  • description – A short, one-sentence summary of your package
  • readme – A path to the file that contains a detailed description of the package. You may use a Markdown file here.
  • requires-python – Tells which Python versions are supported by this package
  • classifiers – Gives an index and some metadata for pip
  • URLs – Extra links that you want to show on PyPI. In general, you would want to link to the source, documentation, issue trackers, etc.

You can specify other information in your TOML file if you’d like. For full details, see the pyproject.toml guide.

READMEs and Licenses

The README file is almost always a Markdown file now. Take a look at other popular Python packages to see what you should include. Here are some recommended items:

  • How to install the package
  • Basic usage examples
  • Link to a guide or tutorial
  • An FAQ if you have one

There are many licenses out there. Don’t take advice from anyone; buy a lawyer on that unless you know a lot about this topic. However, you can look at the licenses for other popular packages and use their license or one similar.

Generating a Package

Now you have the files you need, you are ready to generate your package. You will need to make sure you have PyPA’s build installed.

Here’s the command you’ll need for that:

python3 -m pip install --upgrade build

Next, you’ll want to run the following command from the same directory that your pyproject.toml file is in:

python3 -m build

You’ll see a bunch of output from that command. When it’s done, you will have a dist A folder with a wheel (*.whl) file and a gzipped tarball inside it is called a built distribution. The tarball is a source distribution , while the wheel file is called a built distribution. When you use pip, it will try to find the built distribution first, but pip will fall back to the tarball if necessary.

Uploading / Publishing to PyPI

You now have the files you need to publish to share your package with the world on the Python Package Index (PyPI). However, you need to register an account on TestPyPI first. TestPyPI  is a separate package index intended for testing and experimentation, which is perfect when you have never published a package before. To register an account, go to https://test.pypi.org/account/register/ and complete the steps on that page. It will require you to verify your email address, but other than that, it’s a very straightforward signup.

To securely upload your project, you’ll need a PyPI API token. Create one from your account and make sure to set the “Scope” to the “Entire account”. Don’t close the page until you have copied and saved your token or you’ll need to generate another one!

The last step before publishing is to install twine, a tool for uploading packages to PyPI. Here’s the command you’ll need to run in your terminal to get twine:

python3 -m pip install --upgrade twine

Once that has finished installing, you can run twine to upload your files. Make sure you run this command in your package folder where the new dist folder is:

python3 -m twine upload --repository testpypi dist/*

You will see a prompt for your TestPyPI username and/or a password. The password is the API token you saved earlier. The directions in the documentation state that you should use __token__ as your username, but I don’t think it even asked for a username when I ran this command. I believe it only needed the API token itself.

After the command is complete, you will see some text stating which files were uploaded. You can view your package at https://test.pypi.org/project/example_package_name

To verify you can install your new package, run this command:

python3 -m pip install --index-url https://test.pypi.org/simple/ --no-deps example-package-name

If you specify the correct name, the package should be downloaded and installed. TestPyPI is not meant for permanent storage, though, so it will likely delete your package eventually to conserve space.

When you have thoroughly tested your package, you’ll want to upload it to the real PyPI site. Here are the steps you’ll need to follow:

  • Pick a memorable and unique package name
  • Register an account at https://pypi.org. You’ll go through the same steps as you did on TestPyPI
    • Be sure to verify your email
    • Create an API key and save it on your machine or in a password vault
  • Use twine to upload the dist folder like this:  python -m twine upload dist/*
  • Install the package from the real PyPI as you normally would

That’s it! You’ve just published your first package!

Wrapping Up

Creating a Python package takes time and thought. You want your package to be easy to install and easy to understand. Be sure to spend enough time on your README and other documentation to make using your package easy and fun. It’s not good to look up a package and not know which versions of Python it supports or how to install it. Make the process easy by uploading your package to the Python Package Index.

The post How to Publish a Python Package to PyPI appeared first on Mouse Vs Python.

Categories: FLOSS Project Planets

The Drop Times: Christina Lockhart on Elevating Drupal: The Essential Role of Women in Tech

Planet Drupal - Tue, 2024-06-18 08:36
Discover how women are transforming the Drupal community and driving innovation in technology. In this interview, Christina Lockhart, Digital Marketing Manager at the Drupal Association, shares her journey, her key responsibilities, and the vital role women play in advancing Drupal. Learn about her strategies for promoting DrupalCon, her insights on overcoming barriers for women in tech, and the importance of empowering female leaders in the industry.
Categories: FLOSS Project Planets

DrupalEasy: Two very different European Drupal events in one week

Planet Drupal - Tue, 2024-06-18 08:18

I was fortunate enough to attend two very different European Drupal events recently, and wanted to take a few minutes to share my experience as well as thank the organizers for their dedication to the Drupal community.

The events couldn't have been more different - one being a first-time event in a town I've never heard of and the other celebrating their 20th anniversary in a city that can be considered the crossroads of the Netherlands.

DrupalCamp Cemaes

First off - it is pronounced KEM-ice (I didn't learn this until I actually arrived). Cemaes is a small village at the very northern tip of Wales with a population of 1,357. Luckily, one of those 1,357 people is Jennifer Tehan, an independent Drupal developer who decided to organize a Drupal event in a town that (I can only assume) has more sheep than people. 

When this event was first announced a few months back, I immediately knew that this was something I wanted to attend - mostly because I've always wanted to visit Wales to do some hiking ("walking," if you're not from the United States.) I used the camp as an excuse for my wife and I to take a vacation and explore northern Wales. I could write an entire other blog post on how amazeballs the walking was…

DrupalCamp Cemaes was never going to be a large event, and it turns out there wound up being only nine of us (ten if you count Frankie, pictured below.) The advantage of such a small event was easy to see - we all got to know each other quite well. There's something really nice - less intimidating - about small Drupal events that I realized that I've really missed since the decline of Drupal meetups in many localities. 

The event itself was an unconference, with different people presenting on Drupal contributions (James Shields), GitHub Codespaces (Rachel Lawson), remote work (Jennifer Tehan), the LocalGov distribution (Andy Broomfield), and Starshot (myself.)

The setting of the event couldn't have been more lovely - an idyllic seaside community where everything was within walking distance, including the village hall where the event took place. If Jennifer decides to organize DrupalCamp Cemaes 2025, I'm pretty sure Gwendolyn and I will be there.

Read a brief wrap-up of DrupalCamp Cemaes from Jennifer Tehan. 

This camp was proof that anyone, anywhere can organize a successful Drupal event with a minimum of fuss. 

Drupaljam

Four days later, I was in Utrecht, Netherlands, at the 20th anniversary of Drupaljam, the annual main Dutch Drupal event. 

I had previously attended 2011 Drupaljam and recall two things about that event: the building it took place in seemed like it was from the future, and many of the sessions were in Dutch.

This was a very different event from the one I had been at a few days earlier, in a city of almost 400,000 people with over 300 attendees (I was told this number anecdotally) in one of the coolest event venues I've ever been to. Defabrique is a renovated linseed oil and compound feed factory that is a bit over 100 years old that is (IMHO) absolutely perfect for a Drupal event. Each and every public space has character, yet is also modern enough to support open-source enthusiasts. On top of all that, the food was amazing.

Attending Drupal Jam was a bit of a last minute decision for me, but I'm really glad that I made the time for it. I was able to reconnect with members of the European Drupal community that I don't often see, and make new connections with more folks that I can count.

I spent the day the way I spend most of my time at Drupal events - alternating between networking and attending sessions that attracted me. There were a number of really interesting AI-related sessions that I took in; it's pretty obvious to me that the Drupal community is approaching AI integrations in an unsurprisingly thoughtful manner.

The last week reinforced to me how fortunate I am to be able to attend so many in-person Drupal events. The two events I was able to participate in couldn't have been more different in scale and scope, but don't ask me to choose my favorite, because I'm honestly not sure I could!
 

Categories: FLOSS Project Planets

Real Python: Quiz: Build a Guitar Synthesizer

Planet Python - Tue, 2024-06-18 08:00

In this quiz, you’ll test your understanding of what it takes to build a guitar synthesizer in Python. By working through this quiz, you’ll revisit a few key concepts from the music theory and sound synthesis.

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Categories: FLOSS Project Planets

CodersLegacy: Adding Data Files in Nuitka (Python Files, Images, etc)

Planet Python - Tue, 2024-06-18 06:09

Nuitka is a Python-to-C compiler that converts Python code into executable binaries. While Nuitka efficiently compiles Python scripts, incorporating data files such as images, audio, video, and additional Python files requires can be a little tricky. This guide outlines the steps to include various data files in a Nuitka standalone executable.


Basic Setup for Nuitka

Before delving into the inclusion of data files, ensure you have Nuitka installed. You can install Nuitka via pip:

pip install nuitka

For a more detailed starter guide on Nuitka and its various options, you can follow this link to our main Nuitka Tutorial. If you already familiar with Nuitka, then proceed ahead in this tutorial to learn how to add data files in our Nuitka EXE.


Compiling a Python Script with Nuitka

Assuming you have a Python script named main.py, the basic command to compile it into a standalone executable using Nuitka is:

python -m nuitka --standalone main.py

This command creates a standalone directory containing the executable and all necessary dependencies. However, if your program relies on external data files (images, audio, video, other Python files), you need to explicitly include these files.


Adding Data Files in Nuitka 1. Adding Individual Files

To include individual files such as images, use the --include-data-files option. The syntax is:

--include-data-files=<source-location>=<target-location>

For example, to include an image file favicon.ico, the command is:

--include-data-files=./favicon.ico=favicon.ico

This command tells Nuitka to include the favicon.ico file from the current directory and place it in the same location within the output directory.


2. Adding Directories

To include entire directories, use the --include-data-dir option. The syntax is the same as earlier:

--include-data-dir=<source-location>=<target-location>

For instance, to include an entire directory named Images, the command is:

--include-data-dir=./Images=Images

This will copy the Images directory from the current location to the output directory, preserving its structure.


Example: Adding Data Files of Multiple Types

Suppose you have a project with the following files and directories:

  • main.py: The main Python script.
  • favicon.ico: An icon file.
  • Images/: A directory containing images.
  • audio.mp3: An audio file.
  • videos/: A directory containing video files.
  • utils.py: An additional Python file used in the project.

To compile this project with all necessary data files included, the command would be:

python -m nuitka --standalone --include-data-files=./favicon.ico=favicon.ico --include-data-files=./audio.mp3=audio.mp3 --include-data-dir=./Images=Images --include-data-dir=./videos=videos --include-data-files=./utils.py=utils.py --disable-console main.py

Pretty confusing isn’t it?

Having to write out this command over and over again each time you want to re-build your executable is a massive pain. Instead, I recommend you use Nuitka Configuration Files, which allow you write out all your dependencies/datafiles/imports in a single file, which can then be executed as many times as you like.


Final Word

By using the --include-data-files and --include-data-dir options, you can include all necessary data files in your Nuitka-compiled executable. This ensures that your standalone application has all the resources it needs to run successfully, whether they are images, audio, video files, or additional Python scripts. Combining these options with plugins and other configurations allows for the creation of robust and self-sufficient executables.

Happy coding!

This marks the end of the Adding Data Files in Nuitka Tutorial. Any suggestions or contributions for CodersLegacy are more than welcome. Questions regarding the tutorial content are more than welcome.

The post Adding Data Files in Nuitka (Python Files, Images, etc) appeared first on CodersLegacy.

Categories: FLOSS Project Planets

Python Bytes: #388 Don't delete all the repos

Planet Python - Tue, 2024-06-18 04:00
<strong>Topics covered in this episode:</strong><br> <ul> <li><strong>PSF Elections coming up</strong></li> <li><a href="https://www.bleepingcomputer.com/news/security/cloud-engineer-gets-2-years-for-wiping-ex-employers-code-repos/">Cloud engineer gets 2 years for wiping ex-employer’s code repos</a></li> <li><a href="https://adamj.eu/tech/2024/06/17/python-import-by-string/"><strong>Python: Import by string with pkgutil.resolve_name()</strong></a></li> <li><a href="https://x.com/__AlexMonahan__/status/1801435781380325448"><strong>DuckDB goes 1.0</strong></a></li> <li><strong>Extras</strong></li> <li><strong>Joke</strong></li> </ul><a href='https://www.youtube.com/watch?v=yweZO_BiYfw' style='font-weight: bold;'data-umami-event="Livestream-Past" data-umami-event-episode="388">Watch on YouTube</a><br> <p><strong>About the show</strong></p> <p>Sponsored by ScoutAPM: <a href="https://pythonbytes.fm/scout"><strong>pythonbytes.fm/scout</strong></a></p> <p><strong>Connect with the hosts</strong></p> <ul> <li>Michael: <a href="https://fosstodon.org/@mkennedy"><strong>@mkennedy@fosstodon.org</strong></a></li> <li>Brian: <a href="https://fosstodon.org/@brianokken"><strong>@brianokken@fosstodon.org</strong></a></li> <li>Show: <a href="https://fosstodon.org/@pythonbytes"><strong>@pythonbytes@fosstodon.org</strong></a></li> </ul> <p>Join us on YouTube at <a href="https://pythonbytes.fm/stream/live"><strong>pythonbytes.fm/live</strong></a> to be part of the audience. Usually Tuesdays at 10am PT. Older video versions available there too.</p> <p>Finally, if you want an artisanal, hand-crafted digest of every week of the show notes in email form? Add your name and email to <a href="https://pythonbytes.fm/friends-of-the-show">our friends of the show list</a>, we'll never share it. </p> <p><strong>Brian #1:</strong> <strong>PSF Elections coming up</strong></p> <ul> <li>This is elections for the PSF Board and for 3 bylaw changes.</li> <li>To vote in the PSF election, you need to be a Supporting, Managing, Contributing, or Fellow member of the PSF, …</li> <li>And affirm your voting status by June 25.</li> <li>See <a href="https://pyfound.blogspot.com/2024/06/affirm-your-psf-membership-voting-status.html?utm_source=pocket_shared">Affirm your PSF Membership Voting Status</a> for more details.</li> <li>Timeline <ul> <li>Board Nominations open: Tuesday, June 11th, 2:00 pm UTC</li> <li>Board Nominations close: Tuesday, June 25th, 2:00 pm UTC</li> <li>Voter application cut-off date: Tuesday, June 25th, 2:00 pm UTC <ul> <li>same date is also for <a href="https://psfmember.org/user-information">voter affirmation</a>.</li> </ul></li> <li>Announce candidates: Thursday, June 27th</li> <li>Voting start date: Tuesday, July 2nd, 2:00 pm UTC</li> <li>Voting end date: Tuesday, July 16th, 2:00 pm UTC </li> </ul></li> <li>See also <a href="https://pyfound.blogspot.com/2024/05/blog-post.html">Thinking about running for the Python Software Foundation Board of Directors? Let’s talk!</a> <ul> <li>There’s still one upcoming office hours session on June 18th, 12 PM UTC</li> </ul></li> <li>And <a href="https://pyfound.blogspot.com/2024/06/for-your-consideration-proposed-bylaws.html?utm_source=pocket_shared">For your consideration: Proposed bylaws changes to improve our membership experience</a> <ul> <li>3 proposed bylaws changes</li> </ul></li> </ul> <p><strong>Michael #2:</strong> <a href="https://www.bleepingcomputer.com/news/security/cloud-engineer-gets-2-years-for-wiping-ex-employers-code-repos/">Cloud engineer gets 2 years for wiping ex-employer’s code repos</a></p> <ul> <li>Miklos Daniel Brody, a cloud engineer, was sentenced to two years in prison and a restitution of $529,000 for wiping the code repositories of his former employer in retaliation for being fired.</li> <li>The <a href="https://www.documentcloud.org/documents/24215622-united-states-v-brody?responsive=1&title=1">court documents</a> state that Brody's employment was terminated after he violated company policies by connecting a USB drive.</li> </ul> <p><strong>Brian #3:</strong> <a href="https://adamj.eu/tech/2024/06/17/python-import-by-string/"><strong>Python: Import by string with pkgutil.resolve_name()</strong></a></p> <ul> <li>Adam Johnson</li> <li>You can use pkgutil.resolve_name("[HTML_REMOVED]:[HTML_REMOVED]")to import classes, functions or modules using strings. </li> <li>You can also use importlib.import_module("[HTML_REMOVED]") </li> <li>Both of these techniques are so that you have an object imported, but the end thing isn’t imported into the local namespace. </li> </ul> <p><strong>Michael #4:</strong> <a href="https://x.com/__AlexMonahan__/status/1801435781380325448"><strong>DuckDB goes 1.0</strong></a></p> <ul> <li>via Alex Monahan</li> <li>The cloud hosted product <a href="https://x.com/motherduck">@MotherDuck</a> also opened up General Availability</li> <li>Codenamed "Snow Duck"</li> <li>The core theme of the 1.0.0 release is stability. </li> </ul> <p><strong>Extras</strong> </p> <p>Brian:</p> <ul> <li>Sending us topics. Please send before Tuesday. But any time is welcome.</li> <li><a href="https://blog.scientific-python.org/numpy/numpy2/">NumPy 2.0 </a></li> <li><a href="https://htmx.org/posts/2024-06-17-htmx-2-0-0-is-released/">htmx 2.0.0</a></li> </ul> <p>Michael:</p> <ul> <li>Get 6 months of PyCharm Pro for free. Just take a course (even a free one) at <a href="https://training.talkpython.fm">Talk Python Training</a>. Then visit your account page &gt; details tab and have fun.</li> <li><a href="https://x.com/TalkPython/status/1803098742515732679">Coming soon at Talk Python</a>: Shiny for Python</li> </ul> <p><strong>Joke:</strong> <a href="https://devhumor.com/media/gitignore-thoughts-won-t-let-me-sleep">.gitignore thoughts won't let me sleep</a></p>
Categories: FLOSS Project Planets

Pages