Feeds

Mario Hernandez: Flexible Headings with Twig

Planet Drupal - Wed, 2024-06-19 16:27

Proper use of headings h1-h6 in your project presents many advantages incuding semantic markup, better SEO ranking and better accesibility.

Updated April 3, 2020

Building websites using the component based approach presents all kinds of advantages over the traditional page building approach. Today I’m going to show how to create what would normally be an Atom if we use the atomic design approach for building components. We are going to take this simple component to a whole new level by providing a way to dynamically controlling how it is rendered.

The heading component

Headings are normally used for page or section titles and are a big part of making your website SEO friendly. As simple as this may sound, headings need to be carefully planned. A typical heading would look like this:

<h1>This is a Heading 1</h1>

The idea of components is that they are reusable, but how can we possibly turn what already looks like a bare bones component into one that provides options and flexibility? What if we wanted to use a h2 or h3? or what if the title field is a link to another page? Then the heading component would probably not work because we have no way of changing the heading level from h1 to any other level or add a URL. Let's improve the heading component so we make it more dynamic.

Enter Twig and JSON

Twig offers many advantages over plain HTML and today we will use some logic to transform the static heading component into a more dynamic one.

Let’s start by creating a simple JSON object which we will use as data for Twig to consume. We will build some logic around this data to make the heading component more dynamic. This is typically how I build components on projects I work on.

  1. In your project, typically within the components/patterns directory create a new folder called heading
  2. Inside the heading folder create a new file called heading.json
  3. Inside the new file paste the code snippet below
{ "heading": { "heading_level": "", "modifier": "", "title": "This is the best heading I've seen!", "url": "" } }

So we created a simple JSON object with 4 keys: heading_level, modifier, title, and url.

  • The heading_level is something we can use to change the headings from say, h1 to h2 or h3 if we need to.
  • The modifier key allows us to pass a modifier CSS class when we make use of this component. The modifier class will make it possible for us to style the heading differently than other headings, if needed.
  • The title key is the title's string of text that will become the title of a page or a component.
  • ... and finally, the url key, if present, will allow us to wrap the title in an <a> tag, to make it a link.
  1. Inside the heading folder create a new file called heading.twig
  2. Inside the new file paste the code snippet below
<h{{ heading.heading_level|default('2') }} class="heading{{ heading.modifier ? ' ' ~ heading.modifier }}"> {% if heading.url %} <a href="{{ heading.url }}" class="heading__link"> {{ heading.title }} </a> {% else %} {{ heading.title }} {% endif %} </h{{ heading.heading_level|default('2') }}>

Wow! What's all this? 😮

Let's break things down to explain what's happening here since the twig code has changed significantly:

  • First we make use of heading.heading_level to complete the number part of the heading. If a value is not provided for heading_level in the JSON file, we are setting a default of 2. This will ensue that by default we will have a <h2> as the title, much better than <h1> as we saw before. This value can be changed every time the heading isused. The same approach is taken to close the heading tag at the last line of code.
  • Also, in addition to adding a class of heading, we check whether there is a value for the modifier key in JSON. If there is, we pass it to the heading as a CSS class. If no value is provided nothing will be added.
  • In the next line line, we check whether a URL was provided in the JSON file, and if so, we wrap the Flexible Headings with Twig variable in a <a> tag to turn the title into a link. The href value for the link is ``. If no URL is provided in the JSON file, we simply print the value of Flexible Headings with Twig as plain text.
Now what?

Well, our heading component is ready but unfortunately the component on its own does not do any good. The best way to take advantage of our super smart component is to start using it within other components.

Putting the heading component to use

As previously indicated, the idea of components is so they can be reusable which eliminates code duplication. Now that we have the heading component ready, we can reuse it in other templates by taking advantage of twig’s include statements. That will look like this:

<article class="card"> {% include '@components/heading/heading.twig' with { "heading": heading } only %} </article>

The example above shows how we can reuse the heading component in the card component by using a Twig’s include statement.

NOTE: For this to work, the same data structure for the heading needs to exist in the card’s JSON file. Or, you could also alter the heading's values in twig, like this:

<article class="card"> {% include '@components/heading/heading.twig' with { "heading": { "heading_level": 3, "modifier": 'card__title', "title": "This is a super flexible and smart heading", "url": "https://mariohernandez.io" } } only %} </article>

You noticed the part @components? this is only an example of a namespace. If you are not familiar with the component libraries Drupal module, it allows you to create namespaces for your theme which you can use to nest or include components as we see above.

End result

The heading component we built above would look like this when it is rendered:

<h3 class="heading card__title"> <a href="https://mariohernandez.io" class="heading__link"> This is a super flexible and smart heading </a> </h3> In closing

The main goal of this post is to bring light on how important it is to build components that are not restricted and can be used throughout the site in a way that does not feel like you are repeating yourself.

Additional Resources:

Managing heading levels in design systems.

Categories: FLOSS Project Planets

Mario Hernandez: Getting started with Gatsby

Planet Drupal - Wed, 2024-06-19 16:27

As many developers, when I hear the words "static website" I immediate think of creating flat HTML pages and editing them by hand. Times have changed. As you will see, Static Site Generators (SSG), offer some of the most advanced features and make use of latest technologies available on the web.

Static Site Generators are nothing new. If you search for SSG you will find many. One of the most popular ones is Jekyll, which I have personally worked with and it's a really good one. However, this post focusing on Gatsby. Probably one of the hottest system for creating static sites.

What is Gatsby?

Gatsby's primarily objective is to build static sites, but as you will learn, that's just the tip of the iceberg.

Gatsby is a blazing-fast static site generator for React.

How does Gatsby work?

While other SSGs use templating languages like Mustache, Handlebars, among others, Gatsby uses React. This not only allows for building modern component-driven websites, it also provides an incredible fast page rendering. Like mind-blowing fast.

Extending Gatsby

One of the most powerful features of Gatsby is its growing number of "Plugins". Plugins are the building blocks of Gatsby. They allow you to implement new features and functionality by running a couple of commands and making some configuration changes. Anything from adding Sass to your React project, creating a blog, configuring Google Anaylitics and many many more.

Plugins are contributed code kindly provided by the generous Open Source community which totally rocks. Anyone is able to write plugins and make them available to the world to consume and use.

Check out their Plugins page for a full list of ways you can take your static site to the next level.

Editorial Process

So we are building static sites and you may be wondering How do I create content for my site? There are several ways in which you can create a content editign workflow for your site. Probably the easiest way is to use static Markdown files. Markdown is a lightweight markup language with plain text formatting syntax. It is designed so that it can be converted to HTML and many other formats. Markdown is often used to format readme files, for writing messages in online discussion forums, and to create rich text using a plain text editor. This blog is using markdown. Since I am the only creating content I don't need a fancy administrative interface to crate content.

Markdown is only one of the ways you can create content for your static sites. Others include more advanced methods such as plugging in Gatsby with your Content Management System (CMS) of choice. This includes Wordpress, Drupal, Netlify, ContentaCMS, Contenful, and others. This means if you currently use any of those CMSs, you can continue to use them to retain a familiar workflow while moving your front-end workflow to a simpler and easier to manage process. This method is usually referred to as decoupled or headless, as your back-end is independent of your front-end.

Quering Data

As previously mentioned, Gatsby with the power of React create the perfect system for building robust, flexible and super fast static sites. However, there is a third component that takes that power to a whole new level, and that is GraphQL.

GraphQL is a query language for APIs and a runtime for fulfilling those queries with your existing data. GraphQL provides a complete and understandable description of the data in your API, gives clients the power to ask for exactly what they need and nothing more, makes it easier to evolve APIs over time, and enables powerful developer tools.

Deploying Gatsby

Hosting for a Gatsby site can be done anywhere where React apps can run. Nowadays that's pretty much anywhere. However, before investing in an expensive and highly complicated hosting environment, take a look at some of the simpler and less expensive options on this page.
You will see that for a basic website, you can use several of the free options such as github pages, netlify and others which already include advanced continuous integration workflows. For more advanced sites where a CMS may be involved, you can also find options for deployment that will simplify your DevOps process.

My own blog is running out of a github repo that automatically get deployed when I push new updates. This is happening via Netlify which to me is probably the easiest way I have ever deployed a website.

In closing

Don't worry if you are a little skeptical about static site generators. I was too. However, I gave Gatsby a try and I see myself building more gatsby sites in the future. Before Gatsby I worked with Jekyll which is also a great static site generator, but what sets Gatsby apart is its seamless integration with React and GraphQL. The combination of those 3 provides endless posibilities in your web building process. Check it out.

Categories: FLOSS Project Planets

Mario Hernandez: Running a training workshop

Planet Drupal - Wed, 2024-06-19 16:27

Update 1-10-19
I wrote an extended version of this post at Mediacurrent's blog, check it out.

As long as I can remember I've enjoyed public speaking. This doesn't mean I am good at it, it simply means I enjoy it. School events, class president, my jobs, etc., they all taught me great lessons about public speaking. So when I started as a developer, sharing my knowledge with others at conferences or meetups came pretty natural.

I'd like to clarify, that after years of doing talks and other methods of public speaking, I am still terrified. I get nervous, my hands sweat, my legs shake, and my voice gets weird. Basically what I am trying to say is that I'm not an expert by any means, but I overcome the phobia of public speaking by doing it frequently.

For many years I have speaking at conferences, but in the past few years I started conducting longer workshops. I first started doing online workshops, which have their pros and cons. While they don't put you face to face with your audience, it also does not give you a good sense for how effective your training is because you can't see people reactions. For this reason I prefer to do face-to-face training.
As part of my job I conduct periodic Front-End training workshops for clients and recently I started conducting all-day training workshops at conferences. I really enjoy it and I'ld like to share some of the lessons learned.

Picking a topic

Ideally you want to pick a topic you feel 100% confident about. I have learned that people attending your training or talks welcome any information you can share no matter how simple or elementary it may feel to you. Don't ever think what you know may not be of interest to others because you would be wrong.

Lately I have been challenging myself a little more when picking a topic to train about. While is good to know the topic well, it is also extremely rewarding to pick a topic you'd like to learn more about. This may feel contrary to what I said ealier but hear me out. When you decide to train on a topic, you will spend a lot of time preparing, training, testing and reharsing. This is exactly how you learn a new skill. I can't tell you how many times I come out of training I did knowing more about the topic than before and also learning from people who attended the training. If you want to learn a new skill, teaching others about it could be the best way for you to learn it.

Preparing for the training

Everyone has their own style for teaching or doing a presentation. Some people like to use slides and screenshots, others show recordings of their project or code. My personal preference is to build a working prototype. This to me presents many advantages, but it also means you will spend more time getting ready.
My training workshops usually include very little slides because the majority of the training will be spent writing actual code and building the prototype during the training.

Here's my typical process for preparing for a training workshop:

  • Identify a prototype that serves the purpose of the training. If I am teaching a workshop about component based development I would normally pick something that involves the different aspects of component based development (attoms, nested components, reusable components, etc.)

  • Build the prototype upfront to ensure you have a working model to demo and go by.

  • Once prototype is built, create a public repo so you can share the working prototype

  • Write step-by-step instructions to building the prototype. Normally I would break the prototype down into small components, atoms.

  • Test, test and test. You want to make sure yoru audience will not run insto unexpected issues while following your instructions. For this reason you need to make sure you test your instructions. Ask a friend or colleague to go through each of your excercises to ensure thing work as expected.

  • Provide a pre-training evaluation. A quick set of questions that will give you an idea of people's skills level as well as environment (Linux, Widows, OSX). This will help you plan ahead of time.

  • Build a simple slide deck for introductions and agenda purposes. Mainly I move away from slides as soon as introduction and agenda is done. The rest of the training is all hands on.

Communicate with your audience ahead of time

As you will learn, one thing that can really kill a lot of the time during training is assisting people with their local environment setup. I have conducted training workshops where I've spent half the time helping people with their environment. For this reason, nowadays I communicate with the people ahead of time to ensure everyone's local environment is ready to go.

I normally make myself available once or twice in an evening through a google hangout to assit anyone who may need help. I also provide detailed instructions on how to get their local environment ready. This could save you a lot of time during training. In addition, for those who did spend the time on getting their environment ready, it's not fair that they have to be held back because someone did not make an effort to setup their environment.
I make myself available ahead of training but if someone is still having issues because of neglect, I don't hold the rest of the class back. I try to help them but at some point I move on.

During training

If possible, get help from someone who is also well-versed with the topic so they can assist you help people who may get stuck. Nothing is more frustrating that havign to break the flow of the training to help people who get stuck. Having someone else help you with this allows you to continue with the training and not have everyone loose momentum.

Finally

Enjoy yourself. Make sure you and your audience have fun. If you show excitement in what you are doing people will get excited as well.

Categories: FLOSS Project Planets

Dirk Eddelbuettel: nanotime 0.3.8 on CRAN: More Maintenance

Planet Debian - Wed, 2024-06-19 14:47

Leonardo and I are happy to annunce that a new version 0.3.8 of our nanotime package arrived on CRAN today. It is the first release in over 1 1/2 years. nanotime relies on the RcppCCTZ package (as well as the RcppDate package for additional C++ operations) and offers efficient high(er) resolution time parsing and formatting up to nanosecond resolution, using the bit64 package for the actual integer64 arithmetic. Initially implemented using the S3 system, it has benefitted greatly from a rigorous refactoring by Leonardo who not only rejigged nanotime internals in S4 but also added new S4 types for periods, intervals and durations.

This release responds to a number of enhancements including a new paramter accurate for POSIXct to nanotime conversions, a vector date converter, a switch to double return value when durations objects are dividded – as well as a small battery of CRAN requests for changes and updates. This started with a move away from the now ‘non-API’ function SET_S4_OBJECT which has been replaced by use of Rf_asS4. We also no longer need a custom compiler flag on Windows (where for some reasons nobody understands or remembers, bitfields are not packed) to small enhancements of manual page formatting and last-but-not-least avoidance of some new UBSAN warnings. The NEWS snippet has the full details.

Changes in version 0.3.8 (2024-06-19)
  • Time format documentation now has a reference to RcppCCTZ

  • The package no longer sets a default C++ compilation standard of C++11 (Dirk initially in #114, and later switched to C++17)

  • New accurate parameter for conversion from POSIXct to nanotime (Davor Josipovic and Leonardo in #116 closing #115)

  • The as.Date() function is now vectorized and can take a TZ argument (Leonardo and Dirk in #119 closing #118)

  • Use of internal function SET_S4_OBJECT has been replaced by API function Rf_asS4 (Leonardo in #121 closing #120)

  • An nanoduration / nanoduration expression now returns a double (Leonardo in #122 closing #117)

  • Bitfield calculations no longer require an Windows-only compiler switch (Leonardo in #124)

  • A simple manual page format nag involving has been addressed (Dirk in #126 fixing #125)

  • An set of tests tickling an UBSAN issue via Rcpp code no longer run unless CI is set (Dirk in #127 fixing #123)

Thanks to my CRANberries, there is a diffstat report for this release. More details and examples are at the nanotime page; code, issue tickets etc at the GitHub repository – and all documentation is provided at the nanotime documentation site.

If you like this or other open-source work I do, you can sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

Categories: FLOSS Project Planets

KDE PIM Sprint 2024 Report

Planet KDE - Wed, 2024-06-19 14:00

In 2021 I decided to take a break from contributing to KDE, since I felt that I’ve been losing motivation and energy to contribute for a while… But I’ve been slowly getting back to hacking on KDE stuff for the past year, which ended in me going to Toulouse this year to attend the annual KDE PIM Sprint, my first in 5 years.

I’m very happy to say that we have /a lot/ going on in PIM, and even though not everything is in the best shape and the community is quite small (there were only four of us at the sprint), we have great plans for the future, and I’m happy to be part of it.

Day 0

The sprint was officially supposed to start on Saturday, but everyone arrived already on Friday, so why wait? We wrote down the topics to discuss, put them on a whiteboard and got to it.

We’ve managed to discuss some pretty important topics - how we want to proceed with deprecation and removal of some components, how to improve our test coverage or how to improve indexing and much much more.

I arrived to the sprint with two big topics to discuss: milestones and testing:

Milestones

The idea is to create milestones for all our bigger efforts that we work (or want to work) on. The milestones should be concrete goals that are achievable within a reasonable time frame and have clear definition of done. Each milestones should then be split to smaller tasks that can be tackled by individuals. We hope that this will help to make KDE PIM more attractive to new contributors, who can now clearly see what is being worked on and can find very concrete, bite-sized tasks to work on.

As a result, we took all the ongoing tasks and turned most of them into milestones in Gitlab. It’s still very much work in progress, we still need to break down many milestones to smaller tasks, but the general ideas are out there.

E2E Testing of Resources

Akonadi Resources provide “bridge” between Akonadi Server and individual services, like IMAP servers, DAV servers, Google Calendar etc. But we have no tests to verify that our Resources can talk to the services and vice versa. The plan is to create a testing framework (in Python) so that we can have automated nightly tests to verify that e.g. IMAP resource interfaces properly with common IMAP server implementations, including major proprietary ones like Gmail or Office365. We want to achieve decent coverage for all our resources. This is a big project, but I think it’s a very exciting one as it includes not just programming, but also figuring out and building some infrastructure to run e.g. Dovecot, NextCloud and others in a Docker to test against.

Day 1

On Saturday we started quite early, all the delicious french pastry is not going to eat itself, is it? After breakfast we continued with discussions, we dicussed tags support, how to improve our PR. But we also managed to produce some code. I implemented syncing of iCal categories with Akonadi tags, so the tags are becoming more useful. I also prepared Akonadi to be cleanly handle planned deprecation and retirement of KJots, KNotes and their acompanying resources, as well as planned removal of the Akonadi Kolab Resource (in favor of using IMAP+DAV).

One of the tasks I want to look into is improving how we do database transactions in the Akonadi Server. To get some data out of it, I shoved Prometheus exporter into Akonadi, hooked it up to a local Prometheus service, thrown together a Grafana dashboard, and here we are:

We decided to order some pizzas for dinner and stayed at the venue hacking until nearly 11 o’clock.

Day 2

On the last day of the sprint we wrapped up on the discussions and focused on actually implementing some of the ideas. I spent most of the time extending the Migration agent to extract tags from all existing events and todos already stored in Akonadi and helped to create some of the milestones on the Gitlab board. We also came up with a plan for KDE PIM BoF on this years Akademy, where we want to present out progress on the respective milestones and to give a chance to contributors to learn what are the biggest hurdles they are facing when trying to contribute to KDE PIM and how we can help make it easier for them to get involved.

Conclusion

I think it was a very productive sprint and I am really excited to be involved in PIM again. Can’t wait to meet up with everyone again on Akademy in September.

Go check out Kevin’s and Carl’s reports to see what else have they been up to during the sprint.

Did some of the milestones caught your eye, or do you have have any questions? Come talk to us in our matrix channel.

Finally, many thanks to Kevin for organizing the sprint, Étincelle Coworking for providing us with nice and spacious venue and KDE e.V. for supporting us on travel.

Finally, if you like such meetings to happen in the future so that we can push forward your favorite software, please consider making a tax-deductible donation to the KDE e.V. foundation.

Categories: FLOSS Project Planets

Django Weblog: DjangoCon US: Call for Venue Proposal 2025

Planet Python - Wed, 2024-06-19 12:55

DEFNA is seeking proposals for a venue for DjangoCon US 2025 and ideally 2026. You can read the details on DEFNA’s site.

For 2025, we are looking at conference dates of October 5-10, 2025 or October 12-17, 2025.

The deadline for submissions is July 28, 2024. If you have any questions or concerns, please reach out to the DEFNA board at hello AT defna.org. We look forward to hearing from you!

Categories: FLOSS Project Planets

Real Python: Build a Guitar Synthesizer: Play Musical Tablature in Python

Planet Python - Wed, 2024-06-19 10:00

Have you ever wanted to compose music without expensive gear or a professional studio? Maybe you’ve tried to play a musical instrument before but found the manual dexterity required too daunting or time-consuming. If so, you might be interested in harnessing the power of Python to create a guitar synthesizer. By following a few relatively simple steps, you’ll be able to turn your computer into a virtual guitar that can play any song.

In this tutorial, you’ll:

  • Implement the Karplus-Strong plucked string synthesis algorithm
  • Mimic different types of string instruments and their tunings
  • Combine multiple vibrating strings into polyphonic chords
  • Simulate realistic guitar picking and strumming finger techniques
  • Use impulse responses of real instruments to replicate their unique timbre
  • Read musical notes from scientific pitch notation and guitar tablature

At any point, you’re welcome to download the complete source code of the guitar synthesizer, as well as the sample tablature and other resources that you’ll use throughout this tutorial. They might prove useful in case you want to explore the code in more detail or get a head start. To download the bonus materials now, visit the following link:

Get Your Code: Click here to download the free sample code that you’ll use to build a guitar synthesizer in Python.

Take the Quiz: Test your knowledge with our interactive “Build a Guitar Synthesizer” quiz. You’ll receive a score upon completion to help you track your learning progress:

Interactive Quiz

Build a Guitar Synthesizer

In this quiz, you'll test your understanding of what it takes to build a guitar synthesizer in Python. By working through this quiz, you'll revisit a few key concepts from music theory and sound synthesis.

Demo: Guitar Synthesizer in Python

In this step-by-step guide, you’ll build a plucked string instrument synthesizer based on the Karplus-Strong algorithm in Python. Along the way, you’ll create an ensemble of virtual instruments, including an acoustic, bass, and electric guitar, as well as a banjo and ukulele. Then, you’ll implement a custom guitar tab reader so that you can play your favorite songs.

By the end of this tutorial, you’ll be able to synthesize music from guitar tablature, or guitar tabs for short, which is a simplified form of musical notation that allows you to play music without having to learn how to read standard sheet music. Finally, you’ll store the result in an MP3 file for playback.

Below is a short demonstration of the synthesizer, re-creating the iconic soundtracks of classic video games like Doom and Diablo. Click the play button to listen to the sample output:

E1M1 - At Doom's Gate (Bobby Prince), Tristram (Matt Uelmen)

Once you find a guitar tab that you like, you can plug it into your Python guitar synthesizer and bring the music to life. For example, the Songsterr website is a fantastic resource with a wide range of songs you can choose from.

Project Overview

For your convenience, the project that you’re about to build, along with its third-party dependencies, will be managed by Poetry. The project will contain two Python packages with distinctly different areas of responsibility:

  1. digitar: For the synthesis of the digital guitar sound
  2. tablature: For reading and interpreting guitar tablature from a file

You’ll also design and implement a custom data format to store guitar tabs on disk or in memory. This will allow you to play music based on a fairly standard tablature notation, which you’ll find in various places on the Internet. Your project will also provide a Python script to tie everything together, which will let you interpret the tabs with a single command right from your terminal.

Now, you can dive into the details of what you’ll need to set up your development environment and start coding.

Prerequisites

Although you don’t need to be a musician to follow along with this tutorial, a basic understanding of musical concepts such as notes, semitones, octaves, and chords will help you grasp the information more quickly. It’d also be nice if you had a rough idea of how computers represent and process digital audio in terms of sampling rate, bit depth, and file formats like WAV.

But don’t worry if you’re new to these ideas! You’ll be guided through each step in small increments with clear explanations and examples. So, even if you’ve never done any music synthesis before, you’ll have a working digital guitar or digitar by the end of this tutorial.

Note: You can learn music theory in half an hour by watching an excellent and free video by Andrew Huang.

The project that you’ll build was tested against Python 3.12 but should work fine in earlier Python versions, too, down to Python 3.10. In case you need a quick refresher, here’s a list of helpful resources covering the most important language features that you’ll take advantage of in your digital guitar journey:

Other than that, you’ll use the following third-party Python packages in your project:

  • NumPy to simplify and speed up the underlying sound synthesis
  • Pedalboard to apply special effects akin to electric guitar amplifiers
  • Pydantic and PyYAML to parse musical tablature representing finger movements on a guitar neck
Read the full article at https://realpython.com/python-guitar-synthesizer/ »

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Categories: FLOSS Project Planets

Real Python: Quiz: Creating Great README Files for Your Python Projects

Planet Python - Wed, 2024-06-19 08:00

Test your understanding of how a great README file can make your Python project stand out and how to create your own README files.

Take this quiz after reading our Creating Great README Files for Your Python Projects tutorial.

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Categories: FLOSS Project Planets

PyCharm: How to Move From pandas to Polars

Planet Python - Wed, 2024-06-19 07:48

This is a guest post from Cheuk Ting Ho, a data scientist who contributes to multiple open-source libraries, such as pandas and Polars.

You’ve probably heard about Polars – it is now firmly in the spotlight in the data science community. 

Are you still using pandas and would like to try out Polars? Are you worried that it will take a lot of effort to migrate your projects from pandas to Polars? You might be concerned that Polars won’t be compatible with your existing pipeline or the other tools you are currently using.

Fear not! In this article, I will answer these questions so you can decide whether to migrate to using Polars or not. I will also provide some tips for those of you who have already decided to migrate.

How is Polars different from pandas?

Polars is known for its speed and security, as it is written in Rust and based on Apache Arrow. For details about Polars vs. pandas, you can see our other blog post here. In short, while Polars’ backend architecture is different from pandas’, the creator and community around Polars have tried to maintain a Python API that is very similar to pandas’. At first glance, Polars code is very similar to pandas code. Fun fact – some contributors to pandas are also contributors to Polars. Due to this, the barrier for pandas users to start using Polars is relatively low. However, as it is still a different library, it is worth double-checking the differences between the two.

Advantages of using Polars

Have you struggled when using pandas for a relatively large data set? Do you think pandas is using too much RAM and slowing your computer down while working locally? Polars may solve this problem by using its lazy API. Intermediate steps won’t be executed unless needed, saving memory for the intermediate steps in some cases.

Another advantage Polars has is that, since it is written in Rust, it can make use of concurrency much better than pandas. Python is traditionally single-threaded, and although pandas uses the NumPy backend to speed up some operations, it is still mainly written in Python and has certain limitations in its multithreading capabilities.

Tools that make the switch easy

As Polars’ popularity grows, there is more and more support for Polars in popular tools for data scientists, including scikit-learn and HoloViz.

PyCharm, the most popular IDE used by data scientists, provides a similar experience when you work with pandas and Polars. This makes the process of migration smoother. For example, interactive tables allow you to easily see the information about your DataFrame, such as the number of rows and columns.

Try PyCharm for free

PyCharm has an excellent pagination feature – if you want to see more results per page, you can easily configure that via a drop-down menu:

You can see the statistical summary for the data when you hover the cursor over the column name:

You can also sort the data for inspection with a few clicks in the header. You can also use the multi-sorting functionality – after sorting the table once, press and hold (macOS) or Alt (Windows) and click on the second column you want the table to be sorted by. For example, here, we can sort by island and bill_length_mm in the table.

To get more insights from the DataFrame, you can switch to chat view with the icon on the left:

You can also change how the data is shown in the settings, showing different columns and using different graph types:

It also helps you to auto-complete methods when using Polars, very handy when you are starting to use Polars and not familiar with all of the methods that it provides. To understand more about full line code completion in JetBrains IDEs, please check out this article

You can also access the official documentation quickly by clicking the Polars icon in the top-right corner of the table, which is really handy.

How to migrate from pandas to Polars

If you’re now convinced to migrate to Polars, your final questions might be about the extent of changes needed for your existing code and how easy it is to learn Polars, especially considering your years of experience and muscle memory with pandas.

Similarities between pandas and Polars

Polars provides APIs similar to pandas, most notably the read_csv(), head(), tail(), and describe() for a glance at what the data looks like. It also provides similar data manipulation functions like join() and groupby()/ group_by(), and aggregation functions like mean() and sum().

Before going into the migration, let’s look at these code examples in Polars and pandas.

Example 1 – Calculating the mean score for each class

pandas

import pandas as pd df_student = pd.read_csv("student_info.csv") print(df_student.dtypes) df_score = pd.read_csv("student_score.csv") print(df_score.head()) df_class = df_student.join(df_score.set_index("name"), on="name").drop("name", axis=1) df_mean_score = df_class.groupby("class").mean() print(df_mean_score)

Polars

import polars as pl df_student = pl.read_csv("student_info.csv") print(df_student.dtypes) df_score = pl.read_csv("student_score.csv") print(df_score.head()) df_class = df_student.join(df_score, on="name").drop("name") df_mean_score = df_class.group_by("class").mean() print(df_mean_score)

Polars provides similar io methods like read_csv. You can also inspect the dtypes, do data cleaning with drop, and do groupby with aggregation functions like mean.

Example 2 – Calculating the rolling mean of temperatures

pandas

import pandas as pd df_temp = pd.read_csv("temp_record.csv", index_col="date", parse_dates=True, dtype={"temp":int}) print(df_temp.dtypes) print(df_temp.head()) df_temp.rolling(2).mean()

Polars

import polars as pl df_temp = pl.read_csv("temp_record.csv", try_parse_dates=True, dtypes={"temp":int}).set_sorted("date") print(df_temp.dtypes) print(df_temp.head()) df_temp.rolling("date", period="2d").agg(pl.mean("temp"))

Reading with date as index in Polars can also be done with read_csv, with a slight difference in the function arguments. Rolling mean (or other types of aggregation) can also be done in Polars.

As you can see, these code examples are very similar, with only slight differences. If you are an experienced pandas user, I am sure your journey using Polars will be quite smooth.

Tips for migrating from pandas to Polars

As for code that was previously written in pandas, how can you migrate it to Polars? What are the differences in syntax that may trip you up? Here are some tips that may be useful:

Selecting and filtering

In pandas, we use .loc / .iloc and [] to select part of the data in a data frame. However, in Polars, we use .select to do so. For example, in pandas df["age"] or df.loc[:,"age"] becomes df.select("age") in Polars.

In pandas, we can also create a mask to filter out data. However, in Polars, we will use .filter instead. For example, in pandas df["age" > 18] becomes df.filter(pl.col("a") > 18) in Polars.

All of the code that involves selecting and filtering data needs to be rewritten accordingly.

Use .with_columns instead of .assign

A slight difference between pandas and Polars is that, in pandas we use .assign to create new columns by applying certain logic and operations to existing columns. In Polars, this is done with .with_columns. For example:

In pandas

df_rec.assign(     diameter = lambda df: (df.x + df.y) * 2,     area = lambda df: df.x * df.y )

becomes

df_rec.with_columns(     diameter = (pl.col("x") + pl.col("y")) * 2,     area = lambda df: pl.col("x") * pl.col("y") )

in Polars.

.with_columns can replace groupby

In addition to assigning a new column with simple logic and operations, .with_columns offers more advanced capabilities. With a little trick, you can perform operations similar to groupby in pandas by using window functions:

In pandas

df = pd.DataFrame({     "class": ["a", "a", "a", "b", "b", "b", "b"],     "score": ["80", "39", "67", "28", "77", "90", "44"], }) df["avg_score"] = df.groupby("class")["score"].transform("mean")

becomes

df.with_columns(     pl.col("score").mean().over("class").alias("avg_score") )

in Polars.

Use scan_csv instead of read_csv if you can

Although read_csv also works in Polars, by using scan_csv instead of read_csv it will turn to lazy evaluation mode and benefit from the lazy API mentioned above.

Building pipelines properly with lazy API

In pandas, we usually use .pipe to build data pipelines. However, since Polars works a bit differently, especially when using the lazy API, we want the pipeline to be executed only once. So, we need to adjust the code accordingly. For example:

Instead of this pandas code snippet:

def discount(df):     df["30_percent_off"] = df["price"] * 0.7     return df def vat(df):     df["vat"] = df["price"] * 0.2     return df def total_cost(df):     df["total"] = df["30_percent_off"] + df["vat"]     return df (df  .pipe(discount)  .pipe(vat)  .pipe(total_cost) )

We will have the following one in Polars:

def discount(input_col)r:     return pl.col(input_col).mul(0.7).alias("70_percent_off") def vat(input_col):     return pl.col(input_col).mul(0.2).alias("vat") def total_cost(input_col1, input_col2):     return pl.col(input_col1).add(pl.col(input_col2).alias("total") df.with_columns(     discount("price"),     val("price"),     total_cost("30_percent_off", "vat"), ) Missing data: No more NaN

Do you find NaN in pandas confusing? There is no NaN in Polars! Since NaN is an object in NumPy and Polars doesn’t use NumPy as the backend, all missing data will now be null instead. For details about null and NaN in Polars, check out the documentation.

Exploratory data analysis with Polars

Polars provides a similar API to pandas, and with hvPlot, you can easily create a simple plotting function with exploratory data analysis in Polars. Here I will show two examples, one creating simple statistical information from your data set, and the other plotting simple graphs to understand the data.

Summary statistics from dataset

When using pandas, the most common way to get a summary statistic is to use describe. In Polars, we can also use describe in a similar manner. For example, we have a DataFrame with some numerical data and missing data:

We can use describe to get summary statistics:

Notice how object types are treated – in this example, the column name gives a different result compared to pandas. In pandas, a column with object type will result in categorical data like this:

In Polars, the result is similar to numeric data, which makes less sense:

Simple plotting with Polars DataFrame


To better visualize of the data, we might want to plot some graphs to help us evaluate the data more efficiently. Here is how to do so with the plot method in Polars.

First of all, since Polars uses hvPlot as backend, make sure that it is installed. You can find the hvPlot User Guide here. Next, since hvPlot will output the graph as an interactive Bokeh graph, we need to use output_notebook from bokeh.plotting to make sure it will show inline in the notebook. Add this code at the top of your notebook:

from bokeh.plotting import output_notebook output_notebook()

Also, make sure your notebook is trusted. This is done by simply checking the checkbox in the top-right of the display when using PyCharm.

Next, you can use the plot method in Polars. For example, to make a scatter plot, you have to specify the columns to be used as the x- and y-axis, and you can also specify the column to be used as color of the points:

df.plot.scatter(x="body_mass_g", y="bill_length_mm", color="species")

This will give you a nice plot of the different data points of different penguin species for inspection:

Of course, scatter plots aren’t your only option. In Polars, you can use similar steps to create any type of plot that is supported by hvPlot. For example, hist can be done like this:

df.plot.hist("body_mass_g", by=["species","sex"])

For a full list of plot types supported by hvPlot, you can have a look at the hvPlot reference gallery.

Conclusion

I hope the information provided here will help you on your way with using Polars. Polars is an open-source project that is actively maintained and developed. If you have suggestions or questions, I recommend reaching out to the Polars community.

About the author

Cheuk Ting Ho

Cheuk has been a Data Scientist at various companies – a job that demands high numerical and programming skills, especially in Python. Following her passion for the tech community, Cheuk has been a Developer Advocate for three years. She also contributes to multiple open-source libraries like Hypothesis, Pytest, pandas, Polars, PyO3, Jupyter Notebook, and Django. Cheuk is currently a consultant and trainer at CMD Limes.

Categories: FLOSS Project Planets

EuroPython: EuroPython June 2024 Newsletter

Planet Python - Wed, 2024-06-19 06:13
&#x1F40D; EuroPython 2024: The Ultimate Python Party Awaits! &#x1F389;


Hello Pythonistas,

Get ready to code, connect, and celebrate at EuroPython 2024! We’re thrilled to bring you an unforgettable conference experience filled with enlightening talks, engaging workshops, and a whole lot of fun. Whether you&aposre a seasoned developer or just starting your Python journey, there&aposs something for everyone. Let&aposs dive into the details!

⏰ THREE DAYS LEFT TO BUY YOUR TICKETS!! &#x1F39F;️

Don&apost miss out on the Python event of the year! Secure your spot today and be part of the magic.

&#x1F39F;️ Buy your tickets here!!! &#x1F39F;️

The Late Bird prices kick in this Saturday (June 22nd).

SCHEDULE &#x1F4C5;

The schedule is OUT! Check out all the awesome stuff we have planned for you in Prague this year.

&#x1F3A4; Keynote Speakers

We&aposre excited to announce our stellar lineup of keynote speakers who will inspire and challenge you with their insights and experiences:

  • Carol Willing - Developer Advocate at Noteable and core developer of Jupyter and CPython.
  • Tereza Iofciu - Data Science Lead at Free Now, co-founder of PyLadies Hamburg.
  • Anna P&#x159;istoupilová - Bioinformatician and researcher.
  • Armin Ronacher - Creator of Flask and Director of Engineering at Sentry.
  • &#x141;ukasz Langa - Python core developer and Release Manager for Python 3.8 and 3.9.
  • Mai Giménez - Senior research engineer at Google DeepMind, specialising in large language and multimodal models.
&#x1F973; Social Events
  • Boat Trip: Set sail on Friday with us for a scenic boat trip to enjoy an evening of networking and relaxation. Make sure to reserve your spot early! Sign-up will be available soon.
  • EuroPython Social Event: Join us for a fantastic evening in Prague. This event promises great food, drinks, and the opportunity to connect with fellow attendees in a beautiful setting. You will be invited to bring your favourite games and musical instruments. Stay tuned!
  • Speakers’ Dinner: An exclusive dinner event for our speakers to network, share insights, and enjoy a relaxing evening before the conference kicks off. More information here.
&#x1F37D;  PyLadies LunchPyLadies Lunch at EuroPython 2023 at the Prague Conference Centre

On Thursday, 11th July 2024 12:30 to 14:00. Join us for a special lunch event aimed at fostering community and empowerment among women in tech.

Thank you to our sponsor &#x1F419; Kraken Tech &#x1F419; for supporting the lunch event.

We’re excited to announce a range of events for underrepresented groups in computing this year! &#x1F389; Whether you’re new to PyLadies or a long-time supporter, we warmly welcome you to join us and be part of our supportive community.

  • Self-Defence Workshop: Learn to defend yourself against inappropriate behaviour in this supportive session. Facilitated by a professional therapist, you&aposll gain practical skills and mutual support.
  • #IAmRemarkable: Empower yourself in this workshop designed to help women and underrepresented groups celebrate their achievements and enhance their self-promotion skills.
  • Meet & Greet with PyLadies: Network with seasoned members of the PyLadies community. Gain valuable insights, advice, and inspiration for your Python journey.

Sign up for any of the sessions above here

&#x1F30D; Community Organiser&aposs Lunch

On Friday (July 12th) at 1 pm. A great opportunity for community leaders to network and discuss strategies for success. This lunch will include an Open Space discussion about Python Organizations and how we deal with challenges.

Sign up for the Community Organiser’s session here

&#x1F469;‍&#x1F4BB; Learn to Build Websites With Django Girls

Are you interested in learning how to build websites and belong to an underrepresented group in computing? Join us for a one-day workshop!

No prior programming experience is required. The workshop is open to everyone, regardless of participation in EuroPython. For more information, click here

&#x1F476; Childcare

This year, we&aposre once again partnering with Susie&aposs Babysitting Prague to offer childcare at the main conference venue (Prague Conference Centre).

If you&aposre interested, please let us know at the latest two weeks before the event by filling out this form.

You will be asked about the Childcare add-on when you buy your ticket.

&#x1F4BB; Sprint Weekend

EuroPython 2024 Sprints will be during the weekend of the 13th and 14th of July. This year, the event will happen at a different venue from the conference and it will be free for anyone with a conference ticket to join!

As per our tradition, the EuroPython will provide the rooms and facilities but the sprints are organised by YOU. It is a great chance to contribute to open-source projects large and small, learn from each other, geek out and have fun. &#x1F40D;

Lunch and coffee will be provided.

  • When: 13th and 14th July 2024 (09:30 - 17:00)
  • Where: To be determined
&#x1F92D; Py.Jokes~ pyjoke There are only two hard problems in Computer Science: cache invalidation, naming things and off-by-one-errors. &#x1F4F1; Stay Connected

Share your EuroPython experience on social media!

Use the hashtag #EuroPython2024 and follow us on:

With so much joy and excitement,

EuroPython 2024 Team &#x1F917;

Categories: FLOSS Project Planets

Qt for MCUs 2.8 LTS released

Planet KDE - Wed, 2024-06-19 06:05

We are thrilled to announce the release of Qt for MCUs 2.8 LTS, which comes with new exciting GUI building blocks, improvements to build tools workflows, extended support for Infineon TRAVEO T2G microcontrollers, and much more. Qt for MCUs 2.8 is a Long-Term Support version, offering increased stability throughout your development. As such, it is the preferred version for all new projects. Standard Support will be available for 18 months, until December 2025.

Categories: FLOSS Project Planets

The Drop Times: What We Learned from DrupalJam: Open Up 2024

Planet Drupal - Wed, 2024-06-19 04:41
Celebrate the 20th DrupalJam with Esmeralda Tijhoff! Join over 330 participants in Utrecht and delve into Dries Buytaert's keynote, insightful presentations, and hands-on workshops. Discover the latest trends in Drupal technology and explore the evolution of open source at DrupalJam 2024. Don't miss out on this comprehensive event recap—read more now!
Categories: FLOSS Project Planets

LakeDrops Drupal Consulting, Development and Hosting: ECA 2.0.0 has been released for Drupal 10.3 and 11

Planet Drupal - Wed, 2024-06-19 04:41
ECA 2.0.0 has been released for Drupal 10.3 and 11 Jürgen Haas Wed, 19.06.2024 - 10:41

Almost 2 years ago, ECA 1.0.0 was published, and a lot happened in the 23 months in between. Today, ECA gets its first major update which comes not only with a ton of new features but also with code clean-up, performance improvements and support for the latest Drupal core releases 10.3 and soon 11.

Categories: FLOSS Project Planets

ComputerMinds.co.uk: My text filter's placeholder content disappeared!

Planet Drupal - Wed, 2024-06-19 04:40
A story of contributing a fix to Drupal... and a pragmatic workaround

When I upgraded a site from Drupal 10.1 to 10.2, I discovered a particularly serious bug: the login form on our client's site vanished ... which was pretty serious for this site which hid all content behind a login!

We had a custom text format filter plugin to render the login form in place of a custom token in text that editors set, on one of the few pages that anonymous users could access. Forms can have quite different cacheability to the rest of a page, and building them can be a relatively expensive operation anyway, so we used placeholders which Drupal can replace 'lazily' outside of regular caching:

class MymoduleLoginFormFilter extends FilterBase implements TrustedCallbackInterface { public function process($text, $langcode) { $result = new FilterProcessResult($text); $needle = '[login_form]'; // No arguments needed as [login_form] is always to be replaced with the same form. $arguments = []; $replace = $result->createPlaceholder(self::class . '::renderLoginForm', $arguments); return $result->setProcessedText(str_replace($needle, $replace, $text)); } public static function renderLoginForm() { // Could be any relatively expensive operation. return \Drupal::formBuilder()->getForm(UserLoginForm::class); } public static function trustedCallbacks() { return ['renderLoginForm']; } }

But our text format also had core's "Correct faulty and chopped off HTML" filter enabled - which completely removed the placeholder, and therefore the form went missing from the final output!

Debugging this to investigate was interesting - it took me down the rabbit hole of learning more about PHP 8 Fibers, as Drupal 10.2 uses them to replace placeholders. Initially, I thought the problem could be there, but it turned out that the placeholder itself was the problem. Drupal happily generated the form to go in the right place, but couldn't find the placeholder. Here's what a placeholder, created by FilterProcessResult::createPlaceholder() should look like:

<drupal-filter-placeholder callback="Drupal\mymodule\Plugin\Filter\MymoduleLoginFormFilter::renderLoginForm" arguments="" token="hqdY2kfgWm35IxkrraS4AZx6zYgR7YRVmOwvWli80V4"></drupal-filter-placeholder>

Looking very carefully, I spotted that the arguments="" attribute in the actual markup was just arguments - i.e. it had been turned into a 'boolean' HTML attribute:

<drupal-filter-placeholder callback="Drupal\mymodule\Plugin\Filter\MymoduleLoginFormFilter::renderLoginForm" arguments token="hqdY2kfgWm35IxkrraS4AZx6zYgR7YRVmOwvWli80V4"></drupal-filter-placeholder>

There is a limited set of these, and yet the masterminds/html5 component that Drupal 10.2 now uses to process HTML 5 requires an explicit list of the attributes that should not get converted to boolean attributes when they are set to an empty string.

At this point, I should point out that this means a simple solution could be to just pass some arguments so that the attribute isn't empty! That is a nice immediate workaround that avoids the need for any patch, so is an obvious maintainable solution:

// Insert your favourite argument; any value will do. $arguments = [42];

At least that ensures our login form shows again!

But I don't see any documentation saying there must be arguments, and it would be easy for someone to write this kind of code again elsewhere, especially if we're trying to do The Right Thing by using placeholders in filters.

So I decided to contribute a fix back to Drupal core. I've worked on core before. Sometimes it's a joy to find or fix something that affects thousands of people, other times the contribution process can be soul-destroying. At least in this case, I found an existing test in core that could be easily extended to demonstrate the bug. Then I wrote a surgical fix... but I can see that it tightly couples the filter system to Drupal's HtmlSerializerRules class. That class is within the \Drupal\Component namespace, which is described as:

Drupal Components are independent libraries that do not depend on the rest of Drupal in order to function.

Components MAY depend on other Drupal Components or external libraries/packages, but MUST NOT depend on any other Drupal code.

So perhaps it needs configuration in order to be decoupled; and/or a factory service; or maybe modules should subscribe to an event to be able to inject their own rules .... and very quickly perfection feels like the enemy of good, as I can imagine the scope of a solution ballooning in size and complexity. 

I'm all for high standards in core, but fulfilling them to produce solutions can still be a slow and frustrating experience. I'm already involved in enough long-running issues that just bounce around between reviewers, deprecations and changes in standards. I risk just ranting here rather than providing answers - and believe me, I'm incredibly grateful for the work that reviewers and committers have put into producing Drupal - but surely the current process must be putting so many potential contributors off. We worry about attracting talent to the Drupal ecosystem, and turning Takers into Makers, but what are they going to find when they arrive? Contributing improvements of decent size is hard and can require perseverance over years. Where can we adjust the balance to make contribution easier for anyone, even seasoned developers?

As I suggested, perhaps this particular bug needs any of a factory pattern, event subscriber, or injected configuration... but what would my next step be? I'm reluctant to put effort into writing a more complex solution when I know from experience that reviewers might just suggest doing something different anyway. At least I have that simple (if unsatisfying) workaround for the filter placeholder method: always send an argument, even if it might be ignored. I guess that reflects the contribution experience itself sometimes!

Categories: FLOSS Project Planets

Sahil Dhiman: First Iteration of My Free Software Mirror

Planet Debian - Tue, 2024-06-18 23:35

As I’m gearing towards setting up a Free Software download mirror in India, it occurred to me that I haven’t chronicled the work and motivation behind setting up the original mirror in the first place. Also, seems like it would be good to document stuff here for seeing the progression, as the mirror is going multi-country now. Right now my existing mirror i.e., mirrors.de.sahilister.net, which was mirrors.sahilister.in, hosted in Germany serves traffic for Termux, NomadBSD, Blender, BlendOS and GIMP. For a while in between, I hosted OSMC project mirror as well.

To explain what is a Free Software download mirror thing is first, I’ll quote myself from work blog -

As most Free Software doesn’t have commercial backing and require heavy downloads, the concept of software download mirrors helps take the traffic load off of the primary server, leading to geographical redundancy, higher availability and faster download in general.

So whenever someone wants to download a particular (mirrored) software and click download, upstream redirects the download to one of the mirror server which is geographical (or in other parameters) nearby to the user, leading to faster downloads and load sharing amongst all mirrors.

Since the time I got into Linux and servers, I always wanted to help the community somehow, and mirroring seemed to be the most obvious thing. India seems to be a country which has traditionally seen less number of public download mirrors. IITB, TiFR, and some of the public institutions used to host them for popular Linux and Free Softwares, but they seem to be diminishing these days.

In the last months of 2021, I started using Termux and saw that it had only a few mirrors (back then). I tried getting a high capacity, high bandwidth in budget was hard in India in 2021-22. So after much deliberation I decided to go where it’s available and chose a German hosting provider with the thought helping where possible and adding India node which conditions are favorable here (thankfully that happened, and India node is live too now.). Termux required only 29 GB of storage, so went ahead and started mirroring it. I raised this issue in Termux’s GitHub repository in January 2022. This blog post chronicles the start of the mirror.

Termux has high request counts from a mirror point of view. Each Termux client, usually check each mirror in selected group for availability before randomly selecting one for download (only other case is when client has explicitly selected a single mirror using termux-repo-change). The mirror stared getting thousands of requests daily, but only a small percentage would actually get my mirror in selection, so download traffic was lower. Similar thing happened with OSMC too (which I started mirroring later).

With this start, I started exploring various project that would be benefit from additional mirrors. Public information from Academic Computer Club in Umeå’s mirror and Freedif’s mirror stats helped to figure out storage and bandwidth requirements for potential projects.

Later, I migrated to a different provider for better speeds and added LibreSpeed test on the mirror server. Those were fun times. Between OSMC, Termux and LibreSpeed, I was getting almost 1.2 millions hits/day on the server at its peak, crossing for the first time a TB/day traffic number.

Next came Blender, which took the longest time to set up, around 9–10 months. Blender had a push-trigger requirement for rsync from upstream that took quite some back and forth. It now contributes the most amount of traffic on my mirror. On release days, mirror does more than 3 TB/day and normal days, it hovers around 2 TB/day. Gimp is the latest addition to the mirror.

At one time, the mirror traffic touched 4.97 TB/day. That’s when I decided on dropping LibreSpeed server to solely focus on mirroring for now, keeping the bandwidth allotment for serving downloads for now.

The mirror project selection grew organically. I used to reach out many projects discussing the need of for additional mirrors. Some projects outright denied mirroring request as Germany already has good academic mirrors boosting 20-25 Gbits/s speeds from FTP era, which seems fair. Finding the niche was essential to only add softwares, which truly required additional capacity. There were months when nothing much would happen with the mirror, rsync would continue to update the mirror while nginx would keep on serving the traffic. Nowadays, the mirror pushes around 70 TB/month. I occasionally check logs, vnstat, add new security stuff here and there and pay the bills. The mirror now saturates the Gigabit link sometimes and goes beyond that, peaking around 1.42 Gbits/s (the hosting provider seems to be upping their game). The plan is to upgrade the link to better speeds.

Yearly traffic stats (through `vnstat -y`)

On the way, learned quite a few things like -

  • IPv6 and exposing rsync module due to OSMC requirement.
  • Implementing user with restricted access to only trigger rsync, basically make rsync pull trigger based due to Blender.
  • Iterating over right client response size for LibreSpeed test project.
  • Mistakenly identifying torrent traffic for BlendOS as DDoS and blocking it for quite a few months. BlendOS added loads of hits for torrent traffic, making my mirror also serve as web seed. Web seeds in conjunction with normal seeds surely is a good combination for serving downloads as it combines best of both world, general availability of web seed/mirror and benefit of traditional seeds to maximize download speeds at user end.
  • Handling abusive traffic (a lot of it to be frank) and how to handle it. The approach is more of a whack a mole right now, which I want to improve and automate.
  • Most of the traffic on non-Linux/BSDs operating system serving mirrors (like mine) is for people on Windows and Mac asking for EXEs and DMGs. Mostly because package repositories carry software distribution load for Linux/BSDs OS and partly because the number of Windows/Mac users are quite high compared to other OSs.
  • Load balancing through DNS and HTTP redirection (which I would implement in my India mirror now) to better maximize available resources.

GeoIP Map of Clients from Yesterday's Access Logs. Click to enlarge
Generated from IPinfo.io

Fun fact, Academic Computer Club in Umeå which runs mirror.accum.se (one of the prominent Debian, Ubuntu etc.) mirror, now has 200 Gbits/s uplink to the internet through SUNET.

In hindsight, the stats look amazing, hundreds of TBs of traffic served from the mirror. That does show that there’s still quite an appetite for public mirrors in time of commercially “donated” CDNs and GitHub. The world could have done with one less mirror, but it saved some time, lessened the burden for others, while providing redundancy and traffic localization with one additional mirror. And it’s fun for someone like me who’s into infrastructure that powers the Internet. Now, I’ll try focusing and expanding the India mirror, which in itself started pushing almost half a TB/day. Long live Free Software and public download mirrors.

Categories: FLOSS Project Planets

PyCoder’s Weekly: Issue #634 (June 18, 2024)

Planet Python - Tue, 2024-06-18 15:30

#634 – JUNE 18, 2024
View in Browser »

Should Python Adopt Calendar Versioning?

Python’s use of semantic style versioning numbers causes confusion, as breaking changes can be present in the “minor” number position. This proposal given at the Python Language Summit is to switch to using calendar based versioning. A PEP is forthcoming.
PYTHON SOFTWARE FOUNDATION

Python Mappings: A Comprehensive Guide

In this tutorial, you’ll learn the basic characteristics and operations of Python mappings. You’ll explore the abstract base classes Mapping and MutableMapping and create a custom mapping.
REAL PYTHON

How Do You Turn Data Science Insights into Business Results? Posit Connect

Data scientists use Posit Connect to get their work into the hands of decision-makers. Securely deploy python analytic work & distribute that across teams. Publish data apps, documents, notebooks, and dashboards. Deploy models as APIs & configure reports to run & get distributed on a custom schedule →
POSIT sponsor

NumPy 2.0.0 Release Notes

The long awaited 2.0 release of NumPy landed this week. Not all the docs are up to date yet, but this final draft of the release notes shows you what is included.
NUMPY.ORG

Quiz: Python Mappings

In this quiz, you’ll test your understanding of the basic characteristics and operations of Python mappings. By working through this quiz, you’ll revisit the key concepts and techniques of creating a custom mapping.
REAL PYTHON

Discussions Personal Red Flags When You’re Interviewing at a Company?

HACKER NEWS

Articles & Tutorials Proposed Bylaws Changes for the PSF

As part of the upcoming board election, three new bylaws are also being proposed for your consideration. The first makes it easier to qualify for membership for Python-related volunteer work, the second makes it easier to vote, and the third gives the board more options around the code of conduct.
PYTHON SOFTWARE FOUNDATION

Python Interfaces: Object-Oriented Design Principles

In this video course, you’ll explore how to use a Python interface. You’ll come to understand why interfaces are so useful and learn how to implement formal and informal interfaces in Python. You’ll also examine the differences between Python interfaces and those in other programming languages.
REAL PYTHON course

Prod Alerts? You Should be Autoscaling

Rest easy with Judoscale’s web & worker autoscaling for Heroku, Render, and Amazon ECS. Traffic spike? Scaled up. Quiet night? Scaled down. Work queue backlog? No problem →
JUDOSCALE sponsor

Listing All Files in a Directory With Python

In this video course, you’ll be examining a couple of methods to get a list of files and folders in a directory with Python. You’ll also use both methods to recursively list directory contents. Finally, you’ll examine a situation that pits one method against the other.
REAL PYTHON course

Python Logging: The Log Levels

Logging levels allow you to control which messages you record in your logs. Think of log levels as verbosity levels. How granular do you want your logs to be? This article teaches you how to control your logging.
MIKE DRISCOLL

How Do You Program for 8h in a Row?

You may get paid for 8 hours a day, but that doesn’t necessarily mean you’re coding that whole time. This article touches on the variety of the job and what you should expect if you are new to the field.
BITE CODE!

Python Language Summit 2024: Lightning Talks

A summary of the six lightning talks given at the 2024 Python Language Summit. Topics include Python for iOS, improving asserts in 3.14, sharing data between sub-interpreters, and more.
PYTHON SOFTWARE FOUNDATION

Starting and Stopping uvicorn in the Background

Learn how to start and stop uvicorn in the background using a randomly selected free port number. Useful for running test suites that require live-webservers.
CHRISTOPH SCHIESSL • Shared by Christoph Schiessl

How I Built a Bot Publishing Italian Paintings on Bluesky

This article describes Nicolò’s project to build a bot that retrieves images from Wikimedia, selecting the best ones, and deploying it to the cloud.
NICOLÒ GISO • Shared by Nicolò Giso

Testing async MongoDB AWS Applications With Pytest

This article shows real life techniques and fixtures needed to make the test suite of your MongoDB and AWS-based application usable and performant.
HANDMADESOFTWARE • Shared by Thorin Schiffer

DjangoCon Europe 2024 Bird’s-Eye View

Thibaud shares some of the best moments of DjangoConEU 2024. He highlights some of the talks, workshops, and the outcome of the sprints.
THIBAUD COLAS

Storing Django Static and Media Files on DigitalOcean Spaces

This tutorial shows how to configure Django to load and serve up static and media files, public and private, via DigitalOcean Spaces.
MICHAEL HERMAN

CPython Reference Counting and Garbage Collection Internals

A detailed code walkthrough of how CPython implements memory management, including reference counting and garbage collection
ABHINAV UPADHYAY

The Decline of the User Interface

“Software has never looked cooler, but user interface design and user experience have taken a sharp turn for the worse.”
NICK HODGES

Ruff: Internals of a Rust-Backed Python Linter-Formatter

This article dives into the structure of the popular ruff Python linter written in Rust.
ABDUR-RAHMAAN JANHANGEER • Shared by Abdur-Rahmaan Janhangeer

Projects & Code FinanceDatabase: Financial Database as a Python Module

GITHUB.COM/JERBOUMA

prettypretty: Build Awesome Terminal User Interfaces

GITHUB.COM/APPAREBIT

smbclient-ng: Interact With SMB Shares

GITHUB.COM/P0DALIRIUS

wakepy: Cross-Platform Keep-Awake With Python

GITHUB.COM/FOHRLOOP

django-mfa2: Django MFA; Supports TOTP, U2F, & More

GITHUB.COM/MKALIOBY

Events Weekly Real Python Office Hours Q&A (Virtual)

June 19, 2024
REALPYTHON.COM

Wagtail Space US

June 20 to June 23, 2024
WAGTAIL.SPACE

PyData Bristol Meetup

June 20, 2024
MEETUP.COM

PyLadies Dublin

June 20, 2024
PYLADIES.COM

Chattanooga Python User Group

June 21 to June 22, 2024
MEETUP.COM

PyCamp Leipzig 2024

June 22 to June 24, 2024
BARCAMPS.EU

Happy Pythoning!
This was PyCoder’s Weekly Issue #634.
View in Browser »

[ Subscribe to 🐍 PyCoder’s Weekly 💌 – Get the best Python news, articles, and tutorials delivered to your inbox once a week >> Click here to learn more ]

Categories: FLOSS Project Planets

PyBites: Learn Python From Scratch: We Extended Our Newbie Bite Exercises From 25 to 50 🐍 📈

Planet Python - Tue, 2024-06-18 14:42

We are excited to announce that we’ve extended our Newbie Bites from 25 to 50 Python exercises!

The importance of exercising when learning how to code

We’re passionate about this new batch of exercises because they require active engagement, which is crucial for learning how to code. Passive methods like reading books or watching videos don’t help concepts click or stick.

Our exercises involve writing code that is validated with pytest. This immediate feedback helps you understand mistakes and learn more effectively. You’ll encounter errors, re-think your approach, and practice deliberately.

Why double the number of Newbie exercises?

The first 25 exercises taught the fundamentals well, but many found it hard to tackle the intro and beginner Bites afterward. The extra 25 Newbie Bites (#26-50) bridge the gap between complete beginners and intermediate Python programmers.

These new exercises cover essential concepts like error handling, type hints, default arguments, special characters, working with dates, classes (!), list comprehensions, constants, exception handling, and more.

We believe these challenges will provide a deeper understanding and more robust skill set to tackle the regular Bites and become more proficient in Python.

Get full access to the Newbie Bites Overview of the new exercises

Reading Errors: Learn to read and understand error messages in Python. These messages provide valuable information for debugging and fixing issues efficiently.

Failing Tests: Practice reading and interpreting failing test outputs with the pytest framework. This skill is crucial for resolving Bites and any Python development.

Type Hints: Explore type hints introduced in Python 3.5, which help you write more readable and maintainable code by specifying expected data types.

Default Arguments: Understand how to define functions with default values, making your functions more flexible and easier to use.

Special Chars: Learn about special characters in Python strings, such as \n and \t, for better formatting and readability.

Word Count: Use string methods like .split() and .splitlines() to manipulate and process text data effectively.

Dict Retrieval – Part 2: Explore advanced techniques for retrieving values from dictionaries to enhance your data handling skills.

Dict Retrieval – Part 3: Learn safer methods to retrieve values from dictionaries, providing defaults if keys are not present.

Random Module: Use Python’s random module to write a number guessing game, showcasing the practical use of standard library modules.

Working With Dates – Part 1: Explore the datetime module, focusing on the date class and the .weekday() method to work with dates.

Working With Dates – Part 2: Continue working with the datetime module, focusing on importing specific objects versus entire modules.

Make a Class: Learn about Python classes, which serve as blueprints for creating objects, starting from the basics.

Class With Str: Build upon the previous exercise by learning special methods like __str__ for adding string representations to classes.

Make Dataclass: Simplify class creation with Python dataclasses, introduced in Python 3.7.

Scope: Understand variable scope to write clearer and less error-prone code. Scope determines the visibility and lifespan of variables.

String Manipulations: Practice fundamental string manipulations, essential for processing and transforming text data.

List Comprehension: Learn to write concise and efficient list comprehensions, a powerful feature in Python for creating new lists.

Named Tuple: Explore namedtuples, which allow attribute access to elements and support type hints, enhancing your data handling capabilities.

Constants: Learn to assign and use constants, which are fixed values that improve code readability and maintainability.

Exceptions: Master exception handling to write resilient code and establish clear boundaries for function callers.

For Loop With Break And Continue: Control loop flow using break and continue statements to manage iterations based on conditions.

In Operator: Use the in operator to check for item presence in collections, a common practice in Python programming.

String Module: Combine list comprehensions with the string module to check and manipulate characters, then join them back into strings.

Formatting Intro: Learn string interpolation using the .format() method to insert variables into strings dynamically.

Formatting Intro: Learn string interpolation using the .format() method to insert variables into strings dynamically.

Buy the Newbie exercise bundle

The exercises build upon each other, so they must be done in order. After completing them, you’ll earn your Pybites Newbie Certificate.

We’re also working on a series of screencasts to further explain the second half of the Newbie Bites. Stay tuned by subscribing to our YouTube channel.

About the Pybites Platform

We offer a rich platform for learning Python through practical exercises.

Our Bite exercises are designed to challenge you and help you apply Python in real-world scenarios, ranging from basic syntax to advanced topics.

Whether you’re a beginner or an experienced coder, our exercises aim to improve your coding skills in an effective and fun manner.

Start coding today: https://codechalleng.es/

And join our growing community of passionate Pythonistas: https://pybites.circle.so/

Categories: FLOSS Project Planets

Wim Leers: XB week 5: chaos theory

Planet Drupal - Tue, 2024-06-18 13:47

We started the week off with a first MVP config entity: the component config entity. We have to start somewhere and then iterate — and iterate we can: thanks to Felix “f.mazeikis”, #3444417 landed! Most crucially this means we now have both a low-level unit test and a kernel test that ensures the validation for this config entity is thoroughof course I’m bringing config validation to Experience Builder (XB) from the very start!

It doesn’t do much: it allows “opting in” single directory components (SDCs) to XB, and now only those are available in the PoC admin UI. Next up is #3452397: expanding the admin UI and the config entity to allow defining default values for its SDC props. This will allow the UI to immediately generate a preview of the component, which otherwise may (depending on the component) not render anything visible (imagine: a <h2> without text, an <img> without an image, etc.).

But literally everything about this config entity is subject to change:

  • its name (issue: #3455036)
  • the whole “opting in” principle (issue comment: #3444424-7
  • what it does exactly (issue: #3446083)
  • whether it should handle only exposing SDCs 1:1 in XB, or whether it should also handle other component types (issue: #3454519)
How did we get here?

Which brings me to the main subject for this week: I have a theory that explains the chaos that I think now many of you see.

We rushed to open the 0.x branch to invite contribution, because Dries announced XB at DrupalCon. This wrongly gave the impression that there was a very detailed plan, with a precise architecture in mind. That is not the case, and mea culpa for giving that impression. That branch was simply the combination of different PoCs merged together.

Some additional context: back in March, from one day to the next, I along with Ben “bnjmnm”, Ted “tedbow”, Alex “effulgentsia”, Tim and Lauri, plus Felix and Jesse (both from the Site Studio team, both very experienced) I were asked to stop everything we were doing (for me: dropping config validation) and start estimating the product requirements.

~4 weeks of non-stop grueling meetings 1 to assess and estimate the 64 requirements, with Lauri providing visual examples and additional context for the product requirements he crafted (after careful research: #3452440). We looked at multiple radical options and evaluated which of these was viable:

  1. building on top of one of the page building solutions that already exist in the Drupal ecosystem
  2. partnering with one of the existing page building systems, if they were willing to relicense under GPL-2+ and we’d be able to fit them onto Drupal’s data modeling strengths
  3. building something new (but reusing the parts in the Drupal ecosystem that we can build on top of, such as SDC, field types for data modeling, etc.)

The result: 60 pages worth of notes 2, with for each of the 42 “Required for MVP” product requirements (out of a total of 64), there now was an outline of how it could be implemented, a best+realistic+worst case estimate (week-level accuracy), which skillsets needed and how many people. It’s during this process that it became clear that only one choice made sense: the last one. That was the conclusion we reached because the existing solutions in the ecosystem fall short (see #3452440), Site Studio does not meet all of XB’s requirements and some of its architectural choices conflict with XB’s requirements and we did not end up finding a magical unicorn that meshed perfectly with Drupal.

It’s during this time that I made my biggest mistake yet: because the request for estimating this was coming from within Acquia, I assumed this was intended to be built by a team of Acquians. Because … why else create estimates?! If it’s built by a mix of paid full-time, paid part-time and volunteer community members, there’s little point in estimating, right?!
Well, I was wrong: turns out the intent was to get a sense of how feasible it was to achieve in roughly which timeline. But I and my fellow team members were so deep into this enormous estimation exercise based on very high-level requirements that thinking about capturing this information in a more shareable form was a thought that simply did not occur…

So, choice 3 it was, with people that have deep Layout Builder knowledge (Ted & Tim), while bringing in expertise from the Site Studio team (Jesse & Felix) and strong front-end expertise (Ben & Harumi “hooroomoo”) … because next we were asked to narrow the estimates for the highest-variance estimates, to bring it down to a higher degree of confidence. That’s where the “FTEs”, “Best Case Size (weeks)”, “Realistic Case Size (weeks)”, “Worst Case Size (weeks)”, “Most Probable Case Size (calculated, FTE weeks)”, “Variance” columns in the requirements spreadsheet come in.

For the next ~4 weeks, we built PoCs for the requirements where our estimates had either the highest variance or the highest number, to both narrow and reduce the estimates. That’s where all the branches on the experience_builder project come from! For example:

Week 1 = witch pot

After the ~8 chaotic weeks above, it was DrupalCon (Dries also wrote about this, just before DrupalCon, based on the above work!), and then … it was week 1.

In that first week, we took a handful of branches that seemed to contain the pieces most sensible to start iterating on an actual implementation, threw that into a witch pot, and called that 0.x!

Now that all of you know the full origin story, and many of you have experienced the subsequent ~4 equally chaotic weeks, you’re all caught up! :D

Missed a prior week? See all posts tagged Experience Builder.

Goal: make it possible to follow high-level progress by reading ~5 minutes/week. I hope this empowers more people to contribute when their unique skills can best be put to use!

For more detail, join the #experience-builder Slack channel. Check out the pinned items at the top!

Going forward: order from chaos

So, how are we going to improve things?

Actual progress?

So besides the backstory and attempts to bring more order & overview, what happened?

The UI gained panning support (#3452623), which really makes it feel like it’s coming alive:

Try it yourself locally if you like, but there’s not much you can do yet.
Install the 0.x branch — the “Experience Builder PoC” toolbar item takes you there!

… you can also try an imperfect/limited version of the current UI on the statically hosted demo UI thanks to #3450311 by Lee “larowlan”.

We also got two eslint CI jobs running now (#3450307): one for the typical Drupal stuff, one for the XB UI (which is written in TypeScript) and reduced the overhead of PHPStan Level 8 (again thanks to Lee, in #3454501) … and a bunch more low-level things that were improved.

Various things are in progress … expect an update for those next week :)

Thanks to Lauri for reviewing this!

  1. Jesse Baker aptly labeled this period as “bonding through trauma” — before the meetings started mid-March we didn’t know each other, and afterwards it felt like I’d worked with Felix & Jesse for years! ↩︎

  2. In a form that is not sufficiently digestible for public consumption. ↩︎

Categories: FLOSS Project Planets

Nonprofit Drupal posts: June Drupal for Nonprofits Chat

Planet Drupal - Tue, 2024-06-18 11:05

Join us THURSDAY, June 20 at 1pm ET / 10am PT, for our regularly scheduled call to chat about all things Drupal and nonprofits. (Convert to your local time zone.)

We don't have anything specific on the agenda this month, so we'll have plenty of time to discuss anything that's on our minds at the intersection of Drupal and nonprofits.  Got something specific you want to talk about? Feel free to share ahead of time in our collaborative Google doc: https://nten.org/drupal/notes!

All nonprofit Drupal devs and users, regardless of experience level, are always welcome on this call.

This free call is sponsored by NTEN.org and open to everyone. 

  • Join the call: https://us02web.zoom.us/j/81817469653

    • Meeting ID: 818 1746 9653
      Passcode: 551681

    • One tap mobile:
      +16699006833,,81817469653# US (San Jose)
      +13462487799,,81817469653# US (Houston)

    • Dial by your location:
      +1 669 900 6833 US (San Jose)
      +1 346 248 7799 US (Houston)
      +1 253 215 8782 US (Tacoma)
      +1 929 205 6099 US (New York)
      +1 301 715 8592 US (Washington DC)
      +1 312 626 6799 US (Chicago)

    • Find your local number: https://us02web.zoom.us/u/kpV1o65N

  • Follow along on Google Docs: https://nten.org/drupal/notes

View notes of previous months' calls.

Categories: FLOSS Project Planets

Pages