FLOSS Project Planets

PyPy: Guest Post: How PortaOne uses PyPy for high-performance processing, connecting over 1B of phone calls every month

Planet Python - Thu, 2024-08-29 05:00

The PyPy project is always happy to hear about industrial use and deployments of PyPy. For the GC bug finding task earlier this year, we collaborated with PortaOne and we're super happy that Serhii Titov, head of the QA department at PortaOne, was up to writing this guest post to describe their use and experience with the project.

What does PortaOne do?

We at PortaOne Inc. allow telecom operators to launch new services (or provide existing services more efficiently) using our VoIP platform (PortaSIP) and our real-time charging system (PortaBilling), which provides additional features for cloud PBX, such as call transfer, queues, interactive voice response (IVR) and more. At this moment our support team manages several thousand servers with our software installed in 100 countries, through which over 500 telecommunication service providers connect millions of end users every day. The unique thing about PortaOne is that we supply the source code of our product to our customers - something unheard of in the telecom world! Thus we attract "telco innovators", who use our APIs to build around the system and the source code to create unique tweaks of functionality, which produces amazing products.

At the core of PortaSIP is the middle-ware component (the proper name for it is "B2BUA", but that probably does not say much to anyone outside of experts in VoIP), which implements the actual handling of SIP calls, messages, etc. and all added features (for instance, trying to send a call via telco operators through which the cost per minute is lower). It has to be fast (since even a small delay in establishing a call is noticed by a customer), reliable (everyone hates when a call drops or cannot be completed) and yet easily expandable with new functionality. This is why we decided to use Python as opposed to C/C++ or similar programming languages, which are often used in telecom equipment.

The B2BUA component is a batch of similar Python processes that are looped inside a asyncore.dispatcher wrapper. The load balancing between these Python processes is done by our stateless SIP proxy server written in C++. All our sockets are served by this B2BUA. We have our custom client-wrappers around pymysql, redis, cassandra-driver and requests to communicate with external services. Some of the Python processes use cffi wrappers around C-code to improve their performance (examples: an Oracle DB driver, a client to a radius server, a custom C logger).

The I/O operations that block the main thread of the Python processes are processed in sub-threads. We have custom wrappers around threading.Thread and also asyncore.dispatcher. The results of such operations are returned to the main thread.

Improving our performance with PyPy

We started with CPython and then in 2014 switched to PyPy because it was faster. Here's an exact quote from our first testing notes: "PyPy gives significant performance boost, ~50%". Nowadays, after years of changes in all the software involved, PyPy still gives us +50% boost compared to CPython.

Taking care of real time traffic for so many people around the globe is something we're really proud of. I hope the PyPy team can be proud of it as well, as the PyPy product is a part of this solution.

Finding a garbage collector bug: stage 1, the GC hooks

However our path with PyPy wasn't perfectly smooth. There were very rare cases of crashes on PyPy that we weren't able to catch. That's because to make coredump useful we needed to switch to PyPy with debug, but we cannot let it run in that mode on a production system for an extended period of time, and we did not have any STR (steps-to-reproduce) to make PyPy crash again in our lab. That's why we kept (and still keep) both interpreters installed just in case, and we would switch to CPython if we noticed it happening.

At the time of updating PyPy from 3.5 to 3.6 our QA started noticing those crashes more often, but we still had no luck with STR or collecting proper coredumps with debug symbols. Then it became even worse after our development played with the Garbage Collector's options to increase performance of our middleware component. The crashes started to affect our regular performance testing (controlled by QA manager Yevhenii Bovda). At that point it was decided that we can no longer live like that and so we started an intense investigation.

During the first stage of our investigation (following the best practice of troubleshooting) we narrowed down the issue as much as we could. So, it was not our code, it was definitely somewhere in PyPy. Eventually our SIP software engineer Yevhenii Yatchenko found out that this bug is connected with the use of our custom hooks in the GC. Yevhenii created ticket #4899 and within 2-3 days we got a fix from a member of the PyPy team, in true open-source fashion.

Finding a garbage collector bug: stage 2, the real bug

Then came stage 2. In parallel with the previous ticket, Yevhenii created #4900 that we still see failing with coredumps quite often, and they are not connected to GC custom hooks. In a nutshell, it took us dozens of back and forward emails, three Zoom sessions and four versions of a patch to solve the issue. During the last iteration we got a new set of options to try and a new version of the patch. Surprisingly, that helped! What a relief! So, the next logical step was to remove all debug options and run PyPy only with the patch. Unfortunately, it started to fail again and we came to the obvious conclusion that what will help us is not a patch, but one of options we were testing out. At that point we found out that PYPY_GC_MAX_PINNED=0 is a necessary and sufficient condition to solve our issue. This points to another bug in the garbage collector, somehow related to object pinning.

Here's our current state: we have to add PYPY_GC_MAX_PINNED=0, but we do not face the crashes anymore.

Conclusion and next steps

Gratitude is extended to Carl for his invaluable assistance in resolving the nasty bugss, because it seems we're the only ones who suffered from the last one and we really did not want to fall back to CPython due to its performance disadvantage.

Serhii Titov, head of the QA department at PortaOne Inc.

P.S. If you are a perfectionist and at this point you have mixed feelings and you are still bothered by the question "But there might still be a bug in the GC, what about that?" - Carl has some ideas about it and he will sort it out (we will help with the testing/verification part).

Categories: FLOSS Project Planets

Six Tips for Maximizing Desktop Screen Potential

Planet KDE - Thu, 2024-08-29 02:30

Desktop software has many differences from mobile and embedded applications but one of the biggest and most obvious is the screen. How can you take advantage of all that real estate for your application? Here are six considerations for managing the screen in your desktop application.

  1. Choosing the right GUI framework

Use a flexible GUI framework that easily supports building apps with differing resolutions. The less hand-tweaking you need for your dialogs, graphical assets, and interface across a wide spectrum of resolutions, the better.

  1. Testing across diverse displays

Make sure you have a wide range of monitors to test your app against. It’s easy to assume everything works perfectly when your app is tested on a uniform configuration provided by IT. However, visual issues may arise when your app runs on smaller, larger, or different monitors.

  1. Setting clear resolution standards

Set a clear minimum resolution that supports your application’s features effectively. Ensure all dialogs fit on screen, scrollbars function properly, nothing is off-screen. Also, test against ultra-high-resolution monitors (like 4K and 8K) to ensure clarity and usability at high DPI settings. Verify that text is legible, controls are noticeable, and clickable regions are big enough to target accurately.

  1. Designing for adaptability

Ensure that your application’s user interface is not only scalable but also adaptable. It should reconfigure itself based on the resolution, maintaining a balance between functionality and aesthetics. Dialog boxes should be resizable, and layout managers should dynamically adjust component placement based on the available screen real estate.

  1. Embracing multi-monitor flexibility

Multi-monitor setups aren’t just for developers anymore. Many people use laptops along with a larger monitor. In fact, a two-screen configuration may be even more popular than single screens. Make sure your application handles this flexibility intelligently by allowing spawning windows or panels that can be moved to the monitor that best works for the user.

  1. Customizing the user workspace

With features such as dockable toolbars, multiple document interfaces, or floating inspectors, you can allow users to customize their workspace. This is particularly handy for apps like graphic design, audio/video editing, and software development, where distributing a wide variety of tools, dialogs, controls, and views across the entire screen real estate is essential.

Final thoughts

Designing and testing for multiple screen resolutions and configurations is part of making a great application. As screen technology evolves and user expectations rise, your applications’ ability to harness the full potential of ultra resolution and multi-monitor setups might just set it apart from the crowd. If you’re interested in more tips for building desktop applications, you may want to read our related best practice guide.

About KDAB

If you like this article and want to read similar material, consider subscribing via our RSS feed.

Subscribe to KDAB TV for similar informative short video content.

KDAB provides market leading software consulting and development services and training in Qt, C++ and 3D/OpenGL. Contact us.

The post Six Tips for Maximizing Desktop Screen Potential appeared first on KDAB.

Categories: FLOSS Project Planets

GNU MediaGoblin: MediaGoblin 0.14.0

GNU Planet! - Thu, 2024-08-29 01:00

We're pleased to announce the release of GNU MediaGoblin 0.14.0. See the release notes for full details and upgrading instructions.

Highlights of this release are:

  • Preliminary support for Docker installation
  • Preliminary support for OS packaging on GNU Guix
  • Major configure/build overhaul
  • Extended configuration documentation

This version has been tested on Debian Bookworm (12), Ubuntu 20.04, Ubuntu 22.04, Ubuntu 24.04 and Fedora 39.

Thanks go to co-maintainer Olivier Mehani for his major contributions in this release!

To join us and help improve MediaGoblin, please visit our getting involved page.

Categories: FLOSS Project Planets

Asking for donations in Plasma

Planet KDE - Wed, 2024-08-28 23:17

Why do we ask for donations so often? Because it’s important! As KDE becomes more successful and an increasing number of people use our software, our costs grow as well:

  • Web and server hosting
  • Organizing and hosting larger Akademy events
  • Funding more and larger sprints
  • Paying people to work on stuff the volunteer community doesn’t focus on, and retaining them over time

And so on. If we don’t raise more money while our software becomes more popular, we risk “dying of success,” so to speak. Remember, we give all this stuff away for free, no strings attached! Nobody has to pay a cent if they don’t want to or can’t afford to.

Accordingly, if you’re plugged into KDE social media, you probably see a lot of requests for donations. I end every one of my “This Week in KDE” posts with one, and many others do for their own blog posts as well. KDE’s official social media channels blast it out constantly, and we also do yearly fundraisers that are widely promoted online. If you’re reading this, you may get the impression that we’re always begging for cash!

But if you’re not plugged into these communications channels, you might not have ever seen a donation request at all. We know that the fraction of people who subscribe to these channels is small, so there’s a huge number of people who may not even know they can donate to KDE, let alone that donations are critically important to its continued existence.

Starting in Plasma 6.2, that changes, and I’d like to introduce what it will look like! From 6.2 onwards, Plasma itself will show a system notification asking for a donation once per year, in December:

The idea here is to get the message that KDE really does need your financial help in front of more eyeballs — especially eyeballs not currently looking at KDE’s public-facing promotion efforts.

Now, I know that messages like this can be controversial! The change was carefully considered, and we tried our best to minimize the annoying-ness factor: It’s small and unobtrusive, and no matter what you do with it (click any button, close it, etc) it’ll go away until next year. It’s implemented as a KDE Daemon (KDED) module, which allows users and distributors to permanently disable it if they like. You can also disable just the popup on System Settings’ Notifications page, accessible from the configure button in the notification’s header.

Ultimately the decision to do this came down to the following factors:

  1. We looked at FOSS peers like Thunderbird and Wikipedia which have similar things (and in Wikipedia’s case, the message is vastly more intrusive and naggy). In both cases, it didn’t drive everyone away and instead instead resulted in a massive increase in donations that the projects have been able to use to employ lots of people.
  2. KDE really needs something like this to help our finances grow sustainably in line with our userbase and adoption by vendors and distributors.

So now let me address what I anticipate will be some common concerns:

I think you’re wrong; people hate pop-ups and this is going to turn them off!

Like I said, peer organizations didn’t see that happen, and some were even more in-your-face about it. I do suspect a small but vocal crowd of people will spread doom and gloom about it on social media anyway, of course. This also happened when we implemented off-by-default telemetry — which by the way, was implemented so conservatively that it barely collects any information of value at all. It’s a cautionary tale about the danger of being too timid and ending up with the worst of both worlds.

The worst-case scenario is that we don’t get more donations from this after a couple of years, and end up removing it. That’s always an option. But I think it’s worth venturing out there and being a bit bold! With risk comes opportunity.

KDE shouldn’t need to pay people directly; employment should come from vendors and distributors shipping our software!

To a certain extent this already does happen: by far the largest contributor of paid work is Blue Systems — mostly funded by Valve Corporation, which ships KDE software on the Steam Deck. There are also trickles and spurts of sponsored work from distros, KDAB, and enterprising folks who get funded via grants.

Ultimately a healthy economic ecosystem around KDE includes people employed by many parties, including KDE itself, in my opinion. This is how KDE can help control its own destiny. And that costs money! Money that needs to come from somewhere.

Why does it have to be a notification pop-up? Put this in Welcome Center or something!

We had a request for donations in Welcome center for several years, and it didn’t make a difference, because right after you’ve installed the system wasn’t the right time to ask. At that point, you don’t know if you like Plasma yet, so asking for money is premature.

If KDE is as successful as Thunderbird and Wikipedia have been, what are you going to do with all that money?

This is a question the KDE e.V. board of directors as a whole would need to answer, and any decision on it will be made collectively.

But as one of the five members on that board, I can tell you my personal answer and the one that as your representative, I’d advocate for. It’s basically the platform I ran on two years ago: extend an offer of full-time employment to our current people, and hire even more! I want us to end up with paid QA people and distro developers, and even more software engineers. I want us to fund the creation of a next-generation KDE OS we can offer directly to institutions looking to switch to Linux, and a hardware certification program to go along with it. I want us to to extend our promotional activities and outreach to other major distros and vendors and pitch our software to them directly. I want to see Ubuntu, Red Hat Enterprise Linux, and SUSE Linux Enterprise Desktop ship Plasma by default. I want us to use this money to take over the world — with freedom, empowerment, and kindness.

These have been dreams for a long time, and throughout KDE we’ve been slowly moving towards them over the years. With a lot more money, we can turbocharge the pace! If that stuff sounds good, you can start with a donation today.

I know talking about money can be awkward. But failure to plan is planning to fail; money is something we can’t ignore and just hope things work out — and we don’t. Raising more money is a part of that plan, and this new yearly donation notification is a part of raising money. It’s my expectation and hope that asking our users for donations will result in more donations, and that we can use these to accelerate KDE’s reach and the quality of our software!

Categories: FLOSS Project Planets

Matt Layman: No Frills, Just Go: Standard Library Only Web Apps

Planet Python - Wed, 2024-08-28 20:00
How much can you build in Go with zero extra packages? What is possible using nothing more than Go’s standard library? In this talk, you’re going to find out!
Categories: FLOSS Project Planets

Gizra.com: Drupal Core Contribution Guide

Planet Drupal - Wed, 2024-08-28 20:00
Drupal core development offers developers both opportunities and challenges. On the one hand, it keeps them updated with the latest API changes, ensures their skills stay sharp, introduces fresh ideas and solutions by reviewing contributions from other smart developers and helps developers optimize their code for various scenarios they might not have considered before. On the other hand, getting into Drupal development isn’t easy. Setting up Drupal correctly requires attention to detail and understanding its setup.
Categories: FLOSS Project Planets

Michael Ablassmeier: proxmox backup S3 proxy

Planet Debian - Wed, 2024-08-28 20:00

A few weeks ago Tiziano Bacocco started a small project to implement a (golang) proxy that allows to store proxmox backups on S3 compatible storage: pmoxs3backuproxy, a feature which the current backup server does not have.

I wanted to have a look at the Proxmox Backup Server implementation for a while, so i jumped on the wagon and helped with adding most of the API endpoints required to seamlessly use it as drop-in replacement in PVE.

The current version can be configured as storage backend in PVE. You can then schedule your backups to the S3 storage likewise.

It now supports both the Fixed index format required to create virtual machine backups and the Dynamic index format, used by the regular proxmox-backup-client for file and container backups. (full and incremental)

The other endpoints like adding notes, removing or protecting backups, mounting images using the PVE frontend (or proxmox-backup-client) work too. It comes with a garbage collector that does prune the backup storage if snapshots expire and runs integrity checks on the data.

You can also configure it as so called “remote” storage in the Proxmox Backup server itself and pull back complete buckets using “proxmox-backup-manager pull”, if your local datastore crashes.

I think it will become more interesting if future proxmox versions will allow to push backups to other stores, too.

Categories: FLOSS Project Planets

GNU Taler news: GNU Taler 0.13 released

GNU Planet! - Wed, 2024-08-28 18:00
We are happy to announce the release of GNU Taler v0.13.
Categories: FLOSS Project Planets

GNUnet News: GNUnet 0.22.0

GNU Planet! - Wed, 2024-08-28 18:00
GNUnet 0.22.0 released

We are pleased to announce the release of GNUnet 0.22.0.
GNUnet is an alternative network stack for building secure, decentralized and privacy-preserving distributed applications. Our goal is to replace the old insecure Internet protocol stack. Starting from an application for secure publication of files, it has grown to include all kinds of basic protocol components and applications towards the creation of a GNU internet.

This is a new major release. It breaks protocol compatibility with the 0.21.x versions. Please be aware that Git master is thus henceforth (and has been for a while) INCOMPATIBLE with the 0.21.x GNUnet network, and interactions between old and new peers will result in issues. In terms of usability, users should be aware that there are still a number of known open issues in particular with respect to ease of use, but also some critical privacy issues especially for mobile users. Also, the nascent network is tiny and thus unlikely to provide good anonymity or extensive amounts of interesting information. As a result, the 0.22.0 release is still only suitable for early adopters with some reasonable pain tolerance .

Download links

The GPG key used to sign is: 3D11063C10F98D14BD24D1470B0998EF86F59B6A

Note that due to mirror synchronization, not all links might be functional early after the release. For direct access try http://ftp.gnu.org/gnu/gnunet/

Changes

A detailed list of changes can be found in the git log , the NEWS and the bug tracker . Noteworthy highlights are

  • transport :
    • A new experimental HTTP/3 communicator for peer-to-peer transport communicator.
    • New experimental NAT traversal functionality.
  • util :
  • hostlist : The bootstrap URL is changed to https://bootstrap.gnunet.org/v22 and https://bootstrap.gnunet.org/latest for the release and development version (git head), respectively.
  • gnunet-hello : A new CLI to import/export connectivity information (HELLOs) of peers manually.
  • namestore : Significant zone import performance improvements in preparation for DNS TLD mirror deployments (.se, .nu, etc) .
  • messenger :
    • Implementation of discourse subscriptions for live data streaming in chat rooms.
    • New functionality in CLI for the Messenger service to stream data via standard input and output.
  • Build System :
    • Build variant to build a monolithic GNUnet library.
    • Cross compile the monolithic library for use on Android devices. An Android prototype can be found in this repository.
Known Issues
  • There are known major design issues in the CORE subsystems which will need to be addressed in the future to achieve acceptable usability, performance and security.
  • There are known moderate implementation limitations in CADET that negatively impact performance.
  • There are known moderate design issues in FS that also impact usability and performance.
  • There are minor implementation limitations in SET that create unnecessary attack surface for availability.
  • The RPS subsystem remains experimental.

In addition to this list, you may also want to consult our bug tracker at bugs.gnunet.org which lists about 190 more specific issues.

Thanks

This release was the work of many people. The following people contributed code and were thus easily identified: Christian Grothoff, t3sserakt, TheJackiMonster, Pedram Fardzadeh, Shichao, fence, dvn, nullptrderef and Martin Schanzenbach.

Categories: FLOSS Project Planets

screen @ Savannah: GNU Screen v.5.0.0 is released

GNU Planet! - Wed, 2024-08-28 17:41

Screen is a full-screen window manager that multiplexes a physical
terminal between several processes, typically interactive shells.

The 5.0.0 release includes the following changes to the previous
release 4.9.1:

  • Rewritten authentication mechanism
  • Add escape %T to show current tty for window
  • Add escape %O to show number of currently open windows
  • Use wcwdith() instead of UTF-8 hard-coded tables
  • New commands:

  - auth [on|off]
    Provides password protection
  - status [top|up|down|bottom] [left|right]
    The status window by default is in bottom-left corner.
    This command can move status messages to any corner of the screen.
  - truecolor [on|off]
  - multiinput
    Input to multiple windows at the same time

  • Removed commands:

  - time
  - debug
  - password
  - maxwin
  - nethack

  • Fixes:

  - Screen buffers ESC keypresses indefinitely
  - Crashes after passing through a zmodem transfer
  - Fix double -U issue

Release is available for download:
https://ftp.gnu.org/gnu/screen/

Please report any bugs or regressions.
Thanks to everyone who contributed to this release.

Cheers,
Alex

Categories: FLOSS Project Planets

Python Morsels: Arithmetic in Python

Planet Python - Wed, 2024-08-28 17:10

An explanation of Python's two number types (integers and floating point numbers), supported arithmetic operations, and an explanation of operator precedence.

Table of contents

  1. Integers
  2. Floating point numbers
  3. Mixing integers and floating point numbers
  4. Arithmetic operations
  5. Operator precedence in Python
  6. Arithmetic in Python is similar to in math

Integers

Integers are used for representing whole numbers.

>>> 5 5 >>> 0 0 >>> 999999999999 999999999999 >>> -10 -10

Any number that doesn't have a decimal point in it is an integer.

Floating point numbers

Floating point numbers are used …

Read the full article: https://www.pythonmorsels.com/arithmetic-in-python/
Categories: FLOSS Project Planets

Drupalize.Me: We Updated the Drupal User Guide for Drupal 11

Planet Drupal - Wed, 2024-08-28 16:35
We Updated the Drupal User Guide for Drupal 11

Drupal 11 was released recently. Yay. And with it comes a bunch of minor (and sometimes major) changes to the way Drupal works and the need to update the documentation to reflect those changes.

joe Wed, 08/28/2024 - 15:35
Categories: FLOSS Project Planets

Mike Herchel's Blog: Five Ideas for the Drupal Association

Planet Drupal - Wed, 2024-08-28 15:50
Five Ideas for the Drupal Association mherchel Wed, 08/28/2024 - 15:50
Categories: FLOSS Project Planets

Tag1 Consulting: Tag1 Is Heading to Barcelona - Join Us at DrupalCon Europe 2024!

Planet Drupal - Wed, 2024-08-28 13:14

Exciting news! Tag1 Consulting is proud to be a module sponsor at DrupalCon Barcelona 2024. Join us from September 24-27 for four days of Drupal innovation, collaboration, and community spirit. Our team will be presenting on Gander, Drupal Core development, LMS, DDEV, and more.

Read more Hank Wed, 08/28/2024 - 10:14
Categories: FLOSS Project Planets

Tag1 Consulting: Migrating Your Data from D7 to D10:Migrating field storage and instance settings

Planet Drupal - Wed, 2024-08-28 10:41

In this article, we delve into the process of migrating Drupal fields, building on the knowledge from previous discussions about Drupal fields and their database structures. We begin by addressing the two key components of field migrations: storage and instance settings. This is the first step in a multi-stage migration process that will ultimately involve four different migrations.

Read more mauricio Wed, 08/28/2024 - 07:41
Categories: FLOSS Project Planets

Real Python: Web Scraping With Scrapy and MongoDB

Planet Python - Wed, 2024-08-28 10:00

Scrapy is a robust Python web scraping framework that can manage requests asynchronously, follow links, and parse site content. To store scraped data, you can use MongoDB, a scalable NoSQL database, that stores data in a JSON-like format. Combining Scrapy with MongoDB offers a powerful solution for web scraping projects, leveraging Scrapy’s efficiency and MongoDB’s flexible data storage.

In this tutorial, you’ll learn how to:

  • Set up and configure a Scrapy project
  • Build a functional web scraper with Scrapy
  • Extract data from websites using selectors
  • Store scraped data in a MongoDB database
  • Test and debug your Scrapy web scraper

If you’re new to web scraping and you’re looking for flexible and scalable tooling, then this is the right tutorial for you. You’ll also benefit from learning this tool kit if you’ve scraped sites before, but the complexity of your project has outgrown using Beautiful Soup and Requests.

To get the most out of this tutorial, you should have basic Python programming knowledge, understand object-oriented programming, comfortably work with third-party packages, and be familiar with HTML and CSS.

By the end, you’ll know how to get, parse, and store static data from the Internet, and you’ll be familiar with several useful tools that allow you to go much deeper.

Get Your Code: Click here to download the free code that shows you how to gather Web data with Scrapy and MongoDB.

Take the Quiz: Test your knowledge with our interactive “Web Scraping With Scrapy and MongoDB” quiz. You’ll receive a score upon completion to help you track your learning progress:

Interactive Quiz

Web Scraping With Scrapy and MongoDB

In this quiz, you'll test your understanding of web scraping with Scrapy and MongoDB. You'll revisit how to set up a Scrapy project, build a functional web scraper, extract data from websites, store scraped data in MongoDB, and test and debug your Scrapy web scraper.

Prepare the Scraper Scaffolding

You’ll start by setting up the necessary tools and creating a basic project structure that will serve as the backbone for your scraping tasks.

While working through the tutorial, you’ll build a complete web scraping project, approaching it as an ETL (Extract, Transform, Load) process:

  • Extract data from the website using a Scrapy spider as your web crawler.
  • Transform this data, for example by cleaning or validating it, using an item pipeline.
  • Load the transformed data into a storage system like MongoDB with an item pipeline.

Scrapy provides scaffolding for all of these processes, and you’ll tap into that scaffolding to learn web scraping following the robust structure that Scrapy provides and that numerous enterprise-scale web scraping projects rely on.

Note: In a Scrapy web scraping project, a spider is a Python class that defines how to crawl a specific website or a group of websites. It contains the logic for making requests, parsing responses, and extracting the desired data.

First, you’ll install Scrapy and create a new Scrapy project, then explore the auto-generated project structure to ensure that you’re well-equipped to proceed with building a performant web scraper.

Install the Scrapy Package

To get started with Scrapy, you first need to install it using pip. Create and activate a virtual environment to keep the installation separate from your global Python installation. Then, you can install Scrapy:

Shell (venv) $ python -m pip install scrapy Copied!

After the installation is complete, you can verify it by running the scrapy command and viewing the output:

Shell (venv) $ scrapy Scrapy 2.11.2 - no active project Usage: scrapy <command> [options] [args] Available commands: bench Run quick benchmark test fetch Fetch a URL using the Scrapy downloader genspider Generate new spider using pre-defined templates runspider Run a self-contained spider (without creating a project) settings Get settings values shell Interactive scraping console startproject Create new project version Print Scrapy version view Open URL in browser, as seen by Scrapy [ more ] More commands available when run from project directory Use "scrapy <command> -h" to see more info about a command Copied!

The command-line (CLI) program should display the help text of Scrapy. This confirms that you installed the package correctly. You’ll next run the highlighted startproject command to create a project.

Create a Scrapy Project

Scrapy is built around projects. Generally, you’ll create a new project for each web scraping project that you’re working on. In this tutorial, you’ll work on scraping a website called Books to Scrape, so you can call your project books.

As you may have already identified in the help text, the framework provides a command to create a new project:

Shell (venv) $ scrapy startproject books Copied! Read the full article at https://realpython.com/web-scraping-with-scrapy-and-mongodb/ »

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Categories: FLOSS Project Planets

The Drop Times: Drupal GovCon: Empowering Site Builders and Leading with Integrity

Planet Drupal - Wed, 2024-08-28 09:15
Drupal GovCon wrapped up with actionable insights from Rod Martin on non-coding solutions for site builders and Bree Benesh on navigating leadership challenges. Explore their sessions for practical strategies to apply in your Drupal projects.
Categories: FLOSS Project Planets

Django Weblog: Could you host DjangoCon Europe 2026? Call for organizers

Planet Python - Wed, 2024-08-28 08:59

We are looking for the next group of organizers to own and lead the 2026 DjangoCon Europe conference. Could your town - or your football stadium, circus tent, private island or city hall - host this wonderful community event?

DjangoCon Europe is a major pillar of the Django community, as people from across the world meet and share. This includes many qualities that make it a unique event - unconventional and conventional venues, creative happenings, a feast of talks and a dedication to inclusion and diversity.

Hosting a DjangoCon is an ambitious undertaking. It's hard work, but each year it has been successfully run by a team of community volunteers, not all of whom have had previous experience - more important is enthusiasm, organizational skills, the ability to plan and manage budgets, time and people - and plenty of time to invest in the project.

For 2026, we want to kickstart the organization much earlier than in previous years to allow more flexibility for the organizing team, and open up more opportunities for support from our DjangoCon Europe support working group.

Step 1: Submit your expression of interest

If you’re considering organizing DjangoCon Europe (🙌 great!), fill in our DjangoCon Europe 2026 expression of interest form with your contact details. No need to fill in all the information at this stage if you don’t have it all already, we’ll reach out and help you figure it out.

Express your interest in organizing

Step 2: We’re here to help!

We've set up a DjangoCon Europe support working group of previous organizers that you can reach out to with questions about organizing and running a DjangoCon Europe.

The group will be in touch with everyone submitting the expression of interest form, or you can reach out to them directly: european-organizers-support@djangoproject.com

We'd love to hear from you as soon as possible, so your proposal can be finalized and sent to the DSF board by October 6th 2024. The selected hosts will be publicly announced at DjangoCon Europe 2025 by the current organizers.

Step 3: Submitting the proposal

The more detailed and complete your final proposal is, the better. Basic details include:

  • Organizing committee members: You won’t have a full team yet, probably, naming just some core team members is enough.
  • The legal entity that is intended to run the conference: Even if the entity does not exist yet, please share how you are planning to set it up.
  • Dates: See “What dates are possible in 2026?” below. We must avoid conflicts with major holidays, EuroPython, DjangoCon US, and PyCon US.
  • Venue(s), including size, number of possible attendees, pictures, accessibility concerns, catering, etc.
  • Transport links and accommodation: Can your venue be reached by international travelers?
  • Budgets and ticket prices: Talk to the DjangoCon Europe Support group to get help with this, including information on past event budgets.

We also like to see:

  • Timelines
  • Pictures
  • Plans for online participation, and other ways to make the event more inclusive and reduce its environmental footprint
  • Draft agreements with providers
  • Alternatives you have considered

Have a look at our proposed DjangoCon Europe 2026 Licensing Agreement for the fine print on contractual requirements and involvement of the Django Software Foundation.

Submit your completed proposal by October 6th 2024 via our DjangoCon Europe 2026 expression of interest form, this time filling in as many fields as possible. We look forward to reviewing great proposals that continue the excellence the whole community associates with DjangoCon Europe.

Q&A Can I organize a conference alone?

We strongly recommend that a team of people submit an application.

I/we don’t have a legal entity yet, is that a problem?

Depending on your jurisdiction, this is usually not a problem. But please share your plans about the entity you will use or form in your application.

Do I/we need experience with organizing conferences?

The support group is here to help you succeed. From experience, we know that many core groups of 2-3 people have been able to run a DjangoCon with guidance from previous organizers and help from volunteers.

What is required in order to announce an event?

Ultimately, a contract with the venue confirming the dates is crucial, since announcing a conference makes people book calendars, holidays, buy transportation and accommodation etc. This, however, would only be relevant after the DSF board has concluded the application process. Naturally, the application itself cannot contain any guarantees, but it’s good to check concrete dates with your venues to ensure they are actually open and currently available, before suggesting these dates in the application.

Do we have to do everything ourselves?

No. You will definitely be offered lots of help by the community. Typically, conference organizers will divide responsibilities into different teams, making it possible for more volunteers to join. Local organizers are free to choose which areas they want to invite the community to help out with, and a call will go out through a blog post announcement on djangoproject.com and social media.

What kind of support can we expect from the Django Software Foundation?

The DSF regularly provides grant funding to DjangoCon organizers, to the extent of $6,000 in recent editions. We also offer support via specific working groups:

In addition, a lot of Individual Members of the DSF regularly volunteer at community events. If your team aren’t Individual Members, we can reach out to them on your behalf to find volunteers.

What dates are possible in 2026?

For 2026, DjangoCon Europe should happen between January 5th and April 27th, or June 4th and June 28th. This is to avoid the following community events’ provisional dates:

  • PyCon US 2026: May 2026
  • EuroPython 2026: July 2026
  • DjangoCon US 2026: September - October 2026
  • DjangoCon Africa 2026: August - September 2026

We also want to avoid the following holidays:

  • New Year's Day: Wednesday 1st January 2026
  • Chinese New Year: Tuesday 17th February 2026
  • Eid Al-Fitr: Friday 20th March 2026
  • Passover: Wednesday 1st - Thursday 9th April 2026
  • Easter: Sunday 5th April 2026
  • Eid Al-Adha: Tuesday 26th - Friday 29th May 2026
  • Rosh Hashanah: Friday 11th - Sunday 13th September 2026
  • Yom Kippur: Sunday 20th - Monday 21st September 2026
What cities or countries are possible?

Any city in Europe. This can be a city or country where DjangoCon Europe has happened in the past (Vigo, Edinburgh, Porto, Copenhagen, Heidelberg, Florence, Budapest, Cardiff, Toulon, Warsaw, Zurich, Amsterdam, Berlin), or a new locale.

References Past calls
Categories: FLOSS Project Planets

PyPy: PyPy v7.3.17 release

Planet Python - Wed, 2024-08-28 08:22
PyPy v7.3.17: release of python 2.7 and 3.10

The PyPy team is proud to release version 7.3.17 of PyPy.

This release includes a new RISC-V JIT backend, an improved REPL based on work by the CPython team, and better JIT optimizations of integer operations. Special shout-outs to Logan Chien for the RISC-V backend work, to Nico Rittinghaus for better integer optimization in the JIT, and the CPython team that has worked on the repl.

The release includes two different interpreters:

  • PyPy2.7, which is an interpreter supporting the syntax and the features of Python 2.7 including the stdlib for CPython 2.7.18+ (the + is for backported security updates)

  • PyPy3.10, which is an interpreter supporting the syntax and the features of Python 3.10, including the stdlib for CPython 3.10.14.

The interpreters are based on much the same codebase, thus the dual release. This is a micro release, all APIs are compatible with the other 7.3 releases. It follows after 7.3.16 release on April 23, 2024.

We recommend updating. You can find links to download the releases here:

https://pypy.org/download.html

We would like to thank our donors for the continued support of the PyPy project. If PyPy is not quite good enough for your needs, we are available for direct consulting work. If PyPy is helping you out, we would love to hear about it and encourage submissions to our blog via a pull request to https://github.com/pypy/pypy.org

We would also like to thank our contributors and encourage new people to join the project. PyPy has many layers and we need help with all of them: bug fixes, PyPy and RPython documentation improvements, or general help with making RPython's JIT even better.

If you are a python library maintainer and use C-extensions, please consider making a HPy / CFFI / cppyy version of your library that would be performant on PyPy. In any case, both cibuildwheel and the multibuild system support building wheels for PyPy.

RISC-V backend for the JIT

PyPy's JIT has added support for generating 64-bit RISC-V machine code at runtime (RV64-IMAD, specifically). So far we are not releasing binaries for any RISC-V platforms, but there are instructions on how to cross-compile binaries.

REPL Improvements

The biggest user-visible change of the release is new features in the repl of PyPy3.10. CPython 3.13 has adopted and extended PyPy's pure-Python repl, adding a number of features and fixing a number or bugs in the process. We have backported and added the following features:

  • Prompts and tracebacks use terminal colors, as well as terminal hyperlinks for file names.

  • Bracketed paste enable pasting several lines of input into the terminal without auto-indentation getting in the way.

  • A special interactive help browser (F1), history browser (F2), explicit paste mode (F3).

  • Support for Ctrl-<left/right> to jump over whole words at a time.

See the CPython documentation for further details. Thanks to Łukasz Langa, Pablo Galindo Salgado and the other CPython devs involved in this work.

Better JIT optimizations of integer operations

The optimizers of PyPy's JIT have become much better at reasoning about and optimizing integer operations. This is done with a new "knownbits" abstract domain. In many programs that do bit-manipulation of integers, some of the bits of the integer variables of the program can be statically known. Here's a simple example:

x = a | 1 ... if x & 1: ... else: ...

With the new abstract domain, the JIT can optimize the if-condition to True, because it already knows that the lowest bit of x must be set. This optimization applies to all Python-integers that fit into a machine word (PyPy optimistically picks between two different representations for int, depending on the size of the value). Unfortunately there is very little impact of this change on almost all Python code, because intensive bit-manipulation is rare in Python. However, the change leads to significant performance improvements in Pydrofoil (the RPython-based RISC-V/ARM emulators that are automatically generated from high-level Sail specifications of the respective ISAs, and that use the RPython JIT to improve performance).

PyPy versions and speed.pypy.org

The keen-eyed will have noticed no mention of Python version 3.9 in the releases above. Typically we will maintain only one version of Python3, but due to PyPy3.9 support on conda-forge we maintained multiple versions from the first release of PyPy3.10 in PyPy v7.3.12 (Dec 2022). Conda-forge is sunsetting its PyPy support, which means we can drop PyPy3.9. Since that was the major driver of benchmarks at https://speed.pypy.org, we revamped the site to showcase PyPy3.9, PyPy3.10, and various versions of cpython on the home page. For historical reasons, the "baseline" for comparison is still cpython 3.7.19.

We will keep the buildbots building PyPY3.9 until the end of August, these builds will still be available on the nightly builds tab of the buildbot.

What is PyPy?

PyPy is a Python interpreter, a drop-in replacement for CPython It's fast (PyPy and CPython performance comparison) due to its integrated tracing JIT compiler.

We also welcome developers of other dynamic languages to see what RPython can do for them.

We provide binary builds for:

  • x86 machines on most common operating systems (Linux 32/64 bits, Mac OS 64 bits, Windows 64 bits)

  • 64-bit ARM machines running Linux (aarch64) and macos (macos_arm64).

PyPy supports Windows 32-bit, Linux PPC64 big- and little-endian, Linux ARM 32 bit, RISC-V RV64IMAFD Linux, and s390x Linux but does not release binaries. Please reach out to us if you wish to sponsor binary releases for those platforms. Downstream packagers provide binary builds for debian, Fedora, conda, OpenBSD, FreeBSD, Gentoo, and more.

What else is new?

For more information about the 7.3.17 release, see the full changelog.

Please update, and continue to help us make pypy better.

Cheers, The PyPy Team

Categories: FLOSS Project Planets

Qt for MCUs 2.5.4 LTS Released

Planet KDE - Wed, 2024-08-28 08:13

Qt for MCUs 2.5.4 LTS (Long-Term Support) has been released and is available for download. This patch release provides bug fixes and other improvements while maintaining source compatibility with Qt for MCUs 2.5. It does not add any new functionality.

Categories: FLOSS Project Planets

Pages