Feeds
Python Software Foundation: Python Developers Survey 2023 Results
We are excited to share the results of the seventh official annual Python Developers Survey. This survey is done yearly as a collaborative effort between the Python Software Foundation and JetBrains. Responses were collected from November 2023 through February 2024. This year, we kept the response period open longer to facilitate as much global representation as possible. More than 25,000 Python developers and enthusiasts from almost 200 countries and regions participated in the survey to reveal the current state of the language and the ecosystem around it.
Check out the survey results!
The survey aims to map the Python landscape and covers the following topics:
- General Python usage
- Purpose for using Python
- Python versions
- Frameworks and Libraries
- Cloud Platforms
- Data science
- Development tools
- Python packaging
- Demographics
We encourage you to check out the methodology and the raw data for this year's Python Developers Survey, as well as those from past years (2022, 2021, 2020, 2019, 2018, and 2017). We would love to hear about what you learn by digging into the numbers! Share your results and comments with us on social media by mentioning JetBrains (LinkedIn, X) and the PSF (Mastodon, LinkedIn, X) using the #pythondevsurvey hashtag. Based on the feedback we received last year, we made adjustments to the 2023 survey- so we welcome suggestions and feedback that could help us improve again for next year!
PyPy: Guest Post: How PortaOne uses PyPy for high-performance processing, connecting over 1B of phone calls every month
The PyPy project is always happy to hear about industrial use and deployments of PyPy. For the GC bug finding task earlier this year, we collaborated with PortaOne and we're super happy that Serhii Titov, head of the QA department at PortaOne, was up to writing this guest post to describe their use and experience with the project.
What does PortaOne do?We at PortaOne Inc. allow telecom operators to launch new services (or provide existing services more efficiently) using our VoIP platform (PortaSIP) and our real-time charging system (PortaBilling), which provides additional features for cloud PBX, such as call transfer, queues, interactive voice response (IVR) and more. At this moment our support team manages several thousand servers with our software installed in 100 countries, through which over 500 telecommunication service providers connect millions of end users every day. The unique thing about PortaOne is that we supply the source code of our product to our customers - something unheard of in the telecom world! Thus we attract "telco innovators", who use our APIs to build around the system and the source code to create unique tweaks of functionality, which produces amazing products.
At the core of PortaSIP is the middle-ware component (the proper name for it is "B2BUA", but that probably does not say much to anyone outside of experts in VoIP), which implements the actual handling of SIP calls, messages, etc. and all added features (for instance, trying to send a call via telco operators through which the cost per minute is lower). It has to be fast (since even a small delay in establishing a call is noticed by a customer), reliable (everyone hates when a call drops or cannot be completed) and yet easily expandable with new functionality. This is why we decided to use Python as opposed to C/C++ or similar programming languages, which are often used in telecom equipment.
The B2BUA component is a batch of similar Python processes that are looped inside a asyncore.dispatcher wrapper. The load balancing between these Python processes is done by our stateless SIP proxy server written in C++. All our sockets are served by this B2BUA. We have our custom client-wrappers around pymysql, redis, cassandra-driver and requests to communicate with external services. Some of the Python processes use cffi wrappers around C-code to improve their performance (examples: an Oracle DB driver, a client to a radius server, a custom C logger).
The I/O operations that block the main thread of the Python processes are processed in sub-threads. We have custom wrappers around threading.Thread and also asyncore.dispatcher. The results of such operations are returned to the main thread.
Improving our performance with PyPyWe started with CPython and then in 2014 switched to PyPy because it was faster. Here's an exact quote from our first testing notes: "PyPy gives significant performance boost, ~50%". Nowadays, after years of changes in all the software involved, PyPy still gives us +50% boost compared to CPython.
Taking care of real time traffic for so many people around the globe is something we're really proud of. I hope the PyPy team can be proud of it as well, as the PyPy product is a part of this solution.
Finding a garbage collector bug: stage 1, the GC hooksHowever our path with PyPy wasn't perfectly smooth. There were very rare cases of crashes on PyPy that we weren't able to catch. That's because to make coredump useful we needed to switch to PyPy with debug, but we cannot let it run in that mode on a production system for an extended period of time, and we did not have any STR (steps-to-reproduce) to make PyPy crash again in our lab. That's why we kept (and still keep) both interpreters installed just in case, and we would switch to CPython if we noticed it happening.
At the time of updating PyPy from 3.5 to 3.6 our QA started noticing those crashes more often, but we still had no luck with STR or collecting proper coredumps with debug symbols. Then it became even worse after our development played with the Garbage Collector's options to increase performance of our middleware component. The crashes started to affect our regular performance testing (controlled by QA manager Yevhenii Bovda). At that point it was decided that we can no longer live like that and so we started an intense investigation.
During the first stage of our investigation (following the best practice of troubleshooting) we narrowed down the issue as much as we could. So, it was not our code, it was definitely somewhere in PyPy. Eventually our SIP software engineer Yevhenii Yatchenko found out that this bug is connected with the use of our custom hooks in the GC. Yevhenii created ticket #4899 and within 2-3 days we got a fix from a member of the PyPy team, in true open-source fashion.
Finding a garbage collector bug: stage 2, the real bugThen came stage 2. In parallel with the previous ticket, Yevhenii created #4900 that we still see failing with coredumps quite often, and they are not connected to GC custom hooks. In a nutshell, it took us dozens of back and forward emails, three Zoom sessions and four versions of a patch to solve the issue. During the last iteration we got a new set of options to try and a new version of the patch. Surprisingly, that helped! What a relief! So, the next logical step was to remove all debug options and run PyPy only with the patch. Unfortunately, it started to fail again and we came to the obvious conclusion that what will help us is not a patch, but one of options we were testing out. At that point we found out that PYPY_GC_MAX_PINNED=0 is a necessary and sufficient condition to solve our issue. This points to another bug in the garbage collector, somehow related to object pinning.
Here's our current state: we have to add PYPY_GC_MAX_PINNED=0, but we do not face the crashes anymore.
Conclusion and next stepsGratitude is extended to Carl for his invaluable assistance in resolving the nasty bugss, because it seems we're the only ones who suffered from the last one and we really did not want to fall back to CPython due to its performance disadvantage.
Serhii Titov, head of the QA department at PortaOne Inc.
P.S. If you are a perfectionist and at this point you have mixed feelings and you are still bothered by the question "But there might still be a bug in the GC, what about that?" - Carl has some ideas about it and he will sort it out (we will help with the testing/verification part).
Six Tips for Maximizing Desktop Screen Potential
Desktop software has many differences from mobile and embedded applications but one of the biggest and most obvious is the screen. How can you take advantage of all that real estate for your application? Here are six considerations for managing the screen in your desktop application.
- Choosing the right GUI framework
Use a flexible GUI framework that easily supports building apps with differing resolutions. The less hand-tweaking you need for your dialogs, graphical assets, and interface across a wide spectrum of resolutions, the better.
- Testing across diverse displays
Make sure you have a wide range of monitors to test your app against. It’s easy to assume everything works perfectly when your app is tested on a uniform configuration provided by IT. However, visual issues may arise when your app runs on smaller, larger, or different monitors.
- Setting clear resolution standards
Set a clear minimum resolution that supports your application’s features effectively. Ensure all dialogs fit on screen, scrollbars function properly, nothing is off-screen. Also, test against ultra-high-resolution monitors (like 4K and 8K) to ensure clarity and usability at high DPI settings. Verify that text is legible, controls are noticeable, and clickable regions are big enough to target accurately.
- Designing for adaptability
Ensure that your application’s user interface is not only scalable but also adaptable. It should reconfigure itself based on the resolution, maintaining a balance between functionality and aesthetics. Dialog boxes should be resizable, and layout managers should dynamically adjust component placement based on the available screen real estate.
- Embracing multi-monitor flexibility
Multi-monitor setups aren’t just for developers anymore. Many people use laptops along with a larger monitor. In fact, a two-screen configuration may be even more popular than single screens. Make sure your application handles this flexibility intelligently by allowing spawning windows or panels that can be moved to the monitor that best works for the user.
- Customizing the user workspace
With features such as dockable toolbars, multiple document interfaces, or floating inspectors, you can allow users to customize their workspace. This is particularly handy for apps like graphic design, audio/video editing, and software development, where distributing a wide variety of tools, dialogs, controls, and views across the entire screen real estate is essential.
Final thoughtsDesigning and testing for multiple screen resolutions and configurations is part of making a great application. As screen technology evolves and user expectations rise, your applications’ ability to harness the full potential of ultra resolution and multi-monitor setups might just set it apart from the crowd. If you’re interested in more tips for building desktop applications, you may want to read our related best practice guide.
About KDAB
If you like this article and want to read similar material, consider subscribing via our RSS feed.
Subscribe to KDAB TV for similar informative short video content.
KDAB provides market leading software consulting and development services and training in Qt, C++ and 3D/OpenGL. Contact us.
The post Six Tips for Maximizing Desktop Screen Potential appeared first on KDAB.
GNU MediaGoblin: MediaGoblin 0.14.0
We're pleased to announce the release of GNU MediaGoblin 0.14.0. See the release notes for full details and upgrading instructions.
Highlights of this release are:
- Preliminary support for Docker installation
- Preliminary support for OS packaging on GNU Guix
- Major configure/build overhaul
- Extended configuration documentation
This version has been tested on Debian Bookworm (12), Ubuntu 20.04, Ubuntu 22.04, Ubuntu 24.04 and Fedora 39.
Thanks go to co-maintainer Olivier Mehani for his major contributions in this release!
To join us and help improve MediaGoblin, please visit our getting involved page.
Asking for donations in Plasma
Why do we ask for donations so often? Because it’s important! As KDE becomes more successful and an increasing number of people use our software, our costs grow as well:
- Web and server hosting
- Organizing and hosting larger Akademy events
- Funding more and larger sprints
- Paying people to work on stuff the volunteer community doesn’t focus on, and retaining them over time
And so on. If we don’t raise more money while our software becomes more popular, we risk “dying of success,” so to speak. Remember, we give all this stuff away for free, no strings attached! Nobody has to pay a cent if they don’t want to or can’t afford to.
Accordingly, if you’re plugged into KDE social media, you probably see a lot of requests for donations. I end every one of my “This Week in KDE” posts with one, and many others do for their own blog posts as well. KDE’s official social media channels blast it out constantly, and we also do yearly fundraisers that are widely promoted online. If you’re reading this, you may get the impression that we’re always begging for cash!
But if you’re not plugged into these communications channels, you might not have ever seen a donation request at all. We know that the fraction of people who subscribe to these channels is small, so there’s a huge number of people who may not even know they can donate to KDE, let alone that donations are critically important to its continued existence.
Starting in Plasma 6.2, that changes, and I’d like to introduce what it will look like! From 6.2 onwards, Plasma itself will show a system notification asking for a donation once per year, in December:
The idea here is to get the message that KDE really does need your financial help in front of more eyeballs — especially eyeballs not currently looking at KDE’s public-facing promotion efforts.
Now, I know that messages like this can be controversial! The change was carefully considered, and we tried our best to minimize the annoying-ness factor: It’s small and unobtrusive, and no matter what you do with it (click any button, close it, etc) it’ll go away until next year. It’s implemented as a KDE Daemon (KDED) module, which allows users and distributors to permanently disable it if they like. You can also disable just the popup on System Settings’ Notifications page, accessible from the configure button in the notification’s header.
Ultimately the decision to do this came down to the following factors:
- We looked at FOSS peers like Thunderbird and Wikipedia which have similar things (and in Wikipedia’s case, the message is vastly more intrusive and naggy). In both cases, it didn’t drive everyone away and instead instead resulted in a massive increase in donations that the projects have been able to use to employ lots of people.
- KDE really needs something like this to help our finances grow sustainably in line with our userbase and adoption by vendors and distributors.
So now let me address what I anticipate will be some common concerns:
I think you’re wrong; people hate pop-ups and this is going to turn them off!Like I said, peer organizations didn’t see that happen, and some were even more in-your-face about it. I do suspect a small but vocal crowd of people will spread doom and gloom about it on social media anyway, of course. This also happened when we implemented off-by-default telemetry — which by the way, was implemented so conservatively that it barely collects any information of value at all. It’s a cautionary tale about the danger of being too timid and ending up with the worst of both worlds.
The worst-case scenario is that we don’t get more donations from this after a couple of years, and end up removing it. That’s always an option. But I think it’s worth venturing out there and being a bit bold! With risk comes opportunity.
KDE shouldn’t need to pay people directly; employment should come from vendors and distributors shipping our software!To a certain extent this already does happen: by far the largest contributor of paid work is Blue Systems — mostly funded by Valve Corporation, which ships KDE software on the Steam Deck. There are also trickles and spurts of sponsored work from distros, KDAB, and enterprising folks who get funded via grants.
Ultimately a healthy economic ecosystem around KDE includes people employed by many parties, including KDE itself, in my opinion. This is how KDE can help control its own destiny. And that costs money! Money that needs to come from somewhere.
Why does it have to be a notification pop-up? Put this in Welcome Center or something!We had a request for donations in Welcome center for several years, and it didn’t make a difference, because right after you’ve installed the system wasn’t the right time to ask. At that point, you don’t know if you like Plasma yet, so asking for money is premature.
If KDE is as successful as Thunderbird and Wikipedia have been, what are you going to do with all that money?This is a question the KDE e.V. board of directors as a whole would need to answer, and any decision on it will be made collectively.
But as one of the five members on that board, I can tell you my personal answer and the one that as your representative, I’d advocate for. It’s basically the platform I ran on two years ago: extend an offer of full-time employment to our current people, and hire even more! I want us to end up with paid QA people and distro developers, and even more software engineers. I want us to fund the creation of a next-generation KDE OS we can offer directly to institutions looking to switch to Linux, and a hardware certification program to go along with it. I want us to to extend our promotional activities and outreach to other major distros and vendors and pitch our software to them directly. I want to see Ubuntu, Red Hat Enterprise Linux, and SUSE Linux Enterprise Desktop ship Plasma by default. I want us to use this money to take over the world — with freedom, empowerment, and kindness.
These have been dreams for a long time, and throughout KDE we’ve been slowly moving towards them over the years. With a lot more money, we can turbocharge the pace! If that stuff sounds good, you can start with a donation today.
I know talking about money can be awkward. But failure to plan is planning to fail; money is something we can’t ignore and just hope things work out — and we don’t. Raising more money is a part of that plan, and this new yearly donation notification is a part of raising money. It’s my expectation and hope that asking our users for donations will result in more donations, and that we can use these to accelerate KDE’s reach and the quality of our software!
Matt Layman: No Frills, Just Go: Standard Library Only Web Apps
Gizra.com: Drupal Core Contribution Guide
Michael Ablassmeier: proxmox backup S3 proxy
A few weeks ago Tiziano Bacocco started a small project to implement a (golang) proxy that allows to store proxmox backups on S3 compatible storage: pmoxs3backuproxy, a feature which the current backup server does not have.
I wanted to have a look at the Proxmox Backup Server implementation for a while, so i jumped on the wagon and helped with adding most of the API endpoints required to seamlessly use it as drop-in replacement in PVE.
The current version can be configured as storage backend in PVE. You can then schedule your backups to the S3 storage likewise.
It now supports both the Fixed index format required to create virtual machine backups and the Dynamic index format, used by the regular proxmox-backup-client for file and container backups. (full and incremental)
The other endpoints like adding notes, removing or protecting backups, mounting images using the PVE frontend (or proxmox-backup-client) work too. It comes with a garbage collector that does prune the backup storage if snapshots expire and runs integrity checks on the data.
You can also configure it as so called “remote” storage in the Proxmox Backup server itself and pull back complete buckets using “proxmox-backup-manager pull”, if your local datastore crashes.
I think it will become more interesting if future proxmox versions will allow to push backups to other stores, too.
GNU Taler news: GNU Taler 0.13 released
GNUnet News: GNUnet 0.22.0
We are pleased to announce the release of GNUnet 0.22.0.
GNUnet is an alternative network stack for building secure, decentralized and
privacy-preserving distributed applications.
Our goal is to replace the old insecure Internet protocol stack.
Starting from an application for secure publication of files, it has grown to
include all kinds of basic protocol components and applications towards the
creation of a GNU internet.
This is a new major release. It breaks protocol compatibility with the 0.21.x versions. Please be aware that Git master is thus henceforth (and has been for a while) INCOMPATIBLE with the 0.21.x GNUnet network, and interactions between old and new peers will result in issues. In terms of usability, users should be aware that there are still a number of known open issues in particular with respect to ease of use, but also some critical privacy issues especially for mobile users. Also, the nascent network is tiny and thus unlikely to provide good anonymity or extensive amounts of interesting information. As a result, the 0.22.0 release is still only suitable for early adopters with some reasonable pain tolerance .
Download links- gnunet-0.22.0.tar.gz ( signature )
- gnunet-0.22.0-meson.tar.gz ( signature ) NEW: Test tarball made using the meson build system.
- gnunet-gtk-0.22.0.tar.gz ( signature )
- gnunet-fuse-0.22.0.tar.gz ( signature )
The GPG key used to sign is: 3D11063C10F98D14BD24D1470B0998EF86F59B6A
Note that due to mirror synchronization, not all links might be functional early after the release. For direct access try http://ftp.gnu.org/gnu/gnunet/
ChangesA detailed list of changes can be found in the git log , the NEWS and the bug tracker . Noteworthy highlights are
-
transport
:
- A new experimental HTTP/3 communicator for peer-to-peer transport communicator.
- New experimental NAT traversal functionality.
-
util
:
- An implementation of Hybrid Public Key Encryption (HPKE) and related KEMs which are now used across the stack.
- An implementation of Elligator used as part of our Diffie-Hellman exchanges and KEMs
- hostlist : The bootstrap URL is changed to https://bootstrap.gnunet.org/v22 and https://bootstrap.gnunet.org/latest for the release and development version (git head), respectively.
- gnunet-hello : A new CLI to import/export connectivity information (HELLOs) of peers manually.
- namestore : Significant zone import performance improvements in preparation for DNS TLD mirror deployments (.se, .nu, etc) .
-
messenger
:
- Implementation of discourse subscriptions for live data streaming in chat rooms.
- New functionality in CLI for the Messenger service to stream data via standard input and output.
-
Build System
:
- Build variant to build a monolithic GNUnet library.
- Cross compile the monolithic library for use on Android devices. An Android prototype can be found in this repository.
- There are known major design issues in the CORE subsystems which will need to be addressed in the future to achieve acceptable usability, performance and security.
- There are known moderate implementation limitations in CADET that negatively impact performance.
- There are known moderate design issues in FS that also impact usability and performance.
- There are minor implementation limitations in SET that create unnecessary attack surface for availability.
- The RPS subsystem remains experimental.
In addition to this list, you may also want to consult our bug tracker at bugs.gnunet.org which lists about 190 more specific issues.
ThanksThis release was the work of many people. The following people contributed code and were thus easily identified: Christian Grothoff, t3sserakt, TheJackiMonster, Pedram Fardzadeh, Shichao, fence, dvn, nullptrderef and Martin Schanzenbach.
screen @ Savannah: GNU Screen v.5.0.0 is released
Screen is a full-screen window manager that multiplexes a physical
terminal between several processes, typically interactive shells.
The 5.0.0 release includes the following changes to the previous
release 4.9.1:
- Rewritten authentication mechanism
- Add escape %T to show current tty for window
- Add escape %O to show number of currently open windows
- Use wcwdith() instead of UTF-8 hard-coded tables
- New commands:
- auth [on|off]
Provides password protection
- status [top|up|down|bottom] [left|right]
The status window by default is in bottom-left corner.
This command can move status messages to any corner of the screen.
- truecolor [on|off]
- multiinput
Input to multiple windows at the same time
- Removed commands:
- time
- debug
- password
- maxwin
- nethack
- Fixes:
- Screen buffers ESC keypresses indefinitely
- Crashes after passing through a zmodem transfer
- Fix double -U issue
Release is available for download:
https://ftp.gnu.org/gnu/screen/
Please report any bugs or regressions.
Thanks to everyone who contributed to this release.
Cheers,
Alex
Python Morsels: Arithmetic in Python
An explanation of Python's two number types (integers and floating point numbers), supported arithmetic operations, and an explanation of operator precedence.
Table of contents
- Integers
- Floating point numbers
- Mixing integers and floating point numbers
- Arithmetic operations
- Operator precedence in Python
- Arithmetic in Python is similar to in math
Integers are used for representing whole numbers.
>>> 5 5 >>> 0 0 >>> 999999999999 999999999999 >>> -10 -10Any number that doesn't have a decimal point in it is an integer.
Floating point numbersFloating point numbers are used …
Read the full article: https://www.pythonmorsels.com/arithmetic-in-python/Drupalize.Me: We Updated the Drupal User Guide for Drupal 11
Drupal 11 was released recently. Yay. And with it comes a bunch of minor (and sometimes major) changes to the way Drupal works and the need to update the documentation to reflect those changes.
joe Wed, 08/28/2024 - 15:35Mike Herchel's Blog: Five Ideas for the Drupal Association
Tag1 Consulting: Tag1 Is Heading to Barcelona - Join Us at DrupalCon Europe 2024!
Exciting news! Tag1 Consulting is proud to be a module sponsor at DrupalCon Barcelona 2024. Join us from September 24-27 for four days of Drupal innovation, collaboration, and community spirit. Our team will be presenting on Gander, Drupal Core development, LMS, DDEV, and more.
Read more Hank Wed, 08/28/2024 - 10:14Tag1 Consulting: Migrating Your Data from D7 to D10:Migrating field storage and instance settings
In this article, we delve into the process of migrating Drupal fields, building on the knowledge from previous discussions about Drupal fields and their database structures. We begin by addressing the two key components of field migrations: storage and instance settings. This is the first step in a multi-stage migration process that will ultimately involve four different migrations.
Read more mauricio Wed, 08/28/2024 - 07:41Real Python: Web Scraping With Scrapy and MongoDB
Scrapy is a robust Python web scraping framework that can manage requests asynchronously, follow links, and parse site content. To store scraped data, you can use MongoDB, a scalable NoSQL database, that stores data in a JSON-like format. Combining Scrapy with MongoDB offers a powerful solution for web scraping projects, leveraging Scrapy’s efficiency and MongoDB’s flexible data storage.
In this tutorial, you’ll learn how to:
- Set up and configure a Scrapy project
- Build a functional web scraper with Scrapy
- Extract data from websites using selectors
- Store scraped data in a MongoDB database
- Test and debug your Scrapy web scraper
If you’re new to web scraping and you’re looking for flexible and scalable tooling, then this is the right tutorial for you. You’ll also benefit from learning this tool kit if you’ve scraped sites before, but the complexity of your project has outgrown using Beautiful Soup and Requests.
To get the most out of this tutorial, you should have basic Python programming knowledge, understand object-oriented programming, comfortably work with third-party packages, and be familiar with HTML and CSS.
By the end, you’ll know how to get, parse, and store static data from the Internet, and you’ll be familiar with several useful tools that allow you to go much deeper.
Get Your Code: Click here to download the free code that shows you how to gather Web data with Scrapy and MongoDB.
Take the Quiz: Test your knowledge with our interactive “Web Scraping With Scrapy and MongoDB” quiz. You’ll receive a score upon completion to help you track your learning progress:
Interactive Quiz
Web Scraping With Scrapy and MongoDBIn this quiz, you'll test your understanding of web scraping with Scrapy and MongoDB. You'll revisit how to set up a Scrapy project, build a functional web scraper, extract data from websites, store scraped data in MongoDB, and test and debug your Scrapy web scraper.
Prepare the Scraper ScaffoldingYou’ll start by setting up the necessary tools and creating a basic project structure that will serve as the backbone for your scraping tasks.
While working through the tutorial, you’ll build a complete web scraping project, approaching it as an ETL (Extract, Transform, Load) process:
- Extract data from the website using a Scrapy spider as your web crawler.
- Transform this data, for example by cleaning or validating it, using an item pipeline.
- Load the transformed data into a storage system like MongoDB with an item pipeline.
Scrapy provides scaffolding for all of these processes, and you’ll tap into that scaffolding to learn web scraping following the robust structure that Scrapy provides and that numerous enterprise-scale web scraping projects rely on.
Note: In a Scrapy web scraping project, a spider is a Python class that defines how to crawl a specific website or a group of websites. It contains the logic for making requests, parsing responses, and extracting the desired data.
First, you’ll install Scrapy and create a new Scrapy project, then explore the auto-generated project structure to ensure that you’re well-equipped to proceed with building a performant web scraper.
Install the Scrapy PackageTo get started with Scrapy, you first need to install it using pip. Create and activate a virtual environment to keep the installation separate from your global Python installation. Then, you can install Scrapy:
Shell (venv) $ python -m pip install scrapy Copied!After the installation is complete, you can verify it by running the scrapy command and viewing the output:
Shell (venv) $ scrapy Scrapy 2.11.2 - no active project Usage: scrapy <command> [options] [args] Available commands: bench Run quick benchmark test fetch Fetch a URL using the Scrapy downloader genspider Generate new spider using pre-defined templates runspider Run a self-contained spider (without creating a project) settings Get settings values shell Interactive scraping console startproject Create new project version Print Scrapy version view Open URL in browser, as seen by Scrapy [ more ] More commands available when run from project directory Use "scrapy <command> -h" to see more info about a command Copied!The command-line (CLI) program should display the help text of Scrapy. This confirms that you installed the package correctly. You’ll next run the highlighted startproject command to create a project.
Create a Scrapy ProjectScrapy is built around projects. Generally, you’ll create a new project for each web scraping project that you’re working on. In this tutorial, you’ll work on scraping a website called Books to Scrape, so you can call your project books.
As you may have already identified in the help text, the framework provides a command to create a new project:
Shell (venv) $ scrapy startproject books Copied! Read the full article at https://realpython.com/web-scraping-with-scrapy-and-mongodb/ »[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]
Ezequiel Lanza: Voices of the Open Source AI Definition
The Open Source Initiative (OSI) is running a blog series to introduce some of the people who have been actively involved in the Open Source AI Definition (OSAID) co-design process. The co-design methodology allows for the integration of diverging perspectives into one just, cohesive and feasible standard. Support and contribution from a significant and broad group of stakeholders is imperative to the Open Source process and is proven to bring diverse issues to light, deliver swift outputs and garner community buy-in.
This series features the voices of the volunteers who have helped shape and are shaping the Definition.
Meet Ezequiel LanzaWhat’s your background related to Open Source and AI?
I’ve been working in AI for more than 10 years (Yes, before ChatGPT!). With a background in engineering, I’ve consistently focused on building and supporting AI applications, particularly in machine learning and data science. Over the years, I’ve contributed to and collaborated on various projects. A few years ago, I decided to pursue a master’s in data science to deepen my theoretical knowledge and further enhance my skills. Open Source has also been a significant part of my work; the frameworks, tools and community have continually drawn me in, making me an active participant in this evolving conversation for years.
What motivated you to join this co-design process to define Open Source AI?
AI owes much of its progress to Open Source, and it’s essential for continued innovation. My experience in both AI and Open Source spans many years, and I believe this co-design process offers a unique chance to contribute meaningfully. It’s not just about sharing my insights but also about learning from other professionals across AI and different disciplines. This collective knowledge and diverse perspectives make this initiative truly powerful and enriching, to shape the future of Open Source AI together.
Can you describe your experience participating in this process? What did you most enjoy about it, and what were some of the challenges you faced?
Participating in this process has been both rewarding and challenging. I’ve particularly enjoyed engaging with diverse groups and hearing different perspectives. The in-person events, such as All Things Open in Raleigh in 2023, have been valuable for fostering direct collaboration and building relationships. However, balancing these meetings with my work duties has been challenging. Coordinating schedules and managing time effectively to attend all the relevant discussions can be demanding. Despite these challenges, the insights and progress have made the effort worthwhile.
Why do you think AI should be Open Source?
We often say AI is everywhere, and while that’s partially true, I believe AI will be everywhere, significantly impacting our lives. However, AI’s full potential can only be realized if it is open and accessible to everyone. Open Source AI should also foster innovation by enabling developers and researchers from all backgrounds to contribute to and improve existing models, frameworks and tools, allowing freedom of expression. Without open access, involvement in AI can be costly, limiting participation to only a few large companies. Open Source AI should aim to democratize access, allowing small businesses, startups and individuals to leverage powerful tools that might otherwise be out of reach due to cost or proprietary barriers.
What do you think is the role of data in Open Source AI?
Data is essential for any AI system. Initially, from my ML bias perspective, open and accessible datasets were crucial for effective ML development. However, I’ve reevaluated this perspective, considering how to adapt the system while staying true to Open Source principles. As AI models, particularly GenAI like LLMs, become increasingly complex, I’ve come to value the models themselves. For example, Generative AI requires vast amounts of data, and gaining access to this data can be a significant challenge.
This insight has led me to consider what I—whether as a researcher, developer or user—truly need from a model to use/investigate it effectively. While understanding the data used in training is important, having access to specific datasets may not always be necessary. In approaches like federated learning, the model itself can be highly valuable while keeping data private, though understanding the nature of the data remains important. For LLMs, techniques such as fine-tuning, RAG and RAFT emphasize the benefits of accessing the model rather than the original dataset, providing substantial advantages to the community.
Sharing model architecture and weights is crucial, and data security can be maintained through methods like model introspection and fine-tuning, reducing the need for extensive dataset sharing.
Data is undoubtedly a critical component. However, the essence of Open Source AI lies in ensuring transparency, then the focus should be on how data is used in training models. Documenting which datasets were used and the data handling processes is essential. This transparency helps the community understand the origins of the data, assess potential biases and ensure the responsible use of data in model development. While sharing the exact datasets may not always be necessary, providing clear information about data sources and usage practices is crucial for maintaining trust and integrity in Open Source AI.
Has your personal definition of Open Source AI changed along the way? What new perspectives or ideas did you encounter while participating in the co-design process?
Of course, it changed and evolved – that’s what a thought process is about. I’d be stubborn if I never changed my perspective along the way. I’ve often questioned even the most fundamental concepts I’ve relied on for years, avoiding easy or lazy assumptions. This thorough process has been essential in refining my understanding of Open Source AI. Engaging in meaningful exchanges with others has shown me the importance of practical definitions that can be implemented in real-world scenarios. While striving for an ideal, flawless definition is tempting, I’ve found that embracing a pragmatic approach is ultimately more beneficial.
What do you think the primary benefit will be once there is a clear definition of Open Source AI?
As I see it, the Open Source AI Definition will support the growth, and it will be the first big step. The primary benefit of having a clear definition of Open Source AI will be increased clarity and consistency in the field. This will enhance collaboration by setting clear standards and expectations for researchers, developers and organizations. It will also improve transparency by ensuring that AI models and tools genuinely follow Open Source principles, fostering trust in their development and sharing.
A clear definition will create standardized practices and guidelines, making it easier to evaluate and compare different Open Source AI projects.
What do you think are the next steps for the community involved in Open Source AI?
The next steps for the community should start with setting up a certification process for AI models to ensure they meet certain standards. This could include tools to help automate the process. After that, it would be helpful to offer templates and best practice guides for AI models. This will support model designers in creating high-quality, compliant systems and make the development process smoother and more consistent.
How to get involvedThe OSAID co-design process is open to everyone interested in collaborating. There are many ways to get involved:
- Join the forum: share your comment on the drafts.
- Leave comment on the latest draft: provide precise feedback on the text of the latest draft.
- Follow the weekly recaps: subscribe to our monthly newsletter and blog to be kept up-to-date.
- Join the town hall meetings: we’re increasing the frequency to weekly meetings where you can learn more, ask questions and share your thoughts.
- Join the workshops and scheduled conferences: meet the OSI and other participants at in-person events around the world.
The Drop Times: Drupal GovCon: Empowering Site Builders and Leading with Integrity
Django Weblog: Could you host DjangoCon Europe 2026? Call for organizers
We are looking for the next group of organizers to own and lead the 2026 DjangoCon Europe conference. Could your town - or your football stadium, circus tent, private island or city hall - host this wonderful community event?
DjangoCon Europe is a major pillar of the Django community, as people from across the world meet and share. This includes many qualities that make it a unique event - unconventional and conventional venues, creative happenings, a feast of talks and a dedication to inclusion and diversity.
Hosting a DjangoCon is an ambitious undertaking. It's hard work, but each year it has been successfully run by a team of community volunteers, not all of whom have had previous experience - more important is enthusiasm, organizational skills, the ability to plan and manage budgets, time and people - and plenty of time to invest in the project.
For 2026, we want to kickstart the organization much earlier than in previous years to allow more flexibility for the organizing team, and open up more opportunities for support from our DjangoCon Europe support working group.
Step 1: Submit your expression of interestIf you’re considering organizing DjangoCon Europe (🙌 great!), fill in our DjangoCon Europe 2026 expression of interest form with your contact details. No need to fill in all the information at this stage if you don’t have it all already, we’ll reach out and help you figure it out.
Express your interest in organizing
Step 2: We’re here to help!We've set up a DjangoCon Europe support working group of previous organizers that you can reach out to with questions about organizing and running a DjangoCon Europe.
The group will be in touch with everyone submitting the expression of interest form, or you can reach out to them directly: european-organizers-support@djangoproject.com
We'd love to hear from you as soon as possible, so your proposal can be finalized and sent to the DSF board by October 6th 2024. The selected hosts will be publicly announced at DjangoCon Europe 2025 by the current organizers.
Step 3: Submitting the proposalThe more detailed and complete your final proposal is, the better. Basic details include:
- Organizing committee members: You won’t have a full team yet, probably, naming just some core team members is enough.
- The legal entity that is intended to run the conference: Even if the entity does not exist yet, please share how you are planning to set it up.
- Dates: See “What dates are possible in 2026?” below. We must avoid conflicts with major holidays, EuroPython, DjangoCon US, and PyCon US.
- Venue(s), including size, number of possible attendees, pictures, accessibility concerns, catering, etc.
- Transport links and accommodation: Can your venue be reached by international travelers?
- Budgets and ticket prices: Talk to the DjangoCon Europe Support group to get help with this, including information on past event budgets.
We also like to see:
- Timelines
- Pictures
- Plans for online participation, and other ways to make the event more inclusive and reduce its environmental footprint
- Draft agreements with providers
- Alternatives you have considered
Have a look at our proposed DjangoCon Europe 2026 Licensing Agreement for the fine print on contractual requirements and involvement of the Django Software Foundation.
Submit your completed proposal by October 6th 2024 via our DjangoCon Europe 2026 expression of interest form, this time filling in as many fields as possible. We look forward to reviewing great proposals that continue the excellence the whole community associates with DjangoCon Europe.
Q&A Can I organize a conference alone?We strongly recommend that a team of people submit an application.
I/we don’t have a legal entity yet, is that a problem?Depending on your jurisdiction, this is usually not a problem. But please share your plans about the entity you will use or form in your application.
Do I/we need experience with organizing conferences?The support group is here to help you succeed. From experience, we know that many core groups of 2-3 people have been able to run a DjangoCon with guidance from previous organizers and help from volunteers.
What is required in order to announce an event?Ultimately, a contract with the venue confirming the dates is crucial, since announcing a conference makes people book calendars, holidays, buy transportation and accommodation etc. This, however, would only be relevant after the DSF board has concluded the application process. Naturally, the application itself cannot contain any guarantees, but it’s good to check concrete dates with your venues to ensure they are actually open and currently available, before suggesting these dates in the application.
Do we have to do everything ourselves?No. You will definitely be offered lots of help by the community. Typically, conference organizers will divide responsibilities into different teams, making it possible for more volunteers to join. Local organizers are free to choose which areas they want to invite the community to help out with, and a call will go out through a blog post announcement on djangoproject.com and social media.
What kind of support can we expect from the Django Software Foundation?The DSF regularly provides grant funding to DjangoCon organizers, to the extent of $6,000 in recent editions. We also offer support via specific working groups:
- The dedicated DjangoCon Europe support working group.
- The social media working group can help you promote the event.
- The Code of Conduct working group works with all event organizers.
In addition, a lot of Individual Members of the DSF regularly volunteer at community events. If your team aren’t Individual Members, we can reach out to them on your behalf to find volunteers.
What dates are possible in 2026?For 2026, DjangoCon Europe should happen between January 5th and April 27th, or June 4th and June 28th. This is to avoid the following community events’ provisional dates:
- PyCon US 2026: May 2026
- EuroPython 2026: July 2026
- DjangoCon US 2026: September - October 2026
- DjangoCon Africa 2026: August - September 2026
We also want to avoid the following holidays:
- New Year's Day: Wednesday 1st January 2026
- Chinese New Year: Tuesday 17th February 2026
- Eid Al-Fitr: Friday 20th March 2026
- Passover: Wednesday 1st - Thursday 9th April 2026
- Easter: Sunday 5th April 2026
- Eid Al-Adha: Tuesday 26th - Friday 29th May 2026
- Rosh Hashanah: Friday 11th - Sunday 13th September 2026
- Yom Kippur: Sunday 20th - Monday 21st September 2026
Any city in Europe. This can be a city or country where DjangoCon Europe has happened in the past (Vigo, Edinburgh, Porto, Copenhagen, Heidelberg, Florence, Budapest, Cardiff, Toulon, Warsaw, Zurich, Amsterdam, Berlin), or a new locale.
References Past calls