Feeds

Copyright law makes a case for requiring data information rather than open datasets for Open Source AI

Open Source Initiative - Wed, 2024-09-11 16:04

The Open Source Initiative (OSI) is running a blog series to introduce some of the people who have been actively involved in the Open Source AI Definition (OSAID) co-design process. The co-design methodology allows for the integration of diverging perspectives into one just, cohesive and feasible standard. Support and contribution from a significant and broad group of stakeholders is imperative to the Open Source process and is proven to bring diverse issues to light, deliver swift outputs and garner community buy-in.

This series features the voices of the volunteers who have helped shape and are shaping the Definition.

Meet Felix Reda Photo Credit: CC-by 4.0 International Volker Conradus volkerconradus.com.

Felix Reda (he/they) has been an active contributor to the Open Source AI Definition (OSAID) co-design process, bringing his personal interest and expertise in copyright reform to the online forums. Working in digital policy for over ten years, including serving as a member of the European Parliament from 2014 to 2019 and working with the strategic litigation NGO Gesellschaft für Freiheitsrechte (GFF), Felix is currently the director of developer policy at GitHub. He is also an affiliate of the Berkman Klein Center for Internet and Society at Harvard and serves on the board of the Open Knowledge Foundation Germany. He holds an M.A. in political science and communications science from the University of Mainz, Germany.

Data information as a viable alternative

Note: The original text was contributed by Felix Reda to the discussions happening on the Open Source AI forum as a response to Stefano Maffulli’s post on how the draft Open Source AI Definition arrived at its current state, the design principles behind the data information concept and the constraints (legal and technical) it operates under.

When we look at applying Open Source principles to the subject of AI, copyright law comes into play, especially for the topic of training data access. Open datasets have been a continuous discussion point in the collaborative process of writing the Open Source AI Definition. I would like to explain why the concept of data information is a viable alternative for the purposes of the OSAID.

The definition of Open Source software has an access element and a legal element – the access element being the availability of the source code and the legal element being a license rooted in the copyright-protection given to software. The underlying assumption is that the entity making software available as Open Source is the rights holder in the software and is therefore entitled to make the source code available without infringing the copyright of a third party, and to license it for re-use. To the extent that third-party copyright-protected material is incorporated into the Open Source software, it must itself be released under a compatible Open Source license that also allows the redistribution.

When it comes to AI, the situation is fundamentally different: The assumption that an Open Source AI model will only be trained on copyright-protected material that the developer is entitled to redistribute does not hold. Different copyright regimes around the world, including the EU, Japan and Singapore, have statutory exceptions that explicitly allow text and data mining for the purposes of AI training. The EU text and data mining exceptions, which I know best, were introduced with the objective of facilitating the development of AI and other automated analytical techniques. However, they only allow the reproduction of copyright-protected works (aka copying), but not the making available of those works (aka posting them on the internet).

That means that an Open Source AI definition that would require the republication of the complete dataset in order for an AI model to qualify as Open Source would categorically exclude Open Source AI models from the ability to rely on the text and data mining exceptions in copyright – that is despite the fact that the legislator explicitly decided that under certain circumstances (for example allowing rights holders to declare a machine-readable opt-out from training outside of the context of scientific research) the use of copyright-protected material for the purposes of training AI models should be legal. This result would be particularly counterproductive because it would even render Open Source AI models illegal in situations where the reproducibility of the dataset would be complete by the standards discussed on the OSAID forum.

Examples

Imagine an AI model that was trained on publicly accessible text on the internet that was version-controlled, for which the rights holder had not declared an opt-out, but which the rights holder had also not put under a permissive license (all rights reserved). Using this text as training data for an AI model would be legal under copyright law, but re-publishing the training dataset would be illegal. Publishing information about the training dataset that included the version of the data that was used, when and how it was retrieved from which website, and how it was tokenized would meet the requirements of the OSAID v 0.0.8 if (and only if) it put a skilled person in the position to build their own dataset to recreate an equivalent system.

Neither the developer of the original Open Source AI model nor the skilled person recreating it would violate copyright law in the process, unlike the scenario that required publication of the dataset. Including a requirement in the OSAID to publish the data, in which the AI developer typically does not hold the copyright, would have little added benefit but would drastically reduce the material that could be used for training, despite the existence of explicit legal permissions to use that content for AI training. I don’t think that would be wise.

The international concern of public domain

While I support the creation of public domain datasets that can be republished without restrictions, I would like to caution against pointing to these efforts as a solution to the problem of copyright in training datasets. Public domain status is not harmonized internationally – what is in the public domain in one jurisdiction is routinely protected by copyright in other parts of the world. For example, in US discourse it is often assumed that works generated by US government employees are in the public domain. They are not, they are only in the public domain in the US, while they are copyright-protected in other jurisdictions.

The same goes for works in which copyright has expired: Although the Berne Convention allows signatory countries to limit the copyright term on works until protection in the work’s country of origin has expired, exceptions to this rule are permitted. For example, although the first incarnation of Mickey Mouse has recently entered the public domain in the US, it is still protected by copyright in Germany due to an obscure bilateral copyright treaty between the US and Germany from 1892. Copyright protection is not conditional on registration of a work, and no even remotely comprehensive, reliable rights information on the copyright status of works exists. Good luck to an Open Source AI developer who tried to stay on top of all of these legal pitfalls.

Bottom line

There are solid legal permissions for using copyright-protected works for AI training (reproductions). There are no equivalent legal permissions for incorporating copyright-protected works into publishable datasets (making available). What an Open Source AI developer thinks is in the public domain and therefore publishable in an open dataset regularly turns out to be copyright-protected after all, at least in some jurisdictions.

Unlike reproductions, which only need to follow the copyright law of the country in which the reproduction takes place, making content available online needs to be legal in all jurisdictions from which the content can be accessed. If the OSAID required the publication of the dataset, this would routinely lead to situations where Open Source AI models could not be made accessible across national borders, thus impeding their collaborative improvement, one of the great strengths of Open Source. I doubt that with such a restrictive definition, Open Source AI would gain any practical significance. Tragically, the text and data mining exceptions that were designed to facilitate research collaboration and innovation across borders, would only support proprietary AI models, while excluding Open Source AI. The concept of data information will help us avoid that pitfall while staying true to Open Source principles.

How to get involved

The OSAID co-design process is open to everyone interested in collaborating. There are many ways to get involved:

Join the forum: share your comment on the drafts.
Leave comment on the latest draft: provide precise feedback on the text of the latest draft.
Follow the weekly recaps: subscribe to our monthly newsletter and blog to be kept up-to-date.
Join the town hall meetings: we’re increasing the frequency to weekly meetings where you can learn more, ask questions and share your thoughts.
Join the workshops and scheduled conferences: meet the OSI and other participants at in-person events around the world.

Categories: FLOSS Research

Glyph Lefkowitz: Python macOS Framework Builds

Planet Python - Wed, 2024-09-11 15:43

When you build Python, you can pass various options to ./configure that change aspects of how it is built. There is documentation for all of these options, and they are things like --prefix to tell the build where to install itself, --without-pymalloc if you have some esoteric need for everything to go through a custom memory allocator, or --with-pydebug.

One of these options only matters on macOS, and its effects are generally poorly understood. The official documentation just says “Create a Python.framework rather than a traditional Unix install.” But… do you need a Python.framework? If you’re used to running Python on Linux, then a “traditional Unix install” might sound pretty good; more consistent with what you are used to.

If you use a non-Framework build, most stuff seems to work, so why should anyone care? I have mentioned it as a detail in my previous post about Python on macOS, but even I didn’t really explain why you’d want it, just that it was generally desirable.

The traditional answer to this question is that you need a Framework build “if you want to use a GUI”, but this is demonstrably not true. At first it might not seem so, since the go-to Python GUI test is “run IDLE”; many non-Framework builds also omit Tkinter because they don’t ship a Tk dependency, so IDLE won’t start. But other GUI libraries work fine. For example, uv tool install runsnakerun / runsnake will happily pop open a GUI window, Framework build or not. So it bears some explaining

Wait, what is a “Framework” anyway?

Let’s back up and review an important detail of the mac platform.

On macOS, GUI applications are not just an executable file, they are organized into a bundle, which is a directory with a particular layout, that includes metadata, that launches an executable. A thing that, on Linux, might live in a combination of /bin/foo for its executable and /share/foo/ for its associated data files, is instead on macOS bundled together into Foo.app, and those components live in specified locations within that directory.

A framework is also a bundle, but one that contains a library. Since they are directories, Applications can contain their own Frameworks and Frameworks can contain helper Applications. If /Applications is roughly equivalent to the Unix /bin, then /Library/Frameworks is roughly equivalent to the Unix /lib.

App bundles are contained in a directory with a .app suffix, and frameworks are a directory with a .framework suffix.

So what do you need a Framework for in Python?

The truth about Framework builds is that there is not really one specific thing that you can point to that works or doesn’t work, where you “need” or “don’t need” a Framework build. I was not able to quickly construct an example that trivially fails in a non-framework context for this post, but I didn’t try that many different things, and there are a lot of different things that might fail.

The biggest issue is not actually the Python.framework itself. The metadata on the framework is not used for much outside of a build or linker context. However, Python’s Framework builds also ship with a stub application bundle, which places your Python process into a normal application(-ish) execution context all the time, which allows for various platform APIs like [NSBundle mainBundle] to behave in the normal, predictable ways that all of the numerous, various frameworks included on Apple platforms expect.

Various Apple platform features might want to ask a process questions like “what is your unique bundle identifier?” or “what entitlements are you authorized to access” and even beginning to answer those questions requires information stored in the application’s bundle.

Python does not ship with a wrapper around the core macOS “cocoa” API itself, but we can use pyobjc to interrogate this. After installing pyobjc-framework-cocoa, I can do this

1 2>>> import AppKit >>> AppKit.NSBundle.mainBundle()

On a non-Framework build, it might look like this:

1NSBundle </Users/glyph/example/.venv/bin> (loaded)

But on a Framework build (even in a venv in a similar location), it might look like this:

1NSBundle </Library/Frameworks/Python.framework/Versions/3.12/Resources/Python.app> (loaded)

This is why, at various points in the past, GUI access required a framework build, since connections to the window server would just be rejected for Unix-style executables. But that was an annoying restriction, so it was removed at some point, or at least, the behavior was changed. As far as I can tell, this change was not documented. But other things like user notifications or geolocation might need to identity an application for preferences or permissions purposes, respectively. Even something as basic as “what is your app icon” for what to show in alert dialogs is information contained in the bundle. So if you use a library that wants to make use of any of these features, it might work, or it might behave oddly, or it might silently fail in an undocumented way.

This might seem like undocumented, unnecessary cruft, but it is that way because it’s just basic stuff the platform expects to be there for a lot of different features of the platform.

/etc/ builds

Still, this might seem like a strangely vague description of this feature, so it might be helpful to examine it by a metaphor to something you are more familiar with. If you’re familiar with more Unix style application development, consider a junior developer — let’s call him Jim — asking you if they should use an “/etc build” or not as a basis for their Docker containers.

What is an “/etc build”? Well, base images like ubuntu come with a bunch of files in /etc, and Jim just doesn’t see the point of any of them, so he likes to delete everything in /etc just to make things simpler. It seems to work so far. More experienced Unix engineers that he has asked react negatively and make a face when he tells them this, and seem to think that things will break. But their app seems to work fine, and none of these engineers can demonstrate some simple function breaking, so what’s the problem?

Off the top of your head, can you list all the features that all the files that /etc is needed for? Why not? Jim thinks it’s weird that all this stuff is undocumented, and it must just be unnecessary cruft.

If Jim were to come back to you later with a problem like “it seems like hostname resolution doesn’t work sometimes” or “ls says all my files are owned by 1001 rather than the user name I specified in my Dockerfile” you’d probably say “please, put /etc back, I don’t know exactly what file you need but lots of things just expect it to be there”.

This is what a framework vs. a non-Framework build is like. A Framework build just includes all the pieces of the build that the macOS platform expects to be there. What pieces do what features need? It depends. It changes over time. And the stub that Python’s Framework builds include may not be sufficient for some more esoteric stuff anyway. For example, if you want to use a feature that needs a bundle that has been signed with custom entitlements to access something specific, like the virtualization API, you might need to build your own app bundle. To extend our analogy with Jim, the fact that /etc exists and has the default files in it won’t always be sufficient; sometimes you have to add more files to /etc, with quite specific contents, for some features to work properly. But “don’t get rid of /etc (or your application bundle)” is pretty good advice.

Do you ever want a non-Framework build?

macOS does have a Unix subsystem, and many Unix-y things work, for Unix-y tasks. If you are developing a web application that mostly runs on Linux anyway and never care about using any features that touch the macOS-specific parts of your mac, then you probably don’t have to care all that much about Framework builds. You’re not going to be surprised one day by non-framework builds suddenly being unable to use some basic Unix facility like sockets or files. As long as you are aware of these limitations, it’s fine to install non-Framework builds. I have a dozen or so Pythons on my computer at any given time, and many of them are not Framework builds.

Framework builds do have some small drawbacks. They tend to be larger, they can be a bit more annoying to relocate, they typically want to live in a location like /Library or ~/Library. You can move Python.framework into an application bundle according to certain rules, as any bundling tool for macOS will have to do, but it might not work in random filesystem locations. This may make managing really large number of Python versions more annoying.

Most of all, the main reason to use a non-Framework build is if you are building a tool that manages a fleet of Python installations to perform some automation that needs to know about Python installs, and you want to write one simple tool that does stuff on Linux and on macOS. If you know you don’t need any platform-specific features, don’t want to spend the (not insignificant!) effort to cover those edge cases, and you get a lot of value from that level of consistency (for example, a teaching environment or interdisciplinary development team with a lot of platform diversity) then a non-framework build might be a better option.

Why do I care?

Personally, I think it’s important for Framework builds to be the default for most users, because I think that as much stuff should work out of the box as possible. Any user who sees a neat library that lets them get control of some chunk of data stored on their mac - map data, health data, game center high scores, whatever it is - should be empowered to call into those APIs and deal with that data for themselves.

Apple already makes it hard enough with their thicket of code-signing and notarization requirements for distributing software, aggressive privacy restrictions which prevents API access to some of this data in the first place, all these weird Unix-but-not-Unix filesystem layout idioms, sandboxing that restricts access to various features, and the use of esoteric abstractions like mach ports for communications behind the scenes. We don't need to make it even harder by making the way that you install your Python be a surprise gotcha variable that determines whether or not you can use an API like “show me a user notification when my data analysis is done” or “don’t do a power-hungry data analysis when I’m on battery power”, especially if it kinda-sorta works most of the time, but only fails on certain patch-releases of certain versions of the operating system, becuase an implementation detail of a proprietary framework changed in the meanwhile to require an application bundle where it didn’t before, or vice versa.

More generally, I think that we should care about empowering users with local computation and platform access on all platforms, Linux and Windows included. This just happens to be one particular quirk of how native platform integration works on macOS specifically.

Acknowledgments

Thank you to my patrons who are supporting my writing on this blog. For this one, thanks especially to long-time patron Hynek who requested it specifically. If you like what you’ve read here and you’d like to read more of it, or you’d like to support my various open-source endeavors, you can support my work as a sponsor! I am also available for consulting work if you think your organization could benefit from expertise on topics like “how can we set up our Mac developers’ laptops with Python”.

Categories: FLOSS Project Planets

ImageX: DrupalCon Barcelona 2024: Top Session Picks from Our Team

Planet Drupal - Wed, 2024-09-11 15:24

Authored by Nadiia Nykolaichuk.

DrupalCon 2024 is returning to one of the world’s most enchanting cities — Barcelona! As the event draws near, Drupal enthusiasts from around the globe are tapping out the rhythm of Spanish flamenco with their feet in anticipation. Now is the perfect time to explore the conference’s program and select the sessions that will inspire and invigorate you.

Categories: FLOSS Project Planets

FSF Events: Pick up some Sourceware infrastructure tips and tricks with Ian Kelling at GNU Cauldron in Prague on September 16

GNU Planet! - Wed, 2024-09-11 14:20

Categories: FLOSS Project Planets

A Tale of Wine Labels and Open Source Contributions

Planet KDE - Wed, 2024-09-11 11:08

At Akademy 2024, during my talk KDE to Make Wines I promised there would be a companion blog post focusing more on the technical details of what we did. This is the article in question.

This is a piece I also wrote for the enioka blog, so there is a French version available.

Where we set the stage

After years of working in the service industry, one thing which doesn’t cease to amaze me is the variety of needs our customers have and how we can still be surprised by them. They sometimes lead us in unexpected directions. This is obviously something I particularly like.

Today I’ll tell you the tale of a nice relationship enioka Haute Couture built with a customer down under.

It all started with a mail on a KDE mailing list a couple of years ago. People were looking for help. It was spotted by my colleague Benjamin Port who reached out. It turned out those people were from De Bortoli Wines an Australian winemaking company. They are known for using Free Software quite a bit and contributing when they can. They even got interviewed on the KDE’s dot ten years ago!

Not much really happened after our first contact but we kept the communication open… Until last year when they reached out to us for some help with Okular, the universal document viewer made by KDE.

They wanted to get rid of Acrobat Reader for Linux in favor of Okular. They had one issue though, the overprint preview support was visibly broken in Okular. This is essential when interacting with a reprographics company which will send back PDFs based on how they layer colors (or overprinting). You need the overprint preview to simulate what the final rendering will be. We were of course motivated to help them get rid of Acrobat Reader for Linux since it is an old and stale piece of proprietary software.

Getting serious about PDF rendering

This looked at first like an easy task. Poppler (used for rendering PDFs) was exposing some API for the overprinting preview but Okular didn’t make use of it. We quickly made a patch to use the overprint preview API.

Alas, doing so uncovered another issue. As soon as we turned on the overprinting preview support the application would crash. We tracked it down one level down. Somehow the crash was hiding in the Poppler-Qt bindings.

After further exploration, it was due to the binding wrongly determining the row size of the raster images to generate. There’s a color space conversion occurring between the initial memory representation and the target raster image. The code was getting this row size before the transform occurred… and then ended up stuffing the wrong amount of data in the target raster. This couldn’t go well. Another patch was thus produced to address this.

The good news is that we managed to fix the issue in less time than the budget the customer allocated to us. So we gave them the choice between stopping here or using the remaining budget to address something else.

We used the Ghent Workgroup PDF Output suite to validate our work beyond the samples provided by our contact. While doing so we noticed Poppler was failing at properly rendering some other cases. So we proposed to investigate those as well.

After spending some time on those… we made tentative fixes but unfortunately they led to some regression. So in agreement with the customer, we wrapped up and created a detailed bug report instead as to not waste their budget. This helped the Poppler community figure out the problem and produce a fix. That’s when we realized we came really close to the right fix at some point. Clearly an expert view on how the various PDF color spaces work was required so it was a good call to create the detailed report.

With still some budget left, our contact proposed us to also bring the overprint preview support to Okular printing. This was initially left out of the scope but necessary when you want to print your preview on a regular laser printer. This required adjusting Okular and adjusting Poppler-Qt once more. It was ultimately done within budget.

CIFS mount woes

Since the customer was satisfied with the work they came back for more. We setup a budget line for them to come up with issues to fix throughout the year.

Around that time, their focus moved more to CIFS mounts which they use extensively for their remote office branches. As active users of that kernel feature, they encounter issues in user facing software that you would otherwise not suspect.

File copy failures

They were affected by a bug preventing Okular to save on CIFS mounts. It is one of those which has been lingering on for a bit more than a year without a solution in sight. Some applications could modify and save a file opened on a CIFS mount but somehow not Okular.

It turned out to be due to some code in KIO itself (the KDE Frameworks API used for network transparent file operations) interacting in an unwanted way with CIFS mounts.

Indeed, the behavior of unlink() (file deletion) on CIFS mounts can be a bit “interesting”. If the file one tries to delete is opened by another process then the operation is claimed to have succeeded but the filename is still visible in the file hierarchy until the last handle is closed. This is unlike the usual UNIX behavior, outside of CIFS mounts the file wouldn’t be visible in the hierarchy anymore. We thus were seeing the issue because Okular does keep a file handle opened on the file.

Now, KIO rightfully attempts to write under a temporary name, delete the original file and rename to the final name during its file copy operation. This would then fail as the unlink() call would succeed, but the rename would unexpectedly fail due to the lingering file in the hierarchy.

So we proposed a patch for KIO which would do a direct copy for files on CIFS mounts. Files being directly overwritten succeed and so the bug experienced with Okular was solved.

Slow directory listing

This wasn’t the end of the issues with CIFS mounts (far from it). They were also experiencing performance problems when listing folders. Interestingly, they would experience it only in the details view of the KDE file dialog.

At its core the issue was due to requesting too much information. When listing a folder known to be remote by KIO (e.g. going through an smb:// URL) the view would limit the amount of information it’d request about the sub-folders. In particular it wouldn’t try to determine the number of files in the sub-folder. This operation is fast nowadays on modern disks, but incurs extra trafic and latency over the network.

This sounded like a simple fix… but in fact it was a bit more work than expected.

Unsurprisingly we quickly found that the code would decide to go for more or less details solely on the URL. Since CIFS mounts get file:/ URLs, they’d end up treated as local files… so we went to querying a bit more agressively. There is an isSlow() method on the items in the detail view tree which we extended to check for CIFS mounts.

This wasn’t enough though, we immediately realized that the new bottleneck was the calls to isSlow() itself. It would lead to several statfs calls which would be expensive as well. The way out was thus to cache the information in the items and for children to query the cache in their parent. Indeed, if the parent is considered slow, we decided to consider the children in the folder as slow as well. This heuristic allowed us to remove all the subsequent isSlow() calls after the one on the mountpoint folder itself.

This was a very old piece of code we touched there, so some time was also used to clean things up a bit, refactor and rename things to align them better with other KIO parts.

LibreOffice backup files during save

They are such active users of CIFS mounts that they found yet another issue! This time with an old version of LibreOffice. In their version it would show up only if the KDE integration was used. I can tell you we were a bit surprised by this. The investigation wasn’t that easy but we managed to track it down.

The issue was showing up after opening a file sitting on a CIFS mount with LibreOffice. If you did a change to the file, then clicked “Save As…” and selected the same file to overwrite, you would get a “Could not create a backup copy” error and the file wouldn’t be saved if the KDE integration was active. All would be fine without this integration though.

What would be different with and without the KDE integration? Well, in one case there is an extra process! When the file dialog opens, if it is the KDE file dialog, the listing is delegated to a KIO Worker. In this particular case this would matter. Indeed, we figured that LibreOffice keeps an open file descriptor on the opened file. Not only this, it also holds a read lock on the file. The KIO Worker is being forked from LibreOffice, so it too has the open file descriptor with a lock.

This made us realize that there was a leak of file descriptor which is in itself not a good thing. So we changed KIO to cleanup open file descriptors when spawning workers, it’s always a good idea to be tidy. Also, as soon as we removed the file descriptor leak the issue was gone. Nice, this was an easy fix.

Just to be thorough, we tried again but this time with the latest LibreOffice. And even with the patched KIO we would get the “Could not create a backup copy” error! This time it would show up also without the KDE integration. And so back to hunting… without getting too much into the internals of LibreOffice, it turned out the more recent version had extra code activated to produce said backup version, and it has two file descriptors open on the same file. So we were back to the problem of two file descriptors and read locks being involved. But this time it was not due to a leak towards another process, and the architecture of LibreOffice didn’t make it easy to reuse the original file descriptor created when opening the file.

The only thing we could do at this point was to simply not lock when the file is on a CIFS mount. It is not a very satisfying solution but it did the job while fitting the allocated time.

Somehow we couldn’t leave things like this though. If you know well the very much criticized POSIX file locks (which have extra challenges over CIFS), something is still not feeling quite right. We got two processes with a file descriptor on the file to save, and yet there is a single process holding a read lock. During the save, when the failure happens, LibreOffice is still making a “backup copy”, it is not writing yet to the file only reading it… for sure this should be allowed!

We thus started to suspect a problem with the kernel itself… and our contact has been willing to explore it further. After investigation, it’s been confirmed to be a kernel bug. For that the customer hooked us up with Andrew Bartlett from Catalyst as they knew he could help us flesh out ideas in this space. This proved valuable indeed. Thanks to tests we wrote previously and conversations with Andrew, we quickly figured that depending on the options you would pass at mount time the CIFS driver would handle the locks properly or not. We’ve discussed with the maintainer of the CIFS driver for a couple of fixes. They have been merged last week (end of August 2024).

A customer who sparks joy

It’s really a joy to have a customer like this. As can be seen from their willingness to dig deeper on what others would consider obscure issues, they demonstrate they are thinking long term. It also means they come with interesting and challenging issue… and they’re appreciative of what we achieved for them!

Indeed we got the pleasure to receive this by email during our conversations:

We very much appreciate the work you’ve done, and difficulty surrounding the challenges you’re working through. Your approach/results are spectacular, and we are very grateful to be able to be a part of it.

Of course, if you have projects involving Free Software communities or issues closer to the system, feel free to reach out, and we’ll see what we could do to help you. Maybe you too can be a forward thinking customer who sparks joy!

Categories: FLOSS Project Planets

Real Python: How to Use Conditional Expressions With NumPy where()

Planet Python - Wed, 2024-09-11 10:00

The NumPy where() function is a powerful tool for filtering array elements in lists, tuples, and NumPy arrays. It works by using a conditional predicate, similar to the logic used in the WHERE or HAVING clauses in SQL queries. It’s okay if you’re not familiar with SQL—you don’t need to know it to follow along with this tutorial.

You would typically use np.where() when you have an array and need to analyze its elements differently depending on their values. For example, you might need to replace negative numbers with zeros or replace missing values such as None or np.nan with something more meaningful. When you run where(), you’ll produce a new array containing the results of your analysis.

You generally supply three parameters when using where(). First, you provide a condition against which each element of your original array is matched. Then, you provide two additional parameters: the first defines what you want to do if an element matches your condition, while the second defines what you want to do if it doesn’t.

If you think this all sounds similar to Python’s ternary operator, you’re correct. The logic is the same.

Note: In this tutorial, you’ll work with two-dimensional arrays. However, the same principles can be applied to arrays of any dimension.

Before you start, you should familiarize yourself with NumPy arrays and how to use them. It will also be helpful if you understand the subject of broadcasting, particularly for the latter part of this tutorial.

In addition, you may want to use the data analysis tool Jupyter Notebook as you work through the examples in this tutorial. Alternatively, JupyterLab will give you an enhanced notebook experience, but feel free to use any Python environment.

The NumPy library is not part of core Python, so you’ll need to install it. If you’re using a Jupyter Notebook, create a new code cell and type !python -m pip install numpy into it. When you run the cell, the library will install. If you’re working at the command line, use the same command, only without the exclamation point (!).

With these preliminaries out of the way, you’re now good to go.

Get Your Code: Click here to download the free sample code that shows you how to use conditional expressions with NumPy where().

Take the Quiz: Test your knowledge with our interactive “How to Use Conditional Expressions With NumPy where()” quiz. You’ll receive a score upon completion to help you track your learning progress:

Interactive Quiz

How to Use Conditional Expressions With NumPy where()

This quiz aims to test your understanding of the np.where() function. You won't find all the answers in the tutorial, so you'll need to do additional research. It's recommended that you make sure you can do all the exercises in the tutorial before tackling this quiz. Enjoy!

How to Write Conditional Expressions With NumPy where()

One of the most common scenarios for using where() is when you need to replace certain elements in a NumPy array with other values depending on some condition.

Consider the following array:

Python >>> import numpy as np >>> test_array = np.array( ... [ ... [3.1688358, 3.9091694, 1.66405549, -3.61976783], ... [7.33400434, -3.25797286, -9.65148913, -0.76115911], ... [2.71053173, -6.02410179, 7.46355805, 1.30949485], ... ] ... ) Copied!

To begin with, you need to import the NumPy library into your program. It’s standard practice to do so using the alias np, which allows you to refer to the library using this abbreviated form.

The resulting array has a shape of three rows and four columns, each containing a floating-point number.

Now suppose you wanted to replace all the negative numbers with their positive equivalents:

Python >>> np.where( ... test_array < 0, ... test_array * -1, ... test_array, ... ) array([[3.1688358 , 3.9091694 , 1.66405549, 3.61976783], [7.33400434, 3.25797286, 9.65148913, 0.76115911], [2.71053173, 6.02410179, 7.46355805, 1.30949485]]) Copied!

The result is a new NumPy array with the negative numbers replaced by positives. Look carefully at the original test_array and then at the corresponding elements of the new all_positives array, and you’ll see that the result is exactly what you wanted.

Note: The above example gives you an idea of how the where() function works. If you were doing this in practice, you’d most likely use either the np.abs() or np.absolute() functions instead. Both do the same thing because the former is shorthand for the latter:

Python >>> np.abs(test_array) array([[3.1688358 , 3.9091694 , 1.66405549, 3.61976783], [7.33400434, 3.25797286, 9.65148913, 0.76115911], [2.71053173, 6.02410179, 7.46355805, 1.30949485]]) Copied!

Once more, all negative values have been removed.

Before moving on to other use cases of where(), you’ll take a closer look at how this all works. To achieve your aim in the previous example, you passed in test_array < 0 as the condition. In NumPy, this creates a Boolean array that where() uses:

Python >>> test_array < 0 array([[False, False, False, True], [False, True, True, True], [False, True, False, False]]) Copied!

The Boolean array, often called the mask, consists only of elements that are either True or False. If an element matches the condition, the corresponding element in the Boolean array will be True. Otherwise, it’ll be False.

Read the full article at https://realpython.com/numpy-where-conditional-expressions/ »

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Categories: FLOSS Project Planets

Jamie McClelland: MariaDB mystery

Planet Debian - Wed, 2024-09-11 08:27

I keep getting an error in our backup logs:

Sep 11 05:08:03 Warning: mysqldump: Error 2013: Lost connection to server during query when dumping table `1C4Uonkwhe_options` at row: 1402 Sep 11 05:08:03 Warning: Failed to dump mysql databases ic_wp

It’s a WordPress database having trouble dumping the options table.

The error log has a corresponding message:

Sep 11 13:50:11 mysql007 mariadbd[580]: 2024-09-11 13:50:11 69577 [Warning] Aborted connection 69577 to db: 'ic_wp' user: 'root' host: 'localhost' (Got an error writing communication packets)

The Internet is full of suggestions, almost all of which either focus on the network connection between the client and the server or the FEDERATED plugin. We aren’t using the federated plugin and this error happens when conneting via the socket.

Check it out - what is better than a consistently reproducible problem!

It happens if I try to select all the values in the table:

root@mysql007:~# mysql --protocol=socket -e 'select * from 1C4Uonkwhe_options' ic_wp > /dev/null ERROR 2013 (HY000) at line 1: Lost connection to server during query root@mysql007:~#

It happens when I specifiy one specific offset:

root@mysql007:~# mysql --protocol=socket -e 'select * from 1C4Uonkwhe_options limit 1 offset 1402' ic_wp ERROR 2013 (HY000) at line 1: Lost connection to server during query root@mysql007:~#

It happens if I specify the field name explicitly:

root@mysql007:~# mysql --protocol=socket -e 'select option_id,option_name,option_value,autoload from 1C4Uonkwhe_options limit 1 offset 1402' ic_wp ERROR 2013 (HY000) at line 1: Lost connection to server during query root@mysql007:~#

It doesn’t happen if I specify the key field:

root@mysql007:~# mysql --protocol=socket -e 'select option_id from 1C4Uonkwhe_options limit 1 offset 1402' ic_wp +-----------+ | option_id | +-----------+ | 16296351 | +-----------+ root@mysql007:~#

It does happen if I specify the value field:

root@mysql007:~# mysql --protocol=socket -e 'select option_value from 1C4Uonkwhe_options limit 1 offset 1402' ic_wp ERROR 2013 (HY000) at line 1: Lost connection to server during query root@mysql007:~#

It doesn’t happen if I query the specific row by key field:

Hm. Surely there is some funky non-printing character in that option_value right?

root@mysql007:~# mysql --protocol=socket -e 'select CHAR_LENGTH(option_value) from 1C4Uonkwhe_options where option_id = 16296351' ic_wp +---------------------------+ | CHAR_LENGTH(option_value) | +---------------------------+ | 0 | +---------------------------+ root@mysql007:~# mysql --protocol=socket -e 'select HEX(option_value) from 1C4Uonkwhe_options where option_id = 16296351' ic_wp +-------------------+ | HEX(option_value) | +-------------------+ | | +-------------------+ root@mysql007:~#

Resetting the value to an empty value doesn’t make a difference:

root@mysql007:~# mysql --protocol=socket -e 'update 1C4Uonkwhe_options set option_value = "" where option_id = 16296351' ic_wp root@mysql007:~# mysql --protocol=socket -e 'select * from 1C4Uonkwhe_options' ic_wp > /dev/null ERROR 2013 (HY000) at line 1: Lost connection to server during query root@mysql007:~#

Deleting the row in question causes the error to specify a new offset:

root@mysql007:~# mysql --protocol=socket -e 'delete from 1C4Uonkwhe_options where option_id = 16296351' ic_wp root@mysql007:~# mysql --protocol=socket -e 'select * from 1C4Uonkwhe_options' ic_wp > /dev/null ERROR 2013 (HY000) at line 1: Lost connection to server during query root@mysql007:~# mysqldump ic_wp > /dev/null mysqldump: Error 2013: Lost connection to server during query when dumping table `1C4Uonkwhe_options` at row: 1401 root@mysql007:~#

If I put the record I deleted back in, we return to the old offset:

root@mysql007:~# mysql --protocol=socket -e 'insert into 1C4Uonkwhe_options VALUES(16296351,"z_taxonomy_image8905","","yes");' ic_wp root@mysql007:~# mysqldump ic_wp > /dev/null mysqldump: Error 2013: Lost connection to server during query when dumping table `1C4Uonkwhe_options` at row: 1402 root@mysql007:~#

I’m losing my little mind. Let’s get drastic and create a whole new table, copy over the data delicately working around the deadly offset:

oot@mysql007:~# mysql --protocol=socket -e 'create table 1C4Uonkwhe_new_options like 1C4Uonkwhe_options;' ic_wp root@mysql007:~# mysql --protocol=socket -e 'insert into 1C4Uonkwhe_new_options select * from 1C4Uonkwhe_options limit 1402 offset 0;' ic_wp --- There is only 33 more records, not sure how to specify unlimited limit but 100 does the trick. root@mysql007:~# mysql --protocol=socket -e 'insert into 1C4Uonkwhe_new_options select * from 1C4Uonkwhe_options limit 100 offset 1403;' ic_wp

Now let’s make sure all is working properly:

root@mysql007:~# mysql --protocol=socket -e 'select * from 1C4Uonkwhe_new_options' ic_wp >/dev/null;

Now let’s examine which row we are missing:

root@mysql007:~# mysql --protocol=socket -e 'select option_id from 1C4Uonkwhe_options where option_id not in (select option_id from 1C4Uonkwhe_new_options) ;' ic_wp +-----------+ | option_id | +-----------+ | 18405297 | +-----------+ root@mysql007:~#

Wait, what? I was expecting option_id 16296351.

Oh, now we are getting somewhere. And I see my mistake: when using offsets, you need to use ORDER BY or you won’t get consistent results.

root@mysql007:~# mysql --protocol=socket -e 'select option_id from 1C4Uonkwhe_options order by option_id limit 1 offset 1402' ic_wp ; +-----------+ | option_id | +-----------+ | 18405297 | +-----------+ root@mysql007:~#

Now that I have the correct row… what is in it:

root@mysql007:~# mysql --protocol=socket -e 'select * from 1C4Uonkwhe_options where option_id = 18405297' ic_wp ; ERROR 2013 (HY000) at line 1: Lost connection to server during query root@mysql007:~#

Well, that makes a lot more sense. Let’s start over with examining the value:

root@mysql007:~# mysql --protocol=socket -e 'select CHAR_LENGTH(option_value) from 1C4Uonkwhe_options where option_id = 18405297' ic_wp ; +---------------------------+ | CHAR_LENGTH(option_value) | +---------------------------+ | 50814767 | +---------------------------+ root@mysql007:~#

Wow, that’s a lot of characters. If it were a book, it would be 35,000 pages long (I just discovered this site). It’s a LONGTEXT field so it should be able to handle it. But now I have a better idea of what could be going wrong. The name of the option is “rewrite_rules” so it seems like something is going wrong with the generation of that option.

I imagine there is some tweak I can make to allow MariaDB to cough up the value (read_buffer_size? tmp_table_size?). But I’ll start with checking in with the database owner because I don’t think 35,000 pages of rewrite rules is appropriate for any site.

Categories: FLOSS Project Planets

Morpht: Nightly CI hygiene pays off

Planet Drupal - Wed, 2024-09-11 08:00

The Morpht CI pipeline caught a recent vulnerability in Drupal core which led to the problem promptly being fixed.

Categories: FLOSS Project Planets

Real Python: Quiz: Python Virtual Environments: A Primer

Planet Python - Wed, 2024-09-11 08:00

So you’ve been primed on Python virtual environments! Test your understanding of the tutorial here.

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Categories: FLOSS Project Planets

Vector Graphics in Qt 6.8

Planet KDE - Wed, 2024-09-11 05:41

Two-dimensional vector graphics has been quite prevalent in recent Qt release notes, and it is something we have plans to continue exploring in the releases to come. This blog takes a look at some of the options you have, as a Qt developer.

Categories: FLOSS Project Planets

Dirk Eddelbuettel: RcppSpdlog 0.0.18 on CRAN: Updates

Planet Debian - Tue, 2024-09-10 20:21

Version 0.0.18 of RcppSpdlog arrived on CRAN today and has been uploaded to Debian. RcppSpdlog bundles spdlog, a wonderful header-only C++ logging library with all the bells and whistles you would want that was written by Gabi Melman, and also includes fmt by Victor Zverovich. You can learn more at the nice package documention site.

This releases updates the code to the version 1.14.1 of spdlog which was released as an incremental fix to 1.14.0, and adds the ability to set log levels via the environment variable SPDLOG_LEVEL.

The NEWS entry for this release follows.

Changes in RcppSpdlog version 0.0.18 (2024-09-10)

Upgraded to upstream release spdlog 1.14.1
Minor packaging upgrades
Allow logging levels to be set via environment variable SPDLOG_LEVEL

Courtesy of my CRANberries, there is also a diffstat report. More detailed information is on the RcppSpdlog page, or the package documention site. If you like this or other open-source work I do, you can sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

Categories: FLOSS Project Planets

Oliver Davies' daily list: Do you deploy on Fridays?

Planet Drupal - Tue, 2024-09-10 20:00

Do you deploy changes to production on Fridays?

Some people don't as they're worried about potential issues occuring over the weekend.

When's the last time you did a deployment which caused a issue 24 or 48 hours later?

In my experience, most issues are visible immediately or shortly after a deployment and not days later.

Deploying on a Friday may not be as risky as you think.

Categories: FLOSS Project Planets

Freexian Collaborators: Monthly report about Debian Long Term Support, August 2024 (by Roberto C. Sánchez)

Planet Debian - Tue, 2024-09-10 20:00

Like each month, have a look at the work funded by Freexian’s Debian LTS offering.

Debian LTS contributors

In August, 16 contributors have been paid to work on Debian LTS, their reports are available:

Adrian Bunk did 44.5h (out of 46.5h assigned and 53.5h from previous period), thus carrying over 55.5h to the next month.
Bastien Roucariès did 20.0h (out of 20.0h assigned).
Ben Hutchings did 9.0h (out of 0.0h assigned and 21.0h from previous period), thus carrying over 12.0h to the next month.
Chris Lamb did 18.0h (out of 18.0h assigned).
Daniel Leidert did 12.0h (out of 7.0h assigned and 5.0h from previous period).
Emilio Pozuelo Monfort did 22.25h (out of 6.5h assigned and 53.5h from previous period), thus carrying over 37.75h to the next month.
Guilhem Moulin did 17.5h (out of 8.75h assigned and 11.25h from previous period), thus carrying over 2.5h to the next month.
Lee Garrett did 11.5h (out of 58.0h assigned and 2.0h from previous period), thus carrying over 48.5h to the next month.
Markus Koschany did 40.0h (out of 40.0h assigned).
Ola Lundqvist did 14.5h (out of 4.0h assigned and 20.0h from previous period), thus carrying over 9.5h to the next month.
Roberto C. Sánchez did 8.25h (out of 5.0h assigned and 7.0h from previous period), thus carrying over 3.75h to the next month.
Santiago Ruano Rincón did 21.5h (out of 11.5h assigned and 10.0h from previous period).
Sean Whitton did 4.0h (out of 2.25h assigned and 3.75h from previous period), thus carrying over 2.0h to the next month.
Sylvain Beucler did 42.0h (out of 46.0h assigned and 14.0h from previous period), thus carrying over 18.0h to the next month.
Thorsten Alteholz did 11.0h (out of 11.0h assigned).
Tobias Frost did 2.5h (out of 7.75h assigned and 4.25h from previous period), thus carrying over 9.5h to the next month.

Evolution of the situation

In August, we have released 1 DLAs.

During the month of August Debian 11 "bullseye" officially transitioned to the responsibility of the LTS team (on 2024-08-15). However, because the final point release (11.11) was not made until 2024-08-31, LTS contributors were prevented from uploading packages to bullseye until after the point release had been made. That said, the team was not at all idle, and was busy at work on a variety of tasks which impacted both LTS and the broader Debian community, as well as preparing uploads which will be released during the month of September.

Of particular note, LTS contributor Bastien Roucariès prepared updates of the putty and cacti packages for bookworm (1 2) and bullseye (1 2), which were accepted by the old-stable release managers for the August point releases. He also analysed several security regressions in the apache2 package. LTS contributor Emilio Pozuelo Monfort worked on the Rust toolchain in bookworm and bullseye, which will be needed to support the upcoming Firefox ESR and Thunderbird ESR releases from the Mozilla project. Additionally, LTS contributor Thorsten Alteholz prepared bookworm and bullseye updates of the cups package (1 2), which were accepted by the old-stable release managers for the August point releases.

LTS contributor Markus Koschany collaborated with Emmanuel Bourg, co-maintainer of the tomcat packages in Debian. Regressions in a proposed security fix necessitated the updating of the tomcat10 package in Debian to the latest upstream release.

LTS contributors Bastien and Santiago Ruano Rincón collaborated with the upstream developers and the Debian maintainer (Bernhard Schmidt) of the FreeRADIUS project towards addressing the BlastRADIUS vulnerability in the bookworm and bullseye versions of the freeradius package. If you use FreeRADIUS in Debian bookworm or bullseye, we encourage you to test the packages following the instructions found in the call for testers to help identifying any possible regression that could be introduced with these updates.

Testing is an important part of the work the LTS Team does, and in that vein LTS contributor Sean Whitton worked on improving the documentation and tooling around creating test filesystems which can be used for testing a variety of package update scenarios.

Thanks to our sponsors

Sponsors that joined recently are in bold.

Platinum sponsors:
- TOSHIBA (for 107 months)
- Civil Infrastructure Platform (CIP) (for 75 months)
- VyOS Inc (for 39 months)
Gold sponsors:
- Roche Diagnostics International AG (for 117 months)
- Akamai - Linode (for 111 months)
- Babiel GmbH (for 101 months)
- Plat’Home (for 100 months)
- CINECA (for 75 months)
- University of Oxford (for 57 months)
- Deveryware (for 44 months)
- EDF SA (for 29 months)
- Dataport AöR (for 4 months)
Silver sponsors:
- Domeneshop AS (for 122 months)
- Nantes Métropole (for 116 months)
- Univention GmbH (for 108 months)
- Université Jean Monnet de St Etienne (for 108 months)
- Ribbon Communications, Inc. (for 102 months)
- Exonet B.V. (for 91 months)
- Leibniz Rechenzentrum (for 86 months)
- Ministère de l’Europe et des Affaires Étrangères (for 69 months)
- Cloudways by DigitalOcean (for 59 months)
- Dinahosting SL (for 57 months)
- Bauer Xcel Media Deutschland KG (for 51 months)
- Platform.sh SAS (for 51 months)
- Moxa Inc. (for 45 months)
- sipgate GmbH (for 43 months)
- OVH US LLC (for 41 months)
- Tilburg University (for 41 months)
- GSI Helmholtzzentrum für Schwerionenforschung GmbH (for 32 months)
- Soliton Systems K.K. (for 29 months)
- THINline s.r.o. (for 5 months)
- Copenhagen Airports A/S
- Protegrity USA, Inc.
Bronze sponsors:
- Evolix (for 122 months)
- Seznam.cz, a.s. (for 122 months)
- Intevation GmbH (for 119 months)
- Linuxhotel GmbH (for 119 months)
- Daevel SARL (for 118 months)
- Bitfolk LTD (for 117 months)
- Megaspace Internet Services GmbH (for 117 months)
- Greenbone AG (for 116 months)
- NUMLOG (for 116 months)
- WinGo AG (for 115 months)
- Entr’ouvert (for 106 months)
- Adfinis AG (for 104 months)
- Tesorion (for 99 months)
- GNI MEDIA (for 98 months)
- Laboratoire LEGI - UMR 5519 / CNRS (for 98 months)
- Bearstech (for 90 months)
- LiHAS (for 90 months)
- Catalyst IT Ltd (for 85 months)
- Supagro (for 80 months)
- Demarcq SAS (for 79 months)
- Université Grenoble Alpes (for 65 months)
- TouchWeb SAS (for 57 months)
- SPiN AG (for 54 months)
- CoreFiling (for 50 months)
- Institut des sciences cognitives Marc Jeannerod (for 45 months)
- Observatoire des Sciences de l’Univers de Grenoble (for 41 months)
- Tem Innovations GmbH (for 36 months)
- WordFinder.pro (for 35 months)
- CNRS DT INSU Résif (for 34 months)
- Alter Way (for 27 months)
- Institut Camille Jordan (for 17 months)
- SOBIS Software GmbH

Categories: FLOSS Project Planets

Valhalla's Things: Two Linen Hoods

Planet Debian - Tue, 2024-09-10 20:00

Posted on September 11, 2024
Tags: madeof:atoms, craft:sewing, FreeSoftWear

I’ve been influenced again into feeling the need for a garment.

It was again a case of multiple sources conspiring in the same direction for unrelated reasons, but I decided I absolutely needed a linen hood, made from the heavy white linen I knew I had in my stash.

Why? I don’t know. I do like the feeling of wearing a hood, and the white linen should give a decent protection from the sun, but I don’t know how often I’m going to wear these instead of just a hat. On the other hand the linen was already there and I needed something small to sew.

My first idea was to make a square hood: some time ago I had already made one out of some leftovers of duvet cover, vaguely inspired by the S , because I have a long-term plan of making one a bit more from scratch1.

I like the fact that this pattern is completely made out of squares and rectangles, and while the flannel one is quite fitting, as suitable for a warm garment, I felt that by making it just a cm or two wider it would have worked nicely for a warm weather one, and indeed it did.

Except, before I even started on the square hood, I started to think that the same square top would also be good for a hood-scarf, one of those long flowy garments that sit on the head, wrap around the neck and fall down, moving with the wind and the movements of the person.

Because, let’s be honest. worn in a way that look like a veil they feel nice, it’s true. But with the help of a couple of pins then you can do this.

And no, I’ve never played that game2, and I’m not even 100% sure what it is about, other than killing people, climbing buildings and petting cats3, but that’s not really an issue when making a bit of casual cosplay of something, right?

Anyway, should anybody feel the need to make themselves a hood or ten, the patterns have been released as usual as #FreeSoftWear: square hood and hood scarf.

I’m not going to raise the sheep :D I’m actually not even going to wash and comb the wool, I’ll start from the step just after those :D↩︎
because proprietary software, because somewhat underpowered computers and other related reasons that are somewhat incidental to the game itself.↩︎
at least two out of three things that make it look like a perfectly enjoyable activity.↩︎

Categories: FLOSS Project Planets

PyCoder’s Weekly: Issue #646 (Sept. 10, 2024)

Planet Python - Tue, 2024-09-10 15:30

#646 – SEPTEMBER 10, 2024
View in Browser »

Using Pydantic to Simplify Python Data Validation

Discover the power of Pydantic, Python’s most popular data parsing, validation, and serialization library. In this hands-on video course, you’ll learn how to make your code more robust, trustworthy, and easier to debug with Pydantic.
REAL PYTHON course

Introducing Monthly PSF Board Office Hours

The PSF is introducing monthly office hours on the PSF Discord discussion board. This is a chance to connect with the board members and learn more about what they do. The schedule for the next 12 sessions is in the post.
PYTHON SOFTWARE FOUNDATION

500 Devs, Deploying 200x a Day, While Maintaining 4 Million Lines of Code 😮‍💨

Sounds tricky right? Well that’s exactly what Kraken Technologies is doing. Learn how they manage 100s of deployments a day and how they handle errors when they crop up. Sneak peak: they use Sentry to reduce noise, prioritize issues, and maintain code quality–without relying on a dedicated QA team →
SENTRY sponsor

Why I’m Switching From pandas to Polars

Ari is switching from pandas to Polars and surprisingly (even to himself) it isn’t because of the better performance. Read on for the reasons why.
ARI LAMSTEIN

DjangoCon US Durham, NC Sept 22-27, Tickets Still Available

DJANGOCON.US

Announcing Djangonaut Space Session 3 Applications Open!

DJANGONAUT.SPACE

Django Security Releases Issued: 5.1.1, 5.0.9, and 4.2.16

DJANGO SOFTWARE FOUNDATION

Quiz: Asynchronous Iterators and Iterables in Python

REAL PYTHON

Quiz: Functional Programming in Python

REAL PYTHON

Articles & Tutorials How to Create a Pre-Commit Hook

Pre-commit hooks are a great way to help maintain code quality. However, some of your code quality standards may be specific to your project, and therefore, not covered by existing code linting and formatting tools. In this article, Stefanie shows you how to incorporate custom checks into your pre-commit setup.
STEFANIEMOLIN.COM • Shared by Stefanie Molin

Debugging With Trace and PYREPL_TRACE

Just how does one debug the tool one is using to find bugs? Python 3.13’s new REPL is implemented in Python and adding print statements means you get output in your output. This quick post talks about the environment variable PYREPL_TRACE and how to use it to capture debug information.
RODRIGO GIRÃO SERRÃO

Evolving Django’s auth.User

Carlton has some strong opinions on how Django manages usernames and custom users through auth.User and how the current solution is daunting to folks new to Django. This article dives into why the current approach might be problematic and what could be done.
CARLTON GIBSON

Please Don’t Hijack My Python Root Logger

Redowan keeps running into code that mucks with the root logger’s settings, which leaks into his own code. This post explains the problem and how to make sure you aren’t doing it in your own libraries.
REDOWAN DELOWAR

Polars Has a New Lightweight Plotting Backend

Polars 1.6 allows you to natively create beautiful plots without pandas, NumPy, or PyArrow. This is enabled by Narwhals, a lightweight compatibility layer between dataframe libraries.
POLA.RS • Shared by Marco Gorelli

Why I Still Use Python Virtual Environments in Docker

Hynek often gets challenged when he suggests the use of virtual environments within Docker containers, and this post explains why he still does.
HYNEK SCHLAWACK

Web Scraping With Scrapy and MongoDB

This tutorial covers how to write a Python web crawler using Scrapy to scrape and parse data, and then store the data in MongoDB.
REAL PYTHON

Escaping From Anaconda’s Stranglehold on macOS

Once you’ve got Anaconda on macOS, using any other Python can be problematic. This article walks you through escaping Anaconda.
PAUL ROMER

I Will F(l)ail at Your Tech Interviews

Frak talks about how technical interviews often have false negatives and how this impacts your organization.
FRAK LOPEZ

Projects & Code microrabbit: Lightweight, Asynch Framework for RabbitMQ

GITHUB.COM/TONNOBELLOSNELLO

csv_trimming: Remove Common Ugliness From CSV Files

GITHUB.COM/LUCACAPPELLETTI94

Ibis: Dataframe API That Executes on Any Query Engine

IBIS-PROJECT.ORG

PyRoboCOP: Control Robotics With Algebra

GITHUB.COM/MERLRESEARCH

django-tables2: Create HTML Tables in Django

GITHUB.COM/JIETER

Events Weekly Real Python Office Hours Q&A (Virtual)

September 11, 2024
REALPYTHON.COM

Python Atlanta

September 12 to September 13, 2024
MEETUP.COM

Python Sul 2024 (Brazil)

September 13 to September 16, 2024
PYTHON.ORG.BR

PyData Amsterdam 2024

September 18 to September 21, 2024
PYDATA.ORG

PyCon Latam 2024, Mazatlán, México

September 19 to September 22, 2024
PYLATAM.ORG • Shared by David Sol

PyCon India 2024

September 20 to September 24, 2024
PYCON.ORG

PyCon TW 2024

September 21 to September 23, 2024
PYCON.ORG

PythonCamp Rügen 2024

September 21 to September 23, 2024
BARCAMPS.EU

Happy Pythoning!
This was PyCoder’s Weekly Issue #646.
View in Browser »

[ Subscribe to 🐍 PyCoder’s Weekly 💌 – Get the best Python news, articles, and tutorials delivered to your inbox once a week >> Click here to learn more ]

Categories: FLOSS Project Planets

Darren Oh: The Drupal Forge business model

Planet Drupal - Tue, 2024-09-10 15:13

The Drupal Forge business model

Drupal Forge is a non-profit project of the Drupal community. Our mission is to support vendors that devote a portion of their revenue to sustaining the software and infrastructure Drupal needs to be a great product. Our product launch buttons are part of a business module to sustain contribution. This is what makes them different from launch buttons that hosting vendors offer on their own.

Darren Oh Tue, 09/10/2024 - 15:13 Tags

Read more about The Drupal Forge business model
Log in or register to post comments

Categories: FLOSS Project Planets

unifont @ Savannah: Unifont 16.0.01 Released

GNU Planet! - Tue, 2024-09-10 12:49

10 September 2024

Unifont 16.0.01 is now available. This is a major release.

From the NEWS file:

* Updates to synchronize Unifont with Unicode 16.0.0 release.

* Many new upper-plane Chinese ideographs added.

* New "make" build dependency on ImageMagick's "convert" program
    to build thumbnail images of the Unicode plane bitmaps.

* unifont-combining-$(VERSION).txt is now included in the
    distribution set to provide spacing information on all
    combining characters.

* Many other minor updates; see ChangeLog for details.

Download this release from GNU server mirrors at:

     https://ftpmirror.gnu.org/unifont/unifont-16.0.01/

or if that fails,

     https://ftp.gnu.org/gnu/unifont/unifont-16.0.01/

or, as a last resort,

     ftp://ftp.gnu.org/gnu/unifont/unifont-16.0.01/

These files are also available on the unifoundry.com website:

     https://unifoundry.com/pub/unifont/unifont-16.0.01/

Font files are in the subdirectory

     https://unifoundry.com/pub/unifont/unifont-16.0.01/font-builds/

A more detailed description of font changes is available at

      https://unifoundry.com/unifont/index.html

and of utility program changes at

      https://unifoundry.com/unifont/unifont-utilities.html

Enjoy!

Paul Hardy

Categories: FLOSS Project Planets

OpenUK Awards 2024

Planet KDE - Tue, 2024-09-10 10:28

https://openuk.uk/openuk-september-2024-newsletter-1/

https://www.linkedin.com/feed/update/urn:li:activity:7238138962253344769/

Our 5th annual Awards are open for nominations and our 2024 judges are waiting for your nominations! Hannah Foxwell, Jonathan Riddell, and Nicole Tandy will be selecting winners for 12 categories. ?

Nominations are now open until midnight UK, 8 September 2024. Our 5th Awards again celebrate the UK’s leadership and global collaboration in open technology!

Nominate now! https://openuk.uk/awards/openuk-awards-2024/

Up to 3 shortlisted nominees will be selected in each category by early October and each nominee will be given one place at the Oscars of Open Source, the black tie Awards Ceremony and Gala Dinner for our 5th Awards held at the House of Lords on 28 November, thanks to the sponsorship of Lord Wei.

Categories: FLOSS Project Planets

FSF Events: Free Software Directory meeting on IRC: Friday, September 13, starting at 12:00 EDT (16:00 UTC)

GNU Planet! - Tue, 2024-09-10 10:23

Join the FSF and friends on Friday, September 13 from 12:00 to 15:00 EDT (16:00 to 19:00 UTC) to help improve the Free Software Directory.

Categories: FLOSS Project Planets

ListenData: How to Integrate Gemini API with Python

Planet Python - Tue, 2024-09-10 10:13

In this tutorial, you will learn how to use Google's Gemini AI model through its API in Python.

Steps to Access Gemini API

Follow the steps below to access the Gemini API and then use it in python.

Visit Google AI Studio website.
Sign in using your Google account.
Create an API key.
Install the Google AI Python library for the Gemini API using the command below :
pip install google-generativeai.

To read this article in full, please click hereThis post appeared first on ListenData

Categories: FLOSS Project Planets

Search form

Tag cloud

Feeds

Copyright law makes a case for requiring data information rather than open datasets for Open Source AI

Glyph Lefkowitz: Python macOS Framework Builds

ImageX: DrupalCon Barcelona 2024: Top Session Picks from Our Team

FSF Events: Pick up some Sourceware infrastructure tips and tricks with Ian Kelling at GNU Cauldron in Prague on September 16

A Tale of Wine Labels and Open Source Contributions

Real Python: How to Use Conditional Expressions With NumPy where()

Jamie McClelland: MariaDB mystery

Morpht: Nightly CI hygiene pays off

Real Python: Quiz: Python Virtual Environments: A Primer

Vector Graphics in Qt 6.8

Dirk Eddelbuettel: RcppSpdlog 0.0.18 on CRAN: Updates

Oliver Davies' daily list: Do you deploy on Fridays?

Freexian Collaborators: Monthly report about Debian Long Term Support, August 2024 (by Roberto C. Sánchez)

Valhalla's Things: Two Linen Hoods

PyCoder’s Weekly: Issue #646 (Sept. 10, 2024)

Darren Oh: The Drupal Forge business model

unifont @ Savannah: Unifont 16.0.01 Released

OpenUK Awards 2024

FSF Events: Free Software Directory meeting on IRC: Friday, September 13, starting at 12:00 EDT (16:00 UTC)

ListenData: How to Integrate Gemini API with Python

Pages

Recent Publications

FLOSS Project Planets

FLOSS Research

Search form

Tag cloud

You are here

Feeds

Pages

Recent Publications

FLOSS Project Planets

FLOSS Research