Feeds

The Drop Times: ECA is For Every Drupal Site Out There: Jürgen Haas

Planet Drupal - Wed, 2024-07-31 08:18
In an insightful interview with Alka Elizabeth, Jürgen Haas, Co-Founder of LakeDrops, delves into the development and impact of the ECA module, a tool for workflow automation in Drupal. Jürgen, an expert in technical solution architectures, discusses how the ECA module simplifies complex tasks through a no-code approach, making it accessible to developers and non-technical users. He also explores the module's potential future integration with Drupal core and its role in promoting sustainable, open-source practices within the digital community. This interview provides a comprehensive look at how the ECA module reshapes user experiences and sets new standards in the Drupal ecosystem.
Categories: FLOSS Project Planets

Real Python: Quiz: Python's Built-in Exceptions: A Walkthrough With Examples

Planet Python - Wed, 2024-07-31 08:00

In this quiz, you’ll test your understanding of the most commonly used built-in exceptions in Python.

Exception handling is a core topic in Python. Knowing how to use some of the most common built-in exceptions can help you to debug your code and handle your own exceptions.

Good luck!

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Categories: FLOSS Project Planets

Russ Allbery: Review: The Book That Wouldn't Burn

Planet Debian - Tue, 2024-07-30 22:38

Review: The Book That Wouldn't Burn, by Mark Lawrence

Series: Library Trilogy #1 Publisher: Ace Copyright: 2023 ISBN: 0-593-43793-4 Format: Kindle Pages: 561

The Book That Wouldn't Burn is apparently high fantasy, but of the crunchy sort that could easily instead be science fiction. It is the first of a trilogy.

Livira is a young girl, named after a weed, who lives in a tiny settlement in the Dust. She is the sort of endlessly curious and irrepressible girl who can be more annoying than delightful to adults who are barely keeping everyone alive. Her settlement is not the sort of place that's large enough to have a name; only their well keeps them alive in the desert and the ever-present dust. There is a city somewhere relatively near, which Livira dreams of seeing, but people from the settlement don't go there.

When someone is spotted on the horizon approaching the settlement, it's the first time Livira has ever seen a stranger. It's also not a good sign. There's only one reason for someone to seek them out in the Dust: to take. Livira and the other children are, in short order, prisoners of the humanoid dog-like sabbers, being dragged off to an unknown fate.

Evar lives in the library and has for his entire life. Specifically, he lives in a square room two miles to a side, with a ceiling so high that it may as well be a stone sky. He lived there with his family before he was lost in the Mechanism. Years later, the Mechanism spit him out alongside four other similarly-lost kids, all from the same library in different times. None of them had apparently aged, but everyone else was dead. Now, years later, they live a strange and claustrophobic life with way too much social contact between way too few people.

Evar's siblings, as he considers them, were each in the Mechanism with a book. During their years in the Mechanism they absorbed that book until it became their focus and to some extent their personality. His brothers are an assassin, a psychologist, and a historian. His sister, the last to enter the Mechanism and a refugee from the sabber attack that killed everyone else, is a warrior. Evar... well, presumably he had a book, since that's how the Mechanism works. But he can't remember anything about it except the feeling that there was a woman.

Evar lives in a library in the sense that it's a room full of books, but those books are not on shelves. They're stacked in piles and massive columns, with no organizational system that any of them could discern. There are four doors, all of which are closed and apparently impenetrable. In front of one of them is a hundred yards of char and burned book remnants, but that door is just as impenetrable as the others. There is a pool in the center of the room, crops surrounding it, and two creatures they call the Soldier and the Assistant. That is the entirety of Evar's world.

As you might guess from the title, this book is about a library. Evar's perspective of the library is quite odd and unexplained until well into the book, and Livira's discovery of the library and subsequent explorations are central to her story, so I'm going to avoid going into too many details about its exact nature. What I will say is that I have read a lot of fantasy novels that are based around a library, but I don't think I've ever read one that was this satisfying.

I think the world of The Book That Wouldn't Burn is fantasy, in that there are fundamental aspects of this world that don't seem amenable to an explanation consistent with our laws of physics. It is, however, the type of fantasy with discoverable rules. Even better, it's the type of fantasy where discovering the rules is central to the story, for both the characters and the readers, and the rules are worth the effort. This is a world-building tour de force: one of the most engrossing and deeply satisfying slow revelations that I have read in a long time. This book is well over 500 pages, the plot never flags, new bits of understanding were still slotting into place in the last chapter, and there are lots of things I am desperately curious about that Lawrence left for the rest of the series. If you like puzzling out the history and rules of an invented world and you have anything close to my taste in characters and setting, you are going to love this book.

(Also, there is at least one C.S. Lewis homage that I will not spoil but that I thought was beautifully done and delightfully elaborated, and I am fairly sure there is a conversation happening between this book and Philip Pullman's His Dark Materials series that I didn't quite untangle but that I am intrigued by.)

I do need to offer a disclaimer: Livira is precisely the type of character I love reading about. She's stubborn, curious, courageous, persistent, egalitarian, insatiable, and extremely sharp. I have a particular soft spot for exactly this protagonist, so adjust the weight of my opinion accordingly. But Lawrence also makes excellent use of her as a spotlight to illuminate the world-building. More than anything else in the world, Livira wants to understand, and there is so much here to understand.

There is an explanation for nearly everything in this book, and those explanations usually both make sense and prompt more questions. This is such a tricky balance for the writer to pull off! A lot of world-building of this sort fails either by having the explanations not live up to the mysteries or by tying everything together so neatly that the stakes of the world collapse into a puzzle box. Lawrence avoids both failures. This world made sense to me but remained sufficiently messy to feel like humans were living in it. I also thought the pacing and timing were impeccable: I figured things out at roughly the same pace as the characters, and several twists and turns caught me entirely by surprise.

I do have one minor complaint and one caveat. The minor complaint is that I thought one critical aspect of the ending was a little bit too neat and closed. It was the one time in the book where I thought Lawrence simplified his plot structure rather than complicated it, and I didn't like the effect it had on the character dynamics. There is, thankfully, the promise of significant new complications in the next book.

The caveat is a bit harder to put my finger on, but a comparison to Alaya Dawn Johnson's The Library of Broken Worlds might help. That book was also about a library, featured a protagonist thrown into the deep end of complex world-building, and put discovery of the history and rules at the center of the story. I found the rules structure of The Book That Wouldn't Burn more satisfyingly complicated and layered, in a way that made puzzle pieces fit together in my head in a thoroughly enjoyable way. But Johnson's book is about very large questions of identity, history, sacrifice, and pain, and it's full of murky ambiguity and emotions that are only approached via metaphor and symbolism. Lawrence's book is far more accessible, but the emotional themes are shallower and more straightforward. There is a satisfying emotional through-line, and there are some larger issues at stake, but it won't challenge your sense of morality and justice the way that The Library of Broken Worlds might. I think which of those books one finds better will depend on what mood you're in and what reading experience you're looking for.

Personally, I was looking for a scrappy, indomitable character who would channel her anger into overcoming every obstacle in the way of thoroughly understanding her world, and that's exactly what I got. This was my most enjoyable reading experience of the year to date and the best book I've read since Some Desperate Glory. Fantastic stuff, highly recommended.

Followed by The Book That Broke the World, and the ending is a bit of a cliffhanger so you may want to have that on hand. Be warned that the third book in the series won't be published until 2025.

Rating: 9 out of 10

Categories: FLOSS Project Planets

Matthew Palmer: Health Industry Company Sues to Prevent Certificate Revocation

Planet Debian - Tue, 2024-07-30 20:00

It’s not often that a company is willing to make a sworn statement to a court about how its IT practices are incompatible with the needs of the Internet, but when they do… it’s popcorn time.

The Combatants

In the red corner, weighing in at… nah, I’m not going to do that schtick.

The plaintiff in the case is Alegeus Technologies, LLC, a Delaware Corporation that, according to their filings, “is a leading provider of a business-tobusiness, white-label funding and payment platform for healthcare carriers and third-party administrators to administer consumer-directed employee benefit programs”. Not being subject to the US’ bonkers health care system, I have only a passing familiarity with the sorts of things they do, but presumably it involves moving a lot of money around, which is sometimes important.

The defendant is DigiCert, a CA which, based on analysis I’ve done previously, is the second-largest issuer of WebPKI certificates by volume.

The History

According to a recently opened Mozilla CA bug, DigiCert found an issue in their “domain control validation” workflow, that meant it may have been possible for a miscreant to have certificates issued to them that they weren’t legitimately entitled to. Given that validating domain names is basically the “YOU HAD ONE JOB!” of a CA, this is a big deal.

The CA/Browser Forum Baseline Requirements (BRs) (which all CAs are required to adhere to, by virtue of their being included in various browser and OS trust stores), say that revocation is required within 24 hours when “[t]he CA obtains evidence that the validation of domain authorization or control for any Fully‐Qualified Domain Name or IP address in the Certificate should not be relied upon” (section 4.9.1.1, point 5).

DigiCert appears to have at least tried to do the right thing, by opening the above Mozilla bug giving some details of the problem, and notifying their customers that their certificates were going to be revoked. One may quibble about how fast they’re doing it, but they’re giving it a decent shot, at least.

A complicating factor in all this is that, only a touch over a month ago, Google Chrome announced the removal of another CA, Entrust, from its own trust store program, citing “a pattern of compliance failures, unmet improvement commitments, and the absence of tangible, measurable progress in response to publicly disclosed incident reports”. Many of these compliance failures were failures to revoke certificates in a timely manner. One imagines that DigiCert would not like to gain a reputation for tardy revocation, particularly at the moment.

The Legal Action

Now we come to Alegeus Technologies. They’ve opened a civil case whose first action is to request the issuance of a Temporary Restraining Order (TRO) that prevents DigiCert from revoking certificates issued to Alegeus (which the court has issued). This is a big deal, because TROs are legal instruments that, if not obeyed, constitute contempt of court (or something similar) – and courts do not like people who disregard their instructions. That means that, in the short term, those certificates aren’t getting revoked, despite the requirement imposed by root stores on DigiCert that the certificates must be revoked. DigiCert is in a real “rock / hard place” situation here: revoke and get punished by the courts, or don’t revoke and potentially (though almost certainly not, in the circumstances) face removal from trust stores (which would kill, or at least massively hurt, their business).

The reasons that Alegeus gives for requesting the restraining order is that “[t]o Reissue and Reinstall the Security Certificates, Alegeus must work with and coordinate with its Clients, who are required to take steps to rectify the certificates. Alegeus has hundreds of such Clients. Alegeus is generally required by contract to give its clients much longer than 24 hours’ notice before executing such a change regarding certification.”

In the filing, Alegeus does acknowledge that “DigiCert is a voluntary member of the Certification Authority Browser Forum (CABF), which has bylaws stating that certificates with an issue in their domain validation must be revoked within 24 hours.” This is a misstatement of the facts, though. It is the BRs, not the CABF bylaws, that require revocation, and the BRs apply to all CAs that wish to be included in browser and OS trust stores, not just those that are members of the CABF. In any event, given that Alegeus was aware that DigiCert is required to revoke certificates within 24 hours, one wonders why Alegeus went ahead and signed agreements with their customers that required a lengthy notice period before changing certificates.

What complicates the situation is that there is apparently a Master Services Agreement (MSA) that states that it “constitutes the entire agreement between the parties” – and that MSA doesn’t mention certificate revocation anywhere relevant. That means that it’s not quite so cut-and-dried that DigiCert does, in fact, have the right to revoke those certificates. I’d expect a lot of “update to your Master Services Agreement” emails to be going out from DigiCert (and other CAs) in the near future to clarify this point.

Not being a lawyer, I can’t imagine which way this case might go, but there’s one thing we can be sure of: some lawyers are going to able to afford that trip to a tropical paradise this year.

The Security Issues

The requirement for revocation within 24 hours is an important security control in the WebPKI ecosystem. If a certificate is misissued to a malicious party, or is otherwise compromised, it needs to be marked as untrustworthy as soon as possible. While revocation is far from perfect, it is the best tool we have.

In this court filing, Alegeus has claimed that they are unable to switch certificates with less than 24 hours notice (due to “contractual SLAs”). This is a pretty big problem, because there are lots of reasons why a certificate might need to be switched out Very Quickly. As a practical example, someone with access to the private key for your SSL certificate might decide to use it in a blog post. Letting that sort of problem linger for an extended period of time might end up being a Pretty Big Problem of its own. An organisation that cannot respond within hours to a compromised certificate is playing chicken with their security.

The Takeaways

Contractual obligations that require you to notify anyone else of a certificate (or private key) changing are bonkers, and completely antithetical to the needs of the WebPKI. If you have to have them, you’re going to want to start transitioning to a private PKI, wherein you can do whatever you darn well please with revocation (or not). As these sorts of problems keep happening, trust stores (and hence CAs) are going to crack down on this sort of thing, so you may as well move sooner rather than later.

If you are an organisation that uses WebPKI certificates, you’ve got to be able to deal with any kind of certificate revocation event within hours, not days. This basically boils down to automated issuance and lifecycle management, because having someone manually request and install certificates is terrible on many levels. There isn’t currently a completed standard for notifying subscribers if their certificates need premature renewal (say, due to needing to be revoked), but the ACME Renewal Information Extension is currently being developed to fill that need. Ask your CA if they’re tracking this standards development, and when they intend to have the extension available for use. (Pro-tip: if they say “we’ll start doing development when the RFC is published”, run for the hills; that’s not how responsible organisations work on the Internet).

The Givings

If you’ve found this helpful, consider shouting me a refreshing beverage. Reading through legal filings is thirsty work!

Categories: FLOSS Project Planets

Reproducible Builds (diffoscope): diffoscope 273 released

Planet Debian - Tue, 2024-07-30 20:00

The diffoscope maintainers are pleased to announce the release of diffoscope version 273. This version includes the following changes:

[ Chris Lamb ] * Factor out version detection in test_jpeg_image. (Re: reproducible-builds/diffoscope#384) * Ensure that 'convert' is from Imagemagick 6.x; we will need to update a few things with IM7. (Closes: reproducible-builds/diffoscope#384) * Correct import of identify_version after refactoring change in 037bdcbb0. [ Mattia Rizzolo ] * tests: + Add OpenSSH key test with a ed25519 key. + Skip the OpenSSH test with DSA key if openssh is >> 9.7 + Support ffmpeg >= 7 that adds some extra context to the diff * Do not ignore testing in gitlab-ci. * debian: + Temporarily remove aapt, androguard and dexdump from the build/test dependencies as they are not available in testin/trixie. Closes: #1070416 + Bump Standards-Version to 4.7.0, no changes needed. + Adjust options to make sure not to pack the python s-dist directory into the debian source package. + Adjust the lintian overrides.

You find out more by visiting the project homepage.

Categories: FLOSS Project Planets

ImageX: AI Assistant, Real-Time Collaboration, and More: A Glimpse at CKEditor 5 Premium Features in Drupal

Planet Drupal - Tue, 2024-07-30 18:34

Authored by Nadiia Nykolaichuk.

CKEditor 5 has become a signature innovation in Drupal 10 and a symbol of cutting-edge content editing. As more Drupal websites upgrade, editorial teams can enjoy CKEditor 5’s new and vibrant design, where every detail is crafted for usability and efficiency.

Categories: FLOSS Project Planets

GPG Key Update

Planet KDE - Tue, 2024-07-30 18:00

Quick note that the yearly extend-the-expiry-of-my-GPG-key has happened early this year, as I realised that my GPG information on FreeBSD infrastructure was outdated. This doesn’t extend the Calamares signing-subkey (not yet) but does add new EC subkeys if that’s your kind of thing. The Calamares signing-subkey is valid until November 2024.

Get the exported public key block here.

Categories: FLOSS Project Planets

PyCoder’s Weekly: Issue #640 (July 30, 2024)

Planet Python - Tue, 2024-07-30 15:30

#640 – JULY 30, 2024
View in Browser »

Build Captivating Display Tables in Python With Great Tables

Do you need help making data tables in Python look interesting and attractive? How can you create beautiful display-ready tables as easily as charts and graphs in Python? This week on the show, we speak with Richard Iannone and Michael Chow from Posit about the Great Tables Python library.
REAL PYTHON podcast

Overview of the Module itertools

This article proposes the top 3 iterators that are most useful from the module itertools, classifies all of the 19 iterators into 5 categories, and then provides brief usage examples for all the iterators in the module itertools.
RODRIGO GIRÃO SERRÃO • Shared by Rodrigo Girão Serrão

Take a Free Course. It’s on us

Learn how to speed up Python programs on NVIDIA GPUs using Numba, a type-specializing just-in-time compiler. Join the NVIDIA Developer Program to take our ‘Fundamentals of Accelerated Computing with CUDA Python’ course for free →
NVIDIA sponsor

Asyncio Event Loop in Separate Thread

Typically, the asyncio event loop runs in the main thread, but as that is the one used by the interpreter, sometimes you want the event loop to run in a separate thread. This article talks about why and how to do just that.
JASON BROWNLEE

Quiz: Python Type Checking

In this quiz, you’ll test your understanding of Python type checking. You’ll revisit concepts such as type annotations, type hints, adding static types to code, running a static type checker, and enforcing types at runtime. This knowledge will help you develop your code more efficiently.
REAL PYTHON

Quiz: Build a Blog Using Django, GraphQL, and Vue

In this quiz, you’ll test your understanding of building a Django blog back end and a Vue front end, using GraphQL to communicate between them. This will help you decouple your back end and front end, handle data persistence in the API, and display the data in a single-page app (SPA).
REAL PYTHON

PEP 751: A File Format to List Python Dependencies for Installation Reproducibility (New)

This PEP proposes a new file format for dependency specification to enable reproducible installation in a Python environment.
PYTHON.ORG

pytest 8.3 Released

PYTEST.ORG

Django 5.1 RC 1 Released

DJANGO SOFTWARE FOUNDATION

Discussions Interesting Topics for an Advanced Python Lecture?

DISCUSSIONS ON PYTHON.ORG

Articles & Tutorials Wide Angle Lens Distortion Correction With Straight Lines

Discusses how to estimate and correct wide-angle lens distortion using straight lines in an image. It covers techniques like the Radon transform, Hough transform, and an iterative optimization algorithm to estimate the distortion parameters and undistort the image. The author also provides Python code to match the division-based undistortion model to the OpenCV distortion model.
HUGO HADFIELD

Testing Python Integration With an Azure Eventhub

Using an Azure EventHub with Python is pretty easy thanks to Azure SDK for Python. However, ensuring that your code actually send events into an event hub in a reliable and automated way can be a bit harder. This article demonstrates how you can achieve this thanks to asyncio, docker and pytest.
BENOÎT GODARD • Shared by Benoît Godard

Crunchy Bridge Integrates Postgres with DuckDB

Postgres excels in managing transactional databases. DuckDB offers fast performance for queries and data analysis. Integrating these two databases provides a hybrid solution leveraging the strengths of both transactional and analytical workloads.
CRUNCHY DATA sponsor

pandas GroupBy: Grouping Real World Data in Python

In this course, you’ll learn how to work adeptly with the pandas GroupBy while mastering ways to manipulate, transform, and summarize data. You’ll work with real-world datasets and chain GroupBy methods together to get data into an output that suits your needs.
REAL PYTHON course

10 Open-Source Tools for Optimizing Cloud Expenses

The cloud gets you scale, but it can also be complicated to price properly. This article covers ten different open source tools that you can use to optimize your deployment and understand the associated costs.
TARUN SINGH

Hugging Face Transformers: Open-Source AI With Python

As the AI boom continues, the Hugging Face platform stands out as the leading open-source model hub. In this tutorial, you’ll get hands-on experience with Hugging Face and the Transformers library in Python.
REAL PYTHON

Tanda Runner: A Personalized Running Dashboard

This post talks about a new dashboard tool for visualizing your Strava running data and getting personalized recommendations for your next big race. It is built using Django and includes a LLM integration.
DUARTE O.CARMO

You Don’t Have to Guess to Estimate

“There are roughly three senses of ‘estimate.’ One is ‘a prediction of how much something will cost.’ One is ‘a guess.’ But another definition is a rough calculation.”
NAT BENNETT

Using else in a Comprehension

While list comprehensions in Python don’t support the else keyword directly, conditional expressions can be embedded within list comprehension.
TREY HUNNER

TIL: Difference Between __getattr__ and __getattribute__

A quick post on the the difference between __getattr__ and __getattribute__.
RODRIGO GIRÃO SERRÃO

Projects & Code taipy: Turns Data Into a Web App

GITHUB.COM/AVAIGA

posting: The Modern API Client That Lives in Your Terminal

GITHUB.COM/DARRENBURNS

django-sql-explorer: Share Data With SQL Queries

GITHUB.COM/EXPLORERHQ

pyxel: A Retro Game Engine for Python

GITHUB.COM/KITAO

Herbie: Retrieve Weather Prediction Data

GITHUB.COM/BLAYLOCKBK • Shared by Brian Blaylock

Maelstrom: A Clustered Test Runner for Python and Rust

GITHUB.COM/MAELSTROM-SOFTWARE • Shared by Neal Fachan

Events Weekly Real Python Office Hours Q&A (Virtual)

July 31, 2024
REALPYTHON.COM

Canberra Python Meetup

August 1, 2024
MEETUP.COM

Sydney Python User Group (SyPy)

August 1, 2024
SYPY.ORG

Django Girls Ecuador 2024

August 3, 2024
OPENLAB.EC

Melbourne Python Users Group, Australia

August 5, 2024
J.MP

STL Python

August 8, 2024
MEETUP.COM

Happy Pythoning!
This was PyCoder’s Weekly Issue #640.
View in Browser »

[ Subscribe to 🐍 PyCoder’s Weekly 💌 – Get the best Python news, articles, and tutorials delivered to your inbox once a week >> Click here to learn more ]

Categories: FLOSS Project Planets

Making sense of font selection

Planet KDE - Tue, 2024-07-30 10:54

It’s been a while since my last blog post regarding text. Since then I’ve been working on the on-canvas text tool, as well as multiple reworks for rich text editing, the actual text properties docker for this rich text editing, and finally I’ve done a talk at the Libre Graphics Meeting about my work on the text tool.

I’m now at the point that I’m going over each property and thoroughly polish it. Because I’m also doing frequent updates on the krita-artists forum, I’m hoping to punctuate each polish session with an introduction to the property, and because I also have a lot of technical things to talk about, I’ll be making technical blog posts alongside that, of which this will be the first.

So the first thing that needed to be tackled after putting together the basic text properties docker and the related interaction is font selection. Krita’s text tool is based on SVG+CSS, and uses FontConfig to select fonts. Typically, a font selection widget will show the list fonts, and in some cases, it organises this in two dropdowns, where the first is the font family, and the second a sub family, like italic or bold. So obviously there’s meta data for this, right, and you should just plug that in the right places, and everything’s peachy? Well, we do have a lot of meta data…

Family Relations

For digital fonts, the OpenType format (in both ttf and otf flavours), is the most common digital format. For formats older than it, the family relations are usually limited to regular, italic, bold and bold-italic (‘RIBBI’), but OpenType also allows for weight/width/slant (‘WWS’) organisation, or even a completely arbitrary organisation under a single typographic family. All at once, too, because not all programs have the same font selection features. These are stored in OpenType names 1, 2, 16, 17 and 21, 22. You can model their relationship as a tree, as in the following example, where we have a single typographic family with a sans, a serif, both of which are WWS families, and each has a variety of RIBBI subfamilies, some of them (semibold) being a single font:

  • Typographic family (ids 16, 17)
    • Sans (WWS family, ids 21, 22)
      • Regular (RIBBI, ids 1 and 2)
        • Regular
        • Italic
        • Bold
        • Bold italic
      • Condensed (width variant)
        • Regular
        • Italic
        • Bold
        • Bold italic
      • Semi-bold
        • Regular
    • Serif
      • Regular
        • Regular
        • Italic
        • Etc…

This is of course not only stored in the names, it is also stored in the OS/2.fsSelection flags, and for WWS, there’s width and weight data in the OS/2 table. However for typographic family, there’s no way to identify separate WWS families besides the WWS name being present (besides a bit flag in fsSelection, which indicates there’s only one WWS family, but this too cannot be relied on). Furthermore, variable fonts don’t have subfamilies, but rather “axes”, and perhaps some “instances”, which are internal presets of those axes.

And that’s not all, not all fonts are required to have this data, only fonts that are not sufficiently described without all names present, so the default font of a given font family only needs names 1 and 2 to be present, the semibold only names 1, 2, 16 and 17, and so on.

FontConfig is somewhat build to handle this, the default ordering ( undocumented , of course) of the font family names being WWS, Typographic and finally the RIBBI family name. However, The WWS family name is quite recent, meaning there’s many fonts that only have a typographic and RIBBI name, despite having a difference in, say, optical size data, or one of those layer typefaces.

This works because many of these font selector widgets don’t select a family, but rather, they present a bit of ordering for you to select a font and finally store a specific identifier to that font, like the PostScript name, in the text object. Because we’re using CSS however, we store the font family, and specify the weight, width and slant. This has its benefits, as when a font is absent, we can still infer the intention that something was to be set bold or italic. But that does require that the font can be selected by family at all, so if FontConfig cannot associate a WWS family, that is kind of a problem.

Finally, some fonts have a STAT table, which gives even more info about those axes (if a variable font) and allows for non-variable families to describe their relations in an axis like manner. There’s no API to retrieve this info in Harfbuzz, however, FontConfig knows nothing about it either, and even the CSS working group hasn’t made any statements on whether to interpret the STAT table at all. Mind you, even with api for the STAT table, too many fonts don’t have it, so it is not a solution in itself.

Family Reunion

So, the best way to go about this is to sort the font families. This will require opening each font file up with FreeType or Harfbuzz, and retrieving the data you need, as fontconfig doesn’t store everything we need to identify the different families.

For this purpose, I created a struct to put the data in, and organized the structs inside KisForest, which is a templated container class that allows storing data in a tree, and provides a bunch of itterators to traverse said tree. This allows me to create a top level node (‘typographic family’ node) for each font as I find them, and then sort fonts into those. Then afterwards, go over each node again and sort them into individual WWS families, as WWS family names are in fact kind of rare, and the majority of fonts that need them don’t have them.

The second sort is done by going over each toplevel typographic node, and then take all the children. Of the children, you first select all “regular” fonts (the ones closest to width: 100%, weight: 400% and no italic or slant), and adding those first, each with their own WWS family, and then sort the rest into those. Care will need to be taken for fonts that have different optical sizes identified as well (there’s, of course, four ways this can be stored: OS/2 optical size range; size OpenType tag; ‘opsz’ variable axis and STAT table axis), as well as keeping track of situations where multiple formats of the same font are installed (The Nimbus family on many Linux distributions is often installed as OpenType font as well as two separate Postscript fonts, I’m currently sorting those into the same font family). For bitmap fonts, you want to sort the separate pixel sizes into the RIBBI family, depending on how you interpret bitmap pixel size.

Once that’s done, the CSS font matching algorithm needs to be implemented. CSS is explicitely vague about what it means by a font family (this whole blog has assumed up till now that if a give subfamily cannot be selected with CSS parameters, it needs to be in a separate WWS family), but it does specify that any localized names should be matched (localized names are rare and only really used for CJK fonts, but they do exist). So in practice, you end up testing all the possible names, that is, OpenType ids 1, 16, and 21, in the order of the lowest child node to the parent (because remember, the most default version of a given family only has id 1, so you want to test that first). Then comes the actual style testing, the algorithm of which is more or less the same along width, weight and slant, with weight being special for having a default range to test first, while slant needs to be multiplied by -1 first, which it needed anyhow to cohere the specs (CSS dictates that positive slant should skew to the right, while the OpenType spec requires negative slant to skew to the right).

After all of that, the filenames that rolled out of matching can be added to the FontConfig search pattern to prioritize them in the regular fallback search, which I am very thankful of.

While nowadays an example like this would be best off using a color font, there’s many examples of older fonts that are meant to be used layered (as in, two text shapes overlapped with the same text but different subfonts). This particular font, Sweetie Summer, predates any discussion about WWS, only having a typographic family and ribbi family, which lead it to be unselectable with fontconfig. Presentation

But getting the matching to work for odd fonts wasn’t the only thing that was necessary. Fonts also needed to be displayed nicely. Even more, I really wanted to integrate them with Krita’s resource system, as that would allow tagging and search. These are important usability features as modern operating systems come with hundreds of fonts, so being able to tag fonts with “cursive” and “display” and then filter on that can make it much easier to find an appropriate font. Not all design programs have this feature, which has led to a number of designers to use a so-called font manager, which effectively allows installing and deinstalling fonts by user-defined group (KDE Plasma even has one of these build in, and I’d be suprised if there wasn’t one for Gnome somewhere). Inkscape has quite recently introduced font collections, whose purpose is similar, and given we spend 2 years reworking our resource system, which can do exactly this, it made sense to try and get this system working.

There’s some quibles however: vast majority of resources within Krita are tied to a file, while such a font family resource is an abstraction of a collection of files. This results in problems with getting a preview generated as well as updating an entry between restarts of Krita.

Then there’s selecting the style. This one is a bit abstract, so bear with me: So, as explained before, CSS selects the font file to use by using the user-defined font-family and a set of parameters (width, weight, slant, etc). This has both the benefit of having a certain intent (whether the text is condensed, or the weight is set heavy), as well as being a good abstraction that encompasses both regular font families and variable fonts.

This abstraction is implemented as each font family resource having a set of axes (for variable fonts these are the axes in the font, for non-variable, these are an accumulation of the different parameters associated with the subfamilies), and styles which form a sort of preset for those given parameters (encompassing the instances of variable fonts, and the actual subfamilies in non-variable fonts). This way, you can have fonts that use the OS/2.fsSelection bitflags for indicating bold and italic, you can have fonts that use the OS/2 table values, you can have fonts that have variable axes, and all these will have the same toggles in the same place. If in the future the STAT table is going to be read, the extra info for that will easily be integrated in the same system.

There’s some more toggles than the WWS parameters though, for example, toggles for synthesize slant and bold. Some people think that nowadays, these are not necessary, but that’s really only true for scripts with few glyphs. In fact, we had to implement synthesized slant and bold because CJK users were having missing it dearly in the new text layout. On the flip side, in European typesetting, synthesized versions are considered ‘dangerous’ as it can be hard to tell if the correct bold version was selected, so a toggle to turn it off is required. This needed some extra work with variable fonts, as active slant and italic axes are not testable in the usual manner. There’s also optical size, though this is only supported for variable fonts that have an ‘opsz’ axis, as the CSS Working Group doesn’t seem to have an opinion on the other three ways optical size can be indicated. Finally there’s the remaining axes, in case of a variable font with extra custom axes.

So, this sounds to work right? Where’s the issue with styles? Well, some might say that the split between a font-family and the style is unnecessary, it would be much better to just see all the styles at once. In fact, Inkscape has been implementing this recently. Which means I’ll be asked why I didn’t implement that, because obviously I should.

The main problem here is a philosophical one. The reason Inkscape is implementing this is because its user base wants this UI, and the reason the user base wants this UI is because other software they’re using has this UI. So far, so good.

However, other software has this UI because it has a different kind of text layout system. As noted before, the font selector in these programs is just a fancy way of selecting a specific font file. Within programming way of saying “select font file XYZ” is considered an ‘imperative’ way of programming. This is quite common in WYSIWYG (what you see is what you get) editors, as its easier to program. Markup based methods like CSS instead has a ‘declarative’ way of programming: “Select from this font family a font with these parameters, and otherwise these other font families”. A system like this usually tries to infer properties, which is harder to program, but also means less properties need to be set.

The philosophical difference here is that I don’t think it is wise to try to abstract away the underlying data structure, because I think it leads to bugs that are kind of hard to articulate if you don’t know the UI is not representing the underlying structure properly. This is also why it is weird to see people go “Well, UI should be completely decoupled from the business logic”, because even if you programmatically decouple UI and data, the fact remains that the UI can only do what the data structure allows. The data structure by itself belies a workflow, which in the case of mark-up based type setting general, is one that focuses on consistency, and that if you want to make a small modification, only a small modification is stored.

The importance of the underlying data structure is something that I always feel is missing from discussions about the UI of Free Open Source Software, which is why I am emphasizing it now. The main idea of “listening to your users” is not bad, and even for my own work I did talk to other artists on KA to try to get a feel of what artists using Krita prioritize. But text in particular is also far more tricky than this, because the main reason both the Inkscape folks and us went with an SVG+CSS based text layout is because it is a specification that is widely used (just not in graphics programs…) and has a lot of thought put into multi-lingual and non-European text handling (which is still quite rare today). And I think it is going to cause trouble if you try to apply UI conventions that belong to a different kind of text layout.

The main issue I foresee with this approach is that there’s no mind paid to font family fallback. In an online situation, font family fallback is mainly for what to do when a font family can be found. In a local situation like a graphics program, it is mostly useful in multi-script situations. Many fonts only have glyphs for a small subset of Unicode, often limited to a single script with supplementary punctuation and numbers. So in multi-script situations the CSS font matching algorithm requires you to check if a glyph can be represented, and if not, you must check other fonts on the system till you find one with the given glyph. Font family fallback allows you to have some control over this mechanism.

The font family list allows us to control the fallback. Many fonts only have glyphs for a subset of unicode, so controlling fallback can allow us to select fonts that seem to be in a similar tradition, like using a Serif Latin font for a Naskh Arabic font. Not all scripts have similar traditions, so control over the font fallback is also useful in selecting a font that may not fit within the same tradition, but might look good in terms of contrast, so the Latin text, in this case, stands out less.

Another thing that’s kind of difficult here is that it hides the fluidness of variable fonts. Because where before we could treat instances as a sort of preset for the parameters, they now are presented as a whole font to select. To further explain, one thing I’m fully expecting to happen for Krita is that we receive a bug report with “I turned off synthesis for weight, but still Krita is showing something for a weight value that doesn’t correspond to a style”, and I’ll have to reply with “Yes, that’s because you’re using a variable font”, expand on that, and then close the bug as RESOLVED, NOTABUG. By focusing too much on the styles as individual fonts to select, we’re inhibiting people from updating their mental model of how fonts can work.

Visuals

Of course, because there’s so much variation in what the different fonts can do, it is necessary to indicate the capabilities of a font. Many font selectors have at the very least an icon for the font type. Because there’s been a flurry of activity within OpenType in the last decade, there’s now also variable fonts to keep an eye on, and four (five?) different ways to do color font representation, and those can all be present at once. Krita only really supports the Bitmap and ClrV0 implementations, so we need to indicate which color font data is present.

Other than that, the font name should be present. As well as a preview for the given font. We could technically do the latter straight up with the text layout, by putting a KoSvgTextShape inside a QQuickPaintedItem, but I worry that might be slow on some systems. Instead, I’m laying out the text and converting the result to regular paths, storing the SVG, and then painting the path from that within a QQuickPaintedItem. The sample text chosen uses FontConfig supported languages list, though I am wondering if we’re not better off testing CharMap support instead, as to ensure we will always have some glyphs from the font available. Anyway, I’m quite pleased with the result, as it allows us to display the sample nice and sharp.

One thing that is also tricky is the localization. Basically all user-facing strings within OpenType can have localized variants within the font, and if those are present they should be displayed. In practice this means that these localized strings get stored in the KoResource as well. The models for tracking the styles and axes receive a ‘setLocales’ function, so that when the font name is requested within QML, the text label will receive the localized name. However, with the resource model this isn’t feasible, as the font family resource is the only one that holds localized names. Thankfully, a QVariantHash is treated as a javascript object/dict within QML, so the localized names could be stored into the metadata QVariantMap (note that QML does not support converting QVariantHash to a dict(!), and then tested against the KLocalizedString::languages() stringlist (though care must be taken to ensure the underscore is replaced with a dash).

Eventually, we’ll prolly need to do the same with the writing system samples. However, that should probably use the language the text shape is tagged with, as it is very common for multi-lingual artists to keep the software in English, so they won’t have to guess at a translation when doing an internet search for a given label. So in those cases where you’d need a different sample (like, a Arabic sample if you’re typesetting Arabic), it is probably combined with a different language being set on the text. Mind, there’s no language selector yet, because SVG+CSS uses BCP47 language tags, and I haven’t figured out how to capture the richness of those tags in a discovery friendly ui. Being able to limit the visible fonts based on whether fontconfig thinks they support the active language would also be useful in the same vein.

FontConfig Rescan Interval

When discussing the architecture of how to implement this, it was mentioned that FontConfig has a rescan mechanism. This is basically FontConfig checking if any changes had happened, and if so, updating the font list, and the default on most Linux systems for this is 30 seconds. I think most programs just turn this off, but the person I was talking with went “oh, yes, this is how you need to implement this”, which was a little confusing because our resource system doesn’t actually support refreshing resources during a session. I ended up implementing what they asked of me, as a show of good faith, but there’s multiple refresh problems with it (because our resource system was not build to handle this). It will probably be disabled in the end, unless the refresh problems turn out to be trivial to tackle.

Postamble

A returning theme in handling fonts, OpenType fonts in particular, is that there’s at the least 3-5 ways of doing one common thing. This is not unusual, and often happens when people need a certain function to be part of a specification so badly that there’s no time to standardize. Because of this complexity though, implementing a good font selector still took about a month. At the same time, I’m happy I spend time on this because it would otherwise hang like a thunderstorm over the whole text project, as it would be a matter of time we’d drown in bug reports about this-or-that font not being selectable. That is still going to happen, but at the least there’s a framework to deal with edge cases now.

The next topic I’ll be tackling is probably going to be font size and line height.

Appendix TTF vs OTF

So an OpenType font should be in a file called the Open Type Format (otf), right? Then why are most of them in the True Type Format (ttf)? Aren’t these two the same?

ttf and oft are the same format, yes. The original spec was called TrueType, and the glyphs were outlined with quadratic bezier curves. One of the things that was added when the spec became OpenType, was that the glyphs could now also be outlined in CFF (compact font format, PostScript, basically), which uses cubic bezier curves. Since then, a font stored in a ttf file is an OpenType font with quadratic bezier curves, and a font stored in an otf file is a file with cubic bezier curves. I am unsure whether since the introduction of variable fonts this difference isn’t purely conventional however.

Italics and Obliques

Because blogpost is aimed at readers that are probably not typography (or lettering/calligraphy) nerds, let’s speed through the history of Latin script as to explain the difference between Italics and Obliques and why they’re sometimes confused:

The history of Latin script is basically the existence of a formal style of writing (a ‘ductus’), and then because clerics need to write a lot, them developing a less formal style that’s easier to write. That one then formalizes, and then a new style is developed. So if we start with Roman Square Capitals (like Trajan), it is followed by a variety of less formal styles like Uncial and Rustic Capitals. Around the middle ages however, the formal ductus and less formal ductus are unified into one system of capital (‘upper case’, also called ‘majuscule’) and miniscule (‘lower case’) letters. However, clerics needed a faster way of writing, so a given blackletter style would often be developed into a chancery or court style.

Fast forward to the Rennaisance. Italians are making their first printing fonts. For reasons I don’t want to get into, they choose Roman Square Capitals for the capitals of their typefaces and combine those with Carolingian miniscules. But they want more typographic richness, so the popular Italic chancery hand is turned into a printing font as well. There’s some notable difference between Carolingian miniscules and Italian miniscules, in particular, the ‘a’ and ‘g’ are written differently, as is the ‘f’:

Then, much later, in the nineteenth century, Sans-serif fonts were introduced, which find their origin in sign-painting. For some reason, these fonts don’t use an Italic ductus for their corresponding slanted style, in fact, the slanted style is just that: slanted (an ‘Oblique’). My best guess is that this is because this font style comes from sign painting and thus is optimized for legibility, and frequently uses an Italic ‘g’, combined with a Carolingian ‘a’ for this purpose. So they may have considered a simple slant much more distinguishable than trying to create a full Italic ductus compatible variant.

By the time computers get involved, a slanted version of the font is considered paramount, as many academic style guides require Italics to indicate quotes and book names. By this time, Oblique variants were considered acceptable, which is why ‘synthesized italic’ just ends up being a digitally slanted version of the original. Fonts that combine both a rare, but exist, so OpenType specifies an extra bit to indicate that a font is specifically an oblique version, but it seems that this didn’t catch on. Even now, with OpenType variable fonts and the STAT table making it much easier to define a slanted version, there’s still hesitance to use it as nobody knows which software supports selecting it.

Furthermore, many other writing systems use various script styles to indicate quotes and such, yet, these are not recognized as ‘Italics’ by computer software. There’s therefore something very arbitrary about Italics by themselves: There’s nothing stopping a font designer from creating a black letter or copperplate style for their font family, except how to handle the font files so that software can select these.

Slant

Slant, within variable OpenType fonts, is a different toggle from Italics. And it can go either way, similarly, there is such a thing as a ‘upright Italic’. So that begs the question: why are Italics usually slanted? This has to do with calligraphy, in particular, the reach of the right hand.

Because the right hand has a reach that goes from top-right to bottom-left, it is likely to skew vertical lines to the right.

If you write with a left hand the same as with a right hand, you get the same effect, but flipped.

For lefties like me, to do right-slanting calligraphy we need to either write with our hand over the line, or rotate the paper.

Now, for European scripts, this is usually for italics, but sometimes a slant doesn’t express a calligraphic quirk, but rather a feeling of forward motion. Which is why for right-to-left scripts, you will sometimes see a left-leaning font being used.

Categories: FLOSS Project Planets

Real Python: Simulate a Text File in Python

Planet Python - Tue, 2024-07-30 10:00

Testing applications that read files from a disk can be challenging. Issues such as machine dependencies, special access requirements, and slow performance often arise when you need to read text from a file.

In this Code Conversation with instructor Martin Breuss, you’ll discover how to simplify this process by simulating text files with StringIO from the io module in Python’s standard library.

In this video course, you’ll learn how to:

  • Use io.StringIO to simulate a text file on disk
  • Perform file operations on a io.StringIO object
  • Decide when to use io.StringIO and when to avoid it
  • Understand possible alternatives
  • Mock a file object using unittest.mock

Understanding how to simulate text file objects and mock file objects can help streamline your testing strategy and development process.

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Categories: FLOSS Project Planets

The Drop Times: Thoughts on Drupal Starshot #2: A New Direction for Drupal

Planet Drupal - Tue, 2024-07-30 09:53
In the latest installment of the "Thoughts on Starshot" series, The DropTimes captures the diverse perspectives of the Drupal community on the eagerly anticipated Drupal Starshot initiative. This article delves into the reflections, hopes, and insights shared by key community members, including Kristen Pol, Murray Woodman, Nicolas Loye, Martin Anderson-Clutz, and Tim Hestenes Lehnen. As the community eagerly awaits the release of Drupal Starshot, this series highlights the collective excitement and anticipation, offering a glimpse into how Starshot is expected to shape the future of Drupal and its broader community.
Categories: FLOSS Project Planets

Matt Glaman: Trial experience for Starshot update

Planet Drupal - Tue, 2024-07-30 09:18

Earlier this month, I debuted a way to try out Drupal core and the Starshot prototype running in the browser using WebAssembly. It started as a passion project and fascination with new web technologies, something I had tried a year before but didn't find a fit for. Now, it's officially part of a Starshot initiative track.

Trial experience for Starshot track

I am the lead for the Trial experience for Starshot track. The track has three phases:

Categories: FLOSS Project Planets

LabPlot funded through NGIO Core Fund

Planet KDE - Tue, 2024-07-30 08:22

This year we applied to NLnet’s NGI Zero Core open call for proposals in February 2024. After a thorough review by the NLnet Foundation, LabPlot’s application was accepted and will be funded by the NGI0 Core Fund, a fund established by NLnet with financial support from the European Commission’s Next Generation Internet Program, under the aegis of DG Communications Networks, Content and Technology, under Grant Agreement No. 101092990.

NGI Zero Core is a grant program focused on free and open source projects that deliver free and open technologies to society with full transparency and user empowerment. See the full list of projects funded this year.

As part of this funding, the LabPlot team will mainly work on the following three features:

  • Analysis on live data where we want to enable the already existing analysis functions (FFT, smooth, etc.) on live data
  • Python Scripting which will allow users to leverage LabPlot’s C++ API via python bindings in external applications and also allow to control LabPlot from within the running instance of the application
  • Statistical Analysis with the plan to implement the most relevant statistical hypothesis tests and correlation coefficients that are frequently used in the statistics community.

We’re not starting from scratch in these areas. For all of these topics, we have already received numerous requests and suggestions from users in the past, and there have already been discussions within the team. We have also defined some concrete tasks for what we want to achieve (see for example the planned roadmap for statistical analysis) and we have even done some proof-of-concept implementations. With the financial support, the team will now focus more on these issues, complete the development and release these important features to our users in the near future.

We would like to express our thanks to the NLnet Foundation and to the European Commission for their support!

Categories: FLOSS Project Planets

Marcos Dione: Writing a tile server in python

Planet Python - Tue, 2024-07-30 05:02

Another dictated post111, but heavily edited. Buyer beware.

I developed a tileset based on OpenStreetMap data and style and elevation information, but I don't have a render server. What I have been doing is using my own version of an old script from the mapnik version of the OSM style. This script is called generate_tiles, and I made big modifications to it and now it's capable of doing many things, including spawning several processes for handling the rendering. You can define regions that you want to render, or you can just provide a bbox or a set of tiles or just coordinates. You can change the size of the meta tile, and it handles empty tiles. If you find a sea tile, most probably you will not need to render its children9, where children are the four tiles that are just under it in the next zoom level. For instance, in zoom level zero we have only one tile (0,0,0), and it's children are (1,0,0), (1,0,1), (1,1,0) and (1,1,1). 75% of the planet's surface is water, and with Mercator projection and the Antartic Ocean, the percent of tiles could be bigger, so this optimization cuts a lot of useless rendering time.

Another optimization is that it assumes that when you render zoom level N, you will be using at least the same data for zoom level N+1. Of course, I am not catching that data because mapnik does not allow this, but the operating system does the catching. So if you have enough RAM, then you should be able to reuse all the data that's already in buffers and cache, instead of having to fetch them again from disk. This in theory should accelerate rendering and probably it is10.

The script works very well, and I've been using it for years already for rendering tiles in batches for several zoom levels. Because my personal computer is way more powerful than my server (and younger; 2018 vs 2011), I render in my computer and rsync to my server.

So now I wanted to make a tile server based on this. Why do I want to make my own and not use renderd? I think my main issue with renderd is that it does not store the individual tiles, but keeps metatiles of 8x8 tiles and serve the individual tiles from there. This saves inode usage and internal fragmentation. Since my main usage so far has been (and probably will continue to be) rendering regions by hand, and since my current (static) tile server stores all the latest versions of the tiles I have rendered since I started doing this some 15 years ago, I want updating the server in a fast way. Most tile storage methods I know fail terribly at update time (see here); most of the time it means sending the whole file over the wire. Also, individual tiles are easier to convert to anything else, like creating a MBTiles file, push it to my phone, and have a offline tile service I can carry with me on treks where there is no signal. Also, serving the tiles can be as easy as python -m http.server from the tileset root directory. So renderd is not useful for me. Another reason is, well, I already have the rendering engine working. So how does it work?

The rendering engine consists of one main thread, which I call Master, and rendering threads3. These rendering threads load the style and wait for work to do. The current style file is 6MiB+ and takes mapnik 4s+ to load it and generate all its structures, which means these threads have to be created once per service lifetime. I have one queue that can send commands from the Master to the renderer pool asking for rendering a metatile, which is faster than rendering the individual tiles. Then one of the rendering threads picks the request from this queue, calls mapnik, generates the metatile, cuts it into the subtiles and saves them to disk. The rendering thread posts in another queue, telling the Master about the children metatiles that must be rendered, which due to emptiness can be between 0 and 4.

To implement the caching optimization I mentioned before, I use a third structure to maintain a stack. At the beginning I push into it the initial work; later I pop one element from it, and when a rendered returns the list of children to be rendered, I push them on top of the rest. This is what tries to guarantee that a metatile's children will be rendered before moving to another region that would trash the cache. And because the children can inspect the tiles being written, they can figure out when a child is all sea tiles and not returning it for rendering.

At the beginning I thought that, because the multiprocessing queues are implemented with pipes, I could use select()4 to see whether the queue was ready for writing or reading and use a typical non-blocking loop. When you're trying to write, these queues will block when the queue is full, and when you're trying to read, they will block when the queue is empty. But these two conditions, full and empty, are actually handled by semaphores, not by the size of the pipe. That means that selecting on those pipes, even if I could reach all the way down into the structures of the multiprocessing.Queue all the way down. and add them to a selector, yes, the read queue will not be selected if it's empty (nothing to read), but the write queue will not, since availability of space in the pipe does not mean the queue is not full.

So instead I'm peeking into these queues. For the work queue, I know that the Master thread8 is the only writer, so I can peek to see if it is full. If it is, I am not going to send any new work to be done, because it means that all the renders are busy, and the only work queued to be done has not been picked up yet. For the reading side it's the same, Master is the only reader. so, I can peek if it's empty, and if it is, I am not going to try to read any information from it. So, I have a loop, peeking first into the work queue and then into the info queue. If nothing has been done, I sleep a fraction of a second.

Now let's try to think about how to replace this main loop with a web frontend. What is the web frontend going to do? It's going to be getting queries by different clients. It could be just a slippy map in a web page, so we have a browser as a client, or it could be any of the applications that can also render slippy maps. For instance, on Linux, we have marble; on Android, I use MyTrails, and OsmAnd.

One of the things about these clients is that they have timeouts. Why am I mentioning this? Because rendering a metatile for me can take between 3 to 120 seconds, depending on the zoom level. There are zoom levels that are really, really expensive, like between 7 and 10. If a client is going to be asking directly a rendering service for a tile, and the tile takes too long to render, the client will timeout and close the connection. How do we handle this on the server side? Well, instead of the work stack, the server will have request queue, which will be collecting the requests from the clients, and the Master will be sending these requests to the render pool.

So if the client closes the connection, I want to be able to react to that, removing any lingering requests made by that client from the request queue. If I don't do that, the request queue will start piling up more and more requests, creating a denial of service. This is not possible in multiprocessing queues, you cannot remove an element. The only container that can do that is a dequeue5, which also is optimized for putting and popping things from both ends (it's probably implemented using a circular buffer), which is perfect for a queue. As for the info queue, I will not be caring anymore about children metatiles, because I will not be doing any work that the clients are not requesting.

What framework that would allow me to do this? Let's recap the requirements:

  • Results are computed, and take several seconds.
  • The library that generates the results is not async, nor thread safe, so I need to use subprocesses to achieve parallelization.
  • A current batch implementation uses 2 queues to send and retrieve computations to a pool of subprocesses; my idea is to "just" add a web frontend to this.
  • Each subprocess spends some seconds warming up, son I can't spawn a new process for each request.
  • Since I will have a queue of requested computations, if a client dies, if its query is being processed, then I let it finish; if not, I should remove it from the waiting queue.

I started with FastAPI, but it doesn't have the support that I need. At first I just implemented a tile server; the idea was to grow from there6, but reading the docs it only allows doing long running async stuff after the response has been sent.

Next was Flask. Flask is not async unless you want to use sendfile(). sendfile() is a way to make the kernel read a file and write it directly on a socket without intervention from the process requesting that. The alternative is to to open the file, read a block, write it on the socket, repeat. This definitely makes your code more complex, you have to handle lots of cases. So sendfile() is very, very handy, but it's also faster because it's 0-copy. But Flask does not give control of what happens when the client suddenly closes the connection. I can instruct it to cancel the tasks in flight, but as per all the previous explanation, that's not what I want.

This same problem seems to affect all async frameworks I looked into. asyncio, aiohttp, tornado. Except, of course, twisted, but its API for that is with callbacks, and TBH, I was starting to get tired of all this, and the prospect of callback hell, even when all the rest of the system could be developed in a more async way, was too much. And this is not counting the fact that I need to hook into the main loop to step the Master. This could be implemented with timed callbacks, such as twisted's callLater(), but another thought started to form in my head.

Why did I go directly for frameworks? Because they're supposed to make our lives easier, but from the beginning I had the impression that this would not be a run of the mill service. The main issue came down to beign able to send things to render, return the rendered data to the right clients, associate several clients to a single job before it finished (more than one client might request the same tile or several tiles that belong to the same metatile), and handle client and job cancellation when clients disappear. The more frameworks' documentation I read, the more I started to fear that the only solution was to implement an non-blocking12 loop myself.

I gotta be honest, I dusted an old Unix Network Programming book, 2nd Ed., 1998 (!!!), read half a chapter, and I was ready to do it. And thanks to the simple selector API, it's a breeze:

  1. Create a listening socket.
  2. Register it for read events (connections).
  3. On connection, accept the client and wait for read events in that one too.
  4. We were not registering for write before because the client is always ready for write before we start sending anything, which lead to tight loops.
  5. On client read, read the request and send the job to Master. Unregister for read.
  6. But if there's nothing to read, the client disconnected. Send an empty.response, unregister for read and register for write.
  7. Step Master.
  8. If anything came back, generate the responses and queue them for sending. Register the right clients for write.
  9. On client write (almost always), send the response and the file with sendfile() if any.
  10. Then close the connection and unregister.
  11. Loop to #3.

Initially all this, including reimplementing fake Master and render threads, took less than 200 lines of code, some 11h of on-and-off work. Now that I have finished I have a better idea of how to implement this at least with twisted, which I think I will have to do, since step 4 assumes the whole query can be recv()'ed in one go and step 7 similarly for send()'ing; luckily I don't need to do any handholding for sendfile(), even when the socket is non blocking. A more production ready service needs to handle short reads and writes. Also, the HTTP/1.1 protocol all clients are using allows me to assume that once a query is received, the client will be waiting for an answer before trying anything else, and that I can close the connection once a response has been send and assume the client will open a new connection for more tiles. And even then, supporting keep alive should not be that hard (instead of closing the client, unregister for write, register for read, and only do the close dance when the response is empty). And because I can simply step Master in the main loop, I don't have to worry about blocking queues.

Of course, now it's more complex, because it's implementing support for multiple clients with different queries requiring rendering the same metatile. This is due that applications will open several clients for fetching tiles when showing a region, and unless it's only 4 and they fall in the corner of 4 adjacent metatiles, they will always mean more than one client per metatile. Also, I could have several clients looking at the same region. The current code is approaching the 500 lines, but all that should also be present in any other implementation.

I'm pretty happy about how fast I could make it work and how easy it was. Soon I'll be finishing integrating a real render thread with saving the tiles and implement the fact that if one metatile's tile is not present, we can assume it's OK, but if all are not present, I have to find out if they were all empty or never rendered. A last step would be how to make all this testable. And of course, the twisted port.

  1. This is getting out of hand. The audio was 1h long, not sure how long it took to auto transcribe, and when editing and thinking I was getting to the end of it, the preview told me I still had like half the text to go through. 

  2. No idea what I wanted to write here :) 

  3. Because mapnik is not thread safe and because of the GIL, they're actually subprocesses via the multioprocessing module, but I'll keep calling them threads to simplify. 

  4. Again, a simplification. Python provides the selector module that allows using abstract implementations that spare us from having to select the best implementation for the platform. 

  5. I just found out it's pronounced like 'deck'. 

  6. All the implementations I did followed the same pattern. In fact, right now, I hadn't implementing the rendering tile server: it's only blockingly sleep()'ing for some time (up to 75s, to trigger client timeouts), and then returning the tiles already present. What's currently missing is figuring out whether I should rerender or use the tiles already present7, and actually connecting the rendering part. 

  7. Two reasons to rerender: the data is stale, or the style has changed. The latter requires reloading the styles, which will probably mean rebuilding the rendering threads. 

  8. I keep calling this the Master thread, but at this point instead of having its own main loop, I'm just calling a function that implements the body of such loop. Following previous usage for such functions, it's called single_step(). 

  9. Except when you start rendering ferry routes. 

  10. I never measured it :( 

  11. Seems like nikola renumbers the footnotes based on which order they are here at the bottom of the source. The first note was 0, but it renumbered it and all the rest to start counting from 1. 

  12. Have in account that I'm explicitly making a difference between a non-blocking/select() loop from an async/await system, but have in account that the latter is actually implemented with the formet. 

Categories: FLOSS Project Planets

Python Bytes: #394 Python is easy now?

Planet Python - Tue, 2024-07-30 04:00
<strong>Topics covered in this episode:</strong><br> <ul> <li><a href="https://rdrn.me/postmodern-python/"><strong>Python is easy now</strong></a></li> <li><strong><a href="https://til.simonwillison.net/python/trying-free-threaded-python">Trying out free-threaded Python on macOS</a></strong></li> <li><a href="https://mathspp.com/blog/module-itertools-overview"><strong>Module itertools overview</strong></a></li> <li><strong><a href="https://github.com/louislam/uptime-kuma">uptime-kuma</a></strong></li> <li><strong>Extras</strong></li> <li><strong>Joke</strong></li> </ul><a href='https://www.youtube.com/watch?v=6v7VLgfhZ5o' style='font-weight: bold;'data-umami-event="Livestream-Past" data-umami-event-episode="394">Watch on YouTube</a><br> <p><strong>About the show</strong></p> <p>Sponsored by ScoutAPM: <a href="https://pythonbytes.fm/scout"><strong>pythonbytes.fm/scout</strong></a></p> <p><strong>Connect with the hosts</strong></p> <ul> <li>Michael: <a href="https://fosstodon.org/@mkennedy"><strong>@mkennedy@fosstodon.org</strong></a></li> <li>Brian: <a href="https://fosstodon.org/@brianokken"><strong>@brianokken@fosstodon.org</strong></a></li> <li>Show: <a href="https://fosstodon.org/@pythonbytes"><strong>@pythonbytes@fosstodon.org</strong></a></li> </ul> <p>Join us on YouTube at <a href="https://pythonbytes.fm/stream/live"><strong>pythonbytes.fm/live</strong></a> to be part of the audience. Usually Tuesdays at 10am PT. Older video versions available there too.</p> <p>Finally, if you want an artisanal, hand-crafted digest of every week of the show notes in email form? Add your name and email to <a href="https://pythonbytes.fm/friends-of-the-show">our friends of the show list</a>, we'll never share it. </p> <p><strong>Brian #1:</strong> <a href="https://rdrn.me/postmodern-python/"><strong>Python is easy now</strong></a></p> <ul> <li>or Postmodern Python</li> <li>or Beyond Hypermodern</li> <li>Chris Ardene</li> <li>Mostly a cool review of using rye for <ul> <li>setup</li> <li>linting</li> <li>typing</li> <li>testing</li> <li>documentation</li> <li>CI/CD</li> </ul></li> <li>Also a nice discussion of how to deal with a Monorepo for Python projects</li> </ul> <p><strong>Michael #2:</strong> <a href="https://til.simonwillison.net/python/trying-free-threaded-python">Trying out free-threaded Python on macOS</a></p> <ul> <li>via pycoders</li> <li>How to install free threaded Python the easy way</li> <li>Testing the CPU bound work speed ups for FT Python</li> </ul> <p><strong>Brian #3:</strong> <a href="https://mathspp.com/blog/module-itertools-overview"><strong>Module itertools overview</strong></a></p> <ul> <li>Rodrigo</li> <li>20 tools that every Python developer should be aware of.</li> <li>In 5 categories <ul> <li>Reshaping</li> <li>Filtering</li> <li>Combinatorial</li> <li>Infinite</li> <li>Iterators that complement other tools</li> </ul></li> <li>Things I forgot about <ul> <li>chain</li> <li>pairwise</li> <li>zip_longest</li> <li>tee</li> </ul></li> </ul> <p><strong>Michael #4:</strong> <a href="https://github.com/louislam/uptime-kuma">uptime-kuma</a></p> <ul> <li>A fancy self-hosted monitoring tool</li> <li><strong>Features</strong> <ul> <li>Monitoring uptime for HTTP(s) / TCP / HTTP(s) Keyword / HTTP(s) Json Query / Ping / DNS Record / Push / Steam Game Server / Docker Containers</li> <li>Fancy, Reactive, Fast UI/UX</li> <li>Notifications via Telegram, Discord, Gotify, Slack, Pushover, Email (SMTP), and <a href="https://github.com/louislam/uptime-kuma/tree/master/src/components/notifications">90+ notification services, click here for the full list</a></li> <li>20-second intervals</li> <li><a href="https://github.com/louislam/uptime-kuma/tree/master/src/lang">Multi Languages</a></li> <li>Multiple status pages</li> <li>Map status pages to specific domains</li> <li>Ping chart</li> <li>Certificate info</li> <li>Proxy support</li> <li>2FA support</li> </ul></li> </ul> <p><strong>Extras</strong> </p> <p>Brian:</p> <ul> <li>Still working on a new pytest course. Hoping to get it released soon-ish.</li> </ul> <p>Michael:</p> <ul> <li><a href="https://x.com/kennethreitz42/status/1815881034334126539?prefetchTimestamp=1722279033597">Open source Switzerland</a> </li> <li><a href="https://mastodon.social/@ffalcon31415/112852910444032717">spyoungtech/FreeSimpleGUI</a> — actively maintained fork of the last release of PySimpleGUI</li> </ul> <p><strong>Joke:</strong> <a href="https://devhumor.com/media/java-amp-javascript">Java vs. JavaScript</a></p>
Categories: FLOSS Project Planets

drunomics: Green UX

Planet Drupal - Tue, 2024-07-30 03:44
Green UX illu_stromverrbauch_internet_drunomicscolors_1536pxteaser.png jurgen.thano Tue, 07/30/2024 - 09:44 Explore the profound impact of Green UX, which redefines digital experiences by placing sustainability at their core. Delve into how deliberate design decisions not only improve performance and accessibility but also reduce our environmental impact in the digital realm. Embrace the opportunity to contribute towards a more sustainable and user-friendly future online.
Categories: FLOSS Project Planets

Kay Hayen: Nuitka Release 2.4

Planet Python - Tue, 2024-07-30 03:09

This is to inform you about the new stable release of Nuitka. It is the extremely compatible Python compiler, “download now”.

This release largely contains bug fixes for the previous changes, but also finishes full compatibility with the match statements of 3.10, something that was long overdue since there were always some incompatible behaviors there.

In terms of bug fixes, it’s also huge. An upgrade is required, especially for new setuptools that made compiled programs segfault at startup.

Table of Contents

Bug Fixes
  • UI: Fix, we had reversed disable / force and wrong option name recommendation for --windows-console-mode when the user used old-style options.

  • Python3.10+: Fix, must not check for len greater or equal of 0 or for sequence match cases. That is unnecessary and incompatible and can raise exceptions with custom sequences not implementing __len__. Fixed in 2.3.1 already.

  • Python3.10+: Fix, match sequence with final star arguments failed in some cases to capture the rest. The assigned value then was empty.when it shouldn’t have been. Fixed in 2.3.1 already.

  • Python3.8+: Fix, calls to variable args functions now need to be done differently, or else they can crash, as was observed with 3.10 in PGO instrumentation, at least. Fixed in 2.3.1 already.

  • PGO: Fix, using nuitka-run did not execute the program created as expected. Fixed in 2.3.1 already.

  • Linux: Support extension modules used as DLLs by other DLLs or extension modules. That makes newer tensorflow and potentially more packages work again. Fixed in 2.3.1 already.

  • Python3.10+: Matches classes were not fully compatible.

    We need to check against case-defined class __match_args__, not the matched value type __match_args that is not necessarily the same.

    Also, properly annotating the exception exit of subscript matches; the subscript value can indeed raise an exception.

    Collect keyword and positional match values in one go and detect duplicate attributes used, which we previously did not.

  • Scons: Fix, do not crash when clang is not reporting its version correctly. It happened if Clang usage was required with --clang option but not installed. Fixed in 2.3.2 already.

  • Debian: Fix, detecting the Debian flavor of Python was not working anymore, and as a result, the intended defaults were no longer applied by Nuitka, leading to incorrect suggestions that didn’t work. Fixed in 2.3.3 already.

  • Ubuntu: Fix, the static link library for Python 3.12 is not usable unless we provide parts of HACL for the sha2 module so as not to cause link errors. Fixed in 2.3.3 already.

  • Standalone: Fix, importing newer pkg_resources was crashing. Fixed in 2.3.3 already.

  • Python3.11+: Added support for newer Python with dill-compat. Fixed in 2.3.4 already.

  • Standalone: Support locating Windows icons for pywebview. Fixed in 2.3.4 already.

  • Standalone: Added support for spacy related packages. Fixed in 2.3.4 already.

  • Python3.12: Fix, our workaround for cv2 support cannot use the imp module anymore. Fixed in 2.3.4 already.

  • Compatibility: Added support for __init__ files that are extension modules. Architecture checks for macOS were false negatives for them, and the case insensitive import scan failed to find them on Windows. Fixed in 2.3.4 already.

  • Standalone: Added missing dependencies for standard library extension modules, mainly exhibited on macOS. Fixed in 2.3.4 already.

  • Windows: Fix build failures on mapped network drives. Fixed in 2.3.4 already.

  • Python3.12: Fix, need to set frame prev_inst or else f_lasti is random. Some packages; for example PySide6; use this to check what bytecode calls them or how they import them and it could crash when attempting it. Fixed in 2.3.6 already.

  • Fix, fork bomb in cpuinfo package no longer happens. Fixed in 2.3.8 already.

  • Nuitka-Python: Fix, cannot ask for shared library prefixes. Fixed in 2.3.8 already.

  • Standalone: Make sure keras package dependency for tensorflow is visible. Fixed in 2.3.10 already.

  • Linux: Fix, for static executables we should ignore errors setting a DLL load path. Fixed in 2.3.10 already.

  • Compatibility: Fix, nuitka resource readers also need to have .parent attribute. Fixed in 2.3.10 already.

  • Fix, need to force no-locale language outputs for tools outputs on non-Windows. Our previous methods were not forcing enough.

    For non-Windows this makes Nuitka work on systems with locales active for message outputs only. Fixed in 2.3.10 already.

  • Fix, was not using proper result value for SET_ATTRIBUTE to check success in a few corner cases. Fixed in 2.3.10 already.

  • Windows: Retry deleting dist and build folders, allowing users to recognize still running programs and not crashing on Anti-Virus software still locking parts of them.

  • Fix, dict.fromkeys didn’t give compatible error messages for no args given.

  • Fix, output correct unsupported exception messages for in-place operations

    For in-place **, it was also incompatible, since it must not mention the pow function.

  • Fix, included metadata could lead to instable code generation. We were using a dictionary for it, but that is not as stable order for the C compiler to fully benefit.

  • Fix, including data files for packages that are extension modules was not working yet.

  • macOS: Detect the DLL path of libpython (if used) by looking at dependencies of the running Python binary rather than encoding what CPython does. Doing that covers other Python flavors as well.

  • Fix, need to prefer extension modules over Python code for packages.

  • Fix, immutable constant values are not to be treated as very trusted.

  • Python3: Fix, the __loader__ attribute of a module should be an object and not only the class, otherwise only static methods can work.

  • Python3: Added .name and .path attributes to Nuitka loader objects for enhanced compatibility with code that expects source code loaders.

  • Fix, the sys.argv[0] needs to be absolute for best usability.

    For dirname(sys.argv[0]) to be usable even if the program is launched via PATH environment by a shell, we cannot rely on how we are launched since that won’t be a good path, unlike with Python interpreter, where it always is.

  • Standalone: Fix, adding missing dependencies for some crypto packages.

  • Python3.12: Need to write to thread local variable during import. This however doesn’t work for Windows and non-static libpython flavors in general.

  • macOS: Enforce using system codesign as the Anaconda one is not working for us.

  • Fix, we need to read .pyi files as source code. Otherwise unicode characters can cause crashes.

  • Standalone: Fix, some packages query private values for distribution objects, so use the same attribute name for the path.

  • Multidist: Make sure to follow the multidist reformulation modules. Otherwise in accelerated mode, these could end up not being included.

  • Fix, need to hold a reference of the iterable while converting it to list.

  • Plugins: Fix, this wasn’t properly ignoring None values in load descriptions as intended.

  • macOS: Need to allow DLLs from all Homebrew paths.

  • Reports: Do not crash during report writing for very early errors.

  • Python3.11+: Fix, need to make sure we have split as a constant value when using exception groups.

  • Debian: More robust against problematic distribution folders with no metadata, these apparently can happen with OS upgrades.

  • Fix, was leaking exception in case of --python-flag=-m mode that could cause errors.

  • Compatibility: Close standard file handles on process forks as CPython does. This should enhance things for compilations using attach on Windows.

Package Support
  • Standalone: Added data file for older bokeh version. Fixed in 2.3.1 already.

  • Standalone: Support older pandas versions as well.

  • Standalone: Added data files for panel package.

  • Standalone: Added support for the newer kivy version and added macOS support as well. Fixed in 2.3.4 already.

  • Standalone: Include all kivy.uix packages with kivy, so their typical config driven usage is not too hard.

  • Standalone: Added implicit dependencies of lxml.sax module. Fixed in 2.3.4 already.

  • Standalone: Added implicit dependencies for zeroconf package. Fixed in 2.3.4 already.

  • Standalone: Added support for numpy version 2. Fixed in 2.3.7 already.

  • Standalone: More complete support for tables package. Fixed in 2.3.8 already.

  • Standalone: Added implicit dependencies for scipy.signal package. Fixed in 2.3.8 already.

  • Standalone: Added support for moviepy and imageio_ffmeg packages. Fixed in 2.3.8 already.

  • Standalone: Added support for newer scipy. Fixed in 2.3.10 already.

  • Standalone: Added data files for bpy package. For full support more work will be needed.

  • Standalone: Added support for nes_py and gym_tetris packages.

  • Standalone: Added support for dash and plotly.

  • Standalone: Added support for usb1 package.

  • Standalone: Added support for azure.cognitiveservices.speech package.

  • Standalone: Added implicit dependencies for tinycudann package.

  • Standalone: Added support for newer win32com.server.register.

  • Standalone: Added support for jaxtyping package.

  • Standalone: Added support for open3d package.

  • Standalone: Added workaround for torch submodule import function.

  • Standalone: Added support for newer paddleocr.

New Features
  • Experimental support for Python 3.13 beta 3. We try to follow its release cycle closely and aim to support it at the time of CPython release. We also detect no-GIL Python and can make use of it. The GIL status is output in the --version format and the GIL usage is available as a new {GIL} variable for project options.

  • Scons: Added experimental option --experimental=force-system-scons to enforce system Scons to be used. That allows for the non-use of inline copy, which can be interesting for experiments with newer Scons releases. Added in 2.3.2 already.

  • Debugging: A new non-deployment handler helps when segmentation faults occurred. The crashing program then outputs a message pointing to a page with helpful information unless the deployment mode is active.

  • Begin merging changes for WASI support. Parts of the C changes were merged and for other parts, command line option --target=wasi was added, and we are starting to address cross platform compilation for it. More work will be necessary to fully merge it, right not it doesn’t work at all yet.

  • PGO: Added support for using it in standalone mode as well, so once we use it more, it will immediately be practical.

  • Make the --list-package-dlls use plugins as well, and make delvewheel and announce its DLL path internally, too. Listing DLLs for packages using plugins can use these paths for more complete outputs.

  • Plugins: The no-qt plugin was usable in accelerated mode.

  • Reports: Added included metadata and reasons for it.

  • Standalone: Added support for spacy with a new plugin.

  • Compatibility: Use existing source files as if they were .pyi files for extension modules. That gives us dependencies for code that installs source code and extension modules.

  • Plugins: Make version information, onefile mode, and onefile cached mode indication available in Nuitka Package Configuration, too.

  • Onefile: Warn about using tendo.singleton in non-cached onefile mode.

    Tendo uses the running binary name for locking by default. So it’s not going to work if that changes for each execution, make the user aware of that, so they can use cached mode instead.

  • Reports: Include the micro pass counts and tracing merge statistics so we can see the impact of new optimization.

  • Plugins: Allow to specify modes in the Nuitka Package Configuration for annotations, doc_strings, and asserts. These overrule global configuration, which is often not practical. Some modules may require annotations, but for other packages, we will know they are fine without them. Simply disabling annotations globally barely works. For some modules, removing annotations can give a 30% compile-time speedup.

  • Standalone: Added module configuration for Django to find commands and load its engine.

  • Allow negative values for –jobs to be relative to the system core count so that you can tell Nuitka to use all but two cores with --jobs=-2 and need not hardcode your current code count.

  • Python3.12: Annotate libraries that are currently not supported

    We will need to provide our own Python3.12 variant to make them work.

  • Python3.11+: Catch calls to uncompiled function objects with compiled code objects. We now raise a RuntimeError in the bytecode making it easier to catch them rather than segfaulting.

Optimization
  • Statically optimize constant subscripts of variables with immutable constant values.

  • Forward propagate very trusted values for variable references enabling a lot more optimization.

  • Python3.8+: Calls of C functions are faster and more compact code using vector calls, too.

  • Python3.10+: Mark our compiled types as immutable.

  • Python3.12: Constant returning functions are dealing with immortal values only. Makes their usage slightly faster since no reference count handling is needed.

  • Python3.10+: Faster attribute descriptor lookups. Have our own replacement of PyDesc_IsData that had become an API call, making it very slow on Windows specifically.

  • Avoid using Python API function for determining sequence sizes when getting a length size for list creations.

  • Data Composer: More compact and portable Python3 int (Python2 long) value representation.

    Rather than fixed native length 8 or 4 bytes, we use variable length encoding which for small values uses only a single byte.

    This also avoids using struct.pack with C types, as we might be doing cross platform, so this makes part of the WASI changes unnecessary at the same time.

    Large values are also more compact because middle 31-bit portions can be less than 4 bytes and save space on average.

  • Data Composer: Store bytecode blob size more efficient and portable, too.

  • Prepare having knowledge of __prepare__ result to be dictionaries per compile time decisions.

  • Added more hard trust for the typing module.

    The typing.Text is a constant too. In debug mode, we now check all exports of typing for constant values. This will allow to find missing values sooner in the future.

    Added the other types to be known to exist. That should help scalability for types intensive code somewhat by removing error handling for them.

  • macOS: Should use static libpython with Anaconda as it works there too, and reduces issues with Python3.12 and extension module imports.

  • Standalone: Statically optimize by OS in sysconfig.

    Consequently, standalone distributions can exclude OS-specific packages such as _aix_support and _osx_support.

  • Avoid changing code names for complex call helpers

    The numbering of complex call helper as normally applied to all functions are, caused this issue. When part of the code is used from the bytecode cache, they never come to exist and the C code of modules using them then didn’t match.

    This avoids an extra C re-compilation for some modules that were using renumbered function the second time around a compilation happens. Added in 2.3.10 already.

  • Avoid using C-API when creating __path__ value.

  • Faster indentation of generated code.

Anti-Bloat
  • Add new pydoc bloat mode to trigger warnings when using it.

  • Recognize usage of numpy.distutils as setuptools bloat for more direct reporting.

  • Avoid compiling large opcua modules that generate huge C files much like asyncua package. Added in 2.3.1 already.

  • Avoid shiboken2 and shiboken6 modules from matplotlib package when the no-qt plugin is used. Added in 2.3.6 already.

  • Changes for not using pydoc and distutils in numpy version 2. Added in 2.3.7 already.

  • Avoid numpy and packaging dependencies from PIL package.

  • Avoid using webbrowser module from pydoc.

  • Avoid using unittest in keras package. Added in 2.3.1 already.

  • Avoid distutils from _oxs_support (used by sysconfig) module on macOS.

  • Avoid using pydoc for werkzeug package. Fixed in 2.3.10 already.

  • Avoid using pydoc for site module. Fixed in 2.3.10 already.

  • Avoid pydoc from xmlrpc.server. Fixed in 2.3.10 already.

  • Added no_docstrings support for numpy2 as well. Fixed in 2.3.10 already.

  • Avoid pydoc in joblib.memory.

  • Avoid setuptools in gsplat package.

  • Avoid dask and jax in scipy package.

  • Avoid using matplotlib for networkx package.

Organizational
  • Python3.12: Added annotations of official support for Nuitka PyPI package and test runner options that were still missing. Fixed in 2.3.1 already.

  • UI: Change runner scripts. The nuitka3 is no more. Instead, we have nuitka2 where it applies. Also, we now use CMD files rather than batch files.

  • UI: Check filenames for data files for illegal paths on the respective platforms. Some user errors with data file options become more apparent this way.

  • UI: Check spec paths more for illegal paths as well. Also do not accept system paths like {TEMP} and no path separator after it.

  • UI: Handle report writing interrupt with CTRL-C more gracefully. No need to present this this as a general problem, rather inform the user that he did it.

  • NoGIL: Warn if using a no-GIL Python version, as this mode is not yet officially supported by Nuitka.

  • Added badges to the README.rst of Nuitka to display package support and more. Added in 2.3.1 already.

  • UI: Use the retry decorator when removing directories in general. It will be more thorough with properly annotated retries on Windows. For the dist folder, mention the running program as a probable cause.

  • Quality: Check replacements and replacements_plain Nuitka package configuration values.

  • Quality: Catch backlashes in paths provided in Nuitka Package Configuration values for dest_path, relative_path, dirs, raw_dirs and empty_dirs.

  • Debugging: Disable pagination in gdb with the --debugger option.

  • PGO: Warn if the PGO binary does not run successfully.

  • UI: The new console mode option is a Windows-specific option now, move it to that group.

  • UI: Detect “rye python” on macOS. Added in 2.3.8 already.

  • UI: Be forgiving about release candidates; Ubuntu shipped one in a LTS release. Changed in 2.3.8 already.

  • Debugging: Allow fine-grained debug control for immortal checks

    Can use --no-debug-immortal-assumptions to allow for corrupted immortal objects, which might be done by non-Nuitka code and then break the debug mode.

  • UI: Avoid leaking compile time Nuitka environment variables to the child processes.

    They were primarily visible with --run, but we should avoid it for everything.

    For non-Windows, we now recognize if we are the exact re-execution and otherwise, reject them.

  • Watch: Delete the existing virtualenv in case of errors updating or upgrading it.

  • Watch: Keep track of Nuitka compiled program exit code in newly added result files, too.

  • Watch: Redo compilations in case of previous errors when executing the compile program.

  • Quality: Wasn’t detecting files to ignore for PyLint on Windows properly, also detect crashes of PyLint.

Tests
  • Added test to cover the dill-compat plugin.

  • macOS: Make actual use of ctypes in its standalone test to ensure correctness on that OS, too.

  • Make compile extension module test work on macOS, too.

  • Avoid using 2to3 in our tests since newer Python no longer contains it by default, we split up tests with mixed contents into two tests instead.

  • Python3.11+: Make large constants test executable for as well. We no longer can easily create those values on the fly and output them due to security enhancements.

  • Python3.3: Remove support from the test runner as well.

  • Tests: Added construct-based tests for coroutines so we can compare their performance as well.

Cleanups
  • Make try/finally variable releases through common code. It will allow us to apply special exception value trace handling for only those for scalability improvements, while also making many re-formulations simpler.

  • Avoid using anti-bloat configuration values replacements where replacements_plain is good enough. A lot of config pre-date its addition.

  • Avoid Python3 and Python3.5+ specific Jinja2 modules on versions before that, and consequently, avoid warning about the SyntaxError given.

  • Moved code object extraction of dill-compat plugin from Python module template to C code helper for shared usage and better editing.

  • Also call va_end for standards compliance when using va_start. Some C compilers may need that, so we better do it even if what we have seen so far doesn’t need it.

  • Don’t pass main filename to the tree building anymore, and make nuitka.Options functions usage explicit when importing.

  • Change comments that still mentioned Python 3.3 as where a change in Python happened since we no longer support this version. Now, we consider what’s first seen in Python 3.4 is a Python3 change.

  • Cleanup, change Python 3.4 checks to 3.0 checks as Python3.3 is no longer supported. Cleans up version checks, as we now treat >=3.4 either as >=3 or can drop checks entirely.

  • The usual flow of spelling cleanups, this time for C codes.

Summary

This release cycle was a longer than usual, with much new optimization and package support requiring attention.

For optimization we got quite a few things going, esp. with more forward propagation, but the big ones for scalability are still all queued up and things are only prepared.

The 3.13 work was continuing smoothly and seems to be doing fine. We are still on track for supporting it right after release.

The parts where we try and address WASI prepare cross-compilation, but we will not aim at it generally immediately, and target our own Nuitka standalone backend Python that is supposed to be added in coming releases.

Categories: FLOSS Project Planets

Russell Coker: Links July 2024

Planet Debian - Tue, 2024-07-30 03:03

Interesting Scientific American article about the way that language shapes thought processes and how it was demonstrated in eye tracking experiments with people who have Aboriginal languages as their first language [1].

David Brin wrote an interesting article “Do We Really Want Immortality” [2]. I disagree with his conclusions about the politics though. Better manufacturing technology should allow decreasing the retirement age while funding schools well.

Scientific American has a surprising article about the differences between Chimp and Bonobo parenting [3]. I’d never have expected Chimp moms to be protective.

Sam Varghese wrote an insightful and informative article about the corruption in Indian politics and the attempts to silence Australian journalist Avani Dias [4].

WorksInProgress has an insightful article about the world’s first around the world solo yacht race [5]. It has some interesting ideas about engineering.

Htwo has an interesting video about adverts for fake games [6]. It’s surprising how they apparently make money from advertising games that don’t exist.

Elena Hashman wrote an insightful blog post about Chronic Fatigue Syndrome [7]. I hope they make some progress on curing it soon. The fact that it seems similar to “long Covid” which is quite common suggests that a lot of research will be applied to that sort of thing.

Bruce Schneier wrote an insightful blog post about the risks of MS Copilot [8].

Krebs has an interesting article about how Apple does Wifi AP based geo-location and how that can be abused for tracking APs in warzones etc. Bad Apple! [9].

Bruce Schneier wrote an insightful blog post on How AI Will Change Democracy [10].

Charles Stross wrote an amusing and insightful post about MS Recall titled Is Microsoft Trying to Commit Suicide [11].

Bruce Schneier wrote an insightful blog post about seeing the world as a data structure [12].

Luke Miani has an informative YouTube video about eBay scammers selling overprices MacBooks [13].

The Yorkshire Ranter has an insightful article about Ronald Coase and the problems with outsourcing big development contracts as an array of contracts without any overall control [14].

Related posts:

  1. Links March 2024 Bruce Schneier wrote an interesting blog post about his workshop...
  2. Links January 2024 Long Now has an insightful article about domestication that considers...
  3. Links April 2024 Ron Garret wrote an insightful refutation to 2nd amendment arguments...
Categories: FLOSS Project Planets

Pages