Feeds

Glyph Lefkowitz: The Macintosh

Planet Python - Thu, 2024-01-25 01:31

Today is the 40th anniversary of the announcement of the Macintosh. Others have articulated compelling emotional narratives that easily eclipse my own similar childhood memories of the Macintosh family of computers. So instead, I will ask a question:

What is the Macintosh?

As this is the anniversary of the beginning, that is where I will begin. The original Macintosh, the classic MacOS, the original “System Software” are a shining example of “fake it till you make it”. The original mac operating system was fake.

Don’t get me wrong, it was an impressive technical achievement to fake something like this, but what Steve Jobs did was to see a demo of a Smalltalk-76 system, an object-oriented programming environment with 1-to-1 correspondences between graphical objects on screen and runtime-introspectable data structures, a self-hosting high level programming language, memory safety, message passing, garbage collection, and many other advanced facilities that would not be popularized for decades, and make a fake version of it which ran on hardware that consumers could actually afford, by throwing out most of what made the programming environment interesting and replacing it with a much more memory-efficient illusion implemented in 68000 assembler and Pascal.

The machine’s RAM didn’t have room for a kernel. Whatever application was running was in control of the whole system. No protected memory, no preemptive multitasking. It was a house of cards that was destined to collapse. And collapse it did, both in the short term and the long. In the short term, the system was buggy and unstable, and application crashes resulted in system halts and reboots.

In the longer term, the company based on the Macintosh effectively went out of business and was reverse-acquired by NeXT, but they kept the better-known branding of the older company. The old operating system was gradually disposed of, quickly replaced at its core with a significantly more mature generation of operating system technology based on BSD UNIX and Mach. With the removal of Carbon compatibility 4 years ago, the last vestigial traces of it were removed. But even as early as 2004 the Mac was no longer really the Macintosh.

What NeXT had built was much closer to the Smalltalk system that Jobs was originally attempting to emulate. Its programming language, “Objective C” explicitly called back to Smalltalk’s message-passing, right down to the syntax. Objects on the screen now did correspond to “objects” you could send messages to. The development environment understood this too; that was a major selling point.

The NeXSTEP operating system and Objective C runtime did not have garbage collection, but it provided a similar developer experience by providing reference-counting throughout its object model. The original vision was finally achieved, for real, and that’s what we have on our desks and in our backpacks today (and in our pockets, in the form of the iPhone, which is in some sense a tiny next-generation NeXT computer itself).

The one detail I will relate from my own childhood is this: my first computer was not a Mac. My first computer, as a child, was an Amiga. When I was 5, I had a computer with 4096 colors, real multitasking, 3D graphics, and a paint program that could draw hard real-time animations with palette tricks. Then the writing was on the wall for Commodore and I got a computer which had 256 colors, a bunch of old software that was still black and white, an operating system that would freeze if you held down the mouse button on the menu bar and couldn’t even play animations smoothly. Many will relay their first encounter with the Mac as a kind of magic, but mine was a feeling of loss and disappointment. Unlike almost everyone at the time, I knew what a computer really could be, and despite many pleasant and formative experiences with the Macintosh in the meanwhile, it would be a decade before I saw a real one again.

But this is not to deride the faking. The faking was necessary. Xerox was not going to put an Alto running Smalltalk on anyone’s desk. People have always grumbled that Apple products are expensive, but in 2024 dollars, one of these Xerox computers cost roughly $55,000.

The Amiga was, in its own way, a similar sort of fake. It managed its own miracles by putting performance-critical functions into dedicated hardware which rapidly became obsolete as software technology evolved much more rapidly.

Jobs is celebrated as a genius of product design, and he certainly wasn’t bad at it, but I had the rare privilege of seeing the homework he was cribbing from in that subject, and in my estimation he was a B student at best. Where he got an A was bringing a vision to life by creating an organization, both inside and outside of his companies.

If you want a culture-defining technological artifact, everybody in the culture has to be able to get their hands on one. This doesn’t just mean that the builder has to be able to build it. The buyer also has to be able to afford it, obviously. Developers have to be able to develop for it. The buyer has to actually want it; the much-derided “marketing” is a necessary part of the process of making a product what it is. Everyone needs to be able to move together in the direction of the same technological future.

This is why it was so fitting that Tim Cook was made Jobs's successor. The supply chain was the hard part.

The crowning, final achievement of Jobs’s career was the fact that not only did he fake it — the fakes were flying fast and thick at that time in history, even if they mostly weren’t as good — it was that he faked it and then he built the real version and then he bridged the transitions to get to the real thing.

I began here by saying that the Mac isn’t really the Mac, and speaking in terms of a point in time analysis that is true. Its technology today has practically nothing in common with its technology in 1984. This is not merely an artifact of the length of time here: the technology at the core of various UNIXes in 1984 bears a lot of resemblance of UNIX-like operating systems today1. But looking across its whole history from 1984 to 2024, there is undeniably a continuity to the conceptual “Macintosh”.

Not just as a user, but as a developer moving through time rather than looking at just a few points: the “Macintosh”, such as it is, has transitioned from the Motorola 68000 to the PowerPC to Intel 32-bit to Intel 64-bit to ARM. From obscurely proprietary to enthusiastically embracing open source and then, sadly, much of the way back again. It moved from black and white to color, from desktop to laptop, from Carbon to Cocoa, from Display PostScript to Display PDF, all the while preserving instantly recognizable iconic features like the apple menu and the cursor pointer, while providing developers documentation and SDKs and training sessions that helped them transition their apps through multiple near-complete rewrites as a result of all of these changes.

To paraphrase Abigail Thorne’s first video about Identity, identity is what survives. The Macintosh is an interesting case study in the survival of the idea of a platform, as distinct from the platform itself. It is the Computer of Theseus, a thought experiment successfully brought to life and sustained over time.

If there is a personal lesson to be learned here, I’d say it’s that one’s own efforts need not be perfect. In fact, a significantly flawed vision that you can achieve right now is often much, much better than a perfect version that might take just a little bit longer, if you don’t have the resources to actually sustain going that much longer2. You have to be bad at things before you can be good at them. Real artists, as Jobs famously put it, ship.

So my contribution to the 40th anniversary reflections is to say: the Macintosh is dead. Long live the Mac.

Acknowledgments

Thank you to my patrons who are supporting my writing on this blog. If you like what you’ve read here and you’d like to read more of it, or you’d like to support my various open-source endeavors, you can support me on Patreon as well!

  1. including, ironically, the modern macOS. 

  2. And that is why I am posting this right now, rather than proofreading it further. 

Categories: FLOSS Project Planets

Drupalize.Me: Part 2: EventDispatcher in Drupal (Spotlight on Symfony in Drupal)

Planet Drupal - Wed, 2024-01-24 20:54
Part 2: EventDispatcher in Drupal (Spotlight on Symfony in Drupal)

In Part 2 of our exploration of Symfony components in Drupal, we focus on the event dispatcher.

The event dispatcher is a tool that enables the application to communicate across objects by subscribing to and listening for events. It achieves this by creating a directory for various event types, and the corresponding registered listeners for each event type. When a specific type of event occurs, the code that has registered a listener for that event is invoked. If you're familiar with the Mediator and Observer design patterns you might recognize similarities here.

Blake Hall Wed, 01/24/2024 - 19:54
Categories: FLOSS Project Planets

Dirk Eddelbuettel: qlcal 0.0.10 on CRAN: Calendar Updates

Planet Debian - Wed, 2024-01-24 20:09

The tenth release of the qlcal package arrivied at CRAN today.

qlcal delivers the calendaring parts of QuantLib. It is provided (for the R package) as a set of included files, so the package is self-contained and does not depend on an external QuantLib library (which can be demanding to build). qlcal covers over sixty country / market calendars and can compute holiday lists, its complement (i.e. business day lists) and much more. Examples are in the README at the repository, the package page, and course at the CRAN package page.

This releases synchronizes qlcal with the QuantLib release 1.33 and its updates to 2024 calendars.

Changes in version 0.0.10 (2024-01-24)
  • Synchronized with QuantLib 1.33

Courtesy of my CRANberries, there is a diffstat report for this release. See the project page and package documentation for more details, and more examples.

If you like this or other open-source work I do, you can now sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

Categories: FLOSS Project Planets

Glyph Lefkowitz: Unsigned Commits

Planet Python - Wed, 2024-01-24 19:29

I am going to tell you why I don’t think you should sign your Git commits, even though doing so with SSH keys is now easier than ever. But first, to contextualize my objection, I have a brief hypothetical for you, and then a bit of history from the evolution of security on the web.

It seems like these days, everybody’s signing all different kinds of papers.

Bank forms, permission slips, power of attorney; it seems like if you want to securely validate a document, you’ve gotta sign it.

So I have invented a machine that automatically signs every document on your desk, just in case it needs your signature. Signing is good for security, so you should probably get one, and turn it on, just in case something needs your signature on it.

We also want to make sure that verifying your signature is easy, so we will have them all notarized and duplicates stored permanently and publicly for future reference.

No? Not interested?

Hopefully, that sounded like a silly idea to you.

Most adults in modern civilization have learned that signing your name to a document has an effect. It is not merely decorative; the words in the document being signed have some specific meaning and can be enforced against you.

In some ways the metaphor of “signing” in cryptography is bad. One does not “sign” things with “keys” in real life. But here, it is spot on: a cryptographic signature can have an effect.

It should be an input to some software, one that is acted upon. Software does a thing differently depending on the presence or absence of a signature. If it doesn’t, the signature probably shouldn’t be there.

Consider the most venerable example of encryption and signing that we all deal with every day: HTTPS. Many years ago, browsers would happily display unencrypted web pages. The browser would also encrypt the connection, if the server operator had paid for an expensive certificate and correctly configured their server. If that operator messed up the encryption, it would pop up a helpful dialog box that would tell the user “This website did something wrong that you cannot possibly understand. Would you like to ignore this and keep working?” with buttons that said “Yes” and “No”.

Of course, these are not the precise words that were written. The words, as written, said things about “information you exchange” and “security certificate” and “certifying authorities” but “Yes” and “No” were the words that most users read. Predictably, most users just clicked “Yes”.

In the usual case, where users ignored these warnings, it meant that no user ever got meaningful security from HTTPS. It was a component of the web stack that did nothing but funnel money into the pockets of certificate authorities and occasionally present annoying interruptions to users.

In the case where the user carefully read and honored these warnings in the spirit they were intended, adding any sort of transport security to your website was a potential liability. If you got everything perfectly correct, nothing happened except the browser would display a picture of a small green purse. If you made any small mistake, it would scare users off and thereby directly harm your business. You would only want to do it if you were doing something that put a big enough target on your site that you became unusually interesting to attackers, or were required to do so by some contractual obligation like credit card companies.

Keep in mind that the second case here is the best case.

In 2016, the browser makers noticed this problem and started taking some pretty aggressive steps towards actually enforcing the security that HTTPS was supposed to provide, by fixing the user interface to do the right thing. If your site didn’t have security, it would be shown as “Not Secure”, a subtle warning that would gradually escalate in intensity as time went on, correctly incentivizing site operators to adopt transport security certificates. On the user interface side, certificate errors would be significantly harder to disregard, making it so that users who didn’t understand what they were seeing would actually be stopped from doing the dangerous thing.

Nothing fundamental1 changed about the technical aspects of the cryptographic primitives or constructions being used by HTTPS in this time period, but socially, the meaning of an HTTP server signing and encrypting its requests changed a lot.

Now, let’s consider signing Git commits.

You may have heard that in some abstract sense you “should” be signing your commits. GitHub puts a little green “verified” badge next to commits that are signed, which is neat, I guess. They provide “security”. 1Password provides a nice UI for setting it up. If you’re not a 1Password user, GitHub itself recommends you put in just a few lines of configuration to do it with either a GPG, SSH, or even an S/MIME key.

But while GitHub’s documentation quite lucidly tells you how to sign your commits, its explanation of why is somewhat less clear. Their purse is the word “Verified”; it’s still green. If you enable “vigilant mode”, you can make the blank “no verification status” option say “Unverified”, but not much else changes.

This is like the old-style HTTPS verification “Yes”/“No” dialog, except that there is not even an interruption to your workflow. They might put the “Unverified” status on there, but they’ve gone ahead and clicked “Yes” for you.

It is tempting to think that the “HTTPS” metaphor will map neatly onto Git commit signatures. It was bad when the web wasn’t using HTTPS, and the next step in that process was for Let’s Encrypt to come along and for the browsers to fix their implementations. Getting your certificates properly set up in the meanwhile and becoming familiar with the tools for properly doing HTTPS was unambiguously a good thing for an engineer to do. I did, and I’m quite glad I did so!

However, there is a significant difference: signing and encrypting an HTTPS request is ephemeral; signing a Git commit is functionally permanent.

This ephemeral nature meant that errors in the early HTTPS landscape were easily fixable. Earlier I mentioned that there was a time where you might not want to set up HTTPS on your production web servers, because any small screw-up would break your site and thereby your business. But if you were really skilled and you could see the future coming, you could set up monitoring, avoid these mistakes, and rapidly recover. These mistakes didn’t need to badly break your site.

We can extend the analogy to HTTPS, but we have to take a detour into one of the more unpleasant mistakes in HTTPS’s history: HTTP Public Key Pinning, or “HPKP”. The idea with HPKP was that you could publish a record in an HTTP header where your site commits2 to using certain certificate authorities for a period of time, where that period of time could be “forever”. Attackers gonna attack, and attack they did. Even without getting attacked, a site could easily commit “HPKP Suicide” where they would pin the wrong certificate authority with a long timeline, and their site was effectively gone for every browser that had ever seen those pins. As a result, after a few years, HPKP was completely removed from all browsers.

Git commit signing is even worse. With HPKP, you could easily make terrible mistakes with permanent consequences even though you knew the exact meaning of the data you were putting into the system at the time you were doing it. With signed commits, you are saying something permanently, but you don’t really know what it is that you’re saying.

Today, what is the benefit of signing a Git commit? GitHub might present it as “Verified”. It’s worth noting that only GitHub will do this, since they are the root of trust for this signing scheme. So, by signing commits and registering your keys with GitHub, you are, at best, helping to lock in GitHub as a permanent piece of infrastructure that is even harder to dislodge because they are not only where your code is stored, but also the arbiters of whether or not it is trustworthy.

In the future, what is the possible security benefit? If we all collectively decide we want Git to be more secure, then we will need to meaningfully treat signed commits differently from unsigned ones.

There’s a long tail of unsigned commits several billion entries long. And those are in the permanent record as much as the signed ones are, so future tooling will have to be able to deal with them. If, as stewards of Git, we wish to move towards a more secure Git, as the stewards of the web moved towards a more secure web, we do not have the option that the web did. In the browser, the meaning of a plain-text HTTP or incorrectly-signed HTTPS site changed, in order to encourage the site’s operator to change the site to be HTTPS.

In contrast, the meaning of an unsigned commit cannot change, because there are zillions of unsigned commits lying around in critical infrastructure and we need them to remain there. Commits cannot meaningfully be changed to become signed retroactively. Unlike an online website, they are part of a historical record, not an operating program. So we cannot establish the difference in treatment by changing how unsigned commits are treated.

That means that tooling maintainers will need to provide some difference in behavior that provides some incentive. With HTTPS where the binary choice was clear: don’t present sites with incorrect, potentially compromised configurations to users. The question was just how to achieve that. With Git commits, the difference in treatment of a “trusted” commit is far less clear.

If you will forgive me a slight straw-man here, one possible naive interpretation is that a “trusted” signed commit is that it’s OK to run in CI. Conveniently, it’s not simply “trusted” in a general sense. If you signed it, it’s trusted to be from you, specifically. Surely it’s fine if we bill the CI costs for validating the PR that includes that signed commit to your GitHub account?

Now, someone can piggy-back off a 1-line typo fix that you made on top of an unsigned commit to some large repo, making you implicitly responsible for transitively signing all unsigned parent commits, even though you haven’t looked at any of the code.

Remember, also, that the only central authority that is practically trustable at this point is your GitHub account. That means that if you are using a third-party CI system, even if you’re using a third-party Git host, you can only run “trusted” code if GitHub is online and responding to requests for its “get me the trusted signing keys for this user” API. This also adds a lot of value to a GitHub credential breach, strongly motivating attackers to sneakily attach their own keys to your account so that their commits in unrelated repos can be “Verified” by you.

Let’s review the pros and cons of turning on commit signing now, before you know what it is going to be used for:

Pro Con Green “Verified” badge Unknown, possibly unlimited future liability for the consequences of running code in a commit you signed Further implicitly cementing GitHub as a centralized trust authority in the open source world Introducing unknown reliability problems into infrastructure that relies on commit signatures Temporary breach of your GitHub credentials now lead to potentially permanent consequences if someone can smuggle a new trusted key in there New kinds of ongoing process overhead as commit-signing keys become new permanent load-bearing infrastructure, like “what do I do with expired keys”, “how often should I rotate these”, and so on

I feel like the “Con” column is coming out ahead.

That probably seemed like increasingly unhinged hyperbole, and it was.

In reality, the consequences are unlikely to be nearly so dramatic. The status quo has a very high amount of inertia, and probably the “Verified” badge will remain the only visible difference, except for a few repo-specific esoteric workflows, like pushing trust verification into offline or sandboxed build systems. I do still think that there is some potential for nefariousness around the “unknown and unlimited” dimension of any future plans that might rely on verifying signed commits, but any flaws are likely to be subtle attack chains and not anything flashy and obvious.

But I think that one of the biggest problems in information security is a lack of threat modeling. We encrypt things, we sign things, we institute rotation policies and elaborate useless rules for passwords, because we are looking for a “best practice” that is going to save us from having to think about what our actual security problems are.

I think the actual harm of signing git commits is to perpetuate an engineering culture of unquestioningly cargo-culting sophisticated and complex tools like cryptographic signatures into new contexts where they have no use.

Just from a baseline utilitarian philosophical perspective, for a given action A, all else being equal, it’s always better not to do A, because taking an action always has some non-zero opportunity cost even if it is just the time taken to do it. Epsilon cost and zero benefit is still a net harm. This is even more true in the context of a complex system. Any action taken in response to a rule in a system is going to interact with all the other rules in that system. You have to pay complexity-rent on every new rule. So an apparently-useless embellishment like signing commits can have potentially far-reaching consequences in the future.

Git commit signing itself is not particularly consequential. I have probably spent more time writing this blog post than the sum total of all the time wasted by all programmers configuring their git clients to add useless signatures; even the relatively modest readership of this blog will likely transfer more data reading this post than all those signatures will take to transmit to the various git clients that will read them. If I just convince you not to sign your commits, I don’t think I’m coming out ahead in the felicific calculus here.

What I am actually trying to point out here is that it is useful to carefully consider how to avoid adding junk complexity to your systems. One area where junk tends to leak in to designs and to cultures particularly easily is in intimidating subjects like trust and safety, where it is easy to get anxious and convince ourselves that piling on more stuff is safer than leaving things simple.

If I can help you avoid adding even a little bit of unnecessary complexity, I think it will have been well worth the cost of the writing, and the reading.

Acknowledgments

Thank you to my patrons who are supporting my writing on this blog. If you like what you’ve read here and you’d like to read more of it, or you’d like to support my various open-source endeavors, you can support me on Patreon as well! I am also available for consulting work if you think your organization could benefit from expertise on topics such as “What else should I not apply a cryptographic signature to?”.

  1. Yes yes I know about heartbleed and Bleichenbacher attacks and adoption of forward-secret ciphers and CRIME and BREACH and none of that is relevant here, okay? Jeez. 

  2. Do you see what I did there. 

Categories: FLOSS Project Planets

Joachim Breitner: GHC Steering Committee Retrospective

Planet Debian - Wed, 2024-01-24 19:21

After seven years of service as member and secretary on the GHC Steering Committee, I have resigned from that role. So this is a good time to look back and retrace the formation of the GHC proposal process and committee.

In my memory, I helped define and shape the proposal process, optimizing it for effectiveness and throughput, but memory can be misleading, and judging from the paper trail in my email archives, this was indeed mostly Ben Gamari’s and Richard Eisenberg’s achievement: Already in Summer of 2016, Ben Gamari set up the ghc-proposals Github repository with a sketch of a process and sent out a call for nominations on the GHC user’s mailing list, which I replied to. The Simons picked the first set of members, and in the fall of 2016 we discussed the committee’s by-laws and procedures. As so often, Richard was an influential shaping force here.

Three ingredients

For example, it was him that suggested that for each proposal we have one committee member be the “Shepherd”, overseeing the discussion. I believe this was one ingredient for the process effectiveness: There is always one person in charge, and thus we avoid the delays incurred when any one of a non-singleton set of volunteers have to do the next step (and everyone hopes someone else does it).

The next ingredient was that we do not usually require a vote among all members (again, not easy with volunteers with limited bandwidth and occasional phases of absence). Instead, the shepherd makes a recommendation (accept/reject), and if the other committee members do not complain, this silence is taken as consent, and we come to a decision. It seems this idea can also be traced back on Richard, who suggested that “once a decision is requested, the shepherd [generates] consensus. If consensus is elusive, then we vote.”

At the end of the year we agreed and wrote down these rules, created the mailing list for our internal, but publicly archived committee discussions, and began accepting proposals, starting with Adam Gundry’s OverloadedRecordFields.

At that point, there was no “secretary” role yet, so how I did become one? It seems that in February 2017 I started to clean-up and refine the process documentation, fixing “bugs in the process” (like requiring authors to set Github labels when they don’t even have permissions to do that). This in particular meant that someone from the committee had to manually handle submissions and so on, and by the aforementioned principle that at every step there ought to be exactly one person in change, the role of a secretary followed naturally. In the email in which I described that role I wrote:

Simon already shoved me towards picking up the “secretary” hat, to reduce load on Ben.

So when I merged the updated process documentation, I already listed myself “secretary”.

It wasn’t just Simon’s shoving that put my into the role, though. I dug out my original self-nomination email to Ben, and among other things I wrote:

I also hope that there is going to be clear responsibilities and a clear workflow among the committee. E.g. someone (possibly rotating), maybe called the secretary, who is in charge of having an initial look at proposals and then assigning it to a member who shepherds the proposal.

So it is hardly a surprise that I became secretary, when it was dear to my heart to have a smooth continuous process here.

I am rather content with the result: These three ingredients – single secretary, per-proposal shepherds, silence-is-consent – helped the committee to be effective throughout its existence, even as every once in a while individual members dropped out.

Ulterior motivation

I must admit, however, there was an ulterior motivation behind me grabbing the secretary role: Yes, I did want the committee to succeed, and I did want that authors receive timely, good and decisive feedback on their proposals – but I did not really want to have to do that part.

I am, in fact, a lousy proposal reviewer. I am too generous when reading proposals, and more likely mentally fill gaps in a specification rather than spotting them. Always optimistically assuming that the authors surely know what they are doing, rather than critically assessing the impact, the implementation cost and the interaction with other language features.

And, maybe more importantly: why should I know which changes are good and which are not so good in the long run? Clearly, the authors cared enough about a proposal to put it forward, so there is some need… and I do believe that Haskell should stay an evolving and innovating language… but how does this help me decide about this or that particular feature.

I even, during the formation of the committee, explicitly asked that we write down some guidance on “Vision and Guideline”; do we want to foster change or innovation, or be selective gatekeepers? Should we accept features that are proven to be useful, or should we accept features so that they can prove to be useful? This discussion, however, did not lead to a concrete result, and the assessment of proposals relied on the sum of each member’s personal preference, expertise and gut feeling. I am not saying that this was a mistake: It is hard to come up with a general guideline here, and even harder to find one that does justice to each individual proposal.

So the secret motivation for me to grab the secretary post was that I could contribute without having to judge proposals. Being secretary allowed me to assign most proposals to others to shepherd, and only once in a while myself took care of a proposal, when it seemed to be very straight-forward. Sneaky, ain’t it?

7 Years later

For years to come I happily played secretary: When an author finished their proposal and public discussion ebbed down they would ping me on GitHub, I would pick a suitable shepherd among the committee and ask them to judge the proposal. Eventually, the committee would come to a conclusion, usually by implicit consent, sometimes by voting, and I’d merge the pull request and update the metadata thereon. Every few months I’d summarize the current state of affairs to the committee (what happened since the last update, which proposals are currently on our plate), and once per year gathered the data for Simon Peyton Jones’ annually GHC Status Report. Sometimes some members needed a nudge or two to act. Some would eventually step down, and I’d sent around a call for nominations and when the nominations came in, distributed them off-list among the committee and tallied the votes.

Initially, that was exciting. For a long while it was a pleasant and rewarding routine. Eventually, it became a mere chore. I noticed that I didn’t quite care so much anymore about some of the discussion, and there was a decent amount of naval-gazing, meta-discussions and some wrangling about claims of authority that was probably useful and necessary, but wasn’t particularly fun.

I also began to notice weaknesses in the processes that I helped shape: We could really use some more automation for showing proposal statuses, notifying people when they have to act, and nudging them when they don’t. The whole silence-is-assent approach is good for throughput, but not necessary great for quality, and maybe the committee members need to be pushed more firmly to engage with each proposal. Like GHC itself, the committee processes deserve continuous refinement and refactoring, and since I could not muster the motivation to change my now well-trod secretarial ways, it was time for me to step down.

Luckily, Adam Gundry volunteered to take over, and that makes me feel much less bad for quitting. Thanks for that!

And although I am for my day job now enjoying a language that has many of the things out of the box that for Haskell are still only language extensions or even just future proposals (dependent types, BlockArguments, do notation with (← foo) expressions and 💜 Unicode), I’m still around, hosting the Haskell Interlude Podcast, writing on this blog and hanging out at ZuriHac etc.

Categories: FLOSS Project Planets

Matt Layman: Payments Gateway - Building SaaS with Python and Django#181

Planet Python - Wed, 2024-01-24 19:00
In this episode, we continued on the Stripe integration. I worked on a new payments gateway interface to access the Stripe APIs needed for creating a check out session. We hit some bumps along the way because of djstripe’s new preference for putting the Stripe keys into the database exclusively.
Categories: FLOSS Project Planets

Bruno Ponne / Coding The Past: Explore art with SQL and pd.read_sql_query

Planet Python - Wed, 2024-01-24 19:00


Greetings, humanists, social and data scientists!


Have you ever tried to load a large file in Python or R? Sometimes, when we have file sizes in the order of gigabytes, you may experience problems of performance with your program taking an unusually long time to load the data. SQL, or Structured Query Language, is used to deal with larger data files stored in relational databases and is widely used in the industry and even in research. Apart from being more efficient to prepare data, in your journey, you might encounter data sources whose main form of access is through SQL.


In this lesson you will learn how to use SQL in Python to retrieve data from a relational data base of the National Gallery of Art (US). You will also learn how to use a relational database management system (RDBMS) and pd.read_sql_query to extract data from it in Python.



1. Data source

The database used in this lesson is made available by National Gallery of Art (US) under a Creative Commons Zero license. The dataset contains data about more than 130,000 artworks and their artists since the Middle Ages until the present day.


It is a wonderful resource to study history and art. Variables available include the title of the artwork, dimensions, author, description, location, country where it was produced, the year the artist started the work and the year he or she finished it. These variables are only some examples, but there is much more to explore.



2. Download and install PostgreSQL and pgAdmin

PostgreSQL is a free and very popular relational database management system. It stores and manages the tables contained in a database. Please, consult this guide to install it in your computer.


After you install PostgreSQL, you will need to connect to the Postgre database server. In this tutorial, we will be using the pgAdmin application to establish this connection. It is a visual and intuitive interface and makes many operations easier to execute. The guide above will also guide you through the process of connecting to your local database. In the next steps, after being connected to your local database server, we will learn how to create a database that will store the National Gallery Dataset.


3. Creating the database and its tables

After you are connected to the server, click “Databases” with the right mouse button and choose “Create” and “Database…” as shown in the image below.



Next, give a title to your database as shown in the figure below. In our case, it will be called “art_db”. Click “Save” and it is all set!



With the database ‘art_bd’ selected, click the ‘Query Tool’ as shown below.


This will open a field where you can type SQL code. Our objective is to create the first table of our database, which will contain the content of ‘objects.csv’ available in the GitHub account of the National Gallery of Art, provided in the Data section above.


To create a table, we must specify the name and the variable type for each variable in the table. The SQL command to create a table is quite intuitive: CREATE TABLE name_of_your_table. Copy the code below and paste it in the window opened by the ‘Query Tool’. The code specify each variable of the objects table. This table contains information on each artwork available in the collection.


content_copy Copy

CREATE TABLE objects ( objectID integer NOT NULL, accessioned CHARACTER VARYING(32), accessionnum CHARACTER VARYING(32), locationid CHARACTER VARYING(32), title CHARACTER VARYING(2048), displaydate CHARACTER VARYING(256), beginyear integer, endyear integer, visualbrowsertimespan CHARACTER VARYING(32), medium CHARACTER VARYING(2048), dimensions CHARACTER VARYING(2048), inscription CHARACTER VARYING, markings CHARACTER VARYING, attributioninverted CHARACTER VARYING(1024), attribution CHARACTER VARYING(1024), provenancetext CHARACTER VARYING, creditline CHARACTER VARYING(2048), classification CHARACTER VARYING(64), subclassification CHARACTER VARYING(64), visualbrowserclassification CHARACTER VARYING(32), parentid CHARACTER VARYING(32), isvirtual CHARACTER VARYING(32), departmentabbr CHARACTER VARYING(32), portfolio CHARACTER VARYING(2048), series CHARACTER VARYING(850), volume CHARACTER VARYING(850), watermarks CHARACTER VARYING(512), lastdetectedmodification CHARACTER VARYING(64), wikidataid CHARACTER VARYING(64), customprinturl CHARACTER VARYING(512) );


The last step is to load the data from the csv file into this table. This can be done through the ‘COPY’ command as shown below.


content_copy Copy

COPY objects (objectid, accessioned, accessionnum, locationid, title, displaydate, beginyear, endyear, visualbrowsertimespan, medium, dimensions, inscription, markings, attributioninverted, attribution, provenancetext, creditline, classification, subclassification, visualbrowserclassification, parentid, isvirtual, departmentabbr, portfolio, series, volume, watermarks, lastdetectedmodification, wikidataid, customprinturl) FROM 'C:/temp/objects.csv' DELIMITER ',' CSV HEADER;


tips_and_updates   Download the "objects.csv" file and save it in the desired folder. Note however, that sometimes your system might block access to this file via pgAdmin. Therefore I saved it in the "temp" folder. In any case, change the path in the code above to match where you saved the "objects.csv" file.


Great! Now you should have your first table loaded to your database. The complete database includes more than 15 tables. However, we will only use two of them for this example, as shown in the scheme below. Note that the two tables relate to each other through the key variable objectid.



To load the “objects_terms” table, please repeat the same procedure with the code below.


content_copy Copy

CREATE TABLE objects_terms ( termid INTEGER, objectid INTEGER, termtype VARCHAR(64), term VARCHAR(256), visualbrowsertheme VARCHAR(32), visualbrowserstyle VARCHAR(64) ); COPY objects_terms (termid, objectid, termtype, term, visualbrowsertheme, visualbrowserstyle) FROM 'C:/temp/objects_terms.csv' DELIMITER ',' CSV HEADER;



4. Exploring the data with SQL commands

Click the ‘Query Tool’ to start exploring the data. First, select which variables you would like to include in your analysis. Second, you tell SQL in which table this variables are. The code below selects the variables title and attribution from the objects table. It also limits the result to 5 observations.


content_copy Copy

SELECT title, attribution FROM objects LIMIT 5


Now, we would like to know what are the different kinds of classification in this dataset. To achieve that, we have to select the classification variable, but including only distinct values.


content_copy Copy

SELECT DISTINCT(classification) FROM objects


The result tells us that there are 11 classifications: “Decorative Art”, “Drawing”, “Index of American Design”, “Painting”, “Photograph”, “Portfolio”, “Print”, “Sculpture”, “Technical Material”, “Time-Based Media Art” and “Volume”.


Finally, let us group the artworks by classification and count the number of objects in each category. COUNT(*) will count the total of items in the groups defined by GROUP BY. When you select a variable you can give it a new name with AS. Finally, the command ORDER BY orders the classification by number of items in a descending order (DESC).


content_copy Copy

SELECT classification, COUNT(*) as n_items FROM objects GROUP BY classification ORDER BY n_items DESC


Note that prints is the largest classification, followed by photographs.



5. Using pd.read_sql_query to access data

Now that you have your SQL database working, it is time to access it with Python. Before using Pandas, we have to connect Python to our SQL database. We will do that with psycopg2, a very popular PostgreSQL adapter for Python. Please, install it with pip install psycopg2.


We use the connect method of psycopg2 to establish the connection. It takes 4 main arguments:

  • host: in our case, the database is hosted locally, so we will pass localhost to this parameter. Note, however, that we could specify an IP if the server was external;
  • database: the name given to your SQL database, art_db;
  • user: user name required to authenticate;
  • password: your database password.


content_copy Copy

import psycopg2 import pandas as pd conn = psycopg2.connect( host="localhost", database="art_db", user="postgres", password="*******")


The next step is to store our SQL query in a string Python variable. The query below performs a LEFT JOIN with the two tables in our database. The operation uses the variable objectid to join the two tables. In practice we are selecting the titles, authors (attribution), classification - we keep only “Painting” with a WHERE command -, and term - we filter only terms that specify the “Style” of the painting.


content_copy Copy

command = ''' SELECT o.title, o.attribution, o.classification, ot.term FROM objects as o LEFT JOIN objects_terms as ot ON o.objectid = ot.objectid WHERE classification = 'Painting' AND termtype = 'Style' '''


Finally, we can extract the data. Use the cursor() method of conn to be able to “type” your SQL query. Pass the command variable and connection object to pd.read_sql_query and it will return a Pandas dataframe with the data we selected. Next, commit and close cursor and connections.


content_copy Copy

# open cursor to insert our query cur = conn.cursor() # use pd.read_sql_query to query our database and get the result in a pandas dataframe paintings = pd.read_sql_query(command, conn) # save any changes to the database conn.commit() # close cursor and connection cur.close() conn.close()


6. Visualizing the most popular styles

From the data we gathered from our database, we would like to check which are the 10 most popular art styles in our data, by number of paintings. We can use the value_counts() method of the column term to count how many paintings are classified in each style.


The result is a Pandas Series where the index contains the styles and the values contain the quantities of paintings of the respective style. The remaining code produces an horizontal bar plot showing the top 10 styles by number of paintings. If you would like to learn more about data visualization with matplotlib, please consult the lesson Storytelling with Matplotlib - Visualizing historical data.


content_copy Copy

import matplotlib.pyplot as plt top_10_styles = paintings['term'].value_counts().head(10) fig, ax = plt.subplots() ax.barh(top_10_styles.index, top_10_styles.values, color = "#f0027f", edgecolor = "#f0027f") ax.set_title("The Most Popular Styles") # inverts y axis ax.invert_yaxis() # eliminates grids ax.grid(False) # set ticks' colors to white ax.tick_params(axis='x', colors='white') ax.tick_params(axis='y', colors='white') # set font colors ax.set_facecolor('#2E3031') ax.title.set_color('white') # eliminates top, left and right borders and sets the bottom border color to white ax.spines["top"].set_visible(False) ax.spines["right"].set_visible(False) ax.spines["left"].set_visible(False) ax.spines["bottom"].set_color("white") # fig background color: fig.patch.set_facecolor('#2E3031')


Note that Realist, Baroque and Renaissance are the most popular art styles in our dataset.



Please feel free to share your thoughts and questions below!



6. Conclusions


  • It is possible to create a SQL database from csv files and access it with Python;
  • psycopg2 enables connection between Python and your SQL database;
  • pd.read_sql_query can be used to extract data into a Pandas dataframe.


Categories: FLOSS Project Planets

Dirk Eddelbuettel: RApiDatetime 0.0.9 on CRAN: Maintenance

Planet Debian - Wed, 2024-01-24 18:06

A new maintenance release of our RApiDatetime package is now on CRAN

RApiDatetime provides a number of entry points for C-level functions of the R API for Date and Datetime calculations. The functions asPOSIXlt and asPOSIXct convert between long and compact datetime representation, formatPOSIXlt and Rstrptime convert to and from character strings, and POSIXlt2D and D2POSIXlt convert between Date and POSIXlt datetime. Lastly, asDatePOSIXct converts to a date type. All these functions are rather useful, but were not previously exported by R for C-level use by other packages. Which this package aims to change.

This release responds to a CRAN request to clean up empty macros and sections in Rd files. Moreover, because the windows portion of the corresponding R-internal code underwent some changes, our (#ifdef conditional) coverage here is a little behind and created a warning under the newer UCRT setup. So starting with this release we are back to OS_type: unix meaning there will not be any Windows builds at CRAN. If you would like that to change, and ideally can work in the Windows portion, do not hesitate to get in touch.

Details of the release follow based on the NEWS file.

Changes in RApiDatetime version 0.0.9 (2024-01-23)
  • Replace auto-generated stale RApitDatetime-package.Rd with macro-filled stanza to satisfy CRAN request.

Courtesy of my CRANberries, there is also a diffstat report for this release.

If you like this or other open-source work I do, you can sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

Categories: FLOSS Project Planets

Drupal Association blog: Credit for Events Sponsored and Reducing Redundancy in Contribution Credit

Planet Drupal - Wed, 2024-01-24 16:56

Drupal's contribution recognition system is a key part of the way we recognize and incentivize contribution in the Drupal project. It's a system that needs constant care and feeding, both to ensure that we're recognizing the many kinds of contributions people and organizations are making, and to ensure that the system itself is proportional and fair for the effort being put in.

Event Sponsorship credit

We're about to introduce a new way for organizations to improve their marketplace rank. Drupal.org/community/events has allowed organizers to feature their sponsors since the beginning. Now, those sponsors listed will automatically receive contribution credit as well. To start, this will be a fixed amount for small events and a larger one for DrupalCon, but in the future, in collaboration with event organizers, we'd like it to scale to each sponsor's level of support.

We hope this will encourage companies to do the important work of financially supporting the grass roots events that help our community thrive. (And reward those who already do!). 

Reducing redundancy

We're also making a small tweak to the system this week related to how we recognize Contributor Roles. Contributor Roles are community submitted, and represent all of the many ways contribution happens in our community. However, we want to avoid double counting credit for certain types of roles. For example, issue credit is a fundamental pillar of our contribution recognition system, and some of our community roles such as 'Project Contributor' are also organically receiving credit from issues. 

We will no longer be granting marketplace rank to organizations sponsoring roles that are already represented in other ways, and we'll update the contributor role descriptions to reflect when this restriction applies. 

We expect this change to help level the playing field between some organizations who've made extensive use of the role system, and others who have kept a laser focus on contribution and innovation directly in issues.

Categories: FLOSS Project Planets

The Drop Times: Drupal Mountain Camp 2024: Tech, Thrills, and Alpine Adventures Await in Davos!

Planet Drupal - Wed, 2024-01-24 13:24
Explore the wonders of Drupal Mountain Camp 2024 in Davos! From March 7-10, immerse yourself in tech, nature, and community at this Alpine gathering. Exciting workshops, coworking, and mountain adventures await.
Categories: FLOSS Project Planets

TechBeamers Python: How Do I Install Pip in Python?

Planet Python - Wed, 2024-01-24 12:40

In this tutorial, we’ll provide all the necessary steps for you to install Pip in Python on both Windows and Linux platforms. If you’re using a recent version of Python (Python 3.4 and above), pip is likely already installed. To check if pip is installed, open a command prompt or terminal and run: If it’s […]

The post How Do I Install Pip in Python? appeared first on TechBeamers.

Categories: FLOSS Project Planets

ImageX: Libraries Going Digital: A Guide Through Useful Features For Library Websites, and How Drupal Fits In

Planet Drupal - Wed, 2024-01-24 12:11

Authored by: Nadiia Nykolaichuk.

Libraries are known as one of the most traditional ways of helping people get valuable knowledge. Most people imagine a library as a quiet building with long shelves, filled with the scent of well-worn books and the quiet rustle of turning pages. It’s a true intellectual haven, and librarians are its guardians.

Categories: FLOSS Project Planets

TechBeamers Python: How Do You Filter a List in Python?

Planet Python - Wed, 2024-01-24 09:33

In this tutorial, we’ll explain different methods to filter a list in Python with the help of multiple examples. You’ll learn to use the Python filter() function, list comprehension, and also use Python for loop to select elements from the list. Filter a List in Python With the Help of Examples As we know there […]

The post How Do You Filter a List in Python? appeared first on TechBeamers.

Categories: FLOSS Project Planets

Real Python: What Are Python Raw Strings?

Planet Python - Wed, 2024-01-24 09:00

If you’ve ever come across a standard string literal prefixed with either the lowercase letter r or the uppercase letter R, then you’ve encountered a Python raw string:

Python >>> r"This is a raw string" 'This is a raw string' Copied!

Although a raw string looks and behaves mostly the same as a normal string literal, there’s an important difference in how Python interprets some of its characters, which you’ll explore in this tutorial.

Notice that there’s nothing special about the resulting string object. Whether you declare your literal value using a prefix or not, you’ll always end up with a regular Python str object.

Other prefixes available at your fingertips, which you can use and sometimes even mix together in your Python string literals, include:

  • b: Bytes literal
  • f: Formatted string literal
  • u: Legacy Unicode string literal (PEP 414)

Out of those, you might be most familiar with f-strings, which let you evaluate expressions inside string literals. Raw strings aren’t as popular as f-strings, but they do have their own uses that can improve your code’s readability.

Creating a string of characters is often one of the first skills that you learn when studying a new programming language. The Python Basics book and learning path cover this topic right at the beginning. With Python, you can define string literals in your source code by delimiting the text with either single quotes (') or double quotes ("):

Python >>> david = 'She said "I love you" to me.' >>> alice = "Oh, that's wonderful to hear!" Copied!

Having such a choice can help you avoid a syntax error when your text includes one of those delimiting characters (' or "). For example, if you need to represent an apostrophe in a string, then you can enclose your text in double quotes. Alternatively, you can use multiline strings to mix both types of delimiters in the text.

You may use triple quotes (''' or """) to declare a multiline string literal that can accommodate a longer piece of text, such as an excerpt from the Zen of Python:

Python >>> poem = """ ... Beautiful is better than ugly. ... Explicit is better than implicit. ... Simple is better than complex. ... Complex is better than complicated. ... """ Copied!

Multiline string literals can optionally act as docstrings, a useful form of code documentation in Python. Docstrings can include bare-bones test cases known as doctests, as well.

Regardless of the delimiter type of your choice, you can always prepend a prefix to your string literal. Just make sure there’s no space between the prefix letters and the opening quote.

When you use the letter r as the prefix, you’ll turn the corresponding string literal into a raw string counterpart. So, what are Python raw strings exactly?

Free Bonus: Click here to download a cheatsheet that shows you the most useful Python escape character sequences.

Take the Quiz: Test your knowledge with our interactive “Python Raw Strings” quiz. Upon completion you will receive a score so you can track your learning progress over time:

Take the Quiz »

In Short: Python Raw Strings Ignore Escape Character Sequences

In some cases, defining a string through the raw string literal will produce precisely the same result as using the standard string literal in Python:

Python >>> r"I love you" == "I love you" True Copied!

Here, both literals represent string objects that share a common value: the text I love you. Even though the first literal comes with a prefix, it has no effect on the outcome, so both strings compare as equal.

To observe the real difference between raw and standard string literals in Python, consider a different example depicting a date formatted as a string:

Python >>> r"10\25\1991" == "10\25\1991" False Copied!

This time, the comparison turns out to be false even though the two string literals look visually similar. Unlike before, the resulting string objects no longer contain the same sequence of characters. The raw string’s prefix (r) changes the meaning of special character sequences that begin with a backslash (\) inside the literal.

Note: To understand how Python interprets the above string, head over to the final section of this tutorial, where you’ll cover the most common types of escape sequences in Python.

Read the full article at https://realpython.com/python-raw-strings/ »

[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

Categories: FLOSS Project Planets

Tag1 Consulting: Unraveling the Extract, Transform, Load (ETL) Data Migration Process: A Deep Dive on Load

Planet Drupal - Wed, 2024-01-24 08:37

In this episode of Tag1 Team Talks, our team of Drupal experts delve into the essential "Load" phase of the ETL (Extract, Transform, Load) process in Drupal migrations.

Read more janez Wed, 01/24/2024 - 05:37
Categories: FLOSS Project Planets

Ned Batchelder: You (probably) don’t need to learn C

Planet Python - Wed, 2024-01-24 06:38

On Mastodon I wrote that I was tired of people saying, “you should learn C so you can understand how a computer really works.” I got a lot of replies which did not change my mind, but helped me understand more how abstractions are inescapable in computers.

People made a number of claims. C was important because syscalls are defined in terms of C semantics (they are not). They said it was good for exploring limited-resource computers like Arduinos, but most people don’t program for those. They said it was important because C is more performant, but Python programs often offload the compute-intensive work to libraries other people have written, and these days that work is often on a GPU. Someone said you need it to debug with strace, then someone said they use strace all the time and don’t know C. Someone even said C was good because it explains why NUL isn’t allowed in filenames, but who tries to do that, and why learn a language just for that trivia?

I’m all for learning C if it will be useful for the job at hand, but you can write lots of great software without knowing C.

A few people repeated the idea that C teaches you how code “really” executes. But C is an abstract model of a computer, and modern CPUs do all kinds of things that C doesn’t show you or explain. Pipelining, cache misses, branch prediction, speculative execution, multiple cores, even virtual memory are all completely invisible to C programs.

C is an abstraction of how a computer works, and chip makers work hard to implement that abstraction, but they do it on top of much more complicated machinery.

C is far removed from modern computer architectures: there have been 50 years of innovation since it was created in the 1970’s. The gap between C’s model and modern hardware is the root cause of famous vulnerabilities like Meltdown and Spectre, as explained in C is Not a Low-level Language.

C can teach you useful things, like how memory is a huge array of bytes, but you can also learn that without writing C programs. People say, C teaches you about memory allocation. Yes it does, but you can learn what that means as a concept without learning a programming language. And besides, what will Python or Ruby developers do with that knowledge other than appreciate that their languages do that work for them and they no longer have to think about it?

Pointers came up a lot in the Mastodon replies. Pointers underpin concepts in higher-level languages, but you can explain those concepts as references instead, and skip pointer arithmetic, aliasing, and null pointers completely.

A question I asked a number of people: what mistakes are JavaScript/Ruby/Python developers making if they don’t know these things (C, syscalls, pointers)?”. I didn’t get strong answers.

We work in an enormous tower of abstractions. I write programs in Python, which provides me abstractions that C (its underlying implementation language) does not. C provides an abstract model of memory and CPU execution which the computer implements on top of other mechanisms (microcode and virtual memory). When I made a wire-wrapped computer, I could pretend the signal travelled through wires instantaneously. For other hardware designers, that abstraction breaks down and they need to consider the speed electricity travels. Sometimes you need to go one level deeper in the abstraction stack to understand what’s going on. Everyone has to find the right layer to work at.

Andy Gocke said it well:

When you no longer have problems at that layer, that’s when you can stop caring about that layer. I don’t think there’s a universal level of knowledge that people need or is sufficient.

“like jam or bootlaces” made another excellent point:

There’s a big difference between “everyone should know this” and “someone should know this” that seems to get glossed over in these kinds of discussions.

C can teach you many useful and interesting things. It will make you a better programmer, just as learning any new-to-you language will because it broadens your perspective. Some kinds of programming need C, though other languages like Rust are ably filling that role now too. C doesn’t teach you how a computer really works. It teaches you a common abstraction of how computers work.

Find a level of abstraction that works for what you need to do. When you have trouble there, look beneath that abstraction. You won’t be seeing how things really work, you’ll be seeing a lower-level abstraction that could be helpful. Sometimes what you need will be an abstraction one level up. Is your Python loop too slow? Perhaps you need a C loop. Or perhaps you need numpy array operations.

You (probably) don’t need to learn C.

Categories: FLOSS Project Planets

Thomas Lange: FAI 6.2 released

Planet Debian - Wed, 2024-01-24 06:12

After more than one a year, a new minor FAI version is available, but it includes some interesting new features.

Here a the items from the NEWS file:

fai (6.2) unstable; urgency=low

  • fai-cd can now create live images
  • Use systemd during installation
  • New feature: run FAI inside a screen or tmux session
  • fai-diskimage: do not use compression of qemu-img which is slow instead provide .qcow2.zst, add option -C
  • fai-kvm: add support for booting from USB storage
  • new tool mk-data-partition adds a data partition to an ISO
  • easy installation of packages from /pkgs/<CLASS> directories
  • new helper functions for creating custom list of disks
  • new method detect:// for FAI_CONFIG_SRC

In the past the command fai-cd was only used for creating installation ISOs, that could be used from CD or USB stick. Now it possible to create a live ISO. Therefore you create your live chroot environment using 'fai dirinstall' and then convert it to a bootable live ISO using fai-cd. See man fai-cd(8) for an example.

Years ago I had the idea to use the remaining disk space on an USB stick after copying an ISO onto it. I've blogged about this recently:

https://blog.fai-project.org/posts/extending-iso-images/

The new FAI version includes the tool mk-data-partition for adding a data partition to the ISO itself or to an USB stick.

FAI detects this data partition, mounts it to /media/data and can then use various configurations from it. You may want to copy your own set of .deb packages or your whole FAI config space to this partition. FAI now automatically searches this partition for usable FAI configuration data and packages. FAI will install all packages from pkgs/<CLASSNAME> if the equivalent class is defined. Setting FAI_CONFIG_SRC=detect:// now looks into the data partition for the subdirectory 'config' and uses this as the config space. So it's now possible to modify an existing ISO (that is read-only) and make changes to the config space. If there's no config directory in the data partition FAI uses the default location on the ISO.

The tool fai-kvm, which starts virtual machines can now boot an ISO not only as CD but also as USB stick.

Sometimes users want to adjust the list of disks before the partitioning is startet. Therefore FAI provides several new functions including

  • smallestdisk()
  • largestdisk()
  • matchdisks()

You can select individual disks by their model name or even the serial number.

Two new FAI flags were added (tmux and screen) that make it easy to run FAI inside a tmux or screen session.

And finally FAI uses systemd. Yeah!

This technical change was waiting since 2015 in a merge request from Moritz 'Morty' Strübe, that would enable using systemd during the installation. Before FAI still was using old-style SYSV init scripts and did not started systemd. I didn't tried to apply the patch, because I was afraid that it would need much time to make it work. But then in may 2023 Juri Grabowski just gave it a try at MiniDebConf Hamburg, and voilà it just works! Many, many thanks to Moritz and Juri for their bravery.

The whole changelog can be found at https://tracker.debian.org/media/packages/f/fai/changelog-6.2

New ISOs for FAI are also available including an example of a Xfce desktop live ISO: https://fai-project.org/fai-cd/

The FAIme service for creating customized installation ISOs will get its update later.

The new packages are available for bookworm by adding this line to your sources.list:

deb https://fai-project.org/download bookworm koeln

Categories: FLOSS Project Planets

IslandT: How to search multiple lines with Python?

Planet Python - Wed, 2024-01-24 04:34

Often you will want to search for words or phrase in the entire paragraph and here is the python regular expression code which will do that.

pattern = re.compile(r'^\w+ (\w+) (\w+)', re.M)

We use the re.M flag which will search the entire paragraph for the match words.

Now let us try out the program above…

gad = pattern.findall("hello mr Islandt\nhello mr gadgets") print(gad)

…which will then display the following outcome

[('mr', 'Islandt'), ('mr', 'gadgets')]

Explanation :

The program above will look for two words in the first line and keeps them under a tuple and when the program meets the new line character it continues the search in the second line and return another tuple, both of the tuple will include inside a list. Using re.M flag the search will go on for multiple lines as long as there are more matches out there!

Categories: FLOSS Project Planets

PyBites: Exploring the Role of Static Methods in Python: A Functional Perspective

Planet Python - Wed, 2024-01-24 04:21
Introduction

Python’s versatility in supporting different programming paradigms, including procedural, object-oriented, and functional programming, opens up a rich landscape for software design and development.

Among these paradigms, the use of static methods in Python, particularly in an object-oriented context, has been a topic of debate.

This article delves into the role and implications of static methods in Python, weighing them against a more functional approach that leverages modules and functional programming principles.

The Nature of Static Methods in Python Definition and Usage:

Static methods in Python are defined within a class using the @staticmethod decorator.

Unlike regular methods, they do not require an instance (self) or class (cls) reference.

They are typically used for utility functions that logically belong to a class but are independent of class instances.

Example in Practice:

Consider this code example from Django:

# django/db/backends/oracle/operations.py class DatabaseOperations(BaseDatabaseOperations): ... other methods and attributes ... @staticmethod def convert_empty_string(value, expression, connection): return "" if value is None else value @staticmethod def convert_empty_bytes(value, expression, connection): return b"" if value is None else value

Here, convert_empty_string and convert_empty_bytes are static due to their utility nature and specific association with the DatabaseOperations class.

The Case for Modules and Functional Programming Embracing Python’s Module System:

Python’s module system allows for effective namespace management and code organization.

Namespaces are one honking great idea — let’s do more of those!

The Zen of Python, by Tim Peters

Functions, including those that could be static methods, can be organized in modules, making them reusable and easily accessible.

Functional Programming Advantages:
  1. Quick Development: Functional programming emphasizes simplicity and stateless operations, leading to concise and readable code.
  2. Code Resilience: Pure functions (functions that do not alter external state) enhance predictability and testability. Related: 10 Tips to Write Better Functions in Python
  3. Separation of Concerns: Using functions and modules promotes a clean separation of data representation (classes) and behavior (functions).
Combining Object-Oriented and Functional Approaches Hybrid Strategy:
  1. Abstraction with Classes: Use classes for data representation, encapsulating state and behavior that are closely related. See also our When to Use Classes article.
  2. Functional Constructs: Utilize functional concepts like higher-order functions, immutability, and pure functions for business logic and data manipulation.
  3. Factories and Observers: Implement design patterns like factory and observer for creating objects and managing state changes, respectively (shout-out to Brandon Rhodes’ awesome great design patterns guide!)
Conclusion: Striking the Right Balance

The decision to use static methods, standalone functions, or a functional programming approach in Python depends on several factors:

  • Relevance: Is the function logically part of a class’s responsibilities?
  • Reusability: Would the function be more versatile as a standalone module function?
  • Simplicity: Can the use of regular functions simplify the class structure and align with the Single Responsibility Principle? Related article: Tips for clean code in Python.

Ultimately, the choice lies in finding the right balance that aligns with the application’s architecture, maintainability, and the development team’s expertise.

Python, with its multi-paradigm capabilities , offers the flexibility to adopt a style that best suits the project’s needs.

Fun Fact: Static Methods Were an Accident

Guido added static methods as an accident! He originally meant to add class methods instead.

I think the reason is that a module at best acts as a class where every method is a *static* method, but implicitly so. Ad we all know how limited static methods are. (They’re basically an accident — back in the Python 2.2 days when I was inventing new-style classes and descriptors, I meant to implement class methods but at first I didn’t understand them and accidentally implemented static methods first. Then it was too late to remove them and only provide class methods.)

Guido van Rossum, see the discussion thread here, and thanks Will for pointing me to this. Call to Action

What’s your approach to using static methods in Python?

Do you favor a more functional style, or do you find static methods indispensable in certain scenarios?

Share your thoughts and experiences in our community

Categories: FLOSS Project Planets

eGenix.com: eGenix Antispam Bot for Telegram 0.6.0 GA

Planet Python - Wed, 2024-01-24 03:00
Introduction

eGenix has long been running a local user group meeting in Düsseldorf called Python Meeting Düsseldorf and we are using a Telegram group for most of our communication.

In the early days, the group worked well and we only had few spammers joining it, which we could well handle manually.

More recently, this has changed dramatically. We are seeing between 2-5 spam signups per day, often at night. Furthermore, the signups accounts are not always easy to spot as spammers, since they often come with profile images, descriptions, etc.

With the bot, we now have a more flexible way of dealing with the problem.

Please see our project page for details and download links.

Features
  • Low impact mode of operation: the bot tries to keep noise in the group to a minimum
  • Several challenge mechanisms to choose from, more can be added as needed
  • Flexible and easy to use configuration
  • Only needs a few MB of RAM, so can easily be put into a container or run on a Raspberry Pi
  • Can handle quite a bit of load due to the async implementation
  • Works with Python 3.9+
  • MIT open source licensed
News

The 0.6.0 release fixes a few bugs and adds more features:

  • Upgraded to pyrogram 2.0.106, which fixes a weird error we have been getting recently with the old version 1.4.16 (see pyrogram/pyrogram#1347)
  • Catch weird error from Telegram when deleting conversations; this seems to sometimes fail, probably due to a glitch on their side
  • Made the math and char entry challenges a little harder
  • Added new DictItemChallenge

    It has been battle-tested in production for several years already and is proving to be a really useful tool to help with Telegram group administration.

    More Information

    For more information on the eGenix.com Python products, licensing and download instructions, please write to sales@egenix.com.

    Enjoy !

    Marc-Andre Lemburg, eGenix.com

    Categories: FLOSS Project Planets

    Pages