Feeds
Mirek Długosz: Understanding Linux virtualization stack
I find Linux virtualization stack confusing. KVM? ibvirt? QEMU? Xen? What does that even mean?
This post is my attempt at making sense of that all. I don’t claim it to be correct, but that’s the way I understand it. If I badly missed a mark somewhere, please reach out and tell me how wrong I am.
Virtualization primerVirtual machine is a box inside which programs think they are running on different hardware and operating system. That box is like a full computer, running inside a computer. Virtualization is process of creating these boxes. Virtual machine is called guest, while operating system that runs virtual machine is called host.
Two main reasons to virtualize are security and resource management. Security - because if virtual machine is done correctly, then program inside it won’t even know it’s inside a virtual machine, and that means there’s a good chance it won’t escape to interfere with other programs. Resource management - so you can buy huge server and assign slice of resources to each box as you see fit, and dynamically change it as your needs shift.
Virtual machine pretends to be the entire computer, but most discussions revolve around CPU. Each CPU architecture has a specific set of instructions, and programs generally target only one of them. It is technically possible to translate instructions from one architecture to another, but that’s usually slow. Most modern CPUs provide extensions that allow virtual machines to run at speed comparable to non-virtualized installations. Using these extensions is only possible when the guest and the host target the same CPU architecture.
Conceptually, there is no significant difference between virtualization and emulation. Practically, emulation usually refers to guest thinking it has different CPU architecture than host, and to virtual machine pretending to be some very specific physical hardware - like a video game console. Virtualization in turn usually refers to specific case where guest thinks it has the same CPU architecture as the host.
KVMKVM is part of the kernel, responsible for talking with hardware. As I mentioned before, most modern CPUs have special extensions that allow guests to achieve near-native performance. KVM makes it possible for Linux host to use these extensions.
QEMUQEMU is weird, because it’s multiple different things.
First, it can run virtual machines and pass instructions from guest to KVM, where they will be handled by virtualization CPU extension. So you can think about QEMU as user space component for KVM.
Second, it can run virtual machines while emulating different CPU architecture. So you can think about QEMU as user space virtualization solution, which can run without KVM at all.
Third, it can run programs targeting one CPU architecture on a computer with another CPU architecture. It’s like lightweight virtualization, where only single program is virtualized.
Finally, it is set of standalone utilities related to virtualization, but not directly tied to anything in particular. Most of these programs work with disk image files in some way.
QEMU is probably the hardest part to understand, because it can do things that other parts of the stack are responsible for. This results in some overlap between different components and encountering program names in contexts where you would not expect them.
I think this overlap is mostly a historical artifact. The oldest commit in QEMU git repository is dated 2003, while KVM was included in Linux kernel in 2007. So it seems to me that QEMU was originally intended as software virtualization solution that did not depend on any hardware or kernel driver. Later Linux gained drivers for virtualization CPU extensions, but they were useless without something in user space that could work with them. So instead of creating completely new thing, QEMU was extended to work with KVM.
Technically, the journey towards understanding Linux virtualization stack could end here - you can use QEMU to create, start, stop and modify virtual machines. But QEMU commands tend to be verbose and long, so people rarely use it directly.
libvirtI have ignored that so far, but KVM is not the only virtualization driver in kernel. There are many, some of them closed-source.
libvirt is intended as unified layer that abstracts virtual machine management for application developers. So if you would like your application to create, start, stop or delete virtual machine, you don’t have to support each of these operations on KVM, QEMU, Xen and VirtualBox - you can just support libvirt and trust it will do the right thing.
libvirt doesn’t do any work directly, and instead asks other programs to do it. System administrator is responsible for setting up the host and ensuring virtualization may work. This makes libvirt great for application developers, who can forget about details of different solutions; but most of libvirt users are in less fortunate position, because they have to take a role of system administrator.
libvirt will talk with QEMU, which in turn may pass CPU instructions to KVM. Some sources online claim that libvirt can talk with KVM directly. libvirt itself perpetuates this confusion, as KVM is listed alongside QEMU on the main page, giving the impression they are two equivalent components. But detailed documentation page makes it clear - libvirt only talks with QEMU. It can detect KVM and allow user to create “hardware accelerated guests”, but that case is still handled through QEMU.
It’s also worth noting that libvirt supports multiple virtualization drivers across multiple operating systems, including FreeBSD and macOS.
virshvirsh is a command line interface for libvirt. That’s it, there’s nothing more to it. If it was created today, it might have been called libvirtctl.
VagrantVagrant abstracts virtual machine management across operating systems. It’s similar to libvirt, in the way that it allows application developers to target Vagrant and leave all the details to someone else. And just like libvirt, Vagrant doesn’t do anything on its own, but pushes all the work to tools lower in the stack.
Vagrant primary audience is software developers. It allows to succinctly define multiple virtual machines, and then start and provision them all with a single command. These machines are usually considered disposable - they might be deleted and re-created multiple times a day. If you want to manually modify the virtual machine and keep these changes for longer period of time, Vagrant is probably not a tool for you.
On Linux, Vagrant usually works with libvirt, but may also work with VirtualBox or Xen.
VirtualBoxVirtualBox is full virtualization solution. It consists of kernel driver and user space programs, including one with graphical interface. You can think of VirtualBox as something parallel to all that we’ve discussed so far.
The main reason to use VirtualBox is ease of use. It offers simpler mental model for virtualization - you just install VirtualBox, start VirtualBox UI and create and run virtual machines. While it does require a kernel module, and kernel is notoriously bad at breaking modules that are not in tree, VirtualBox has a lot of tooling and integrations that will ensure that module is built every time you install or update kernel image. On distributions like Ubuntu you don’t have to think about this at all.
Technically speaking, libvirt supports VirtualBox, so you can use libvirt and virsh to manage VirtualBox machines. In practice that seems to be rare.
XenXen is another virtualization stack. You can think about it as something parallel to KVM/QEMU/libvirt and VirtualBox. What sets Xen apart is that instead of running inside your operating system, it runs below your operating system.
Xen manages scheduling, interrupts and timers - that is, it manages who gets access to computer resources, when, and for how long. Xen is the very first thing that starts when you boot your computer.
But Xen is not an operating system. So right after starting, it starts the guest with one. That guest is special, because it provides drivers to hardware and is the only one with a privilege to talk with Xen. Only through that special guest virtual machine you can create other virtual machines, assign them resources etc.
Xen is one of the oldest virtualization solutions, with first release back in 2003. However, it was included in kernel only in 2010. So while many people prefer Xen and think favorably of its architecture, it seems to lost the popularity contest to KVM and QEMU.
Xen provides the tools required to manage guest virtual machines, but you can also manage them through libvirt.
SummaryI thought to put a visual aid in place of summary. Each column represents a possible stack - you generally want to pick one of them. Items with dashed line style are optional - you can use them, but don’t have to. Click to see the bigger image.
Paul Tagliamonte: Complex for Whom?
In basically every engineering organization I’ve ever regarded as particularly high functioning, I’ve sat through one specific recurring conversation which is not – a conversation about “complexity”. Things are good or bad because they are or aren’t complex, architectures needs to be redone because it’s too complex – some refactor of whatever it is won’t work because it’s too complex. You may have even been a part of some of these conversations – or even been the one advocating for simple light-weight solutions. I’ve done it. Many times.
When I was writing this, I had a flash-back to a over-10 year old post by mjg59 about LightDM. It would be a mistake not to link it here.Rarely, if ever, do we talk about complexity within its rightful context – complexity for whom. Is a solution complex because it’s complex for the end user? Is it complex if it’s complex for an API consumer? Is it complex if it’s complex for the person maintaining the API service? Is it complex if it’s complex for someone outside the team maintaining it to understand? Complexity within a problem domain I’ve come to believe, is fairly zero-sum – there’s a fixed amount of complexity in the problem to be solved, and you can choose to either solve it, or leave it for those downstream of you to solve that problem on their own.
Although I believe there is a fixed amount of complexity in the lower bound of a problem, you always have the option to change the problem you're solving!That being said, while I believe there is a lower bound in complexity to contend with for a problem, I do not believe there is an upper bound to the complexity of solutions possible. It is always possible, and in fact, very likely that teams create problems for themselves while trying to solve a problem. The rest of this post is talking to the lower bound. When getting feedback on an early draft of this blog post, I’ve been informed that Fred Brooks coined a term for what I call “lower bound complexity” – “Essential Complexity”, in the paper “No Silver Bullet—Essence and Accident in Software Engineering”, which is a better term and can be used interchangeably.
Complexity CultureIn a large enough organization, where the team is high functioning enough to have and maintain trust amongst peers, members of the team will specalize. People will begin to engage with subsets of the work to be done, and begin to have their efficacy measured against that part of the organization’s problems. Incentives shift, and over time it becomes increasingly likely that two engineers may have two very different priorities when working on the same system together. Someone accountable for uptime and tasked with responding to outages will begin to resist changes. Someone accountable for rapidly delivering features will resist gates between them and their users. Companies (either wittingly or unwittingly) will deal with this by tasking engineers with both production (feature development) and operational tasks (maintenance), so the difference in incentives isn’t usually as bad as it could be.
The events depicted in this movie are fictitious. Any similarity to any person living or dead is merely coincidental.When we get a bunch of folks from far-flung corners of an organization in a room, fire up a slide deck and throw up some asperiational to-be architecture diagram in order to get a sign-off to solve some problem (be it someone needs a credible promotion packet, new feature needs to get delivered, or the system has begun to fail and needs fixing), the initial reaction will, more often than I’d like, start to devolve into a discussion of how this is going to introduce a bunch of complexity, going to be hard to maintain, why can’t you make it less complex?
In a high functioning environment, this is a mostly healthy impulse, coming from a good place, and genuinely intended to prevent problems for the whole organization by reducing non-essental complexity. That is good. I'm talking about a conversation discussing removing lower-limit complexity.Right around here is when I start to try and contextualize the conversation happening around me – understand what complexity is that being discussed, and understand who is taking on that burden. Think about who should be owning that problem, and work through the tradeoffs involved. Is it best solved here, or left to consumers (be them other systems, developers, or users). Should something become an API call’s optional param, taking on all the edge-cases and on, or should users have to implement the logic using the data you return (leaving everyone else to take on all the edge-cases and maintenance)? Should you process the data, or require the user to preprocess it for you?
Carla Geisser described this as being reminicent of the technique outlined in "end to end arguments in system design", which she uses to think about where complexity winds up in a system. It's an extremely good parallel.Frequently it’s right to make an active and explicit decision to simplify and leave problems to be solved downstream, since they may not actually need to be solved – or perhaps you expect consumers will want to own the specifics of how the problem is solved, in which case you leave lots of documentation and examples. Many other times, especially when it’s something downstream consumers are likely to hit, it’s best solved internal to the system, since the only thing that can come of leaving it unsolved are bugs, frustration and half-correct solutions. This is a grey-space of tradeoffs, not a clear decision tree. No one wants the software manifestation of a katamari ball or a junk drawer, nor does anyone want a half-baked service unable to handle the simplest use-case.
Head-in-sand as a ServicePopoffs about how complex something is, are, to a first approximation, best understood as meaning “complicated for the person making comments”. A lot of the #thoughtleadership believe that an AWS hosted EKS k8s cluster running images built by CI talking to an AWS hosted PostgreSQL RDS is not complex. They’re right. Mostly right. This is less complex – less complex for them. It’s not, however, without complexity and its own tradeoffs – it’s just complexity that they do not have to deal with. Now they don’t have to maintain machines that have pesky operating systems or hard drive failures. They don’t have to deal with updating the version of k8s, nor ensuring the backups work. No one has to push some artifact to prod manually. Deployments happen unattended. You click a button and get a cluster.
On the other hand, developers outside the ops function need to deal with troubleshooting CI, debugging access control rules encoded in turing complete YAML, permissions issues inside the cluster due to whatever the fuck a service mesh is, everyone needs to learn how to use some k8s tools they only actually use during a bad day, likely while doing some x.509 troubleshooting to connect to the cluster (an internal only endpoint; just port forward it) – not to mention all sorts of rules to route packets to their project (a single repo’s binary being run in 3 containers on a single vm host).
Truly i'm not picking on k8s here; I do genuinely believe it when I say EKS is less complex for me to operate well; that's kinda the whole point.Beyond that, there’s the invisible complexity – complexity on the interior of a service you depend on. I think about the dozens of teams maintaining the EKS service (which is either run on EC2 instances, or alternately, EC2 instances in a trench coat, moustache and even more shell scripts), the RDS service (also EC2 and shell scripts, but this time accounting for redundancy, backups, availability zones), scores of hypervisors pulled off the shelf (xen, kvm) smashed together with the ones built in-house (firecracker, nitro, etc) running on hardware that has to be refreshed and maintained continuously. Every request processed by network ACL rules, AWS IAM rules, security group rules, using IP space announced to the internet wired through IXPs directly into ISPs. I don’t even want to begin to think about the complexity inherent in how those switches are designed. Shitloads of complexity to solve problems you may or may not have, or even know you had.
Do I care about invisible complexity? Generally, no. I don't. It's not my problem and they don't show up to my meetings.What’s more complex? An app running in an in-house 4u server racked in the office’s telco closet in the back running off the office Verizon line, or an app running four hypervisors deep in an AWS datacenter? Which is more complex to you? What about to your organization? In total? Which is more prone to failure? Which is more secure? Is the complexity good or bad? What type of Complexity can you manage effectively? Which threaten the system? Which threaten your users?
COMPLEXIVIBESThis extends beyond Engineering. Decisions regarding “what tools are we able to use” – be them existing contracts with cloud providers, CIO mandated SaaS products, a list of the only permissible open source projects – will incur costs in terms of expressed “complexity”. Pinning open source projects to a fixed set makes SBOM production “less complex”. Using only one SaaS provider’s product suite (even if its terrible, because it has all the types of tools you need) makes accreditation “less complex”. If all you have is a contract with Pauly T’s lowest price technically acceptable artisinal cloudary and haberdashery, the way you pay for your compute is “less complex” for the CIO shop, though you will find yourself building your own hosted database template, mechanism to spin up a k8s cluster, and all the operational and technical burden that comes with it. Or you won’t and make it everyone else’s problem in the organization. Nothing you can do will solve for the fact that you must now deal with this problem somewhere because it was less complicated for the business to put the workloads on the existing contract with a cut-rate vendor.
Suddenly, the decision to “reduce complexity” because of an existing contract vehicle has resulted in a huge amount of technical risk and maintenance burden being onboarded. Complexity you would otherwise externalize has now been taken on internally. With a large enough organizations (specifically, in this case, i’m talking about you, bureaucracies), this is largely ignored or accepted as normal since the personnel cost is understood to be free to everyone involved. Doing it this way is more expensive, more work, less reliable and less maintainable, and yet, somehow, is, in a lot of ways, “less complex” to the organization. It’s particularly bad with bureaucracies, since screwing up a contract will get you into much more trouble than delivering a broken product, leaving basically no reason for anyone to care to fix this.
I can’t shake the feeling that for every story of technical mandates gone awry, somewhere just out of sight there’s a decisionmaker optimizing for what they believe to be the least amount of complexity – least hassle, fewest unique cases, most consistency – as they can. They freely offload complexity from their accreditation and risk acceptance functions through mandates. They will never have to deal with it. That does not change the fact that someone does.
TC;DR (TOO COMPLEX; DIDN’T REVIEW)We wish to rid ourselves of systemic Complexity – after all, complexity is bad, simplicity is good. Removing upper-bound own-goal complexity (“accidental complexity” in Brooks’s terms) is important, but once you hit the lower bound complexity, the tradeoffs become zero-sum. Removing complexity from one part of the system means that somewhere else – maybe outside your organization or in a non-engineering function must grow it back. Sometimes, the opposite is the case, such as when a previously manual business processes is automated. Maybe that’s a good idea. Maybe it’s not. All I know is that what doesn’t help the situation is conflating complexity with everything we don’t like – legacy code, maintenance burden or toil, cost, delivery velocity.
- Complexity is not the same as proclivity to failure. The most reliable systems I’ve interacted with are unimaginably complex, with layers of internal protection to prevent complete failure. This has its own set of costs which other people have written about extensively.
- Complexity is not cost. Sometimes the cost of taking all the complexity in-house is less, for whatever value of cost you choose to use.
- Complexity is not absolute. Something simple from one perspective may be wildly complex from another. The impulse to burn down complex sections of code is helpful to have generally, but sometimes things are complicated for a reason, even if that reason exists outside your codebase or organization.
- Complexity is not something you can remove without introducing complexity elsewhere. Just as not making a decision is a decision itself; choosing to require someone else to deal with a problem rather than dealing with it internally is a choice that needs to be considered in its full context.
Next time you’re sitting through a discussion and someone starts to talk about all the complexity about to be introduced, I want to pop up in the back of your head, politely asking what does complex mean in this context? Is it lower bound complexity? Is this complexity desirable? Is what they’re saying mean something along the lines of I don’t understand the problems being solved, or does it mean something along the lines of this problem should be solved elsewhere? Do they believe this will result in more work for them in a way that you don’t see? Should this not solved at all by changing the bounds of what we should accept or redefine the understood limits of this system? Is the perceived complexity a result of a decision elsewhere? Who’s taking this complexity on, or more to the point, is failing to address complexity required by the problem leaving it to others? Does it impact others? How specifically? What are you not seeing?
What can change?
What should change?
PyCoder’s Weekly: Issue #655 (Nov. 12, 2024)
#655 – NOVEMBER 12, 2024
View in Browser »
In this video course, you’ll learn all about web scraping in Python. You’ll see how to parse data from websites and interact with HTML forms using tools such as Beautiful Soup and MechanicalSoup.
REAL PYTHON course
This article does a comparison between code in single threaded, threaded, and multi-process versions under Python 3.12, 3.13, and 3.13 free-threaded with the GIL on and off.
ARTHUR PASTEL
The only web scraper API that extracts data at scale with a 98.7% average success rate while automatically handling all anti-bot systems. ZenRows is a complete web scraping toolkit with premium proxies, anti-CAPTCHA, headless browsers, and more. Try for free now →
ZENROWS sponsor
This is the first post in a multi-part series that uses Python to build tiny interpreters for other languages.
SERGE ZAITSEV
Michael (from Talk Python fame) introduces the concept of “stack-native” as the opposite of “cloud-native”, and how it applies to Python web apps. Building applications with just enough full-stack building blocks to run reliably with minimal complexity, rather than relying on a multitude of cloud services.
MICHAEL KENNEDY
In this tutorial, you’ll learn how to reset a pandas DataFrame index using various techniques. You’ll also learn why you might want to do this and understand the problems you can avoid by optimizing the index structure.
REAL PYTHON
Posit Connect lets data teams publish, host, & manage Python work. It provides a secure, scalable platform for sharing data insights with those who need them, including models, Jupyter notebooks, Streamlit & Shiny apps, Plotly dashboards, and more.
POSIT sponsor
In this tutorial, you’ll learn about Python closures. A closure is a function-like object with an extended scope. You can use closures to create decorators, factory functions, stateful functions, and more.
REAL PYTHON
You can go surprisingly far with a simple software architecture, in fact simplicity can make it easier to scale. This post talks about some real world cases of just that.
DAN LUU
This project shows you how to implement the Vehicle Routing Problem (VRP) using OpenStreetMap data and Google’s OR-Tools library to find efficient routes.
MEDIUM.COM/P • Shared by Albert Ferré
This is a deep dive article on Python project management and packaging. It covers the pyproject.toml file, modules, dependencies, locking, and more.
REINFORCED KNOWLEDGE
This opinion piece discusses the fact that our faster hardware still can’t keep up with our bloated software and why that is the case.
PRANAV NUTALAPTI
This post by a Django core developer talks about the last twenty years of the library and why it has had such staying power.
CARLTON GIBSON
This quick-hit post shows you how to use the map() and filter() functions in conjunction with list comprehensions.
JUHA-MATTI SANTALA
This post talks about when and when not to use a named tuple in your API and why you might make that choice.
BRETT CANNON
November 12, 2024
MEETUP.COM • Shared by Laura Stephens
November 13, 2024
REALPYTHON.COM
November 14 to November 16, 2024
PYCON.SE
November 15, 2024
MEETUP.COM
November 16 to November 17, 2024
PYCON.HK
November 16 to November 17, 2024
PYCON.JP
November 16 to November 18, 2024
PYTHON.IE
November 22 to November 27, 2024
PYCON.ORG.AU
Happy Pythoning!
This was PyCoder’s Weekly Issue #655.
View in Browser »
[ Subscribe to 🐍 PyCoder’s Weekly 💌 – Get the best Python news, articles, and tutorials delivered to your inbox once a week >> Click here to learn more ]
FSF Events: Free Software Directory meeting on IRC: Friday, November 11, starting at 12:00 EST (17:00 UTC)
Sven Hoexter: fluxcd: Validate flux-system Root Kustomization
Not entirely sure how people use fluxcd, but I guess most people have something like a flux-system flux kustomization as the root to add more flux kustomizations to their kubernetes cluster. Here all of that is living in a monorepo, and as we're all humans people figure out different ways to break it, which brings the reconciliation of the flux controllers down. Thus we set out to do some pre-flight validations.
Note1: We do not use flux variable substitutions for those root kustomizations, so if you use those, you've to put additional work into the validation and pipe things through flux envsubst.
First Iteration: Just Run kustomize Like Flux Would Do ItWith a folder structure where we've a cluster folder with subfolders per cluster, we just run a for loop over all of them:
for CLUSTER in ${CLUSTERS}; do pushd clusters/${CLUSTER} # validate if we can create and build a flux-system like kustomization file kustomize create --autodetect --recursive if ! kustomize build . -o /dev/null 2> error.log; then echo "Error building flux-system kustomization for cluster ${CLUSTER}" cat error.log fi popd done Second Iteration: Make Sure Our Workload Subfolder Have a kustomization.yamlNext someone figured out that you can delete some yaml files from a workload subfolder, including the kustomization.yaml, but not all of them. That left around a resource definition which lacks some other referenced objects, but is still happily included into the root kustomization by kustomize create and flux, which of course did not work.
Thus we started to catch that as well in our growing for loop:
for CLUSTER in ${CLUSTERS}; do pushd clusters/${CLUSTER} # validate if we can create and build a flux-system like kustomization file kustomize create --autodetect --recursive if ! kustomize build . -o /dev/null 2> error.log; then echo "Error building flux-system kustomization for cluster ${CLUSTER}" cat error.log fi # validate if we always have a kustomization file in folders with yaml files for CLFOLDER in $(find . -type d); do test -f ${CLFOLDER}/kustomization.yaml && continue test -f ${CLFOLDER}/kustomization.yml && continue if [[ $(find ${CLFOLDER} -maxdepth 1 \( -name '*.yaml' -o -name '*.yml' \) -type f|wc -l) != 0 ]]; then echo "Error Cluster ${CLUSTER} folder ${CLFOLDER} lacks a kustomization.yaml" fi done popd doneNote2: I shortened those snippets to the core parts. In our case some things are a bit specific to how we implemented the execution of those checks in GitHub action workflows. Hope that's enough to transport the idea of what to check for.
Real Python: Formatting Floats Inside Python F-Strings
You’ll often need to format and round a Python float to display the results of your calculations neatly within strings. In earlier versions of Python, this was a messy thing to do because you needed to round your numbers first and then use either string concatenation or the old string formatting technique to do this for you.
Since Python 3.6, the literal string interpolation, more commonly known as a formatted string literal or f-string, allows you to customize the content of your strings in a more readable way.
An f-string is a literal string prefixed with a lowercase or uppercase letter f and contains zero or more replacement fields enclosed within a pair of curly braces {...}. Each field contains an expression that produces a value. You can calculate the field’s content, but you can also use function calls or even variables.
[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]
James Bromberger: My own little server
1xINTERNET blog: Inside DrupalCamp Berlin 2024: Innovations, Awards, and the Future of Drupal CMS
DrupalCamp Berlin 2024 united 150+ Drupal enthusiasts, showcasing AI innovations, inspiring talks, and Dries Buytaert’s vision for the new “Drupal CMS.” From Splash Awards to collaboration highlights, dive into the full recap of this dynamic event!
Metadrop: Drupal Camp 2024 in Benidorm
Another year, another Drupal Camp, and I'm once again delighted by the community, its great atmosphere, and all the things showcased in the talks, informal meetings, hallways, and social events. The motto "Come for the software, stay for the community" remains as relevant as ever. This year, I noticed quite a few new faces. During the closing session, there was a raffle just for first-time attendees at Drupal Camp, and I counted about thirty people in the raffle, so at least 15% were newcomers who, I hope, felt that motto. Additionally, there was significant attendance at the Forcontu introduction to Drupal 10 course, partly because this year the Spanish Drupal Association found the right strategy to bring in students from various institutions.
As for me, I believe the community remains healthy and strong.
The sessionsSince I still haven't mastered the art of being in two places at once and attending multiple talks simultaneously, I was only able to see a few and hear opinions on some others, so the selection I provide below is incomplete and biased, but I believe it's still interesting.
Drupal futureThis year, Drupal Starshot initiative has been launched, aiming to create a new entry point for Drupal, which would be the default download from Drupal.org instead of the Drupal core. The…
joshics.in: Leveraging AI for Optimized Drupal Development: A New Era of Efficiency
In the world of web development, staying ahead means being willing to embrace innovation. One of the most transformative forces in technology today is Artificial Intelligence (AI), and its integration with Drupal development is proving to be a pivotal moment.
Why AI?AI brings automation, precision, and speed to processes that were traditionally time-consuming and manually intensive. In the realm of Drupal development, these qualities unlock unprecedented possibilities, streamlining workflows and enhancing the overall quality of outcomes.
Automated Code GenerationOne of the most practical applications of AI in Drupal development is automated code generation. AI tools, like machine learning algorithms, can automate repetitive coding tasks, freeing developers to focus on refining and innovating rather than getting bogged down in routine work.
Imagine AI-powered assistants generating initial code drafts or suggesting code snippets based on programming patterns. This capability can drastically reduce the time developers spend searching for syntax or debugging, accelerating the development cycle and increasing productivity.
Enhanced SecurityAI's predictive capabilities allow for the identification and mitigation of potential security vulnerabilities before they become actual threats. In a Drupal environment, this proactive approach translates into safer websites and peace of mind for both developers and clients.
AI can analyze patterns in data traffic, detect anomalies that might suggest a security breach, and even suggest patches or updates to address these vulnerabilities swiftly.
Personalized User ExperiencesThe demand for personalized user experiences is growing, and AI is poised to meet this challenge head-on. By analyzing vast amounts of user data, AI can predict and tailor content delivery to individual preferences, enhancing engagement and retaining users on Drupal sites.
For instance, AI algorithms can determine the best times to display certain types of content or recommend products, ensuring users are met with relevant information every time they visit a site.
Efficient Data AnalysisIntegrating AI with Drupal isn't just about improving front-end experiences, it also offers powerful data analysis capabilities. Developers can gain deep insights from user interactions and site performance metrics, leading to more informed, data-driven decisions that enhance site functionality and user satisfaction.
AI can detect patterns in user behavior, suggest optimizations for content delivery, and even predict future trends, enabling developers to remain agile and responsive to changing user needs.
Challenges and ConsiderationsWhile AI offers numerous benefits, it's essential to approach its integration with a clear understanding of its limitations and potential drawbacks. Navigating these challenges effectively requires a commitment to responsible AI usage and continuous learning. It's crucial to stay updated with the latest advancements and best practices. AI models can sometimes produce biased or unexpected results, making it important to have human oversight in place.
Ethical considerations are key. Developers need to ensure that AI tools are used to enhance the user experience while respecting privacy and data protection standards. Transparency with clients about AI interventions and data usage builds trust and aligns with ethical standards.
Training and skill development remain essential. As AI technologies evolve, so should the competencies of those who work with them. Investing in training ensures that teams are equipped to harness AI’s full potential effectively and responsibly.
In summary, AI is not just an add-on, it’s a transformative element that can elevate Drupal development to new heights of efficiency and innovation. By embracing AI thoughtfully, developers can not only streamline processes but also create more secure, personalized, and engaging digital experiences for users.
AI Drupal Drupal Planet Add new commentSpecbee: A closer look at JavaScript’s native Array and Object methods
Freexian Collaborators: Monthly report about Debian Long Term Support, October 2024 (by Roberto C. Sánchez)
Like each month, have a look at the work funded by Freexian’s Debian LTS offering.
Debian LTS contributorsIn October, 20 contributors have been paid to work on Debian LTS, their reports are available:
- Abhijith PA did 6.0h (out of 7.0h assigned and 7.0h from previous period), thus carrying over 8.0h to the next month.
- Adrian Bunk did 15.0h (out of 87.0h assigned and 13.0h from previous period), thus carrying over 85.0h to the next month.
- Arturo Borrero Gonzalez did 10.0h (out of 10.0h assigned).
- Bastien Roucariès did 20.0h (out of 20.0h assigned).
- Ben Hutchings did 4.0h (out of 0.0h assigned and 4.0h from previous period).
- Chris Lamb did 18.0h (out of 18.0h assigned).
- Daniel Leidert did 29.0h (out of 26.0h assigned and 3.0h from previous period).
- Emilio Pozuelo Monfort did 60.0h (out of 23.5h assigned and 36.5h from previous period).
- Guilhem Moulin did 7.5h (out of 19.75h assigned and 0.25h from previous period), thus carrying over 12.5h to the next month.
- Lee Garrett did 15.25h (out of 0.0h assigned and 60.0h from previous period), thus carrying over 44.75h to the next month.
- Lucas Kanashiro did 10.0h (out of 10.0h assigned and 10.0h from previous period), thus carrying over 10.0h to the next month.
- Markus Koschany did 40.0h (out of 40.0h assigned).
- Ola Lundqvist did 14.5h (out of 6.5h assigned and 17.5h from previous period), thus carrying over 9.5h to the next month.
- Roberto C. Sánchez did 9.75h (out of 24.0h assigned), thus carrying over 14.25h to the next month.
- Santiago Ruano Rincón did 23.5h (out of 25.0h assigned), thus carrying over 1.5h to the next month.
- Sean Whitton did 6.25h (out of 1.0h assigned and 5.25h from previous period).
- Stefano Rivera did 1.0h (out of 0.0h assigned and 10.0h from previous period), thus carrying over 9.0h to the next month.
- Sylvain Beucler did 9.5h (out of 16.0h assigned and 44.0h from previous period), thus carrying over 50.5h to the next month.
- Thorsten Alteholz did 11.0h (out of 11.0h assigned).
- Tobias Frost did 10.5h (out of 12.0h assigned), thus carrying over 1.5h to the next month.
In October, we have released 35 DLAs.
Some notable updates prepared in October include denial of service vulnerability fixes in nss, regression fixes in apache2, multiple fixes in php7.4, and new upstream releases of firefox-esr, openjdk-17, and opendk-11.
Additional contributions were made for the stable Debian 12 bookworm release by several LTS contributors. Arturo Borrero Gonzalez prepared a parallel update of nss, Bastien Roucariès prepared a parallel update of apache2, and Santiago Ruano Rincón prepared updates of activemq for both LTS and Debian stable.
LTS contributor Bastien Roucariès undertook a code audit of the cacti package and in the process discovered three new issues in node-dompurity, which were reported upstream and resulted in the assignment of three new CVEs.
As always, the LTS team continues to work towards improving the overall sustainability of the free software base upon which Debian LTS is built. We thank our many committed sponsors for their ongoing support.
Thanks to our sponsorsSponsors that joined recently are in bold.
- Platinum sponsors:
- Toshiba Corporation (for 109 months)
- Civil Infrastructure Platform (CIP) (for 77 months)
- VyOS Inc (for 41 months)
- Gold sponsors:
- Roche Diagnostics International AG (for 119 months)
- Akamai - Linode (for 113 months)
- Babiel GmbH (for 103 months)
- Plat’Home (for 102 months)
- CINECA (for 77 months)
- University of Oxford (for 59 months)
- Deveryware (for 46 months)
- EDF SA (for 31 months)
- Dataport AöR (for 6 months)
- CERN (for 4 months)
- Silver sponsors:
- Domeneshop AS (for 124 months)
- Nantes Métropole (for 118 months)
- Univention GmbH (for 110 months)
- Université Jean Monnet de St Etienne (for 110 months)
- Ribbon Communications, Inc. (for 104 months)
- Exonet B.V. (for 94 months)
- Leibniz Rechenzentrum (for 88 months)
- Ministère de l’Europe et des Affaires Étrangères (for 72 months)
- Cloudways by DigitalOcean (for 61 months)
- Dinahosting SL (for 59 months)
- Bauer Xcel Media Deutschland KG (for 53 months)
- Platform.sh SAS (for 53 months)
- Moxa Inc. (for 47 months)
- sipgate GmbH (for 45 months)
- OVH US LLC (for 43 months)
- Tilburg University (for 43 months)
- GSI Helmholtzzentrum für Schwerionenforschung GmbH (for 34 months)
- Soliton Systems K.K. (for 31 months)
- THINline s.r.o. (for 7 months)
- Copenhagen Airports A/S
- Bronze sponsors:
- Evolix (for 124 months)
- Seznam.cz, a.s. (for 124 months)
- Intevation GmbH (for 121 months)
- Linuxhotel GmbH (for 121 months)
- Daevel SARL (for 120 months)
- Bitfolk LTD (for 119 months)
- Megaspace Internet Services GmbH (for 119 months)
- Greenbone AG (for 118 months)
- NUMLOG (for 118 months)
- WinGo AG (for 117 months)
- Entr’ouvert (for 108 months)
- Adfinis AG (for 106 months)
- Tesorion (for 101 months)
- GNI MEDIA (for 100 months)
- Laboratoire LEGI - UMR 5519 / CNRS (for 100 months)
- Bearstech (for 92 months)
- LiHAS (for 92 months)
- Catalyst IT Ltd (for 87 months)
- Supagro (for 82 months)
- Demarcq SAS (for 81 months)
- Université Grenoble Alpes (for 67 months)
- TouchWeb SAS (for 59 months)
- SPiN AG (for 56 months)
- CoreFiling (for 52 months)
- Institut des sciences cognitives Marc Jeannerod (for 47 months)
- Observatoire des Sciences de l’Univers de Grenoble (for 43 months)
- Tem Innovations GmbH (for 38 months)
- WordFinder.pro (for 37 months)
- CNRS DT INSU Résif (for 36 months)
- Alter Way (for 29 months)
- Institut Camille Jordan (for 19 months)
- SOBIS Software GmbH (for 4 months)
Krita Monthly Update - Edition 20
Welcome to the @Krita-promo team's October 2024 development and community update.
Development Report Android-only Krita 5.2.8 Hotfix ReleaseKrita 5.2.6 was reported to crash on startup on devices running Android 14 or later. This was caused by issues with an SDK update required for release on the Play Store, so a temporary 5.2.7 release reverting it was available from the downloads page only.
However, the issue has now been resolved and 5.2.8 is rolling out on the Play Store. Note that 5.2.8 raises the minimum supported Android version to Android 7.0 (Nougat).
Community Bug Hunt StartedThe development team has declared a "Bug Hunt Month" running through November, and needs the community's help to decide what to do with each and every one of the hundreds of open bug reports on the bug tracker. Which reports are valid and need to be fixed? Which ones need more info or are already resolved?
Read the bug hunting guide and join in on the bug hunt thread on the Krita-Artists forum.
Community Report October 2024 Monthly Art Challenge ResultsFor the "Buried, Stuck, or otherwise Swallowed" theme, 16 members submitted 18 original artworks. And the winner is… Tomorrow, contest… I’m so finished by @mikuma_poponta!
The November Art Challenge is Open NowFor the November Art Challenge, @mikuma_poponta has chosen "Fluffy" as the theme, with the optional challenge of making it "The Ultimate Fluffy According to Me". See the full brief for more details, and get comfortable with this month's theme.
Featured Artwork Best of Krita-Artists - September/October 20248 images were submitted to the Best of Krita-Artists Nominations thread, which was open from September 14th to October 11th. When the poll closed on October 14th, moderators had to break a four-way tie for the last two spots, resulting in these five wonderful works making their way onto the Krita-Artists featured artwork banner:
Sapphire by @Dehaf
Sci-Fi Spaceship by @NAKIGRAL
Oracular Oriole by @SylviaRitter
Air Port by @agarad
Dancing with butterflies 🦋 by @Kichirou_Okami
Best of Krita-Artists - October/November 2024Nominations were accepted until November 11th. The poll is now open until November 14th. Don't forgot to vote!
Ways to Help KritaKrita is Free and Open Source Software developed by an international team of sponsored developers and volunteer contributors.
Visit Krita's funding page to see how user donations keep development going, and explore a one-time or monthly contribution. Or check out more ways to Get Involved, from testing, coding, translating, and documentation writing, to just sharing your artwork made with Krita.
The Krita-promo team has put out a call for volunteers, come join us and help keep these monthly updates going.
Notable ChangesNotable changes in Krita's development builds from Oct. 10 - Nov. 12, 2024.
Stable branch (5.2.9-prealpha):- Layers: Fix infinite loop when a clone layer is connected to a group with clones, and a filter mask triggers an out-of-bounds update. (Change, by Dmitry Kazakov)
- General: Fix inability to save a document after saving while the image is busy and then canceling the busy operation. (bug report) (Change, by Dmitry Kazakov)
- Resources: Fix crash when re-importing a resource after modifying it. (bug report) (Change, by Dmitry Kazakov)
- Brush Presets: Fix loading embedded resources from .kpp files. (bug report, bug report, bug report) (Change, by Dmitry Kazakov)
- Brush Tools: Fix the Dynamic Brush Tool to not use the Freehand Brush Tool's smoothing settings which it doesn't properly support. (bug report) (Change, by Mathias Wein)(Change, by Dmitry Kazakov)
- Recorder Docker: Prevent interruption of the Text Tool by disabling recording while it is active. (bug report) (Change, by Dmitry Kazakov)
- File Formats: EXR: Possibly fix saving EXR files with extremely low alpha values. (Change, by Dmitry Kazakov)
- File Formats: EXR: Try to keep color space profile when saving EXR of incompatible type. (Change, by Dmitry Kazakov)
- File Formats: EXR: Fix bogus offset when saving EXR with moved layers. (Change, by Dmitry Kazakov)
- File Formats: JPEG-XL: Fix potential lockup when loading multi-page images. (Change, by Rasyuqa A H)
- Keyboard Shortcuts: Set the default shortcut for Zoom In to = instead of +. (bug report) (Change, by Halla Rempt)
- Brush Editor: Make the Saturation and Value brush options' graph and graph labels consistently agree on the range being -100% to 100% with 0% as neutral. (bug report) (Change, by Dmitry Kazakov)
Bug fixes:
- Vector Layers: Fix endlessly slow rendering of vector layers with clipping masks. (bug report) (Change, by Dmitry Kazakov)
- Layers: Fix issue with transform masks on group layers not showing until visibility change, and visibility change of passthrough groups with layer styles causing artifacts. (bug report) (Change, by Dmitry Kazakov)
- Brush Editor: Fix crash when clearing scratchpad while it's busy rendering a resource-intensive brushstroke. (bug report) (Change, by Dmitry Kazakov)
- File Formats: EXR: Add GUI option for selecting the default color space for EXR files. (Change, by Dmitry Kazakov)
- Transform Tool: Liquify: Move the Move/Rotate/Scale/Offset/Undo buttons to their own spot instead of alongside unrelated options, to avoid confusion. (bug report) (Change, by Emmet O'Neill)
- Move Tool: Fix Force Instant Preview in the Move tool to be off by default. (CCbug report) (Change, by Halla Rempt)
- Pop-Up Palette: Fix lag in selecting a color in the Pop-Up Palette. (bug report) (Change, by Dmitry Kazakov)
- Scripting: Fix accessing active node state from the Python scripts. (bug report) (Change, by Dmitry Kazakov)
- Usabillity: Remove unnecessary empty space at the bottom of Transform, Move and Crop tool options. (bug report) (Change, by Dmitry Kazakov)
Pre-release versions of Krita are built every day for testing new changes.
Get the latest bugfixes in Stable "Krita Plus" (5.2.9-prealpha): Linux - Windows - macOS (unsigned) - Android arm64-v8a - Android arm32-v7a - Android x86_64
Or test out the latest Experimental features in "Krita Next" (5.3.0-prealpha). Feedback and bug reports are appreciated!: Linux - Windows - macOS (unsigned) - Android arm64-v8a - Android arm32-v7a - Android x86_64
Have feedback?Join the discussion of this post on the Krita-Artists forum!
Python⇒Speed: Using portable SIMD in stable Rust
In a previous post we saw that you can speed up code significantly on a single core using SIMD: Single Instruction Multiple Data. These specialized CPU instructions allow you to, for example, add 4 values at once with a single instruction, instead of the usual one value at a time. The performance improvement you get compounds with multi-core parallelism: you can benefit from both SIMD and threading at the same time.
Unfortunately, SIMD instructions are specific both to CPU architecture and CPU model. Thus ARM CPUs as used on modern Macs have different SIMD instructions than x86-64 CPUs. And even if you only care about x86-64, different models support different instructions; the i7-12700K CPU in my current computer doesn’t support AVX-512 SIMD, for example.
One way to deal with this is to write custom versions for each variation of SIMD instructions. Another is to use a portable SIMD library, that provides an abstraction layer on top of these various instruction sets.
In the previous post we used the std::simd library. It is built-in to the Rust standard library… but unfortunately it’s currently only available when using an unstable (“nightly”) compiler.
How you can write portable SIMD if you want to use stable Rust? In this article we’ll:
- Introduce the wide crate, that lets you write portable SIMD in stable Rust.
- Show how it can be used to reimplement the Mandelbrot algorithm we previously implemented with std::simd.
- Go over the pros and cons of these alternatives.
Eli Bendersky: ML in Go with a Python sidecar
Machine learning models are rapidly becoming more capable; how can we make use of these powerful new tools in our Go applications?
For top-of-the-line commercial LLMs like ChatGPT, Gemini or Claude, the models are exposed as language agnostic REST APIs. We can hand-craft HTTP requests or use client libraries (SDKs) provided by the LLM vendors. If we need more customized solutions, however, some challenges arise. Completely bespoke models are typically trained in Python using tools like TensorFlow, JAX or PyTorch that don't have real non-Python alternatives.
In this post, I will present some approaches for Go developers to use ML models in their applications - with increasing level of customization. The summary up front is that it's pretty easy, and we only have to deal with Python very minimally, if at all - depending on the circumstances.
Internet LLM servicesThis is the easiest category: multimodal services from Google, OpenAI and others are available as REST APIs with convenient client libraries for most leading languages (including Go), as well as third-party packages that provide abstractions on top (e.g. langchaingo).
Check out the official Go blog titled Building LLM-powered applications in Go that was published earlier this year. I've written about it before on this blog as well: #1, #2, #3 etc.
Go is typically as well supported as other programming languages in this domain; in fact, it's uniquely powerful for such applications because of its network-native nature; quoting from the Go blog post:
Working with LLM services often means sending REST or RPC requests to a network service, waiting for the response, sending new requests to other services based on that and so on. Go excels at all of these, providing great tools for managing concurrency and the complexity of juggling network services.Since this has been covered extensively, let's move on to the more challenging scenarios.
Locally-running LLMsThere's a plethora of high-quality open models [1] one can choose from to run locally: Gemma, Llama, Mistral and many more. While these models aren't quite as capable as the strongest commercial LLM services, they are often surprisingly good and have clear benefits w.r.t. cost and privacy.
The industry has begun standardizing on some common formats for shipping and sharing these models - e.g. GGUF from llama.cpp, safetensors from Hugging Face or the older ONNX. Additionally, there are a number of excellent OSS tools that let us run such models locally and expose a REST API for an experience that's very similar to the OpenAI or Gemini APIs, including dedicated client libraries.
The best known such tool is probably Ollama; I've written extensively about it in the past: #1, #2, #3.
Ollama lets us customize an LLM through a Modelfile, which includes things like setting model parameters, system prompts etc. If we fine-tuned a model [2], it can also be loaded into Ollama by specifying our own GGUF file.
If you're running in a cloud environment, some vendors already have off-the-shelf solutions like GCP's Cloud Run integration that may be useful.
Ollama isn't the only player in this game, either; recently a new tool emerged with a slightly different approach. Llamafile distributes the entire model as a single binary, which is portable across several OSes and CPU architectures. Like Ollama, it provides REST APIs for the model.
If such a customized LLM is a suitable solution for your project, consider just running Ollama or Llamafile and using their REST APIs to communicate with the model. If you need higher degrees of customization, read on.
A note about the sidecar patternBefore we proceed, I want to briefly discuss the sidecar pattern of application deployment. That k8s link talks about containers, but the pattern isn't limited to these. It applies to any software architecture in which functionality is isolated across processes.
Suppose we have an application that requires some library functionality; using Go as an example, we could find an appropriate package, import it and be on our way. Suppose there's no suitable Go package, however. If libraries exist with a C interface, we could alternatively use cgo to import it.
But say there's no C API either, for example if the functionality is only provided by a language without a convenient exported interface. Maybe it's in Lisp, or Perl, or... Python.
A very general solution could be to wrap the code we need in some kind of server interface and run it as a separate process; this kind of process is called a sidecar - it's launched specifically to provide additional functionality for another process. Whichever inter-process communication (IPC) mechanism we use, the benefits of this approach are many - isolation, security, language independence, etc. In today's world of containers and orchestration this approach is becoming increasingly more common; this is why many of the links about sidecars lead to k8s and other containerized solutions.
The Ollama approach outlined in the previous section is one example of using the sidecar pattern. Ollama provides us with LLM functionality but it runs as a server in its own process.
The solutions presented in the rest of this post are more explicit and fully worked-out examples of using the sidecar pattern.
Locally-running LLM with Python and JAXSuppose none of the existing open LLMs will do for our project, even fine-tuned. At this point we can consider training our own LLM - this is hugely expensive, but perhaps there's no choice. Training usually involves one of the large ML frameworks like TensorFlow, JAX or PyTorch. In this section I'm not going to talk about how to train models; instead, I'll show how to run local inference of an already trained model - in Python with JAX, and use that as a sidecar server for a Go application.
The sample (full code is here) is based on the official Gemma repository, using its sampler library [3]. It comes with a README that explains how to set everything up. This is the relevant code instantiating a Gemma sampler:
# Once initialized, this will hold a sampler_lib.Sampler instance that # can be used to generate text. gemma_sampler = None def initialize_gemma(): """Initialize Gemma sampler, loading the model into the GPU.""" model_checkpoint = os.getenv("MODEL_CHECKPOINT") model_tokenizer = os.getenv("MODEL_TOKENIZER") parameters = params_lib.load_and_format_params(model_checkpoint) print("Parameters loaded") vocab = spm.SentencePieceProcessor() vocab.Load(model_tokenizer) transformer_config = transformer_lib.TransformerConfig.from_params( parameters, cache_size=1024, ) transformer = transformer_lib.Transformer(transformer_config) global gemma_sampler gemma_sampler = sampler_lib.Sampler( transformer=transformer, vocab=vocab, params=parameters["transformer"], ) print("Sampler ready")The model weights and tokenizer vocabulary are files downloaded from Kaggle, per the instructions in the Gemma repository README.
So we have LLM inference up and running in Python; how do we use it from Go?
Using a sidecar, of course. Let's whip up a quick web server around this model and expose a trivial REST interface on a local port that Go (or any other tool) can talk to. As an example, I've set up a Flask-based web server around this inference code. The web server is invoked with gunicorn - see the shell script for details.
Excluding the imports, here's the entire application code:
def create_app(): # Create an app and perform one-time initialization of Gemma. app = Flask(__name__) with app.app_context(): initialize_gemma() return app app = create_app() # Route for simple echoing / smoke test. @app.route("/echo", methods=["POST"]) def echo(): prompt = request.json["prompt"] return {"echo_prompt": prompt} # The real route for generating text. @app.route("/prompt", methods=["POST"]) def prompt(): prompt = request.json["prompt"] # For total_generation_steps, 128 is a default taken from the Gemma # sample. It's a tradeoff between speed and quality (higher values mean # better quality but slower generation). # The user can override this value by passing a "sampling_steps" key in # the request JSON. sampling_steps = request.json.get("sampling_steps", 128) sampled_str = gemma_sampler( input_strings=[prompt], total_generation_steps=int(sampling_steps), ).text return {"response": sampled_str}The server exposes two routes:
- prompt: a client sends in a textual prompt, the server runs Gemma inference and returns the generated text in a JSON response
- echo: used for testing and benchmarking
Here's how it all looks tied together:
The important takeaway is that this is just an example. Literally any part of this setup can be changed: one could use a different ML library (maybe PyTorch instead of JAX); one could use a different model (not Gemma, not even an LLM) and one can use a different setup to build a web server around it. There are many options, and each developer will choose what fits their project best.
It's also worth noting that we've written less than 100 lines of Python code in total - much of it piecing together snippets from tutorials. This tiny amount of Python code is sufficient to wrap an HTTP server with a simple REST interface around an LLM running locally through JAX on the GPU. From here on, we're safely back in our application's actual business logic and Go.
Now, a word about performance. One of the concerns developers may have with sidecar-based solutions is the performance overhead of IPC between Python and Go. I've added a simple echo endpoint to measure this effect; take a look at the Go client that exercises it; on my machine the latency of sending a JSON request from Go to the Python server and getting back the echo response is about 0.35 ms on average. Compared to the time it takes Gemma to process a prompt and return a response (typically measured in seconds, or maybe hundreds of milliseconds on very powerful GPUs), this is entirely negligible.
That said, not every custom model you may need to run is a full-fledged LLM. What if your model is small and fast, and the overhead of 0.35 ms becomes significant? Worry not, it can be optimized. This is the topic of the next section.
Locally-running fast image model with Python and TensorFlowThe final sample of this post mixes things up a bit:
- We'll be using a simple image model (instead of an LLM)
- We're going to train it ourselves using TensorFlow+Keras (instead of JAX)
- We'll use a different IPC method between the Python sidecar server and clients (instead of HTTP+REST)
The model is still implemented in Python, and it's still driven as a sidecar server process by a Go client [4]. The idea here is to show the versatility of the sidecar approach, and to demonstrate a lower-latency way to communicate between the processes.
The full code of the sample is here. It trains a simple CNN (convolutional neural network) to classify images from the CIFAR-10 dataset:
The neural net setup with TensorFlow and Keras was taken from an official tutorial. Here's the entire network definition:
model = models.Sequential() model.add(layers.Conv2D(32, (3, 3), activation="relu", input_shape=(32, 32, 3))) model.add(layers.MaxPooling2D((2, 2))) model.add(layers.Conv2D(64, (3, 3), activation="relu")) model.add(layers.MaxPooling2D((2, 2))) model.add(layers.Conv2D(64, (3, 3), activation="relu")) model.add(layers.Flatten()) model.add(layers.Dense(64, activation="relu")) model.add(layers.Dense(10))CIFAR-10 images are 32x32 pixels, each pixel being 3 values for red, green and blue. In the original dataset, these values are bytes in the inclusive range 0-255 representing color intensity. This should explain the (32, 32, 3) shape appearing in the code. The full code for training the model is in the train.py file in the sample; it runs for a bit and saves the serialized model along with the trained weights into a local file.
The next component is an "image server": it loads the trained model+weights file from disk and runs inference on images passed into it, returning the label the model thinks is most likely for each.
The server doesn't use HTTP and REST, however. It creates a Unix domain socket and uses a simple length-prefix encoding protocol to communicate:
Each packet starts with a 4-byte field that specifies the length of the rest of the contents. A type is a single byte, and the body can be anything [5]. In the sample image server two commands are currently supported:
- 0 means "echo" - the server will respond with the same packet back to the client. The contents of the packet body are immaterial.
- 1 means "classify" - the packet body is interpreted as a 32x32 RGB image, encoded as the red channel for each pixel in the first 1024 bytes (32x32, row major), then green in the next 1024 bytes and finally blue in the last 1024 bytes. Here the server will run the image through the model, and reply with the label the model thinks describes the image.
The sample also includes a simple Go client that can take a PNG file from disk, encode it in the required format and send it over the domain socket to the server, recording the response.
The client can also be used to benchmark the latency of a roundtrip message exchange. It's easier to just show the code instead of explaining what it does:
func runBenchmark(c net.Conn, numIters int) { // Create a []byte with 3072 bytes. body := make([]byte, 3072) for i := range body { body[i] = byte(i % 256) } t1 := time.Now() for range numIters { sendPacket(c, messageTypeEcho, body) cmd, resp := readPacket(c) if cmd != 0 || len(resp) != len(body) { log.Fatal("bad response") } } elapsed := time.Since(t1) fmt.Printf("Num packets: %d, Elapsed time: %s\n", numIters, elapsed) fmt.Printf("Average time per request: %d ns\n", elapsed.Nanoseconds()/int64(numIters)) }In my testing, the average latency of a roundtrip is about 10 μs (that's micro-seconds). Considering the size of the message and it being Python on the other end, this is roughly in-line with my earlier benchmarking of Unix domain socket latency in Go.
How long does a single image inference take with this model? In my measurements, about 3 ms. Recall that the communication latency for the HTTP+REST approach was 0.35 ms; while this is only 12% of the image inference time, it's close enough to be potentially worrying. On a beefy server-class GPU the time can be much shorter [6].
With the custom protocol over domain sockets, the latency - being 10 μs - seems quite negligible no matter what you end up running on your GPU.
CodeThe full code for the samples in this post is on GitHub.
[1]To be pedantic, these models are not entirely open: their inference architecture is open-source and their weights are available, but the details of their training remain proprietary. [2]The details of fine-tuning models are beyond the scope of this post, but there are plenty resources about this online. [3]"Sampling" in LLMs means roughly "inference". A trained model is fed an input prompt and then "sampled" to produce its output. [4]In my samples, the Python server and Go client simply run in different terminals and talk to each other. How service management is structured is very project-specific. We could envision an approach wherein the Go application launches the Python server to run in the background and communicates with it. Increasingly likely these days, however, would be a container-based setup, where each program is its own container and an orchestration solution launches and manages these containers. [5]You may be wondering why I'm implementing a custom protocol here instead of using something established. In real life, I'd definitely recommend using something like gRPC. However, for the sake of this sample I wanted something that would be (1) simple without additional libraries and (2) very fast. FWIW, I don't think the latency numbers would be very much different for gRPC. Check out my earlier post about RPC over Unix domain sockets in Go. [6]On the other hand, the model I'm running here is really small. It's fair to say realistic models you'll use in your application will be much larger and hence slower.Talking Drupal: Talking Drupal #475 - Workspaces
Today we are talking about Workspaces, What They are, and How They Work with guest Scott Weston. We’ll also cover Workspaces Extra as our module of the week.
For show notes visit: https://www.talkingDrupal.com/475
Topics- What are Workspaces in Drupal
- What's a common use cases for Workspaces
- Are Workspaces stable
- Do Workspaces help with content versioning
- What does the module ecosystem look like for Workspaces
- Inspiration
- Workspaces best practices
- Any interesting ways it is being used
- Is there a way to access workspace content in twig
- Navigation integration
- Workspaces and workflows
- What aspects of a Workspace are limited to live
- If someone wants to get involved or get started
- Drupal Workspaces
- Core issue: Media library form can only be submitted in the default workspace
- Integrate Navigation with Workspaces
Scott Weston - scott-weston
HostsNic Laflin - nLighteneddevelopment.com nicxvan John Picozzi - epam.com johnpicozzi Joshua "Josh" Mitchell - joshuami.com joshuami
MOTW CorrespondentMartin Anderson-Clutz - mandclu.com mandclu
- Brief description:
- Do you want to extend the capabilities of the Workspaces system in Drupal core? There’s a module for that.
- Module name/project name:
- Brief history
- How old: created in Apr 2021 by Andrei Mateescu (amateescu) of tag1, who has also contributed to Workspaces in core, among other many things
- Versions available: 2.0.0-alpha3 which works with Drupal 10.3 or 11
- Maintainership
- Actively maintained, latest release is less than a week old
- Security coverage: technically yes, but not really until it has a stable release
- Test coverage
- Number of open issues: 20 open issues, 3 of which are bugs against the current branch, though one has already been fixed
- Usage stats:
- 89 sites
- Module features and usage
- One of the big features in Drupal 10.3 was that Workspaces is now officially stable. That said, not everything works the way some site builders will want it to. That’s where a contrib solution like Workspace Extra can help to fill in the gaps
- It provides new options like letting you roll back changes from a published workspace, move content between workspaces, discard changes in a workspace, squashing content revisions when a workspace is published, and more
- Workspaces Extra, or WSE also includes a number of submodules to add even more capabilities. For example, they can allow your workspace to stage an allowlist of configuration changes, deploy workspace content using an import/export system, stage menu changes, and more. For workflow, there’s an option to generate a shareable workspace preview link for external users, and a scheduler to publish your workspace at a specific day and time
- I will add that the first time I played with workspaces I ran into an issue where I couldn’t create media entities within a workspace. I don’t know for sure that this hasn’t been fixed in core, but the core issue about it is still listed as “Needs work”. That said, the last comment on that issue (link in the show notes) lists WSE as something that helps, so if you encounter the same issue with Workspaces, WSE is worth a try
Of Color and Software
It’s been a minute!
We have been hard at work making sure our design system keeps moving forward. For the past weeks, we have made significant progress in the space of color creation and icons.
There is also an easter egg in the form of PenPot. Read the rest!
As previously mentioned, we restructured our color palettes to have set color variations at various levels. We will combine those colors into tokens that will be named something like this:
pd.sys.color.red50
Meaning:
- PD: Plasma Design
- SYS: System token (We also have reference tokens and component tokens, .ref. and .com. respectively)
- Color: Token type
- red50: color name + color value
Note that as we follow Material design guidelines for these colors, we have a collection of 100 different color shades for a given color. Depending on the needs of the system or changes in design, we could decide to not use red50 but we would like more intensity. So we would choose red49, or red48, and so on.
The color variable name would change accordingly. This set up would allow designers and developers to understand the kind of token they are working with and it would be the same language for both developer and designer.
In Figma and PenPot, designers have the ability to name tokens however they like. I opted for keeping token names as we are recommending them for the Plasma system. That way there is good consistency.
This week, we consolidated these colors and we added them to the list of tokens in Figma and PenPot. However, there is still more to be done in the form of documentation for our Plasma developer team. We are still working through it, making sure we are accurate in the request for development.
Additionally, this week we had the pleasure to meet with Pablo Ruiz, CEO at PenPot. Mike, one of our team members met Pablo recently and spoke of our Plasma Next project. This led to a meeting to discuss the needs that our team currently has for developing a design system.
The team at PenPot is excited to partner with our KDE team and the Plasma Next initiative. They have generously offered a few resources to help.
This couldn’t come at a better time as very recently we have been hitting gaps in our team knowledge when it comes to developing design systems. This process is a first for our desktop system and we want to get it right. With the help of the PenPot team and the changes they are making to the application, this should be easier.
As such, we also decided to request prioritization for some of our tickets that would allow us to set up and migrate our Figma assets into PenPot and eventually, share these with the community at large.
Today, we are not close to releasing a full design system for others to use, but we are making good progress. Stay tuned!
We also moved into the process of editing 16px icons. Given that we already have new icons in the 24px collection that we can leverage, we cut the design time in half or more. We don’t have to brainstorm new icons, we mostly just have to edit the 24px icon and adapt it to a 16px version. This work just barely started but we are making good progress.
One area that is still up in the air is our colorful icons. Given we edited the monochrome icons, this calls for editing colorful icons as well. We have received many suggestions on what kind of colorful style we should follow. I would like to extend that invitation.
If you have seen or created amazing colorful icons and would like to suggest that style for us at Plasma, send us a comment!
That’s it for this week. Good progress so far!
Dirk Eddelbuettel: RcppSpdlog 0.0.19 on CRAN: New Upstream, New Features
Version 0.0.19 of RcppSpdlog arrived on CRAN early this morning and has been uploaded to Debian. RcppSpdlog bundles spdlog, a wonderful header-only C++ logging library with all the bells and whistles you would want that was written by Gabi Melman, and also includes fmt by Victor Zverovich. You can learn more at the nice package documention site.
This releases updates the code to the version 1.15.0 of spdlog which was released on Saturday, and contains fmt 11.0.2. It also contains a contributed PR which allows use std::format under C++20, bypassing fmt (with some post-merge polish too), and another PR correcting a documentation double-entry.
The NEWS entry for this release follows.
Changes in RcppSpdlog version 0.0.19 (2024-11-10)Support use of std::format under C++20 via opt-in define instead of fmt (Xanthos Xanthopoulos in #19)
An erroneous duplicate log=level documentation level was removed (Contantinos Giachalis in #20)
Upgraded to upstream release spdlog 1.15.0 (Dirk in #21)
Partially revert / simplify src/formatter.cpp accomodating both #19 and previous state (Dirk in #21)
Courtesy of my CRANberries, there is also a diffstat report. More detailed information is on the RcppSpdlog page, or the package documention site. If you like this or other open-source work I do, you can sponsor me at GitHub.
This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.
Gunnar Wolf: Why academics under-share research data - A social relational theory
As an academic, I have cheered for and welcomed the open access (OA) mandates that, slowly but steadily, have been accepted in one way or another throughout academia. It is now often accepted that public funds means public research. Many of our universities or funding bodies will demand that, with varying intensities–sometimes they demand research to be published in an OA venue, sometimes a mandate will only “prefer” it. Lately, some journals and funder bodies have expanded this mandate toward open science, requiring not only research outputs (that is, articles and books) to be published openly but for the data backing the results to be made public as well. As a person who has been involved with free software promotion since the mid 1990s, it was natural for me to join the OA movement and to celebrate when various universities adopt such mandates.
Now, what happens after a university or funder body adopts such a mandate? Many individual academics cheer, as it is the “right thing to do.” However, the authors observe that this is not really followed thoroughly by academics. What can be observed, rather, is the slow pace or “feet dragging” of academics when they are compelled to comply with OA mandates, or even an outright refusal to do so. If OA and open science are close to the ethos of academia, why aren’t more academics enthusiastically sharing the data used for their research? This paper finds a subversive practice embodied in the refusal to comply with such mandates, and explores an hypothesis based on Karl Marx’s productive worker theory and Pierre Bourdieu’s ideas of symbolic capital.
The paper explains that academics, as productive workers, become targets for exploitation: given that it’s not only the academics’ sharing ethos, but private industry’s push for data collection and industry-aligned research, they adapt to technological changes and jump through all kinds of hurdles to create more products, in a result that can be understood as a neoliberal productivity measurement strategy. Neoliberalism assumes that mechanisms that produce more profit for academic institutions will result in better research; it also leads to the disempowerment of academics as a class, although they are rewarded as individuals due to the specific value they produce.
The authors continue by explaining how open science mandates seem to ignore the historical ways of collaboration in different scientific fields, and exploring different angles of how and why data can be seen as “under-shared,” failing to comply with different aspects of said mandates. This paper, built on the social sciences tradition, is clearly a controversial work that can spark interesting discussions. While it does not specifically touch on computing, it is relevant to Computing Reviews readers due to the relatively high percentage of academics among us.
Real Python: Python News Roundup: November 2024
The latest Python developments all point to the same thing—Python is currently thriving. The recent GitHub Octoverse 2024 report has revealed that Python is now the most used language on GitHub. Also, last month saw the release of Python 3.13, which is already laying the groundwork for some exciting future improvements.
While Python core developers have been busy exploring the language’s features as they tinker with upcoming enhancements, it’s good to know that working on Python’s source code isn’t the only way you can contribute to Python’s future. Another way to shape the focus of upcoming releases is to join the Python Developers Survey 2024.
And with the end of the year in sight, you may want to venture a look at next year’s calendar and mark some dates, such as the PyCon US conference in May or the Python 3.14 release in October 2025.
Now that you know the highlights, it’s time to dive into the most important Python news for November.
Join Now: Click here to join the Real Python Newsletter and you'll never miss another Python tutorial, course update, or post.
Python’s Popularity Shines in GitHub’s Octoverse 2024The latest Octoverse report for 2024 shows that Python remains one of the most widely used languages on GitHub, securing its place as a core language in open-source and professional development. Python ranked among the top three most-used languages, demonstrating its continued appeal across industries and experience levels:
As GitHub’s annual report illustrates, Python’s popularity is fueled by its solid role in developing machine learning and artificial intelligence frameworks.
Another takeaway from the Octoverse survey is Python’s strong community engagement. Python developers are not only active in contributing code but also in participating in discussions, filing issues, and reviewing pull requests.
Read the full article at https://realpython.com/python-news-november-2024/ »[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]