Feeds
PyPy: PyPy v7.3.17 release
The PyPy team is proud to release version 7.3.17 of PyPy.
This release includes a new RISC-V JIT backend, an improved REPL based on work by the CPython team, and better JIT optimizations of integer operations. Special shout-outs to Logan Chien for the RISC-V backend work, to Nico Rittinghaus for better integer optimization in the JIT, and the CPython team that has worked on the repl.
The release includes two different interpreters:
PyPy2.7, which is an interpreter supporting the syntax and the features of Python 2.7 including the stdlib for CPython 2.7.18+ (the + is for backported security updates)
PyPy3.10, which is an interpreter supporting the syntax and the features of Python 3.10, including the stdlib for CPython 3.10.14.
The interpreters are based on much the same codebase, thus the dual release. This is a micro release, all APIs are compatible with the other 7.3 releases. It follows after 7.3.16 release on April 23, 2024.
We recommend updating. You can find links to download the releases here:
https://pypy.org/download.html
We would like to thank our donors for the continued support of the PyPy project. If PyPy is not quite good enough for your needs, we are available for direct consulting work. If PyPy is helping you out, we would love to hear about it and encourage submissions to our blog via a pull request to https://github.com/pypy/pypy.org
We would also like to thank our contributors and encourage new people to join the project. PyPy has many layers and we need help with all of them: bug fixes, PyPy and RPython documentation improvements, or general help with making RPython's JIT even better.
If you are a python library maintainer and use C-extensions, please consider making a HPy / CFFI / cppyy version of your library that would be performant on PyPy. In any case, both cibuildwheel and the multibuild system support building wheels for PyPy.
RISC-V backend for the JITPyPy's JIT has added support for generating 64-bit RISC-V machine code at runtime (RV64-IMAD, specifically). So far we are not releasing binaries for any RISC-V platforms, but there are instructions on how to cross-compile binaries.
REPL ImprovementsThe biggest user-visible change of the release is new features in the repl of PyPy3.10. CPython 3.13 has adopted and extended PyPy's pure-Python repl, adding a number of features and fixing a number or bugs in the process. We have backported and added the following features:
Prompts and tracebacks use terminal colors, as well as terminal hyperlinks for file names.
Bracketed paste enable pasting several lines of input into the terminal without auto-indentation getting in the way.
A special interactive help browser (F1), history browser (F2), explicit paste mode (F3).
Support for Ctrl-<left/right> to jump over whole words at a time.
See the CPython documentation for further details. Thanks to Łukasz Langa, Pablo Galindo Salgado and the other CPython devs involved in this work.
Better JIT optimizations of integer operationsThe optimizers of PyPy's JIT have become much better at reasoning about and optimizing integer operations. This is done with a new "knownbits" abstract domain. In many programs that do bit-manipulation of integers, some of the bits of the integer variables of the program can be statically known. Here's a simple example:
x = a | 1 ... if x & 1: ... else: ...With the new abstract domain, the JIT can optimize the if-condition to True, because it already knows that the lowest bit of x must be set. This optimization applies to all Python-integers that fit into a machine word (PyPy optimistically picks between two different representations for int, depending on the size of the value). Unfortunately there is very little impact of this change on almost all Python code, because intensive bit-manipulation is rare in Python. However, the change leads to significant performance improvements in Pydrofoil (the RPython-based RISC-V/ARM emulators that are automatically generated from high-level Sail specifications of the respective ISAs, and that use the RPython JIT to improve performance).
PyPy versions and speed.pypy.orgThe keen-eyed will have noticed no mention of Python version 3.9 in the releases above. Typically we will maintain only one version of Python3, but due to PyPy3.9 support on conda-forge we maintained multiple versions from the first release of PyPy3.10 in PyPy v7.3.12 (Dec 2022). Conda-forge is sunsetting its PyPy support, which means we can drop PyPy3.9. Since that was the major driver of benchmarks at https://speed.pypy.org, we revamped the site to showcase PyPy3.9, PyPy3.10, and various versions of cpython on the home page. For historical reasons, the "baseline" for comparison is still cpython 3.7.19.
We will keep the buildbots building PyPY3.9 until the end of August, these builds will still be available on the nightly builds tab of the buildbot.
What is PyPy?PyPy is a Python interpreter, a drop-in replacement for CPython It's fast (PyPy and CPython performance comparison) due to its integrated tracing JIT compiler.
We also welcome developers of other dynamic languages to see what RPython can do for them.
We provide binary builds for:
x86 machines on most common operating systems (Linux 32/64 bits, Mac OS 64 bits, Windows 64 bits)
64-bit ARM machines running Linux (aarch64) and macos (macos_arm64).
PyPy supports Windows 32-bit, Linux PPC64 big- and little-endian, Linux ARM 32 bit, RISC-V RV64IMAFD Linux, and s390x Linux but does not release binaries. Please reach out to us if you wish to sponsor binary releases for those platforms. Downstream packagers provide binary builds for debian, Fedora, conda, OpenBSD, FreeBSD, Gentoo, and more.
What else is new?For more information about the 7.3.17 release, see the full changelog.
Please update, and continue to help us make pypy better.
Cheers, The PyPy Team
Qt for MCUs 2.5.4 LTS Released
Qt for MCUs 2.5.4 LTS (Long-Term Support) has been released and is available for download. This patch release provides bug fixes and other improvements while maintaining source compatibility with Qt for MCUs 2.5. It does not add any new functionality.
Debian Brasil: Debian Day 2024 em Belém e Poços de Caldas - Brasil
por Paulo Henrique de Lima Santana (phls)
Listamos abaixo os links para os relatos e notícias do Debian Day 2024 realizado em Belém e Poços de Caldas:
Smartbees: How to Create a Multilingual Drupal Site?
Setting up a multilingual website in Drupal opens the door to a global online marketplace, allowing businesses to reach different cultural audiences by presenting content in multiple languages. In this article, we will discuss the process of configuring a multilingual site on Drupal.
Python Software Foundation: Ask questions or tell us what you think: Introducing monthly PSF Board Office Hours!
Greetings, Pythonistas- thank you so much for supporting the work of the Python Software Foundation (PSF) and the Python community! The current PSF Board has decided to invest more in connecting and serving the global Python community by establishing a forum to have regular conversations. The board members of the PSF with the support of PSF staff are excited to introduce monthly PSF Board Office Hours on the PSF Discord. The Office Hours will be sessions where you can share with us how we can help your community, express your perspectives, and provide feedback for the PSF.
Similar to the PSF Grants Program Office Hours where PSF staff members help to answer questions regarding the PSF Grants Program, during the PSF Board Office Hours you can participate in a text-based live chat with PSF Board Directors. This is a chance to connect, share, and collaborate with the PSF Board and staff to improve our community together. Occasionally, we will have dedicated topics such as PyCon US and the PSF Board Elections for the office hour sessions.
Here is some of the work that we collaborate with staff and volunteers on:
- Promotion and outreach for the Python programming language
- Supporting local Python communities
- Organizing PyCon US
- Diversity and Inclusion in our community
- Support handling of Code of Conduct within our communities
- Support regional Python communities via the PSF Grants Program
- Furthering the mission of the PSF
Unless we have a dedicated topic for a session, you are not limited to talking with us about the above topics, although the discussions should be focused on Python, the PSF, and our community. If you think there’s something we can help with or we should know, we welcome you to come and talk to us!
The office hour sessions will take place on the PSF Discord server in the #psf-board channel. If you are new to Discord, make sure to check out a tutorial on how you can download the Discord app and sign up for free– then join us on the PSF Discord! To make the office hours more accessible, the office hours will be scheduled at alternating times so no matter where you are based, you can find a time that is most convenient for you! Here is a list of the dates and times:
- September 10th, 2024: 1pm UTC
- October 8th, 2024: 9pm UTC
- November 12th, 2024: 2pm UTC
- December 10th, 2024: 9pm UTC
- January 14th, 2025: 2pm UTC
- February 11th, 2025: 9pm UTC
- March 11th, 2025: 1pm UTC
- April 8th, 2025: 9pm UTC
- May 13th, 2025: 1pm UTC (Live from PyCon US!)
- June 10th, 2025: 9pm UTC
- July 9th, 2025: 1pm UTC
- August 12th, 2025: 9pm UTC
Each session lasts for an hour. Make sure to check what time these sessions are for you locally so you don't miss out! Sessions after August 13th, 2025, will be announced in the future.
Some of the board members of the PSF will be attending each office hour, as well as members of the PSF Staff. The list of the PSF Board Directors can be found on our website. We are passionate Python community members who are happy to listen, help, and provide support to you. We are happy to follow up with you if there are any issues we cannot address immediately during the office hour sessions. As always, you can email us at psf-board@python.org with inquiries, feedback, or comments at any time.
Plasma Crash Course - coredumpd
A while ago a colleague of mine asked about our crash infrastructure in Plasma and whether I could give some overview on it. This seems very useful to others as well, I thought. Here I am, telling you all about it!
Our crash infrastructure is comprised of a number of different components.
- KCrash: a KDE Framework performing crash interception and prepartion for handover to…
- coredumpd: a systemd component performing process core collection and handover to…
- DrKonqi: a GUI for crashes sending data to…
- Sentry: a web service and UI for tracing and presenting crashes for developers
We’ve looked at KCrash previously. This time we look at coredumpd.
Coredumpdcoredumpd collects all crashes happening on the system, through the core_pattern system. It is shipped as part of systemd and as such mostly available out of the box.
It is fairly sophisticated and can manage the backlog of crashes, so old crashes get cleaned out from time to time. It also tightly integrates with journald giving us a well-defined interface to access crash metadata.
But before we dive into the inner workings of coredumpd, let’s talk about cores.
What are cores?A core, or more precisely: a core dump file, is a copy of the memory image of a process and its process status (registers, mappings, etc.) in a file. Simply put, it’s like we took a copy of the running process from RAM and stored it in a file. The purpose of such a core is that it allows us to look at a snapshot of the process at that point in time without having the process still running. Using this data, we can perform analysis of the process to figure out what exactly went wrong and how we ended up in that situation.
The advantage is that since the process doesn’t need to be running anymore, we can investigate crashes even hours or days after they happened. That is of particular use when things crash while we are not able to deal with them immediately. For example if Plasma were to crash on logout there’d be no way to deal with it besides stopping the logout, which may not even be possible anymore. Instead we let the crash drop into coredumpd, let it collect a core file, and on next login we can tell the user about the crash.
With that out of the way, it’s time to dump a core!
Core DumpsWe already talked about KCrash and how it intercepts crashes to write some metadata to disk. Once it is done it calls raise() to generate one of those core dumps we just discussed. This actually very briefly turns over control to the kernel which will more or less simply invoke the defined core_pattern process. In our case, coredumpd.
coredumpd will immediately systemd-socket-activate itself and forward the data received from the kernel. In other words: it will start an instance of systemd-coredump@.service and the actual processing will happen in there. The advantage of this is that regular systemd security configuration can be applied as well as cgroup resource control and all that jazz — the core dumping happens in a regular systemd service.
The primary task here is to actually write the dump to a file. In addition, coredumpd will also collect lots of additional metadata besides what is in the core already. Most notably various bits and pieces of /proc information such as cgroup information, mount information, the auxillary vector (auxv), etc.
Once all the data is collected a journald entry is written and the systemd-coredump@.service instance quits again.
The journal entry will contain the metadata as entry fields as well as the path of the core dump on disk, so we can later access it. It essentially serves as a key-value store for the crash data. A severely shortened version looks like this:
Tue 2024-08-27 17:52:27.593233 CEST […] COREDUMP_UID=60106 COREDUMP_GID=60106 COREDUMP_SIGNAL_NAME=SIGSYS COREDUMP_SIGNAL=31 COREDUMP_TIMESTAMP=1724773947000000 COREDUMP_COMM=wine64 COREDUMP_FILENAME=/var/lib/systemd/coredump/core.wine64.….zst … ExampleSince this is all rather abstract, we can look at a trivial example to illustrate things a bit better.
Let’s open two terminals. In the first we can watch the journal for the crash to appear.
journalctl -xef SYSLOG_IDENTIFIER=systemd-coredumpIn the second terminal we run an instance of sleep in the background, and then trigger a segmentation fault crash.
sleep 99999999999& kill -SEGV $!In the first terminal you’ll see the crash happening:
Aug 27 15:01:49 ajax systemd-coredump[35535]: Process 35533 (sleep) of user 60106 terminated abnormally with signal 11/SEGV, processing... Aug 27 15:01:49 ajax systemd-coredump[35549]: [🡕] Process 35533 (sleep) of user 60106 dumped core. Stack trace of thread 35533: #0 0x0000729f1b961dc0 n/a (/lib/ld-linux-x86-64.so.2 + 0x1cdc0) ELF object binary architecture: AMD x86-64So far so interesting. “But where is the additional data from /proc hiding?” you might wonder. We need to look at the verbose entry to see all data.
journalctl -o verbose SYSLOG_IDENTIFIER=systemd-coredumpThis actually already concludes coredumpd’s work. In the next post DrKonqi will step onto the stage.
PyCoder’s Weekly: Issue #644 (Aug. 27, 2024)
#644 – AUGUST 27, 2024
View in Browser »
This course uses three problems often covered in introductory astro-physics courses to play in Python. Along the way you’ll learn some astronomy and how to use a variety of datascience libraries like NumPy, Matplotlib, pandas, and pint.
REAL PYTHON course
Packaging in Python has a bit of a history and if you’ve come across the variety of ways of specifying and building packages you might wonder “why?” This article gives you the history and current best practices.
BITECODE
Discover how to create, accelerate, and deploy data pipelines with RAPIDS for GPU-accelerated data science workflows. Take this course for free when you join the NVIDIA Developer Program →
NVIDIA sponsor
This guide walks you through how to build a custom query language in Python. The example given is a language to search through song lyrics.
JAMES G
In this step-by-step project, you’ll build a blog from the ground up. You’ll turn your Django blog data models into a GraphQL API and consume it in a Vue application for users to read. You’ll end up with an admin site and a user-facing site you can continue to refine for your own use.
REAL PYTHON
Are you interested in learning robotics with Python? Can physical electronics-based projects grow a child’s interest in coding? This week on the show, we speak with author Marwan Alsabbagh about his book “Build Your Own Robot - Using Python, CRICKIT, and Raspberry Pi.”
REAL PYTHON podcast
As a user of pre-commit hooks, do you know what happens when you run pre-commit install or why you have to run it in the first place? How does pre-commit actually work with Git? In this article, Stefanie takes you behind the scenes of how your pre-commit setup works.
STEFANIEMOLIN.COM • Shared by Stefanie Molin
Master the Python range() function and learn how it works under the hood. You most commonly use ranges in loops. In this tutorial, you’ll learn how to iterate over ranges but also identify when there are better alternatives.
REAL PYTHON
The author was asked what he would considered if he wrote a in-memory cache. This article talks about the eight principles he would use, many of which he wouldn’t have considered when a younger developer.
TWO WRONGS
Every now and then you hear outrageous claims such as “Python has no preprocessor”, well it is there if you’re willing to dig deep enough. Learn how to hack Python’s compile step.
PYDONG
Work continues on removing and/or optimizing the GIL in Python. This article gives a little history so you can better understand why the GIL is there and what changes are coming.
IZZY MUERTE
“Python continues to cement its overall dominance, buoyed by things like popular libraries for hot fields such as A.I.” Read the article to see where other languages have placed.
IEEE SPECTRUM
Optimization should be your last step, but once you’re there, just what can you do? This article covers ten different techniques that address memory size and code performance.
JAMES ONONIWU
A new release of of uv is out and it has added a lot of features. This post talks about what is new and how it can simplify your packaging process.
SIMON WILLISON
“PyPI has drastically improved its malware response times, resolving 90% of issues in under 24 hours and removing 900 projects since March 2024.”
SARAH GOODING
GITHUB.COM/BEN-N93 • Shared by Ben Nour
deltadb: A Lightweight Database Built on Polars and Deltalake django-public-admin: A Public and Read-Only Django Admin sqlfluff: A Modular SQL Linter and Auto-Formatter authentik: The Authentication Glue You Need Events Weekly Real Python Office Hours Q&A (Virtual) August 28, 2024
REALPYTHON.COM
August 29 to September 1, 2024
PYCON.ORG
August 29, 2024
MEETUP.COM
August 31, 2024
PYTHON.ORG.BR
September 2, 2024
J.MP
September 4 to September 6, 2024
DATACOVE.CO.UK
September 5 to September 7, 2024
PYCON.EE
Happy Pythoning!
This was PyCoder’s Weekly Issue #644.
View in Browser »
[ Subscribe to 🐍 PyCoder’s Weekly 💌 – Get the best Python news, articles, and tutorials delivered to your inbox once a week >> Click here to learn more ]
Three things I learned at KubeCon + AI_Dev China 2024
KubeCon China 2024 was a whirlwind of innovation, community and technical deep dives. As it often happens at these community events, I was blown away by the energy, enthusiasm and sheer amount of knowledge being shared. Here are three key takeaways that stood out to me:
1. The focus on AI and machine learningAI and machine learning are increasingly integrated into cloud-native applications. At KubeCon China, I saw numerous demonstrations of how these technologies are being used to automate tasks, optimize resource utilization and improve application performance. From AI-powered observability tools to machine learning-driven anomaly detection, the potential for AI and ML in the cloud-native space is astounding.
Mer Joyce and Anni Lai introduced the new draft of the Open Source AI Definition (v.0.0.9) and the Model Openness Framework.
We also saw a robot on stage demonstrating that teaching a robotic arm to use a spoon to help disabled people is not a programming issue but a data issue. This was probably my biggest learning moment: A robot can be “taught” to execute tasks by imitating humans. Follow Xavier Tao and the dora-rs project.
2. The growing maturity of cloud-native technologiesIt’s clear that cloud-native technologies have come of age. From Kubernetes adoption to the rise of serverless platforms and edge computing, the ecosystem is thriving. In his keynote, Chris Aniszczyk announced over 200 projects are hosted by the Cloud Native Computing Foundation and half of the contributors are not in the US. The conference showcased a wide range of tools, frameworks and use cases that demonstrate the versatility and scalability of cloud-native architectures.
The presentation by Kevin Wang (Huawei) and Saint Jiang (NIO) showed how Containerd, Kubernetes and KubeEdge power the transition to electric vehicles. Modern cars are computers… no, cars are full datacenters on wheels, a collection of sensors feeding distributed applications to optimize battery usage, feeding into centralized programs to constantly improve the whole mobility system.
3. AI technology is removing the language barrierI was absolutely amazed by being able to follow the keynote sessions delivered in Chinese. I don’t speak Chinese but I could read the automatic translation in real time superimposed on the slides behind the speakers. This technology is absolutely jaw-droppingly amazing! Within a few years, there won’t be a career for simultaneous translators or for live transcribers.
Final thoughtsKubeCon + AI_Dev China was a testament to the power of Open Source collaboration hosted in one of the most amazing regions of the world. The conference brought together developers, operators and end-users from around the world to share their experiences, best practices and contributions to Open Source projects. This collaborative spirit is essential for driving innovation and ensuring the long-term success of cloud-native technologies.
FSF News: Thank you Odile Bénassy for four years of service on the FSF Board of Directors!
James Bennett: There can't be only one
There’s a concept that I’ve heard called by a lot of different names, but my favorite name for it is “the Highlander problem”, which refers to the catchphrase of the campy-yet-still-quite-fun Highlander movie/TV franchise. In Highlander, immortal beings secretly live amongst us and sword-fight each other in hopes of being the last one standing, who will then get to rule the world forever. And when one of them is about to eliminate another, …
The Drop Times: Drupal GovCon 2024: Drupal’s Pivotal Role in Government CMS and Accessibility
KDE neon rebase progressing
Here at KDE neon tower we have been busy rebasing our KDE software builds from Ubuntu 22.04 (jammy) to Ubuntu 24.04 (noble). This always takes longer than you’d think, mostly because it’s a moving target so we also have to keep updating the incoming releases from Plasma, Frameworks, Gear and even Calligra. We had a couple of delays when Jonathan caught Covid (4 years after it was worth sympathy to do so) and then the build server had issues and needed itself rebuilding. But the package archives are there and the Docker images are there and today the first ISO got built which boots successfully. Next steps are making sure all the software is up to date and getting the upgrade solid. Be with you soon!
First KDE neon ISO based on NobleFresh Breeze Dialog Icons
The Breeze icons used in message boxes always felt a little odd with a status icon placed inside some kind of speech bubble, effectively an icon within an icon. Three months ago they got replaced by more simplistic ones that I felt didn’t fit very well either. Therefore I put my Inkscape skills to the test and created a new set of Breeze-style dialog icons.
As you may know I have a secret passion for vector graphics. I am not very good at it but I love that you have proper shapes and objects to work with as opposed to a mush of pixels. Sure, more sophisticated graphics programs have layers and masks and what not but my proficiency there doesn’t go much beyond Kolourpaint. More importantly, though, I can easily copy paste together various bits and pieces of other icons to implement the icon I had in mind.
A few of the file icons I have worked on but didn’t finish (feel free to take as an inspiration)Over the years I have created a couple of Breeze icons, mostly for (exotic) file types since I am on a quest to have a thumbnailer or at least a proper icon for every file type imaginable. Take for example the Apple Wallet icon we use for KDE Itinerary, our fantastic travel companion app: it’s just a ZIP file with a JSON description and graphical assets for a boarding pass or event ticket. I used a generic blue file icon combined with the wallet-open icon. My original idea was actually a plane alongside a QR code but I couldn’t help but view it as a plane that crashed into a building.
The other day I received concert tickets in the form of an Apple Wallet bundle which contains multiple passes, something I didn’t know existed and probably didn’t back when I created the original icon. For this purpose I used the generic package icon in the same light blue, zipper shifted to the right to match the Android APK icon, and wallet emblem added.
How the Apple Wallet passes icon came to beFriday night I sat down to finally create some new dialog icons for error, warning, question, and information. Breeze come with has shield-shaped security icons similar to the message box icons we used to have. I just wanted a triangle with a Breeze-style gradient and angled drop shadow, how hard can it be? I ended up twisting the security-medium icon (an orange shield with exclamation mark), removing nodes, stretching objects, and straightening Bezier curves until I got what I wanted.
For the error icon I initially wanted a simple circle with an X. This time security-low (a red shield with a cross) served as a base. I drew a simple circle and used the shield’s gradient as fill. I couldn’t figure out how to bend the shield-like outline at the bottom to match the circle so I went looking for a similar circular icon I could steal it from. Luckily, there was exactly one circular 64 px icon I could copy. For question and information I also used purple and blue circles, respectively.
First iteration of the new dialog-error iconAfterwards I figured I probably want a different shape for the error icon so it’s not all circles and to keep it from looking like a close button. That’s actually the main reason our message widget lost its icon, people confused the error symbol on the left with a close button to dismiss the message. I checked what other icon themes did and found squares were quite popular (also used by the KDE 4 Oxygen theme) as well as octagons. I tried a square by modifying the utilities-terminal (Konsole) icon but that just looked way too massive. After consulting the KDE Visual Design Group we settled on an octagon like a stop sign (reminds me of the glorious Windows 3.1 days which used an actual stop sign). Finally, Janet Blackquill and Andy Betts gave the icons some finishing touches and needed polish, taking into account our Colorful Icon HIG.
This post illustrates (in more than one sense of the word) nicely that contributing to KDE is more than writing actual source code. Creating artwork, writing translations, doing promo work, managing IT infrastructure, and of course hosting community events like Akademy is just as important for a community to thrive!
Real Python: Using Astropy for Astronomy With Python
This course covers two problems from introductory astronomy to help you play with some Python libraries. You’ll use Astropy, NumPy, Matplotlib, and pandas to find planet conjunctions, and graph the best viewing times for a star.
In this course you’ll learn about:
- Astronomy concepts of conjunction and optimal viewing
- The Python package Astropy
- Using pandas to process data
- Building graphs with Matplotlib
- Python’s warning module
[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]
Wim Leers: XB week 12: component previews & StorablePropShape
The back-end heart of Experience Builder (XB) is its two-property field type. Thanks to Ted “tedbow” Bowman, the tree field property is now strictly validated, which is essential to ensure both data integrity and the ability to evolve the codebase rapidly and confidently. Crucially, this validation constraint is used to validate both content and configuration, just like the validation that was added in week 10.
That validation finally unblocked #3446722: Introduce an example set of representative SDC components; transition from “component list” to “component tree”: now that Ted landed the necessary validation, it makes sense to add Kyle “ctrladel” Einecker’s set of Single Directory Components (SDCs) that Lauri has confirmed to well represent the spectrum of SDC functionality XB must support.
Ted will change the default component tree that XB configures for articles, so that we’ll start seeing Kyle’s two_column SDC by default!
To assist Ted, I updated the XB field type’s computed hydrated property to support hydrating component trees instead of just component lists, and updated the “preview” route to use that logic 1.
Ted already landed validation, I took care of rendering, so now Ted can focus on the remaining bits … because until now, we’d been testing/developing XB with only a handful of sample components — we fully expect this to reveal a bunch of missing things. That’s exactly why getting a representative set of SDCs into the XB codebase is important, even if eventually they may only be used in tests.
Hopefully that will land next week!
Missed a prior week? See all posts tagged Experience Builder.
Goal: make it possible to follow high-level progress by reading ~5 minutes/week. I hope this empowers more people to contribute when their unique skills can best be put to use!
For more detail, join the #experience-builder Slack channel. Check out the pinned items at the top!
Hearing the term “routing” in a Drupal context typically means “server-side routing”. But for an extensive JS application, client-side routing is important too: it allows sharing a URL with a friend/colleague to invite them to collaborate on a particular bit in the content being created. That’s why Jesse “jessebaker” Baker and Ben “bnjmnm” Mullins landed client-side routing for XB, after having asked for feedback from the community on which direction to take (thanks Bálint “balintbrews” Kléri, Ronald “roaguicr” Aguilar, Kyle and Lee “larowlan” Rowlands for your input!), landed on React Router. The implementation will likely evolve, but a basic implementation is now in place, and includes test coverage.
Related to routing, but on the back-end side: Lee updated XB’s server-side routes to expect an entity type + ID, rather than hardcoding them all to node one. This is a welcome improvement, but would not have happened if not for Lee or somebody else in the community: for the team working full-time on XB this isn’t a priority yet, because we’re prioritizing the hard stuff — the known unknowns. Still, we definitely welcome MRs like these, and will happily review & merge them!
I know y’all are waiting for interesting progress on the experience of using XB — this week’s key progress on that front is brought to you by Ben!
Choosing a component to pick just based on the name might be okay … but an instantaneous visual preview would be better, right? That’s exactly what he landed in #3462636:
The funniest bugfix of the week is brought to you by Utkarsh “utkarsh_33”: the SDC prop labels were present on field widgets, but were invisible :D
Finally, in the “improve DX & velocity” department, the eslint prettier configuration was updated, which gets us closer to Drupal core’s configuration for JS. Thanks to Ivan “finnsky” Berdinsky, Ben, Gaurav “gauravvvv”, Daniel “DanielVeza”, Lee and harumi “hooroomoo” Jang — Harumi captured the impact well:
Looks good! Will save headaches:)
Computing a StorablePropShapeBack to the back-end side, to end this week’s update in a very deep place (but also a very interesting place!): XB gained the ability to compute a field type + storage settings + instance settings for a given SDC prop shape (the normalized subset of an SDC prop’s JSON schema that affects the shape of data it expects — the title, description, examples etc. in the JSON schema are irrelevant from this point of view; I named this a PropShape).
Until now, XB has only been using matching. But that can only get us so far — for example, SDCs often have props whose JSON schema looks like this:
To populate this SDC prop, XB must store a string (logical choice: Drupal core’s string field type), but not just any string: only primary or secondary. Drupal core has an answer for this too: the list_string field type. But the matching that was hitherto used requires either a field type that allows precisely those 2 values, or an existing list_string field instance that is configured to allow those 2 values. Clearly, that’s likely to result in zero matches, because the chances are vanishingly small that a Drupal site has a pre-existing field instance configured exactly like that. And that is just one example: many SDCs will have different allowed values.
That’s where computing rather than matching becomes relevant: use logic to compute what exact shape (in this case: a type: string that also specifies an enum: […]) requires which field type (list_string) and which corresponding field storage+instance settings (here only storage settings: allowed_values: [ {value: primary, label: primary}, {value: secondary, label: secondary} ]). The computed result is represented by a StorablePropShape.
And that is necessary for XB users to fully benefit from the work Ted is doing on #3446722: many of those representative SDCs are indeed using enum: otherwise you’d not be able to edit component instances that will be placeable once Ted’s done!
This infrastructure also paves the path to something else: allowing those computed field type + widget decisions to be altered. For example, when the Media Library module is installed and a media type that uses the image MediaSource plugin is present, an SDC with a prop that expects an image should no longer use the image field type + widget, but the Media Library widget. So I worked with Ted and Ben to introduce hook_storable_prop_shape_alter(), and made XB implement it on behalf of the media_library module.
This doesn’t mean that matching goes away: that will remain relevant for identifying which existing structured data can be used to populate an SDC prop. Much more work is needed to make XB’s matching ability complete, but that work is for after the 0.1.0 goals for DrupalCon Barcelona.
Week 12 was July 29–August 4, 2024.
FSF Events: Free Software Directory meeting on IRC: Friday, August 30, starting at 12:00 EDT (16:00 UTC)
Calligra Office 4.0 is Out!
Calligra is the office and graphics suite developed by KDE and is the successor to KOffice. With some traditional parts like Kexi and Plan having an independent release schedule, this release only contains the four following components:
- Calligra Words: Word Processor
- Calligra Sheets: Spreadsheet Application
- Calligra Stage: Presentation Application
- Karbon: Vector Graphics Editor
The most significant updates are that Calligra has been fully transitioned to Qt6 and KF6, along with a major overhaul of its user interface.
GeneralWords, Sheets, and Stage now feature a new sidebar design. Currently, this is implemented using a proxy style, which will no longer be necessary once the related merge request in Breeze is merged.
Sidebar with the new immutable tab design
I revamped the content of each sidebar tab, addressing various visual glitches and making the spacing much more consistent.
The “Custom Shape” docker has been removed, and custom shapes are now accessible through a popup menu in the toolbar across all Calligra applications.
Regarding the toolbar, I streamlined the default layout by removing basic actions like copy, cut, and paste.
The settings dialogs were also cleaned up and are now using the new FlatList style also used by System Settings and most Kirigami applications.
WordsWord now features the new sidebar design, and the main view uses a shadow to define the document borders.
The Style Manager and Page Layout dialog were also updated.
StageStage didn’t really change aside of the sidebar redesign. But I am using it to work on my slides for Akademy and it is a pretty solid choice.
The tooltip for the slides are now compatible with Wayland.
Calligra SheetsAs part of the Qt6 port, Sheets lost its scripting system based on the unmaintained Kross framework. In the future, it would be possible to add Python scriping, thanks to the work of Manuel Alcaraz Zambrano on getting Python bindings for the KDE Frameworks.
Visually a noticable change is that the cell editor moved from a docker positioned on the left of the spreadsheet view by default to a normal widget on the top. This takes a lot less space which can be used by the spreadsheet.
KarbonKarbon didn’t received much change outside of the one affecting the whole platform.
LauncherThe intial window when opening one of the Calligra application was redesign and adopted the new “frameless style”.
Custom Document tab of the launcher page
Template tab of the launcher page
Other-
Braindump is now able to compile again, but since it lacks an active maintainer, the component is disabled in release builds.
-
The webshape plugin has been ported from the outdated QtWebkit module to QtWebEngine and is no longer exclusive to Braindump. This means you can now embed websites directly into your word documents, slides, and spreadsheets.
- The AppStream id of every components is prefixed by org.kde.calligra. This allow Flatpak to expose every Calligra applications to your application launcher.
Calligra needs your support! You can contribute by getting involved in development, providing new or updated templates, or making a donation to KDE e.V.. Join the discussion in our Matrix channel.
CreditsThis release would not have been possible without the high quality mockups provided by Manuel Jesús de la Fuente. Also big thanks to everyone who contributed to this Calligra release: Evgeniy Harchenko, Dmitrii Fomchenkov and bob sayshilol.
Packager SectionYou can find the package on download.kde.org and it has been signed with my GPG key.
Specbee: Why User Experience (UX) matters and how it can transform your website
Russ Allbery: Review: Dark Horse
Review: Dark Horse, by Michelle Diener
Series: Class 5 #1 Publisher: Eclipse Copyright: June 2015 ISBN: 0-9924559-3-6 Format: Kindle Pages: 366Dark Horse is a science fiction romance novel, the first of a five book series as of this writing. It is self-published, although it is sufficiently well-edited and packaged that I had to do some searching to confirm that.
Rose was abducted by aliens. The Tecrans picked her up along with a selection of Earth animals, kept her in a cell in their starship, and experimented on her. As the book opens, she has managed to make her escape with the aid of an AI named Sazo who was also imprisoned on the Tecran ship. Sazo dealt with the Tecrans, dropped the ship in the middle of Grih territory, and then got Rose and most of the animals on shuttles to a nearby planet.
Dav Jallan is the commander of the ship the Grih sent to investigate the unexplained appearance of a Class 5 Tecran warship in the middle of their territory. The Grih and the Tecran, along with three other species, are members of the United Council, which means in theory they're all at peace. With the Tecran, that theory is often strained. Dav is not going to turn down one of their highly-advanced Class 5 warships delivered to him on a silver platter. There is only the matter of the unexpected cargo, the first orange dots (indicating unknown life forms) that most of the Grih have ever seen.
There is a romance. That romance did not work for me. I thought it was highly unprofessional on Dav's part and a bit too obviously constructed on the author's part. It also leans on the subgenre convention that aliens can be remarkably physically similar and sexually compatible, which always causes problems for my suspension of disbelief even though I know it's no less plausible than faster-than-light travel.
Despite that, I had so much fun with this book! It was absolutely delightful and weirdly grabby in a way that caught me by surprise. I was skimming some parts of it to write this review and found myself re-reading multiple pages before I dragged myself back on task.
I think the most charming part of this book is that the United Council has a law called the Sentient Beings Agreement that makes what the Tecran were doing extremely illegal, and the Grih and the other non-Tecran aliens take this very seriously and with a refreshing lack of cynicism. Rose has a typical human reaction to ending up in a place where she doesn't know the rules and isn't entirely an expected guest. She almost reflexively smoothes over miscommunications and tensions, trying to adapt to their expectations. And then, repeatedly, the Grih realize how much work she's doing to adapt to them, feel enraged at the Tecran and upset that they didn't understand or properly explain something, and find some way to make Rose feel more comfortable. It's surprisingly soothing and comforting to read.
It occurred to me in several places that Dark Horse could be read as a wish-fulfillment fantasy of what life as a woman could be like if men took their fair share of the mental load. (This concept is usually applied to housework, but I think it generalizes to other social and communication contexts.) I suspect this was not an accident.
There is a lot of wish fulfillment in this book. The Grih are very human-like but hunky, which is convenient for the romance subplot. They struggle to sing, value music exceptionally highly, and consider Rose's speaking voice beautifully musical. Her typical human habit of singing to herself is a source of immediate and almost overwhelming fascination. The supplies Rose takes from the Tecran ship when she flees just happen to be absurdly expensive scented shampoo and equally expensive luxury adaptable clothing. The world she lands on, and the Grih ship, are low-gravity compared to Earth, so Rose is unusually strong for her size. Grih military camouflage has no effect on her human vision. The book is set up to make Rose special.
If that type of wish fulfillment is going to grate, wait on this book until you're more in the mood for it. But I like wish fulfillment books when they're done well. Part of why I like to read is to imagine a better world. And Rose isn't doted on; despite their hospitality, she's constantly underestimated by the Grih. Even with their deep belief in the Sentient Beings Agreement, the they find it hard to believe that an unknown sentient, even an advanced sentient, is really their equal. Their concern at the start is somewhat patronizing, so watching Rose constantly surprise them delighted the part of my brain that likes both competence porn and deserved reversals, even though the competence here is often due to accidents of biology. It helps that Diener tells the story in alternating perspectives, so the reader first watches Rose do something practical and straightforward from her perspective and then gets to enjoy the profound surprise and chagrin of the aliens.
There is a plot beneath this first contact story, and beyond the political problem of figuring out what to do with Rose and the Tecran. Sazo, Rose's AI friend, does not want the Grih to know he exists. He has a history that Rose does not know about and may not be entirely safe. As the political situation with the Tecran escalates, Sazo is pursuing goals of his own, and Rose has a firm opinion about where her loyalties should lie. The resolution is nothing ground-breaking as far as SF goes, but I thought it was satisfyingly tense and complex. Dark Horse leaves obvious room for a sequel, but it comes to a satisfying conclusion.
The writing is serviceable, particularly once you get into the story. I would not call it great, and it's not going to win any literary awards, but it didn't interfere with my enjoyment of the story.
This is not the sort of book that will make anyone's award list, but it is easily in the top five of books I had the most fun reading this year. Maybe save it for when you're looking for something light and wholesome and don't mind some rather obvious tropes, but if you're in the mood for imagining people who take laws seriously and sincerely try to help other people, I found this an utterly delightful way to pass the time. I immediately bought the sequel. Recommended.
Followed by Dark Deeds.
Rating: 8 out of 10
Armin Ronacher: MiniJinja: Learnings from Building a Template Engine in Rust
Given that I can't stop creating template engines, I figured I might write a bit about my learnings of creating MiniJinja which is an implementation of my Jinja2 template engine for Rust. Disclaimer: this post might be a bit more technical.
There is a good chance you have come across Jinja2 templates before as they became quite common place in various places over the years. They look a bit like this:
{% extends "layout.html" %} {% block body %} <p>Hello {{ name }}!</p> {% endblock %}If you want to play around it yourself, here are some links:
- The MiniJinja playground lets you play with a WASM compiled version of MiniJinja.
- The API Documentation documents all APIs, functionality and syntax.
- The GitHub Project for all the code including lots of examples.
- minijinja and minijinja-cli on crates.io
Maybe we start with the initial question of why I wrote MiniJinja. It's the year 2024 and people don't create a ton of HTML with server side rendered template engines any more. While there is some resurgence of that model thanks to HTMX, hotwire and livewire, I personally use SolidJS for my internal UI needs. There is however always a need to generate some form of text and so somehow Jinja2's need never really went away. When I originally created it, it was clearly meant for generating HTML with some JavaScript sprinkled on top, but in the years since I have encountered Jinja templates in many more places, primarily for generating YAML and similar formats. Lately it comes up for LLM prompt generation.
My personal need for MiniJinja came out of an experiment I built for infrastructure automation. Since the templates had to be loaded dynamically I could not use a system like Askama. Askama has type-safe templates that just generate Rust code. On the other hand most Jinja inspired template engines that are dynamic in Rust really do not try very hard to be Jinja compatible. Because writing template engines is also fun, I figured I might give it another try.
Over the last two years I kept adding to the engine until it got to the point where it's at almost feature parity with Jinja2 and quite enjoyable to use.
Runtime ValuesWhen building a template engine for Rust you end up building a little dynamic programming language that is optimized for text generation. Consequently you pull in most of the challenges of building a dynamic language. Particularly when working in Rust the immediate challenge is memory management and exposing native Rust objects to the embedded language. So the interesting bit here is how to create a system that allows interactions between the template engine and the Rust world around it.
MiniJinja, unlike Jinja2 does not use code generation but has a basic stack based VM and a AST based bytecode compiler. Since MiniJinja follows Jinja2 it inherits a lot of the realities of the underlying object system that Jinja2 inherits from Python. For instance macros (functions) are first class objects and they can have closures. This has challenges because it's easy to create cycles and Rust has no garbage collector that can help with this problem.
The core object model in MiniJinja is a Value type which is represented by an enum that looks as follows (some less important variants removed):
#[derive(Clone)] pub struct Value(ValueRepr); #[derive(Clone)] pub(crate) enum ValueRepr { Undefined, None, Bool(bool), U64(u64), I64(i64), F64(f64), String(Arc<str>, StringType), SmallStr(SmallStr), Invalid(Arc<Error>), Object(DynObject), }Externaly everything is a Value. If you Clone it, you usually bump a reference count or you make a cheap memcopy. Values are either primitives such as strings, numbers etc. or objects.
For objects MiniJinja provides a tait called Object which can be implemented by most Rust types. The engine provides a DynObject wrapper is a fancy Arc<dyn Object> which supports borrowing and object safety. I wrote about this before. What you will notice is that quite a few of the types involved have an Arc. That's because these values are for the most part reference counted. Since values here are really fat (they are 24 bytes in memory) a SmallStr type is used to hold up to 22 bytes of string data inline. One byte is used to encode the length of the string, and another byte is then used by the ValueRepr to mark which enum variant is in use. In pure theory this is all wrong. We never use weak references, so the weak count in the Arc is not used and clever bit hackery could be used to greatly reduce the size of the value type. I think one could get the whole thing down to 16 bytes trivially or even 8 bytes with NaN tagging. However I did not want to walk into the world of unsafe code more than feels appropriate.
MiniJinjia is also plenty fast.
One variant that is worth calling out is Invalid. That's a value that can exist in the system but it carries an error. When you're trying to interact with it in most cases it will propagate this error. That's used in the engine in places where the API assumes infallability (particularly during iteration) but it needs a way to emit an error. This concept is quite common when writing an engine in C though typically the actual error is carried out of bounds. For instance in QuickJS there is a marker value that indicates a failure, but the actual error is held on the interpreter runtime.
The trait definition for objects looks like this:
pub trait Object: Debug + Send + Sync { fn repr(self: &Arc<Self>) -> ObjectRepr { ... } fn get_value(self: &Arc<Self>, key: &Value) -> Option<Value> { ... } fn enumerate(self: &Arc<Self>) -> Enumerator { ... } fn enumerator_len(self: &Arc<Self>) -> Option<usize> { ... } fn is_true(self: &Arc<Self>) -> bool { ... } fn call( self: &Arc<Self>, state: &State<'_, '_>, args: &[Value], ) -> Result<Value, Error> { ... } fn call_method( self: &Arc<Self>, state: &State<'_, '_>, method: &str, args: &[Value], ) -> Result<Value, Error> { ... } fn render(self: &Arc<Self>, f: &mut Formatter<'_>) -> Result where Self: Sized + 'static { ... } }Some of these methods are implemented automatically. For instance many of the methods such as is_true or enumerator_len have a default implementation that is based on object repr and the return value from enumerate. But they can be overridden to change the default behavior or to add some potential optimizations.
One of the most important types in Jinja is a map as it holds the template context. They are implemented as you can imagine as Object. The implementation is in fact pretty trivial:
impl<V> Object for BTreeMap<Value, V> where V: Into<Value> + Clone + Send + Sync + fmt::Debug + 'static, { fn get_value(self: &Arc<Self>, key: &Value) -> Option<Value> { self.get(key).cloned().map(|v| v.into()) } fn enumerate(self: &Arc<Self>) -> Enumerator { self.mapped_enumerator(|this| Box::new(this.keys().cloned())) } }This reveals two interesting aspects of the object model: First that Value implements Hash. That means any value can be used as the key in a value. While this is untypical for Rust and even not what happens in Python, it simplifies the system greatly. When in the template engine you write {{ object.key }}, behind the scenes object.get_value(Value::from("key")) is called. Since most keys are typically less than 22 characters, creating a dummy Value wrapper around is not too problematic.
The second and probably more interesting part here is that you can sort of borrow out of an object for the enumerator. The mapped_enumerator helper takes a reference to self and invokes a closure which itself can borrow from self. This adjacent borrowing is implemented with unsafe code as there is no other way to make it work. The combination of repr (defaults to Map), get_value and enumerate gives the object the behavior, shape and contents.
Vectors look quite similar:
impl<T> Object for Vec<T> where T: Into<Value> + Clone + Send + Sync + fmt::Debug + 'static, { fn repr(self: &Arc<Self>) -> ObjectRepr { ObjectRepr::Seq } fn get_value(self: &Arc<Self>, key: &Value) -> Option<Value> { self.get(key.as_usize()?).cloned().map(|v| v.into()) } fn enumerate(self: &Arc<Self>) -> Enumerator { Enumerator::Seq(self.len()) } } Enumerators and Object BehaviorsEnumeration in MiniJinja is a way to allow an object to describe what's inside of it. In combination with the return values from repr() the engine changes how iteration is performed. These are possible enumerators:
pub enum Enumerator { NonEnumerable, Empty, Iter(Box<dyn Iterator<Item = Value> + Send + Sync>), Seq(usize), Values(Vec<Value>), }It's probably easier to explain how enumerators turn into iterators by showing you the try_iter method in the engine:
impl DynObject { fn try_iter(self: &Self) -> Option<Box<dyn Iterator<Item = Value> + Send + Sync>> where Self: 'static, { match self.enumerate() { Enumerator::NonEnumerable => None, Enumerator::Empty => Some(Box::new(None::<Value>.into_iter())), Enumerator::Seq(l) => { let self_clone = self.clone(); Some(Box::new((0..l).map(move |idx| { self_clone.get_value(&Value::from(idx)).unwrap_or_default() }))) } Enumerator::Iter(iter) => Some(iter), Enumerator::Values(v) => Some(Box::new(v.into_iter())), } } }Some of the trivial enumerators are quick to explain: Enumerator::NonEnumerable just does not support iteration and Enumerator::Empty does but won't yield any values. The more interesting one is Enumerator::Seq(n) which basically tells the engine to call get_value from 0 to n to yield items from the object. This is how sequences are implemented. The rest are enumerators that just directly yield values.
So when you want to iterate over a map, you will usually use something like Enumerator::Iter and iterate over all the keys in the map.
The engine then uses ObjectRepr to figure out what to do with it. For a value marked as ObjectRepr::Seq it will display like a sequence, you can index it with integers, and that it iterates over the values in the sequence. If the repr is ObejctRepr::Map then the expectation is that it will be indexable by key and it will iterate over the keys when used in a loop. Its default rendering also is a key-value pair list wrapped in curly braces.
Now quite frankly I don't like that iteration protocol. I think it's more sensible for maps to naturally iterate over the key-value pairs, but since MiniJinja follows Jinja2 and Jinja2 follows Python emulating was important.
Enumerators are a bit different than iterators because they might only define how iteration is performed (see: Enumerator::Seq). To actually create an iterator, the object is then passed to it. They are also asked to provide a length. When an enumerator provides a length it's an indication to the engine that the object can be iterated over more than once (you can re-create the enumerator). This is why objects land in a MiniJinja template that looks like a list, but is actually just an iterable object with a known length. For this MiniJinja uses a trick where it will inspect the size hint of the iterator to make assumptions about it. Internally every enumerator allows the engine to query the length of it:
impl Enumerator { fn query_len(&self) -> Option<usize> { Some(match self { Enumerator::Empty => 0, Enumerator::Values(v) => v.len(), Enumerator::Iter(i) => match i.size_hint() { (a, Some(b)) if a == b => a, _ => return None, }, Enumerator::RevIter(i) => match i.size_hint() { (a, Some(b)) if a == b => a, _ => return None, }, Enumerator::Seq(v) => *v, Enumerator::NonEnumerable => return None, }) } }The important part here is the call to size_hint. If the upper bound is known, and the lower bound matches the upper bound then MiniJinja will assume the iterator will always have that length (for as long as not iterated). As a result it will change the way the object is interacted with. This for instance means that if you run range(10) in a template it looks like a list when printed even though iteration and number creation is lazy. On the other hand if you use the Value::make_one_shot_iterator API the length hint will always be disabled and MiniJinja will not attempt to interact with the iterator when printing it:
{{ range(4) }} -> prints [0, 1, 2, 3] {{ a_real_iterator }} -> prints <iterator> Building a VMLexing and parsing I think is not too puzzling in Rust, but making an AST and making a VM is kinda unusual. The first thing is that Rust is just not particularly amazing at tree structures. In MiniJinja I really wanted to avoid having the AST at all, but it does come in in handy to implement some of the functionality that Jinja2 requires. For instance to establish closures it will just walk the AST to figure out which names are looked up within a function. I tried a few things to improve how memory allocations work with the AST. There are great crates out there for doing this, but I really wanted MiniJinja to be light on dependencies so I ended up opting against all of them.
For the AST design I went with large enums that hold Spanned<T> values:
pub enum Expr<'a> { Var(Spanned<Var<'a>>), Const(Spanned<Const>), ... } pub struct Var<'a> { pub id: &'a str, } pub struct Const { pub value: Value, }You might now be curious what Spanned<T> is. It's a wrapper type that does two things: it boxes the inner node and it stores and adjacent Span which is basically the code location in the original input template for debugging:
pub struct Spanned<T> { node: Box<T>, span: Span, }It implements Deref like a smart pointer so you can poke right through it to interact with the node. The code generator just walks the AST and emits instructions for it.
The instructions themselves are a large enum but the number of arguments to the variants is kept rather low to not waste too much memory. The base size of the instruction is dominated by it being able to hold a Value which as we have established is a pretty hefty thing:
pub enum Instruction<'source> { EmitRaw(&'source str), StoreLocal(&'source str), Lookup(&'source str), LoadConst(Value), Jump(usize), JumpIfFalse(usize), JumpIfFalseOrPop(usize), JumpIfTrueOrPop(usize), ... }The VM keeps most of the runtime state on a State object that is passed to a few places. For instance you have already seen this in the call signature further up. The state for instance holds the loaded instructions or the template context. The VM itself maintains a stack of values and then just steps through a list of instructions on the state in a loop. Since there are a lot of instructions you can have a look on GitHub to see it in its entirety. Here however is a small part that shows roughly how this works:
let mut pc = 0; loop { let instr = state.instructions.get(pc) { Some(instr) => instr, None => break, }; let a; let b; match instr { Instruction::EmitRaw(val) => { out.write_str(val).map_err(Error::from)?; } Instruction::Emit => { self.env.format(&stack.pop(), state, out)?; } Instruction::StoreLocal(name) => { state.ctx.store(name, stack.pop()); } Instruction::Lookup(name) => { stack.push(assert_valid!(state .lookup(name) .unwrap_or(Value::UNDEFINED))); } Instruction::GetAttr(name) => { a = stack.pop(); stack.push(match a.get_attr_fast(name) { Some(value) => value, None => undefined_behavior.handle_undefined(a.is_undefined())?, }); } Instruction::LoadConst(value) => { stack.push(value.clone()); } Instruction::Jump(jump_target) => { pc = *jump_target; continue; } Instruction::JumpIfFalse(jump_target) => { a = stack.pop(); if !undefined_behavior.is_true(&a)? { pc = *jump_target; continue; } } // ... } pc += 1; }Basically the current instruction is held in pc (short for program counter), normally it's advanced by one but jump instructions can change the pc to any other location. If you run out of instructions the evaluation ends.
One piece of complexity in the VM comes down to macros. That's because lifetimes make that really tricky. A macro is just a Value that holds a Macro Object internally. So how can that macro reference the instructions, if the instructions themselves have a lifetime to the template 'source? The answer is that they can't (at least I have not found a reasonable way). So instead a macro has an ID which acts as a handle to look up the instructions dynamically from the execution state. Additionally each state has a unique ID so the engine can assert that nothing funny was happening. The downside of this is that a macro cannot be "returned" from a template. They can however be imported from one template into another.
Here is what a macro object looks like in code (abbreviated):
pub(crate) struct Macro { pub name: Value, pub arg_spec: Vec<Value>, pub macro_ref_id: usize, // id of the macro pub state_id: isize, pub closure: Value, pub caller_reference: bool, } impl Object for Macro { fn call(self: &Arc<Self>, state: &State<'_, '_>, args: &[Value]) -> Result<Value, Error> { // we can only call macros that point to loaded template state. // if a template would be returned from a template this will // fail. if state.id != self.state_id { return Err(Error::new( ErrorKind::InvalidOperation, "cannot call this macro. template state went away.", )); } // ... argument parsing let arg_values = ...; // find referenced instructions let (instructions, offset) = &state.macros[self.macro_ref_id]; // created a nested vm and evaluate the macro let vm = Vm::new(state.env()); let mut rv = String::new(); let mut out = Output::with_string(&mut rv); let closure = self.closure.clone(); ok!(vm.eval_macro( instructions, *offset, self.closure.clone(), state.ctx.clone_base(), caller, &mut out, state, arg_values )); // return rendered template as string from the call Ok(if !matches!(state.auto_escape(), AutoEscape::None) { Value::from_safe_string(rv) } else { Value::from(rv) }) } }Additionally the closure is a good source of cycles. For that reason the engine keeps track of all closures during the execution and breaks cycles caused by closures manually by clearning them out.
Cool APIsThe last part that I want to go over is the magic that makes this work:
fn slugify(value: String) -> String { value.to_lowercase().split_whitespace().collect::<Vec<_>>().join("-") } fn timeformat(state: &State, ts: f64) -> String { let configured_format = state.lookup("TIME_FORMAT"); let format = configured_format .as_ref() .and_then(|x| x.as_str()) .unwrap_or("HH:MM:SS"); format_unix_timestamp(ts, format) } let mut env = Environment::new(); env.add_filter("slugify", slugify); env.add_filter("timeformat", timeformat);You might have seem something like this in Rust before, but it's still a bit magical. How can you make functions with seemingly different signatures register with the add_filter function? How does the engine perform the type conversions (as we know the engine has Value types, so where does the String conversion take place?). This is a topic for a blog post on its own but the answer behind this lies in a a lot of clever trait hackery. The add_filter function reveals a bit of that hackery:
pub fn add_filter<N, F, Rv, Args>(&mut self, name: N, f: F) where N: Into<Cow<'source, str>>, F: Filter<Rv, Args> + for<'a> Filter<Rv, <Args as FunctionArgs<'a>>::Output>, Rv: FunctionResult, Args: for<'a> FunctionArgs<'a>, { let filter = BoxedFilter(Arc::new(move |state, args| -> Result<Value, Error> { f.apply_to(Args::from_values(Some(state), args)?).into_result() })); self.filters.insert(name.into(), filter); }Hidden behind this rather complex set of traits are some basic ideas:
- FunctionArgs is a helper trait for type conversions. It's implemented for tuples of different sizes made of ArgType values. These tuples represent the signature of the function. It has a method called from_values which performs that conversion via ArgType.
- ArgType which you can't really see in the code above, is a trait that knows how to convert a Value into whatever the function desires as argument.
- Filter is a trait implemented for function with qualifying FunctionArgs signatures returning a FunctionResult.
- A FunctionResult is a trait that represents potential return values from the function such as a Value, something that can be converted into a Value or a Result.
- The BoxedFilter type is what converts the passed closure into a reference counted object that is held in the environment.
I think a lot of the patterns in MiniJinja are useful for projects outside of MiniJinja. Quite is quite a bit more hidden in it that I have talked about before such as how MiniJinja is abusing serde. If you have a need for a Jinja2 compatible template engine I would love if you get some use out of it. If you're curious about how to build a runtime and object system in Rust, you might also find some utility in the codebase.
I myself learned quite a bit about what creative API design can look like in Rust by building it. At this point I am incredibly happy with how the public API of the engine shaped out to be. The engine is extensively documented both internally and publicly and you can read all about it in the API docs.