Feeds
Okteta got “Best Application” 2024 Akademy Award
The jury of this year’s KDE Akademy Awards, being by tradition representatives of last year’s winners, has selected the hex editor Okteta in the category “Best Application”. Thanks to them for this appreciation, even more for a niche application
Though, appreciation for what, as there are no details? The last new feature was added in 2019, with the 17th patch release since just done. So, for a reliable program with no need to relearn the UI every year and proudly close to zero open actual bug reports? Then the port to Qt6/KF6, while started in 2022, might be only completed in 2025… if ever. So rather, is this an end-of-life award for an aged 16 years old program?
Looking BackTriggered by the event some reflection on the past development, if only for the author himself, to update the memories how one got here and what it brings for the future. Which turned into a longer text than anticipated
2003-2006: Years of Initial Need for a WidgetThe Okteta project was born in 2003 , the known first code traces date back to May 13th, 2003. The first related code was imported into KDE’s code repository in August 15th, 2003, by the commit message:
Initial import of KHexEdit2, featuring a widget, a kpart and an app.Most important is the widget...
Hopefully it will be usable enough ready for KDE 3.2...
The name “KHexEdit2” was chosen as the project was a re-implementation of KHexEdit, the hex editor part of KDE since KDE 1.1. Because those times I was trying to create a viewer for executables and libraries (project name Binspekt, stalled soon), for which I wanted a widget for displaying bytes. KHexEdit seemed to have no code that could be cleanly ripped out and be reused, so work started to code such a widget from scratch and respectively also consumers of it, to add more reason & motivation.
The formal request on September 29th that year to then move the project’s code from the “kdenonbeta” area into release-covered areas had this optimistic line in it:
Finally there will be an app, build around the ReadWritePart. In 2004.
Turned out, life did not agree to that plan, thus 2004 passed without any such app. And so did 2005.
Still, back in February 2004 as part of KDE 3.2 the first elements of the still named KHexEdit2 project made a first release. Though with a bit of complexity. ((Note that at this time KDE was still also the name of the released bundle product, composed of the so called modules kdelibs, kdebase, kdeutils, etc. Where kdelibs held all the public libraries, kdebase the basic desktop components, and so on.)) As adding a complete implementation of a hexeditor widget to the official kdelibs for just a few potential users was declared unbalanced, instead just some header files with KHE namespaced abstract interface classes were added, with an inline utility method to dynamically load any plugin implementing them. So not increasing the actual runtime and installation size of kdelibs. And the kdeutils module got to provide such a plugin by the name KBytesEdit. This then was implemented by the hex editor widget library from the KHexEdit2 project, also in the kdeutils module, whose own API and headers were kept private. To confuse everyone this library was yet named libkhexedit, even if the actual KHexEdit program did not use it. Because the spirit of the naming was on the level of widgets and classes, not program names, and there the 2 postfix made no sense. Consumers of this construct became at least the utility app for Palm Pilot handhelds KPilot and the debugger plugin of the IDE KDevelop.
In March 2005 KDE 3.4 then included the KHexEdit2 KPart (read-only) as well, also located in the kdeutils module and also implemented using the private libkhexedit library. Making some people unhappy as it registered (like its Okteta successor still does today) as handler for the MIME type application/octet-stream, so popping up as fallback KPart where no other handler was found,. And seeing raw bytes over nothing has been partially perceived as “broken”
2006-2008: Second Try on a Program, as Sample and with new Dedicated Name2006 arrived, and with the author there was still some ever developing curiosity about the feasibility of writing viewer & editor programs using some higher-level reusable & exchangeable components. Now a byte array is a pretty simple data structure to use for a sample implementation of such a concept. And here we had a widget implementation for byte array viewing and editing fully under our control. And a certain level of skills with C++ acquired at the time. This was just too tempting to not give it an own try, for fun and experience. So a June blog post “Fun with KHexEdit” also mentioned a program again and introduced the name designed meanwhile:
I am tackling the construction of a successor to KHexEdit again, projectname “Okteta”.
Later that year a first visual snippet was shared, showing how KHexEdit’s UI served as initial template for Okteta, also to potentially help the transfer of existing users:
November 2006: first published screenshot of Okteta in some pre-Alpha stateTo avoid duplicated efforts and to increase pressure to deliver, two weeks later on November 27th an optimistic email was sent to KDE’s great eternal to-newer-API porting worker Laurent Montel, as it was the time to port KDE software to Qt4/KDELibs4:
Hi Laurent,
please don't spend too much effort at the old program KHexEdit, I am quite far
on the way to write a successor, called Okteta. Concerning feature
compatibility, so far I implemented around 60 % of the features of KHexEdit,
and hope to do the last 40 % until at least January. Yes, no code yet in SVN
(besides the library), but that will change in three weeks, promised.
[...]
This time life agreed mainly to the plan. Though the promise on the code in SVN was delivered on only with almost a year delay. A first dump of the program code was committed into KDE’s code repository on October 23rd, 2007, by the commit message:
Uploading the Okteta program into KDE's playground, so the code isn't lost, after growing slowly only on my hdd for more than a year. Okteta is a planned successor to KHexEdit, yet misses all of it's functionality. With it's modular architecture, based on the co-developed lib kakao, it should soon offer more than would could be done with the monolithic KHexEdit. I hope.As can be read, this first copy also was featuring a first draft for the own before mentioned higher-level component system, initially named “Kakao”, later in 2009 to be renamed to “Kasten”. That first name was made ad-hoc inspired by a drink on the table (in learned safe distance from the keyboard), to soon find it used similarly by some certain bigger IT company, even for a somehow related subject, thus a new name was by the time designed less ad-hoc.
And so some months later in April 2008 Okteta entered the “kdereview” phase, to proceed after two weeks into the kdeutils module. In time for KDE 4.1, so premiering its release as part of that in July 2008. Okteta here also took over to provide the KBytesEdit plugin for the kdelibs KHE interfaces as well as the KPart, which before had resided in subdirectories of the KHexEdit program sources. KHexEdit itself stayed unported to Qt4/KDELibs4, so Okteta as planned did not run into duplicated efforts and rivalry (or, it avoided competition, for good and bad).
July 2008: Okteta’s first release, as version 0.1, part of KDE 4.1 2008-2012: Years of Features FlowWith the foundations laid and releases established as part of KDE releases, the next years saw a number of features added, initially even each KDE version:
January 2009: new features in Okteta 0.2, part of KDE 4.2 August 2009: new features in Okteta 0.3, part of KDE 4.3 February 2010: new features in Okteta 0.4, part of KDE SC 4.4 July 2011: new features in Okteta 0.7, part of KDE Applications 4.7 August 2012: new features in Okteta 0.9, part of KDE Applications 4.9 2010-Present: Sharing Functionality in Rich Public LibrariesFrom the very begin of the project on the byte array viewing & editing feature was embeddable by 3rd-party software using the abstract KHE interfaces in kdelibs, or then the KPart at least for viewing, Though this allowed only little control & features due to a limited API.
Starting with Okteta 0.4 in February 2010, the two sets of underlying libraries, Kasten and Okteta ones, used to implement the Okteta program, the Okteta KPart and the KHE KBytesEdit plugin, are provided with stable public API.
The lower-Qt-level Okteta GUI library also started to be accompanied by a Qt Designer plugin, to allow easy use of the two provided classes of widgets also in Qt’s UI files.
February 2010: new Okteta widgets plugin for Qt Designer, part of Okteta 0.4In February 2010 during a week-long Kate-KDevelop development meeting in Berlin the intended flexibility of the new public libraries proofed itself by enabling to create a plugin for KDevelop to integrate hex viewing & edting and all the Okteta tools in just those few days, for some nice satisfaction. The plugin was officially released first with KDevelop 4.1 in October 2010 and later also ported to the Qt5/KF5 version of KDevelop. For the current Qt6/KF6-based version of KDevelop the plugin is excluded from the build for now, given the current lack of a released Qt6/KF6-based version of the Okteta and Kasten libraries.
October 2010: Okteta plugin for KDevelop, first released with KDevelop 4.1 2012-Present: Switching from Features to Architecture, from Bundled to Stand-AloneThe port to Qt5/KF5 happened without any issues and was completed for version 0.15, released as part of KDE Applications 14.12. During the transformation of KDELibs4 to KDE Frameworks 5 the KHE interfaces also got dropped there, due to Okteta meanwhile directly providing public libraries. So this ported version of Okteta also no longer provided the KBytesEdit plugin, but otherwise as before the public libraries and the KPart, next to the program itself.
After KDE Applications 17.12, as for a while there was no feature development and only occasional work on the design of the Okteta & Kasten libraries happened, the Okteta project switched to a stand-alone release schedule. A 0.25 version branch was created and patch version releases only done when there were user-relevant changes like bug fixes or bigger translation improvements.
Then 2019 brought the first version and for now also latest to also provide at least one new feature to users, for which a 0.26 version branch was created. This version has meanwhile got 17 patch releases, with bug fixes, translation improvements and other adjustments. And after 5 years of such polishing is the one which now received the “Best Application” 2024 Akademy Award
March 2019: new features in Okteta 0.26, released on own schedule 2022-Present: Preparing for Qt6 & KF6Okteta’s code base has been mostly quickly updated to any API deprecations, also as part of a “zero build warnings” strategy. So the approach taken for both Qt & KF libraries to strive for source-backward-compatible C++ API in their both new major version 6 made the initial port of Okteta to Qt6 & KF6(-Alpha) a matter of less than a day in May 2022. Now, only if one ignores one of the tools.
May 2022: preview of Okteta port to Qt6/KF6The Structures tool, first developed by Alexander Richardson in 2010 for Okteta 0.4, was in 2011 for Okteta 0.7 extended by him to also support dynamic structure definitions, using JavaScript expressions. For this QtScript has been used as engine. Four years later, in July 2015 though Qt 5.5 declared QtScript as deprecated. The officially recommended substitute QJSEngine turned out to not allow the dynamic translation of JavaScript properties and methods as relied on by the Structures tool for the copy-avoiding mapping of the data blob interpretation into the JavaScript scene (beware, for what the author understands so far). So it could not be used as drop-in replacement.
As finding a suitable and more future-proof JavaScript engine or exploring a possible reimplementation using the QJSEngine is a complex task and also needs bigger chunks of time & focussing, it had been postponed. Year after year. And thus now nine years later in 2024 there is still not even a plan. And Qt6 no longer now provides QtScript.
Just dropping the Structures tool is not a real option. It is a great feature, which also got some users. So a plan is needed and work to be done by someone. As of now my own, surely radical idea is to rewrite the whole Structures tool from scratch, still for the Qt5/KF5 version of Okteta. This should lead to fully wrapping the brain around this complex feature, instead of seeing to indirectly explore it by trying to understand all the details of the current elaborated implementation with the risk to misinterpret some intentions. Starting from scratch might also allow to finally share all the code used for the data formats in the separate Decoding Table tool, and perhaps even to introduce a more generic approach on the data formats supported in the main mass display besides currently byte values and 8-bit charset mapping. Idea, Should, Trying, Starting, Might, Perhaps… any words of confidence, please?
For now the initial Qt6/KF6 port is maintained by a single commit containing the complete dump of the “it builds, starts and does not crash on simple usage” changes, in a work branch continuously rebased to the latest current Qt5/KF5-based development branch. This commit would then at the time of the real Qt6/KF6 port be properly split into the different aspects of porting. For now it serves to hold the door open while still on the other side.
Looking Forward, by Looking Back Some MoreFor sure the initial goal with the Okteta program to do something for fun and experience has been largely achieved The current challenge with the needed replacement of the used JavaScript engine promises more experience, though no fun initially to me at least.
And some feedback as well even a KDE Akademy Award now hint the created and publicly shared program also served other people for their serious and less serious needs. Possibly even some desperate Faust-like persons, “So that they may perceive the bytes which hold their doc[ument] together in its inmost fold.” (and even tweak them for better as needed, owning their world or document). Though no contracts done, and thus no souls here owned.
But as before, Okteta actually is just a sample implementation of the actual interest pursued here, exploring the feasibility of writing programs by higher-level reusable & exchangeable components, ideally also allowing random end users to mesh those components themselves into tailored solutions for certain needs. So if development has stalled as it has, both on the components concept but also the hex editing features, how to increase motivation again to set resources aside, and for which part?
Position in the Hex Editor Solution SpaceWhen it comes to the Free Software solution space for hex editor needs, next to Okteta there is currently coverage starting from simple ones like GHex over wxHexEditor, which serves needs beyond Okteta by support for paged loading of big files and also the working memory, though sadly unmaintained currently, to the newer yet most impressive and very powerful ImHex (by what the web pages show, never tried).
So would people suffer if Okteta is gone for current platforms, at least for a while?
Open Component Systems vs. Closed Monolithic BlobsNow the author, while being curious, never got around to actually study existing solutions for the concept of higher-level component system or even deploy them in projects, only ever saw some theoretic surfaces. And is fully aware of the own experiment turning into something serious rather being a pipe dream. Even more when after soon two decades the initially created TODO list is not even 10 % done, this won’t work out this single human’ life So the following is more like the wish-wash of a hobbyist bird watcher, while also having some chicken in the backyard to which things are compared. Or alike.
It seems composable systems with complex interfaces are not the dominating species in the Free Software ecosystems. The Linux kernel outpaced any microkernel systems, e.g. Gnu Hurd is yet to be spotted outside OS zoos. The Eclipse Rich Client Platform, whose concepts were one of the original by-headlines inspirations for this project, seems to have maxed out some time ago as well. at least in the mainstream through the author’s bubble space. StarOffice^WOpenOffice^WLibreOffice has the UNO component system, but how many Add-Ons flourish on it? Then GnuStep would have enabled to spread the component concepts of OpenStep, but little has be seen? The later GnuStep-related, indeed thriving for stars impressive project Étoilé seemed to be overloaded with related ideas, but sadly never lift off. Then the GNOME project even had a reference to component systems in its initial full name “GNU Network Object Model Environment”, but its respective Bonobo framework based on CORBA faded away rather soon.
Also KDE started initially with implementations around the idea of components. To quote the KDE 1.0 announcement:
In view of these circumstances the KDE Project has developed a first rate compound document application framework, implementing the latest advances in framework technology and thus positioning itself in direct competition to such popular development frameworks as for example Mircrosoft’s MFC/COM/ActiveX technology. KDE’s KOM/OpenParts compound document technology enables developers to quickly create first rate applications implementing cutting edge technology. KDE’s KOM/OpenParts technology is build upon the open industry standards such as the object request broker standard CORBA 2.0.
KOM/OpenParts was then in KDE 2.0 replaced by KParts. Actually the presence of such technology development was the deciding factor to go for KDE when the author those days got into “Linux” and had to choose between GnuStep, Enlightenment, GNOME and KDE. These days though KDE is run with claims like “All About Apps”. The generic KServices system got destructed for KF6. The possibly latest KPart (a Markdown Viewer) was written years ago by this author, and the once KDE-central KPart-driven program Konqueror is only a shadow of its former self. KOffice & Calligra as component-oriented office suites also died or stalled close to extinction. Generic plugins like the KParts are not even listed on apps.kde.org or elsewhere anymore, also no longer mentioned as concept in KDE Gear release announcements. Similar specific plugins like the Plasma applets, they are also not listed separately, but only as part of the respective, in the example, Plasma products.
Additionally packaging formats like Flatpak or Snap are discussion-less embraced and promoted, which push in the direction of isolated and frozen software programs. Even today does Flatpak’s metadata system appstream. also otherwise discussion-less embraced by KDE, not have a concept of generic plugins, so KParts cannot be described properly.
In such an environment a component system would be limited to predefined fixed component sets in libraries, from which applications would then provide a setup and offer that to users. A bit like being able to shop as consumer at the kitchen equipment store only preset exhibition rooms, instead of meshing up items from different providers into ones’ own tailored meal preparations “app”. Surely it is in the interest of the dominating providers who then will see to bundle only their items, and then add bloat as well as only making half the items good. As consumer I desire to have the choice over pre-made bundles vs. self-assembled ones. Like there are times for All-inclusive vacations and times for self-organized ones. So with current KDE but also the larger current Free Software “desktop” scene as real world development environment working on and thinking about component systems has the author feel at odds.
So maybe the experiment with Kasten as a higher-level component system could also stop here. Perhaps some research instead could be done why such systems failed in comparison. Like, was it due to the inflexibility presented by fixed published interfaces, where on new feature needs implementations cannot simply do temporary shortcuts and adaptions where needed to be quick back to the market? Could it be due to the possible need for more abstract and generic thinking with component systems, where the majority of developers working for the market might prefer to think more concrete and case-by-case? Then, might there still be a middle-ground, where any advantages of high-level component systems are the deciding factor in the competition?
Next Release Scheduled: October 10th, version 0.26.18As described already for the early stages, there always have been ideas and plans.. and delays… and also doubts… and then things happened. Locally there are lots of notes with ideas, and a number of code drafts and sketches stacked by the years. And at least in the near future it seems there are still time windows, electricity, a laptop, and enough human capabilities to carry on and tinkering over this stack.
The Okteta (& Kasten) project for now is alive, just lurking around in front of the next evolution step to take. Which might find it new ground. Or extinction anyway. And while it is lurking, it gets a tad more feathers polished, by another bug fix release already scheduled for next month
mark.ie: My LocalGov Drupal contributions for week-ending September 20th, 2024
Do one thing, do it well - this week I spent most of my time creating a live preview module for microsites.
Programiz: Getting Started with Python
PyCharm: How to Use FastAPI for Machine Learning
This is a guest post from Cheuk Ting Ho, a data scientist who contributes to multiple open-source libraries, such as pandas, Polars, and Jupyter Notebook.
FastAPI provides a quick way to build a backend service with Python. With a few decorators, you can turn your Python function into an API application.
It is widely used by many companies including Microsoft, Uber, and Netflix. According to the Python Developers Survey, FastAPI usage has grown from 21% in 2021 to 29% in 2023. For data scientists, it’s the second most popular framework, with 31% using it.
In this blog post, we will cover the basics of FastAPI for data scientists who may want to build a quick prototype for their project.
What is FastAPI?FastAPI is a popular web framework for building APIs with Python, based on standard Python type hints. It is intuitive and easy to use, and it can provide a production-ready application in a short period of time. It is fully compatible with OpenAPI and JSON Schema.
Why use FastAPI for machine learning?Most teams working on machine learning projects consist of data scientists whose domains and professions lie on the statistics side of things. They may not have experience developing software or applications to ship their machine learning projects. FastAPI enables data scientists to easily create APIs for the following projects:
Deploying prediction models
The data science team may have trained a model for the prediction of the sales demand in a warehouse. To make it useful, they have to provide an API interface so other parts of the stock management system can use this new prediction functionality.
Suggestion engines
One of the very common uses of machine learning is as a system that provides suggestions based on the users’ choices. For example, if someone puts certain products in their shopping cart, more items can be suggested to that user. Such an e-commerce system requires an API call to the suggestion engine that takes input parameters.
Dynamic dashboards and reporting systems
Sometimes, reports for data science projects need to be presented as dashboards so users can inspect the results themselves. One possible approach is to have the data model provide an API. Frontend developers can use this API to create applications that allow users to interact with the data.
Advantages of using FastAPICompared to other Python web frameworks, FastAPI is simple yet fully functional. Mainly using decorators and type hints, it allows you to build a web application without the complexity of building a whole ORM (object-relational mapping) model and with the flexibility of using any database, including any SQL and NoSQL databases. FastAPI also provides automatic documentation generation, support for additional information and validation for query parameters, and good async support.
Fast development
Creating API calls in FastAPI is as easy as adding decorators in the Python code. Little to no backend experience is needed for anyone who wants to turn a Python function into an application that will respond to API calls.
Fast documentation
FastAPI provides automatic interactive API documentation using Swagger UI, which is an industry standard. No extra effort is required to build clear documentation with API call examples. This creates an advantage for busy data science teams who may not have the energy and expertise to write technical specifications and documentation.
Easy testing
Writing tests is one of the most important steps in software development, but it can also be one of the most tedious, especially when the time of the data science team is valuable. Testing FastAPI is made simple thanks to Starlette and HTTPX. Most of the time no monkey patching is needed and tests are easy to write and understand.
Fast deployment
FastAPI comes with a CLI tool that can bridge development and deployment smoothly. It allows you to switch between development mode and production mode easily. Once development is completed, the code can be easily deployed using a Docker container with images that have Python prebuilt.
How to use FastAPI for a machine learning projectIn this example, we will turn a classification prediction model that uses the Nearest Neighbors algorithm to predict the species of various penguins based on their bill and flipper length into a backend application. We will provide an API that takes parameters from the query parameters of a URL and gives back the prediction. This shows how a prototype can be made quickly by any data scientist with no backend development experience.
We will use a simple `KNeighborsClassifier` on the penguin data set as an example. Details of how to build the model will be omitted, but feel free to check out the relevant notebook here. In the following tutorial, we will focus on the usage of FastAPI and explain some fundamental concepts. We will be building a prototype to do so.
1. Start a FastAPI project with PyCharmIn this blog post, we will be using PyCharm Professional 2024.1. The best way to start using FastAPI is to create a FastAPI project with PyCharm. When you click New Project in PyCharm, you will be presented with a large selection of projects to choose from. Select the FastAPI tab:
From here, you can put in the name of your project and take advantage of other options such as initializing Git and the virtual environment that you want to use.
After doing so, you will see the basic structure of a FastAPI project set up for you.
There is also a `test_main.http` file set up for you to quickly test all the endpoints.
Next, set up our environment dependency with `requirements.txt` by selecting Sync Python Requirements under PyCarm’s Tool menu.
Then you can select the `requirements.txt` file to be used.
You can copy and use this `requirements.txt` file. We will be using pandas and scikit-learn for the machine learning part of the project. Also, add the `penguins.csv` file to your project directory.
Arrange your machine learning code in the `main.py` file. We will start with a script that trains our model:
import pandas as pd from sklearn.model_selection import train_test_split from sklearn import preprocessing from sklearn.neighbors import KNeighborsClassifier from sklearn.pipeline import Pipeline from sklearn.preprocessing import StandardScaler data = pd.read_csv('penguins.csv') data = data.dropna() le = preprocessing.LabelEncoder() X = data[["bill_length_mm", "flipper_length_mm"]] le.fit(data["species"]) y = le.transform(data["species"]) X_train, X_test, y_train, y_test = train_test_split(X, y, stratify=y, random_state=0) clf = Pipeline( steps=[("scaler", StandardScaler()), ("knn", KNeighborsClassifier(n_neighbors=11))] ) clf.set_params().fit(X_train, y_train)We can place the above code after `app = FastAPI()`. All of it will be run when we start the application.
However, there is a better way to run the start-up code we used to set up our model. We will cover that in a later part of the blog post.
Next we will look at how to add our model to FastAPI functionality. As a first step, we will add a response to the root of the URL and just simply return a message about our model in JSON format. Change the code in `async def root():` from “Hello world” to our message like this:
@app.get("/") async def root(): return { "Name": "Penguins Prediction", "description": "This is a penguins prediction model based on the bill length and flipper length of the bird.", }Now, test our application. First, we will start our application, which is easy in PyCharm. Just press the arrow button () next to your project name at the top.
If you are using the default settings, your application will run on http://127.0.0.1:8000. You can double-check that by looking at the prompt from the Run window.
Once the process has started, let’s go to `test_main.http` and press the first arrow button () next to `GET`. From the HTTP Client in the Services window, you will see the response message that we put in.
The response JSON file is also saved for future inspection.
Next, we would like to let users make predictions by providing query parameters in the URL. Let’s add the code below after the `root` function.
@app.get("/predict/") async def predict(bill_length_mm: float = 0.0, flipper_length_mm: float = 0.0): param = { "bill_length_mm": bill_length_mm, "flipper_length_mm": flipper_length_mm } if bill_length_mm <=0.0 or flipper_length_mm <=0.0: return { "parameters": param, "error message": "Invalid input values", } else: result = clf.predict([[bill_length_mm, flipper_length_mm]]) return { "parameters": param, "result": le.inverse_transform(result)[0], }Here we set the default value of the `bill_length_mm` and `flipper_length_mm` to be 0 if the user didn’t input a value. We also add a check to see if either of the values is 0 and return an error message instead of trying to predict which penguin the input refers to.
If the inputs are not 0, we will use the model to make a prediction and use the encoder to do an inverse transformation to get the label of the predicted target, i.e. the name of the penguin species.
This is not the only way you can verify inputs. You can also consider using Pydantic for input verification.
If you are using the same version of FastAPI as stated in `requirements.txt`, FastAPI automatically refreshes the service and applies changes on save. Now put in a new URL in `test_main.http` to test (separated from the URL before with ###):
### GET http://127.0.0.1:8000/predict/?bill_length_mm=40.3&flipper_length_mm=195 Accept: application/jsonPress the arrow button () next to our new URL and see the output.
Next you can try a URL with one or both of the parameters removed to see the error message:
Last, let’s look at how we can set up our model with FastAPI lifespan events. The advantage of doing that is we can make sure no request will be accepted while the model is still being set up and the memory used will be cleaned up afterward. To do that, we will use an `asynccontextmanager`. Before `app = FastAPI()` we will add:
from contextlib import asynccontextmanager ml_models = {} @asynccontextmanager async def lifespan(app: FastAPI): # Set up the ML model here yield # Clean up the models and release resources ml_models.clear()Now we will move the import of pandas and scikit-learn to be alongside the other imports. We will also move our setup code inside the `lifespan` function, setting the machine learning model and LabelEncoder inside `ml_models` like this:
from fastapi import FastAPI from contextlib import asynccontextmanager import pandas as pd from sklearn.model_selection import train_test_split from sklearn import preprocessing from sklearn.neighbors import KNeighborsClassifier from sklearn.pipeline import Pipeline from sklearn.preprocessing import StandardScaler ml_models = {} @asynccontextmanager async def lifespan(app: FastAPI): # Set up the ML model here data = pd.read_csv('penguins.csv') data = data.dropna() le = preprocessing.LabelEncoder() X = data[["bill_length_mm", "flipper_length_mm"]] le.fit(data["species"]) y = le.transform(data["species"]) X_train, X_test, y_train, y_test = train_test_split(X, y, stratify=y, random_state=0) clf = Pipeline( steps=[("scaler", StandardScaler()), ("knn", KNeighborsClassifier(n_neighbors=11))] ) clf.set_params().fit(X_train, y_train) ml_models["clf"] = clf ml_models["le"] = le yield # Clean up the models and release resources ml_models.clear()After that we will add the `lifespan=lifespan` parameter in `app = FastAPI()`:
app = FastAPI(lifespan=lifespan)Now save and test again. Everything should work and we should see the same result as before.
Afterthought: When to train the model?From our example, you may wonder when the model is trained. Since `clf` is trained at the beginning, i.e. when the service is launched, you may wonder why we do not train the model every time someone makes a prediction.
We do not want the model to be trained every time someone makes a call, because it costs way more resources to re-train everything. Additionally, it may cause race conditions since our FastAPI application is working concurrently. This is especially the case if we use live data that changes all the time.
Technically, we can set up an API to collect data and re-train the model (which we will demonstrate in the next example). Other options would be to schedule a re-train at a certain time when a certain amount of new data has been collected or to let a super user upload new data and trigger the re-training.
So far, we are aiming to build a prototype that runs locally. Check out this article on deploying a FastAPI project on a cloud service for more information.
What is concurrency?To put it simply, concurrency is like when you are cooking in the kitchen, and while waiting for the water to boil, you go ahead and chop the vegetables. Since, in the web service world, the server is talking to many terminals, and the communication between the server and the terminals is slower than most internal applications, so the server will not talk to and serve the terminals one by one. Instead, it will talk to and serve many of them at the same time while fulfilling their requests. You may want to check out this explanation in the FastAPI documentation.
In Python, this is achieved by using async code. In our FastAPI code, the use of `async def` instead of `def` is obvious evidence that FastAPI is working concurrently. There are other keywords used in Python async code, like `await` and `asyncio.get_event_loop`, but we won’t be able to cover them in this blog post.
How to use FastAPI for an image classification projectTo discover more FastAPI functionality, we will add an image classification model based on the MNIST example in Keras to our application as well (we are using the TensorFlow backend). If you installed the `requirements.txt` provided, you should have Keras and Pillow installed for image processing and building a convolutional neural network (CNN).
1. RefactoringBefore we start, let’s refactor our code. To make the code more organized, we will put the model setup for the penguins prediction in a function:
def penguins_pipeline(): data = pd.read_csv('penguins.csv') data = data.dropna() le = preprocessing.LabelEncoder() X = data[["bill_length_mm", "flipper_length_mm"]] le.fit(data["species"]) y = le.transform(data["species"]) X_train, X_test, y_train, y_test = train_test_split(X, y, stratify=y, random_state=0) clf = Pipeline( steps=[("scaler", StandardScaler()), ("knn", KNeighborsClassifier(n_neighbors=11))] ) clf.set_params().fit(X_train, y_train) return clf, leThen we rewrite the lifespan function. With full-line code completion in PyCharm, it is very easy:
2. Set up a CNN model for MNIST predictionIn similar fashion as the penguin prediction model, we create a function for MNIST prediction (and we will store the meta parameters globally):
# MNIST model meta parameters num_classes = 10 input_shape = (28, 28, 1) batch_size = 128 epochs = 15 def mnist_pipeline(): # Load the data and split it between train and test sets (x_train, y_train), _ = keras.datasets.mnist.load_data() # Scale images to the [0, 1] range x_train = x_train.astype("float32") / 255 # Make sure images have shape (28, 28, 1) x_train = np.expand_dims(x_train, -1) # convert class vectors to binary class matrices y_train = keras.utils.to_categorical(y_train, num_classes) model = keras.Sequential( [ keras.Input(shape=input_shape), layers.Conv2D(32, kernel_size=(3, 3), activation="relu"), layers.MaxPooling2D(pool_size=(2, 2)), layers.Conv2D(64, kernel_size=(3, 3), activation="relu"), layers.MaxPooling2D(pool_size=(2, 2)), layers.Flatten(), layers.Dropout(0.5), layers.Dense(num_classes, activation="softmax"), ] ) model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=["accuracy"]) model.fit(x_train, y_train, batch_size=batch_size, epochs=epochs, validation_split=0.1) return modelThen add the model setup in the lifespan function:
ml_models["cnn"] = mnist_pipeline()Note that since this is added, every time you make changes to `main.py` and save, the model will be trained again. It can take a bit of time. So in development you may want to use a dummy model that requires no training time at all or a pre-trained model instead. After training, the CNN model will be ready to go.
3. Set up a POST endpoint for uploading an image file for predictionTo set up an endpoint that takes an upload file, we have to use UploadFile in FastAPI:
@app.post("/predict-image/") async def predicct_upload_file(file: UploadFile): img = await file.read() # process image for prediction img = Image.open(BytesIO(img)).convert('L') img = np.array(img).astype("float32") / 255 img = np.expand_dims(img, (0, -1)) # predict the result result = ml_models["cnn"].predict(img).argmax(axis=-1)[0] return {"filename": file.filename, "result": str(result)}Please note that this is a POST endpoint (so far we have only set up GET endpoints).
Don’t forget to import `UploadFile` from `fastapi`:
from fastapi import FastAPI, UploadFileAnd `Image` from Pillow. We are also using `BytesIO` from the `io` module:
from PIL import Image from io import BytesIOTo test this using the PyCharm HTTP Client with a test image file, we will make use of the `multipart/form-data` encoding. You can check out the HTTP request syntax here. This is what you will put in the `test_in.http` file:
### POST http://127.0.0.1:8000/predict-image/ HTTP/1.1 Content-Type: multipart/form-data; boundary=boundary --boundary Content-Disposition: form-data; name="file"; filename="test_img0.png" < ./test_img0.png --boundary– 4. Add an API to collect data and trigger retrainingNow, here comes the retraining. We set up a POST endpoint like above to accept a zip file which contains training images and labels. The zip file will then be processed and the training data will be prepared. After that we will fit the CNN model again:
@app.post("/upload-images/") async def retrain_upload_file(file: UploadFile): img_files = [] labels_file = None train_img = None with ZipFile(BytesIO(await file.read()), 'r') as zfile: for fname in zfile.namelist(): if fname[-4:] == '.txt' and fname[:2] != '__': labels_file = fname elif fname[-4:] == '.png': img_files.append(fname) if len(img_files) == 0: return {"error": "No training images (png files) found."} else: for fname in sorted(img_files): with zfile.open(fname) as img_file: img = img_file.read() # process image img = Image.open(BytesIO(img)).convert('L') img = np.array(img).astype("float32") / 255 img = np.expand_dims(img, (0, -1)) if train_img is None: train_img = img else: train_img = np.vstack((train_img, img)) if labels_file is None: return {"error": "No training labels file (txt file) found."} else: with zfile.open(labels_file) as labels: labels_data = labels.read() labels_data = labels_data.decode("utf-8").split() labels_data = np.array(labels_data).astype("int") labels_data = keras.utils.to_categorical(labels_data, num_classes) # retrain model ml_models["cnn"].fit(train_img, labels_data, batch_size=batch_size, epochs=epochs, validation_split=0.1) return {"message": "Model trained successfully."}Remember to import `ZipFile`:
from zipfile import ZipFileIf we now try the endpoint with this zip file of 1000 retraining images and labels, you will see that it takes a moment for the response to come, as the training is taking a while:
POST http://127.0.0.1:8000/upload-images/ HTTP/1.1 Content-Type: multipart/form-data; boundary=boundary --boundary Content-Disposition: form-data; name="file"; filename="training_data.zip" < ./retrain_img.zip --boundary--Imagine the zip files contain more training data or you’re retraining a more complicated model. The user would then have to wait for a long time and it would seem like things are not working for them.
5. Retrain the model with BackgroundTasksA better way to handle retraining is, after receiving the training data, we process it and check if the data is in the right format, then give a response saying that the retraining has restarted and train the model in `BackgroundTasks`. Here is how to do it. First, we will add `BackgroundTasks` to our `upload-images` endpoint:
@app.post("/upload-image/") async def retrain_upload_file(file: UploadFile, background_tasks: BackgroundTasks): ...Remember to import it from `fastapi`:
from fastapi import FastAPI, UploadFile, BackgroundTasksThen, we will put the fitting of the model into the `background_tasks`:
# retrain model background_tasks.add_task( ml_models["cnn"].fit, train_img, labels_data, batch_size=batch_size, epochs=epochs, validation_split=0.1 )Also, we will update the message in the response:
return {"message": "Data received successfully, model training has started."}Now test the endpoint again. You will see that the response has arrived much quicker, and if you look at the Run window, you’ll see that the training is running after the response has arrived.
At this point, more functionality can be added, for example, an option to notify the user later (e.g. via email) when the training is finished or track the training progress in a dashboard when a full application is built.
FastAPI provides an easy way to convert your data science project into a working application in several easy steps. It is perfect for data science teams that want to provide an application prototype for their machine learning model which can be further developed into a professional web application if needed.
PyCharm Professional is the Python IDE that allows you to develop FastAPI applications more easily with a preconfigured project for FastAPI, coding assistance, tailored run/debug configurations, and the Endpoints tool window for managing API endpoints efficiently.
Get a free trial of PyCharm ProfessionalIn this blog post, we showed the process of providing a simple API for a pre-trained prediction model. To learn more about FastAPI, I would suggest checking out the official FastAPI documentation. If you’re choosing between different frameworks, explore how FastAPI differs from Django.
About the author Cheuk Ting HoCheuk has been a Data Scientist at various companies – a job that demands high numerical and programming skills, especially in Python. Following her passion for the tech community, Cheuk has been a Developer Advocate for three years. She also contributes to multiple open-source libraries like Hypothesis, Pytest, pandas, Polars, PyO3, Jupyter Notebook, and Django. Cheuk is currently a consultant and trainer at CMD Limes.
Implementing an Audio Mixer, Part 2
In Part 1, we covered PCM audio and superimposing waveforms, and developed an algorithm to combine an arbitrary number of audio streams into one.
Now we need to use these ideas to finish a full implementation using Qt Multimedia.
Using Qt Multimedia for Audio Device AccessSo what do we need? Well, we want to use a single QAudioOutput, which we pass an audio device and a supported audio format.
We can get those like this:
const QAudioDeviceInfo &device = QAudioDeviceInfo::defaultOutputDevice(); const QAudioFormat &format = device.preferredFormat();Let’s construct our QAudioOutput object using the device and format:
static QAudioOutput audioOutput(device, format);Now, to use it to write data, we have to call start on the audio output, passing in a QIODevice *.
Normally we would use the QIODevice subclass QBuffer for a single audio buffer. But in this case, we want our own subclass of QIODevice, so we can combine multiple buffers into one IO device.
We’ll call our subclass MixerStream. This is where we will do our bufferCombine, and keep our member list of streams mStreams.
We will also need another stream type for mStreams. For now let’s just call it DecoderStream, forward declare it, and worry about its implementation later.
One thing that’s good to know at this point is DecoderStream objects will get the data buffers we need by decoding audio data from a file. Because of this, we’ll need to keep our audio format from above to as a data member mFormat. Then we can pass it to decoders when they need it.
Implementing MixerStreamSince we are subclassing QIODevice, we need to provide reimplementations for these two protected virtual functions:
virtual qint64 QIODevice::readData(char *data, qint64 maxSize); virtual qint64 QIODevice::writeData(const char *data, qint64 maxSize);We also want to provide a way to open new streams that we’ll add to mStreams, given a filename. We’ll call this function openStream. We can also allow looping a stream multiple times, so let’s add a parameter for that and give it a default value of 1.
Additionally, we’ll need a user-defined destructor to delete any pointers in the list that might remain if the MixerStream is abruptly destructed.
// mixerstream.h #pragma once #include <QAudioFormat> #include <QAudioOutput> #include <QIODevice> class DecodeStream; class MixerStream : public QIODevice { Q_OBJECT public: explicit MixerStream(const QAudioFormat &format); ~MixerStream(); void openStream(const QString &fileName, int loops = 1); protected: qint64 readData(char *data, qint64 maxSize) override; qint64 writeData(const char *data, qint64 maxSize) override; private: QAudioFormat mFormat; QList<DecodeStream *> mStreams; };Notice that combineSamples isn’t in the header. It’s a pretty basic function that doesn’t require any members, so we can just implement it as a free function.
Let’s put it in a header mixer.h and wrap it in a namespace:
// mixer.h #pragma once #include <QtGlobal> #include <limits> namespace Mixer { inline qint16 combineSamples(qint32 samp1, qint32 samp2) { const auto sum = samp1 + samp2; if (std::numeric_limits<qint16>::max() < sum) return std::numeric_limits<qint16>::max(); if (std::numeric_limits<qint16>::min() > sum) return std::numeric_limits<qint16>::min(); return sum; } } // namespace MixerThere are some very basic things we can get out of the way quickly in the MixerStream cpp file. Recall that we must implement these member functions:
explicit MixerStream(const QAudioFormat &format); ~MixerStream(); void openStream(const QString &fileName, int loops = 1); qint64 readData(char *data, qint64 maxSize) override; qint64 writeData(const char *data, qint64 maxSize) override;The constructor is very simple:
MixerStream::MixerStream(const QAudioFormat &format) : mFormat(format) { setOpenMode(QIODevice::ReadOnly); }Here we use setOpenMode to automatically open our device in read-only mode, so we don’t have to call open() directly from outside the class.
Also, since it’s going to be read-only, our reimplementation of QIODevice::writeData will do nothing:
qint64 MixerStream::writeData([[maybe_unused]] const char *data, [[maybe_unused]] qint64 maxSize) { Q_ASSERT_X(false, "writeData", "not implemented"); return 0; }The custom destructor we need is also quite simple:
MixerStream::~MixerStream() { while (!mStreams.empty()) delete mStreams.takeLast(); }readData will be almost exactly the same as the implementation we did earlier, but returning qint64. The return value is meant to be the amount of data written, which in our case is just the maxSize argument given to it, as we write fixed-size buffers.
Additionally, we should call qAsConst (or std::as_const) on mStreams in the range-for to avoid detaching the Qt container. For more on qAsConst and range-based for loops, see Jesper Pederson’s blog post on the topic.
qint64 MixerStream::readData(char *data, qint64 maxSize) { memset(data, 0, maxSize); constexpr qint16 bitDepth = sizeof(qint16); const qint16 numSamples = maxSize / bitDepth; for (auto *stream : qAsConst(mStreams)) { auto *cursor = reinterpret_cast<qint16 *>(data); qint16 sample; for (int i = 0; i < numSamples; ++i, ++cursor) if (stream->read(reinterpret_cast<char *>(&sample), bitDepth)) *cursor = Mixer::combineSamples(sample, *cursor); } return maxSize; }That only leaves us with openStream. This one will require us to discuss DecodeStream and its interface.
The function should construct a new DecodeStream on the heap, which will need a file name and format. DecodeStream, as implied by its name, needs to decode audio files to PCM data. We’ll use a QAudioDecoder within DecodeStream to accomplish this, and for that, we need to pass mFormat to the constructor. We also need to pass loops to the constructor, as each stream can have a different number of loops.
Now our constructor call will look like this:
DecodeStream(fileName, mFormat, loops);We can then use operator<< to add it to mStreams.
Finally, we need to remove it from the list when it’s done. We’ll give it a Qt signal, finished, and connect it to a lambda expression that will remove the stream from the list and delete the pointer.
Our completed openStream function now looks like this:
void MixerStream::openStream(const QString &fileName, int loops) { auto *decodeStream = new DecodeStream(fileName, mFormat, loops); mStreams << decodeStream; connect(decodeStream, &DecodeStream::finished, this, [this, decodeStream]() { mStreams.removeAll(decodeStream); decodeStream->deleteLater(); }); }Recall from earlier that we call read on a stream, which takes a char * to which the read data will be copied and a qint64 representing the size of the data.
This is a QIODevice function, which will internally call readData. Thus, DecoderStream also needs to be a QIODevice.
Getting PCM Data for DecodeStreamIn DecodeStream, we need readData to spit out PCM data, so we need to decode our audio file to get its contents in PCM format. In Qt Multimedia, we use a QAudioDecoder for this. We pass it an audio format to decode to, and a source device, in this case a QFile file handle for our audio file.
When a QAudioDecoder‘s start method is called, it will begin decoding the source file in a non-blocking manner, emitting a signal bufferReady when a full buffer of decoded PCM data is available.
On that signal, we can call the decoder’s read method, which gives us a QAudioBuffer. To store in a data member in DecodeStream, we use a QByteArray, which we can interact with using QBuffers to get a QIODevice interface for reading and writing. This is the ideal way to work with buffers of bytes to read or write in Qt.
We’ll make two QBuffers: one for writing data to the QByteArray (we’ll call it mInputBuffer), and one for reading from the QByteArray (we’ll call it mOutputBuffer). The reason for using two buffers rather than one read/write buffer is so the read and write positions can be independent. Otherwise, we will encounter more stuttering.
So when we get the bufferReady signal, we’ll want to do something like this:
const QAudioBuffer buffer = mDecoder.read(); mInputBuf.write(buffer.data<char>(), buffer.byteCount());We’ll also need to have some sort of state enum. The reason for this is that when we are finished with the stream and emit finished(), we remove and delete the stream from a connected lambda expression, but read might still be called before that has completed. Thus, we want to only read from the buffer when the state is Playing.
Let’s update mixer.h to put the enum in namespace Mixer:
#pragma once #include <QtGlobal> #include <limits> namespace Mixer { enum State { Playing, Stopped }; inline qint16 combineSamples(qint32 samp1, qint32 samp2) { const auto sum = samp1 + samp2; if (std::numeric_limits<qint16>::max() < sum) return std::numeric_limits<qint16>::max(); if (std::numeric_limits<qint16>::min() > sum) return std::numeric_limits<qint16>::min(); return sum; } } // namespace Mixer Implementing DecodeStreamNow that we understand all the data members we need to use, let’s see what our header for DecodeStream looks like:
// decodestream.h #pragma once #include "mixer.h" #include <QAudioDecoder> #include <QBuffer> #include <QFile> class DecodeStream : public QIODevice { Q_OBJECT public: explicit DecodeStream(const QString &fileName, const QAudioFormat &format, int loops); protected: qint64 readData(char *data, qint64 maxSize) override; qint64 writeData(const char *data, qint64 maxSize) override; signals: void finished(); private: QFile mSourceFile; QByteArray mData; QBuffer mInputBuf; QBuffer mOutputBuf; QAudioDecoder mDecoder; QAudioFormat mFormat; Mixer::State mState; int mLoopsLeft; };In the constructor, we’ll initialize our private members, open the DecodeStream in read-only (like we did earlier), make sure we open the QFile and QBuffers successfully, and finally set up our QAudioDecoder.
DecodeStream::DecodeStream(const QString &fileName, const QAudioFormat &format, int loops) : mSourceFile(fileName) , mInputBuf(&mData) , mOutputBuf(&mData) , mFormat(format) , mState(Mixer::Playing) , mLoopsLeft(loops) { setOpenMode(QIODevice::ReadOnly); const bool valid = mSourceFile.open(QIODevice::ReadOnly) && mOutputBuf.open(QIODevice::ReadOnly) && mInputBuf.open(QIODevice::WriteOnly); Q_ASSERT(valid); mDecoder.setAudioFormat(mFormat); mDecoder.setSourceDevice(&mSourceFile); mDecoder.start(); connect(&mDecoder, &QAudioDecoder::bufferReady, this, [this]() { const QAudioBuffer buffer = mDecoder.read(); mInputBuf.write(buffer.data<char>(), buffer.byteCount()); }); }Once again, our QIODevice subclass is read-only, so our writeData method looks like this:
qint64 DecodeStream::writeData([[maybe_unused]] const char *data, [[maybe_unused]] qint64 maxSize) { Q_ASSERT_X(false, "writeData", "not implemented"); return 0; }Which leaves us with the last part of the implementation, DecodeStream‘s readData function.
We zero out the char * with memset to avoid any noise if there are areas that are not overwritten. Then we simply read from the QByteArray into the char * if mState is Mixer::Playing.
We check to see if we finished reading the file with QBuffer::atEnd(), and if we are, we decrement the loops remaining. If it’s zero now, that was the last (or only) loop, so we set mState to stopped, and emit finished(). Either way we seek back to position 0. Now if there are loops left, it starts reading from the beginning again.
qint64 DecodeStream::readData(char *data, qint64 maxSize) { memset(data, 0, maxSize); if (mState == Mixer::Playing) { mOutputBuf.read(data, maxSize); if (mOutputBuf.size() && mOutputBuf.atEnd()) { if (--mLoopsLeft == 0) { mState = Mixer::Stopped; emit finished(); } mOutputBuf.seek(0); } } return maxSize; }Now that we’ve implemented DecodeStream, we can actually use MixerStream to play two audio files at the same time!
Using MixerStreamHere’s an example snippet that shows how MixerStream can be used to route two simultaneous audio streams into one system mixer channel:
const auto &device = QAudioDeviceInfo::defaultOutputDevice(); const auto &format = device.preferredFormat(); auto mixerStream = std::make_unique<MixerStream>(format); auto *audioOutput = new QAudioOutput(device, format); audioOutput->setVolume(0.5); audioOutput->start(mixerStream.get()); mixerStream->openStream(QStringLiteral("/path/to/some/sound.wav")); mixerStream->openStream(QStringLiteral("/path/to/something/else.mp3"), 3); Final RemarksThe code in this series of posts is largely a reimplementation of Lova Widmark’s project QtMixer. Huge thanks to her for a great and lightweight implementation. Check the project out if you want to use something like this for a GPL-compliant project (and don’t mind that it uses qmake).
About KDAB
If you like this article and want to read similar material, consider subscribing via our RSS feed.
Subscribe to KDAB TV for similar informative short video content.
KDAB provides market leading software consulting and development services and training in Qt, C++ and 3D/OpenGL. Contact us.
The post Implementing an Audio Mixer, Part 2 appeared first on KDAB.
Vasudev Kamath: Note to Self: Enabling Unified Kernel Image on Debian
Note
These steps may not work on your system if you are using the default Debian installation. This guide assumes that your system is using systemd-boot as the bootloader, which is explained in the post linked below.
A unified kernel image (UKI) is a single executable that can be booted directly from UEFI firmware or automatically sourced by bootloaders with little or no configuration. It combines a UEFI boot stub program like systemd-stub(7), a Linux kernel image, an initrd, and additional resources into a single UEFI PE file.
systemd-boot already provides a hook for kernel installation via /etc/kernel/postinst.d/zz-systemd-boot. We just need a couple of additional configurations to generate the UKI image.
Installation and ConfigurationInstall the systemd-ukify package:
sudo apt-get install systemd-ukifyCreate the following configuration in /etc/kernel/install.conf:
layout=uki initrd_generator=dracut uki_generator=ukifyThis configuration specifies how to generate the UKI image for the installed kernel and which generator to use.
Define the kernel command line for the UKI image. Create /etc/kernel/uki.conf with the following content:
[UKI] Cmdline=@/etc/kernel/cmdline
To apply these changes, regenerate the UKI image for the currently running kernel:
sudo dpkg-reconfigure linux-image-$(uname -r) VerificationUse the bootctl list command to verify the presence of a "Type #2" entry for the current kernel. The output should look similar to this:
bootctl list type: Boot Loader Specification Type #2 (.efi) title: Debian GNU/Linux trixie/sid (2d0080583f1a4127ac0b073b1a9d3e61-6.10.9-amd64.efi) (default) (selected) id: 2d0080583f1a4127ac0b073b1a9d3e61-6.10.9-amd64.efi source: /boot/efi/EFI/Linux/2d0080583f1a4127ac0b073b1a9d3e61-6.10.9-amd64.efi sort-key: debian linux: /boot/efi/EFI/Linux/2d0080583f1a4127ac0b073b1a9d3e61-6.10.9-amd64.efi options: systemd.gpt_auto=no quiet root=LABEL=root_disk ro systemd.machine_id=2d0080583f1a4127ac0b073b1a9d3e61 type: Boot Loader Specification Type #2 (.efi) title: Debian GNU/Linux trixie/sid (2d0080583f1a4127ac0b073b1a9d3e61-6.10.7-amd64.efi) id: 2d0080583f1a4127ac0b073b1a9d3e61-6.10.7-amd64.efi source: /boot/efi/EFI/Linux/2d0080583f1a4127ac0b073b1a9d3e61-6.10.7-amd64.efi sort-key: debian linux: /boot/efi/EFI/Linux/2d0080583f1a4127ac0b073b1a9d3e61-6.10.7-amd64.efi options: systemd.gpt_auto=no quiet root=LABEL=root_disk ro systemd.machine_id=2d0080583f1a4127ac0b073b1a9d3e61 type: Automatic title: Reboot Into Firmware Interface id: auto-reboot-to-firmware-setup source: /sys/firmware/efi/efivars/LoaderEntries-4a67b082-0a4c-41cf-b6c7-440b29bb8c4f Cleanup and RebootOnce the "Type #2" entries are generated, remove any "Type #1" entries using the bootctl unlink command. After this, reboot your system to boot from the UKI-based image.
Future ConsiderationsThe primary use case for a UKI image is secure boot. Signing the UKI image can also be configured in the settings above, but this guide does not cover that process as it requires setting up secure boot on your system.
Stack Abuse: Securing Your Email Sending With Python: Authentication and Encryption
Email encryption and authentication are modern security techniques that you can use to protect your emails and their content from unauthorized access.
Everyone, from individuals to business owners, uses emails for official communication, which may contain sensitive information. Therefore, securing emails is important, especially when cyberattacks like phishing, smishing, etc. are soaring high.
In this article, I'll discuss how to send emails in Python securely using email encryption and authentication.
Setting Up Your Python EnvironmentBefore you start creating the code for sending emails, set up your Python environment first with the configurations and libraries you'll need.
You can send emails in Python using:
-
Simple Mail Transfer Protocol (SMTP): This application-level protocol simplifies the process since Python offers an in-built library or module (smtplib) for sending emails. It's suitable for businesses of all sizes as well as individuals to automate secure email sending in Python. We're using the Gmail SMTP service in this article.
-
An email API: You can leverage a third-party API like Mailtrap Python SDK, SendGrid, Gmail API, etc., to dispatch emails in Python. This method offers more features and high email delivery speeds, although it requires some investment.
In this tutorial, we're opting for the first choice - sending emails in Python using SMTP, facilitated by the smtplib library. This library uses the RFC 821 protocol and interacts with SMTP and mail servers to streamline email dispatch from your applications. Additionally, you should install packages to enable Python email encryption, authentication, and formatting.
Step 1: Install PythonInstall the Python programming language on your computer (Windows, macOS, Linux, etc.). You can visit the official Python website and download and install it from there.
If you've already installed it, run this code to verify it:
python --version
Step 2: Install Necessary Modules and Libraries-
smtplib: This handles SMTP communications. Use the code below to import 'smtplib' and connect with your email server:
import smtplib -
email module: This provides classes (Subject, To, From, etc.) to construct and parse emails. It also facilitates email encoding and decoding with Multipurpose Internet Mail Extensions (MIME).
-
MIMEText: It's used for formatting your emails and supports sending emails with text and attachments like images, videos, etc. Import this using the code below:
import MIMEText -
MIMEMultipart: Use this library to add attachments and text sections separately in your email.
import MIMEMultipart -
ssl: It provides Secure Sockets Layer (SSL) encryption.
To send emails using the Gmail SMTP email service, I recommend creating a test account to develop the code. Delete the account once you've tested the code.
The reason is, you'll need to modify the security settings of your Gmail account to enable access from the Python code for sending emails. This might expose the login details, compromising security. In addition, it will flood your account with too many test emails.
So, instead of using your own Gmail account, create a new one for creating and testing the code. Here's how to do this:
- Create a fresh Gmail account
- Set up your app password:
Google Account > Security > Turn on 2-Step Verification > Security > Set up an App Password
Next, define a name for the app password and click on "Generate". You'll get a 16-digit password after following some instructions on the screen. Store the password safely.
Use this password while sending emails in Python. Here, we're using Gmail SMTP, but if you want to use another mail service provider, follow the same process. Alternatively, contact your company's IT team to seek support in accessing your SMTP server.
Email Authentication With PythonEmail authentication is a security mechanism that verifies the sender's identity, ensuring the emails from a domain are legitimate. If you have no email authentication mechanism in place, your emails might land in spam folders, or malicious actors can spoof or intercept them. This could affect your email delivery rates and the sender's reputation.
This is the reason you must enable Python email authentication mechanisms and protocols, such as:
-
SMTP authentication: If you're sending emails using an SMTP server like Gmail SMTP, you can use this method of authentication. It verifies the sender's authenticity when sending emails via a specific mail server.
-
SPF: Stands for Sender Policy Framework and checks whether the IP address of the sending server is among
-
DKIM: Stands for DomainKeys Identified Mail and is used to add a digital signature to emails to ensure no one can alter the email's content while it's in transmission. The receiver's server will then verify the digital signature. Thus, all your emails and their content stay secure and unaltered.
-
DMARC: Stands for Domain-based Message Authentication, Reporting, and Conformance. DMARC instructs mail servers what to do if an email fails authentication. In addition, it provides reports upon detecting any suspicious activities on your domain.
To authenticate your email in Python using SMTP, the smtplib library is useful. Here's how Python SMTP security works:
import smtplib server = smtplib.SMTP('smtp.domain1.com', 587) server.starttls() # Start TLS for secure connection server.login('my_email@domain1.com', 'my_password') message = "Subject: Test Email." server.sendmail('my_email@domain1.com', 'receiver@domain2.com', message) server.quit()Implementing email authentication will add an additional layer of security to your emails and protect them from attackers or from being marked as spam.
Encrypting Emails With PythonEncrypting emails enables you to protect your email's content so that only authorized senders and receivers can access or view the content. Encrypting emails with Python is done using encryption techniques to encode the email message and transform it into a secure and unreadable format (also known as ciphertext).
This way, email encryption secures the message from unauthorized access or attackers even if they intercept the email.
Here are different types of email encryption:
-
SSL: This stands for Secure Sockets Layer, one of the most popular and widely used encryption protocols. SSL ensures email confidentiality by encrypting data transmitted between the mail server and the client.
-
TLS: This stands for Transport Layer Security and is a common email encryption protocol today. Many consider it a great alternative to SSL. It encrypts the connection between an email client and the mail server to prevent anyone from intercepting the email during its transmission.
-
E2EE: This stands for end-to-end encryption, ensuring only the intended recipient with valid credentials can decrypt the email content and read it. It aims to prevent email interception and secure the message.
If your mail server requires SSL encryption, here's how to send an email in Python:
import smtplib import ssl context = ssl.create_default_context() server = smtplib.SMTP_SSL('smtp.domain1.com', 465, context=context) # This is for SSL connections, requiring port number 465 server.login('my_email@domain1.com', 'my_password') message = "Subject: SSL Encrypted Email." server.sendmail('my_email@domain1.com', 'receiver@domain2.com', message) server.quit()For TLS connections, you'll need the smtplib library:
import smtplib server = smtplib.SMTP('smtp.domain1.com', 587) # TLS requires 587 port number server.starttls() # Start TLS encryption server.login('my_email@domain1.com', 'my_password') message = "Subject: TLS Encrypted Email." server.sendmail('my_email@domain1.com', 'receiver@domain2.com', message) server.quit()For end-to-end encryption, you'll need more advanced libraries or tools such as GnuPG, OpenSSL, Signal Protocol, and more.
Combining Authentication and EncryptionEmail Security with Python requires both encryption and authentication. This ensures that mail servers find the email legitimate and it stays safe from cyber attackers and unauthorized access during transmission. For email encryption, you can use either SSL or TLS and combine it with SMTP authentication to establish a robust email connection.
Now that you know how to enable email encryption and authentication in your emails, let's examine some complete code examples to understand how you can send secure emails in Python using Gmail SMTP and email encryption (SSL).
Code Examples
1. Sending a Plain Text Email import smtplib from email.mime.text import MIMEText subject = "Plain Text Email" body = "This is a plain text email using Gmail SMTP and SSL." sender = "sender1@gmail.com" receivers = ["receiver1@gmail.com", "receiver2@gmail.com"] password = "my_password" def send_email(subject, body, sender, receivers, password): msg = MIMEText(body) msg['Subject'] = subject msg['From'] = sender msg['To'] = ', '.join(receivers) with smtplib.SMTP_SSL('smtp.gmail.com', 465) as smtp_server: smtp_server.login(sender, password) smtp_server.sendmail(sender, receivers, msg.as_string()) print("The plain text email is sent successfully!") send_email(subject, body, sender, receivers, password)Explanation:
- sender: This contains the sender's address.
- receivers: This contains email addresses of receiver 1 and receiver 2.
- msg: This is the content of the email.
- sendmail(): This is the SMTP object's instance method. It takes three parameters - sender, receiver, and msg and sends the message.
- with: This is a context manager that is used to properly close an SMTP connection once an email is sent.
- MIMEText: This holds only plain text.
To send an email in Python with attachments securely, you will need some additional libraries like MIMEBase and encoders. Here's the code for this case:
import smtplib from email import encoders from email.mime.base import MIMEBase from email.mime.multipart import MIMEMultipart from email.mime.text import MIMEText sender = "sender1@gmail.com" password = "my_password" receiver = "receiver1@gmail.com" subject = "Email with Attachments" body = "This is an email with attachments created in Python using Gmail SMTP and SSL." with open("attachment.txt", "rb") as attachment: part = MIMEBase("application", "octet-stream") # Adding the attachment to the email part.set_payload(attachment.read()) encoders.encode_base64(part) part.add_header( "Content-Disposition", # The header indicates that the file name is an attachment. f"attachment; filename='attachment.txt'", ) message = MIMEMultipart() message['Subject'] = subject message['From'] = sender message['To'] = receiver html_part = MIMEText(body) message.attach(html_part) # To attach the file message.attach(part) with smtplib.SMTP_SSL('smtp.gmail.com', 465) as server: server.login(sender, password) server.sendmail(sender, receiver, message.as_string())Explanation:
- MIMEMultipart: This library allows you to add text and attachments both to an email separately.
- 'rb': It represents binary mode for the attachment to be opened and the content to be read.
- MIMEBase: This object is applicable to any file type.
- Encode and Base64: The file will be encoded in Base64 for safe email sending.
To send an HTML email in Python using Gmail SMTP, you need a class - MIMEText.
Here's the full code for Python send HTML email:
import smtplib from email.mime.text import MIMEText sender = "sender1@gmail.com" password = "my_password" receiver = "receiver1@gmail.com" subject = "HTML Email in Python" body = """ <html> <body> <p>HTML email created in Python with SSL and Gmail SMTP.</p> </body> </html> """ message = MIMEText(body, 'html') # To attach the HTML content to the email message['Subject'] = subject message['From'] = sender message['To'] = receiver with smtplib.SMTP_SSL('smtp.gmail.com', 465) as server: server.login(sender, password) server.sendmail(sender, receiver, message.as_string()) Testing Your Email With Authentication and EncryptionTesting your emails before sending them to the recipients is important. It enables you to discover any issues or bugs in sending emails or with the formatting, content, etc.
Thus, always test your emails on a staging server before delivering them to your target recipients, especially when sending emails in bulk. Testing emails provide the following advantages:
- Ensures the email sending functionality is working fine
- Emails have proper formatting and no broken links or attachments
- Prevents flooding the recipient's inbox with a large number of test emails
- Enhances email deliverability and reduces spam rates
- Ensures the email and its contents stay protected from attacks and unauthorized access
To test this combined setup of sending emails in Python with authentication and encryption enabled, use an email testing server like Mailtrap Email Testing. This will capture all the SMTP traffic from the staging environment, and detect and debug your emails before sending them. It will also analyze the email content, validate CSS/HTML, and provide a spam score so you can improve your email sending.
To get started:
- Open Mailtrap Email Testing
- Go to 'My Inbox'
- Click on 'Show Credentials' to get your test credentials - login and password details
Here's the Full Code Example for Testing Your Emails:
import smtplib from socket import gaierror port = 2525 # Define the SMTP server separately smtp_server = "sandbox.smtp.mailtrap.io" login = "xyz123" # Paste your Mailtrap login details password = "abc$$" # Paste your Mailtrap password sender = "test_sender@test.com" receiver = "test_receiver@example.com" message = f"""\ Subject: Hello There! To: {receiver} From: {sender} This is a test email.""" try: with smtplib.SMTP(smtp_server, port) as server: # Use Mailtrap-generated credentials for port, server name, login, and password server.login(login, password) server.sendmail(sender, receiver, message) print('Sent') except (gaierror, ConnectionRefusedError): # In case of errors print('Unable to connect to the server.') except smtplib.SMTPServerDisconnected: print('Server connection failed!') except smtplib.SMTPException as e: print('SMTP error: ' + str(e))If there's no error, you should see this message in the receiver's inbox:
This is a test email. Best Practices for Secure Email SendingConsider the below Python email best practices for secure email sending:
-
Protect data: Take appropriate security measures to protect your sensitive data such as SMTP credentials, API keys, etc. Store them in a secure, private place like config files or environment variables, ensuring no one can access them publicly.
-
Encryption and authentication: Always use email encryption and authentication so that only authorized individuals can access your emails and their content.
For authentication, you can use advanced methods like API keys, two-factor authentication, single sign-on (SSO), etc. Similarly, use advanced encryption techniques like SSL, TLS, E2EE, etc.
-
Error handling: Manage network issues, authentication errors, and other issues by handling errors effectively using except/try blocks in your code.
-
Rate-Limiting: Maintain high email deliverability by rate-limiting the email sending functionality to prevent exceeding your service limits.
-
Validate Emails: Validate email addresses from your list and remove invalid ones to enhance email deliverability and prevent your domain from getting marked as spam. You can use an email validation tool to do this.
-
Educate: Keep your team updated with secure email practices and cybersecurity risks. Monitor your spam score and email deliverability rates, and work to improve them.
Secure email sending with Python using advanced email encryption methods like SSL, TLS, and end-to-end encryption, as well as authentication protocols and techniques such as SPF, DMARC, 2FA, and API keys.
By combining these security measures, you can protect your confidential email information, improve email deliverability, and maintain trust with your target recipients. In this way, only individuals with appropriate credentials can access it. This will help prevent unauthorized access, data breaches, and other cybersecurity attacks.
The Python Show: 47 - Python Projects of 2024
I’ve been working on lots of projects this year. Here are the ones I highlighted in this episode:
JupyterLab 101 Book on Kickstarter
A book on Textual
Oliver Davies' daily list: The two ways of writing PHP code
Something that came up in my discussion with Dave Liddament for the Beyond Blocks podcast was that there seem to be two ways of writing PHP code.
One is writing strict code by enabling strict typing, using parameter and return types, and leveraging tools like PHPStan at a high level to analyze code.
The other is no not use types and to use a more "duck typing" approach.
The term "visual debt" came from a video discussing the pros and cons of these approaches.
The same can be said for JavaScript and TypeScript, but PHP can do both and gives the Developer the choice of how they write their code.
I prefer writing strict code and for my code to be as explicit as possible, but I appreciate not everyone does and I like that PHP caters for both.
How do you write your PHP code?
Armin Ronacher: Accidental Spending: A Case For an Open Source Tax?
Both last week at London tech leaders and this week at the Open Source Summit in Vienna I engaged in various discussions about pledging money to Open Source. At Sentry we have been funding our Open Source dependencies for a few years now and we're trying to encourage others to do the same.
It’s not an easy ask, of course. One quite memorable point raised was what I would call “accidental spending”. The story goes like this: an engineering team spins up a bunch of Kubernetes machines. As the fleet grows in scale some inefficiencies creep in. To troubleshoot or optimize, additional services such as load balancers, firewalls, cloud provider log services, etc. are provisioned with minimal discussion. Initially none of that was part of the plan, but ever so slightly for every computing resource, some extra stuff is paid on top creating largely hidden costs. Ideally all of that pays off (after all, hopefully by debugging quicker you reduce that downtime, by having that load balancer you can auto scale and save on unused computing resources etc.). But often, the payoff feels abstract and are hard to quantify.
I call those purchases “accidental” because they are proportional to the deployed infrastructure and largely acting like a tax on top of everything. Only after a while does the scale of that line item become apparent. On the other hand intentionally purchasing a third party system is a very intentional act. It's very deliberate, requiring conversations and more scrutiny is placed for putting a credit card into a new service. Companies providing services understand this and are positioning themselves accordingly. Their play could be to make the case that that their third party solution is better, cheaper etc.
Open Source funding could be seen through both of these lenses. Today, in many ways, pledging money to Open Source is a very intentional decision. It requires discussions, persuasion and justification. The purpose and the pay-off is not entirely clear. Companies are not used to the idea of funding Open Source and they don't have a strong model to reason about these investments. Likewise many Open Source projects themselves also don't have a good way of dealing with money and might lack the governance to handle funds effectively. After all many of these projects are run by individuals and not formal organizations.
Companies are unlikely to fund something without understanding the return on investment. One better understood idea is to turn that one “random person in Nebraska” maintaining a critical dependency into a well-organized team with good op-sec. But for that to happen, funding needs to scale from pennies to dollars, making it really worthwhile.
My colleague Chad Whitacre floated an idea: what if platforms like AWS or GitHub started splitting the check? By adding a line-item to the invoices of their customers to support Open Source finding. It would turn giving to Open Source into more of a tax like thing. That might leverage the general willingness to just pile up on things to do good things. If we all pay 3% on top of our Cloud or SaaS bills to give to Open Source this would quickly add up.
While I’m intrigued by the idea, I also have my doubts that this would work. It goes back to the problem mentioned earlier that some Open Source projects just have no governance or are not even ready to receive money. How much value you put on a dependency is also very individual. Just because an NPM package has a lot of downloads does not necessarily mean it's critical to the mission of the company. rrweb is a good example for us at Sentry. It sits at the core of our session replay product but since we we vendor a pinned fork, you would not see rrweb in your dependency tree. We also value that package more than some algorithm would be able to determine about how important that package is to us.
So the challenge with the tax — as appealing as it is — is that it might make the “purchase decision” of funding Open Source easier, but it would probably make the distribution problem much worse. Deliberate, intentional funding is key. At least for the moment.
Still, it’s worth considering. The “what if” is a powerful idea. Using a restaurant analogy, the “open-source tax” is like the mandatory VAT or health surcharge on your bill: no choice is involved. Another model could be more like the tip suggestions on a receipt offering a choice but also guidance on what’s appropriate to contribute.
The current model we propose with our upcoming Open Source Pledge is to suggest like a tip what you should give in relation to your developer work force. Take the average number of full time engineers you have over a year, multiply this by 2000. That is the amount in US dollars you should give to your Open Source dependencies.
That sounds like a significant amount! But let's put this in relation for a typical developer you employ: that's less than a fifth of what you would pay for FICA (Federal Insurance Contributions Act in the US) in the US. That's less than the communal tax you would pay in Austria. I'm sure you can think of similar payroll taxes in your country.
I believe that after step one of recognizing there is a funding problem follows an obvious step two: having a baseline funding amount that stands in relation to your business (you own or are a part of) of what the amount should be. Using the size of the development team as a metric offers an objective and quantifiable starting point. The beauty in my mind of the developer count in particular is that it's somewhat independently observable from both the outside and inside [1]. The latter is important! It creates a baseline for people within a company to start a conversation about Open Source funding.
If you have feedback on this, particular the pledge I invite you mail me or to leave a comment on the Pledge's issue tracker.
[1]There is an analogy to historical taxation here. For instance the Window Tax was taxation based on the number of Windows in a building. That made enforcement easy because you could count them from street level. The downside of taht was obviously the unintended consequences that this caused. Something to always keep in mind!Ruqola 2.3.0
Ruqola 2.3.0 is a feature and bugfix release of the Rocket.chat app.
New features:
- Implement Rocket.Chat Marketplace.
- Allow to clean room history.
- Allow to check new version.
- Implement moderation (administrator mode).
- Add welcome page.
- Implement pending users info (administrator mode).
- Use cmark-rc (https://github.com/dfaure/cmark-rc) for markdown support.
- Delete oldest files from some cache directories (file-upload and media) so it doesn't grow forever.
Fixed bugs:
- Clean market application model after 30 minutes (reduce memory footprint).
- Fix show discussion name in completion.
- Fix duplicated messages in search message dialog.
- Add delegate in search rooms in team dialog.
URL: https://download.kde.org/stable/ruqola/
Source: ruqola-2.3.0.tar.xz
SHA256: 051186793b7edc4fb2151c80ceab3bcfd65acb27d38305568fda54553660fdd0
Signed by: E0A3EB202F8E57528E13E72FD7574483BB57B18D Jonathan Riddell jr@jriddell.org
https://jriddell.org/jriddell.pgp
Python Engineering at Microsoft: Announcing the new Python Data Science Extension Pack for VS Code
We’re thrilled to announce the launch of the new Python Data Science Extension Pack for Visual Studio Code! This powerful pack brings together some of the most popular and essential VS Code extensions, making it your one-stop shop for all things data science in Python.
What’s Inside?Our extension pack is designed to streamline your data science journey from start to finish. Whether you’re preparing data, conducting analysis, visualizing results, or building and training machine learning models, we’ve got you covered.
This Data Science extension pack currently includes four extensions:
- Python – Provides rich support for the Python language such as IntelliSense, debugging, formatting, linting, code navigation, refactoring, variable explorer, test explorer, and more.
- Jupyter – Used to create and edit Jupyter Notebooks, add and run code/markdown cells, render plots, create presentation-friendly versions of your notebook by exporting to HTML or PDF and more.
- GitHub Copilot – An AI pair programmer tool that helps you write code faster and smarter.
- Data Wrangler – A code-centric data viewing and cleaning tool to explore, visualize, and clean tabular data.
Dive into the world of data science by installing the Python Data Science Extension Pack for VS Code from the VS Code extension marketplace.
We encourage you to provide feedback and file issues. Additionally, if there are other VS Code extensions that you feel are essential to the data science workflow, please let us know by creating a ticket in our GitHub repo.
The post Announcing the new Python Data Science Extension Pack for VS Code appeared first on Python.
Liip: blökkli Starterkit released
Meet the blökkli starterkit for Drupal.
Spin-up a preconfigured Decoupled Drupal setup with Nuxt 3, GraphQL and blökkli to get started developing within seconds.
Enjoy the powerful and elegant editing experience offered by blokk.li, a fully interactive in-page editor based on the well-known Drupal Paragraphs module.
-
Out of the box support experience with local setup using Lando or DDEV
-
Drupal Backend Setup with GraphQL, Paragraphs and Paragraphs blokk.li module enabled
-
+40 Vue Components and Composables ready to build paragraph based Drupal websites
-
Basic Frontend setup with Nuxt 3 and GraphQL Middleware included
-
Multilanguage support with language negotation and translation support
-
Key-based Texts managed in Drupal combined with translation extraction in the frontend
-
SVG Icon Sprite generation with Drupal Media Library integration
-
rokka.io Image CDN integration and Drupal media library integration
-
Reverse proxy configuration generator included
We are looking forward to getting your feedback in the issue queue and on slack.
If you happen to be in Barcelona for DrupalCon, find us at the following dates:
-
Session: Large-scale content creation with Drupal — Delights, Pitfalls and support structures to help editors by Jonathan Noack (Jonock), Thomas Nagy (Thomnagy)
Wednesday, September 25, 2024 - 11:30 to 12:15
Room 1 (133-134) -
BoF: blökkli - Demo, Q&A for the Interactive page building experience with Nuxt
Thursday, September 26, 2024 - 16:30 to 17:15
Room BoF 1 (118)
Lullabot: Untangling Your Drupal Migration: Lessons from the State of Iowa
Migrating your content to a new CMS can feel daunting. You have years and years of content to sort through and move over, and you have no idea what obstacles might be hiding in the weeds. In the back of your mind, the question looms: “Do we really need all of this content?”
You don’t want to waste time. But you also don’t want to miss anything important.
Dirk Eddelbuettel: Rblpapi 0.3.15: Updated and New BLP Library
Version 0.3.15 of the Rblpapi package arrived on CRAN today. Rblpapi provides a direct interface between R and the Bloomberg Terminal via the C++ API provided by Bloomberg (but note that a valid Bloomberg license and installation is required).
This is the fifteenth release since the package first appeared on CRAN in 2016. This release updates to the current version 3.24.6 of the Bloomberg API, and rounds out a few corners in the packaging from continuous integration to the vignette.
The detailed list of changes follow below.
Changes in Rblpapi version 0.3.15 (2024-09-18)A warning is now issued if more than 1000 results are returned (John in #377 addressing #375)
A few typos in the rblpapi-intro vignette were corrected (Michael Streatfield in #378)
The continuous integration setup was updated (Dirk in #388)
Deprecation warnings over char* where C++ class Name is now preferred have been addressed (Dirk in #391)
Several package files have been updated (Dirk in #392)
The request formation has been corrected, and an example was added (Dirk and John in #394 and #396)
The Bloomberg API has been upgraded to release 3.24.6.1 (Dirk in #397)
Courtesy of my CRANberries, there is also a diffstat report for the this release. As always, more detailed information is at the Rblpapi repo or the Rblpapi page. Questions, comments etc should go to the issue tickets system at the GitHub repo.
If you like this or other open-source work I do, you can sponsor me at GitHub.
This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.
Wim Leers: XB week 17: drag and drop party
We matched last week’s record: again 26 MRs merged! :D
Experience Builder (XB) already had a hierarchy view for a while. Lauri worked with the Acquia UX team to change that to match the more common “layers” pattern (used in Photoshop and Figma). Harumi “hooroomoo” Jang made that a reality:
The new “layers” panel, which also allows moving components.Issue #3458503, image by Harumi.
Ben “bnjmnm” Mullins and I (mostly Ben!) collaborated on integrating Media Library! This required expanding some of the lower-level XB infrastructure but most importantly, it means we proved Drupal core’s most complex field widget 1 can work, which is an important milestone:
Using an image from the Media Library. Note how the alt updates, but the image won’t load — more about that later in this post :)Issue #3454173, image by me.
During the product research phase, Lauri identified that it’s important for Content Creators’ productivity to not have to craft the same combinations of components over and over again. Lauri and the Acquia UX team have labeled such combinations “sections” — similar to Layout Builder’s sections. Creating new ones is out-of-scope for 0.1.0, but conveying what that UX would feel like is in scope. So, Jesse “jessebaker” Baker and Bálint “balintbrews” Kléri worked on a client-only implementation that hardcodes a single section (again: for now):
Using an image from the Media Library. Note how the alt updates, but the image won’t load — more about that later in this post :)Issue #3463300, image by me.
It takes no designer or expert user to observe that in the above images, the drag-and-drop UX and visualization can be improved. The Figma designs do not have an answer for this. But … we have Bálint! :D He thought, tinkered, experimented and gave the Drupal ecosystem this delightful UX:
A blue line now precedes the ghost while dragging, which conveys both the current position and the target position upon dropping.Issue #3470973, image by Bálint.
… which subsequently enabled him together with danielveza and Jesse to also highlight the slot that a component is about to be placed in:
The precise destination of a component is has a thick blue line, the containing slot gets a thin outline.Issue #3469822, image by Bálint.
If that isn’t an epic leap forward on the front end, then I don’t know what is! :D On so many fronts, dragging and dropping components became not only more usable, but also enjoyable.
It doesn’t end there, though:
- Utkarsh “utkarsh_33” updated the styling of text inputs, boolean toggle, etc to match the designs, by mapping more form elements to XB’s React components using the semi-coupled theme engine that Ben introduced in week 10
- Utkarsh, fazilitehreem and Jesse made deleting component instances more intuitive: you can delete the selected one using the keyboard now
- fazilitehreem, Utkarsh and Jesse added right-click support to the selected component
- Jesse fixed a crucial bug that prevented slots in freshly dropped components from working
Missed a prior week? See all posts tagged Experience Builder.
Goal: make it possible to follow high-level progress by reading ~5 minutes/week. I hope this empowers more people to contribute when their unique skills can best be put to use!
For more detail, join the #experience-builder Slack channel. Check out the pinned items at the top!
Back endComparatively, the back end progress this week was very non-visual… with one exception: Ted “tedbow” Bowman and I fixed the visually broken “image” components — this was caused by the buggy PoC code I wrote 14 weeks ago — finally this rose to the top of the priorities!
Images now render as expected in Experience Builder. Compare and contrast with the Media Library image above :)Issue #3469436, image by me.
Feliksas “f.mazeikis” Mazeikis and I discovered a critical bug in the auto-generated Component config entities for Single-Directory Components (SDCs) meeting the criteria: the field type and widget for optional props were missing. How could this happen? Because we’ve been racing ahead to make functionality exist, without the foundations being sufficiently thoroughly checked: the Component config entity’s schema is littered with @todos for adding more validation constraints. One of those would’ve prevented this problem … so we fixed not only the problem at hand, but also ensured that it could never reoccur, by introducing a KeyForEverySdcProp validation constraint first, and then fixing the auto-generation logic.
Dave “longwave” Long, Lee “larowlan” Rowlands and Deepak “deepakkm” Mishra updated XB to declare a runtime rather than development dependency on justinrainbow/json-schema — this is what the SDC subsystem uses to validate that the provided props values are considered acceptable by an SDC, and that’s why XB uses it to validate an XB field is valid (i.e. every SDC in the component tree must be renderable and hence trigger no exceptions for provided SDC props values). So that should’ve been marked as an explicit dependency months ago, but we didn’t spot that. Easy enough!
However … Lee pointed out that this is actually unacceptable for Drupal sites that use JSON:API in production, because it causes automatic validation for every JSON:API response against the JSON:API spec if assertions are enabled. That results in a significant performance regression. That being said, having assertions enabled is also a violation of Drupal best practices (and PHP best practices). Still, Drupal should help users even when they ignore/are unaware of best practices, so the XB module warns on the status report when best practices are violated. A core issue was created to improve this upstream: #3472008.
What a week! :D
Week 17 was September 2–8, 2024.
-
For now, that Media Library dialog looks rather stark, because it is, well … using the Stark theme. We plan to load the Claro/Gin styles, but to ensure style isolation, that requires some non-trivial <iframe> shenanigans in #3471978 to avoid loading that CSS/JS in the context of the XB React app. ↩︎
- The new "layers" panel, which also allows moving components.
- Using an image from the Media Library.
- The sole "section" available right now: one that contains two predefined hero components.
- A blue line now precedes the ghost while dragging, which conveys both the current position and the target position upon dropping
- The precise destination of a component is has a thick blue line, the containing slot gets a thin outline.
- Images now render as expected in Experience Builder.
Real Python: Python 3.13 Preview: Free Threading and a JIT Compiler
Although the final release of Python 3.13 is scheduled for October 2024, you can download and install a preview version today to explore the new features. Notably, the introduction of free threading and a just-in-time (JIT) compiler are among the most exciting enhancements, both designed to give your code a significant performance boost.
In this tutorial, you’ll:
- Compile a custom Python build from source using Docker
- Disable the Global Interpreter Lock (GIL) in Python
- Enable the Just-In-Time (JIT) compiler for Python code
- Determine the availability of new features at runtime
- Assess the performance improvements in Python 3.13
- Make a C extension module targeting Python’s new ABI
Check out what’s new in the Python changelog for a complete list of the upcoming features and improvements. This document contains a quick summary of the release highlights as well as a detailed breakdown of the planned changes.
To download the sample code and other resources accompanying this tutorial, click the link below:
Get Your Code: Click here to download the free sample code that shows you how to work with the experimental free threading and JIT compiler in Python 3.13.
Take the Quiz: Test your knowledge with our interactive “Python 3.13: Free-Threading and a JIT Compiler” quiz. You’ll receive a score upon completion to help you track your learning progress:
Interactive Quiz
Python 3.13: Free-Threading and a JIT CompilerIn this quiz, you'll test your understanding of the new features in Python 3.13. You'll revisit how to compile a custom Python build, disable the Global Interpreter Lock (GIL), enable the Just-In-Time (JIT) compiler, and more.
Free Threading and JIT in Python 3.13: What’s the Fuss?Before going any further, it’s important to note that the majority of improvements in Python 3.13 will remain invisible to the average Joe. This includes free threading (PEP 703) and the JIT compiler (PEP 744), which have already sparked a lot of excitement in the Python community.
Keep in mind that they’re both experimental features aimed at power users, who must take extra steps to enable them at Python’s build time. None of the official channels will distribute Python 3.13 with these additional features enabled by default. This is to maintain backward compatibility and to prevent potential glitches, which should be expected.
Note: Don’t try to use Python 3.13 with the experimental features in a production environment! It may cause unexpected problems, and the Python Steering Council reserves the right to remove these features entirely from future Python releases if they prove to be unstable. Treat them as an experiment to gather real-world data.
In this section, you’ll get a birds-eye view of these experimental features so you can set the right expectations. You’ll find detailed explanations on how to enable them and evaluate their impact on Python’s performance in the remainder of this tutorial.
Free Threading Makes the GIL OptionalFree threading is an attempt to remove the Global Interpreter Lock (GIL) from CPython, which has traditionally been the biggest obstacle to achieving thread-based parallelism when performing CPU-bound tasks. In short, the GIL allows only one thread of execution to run at any given time, regardless of how many cores your CPU is equipped with. This prevents Python from leveraging the available computing power effectively.
There have been many attempts in the past to bypass the GIL in Python, each with varying levels of success. You can read about these attempts in the tutorial on bypassing the GIL. While previous attempts were made by third parties, this is the first time that the core Python development team has taken similar steps with the permission of the steering council, even if some reservations remain.
Note: Python 3.12 approached the GIL obstacle from a different angle by allowing the individual subinterpreters to have their independent GILs. This can improve Python’s concurrency by letting you run different tasks in parallel, but without the ability to share data cheaply between them due to isolated memory spaces. In Python 3.13, you’ll be able to combine subinterpreters with free threading.
The removal of the GIL would have significant implications for the Python interpreter itself and especially for the large body of third-party code that relies on it. Because free threading essentially breaks backward compatibility, the long-term plan for its implementation is as follows:
- Experimental: Free threading is introduced as an experimental feature and isn’t a part of the official Python distribution. You must make a custom Python build to disable the GIL.
- Enabled: The GIL becomes optional in the official Python distribution but remains enabled by default to allow for a transition period.
- Disabled: The GIL is disabled by default, but you can still enable it if needed for compatibility reasons.
There are no plans to completely remove the GIL from the official Python distribution at the moment, as that would cause significant disruption to legacy codebases and libraries. Note that the steps outlined above are just a proposal subject to change. Also, free threading may not pan out at all if it makes single-threaded Python run slower than without it.
Until the GIL becomes optional in the official Python distribution, which may take a few more years, the Python development team will maintain two incompatible interpreter versions. The vanilla Python build won’t support free threading, while the special free-threaded flavor will have a slightly different Application Binary Interface (ABI) tagged with the letter “t” for threading.
This means that C extension modules built for stock Python won’t be compatible with the free-threaded version and the other way around. Maintainers of those external modules will be expected to distribute two packages with each release. If you’re one of them, and you use the Python/C API, then you’ll learn how to target CPython’s new ABI in the final section of this tutorial.
JIT Compiles Python to Machine CodeAs an interpreted language, Python takes your high-level code and executes it on the fly without the need for prior compilation. This has both pros and cons. Some of the biggest advantages of interpreted languages include better portability across different hardware architectures and a quick development time due to the lack of a compilation step. At the same time, interpretation is much slower than directly executing code native to your machine.
Note: To be more precise, Python interprets bytecode instructions, an intermediate binary representation between pure Python and machine code. The Python interpreter compiles your code to bytecode when you import a module and stores the resulting bytecode in the __pycache__ folder. This doesn’t inherently make your Python scripts run faster, but loading a pre-processed bytecode can indeed speed up their startup time.
Languages like C and C++ leverage Ahead-of-Time (AOT) compilation to translate your high-level code into machine code before you ship your software. The benefit of this is faster execution since the code is already in the computer’s mother tongue. While you no longer need a separate program to interpret the code, you must compile it separately for all target platforms that you want supported. You should also handle platform-specific differences yourself.
Read the full article at https://realpython.com/python313-free-threading-jit/ »[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]
Tag1 Consulting: Migrating Your Data from D7 to D10:Migrating view modes and field groups
Today, we're building on our previous work with field widget settings. We will cover migrating view modes, a prerequisite for migrating field groups and field formatter settings. We’ll then walk through migrating field groups. Field formatter settings will be addressed in our next article.
Read more mauricio Wed, 09/18/2024 - 05:36Jamie McClelland: Gmail vs Tor vs Privacy
A legit email went to spam. Here are the redacted, relevant headers:
[redacted] X-Spam-Flag: YES X-Spam-Level: ****** X-Spam-Status: Yes, score=6.3 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, [redacted] * 1.0 RCVD_IN_XBL RBL: Received via a relay in Spamhaus XBL * [185.220.101.64 listed in xxxxxxxxxxxxx.zen.dq.spamhaus.net] * 3.0 RCVD_IN_SBL_CSS Received via a relay in Spamhaus SBL-CSS * 2.5 RCVD_IN_AUTHBL Received via a relay in Spamhaus AuthBL * 0.0 RCVD_IN_PBL Received via a relay in Spamhaus PBL [redacted] [very first received line follows...] Received: from [10.137.0.13] ([185.220.101.64]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-378956d2ee6sm12487760f8f.83.2024.09.11.15.05.52 for <xxxxx@mayfirst.org> (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 11 Sep 2024 15:05:53 -0700 (PDT)At first I though a Gmail IP address was listed in spamhaus - I even opened a ticket. But then I realized it wasn’t the last hop that Spamaus is complaining about, it’s the first hop, specifically the ip 185.220.101.64 which appears to be a Tor exit node.
The sender is using their own client to relay email directly to Gmail. Like any sane person, they don’t trust Gmail to protect their privacy, so they are sending via Tor. But WTF, Gmail is not stripping the sending IP address from the header.
I’m a big fan of harm reduction and have always considered using your own client to relay email with Gmail as a nice way to avoid some of the surveillance tax Google imposes.
However, it seems that if you pursue this option you have two unpleasant choices:
- Embed your IP address in every email message or
- Use Tor and have your email messages go to spam
I supposed you could also use a VPN, but I doubt the IP reputation of most VPN exit nodes are going to be more reliable than Tor.
Drupal Starshot blog: Drupal CMS Update for The Mid September 2024
We are in the middle of September and that means it’s time for our regular update on what’s going on with the Drupal CMS. Let’s check out what’s new!
Documentation
The Drupal Association is on a mission to bring world-class documentation and maintenance software for Drupal CMS launch and beyond. In order to achieve that, we teamed up with Drupalize.me. Our common goal is to make sure that by the time when Drupal CMS hits the market we have an easy to follow user guide suitable for our target audience. Also we prepared an announcement that will be revealed at DrupalCon Barcelona. Stay tuned for the details!
Contact Form
We are happy to announce that we found a track lead for the Contact form! J. Hogue from Oomph joined the team very recently but has already managed to show progress: he and his team are getting busy with the research and MVP mapping. It’s great to have you with us!
Blog
As per the most recent update from Laurens Van Damme and his team, The MVP version has been set up and now the team is busy with the research on how to align with Drupal CMS standards.
Events
While Martin Anderson-Clutz and his team are considering expanding functionality with the options that will not be applied by default, they are working on a different calendar solution that would have more community support, as well as UX improvement for the date widget. We can’t wait to see the result of their work!
Data Privacy / Compliance
Jürgen Haas tells us that information research and framing the scope of the track has been completed and the team agreed on the documentation. Therefore, the following 3 action items has been set as immediate next priorities:
-
continue documentation
-
break down existing feature list into deliverables/recipes and prioritise them
-
define components for the "Compliance Audit" module as an additional deliverable
Trial experience for Starshot
Some exciting news is coming from Matt Glaman: the trial now displays an interactive installer of Drupal CMS. Meanwhile, the work on styling so the trial would look like the Drupal CMS installer while it is being set up, is full steam on.
Dashboard
The team, lead by Christian López Espínola and Matthew Tift has got the wireframes they can rely on hence now it’s time to get things rolling! They are busy looking deeper into the Gin theme, in particular - config actions for adding blocks to the dashboards from other recipes. There is a decision to be made on whether the right sidebar should consist of shortcuts or individual blocks. At the same time, planning for the upcoming activities is in progress and we are waiting patiently to see more details on what’s ahead.
SEO
Great news from Jim Birch and John Doyle with the SEO track team: Basic and Advanced SEO Recipes have been committed to the repository. Next priority is to continue the iteration process on the recipes, documentation, and guidelines for other tracks.
Content Publishing Workflows
We are most excited to announce that we’ve got one more valuable addition to the team - Mohammed Razem from Vardot is joining the Drupal CMS crew as a track lead of the Content Publishing Workflows track. Welcome on board!
Advanced Search
The 1xINTERNET team has been busy finalising the first version of the concept. The next target for them is to get insights from the survey asking Drupalers what they prefer to use in order to confirm earlier findings from the specification.
Media Management
Tony Barker informs that the Media Track team is researching the features of content management systems identified in the Strategy document as well as working through the information and ideas from the earlier released questionnaire. They keep experimenting with modules and configuration to make necessary choices, with the focus on features that can make it into early recipes over the coming weeks.
Accessibility Tools
From Gareth Alexander we learn the following: as the discovery process and gathering insight on common practices has been finalised, the team has published a survey and is now busy reviewing the results. As review of the current module availability has been completed the list has been simmered down to the ones now being reviewed for feasibility.
This will lead to a proposal for the features and recipes for Accessibility Tools to be considered for inclusion in early versions of Drupal CMS.
Proposal creation is underway and the next steps are being generated.
Analytics
Dharizza Espinach meanwhile shares that the team has finished the market research, and is currently wrapping up the work on a comparison with other tools. The selected tools are to be included in the recipes. Another objective the track team is busy with is preparing the list of recommendations for features that should be included later in the project. The proposal document is underway and iterating over a first version of the basic recipe will be the next step.
AI
Jamie Abrahams is working on something very special that will be released during DrupalCon Barcelona. So I will keep the intrigue and will allow you to discover the details for yourself in just 1 week!
New Track Announcement!
As we make progress with the already defined deliverables, we keep discovering the missing parts of the puzzle. In order to close those gaps, we are excited to announce 2 new tracks we not only set up but managed to get staffed as well:
-
NavigationMatthew Oliveira, Pablo López Escobés from Lullabot. >
-
Gin admin theme track - Known by many, a long time maintainer of the Gin, Sascha Eggenberger became track lead for this milestone.
Our heartfelt welcome to all of the newly acquired Drupal CMS track leads - we are excited to have you and looking forward to all the expertise you are bringing along!
I truly hope you find the news outlined above as exciting as we do and look forward to sharing even more at DrupalCon Barcelona. So if you somehow didn’t get your ticket yet - better hurry up!